Infrastructure

We have servers in multiple Tier-III datacentres, with sub-second backups of data from primary to backup servers. We obsessively monitor each server with over 200 different tests per server every few minutes. If any test fails, our infrastructure team gets texted. You can view our live infrastructure status page here.
Tier III Datacentres
The "Tier-III" designation means:
- Site infrastructure guarantees 99.982% availability
- All equipment is dual-powered
- Multiple internet connections to site
To ensure our servers keep on running, they all have:
- Two power supplies, permanently connected
- Battery packs in case of power failure
- Two network connections, routed to separate switches
- Disk drives in Redundant Array of Independent Disks (RAID) so if one hard disk fails, the server keeps running
We slice our Ubuntu-based servers with Xen virtualisation software into several separate Debian "Virtual Machines", allowing us to individually monitor services offered, and allocate more server resource to busier services.
We deploy world-class software environments to support The PODFather. Web service is deployed using Apache (with Squid reverse proxy in front). Databasing is provided by MySQL, with real-time per-transaction replication from primary to backups.
The Nagios monitoring suite is used to run over 200 tests per server every few minutes, with emailing of Warnings, and texting of Criticals. This allows us to closely monitor, for example, peaks in load, and act quickly to allocate resource as required thus avoiding an outage.

