What is uptime and why it matters

In brief: Uptime is the percentage of time your website is available. 99 % sounds like a lot, but in practice it means more than three days of outage per year. For sites and apps that bring in revenue, a realistic target is at least 99.9 % (8.7 hours of outage per year) - and the key is knowing about the problem within a minute.

In brief: Uptime is the percentage of time your website is available. 99 % sounds like a lot, but in practice it means more than three days of outage per year. For sites and apps that bring in revenue, a realistic target is at least 99.9 % (8.7 hours of outage per year) - and the key is knowing about the problem within a minute.

Definition: what exactly we measure

Uptime is the ratio of time the service responds as expected (typically HTTP 200, content contains a keyword) to the total measurement time. Expressed as a percentage, most commonly for a 30- or 365-day period.

The opposite is downtime - time when the service doesn't respond, returns 5xx, or takes longer than the set timeout. This also includes scheduled maintenance, unless you explicitly exclude it from the calculation (which you should communicate in the SLA).

The "nines" table: how much time each decimal place means

Uptime	Allowed downtime / year	Month	Day
99 %	3 days 15 h	7 h 18 min	14 min
99,5 %	1 day 19 h	3 h 39 min	7 min
99,9 % (three nines)	8 h 45 min	43 min	1 min 26 s
99,95 %	4 h 22 min	21 min	43 s
99,99 % (four nines)	52 min	4 min 22 s	8.6 s
99,999 % (five nines)	5 min 15 s	26 s	0.86 s

Each additional nine multiplies infrastructure costs. Five nines (99.999 %) is the domain of global providers with active cross-continent redundancy - for most business applications 99.9 % is the right target.

What uptime you actually need

Marketing site (company, portfolio): 99 % is enough. A visitor who comes during an outage will try again later.
SaaS app with a desktop client: 99.9 % is the minimum. Customers pay for work they can't do during downtime.
E-shop, payment gateway, real-time service: 99.95 % and above. Every minute = direct losses.
Infrastructure (API used by others): At least 99.99 %. Your SLA limits your clients' SLA.

How uptime is measured

The monitoring service periodically calls your endpoint (typically HTTP GET, but also TCP socket, ICMP ping or DNS resolution). Each check has a binary result: up or down.

A common interval is 1-5 minutes. The shorter it is, the faster you catch an outage, but the more false-positive alerts you'll get (local network glitch, brief deploy restart). The solution is multi-region checking: an outage is confirmed only when N regions report it, not just one.

Most common sources of "lost nines"

Expired SSL/TLS certificate. The browser blocks the page. Without monitoring you find out Monday morning when the phone rings.
Domain expiry. The whole DNS stops working. Email, web, status page - everything falls at the same time.
Crashed database worker. The site returns 500 or times out for some requests. A classic ping might even pass through.
DDoS or flooding. Server overloaded, response time climbs above the limit, monitoring reports an outage.
Botched deploy. A new version has a bug that breaks the path. Without integration tests you find out when customers start complaining.

Conclusion

Uptime isn't a marketing number - it's the measure of how much you can rely on your own infrastructure. 99.9 % uptime isn't a luxury, it's a standard requirement for any service generating revenue or with paying customers.

Step one is to measure. If you don't have external monitoring, technically you don't know what uptime you have - you're just guessing.

Start measuring uptime of your services

ePulz.io tracks your endpoints at intervals from 1 minute from multiple regions. 7 days free.

Start monitoring →