ePulz.io Blog

2026-05-23 · 7 min

Monitoring devices in client's internal network via LAN agent

Cloud monitoring can't reach NAS, cameras, local servers behind the client's router. The ePulz.io LAN agent reverses the direction of communication - a small daemon in your network calls us via standard HTTPS. No port forwards, no VPN.

2026-05-21 · 4 min

Why monitor your website (and what it costs when you don't)

An e-shop outage of 1h during peak hours = hundreds of euros in lost orders. SSL expiry = 100% traffic loss. We calculated the real costs.

2025-12-03 · 6 min

Webhook, email or Telegram: which alert to use and when

Email is slow but auditable, Telegram is fast but informal, webhook is flexible but requires integration. A practical guide on how to combine channels without alert fatigue.

2025-09-11 · 8 min

Incident response playbook for small teams

Roles, severity levels, the first 15 minutes of a SEV1 incident, post-mortem structure. A practical guide for a 5-20 person team without a dedicated SRE.

2025-05-19 · 7 min

SLI, SLO, SLA: measuring availability without illusions

Three terms from the Google SRE book that often get confused. SLI is a metric, SLO an internal target, SLA a contract. Plus the error budget concept in practice.

2025-04-15 · 7 min

API monitoring: when HTTP 200 isn't enough

A backend can return 200 OK with a body of 'status: error'. True API monitoring combines status code, keyword match in content, JSONPath assertions and response time.

2025-02-25 · 6 min

Core Web Vitals and uptime: when 200 OK isn't enough for Google

LCP, INP, CLS - three real UX metrics Google uses as a ranking factor. A server can be 100% uptime and still lose rankings due to slow LCP.

2024-12-18 · 6 min

What a good public status page looks like

Components, incident timeline, post-mortem, subscribers, hosting on independent infrastructure. Anti-pattern: hiding problems.

2024-10-04 · 6 min

How to eliminate false-positive outages in monitoring

Single-region monitoring lies. Multi-region cross-check with a consensus algorithm (M of N probes) drastically reduces noise and protects against alert fatigue.

2024-07-30 · 7 min

HTTP security headers: HSTS, CSP, X-Frame-Options and others

Practical configuration of security headers in nginx. HSTS preload, CSP with nonce/hash, Permissions-Policy. Half an hour of work for complete browser-side defense.

2024-05-12 · 7 min

DNS troubleshooting: nslookup, dig and DNS-over-HTTPS

Practical DNS debugging procedures. dig +trace, +dnssec, RDAP API, DoH to bypass blocked port 53. Plus a checklist for 'domain not working'.

2024-04-08 · 6 min

Domain expiration: WHOIS monitoring in practice

Domain expiration = simultaneous outage of website, email and all subdomains. Grace period for different TLDs, why auto-renewal fails, how monitoring warns you 90 days in advance.

2024-02-20 · 6 min

HTTP status codes: 200, 301, 404, 5xx and what to do with them

A practical cheat sheet of HTTP codes 2xx-5xx. Which to alert immediately, which to ignore, when to react to a trend. Including Cloudflare 5xx (520-525).

2023-11-14 · 6 min

Monitoring cron jobs: heartbeat pattern in practice

Background jobs without an HTTP endpoint are a monitoring blind spot. The heartbeat pattern reverses the direction of communication - cron pings the monitor. Implementation in bash, Python, Node.

2023-09-08 · 6 min

What to do when your SSL certificate expires

A quick procedure for renewing Let's Encrypt and commercial certs, automatic renewal via certbot, Caddy/Traefik with ACME, expiry monitoring as a safety net.

2023-06-15 · 5 min

What is uptime and why it matters

A table of percentage nines (99% = 3.65 days, 99.9% = 8.76 h, 99.99% = 52 min per year). What uptime you need by product type. The most common sources of lost nines.