Monitoring SRE's Golden Signals

Monitoring SRE's Golden Signals


Request Rate -- request rate, in requests/sec.

Error Rate -- error rate, in errors/sec.

Latency -- response time, including queue/wait time, in milliseconds.

Saturation -- how overloaded something is, directly measured by things like queue depth (or sometimes concurrency). Becomes non-zero when the system gets saturated.

Utilization -- how busy the resource or system is. Usually expressed 0-100% and most useful for predictions (saturation is usually more useful for alerts).