Mission Control
DemoCentral command — log clusters, anomaly synopsis, and AI chat in one pane
Ask Mission Control
ai-sre chatAsk about your logs — e.g. “What’s driving the error spike?”
Live Anomalies
4novel event · 1,243,891,402 logs scanned
Zero history, no learned baseline, no monitor configured. Restarted in 8s, so all 42 pods stayed green and no threshold fired.
Risk if unchecked
First tremor of a memory leak. At peak load it OOM-kills every replica — all 42 checkout pods down, full checkout outage.
restarts in 2 min
Correlated to the cgroup OOM 40s earlier on the same pod.
Risk if unchecked
Each restart widens the back-off. The pod stops serving entirely and checkout capacity drops to 41/42 replicas.
readiness probes passing
Pod marked NotReady; traffic shifted onto the 41 healthy replicas.
Risk if unchecked
The 41 remaining replicas now absorb extra load. If one more drops, checkout runs under capacity and latency spikes.
5xx rate · 210× over baseline
Downstream blast radius of the checkout-api OOM cascade.
Risk if unchecked
Real users are already failing to check out. At 4.2% and climbing, every minute is lost orders and direct revenue loss.