- Print
- PDF
Article summary
Did you find this summary helpful?
Thank you for your feedback!
Meaning
This alert is triggered when a core service logs more than one fatal-level message within a 1-minute window. Fatal logs indicate unrecoverable errors that typically cause service crashes or major disruptions. The alert fires at a critical level.
Full context
Fatal log entries usually reflect catastrophic failures such as unhandled exceptions, corrupted state, or forced shutdowns.
Impact
- Service likely crashed or entered a failed state.
- Immediate disruption of functionality or data processing.
- Potential cascading impact on dependent systems.
Diagnosis
- Identify the affected service from the alert label.
- Check logs for fatal errors and stack traces.
- Correlate with system events like deployments, resource spikes, or dependency failures.
Mitigation
- Restart the service immediately if it has stopped.
- Revert recent changes if fatal errors were introduced during deployment.
- Escalate to engineering for root cause analysis and remediation.
Was this article helpful?