KnovvuCoreFatalLogRules

Meaning

This alert is triggered when a core service logs more than one fatal-level message within a 1-minute window. Fatal logs indicate unrecoverable errors that typically cause service crashes or major disruptions. The alert fires at a critical level.

Full context

Fatal log entries usually reflect catastrophic failures such as unhandled exceptions, corrupted state, or forced shutdowns.

Impact

Service likely crashed or entered a failed state.
Immediate disruption of functionality or data processing.
Potential cascading impact on dependent systems.

Diagnosis

Identify the affected service from the alert label.
Check logs for fatal errors and stack traces.
Correlate with system events like deployments, resource spikes, or dependency failures.

Mitigation

Restart the service immediately if it has stopped.
Revert recent changes if fatal errors were introduced during deployment.
Escalate to engineering for root cause analysis and remediation.