KnovvuVAErrorLogs

Prev Next

Meaning

This alert is triggered when a Virtual Agent service generates more than 100 error-level logs within a 5-minute period (excluding the usher container). It indicates a potential malfunction or instability in the service. The alert fires at a critical level if the issue persists for 1 minute.

Full context

A sudden spike in error-level logs can signal service-level failures, unhandled exceptions, or misconfigurations that require attention.

Impact

High error volume can indicate:

  • Service disruptions or degraded behavior.
  • Increased failure rate in user interactions.
  • Risk of log storage overflow or reduced observability.

Diagnosis

  • Identify the service from the alert label.
  • Check recent logs for common patterns or repeated exceptions.
  • Correlate with recent deployments or traffic spikes.

Mitigation

  • Restart the affected service if necessary.
  • Roll back recent changes or hotfix issues.
  • Escalate to engineering if errors persist or impact production behavior.