Scalability

Document Number	Revision Number	Revision Date
KN. GU.44.EN	Rev4	05.03.2026

Horizontal Auto-Scaling Based on Demand

One of the major advantages of operating in the cloud is the ability to scale elastically. The Knovvu platform uses both AWS and Kubernetes auto-scaling mechanisms to dynamically adjust compute capacity in real time.

During periods of peak load, the platform automatically scales out, adding additional instances of services. When demand decreases, it scales in to reduce cost and resource usage. This ensures:

Smooth handling of traffic spikes
No customer impact during sudden increases in workload
Efficient use of cloud resources

Why Horizontal Scaling?

Vertical scaling (adding resources to a single server) has inherent limits and does not scale linearly. Instead, the Knovvu platform uses horizontal scaling, which adds more pods - i.e., more instances of a microservice.

Load Balancing Across Services

Every Knovvu service is fronted by an AWS Application Load Balancer (ALB) or an internal Kubernetes load-balancing mechanism. Incoming traffic is distributed evenly across multiple service instances.

This provides:

Optimal performance
Even workload distribution
Protection against bottlenecks or hotspots
Seamless failover when one instance becomes unhealthy

Decoupled Architecture Using Message Queues

To enable independent scalability, Knovvu platform uses message queues to decouple microservices. This allows each component to scale at its own pace, based on demand.

Automatically Scalable Managed Services

The platform also leverages AWS managed services that scale automatically with usage. For example S3 scales seamlessly to handle large amounts of data or request volume, with no configuration required. By relying on these services wherever possible, the platform minimizes operational overhead while maximizing reliability and performance.