- Print
- PDF
Article summary
Did you find this summary helpful?
Thank you for your feedback
Version 1.82
17 April, 2024
Important Note
To enable autoscaling, set the AppService__Serving__HorizontalScaling__Enabled
flag to true
during deployment and customize the default settings to align with desired autoscale configuration.
Default Settings:
AppService__Serving__HorizontalScaling__Enabled:
AppService__Serving__HorizontalScaling__MinReplicas: 1
AppService__Serving__HorizontalScaling__MaxReplicas: 1
AppService__Serving__HorizontalScaling__AverageUtilization: 50
AppService__Serving__HorizontalScaling__ScaleDown__Enabled: true
AppService__Serving__HorizontalScaling__ScaleDown__StabilizationSeconds: 300
AppService__Serving__HorizontalScaling__ScaleDown__PeriodSeconds: 300
AppService__Serving__HorizontalScaling__ScaleDown__ValuePods: 1
AppService__Serving__HorizontalScaling__ScaleUp__StabilizationSeconds: 30
AppService__Serving__HorizontalScaling__ScaleUp__PeriodSeconds: 15
AppService__Serving__HorizontalScaling__ScaleUp__ValuePods: 1
AppService__Serving__HorizontalScaling__ScaleUp__ValuePercentage: 40
Improvements
Horizontal Autoscale Capability for AI Modules:
Horizontal Autoscale Capability is added to all AI modules to eliminate the need for manual replica configuration. With this configurable feature, resources can be dynamically adjusted based on workload demands, optimizing performance and resource utilization.Reducing Dependency to Nexus:
When a new tenant is created, after retrieving metadata from Nexus and writing it to the database initially, the existing module information (metadata) is fetched from the DB and copied for that tenant.
Was this article helpful?