Features

Document Number	Revision Number	Revision Date
KN. GU.25.EN	Rev34	13.04.2026

Support for 99+ Languages: More than 99 languages are available, enabling organizations to build speech-enabled experiences for a wide range of geographies, markets, and customer segments.
Multilingual and Bilingual Models: Multilingual and bilingual model options leverage knowledge from multiple languages, helping improve performance in multilingual and mixed-language scenarios.
Accent Coverage in a Single Model: Multiple accents of the same language can be handled within one model, reducing operational complexity and eliminating the need to manage separate accent-specific models.

Support for Hosting Different Model Foundations: Different speech recognition model foundations, including approaches such as Whisper and Dolphin based models, can be hosted to match language, use case, and performance expectations more effectively.
Flexible Architecture for Evolving Speech Technologies: A flexible technology foundation makes it possible to adapt recognition solutions as speech technologies evolve, rather than being limited to a single fixed model strategy.

Custom Word Support: Domain-specific, business-specific, or customer-specific words that are not currently included in the language model can be added upon request.
Fine-Tuning with Real Customer Data: Models can be improved through fine-tuning with real customer data, allowing better adaptation to customer terminology, speaking habits, domain language, and real-life acoustic conditions.
Domain Adaptation for Industry Needs: Sector-specific terminology and enterprise scenarios can be addressed through tailored adaptation, helping improve recognition quality in areas such as contact centers, banking, telecom, and public services.

Model Creation for Low-Resource Languages: Dedicated speech recognition models can be developed even for languages with little or no ready training data by working with customer-provided or specially collected datasets.
Custom Data-Based Model Training: For languages without an existing mature model, custom development can be carried out with sufficient language data, typically requiring large-scale datasets such as 200+ hours depending on the language and target quality.
High-Potential Accuracy for New Language Models: With adequate, high-quality training data, custom speech recognition models can achieve strong performance levels, including 85%+ success rates for previously unsupported or low-resource languages.

Wide Audio Format Compatibility: Audio conversion is handled through ffmpeg, with support for major formats such as G729, MP3, MP4, WAV, and Opus.
Flexible Audio Input Handling: Different audio sources and integration flows can be accommodated, making it easier to connect speech recognition into existing telephony and application environments.

Numeral & Entity Formatting: Recognized content can be transformed into more usable written output by converting spoken numerals and selected entities into normalized text representations, improving readability and downstream system usability.
Masking of Sensitive Information: Sensitive information can be masked in recognition output through user-defined regex rules, helping protect transcribed content.
Structured Recognition Output: Transcribed speech can be delivered in a format suitable for automation, analytics, reporting, and operational workflows.
Time-Aligned Transcription: Transcription output can be provided together with timing information, supporting use cases such as subtitle generation, audio-text synchronization, search within recordings, analytics, and detailed post-processing.

Easy Integration with APIs and SDKs: User-friendly APIs and SDKs help simplify integration into existing applications and platforms, reducing implementation effort.
Flexible Integration Options: Different integration methods and communication structures can be used to align with varying architectural and operational needs.
Cloud and On-Prem Deployment: Deployment can be made in cloud or on-prem environments, depending on infrastructure, security, and compliance requirements.

Documentation Index