Features

Prev Next
Document Number Revision Number Revision Date
KN. GU.25.EN Rev34 13.04.2026

Language Coverage

  • Support for 99+ Languages: More than 99 languages are available, enabling organizations to build speech-enabled experiences for a wide range of geographies, markets, and customer segments.

  • Multilingual and Bilingual Models: Multilingual and bilingual model options leverage knowledge from multiple languages, helping improve performance in multilingual and mixed-language scenarios.

  • Accent Coverage in a Single Model: Multiple accents of the same language can be handled within one model, reducing operational complexity and eliminating the need to manage separate accent-specific models.


Model Flexibility and Technology Foundation

  • Support for Hosting Different Model Foundations: Different speech recognition model foundations, including approaches such as Whisper and Dolphin based models, can be hosted to match language, use case, and performance expectations more effectively.

  • Flexible Architecture for Evolving Speech Technologies: A flexible technology foundation makes it possible to adapt recognition solutions as speech technologies evolve, rather than being limited to a single fixed model strategy.


Vocabulary and Model Adaptation

  • Custom Word Support: Domain-specific, business-specific, or customer-specific words that are not currently included in the language model can be added upon request.

  • Fine-Tuning with Real Customer Data: Models can be improved through fine-tuning with real customer data, allowing better adaptation to customer terminology, speaking habits, domain language, and real-life acoustic conditions.

  • Domain Adaptation for Industry Needs: Sector-specific terminology and enterprise scenarios can be addressed through tailored adaptation, helping improve recognition quality in areas such as contact centers, banking, telecom, and public services.


Language Model Development

  • Model Creation for Low-Resource Languages: Dedicated speech recognition models can be developed even for languages with little or no ready training data by working with customer-provided or specially collected datasets.

  • Custom Data-Based Model Training: For languages without an existing mature model, custom development can be carried out with sufficient language data, typically requiring large-scale datasets such as 200+ hours depending on the language and target quality.

  • High-Potential Accuracy for New Language Models: With adequate, high-quality training data, custom speech recognition models can achieve strong performance levels, including 85%+ success rates for previously unsupported or low-resource languages.


Audio and Input Support

  • Wide Audio Format Compatibility: Audio conversion is handled through ffmpeg, with support for major formats such as G729, MP3, MP4, WAV, and Opus.

  • Flexible Audio Input Handling: Different audio sources and integration flows can be accommodated, making it easier to connect speech recognition into existing telephony and application environments.


Text Processing and Output Control

  • Numeral & Entity Formatting: Recognized content can be transformed into more usable written output by converting spoken numerals and selected entities into normalized text representations, improving readability and downstream system usability.

  • Masking of Sensitive Information: Sensitive information can be masked in recognition output through user-defined regex rules, helping protect transcribed content.

  • Structured Recognition Output: Transcribed speech can be delivered in a format suitable for automation, analytics, reporting, and operational workflows.

  • Time-Aligned Transcription: Transcription output can be provided together with timing information, supporting use cases such as subtitle generation, audio-text synchronization, search within recordings, analytics, and detailed post-processing.


Integration and Deployment

  • Easy Integration with APIs and SDKs: User-friendly APIs and SDKs help simplify integration into existing applications and platforms, reducing implementation effort.

  • Flexible Integration Options: Different integration methods and communication structures can be used to align with varying architectural and operational needs.

  • Cloud and On-Prem Deployment: Deployment can be made in cloud or on-prem environments, depending on infrastructure, security, and compliance requirements.