- Print
- PDF
This page provides a comprehensive list of languages supported by Knovvu Speech Recognition.
Single Models
⬇️We currently support the following languages through the endpoint:
Arabic, Azerbaijani, Croatian, Czech, Danish, Dutch, English, Farsi, Finnish, Flemish, French, German, Greek, Hindi, Italian, Japanese, Kazakh, Kurmanji-Kurdish, Korean, Latvian, Malay, Mandarin, Mongolian, Norwegian, Pashto, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Tagalog, Tamil, Turkish, Ukrainian, Urdu, Welsh.
Multilingual Models
Multilingual models enhance the user experience for speakers of multiple languages. For example, a voice assistant capable of understanding and responding in multiple languages can be more useful and engaging for users compared to one limited to a single language.
⬇️We currently support the following multilingual models through the endpoint:
Model Name | Languages Supported |
---|---|
EnglishTurkish | English, Turkish |
DutchFrench | Dutch, French |
ArabicEnglish | Arabic, English |
ArabicEnglishTurkish | Arabic, English, Turkish |
EnglishSpanish | English, Spanish |
LatvianRussian | Latvian, Russian |
Mena | Arabic, English, French, Urdu |
NorthAmerica | English, French, Portuguese, Spanish |
Europe | English, French, Portuguese, Spanish, Dutch, German, Italian |
FourLanguagesMulti | English, Arabic, French, Spanish |
SixLanguagesMulti | English, Turkish, Arabic, Russian, French, Spanish |
Large | Arabic, Danish, Dutch, English, Finnish, French, German, Hindi, Italian, Latvian, Mandarin, Norwegian, Portuguese, Russian, Spanish, Swedish, Tagalog, Turkish, Urdu |
For example, if you intend to transcribe mixed speech in English and Spanish, you can use NorthAmerica
or FourLanguagesMulti
model. Similarly, if you wish to transcribe mixed speech in Arabic, French, and Urdu, you can utilize the Mena
or SixLanguagesMulti
model.
Request Sample with Curl:
curl --location '{{Address}}/v1/speech/dictation/request' \
--header 'ModelName: FourLanguagesMulti' \
--header 'ModelVersion: 2' \
--header 'Content-Type: audio/wave' \
--header 'Authorization: Bearer Token
--data 'audio file.wav'
You can specify the language within a recognition request's ModelName
parameter. For detailed instructions on sending a recognition request and specifying the language for transcription, please refer to the API Reference on performing speech recognition.
Whisper Models
As part of Knovvu's commitment to providing comprehensive speech recognition solutions, we have strategically integrated Whisper models into our supported languages. This decision allows us to bring additional language support to Knovvu SR, particularly in regions or dialects we do not directly support.
Performance and Limitations
It is important to note that while Whisper models enable access to additional languages, Sestek does not assume responsibility for the performance or accuracy of these third-party models. Results may vary based on language and environmental factors, and Whisper models may not achieve the same performance standards as Knovvu’s core SR technology. We recommend testing these models within specific environments to ensure they meet your project requirements.
Supported Languages for Whisper Models
⬇️We currently support the following languages through the endpoint:
Afrikaans, Amharic, Arabic, Assamese, Azerbaijani, Bashkir, Belarusian, Bulgarian, Bengali, Tibetan, Breton, Bosnian, Catalan, Czech, Welsh, Danish, German, Greek, English, Spanish, Estonian, Basque, Persian, Finnish, Faroese, French, Galician, Gujarati, Hawaiian, Hausa, Hebrew, Hindi, Croatian, Haitian Creole, Hungarian, Armenian, Indonesian, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, Khmer, Kannada, Korean, Latin, Luxembourgish, Lingala, Lao, Lithuanian, Latvian, Malagasy, Maori, Macedonian, Malayalam, Mongolian, Marathi, Malay, Maltese, Burmese, Nepali, Dutch, Norwegian Nynorsk, Norwegian, Occitan, Punjabi, Polish, Pashto, Portuguese, Romanian, Russian, Sanskrit, Sindhi, Sinhala, Slovak, Slovenian, Shona, Somali, Albanian, Serbian, Sundanese, Swedish, Swahili, Tamil, Telugu, Tajik, Thai, Turkmen, Tagalog, Turkish, Tatar, Ukrainian, Urdu, Uzbek, Vietnamese, Yiddish, Yoruba, Yue Chinese, Chinese.
Model Options
To accommodate diverse performance needs, Knovvu offers two versions of Whisper models: WhisperTiny and WhisperTurbo. Each model provides a distinct balance of speed and recognition accuracy:
- WhisperTiny: This is a smaller, lightweight model designed for faster performance. While WhisperTiny provides quick processing speeds, its recognition accuracy may be limited in comparison to larger models. This option is ideal for applications where speed is prioritized over the highest accuracy levels, or where computing resources are limited.
- WhisperTurbo: Equivalent to the advanced "large v3 turbo" Whisper model, WhisperTurbo represents the latest in Whisper technology, offering improved recognition accuracy. However, this model requires more processing power and operates at a slower speed than WhisperTiny. WhisperTurbo is suited for applications that demand high accuracy, especially in complex or noisy environments, but can accommodate slower processing times.
Integrating Whisper Models
To utilize Whisper models in the Knovvu SR API, users simply need to specify the desired model in the ModelName
parameter. Enter either WhisperTiny
or WhisperTurbo
as the value. There is no need to specify the language, as Whisper models automatically detect and process supported languages.