SESTEK SR provides two primary recognition methods, each designed for different use cases. Choosing the right method depends on the nature of your application, the vocabulary it needs to handle, and its accuracy and speed requirements.
Speech Recognition with Grammar
Grammar-based recognition constrains the recognizer to a predefined set of phrases or structures. The vocabulary, sentence structure, and acceptable word combinations are explicitly defined in advance - the recognizer only considers inputs that match the grammar.
This makes grammar-based recognition highly accurate and fast for applications with well-defined inputs. Because the search space is limited, there is less room for misrecognition, and processing is significantly faster than open-ended dictation.
Use cases:
- IVR systems - automated customer service where users select options by voice
- Command-based applications - smart home controls, call routing, automated interactions
- Voice-activated interfaces - systems where users issue specific, predictable instructions
- Security applications - strict phrase recognition for authentication prompts
Available for both cloud and on-premises deployments.
Speech Dictation with Language Model
Dictation mode enables free-form speech recognition. Users can speak naturally without constraints on vocabulary or sentence structure. The SESTEK language model - trained on diverse speech data - handles a broad range of linguistic variations and transcribes spoken language into text as accurately as possible.
Unlike grammar-based recognition, dictation mode does not restrict input to predefined phrases. This flexibility comes with trade-offs: recognition is slower and more prone to errors, particularly in noisy environments or when domain-specific jargon is involved. Context biasing can help mitigate this for specialized vocabulary.
Use cases:
- Speech-to-text applications - transcribing long-form speech such as meeting recordings or interviews
- Voice assistants - hands-free interaction with chatbots and AI-driven support systems
- Medical and legal transcription - recording dictated notes in professional environments
- IVR with NLU - conversational AI in customer service applications
Available for both cloud and on-premises deployments.
Choosing the Right Method

| Factor | Grammar-based | Dictation |
|---|---|---|
| Input type | Fixed phrases and commands | Open-ended, natural speech |
| Accuracy | High - vocabulary is restricted | Moderate - broad vocabulary increases error risk |
| Processing speed | Fast - limited search space | Slower - complex parsing |
| Best for | IVR, voice commands, authentication | Transcription, NLU, dictation |

