- Print
- PDF
Article summary
Did you find this summary helpful?
Thank you for your feedback
Performs language identification on provided audio.
Parameters:
name | description | default |
---|---|---|
Address | Address of Language Identification service that will be used. | http://core-sr |
IgnoreSslErrors | ignore any certificate errors if SestekSR address contains https | false |
ConfidenceThreshold | Language detection results with less confidence than this value will be ignored. This value should be between 0.0 and 1.0 | 0.75 |
Languages | SemiColon separated, list of languages for language detection service to treat as candidates. If left empty, all the known languages by the service will be evaluated. It is good practice to provide a list with only the possible languages if you have a limited target audience. This increases detection correctness. eg. tr-TR;en-US | |
MinimumSegmentDuration(ms) | Sets minimum duration for speech to be sent to the Language Identifier Service | 3000 |
Inputs
Audio:
Accepts audio from a single channel.
Important Note
Note that passing the audio through a VAD or Audio Segmenter before sending to this node is required. Directly connecting the Entry node audio output to the Language Identifier will result in no outputs.
Events:
name | description | known nodes that generate this event |
---|---|---|
Start Of Segment | Signals the start of a speech segment. | Audio Segmenter |
Speech Started | Signals the start of speech in realtime. | Vad |
End Of Segment | Signals the start of a speech segment. | Audio Segmenter |
Speech Ended | Signals the end of speech in realtime. | Vad |
Outputs
Audio:
One Channel Segmented Audio
Events:
name | description |
---|---|
Language Change | Raised when language change has been detected. |
Start Of Segment | Signals the start of a speech segment. |
End Of Segment | Signals the start of a speech segment. |
Remarks :
- This node produces events that can trigger changes on other nodes' behavior. eg. SR Http node changes the model accordingly when it receives a language change event.
- For a Language Change event to be raised, the received audio segment needs to be longer than
MinimumSegmentDuration(ms)
, with the confidence of identification being higher than "ConfidenceThreshold". - The Language Identifier service performs best with segments that are 5 seconds. The segments that diverge from this value will have lower confidence scores.
Important Note
Language Identifier behaves as a Pass-Through node for audio segments. This means we can connect nodes like SR Http directly to the output of Language Identifier. Which will ensure that each segment of audio has the chance to change the transcription language.
Requirements :
Project Requirements:
- The node needs to be connected to a VAD or an Audio Segmenter.
Detailed Workflow
- The node waits for a whole audio segment.
- When "Speech Ended" or "End of Segment" event is received, the node checks if the length of segment is higher than MinimumSegmentDuration.
- If the segment length is higher than MinimumSegmentDuration, The segment is sent to the Audio Language Identifier service.
- If the length is lower than MinimumSegmentDuration, the node will just output the audio segment.
- If the confidence of identification is lower than the threshold, or the identified language is the same as before, the node will just output the audio segment.
- If the language is identified as different from the current language, the node will send the Language Change event and output the audio segment consecutively.
Supported flow types: Stream, Batch
Was this article helpful?