Language Identifier

Prev Next

Performs language identification on provided audio.

Parameters

name
description
default
Address Address of Language Identification service that will be used. http://core-sr
IgnoreSslErrors ignore any certificate errors if SestekSR address contains https false
ConfidenceThreshold Language detection results with less confidence than this value will be ignored. This value should be between 0.0 and 1.0 0.75
Languages SemiColon separated, list of languages for language detection service to treat as candidates. If left empty, all the known languages by the service will be evaluated. It is good practice to provide a list with only the possible languages if you have a limited target audience. This increases detection correctness. eg. tr-TR;en-US
MinimumSegmentDuration(ms) Sets minimum duration for speech to be sent to the Language Identifier Service 3000

Inputs

Audio

 Accepts audio from a single channel.

Important Note

Note that passing the audio through a VAD or Audio Segmenter before sending to this node is required. Directly connecting the Entry node audio output to the Language Identifier will result in no outputs.

Events

name description known nodes that generate this event
Start Of Segment Signals the start of a speech segment. Audio Segmenter
Speech Started Signals the start of speech in realtime. Vad
End Of Segment Signals the start of a speech segment. Audio Segmenter
Speech Ended Signals the end of speech in realtime. Vad

Outputs

Audio

  One Channel Segmented Audio

Events

name description
Language Change Raised when language change has been detected.
Language Detected Raised for every analyzed utterance.
Start Of Segment Signals the start of a speech segment.
End Of Segment Signals the start of a speech segment.

Remarks

  • This node produces events that can trigger changes on other nodes' behavior. eg. SR Http node changes the model accordingly when it receives a language change event.
  • For a Language Change event to be raised, the received audio segment needs to be longer than MinimumSegmentDuration(ms), with the confidence of identification being higher than "ConfidenceThreshold".
  • The Language Identifier service performs best with segments that are 5 seconds. The segments that diverge from this value will have lower confidence scores.
Important Note

Language Identifier behaves as a Pass-Through node for audio segments. This means we can connect nodes like SR Http directly to the output of Language Identifier. Which will ensure that each segment of audio has the chance to change the transcription language.

Requirements

Project Requirements

Detailed Workflow

  • The node waits for a whole audio segment.
  • When "Speech Ended" or "End of Segment" event is received, the node checks if the length of segment is higher than MinimumSegmentDuration.
  • If the segment length is higher than MinimumSegmentDuration, The segment is sent to the Audio Language Identifier service.
    • If the length is lower than MinimumSegmentDuration, the node will just output the audio segment.
  • If the confidence of identification is lower than the threshold, or the identified language is the same as before, the node will just output the audio segment.
  • If the language is identified as different from the current language, the node will send the Language Change event and output the audio segment consecutively.

Supported flow types

Stream, Batch

Release Notes

v4.1.0
  • Fixed a bug that caused the node to send every detected language as a "Language Change" event.
  • Added `Language Detected` event that is sent on each detection.
v3.3.0
  • Added Minimum Segment Duration Parameter.
v3.2.0
  • The node is now project type agnostic. It doesn't behave differently between batch and stream projects.
v3.1.0
  • Fixed a rarely observed crash while responding to Stop
v1.0.0
  • Introduced node.