Language Identifier

Performs language identification on provided audio.

Parameters

name	description	default
Address	Address of Language Identification service that will be used.	http://core-sr
IgnoreSslErrors	ignore any certificate errors if SestekSR address contains https	false
ConfidenceThreshold	Language detection results with less confidence than this value will be ignored. This value should be between 0.0 and 1.0	0.75
Languages	SemiColon separated, list of languages for language detection service to treat as candidates. If left empty, all the known languages by the service will be evaluated. It is good practice to provide a list with only the possible languages if you have a limited target audience. This increases detection correctness. eg. tr-TR;en-US
MinimumSegmentDuration(ms)	Sets minimum duration for speech to be sent to the Language Identifier Service	3000

Inputs

Audio

Accepts audio from a single channel.

Important Note

Note that passing the audio through a VAD or Audio Segmenter before sending to this node is required. Directly connecting the Entry node audio output to the Language Identifier will result in no outputs.

Events

name	description	known nodes that generate this event
Start Of Segment	Signals the start of a speech segment.	Audio Segmenter
Speech Started	Signals the start of speech in realtime.	Vad
End Of Segment	Signals the start of a speech segment.	Audio Segmenter
Speech Ended	Signals the end of speech in realtime.	Vad

Outputs

Audio

One Channel Segmented Audio

Events

name	description
Language Change	Raised when language change has been detected.
Language Detected	Raised for every analyzed utterance.
Start Of Segment	Signals the start of a speech segment.
End Of Segment	Signals the start of a speech segment.

Remarks

This node produces events that can trigger changes on other nodes' behavior. eg. SR Http node changes the model accordingly when it receives a language change event.
For a Language Change event to be raised, the received audio segment needs to be longer than MinimumSegmentDuration(ms), with the confidence of identification being higher than "ConfidenceThreshold".
The Language Identifier service performs best with segments that are 5 seconds. The segments that diverge from this value will have lower confidence scores.

Important Note

Language Identifier behaves as a Pass-Through node for audio segments. This means we can connect nodes like SR Http directly to the output of Language Identifier. Which will ensure that each segment of audio has the chance to change the transcription language.

Requirements

Project Requirements

The node needs to be connected to a VAD or an Audio Segmenter.

Detailed Workflow

The node waits for a whole audio segment.
When "Speech Ended" or "End of Segment" event is received, the node checks if the length of segment is higher than MinimumSegmentDuration.
If the segment length is higher than MinimumSegmentDuration, The segment is sent to the Audio Language Identifier service.
- If the length is lower than MinimumSegmentDuration, the node will just output the audio segment.
If the confidence of identification is lower than the threshold, or the identified language is the same as before, the node will just output the audio segment.
If the language is identified as different from the current language, the node will send the Language Change event and output the audio segment consecutively.

Supported flow types

Stream, Batch

Release Notes

v4.1.0

Fixed a bug that caused the node to send every detected language as a "Language Change" event.
Added `Language Detected` event that is sent on each detection.

v3.3.0

Added Minimum Segment Duration Parameter.

v3.2.0

The node is now project type agnostic. It doesn't behave differently between batch and stream projects.

v3.1.0

Fixed a rarely observed crash while responding to Stop

v1.0.0

Introduced node.