Language Identifier
  • 19 Jul 2024
  • 2 Minutes to read
  • Contributors
  • PDF

Language Identifier

  • PDF

Article summary

Performs language identification on provided audio.

Parameters:

name
description
default
AddressAddress of Language Identification service that will be used.http://core-sr
IgnoreSslErrorsignore any certificate errors if SestekSR address contains httpsfalse
ConfidenceThresholdLanguage detection results with less confidence than this value will be ignored. This value should be between 0.0 and 1.00.75
LanguagesSemiColon separated, list of languages for language detection service to treat as candidates. If left empty, all the known languages by the service will be evaluated. It is good practice to provide a list with only the possible languages if you have a limited target audience. This increases detection correctness. eg. tr-TR;en-US
MinimumSegmentDuration(ms)Sets minimum duration for speech to be sent to the Language Identifier Service3000

Inputs

Audio:
Accepts audio from a single channel.

Important Note

Note that passing the audio through a VAD or Audio Segmenter before sending to this node is required. Directly connecting the Entry node audio output to the Language Identifier will result in no outputs.

Events:

namedescriptionknown nodes that generate this event
Start Of SegmentSignals the start of a speech segment.Audio Segmenter
Speech StartedSignals the start of speech in realtime.Vad
End Of SegmentSignals the start of a speech segment.Audio Segmenter
Speech EndedSignals the end of speech in realtime.Vad

Outputs

Audio:
One Channel Segmented Audio

Events:

namedescription
Language ChangeRaised when language change has been detected.
Start Of SegmentSignals the start of a speech segment.
End Of SegmentSignals the start of a speech segment.

Remarks :

  • This node produces events that can trigger changes on other nodes' behavior. eg. SR Http node changes the model accordingly when it receives a language change event.
  • For a Language Change event to be raised, the received audio segment needs to be longer than MinimumSegmentDuration(ms), with the confidence of identification being higher than "ConfidenceThreshold".
  • The Language Identifier service performs best with segments that are 5 seconds. The segments that diverge from this value will have lower confidence scores.
Important Note

Language Identifier behaves as a Pass-Through node for audio segments. This means we can connect nodes like SR Http directly to the output of Language Identifier. Which will ensure that each segment of audio has the chance to change the transcription language.

Requirements :

Project Requirements:

Detailed Workflow

  • The node waits for a whole audio segment.
  • When "Speech Ended" or "End of Segment" event is received, the node checks if the length of segment is higher than MinimumSegmentDuration.
  • If the segment length is higher than MinimumSegmentDuration, The segment is sent to the Audio Language Identifier service.
    • If the length is lower than MinimumSegmentDuration, the node will just output the audio segment.
  • If the confidence of identification is lower than the threshold, or the identified language is the same as before, the node will just output the audio segment.
  • If the language is identified as different from the current language, the node will send the Language Change event and output the audio segment consecutively.
    Supported flow types: Stream, Batch

Was this article helpful?

What's Next
Changing your password will log you out immediately. Use the new password to log back in.
First name must have atleast 2 characters. Numbers and special characters are not allowed.
Last name must have atleast 1 characters. Numbers and special characters are not allowed.
Enter a valid email
Enter a valid password
Your profile has been successfully updated.