Parameters
name |
description | default |
---|---|---|
Engine Type | Underlying VAD Engine | 0.9 |
Sensitivity | The range is 0.0-1.0 inclusive, Determines the threshold of Speech Possibility. Speech Started event triggers when Speech Possibility > Sensitivity. | 0.9 |
MaxSpeechDurationMsec | If the speech does not end after this duration it will be ended by VAD. This will be treated as a normal end of speech, and an appropriate speech-ended event will be generated. Exceeding this timeout will not generate an error. | -1 |
Max Speech Duration Graceful End (%) | Specify the last % part of Max Speech Duration, engine becomes more sensitive to silence. It's active only if Max Speech Duration is set. | 20 |
PreSpeechBufferMsec | After start of speech is detected VAD rewinds and takes a little more data before the detected beginning, just in case a low energy voice happens to be there. This duration is determined by pre-speech-buffer-msec | 300 |
PostSpeechBufferMsec | After the end of speech is detected VAD takes a little more data after the detected end just in case a low energy voice happens to be there. This duration is determined by post-speech-buffer-msec. | 300 |
SilenceTriggerMsec | The amount of silence in milliseconds for VAD to expect in order to decide that the speech has actually ended | 400 |
Inputs
Audio
Accepts audio from a single channel.
Events
none
Outputs
Audio
After removing the silences in the input audio the remaining data is sent to output.
Events
name | description |
---|---|
Speech Started | Raised once at the beginning of each piece of actual audio fragment. |
Speech Ended | Raised once at the end of each piece of actual audio fragment. |
Remarks
Supported Flow Types
Batch, Stream