- Print
- PDF
Merges two incoming audio together and outputs it as a stream. For the node to work correctly, it needs to be used in a streaming project. Supports Barge-In as of Data Flow 4.1.0.
Parameters
name | description | default |
---|---|---|
Stream Input Volume Percentage | Changes the volume of the incoming streaming audio | 100 |
Inputs
Audio
Accepts Streaming and Batch audio from a single channel.
Events
name | description | known nodes that generate this event |
---|---|---|
Start of TTS Fragment | Signals the start of a TTS segment. | TTS Http |
Speech Started | Indicates Barge-In. | Vad |
End of TTS Fragment | Signals the start of a TTS segment. | TTS Http |
Speech Ended | Signals the end of Barge-In. | Vad |
Outputs
Audio
Streams one-channel audio from the output connection.
Events
name | description |
---|---|
Merge Audio Started | Signals the start of audio merging. Sent only when batch audio queue is not empty anymore. |
Merge Audio Ended | Signals the end of audio merging. Sent only when batch audio queue is emptied out. |
Remarks
Working Principle
- The node outputs the streaming audio instantaneously no matter what.
- Whenever an audio batch is received from a TTS node or a VAD node, the audio batch will be kept in a queue.
- If the batch audio queue is empty, the output stream is the same as the input.
- If the batch audio queue is not empty, the incoming streaming audio will be merged with the first audio in queue and streamed in the incoming rate.
If the merged audio is too loud for the audio format, the node will automatically normalize the output stream.
Barge-In
Barge-In occurs when a the connected VAD node detects speech. If a Speech Started is received during a merge, the merge stops, and only the streaming audio source is outputted.
The merging will continue normally after a Speech Ended event is received.
Project Structure
The figure above, is a basic project that supports simultaneous translation with the Barge-In feature.
Supported flow types
Stream