Batch-To-Stream Audio Merger

Merges two incoming audio together and outputs it as a stream. For the node to work correctly, it needs to be used in a streaming project. Supports Barge-In as of Data Flow 4.1.0.

Parameters

name	description	default
Stream Input Volume Percentage	Changes the volume of the incoming streaming audio	100

Inputs

Audio

Accepts Streaming and Batch audio from a single channel.

Events

name	description	known nodes that generate this event
Start of TTS Fragment	Signals the start of a TTS segment.	TTS Http
Speech Started	Indicates Barge-In.	Vad
End of TTS Fragment	Signals the start of a TTS segment.	TTS Http
Speech Ended	Signals the end of Barge-In.	Vad

Outputs

Audio

Streams one-channel audio from the output connection.

Events

name	description
Merge Audio Started	Signals the start of audio merging. Sent only when batch audio queue is not empty anymore.
Merge Audio Ended	Signals the end of audio merging. Sent only when batch audio queue is emptied out.

Remarks

Working Principle

The node outputs the streaming audio instantaneously no matter what.
Whenever an audio batch is received from a TTS node or a VAD node, the audio batch will be kept in a queue.
If the batch audio queue is empty, the output stream is the same as the input.
If the batch audio queue is not empty, the incoming streaming audio will be merged with the first audio in queue and streamed in the incoming rate.

Untitled video - Made with Clipchamp 2.gif

Overflow

If the merged audio is too loud for the audio format, the node will automatically normalize the output stream.

Barge-In

Barge-In occurs when a the connected VAD node detects speech. If a Speech Started is received during a merge, the merge stops, and only the streaming audio source is outputted.
The merging will continue normally after a Speech Ended event is received.

Project Structure

The figure above, is a basic project that supports simultaneous translation with the Barge-In feature.

Supported flow types

Stream

Release Notes

v4.2.0

Added Merge audio started and ended events.

v4.1.0

Added Barge-In support.

v3.7.0

Added Stream Input Volume Percentage as a Parameter.

v3.4.0

Introduced Node.