1. Purpose
This document explains how to use the TTS service over WebSocket. Through the WebSocket connection, text can be sent to the service, synthesized into audio, and the synthesis flow can be completed by sending the required control messages.
2. WebSocket Connection
A WebSocket request can be created in Postman.
WebSocket URL Format:
wss://<tts-service-host>/synthesizer
Example:
wss://<environment-specific-tts-host>/synthesizer
Note: Actual environment URLs should not be shared openly in documentation. The relevant TTS WebSocket URL should be used depending on the target environment.
3. Message Flow
After the WebSocket connection is established, the messages should be sent in the following order:
1. synthesize
2. add-tts-text
3. flush
4. stop
5. finalize-synthesis
Note: The separate
flushmessage is required only when theadd-tts-textmessage does not include the optionalflushparameter.
4. Start Synthesis Message
The first message should be the synthesize message. This message initializes the synthesis process and includes the audio format, voice, sample rate, volume, speaking rate, and authorization information.
Request Format
{
"message-name": "synthesize",
"audio-format": "pcm",
"voice-name": "<voice-name>",
"sample-rate": "<sample-rate>",
"volume": "<volume>",
"rate": "<rate>",
"Authorization": "<access-token>"
}
Example
{
"message-name": "synthesize",
"audio-format": "pcm",
"voice-name": "Emily_Premium",
"sample-rate": "24000",
"volume": "1.0",
"rate": "1.0",
"Authorization": "<access-token>"
}
Parameter Descriptions
| Parameter | Description | Example Value |
|---|---|---|
message-name |
Specifies the message type. | synthesize |
audio-format |
Specifies the output audio format. | pcm |
voice-name |
Specifies the TTS voice to be used. | Emily_Premium |
sample-rate |
Specifies the audio sample rate. | 24000 |
volume |
Specifies the output volume level. | 1.0 |
rate |
Specifies the speaking rate. | 1.0 |
Authorization |
Access token used for authorization. | <access-token> |
Security note: The actual access token should not be shared openly in documentation, emails, tickets, or any shared environment.
5. Text Message
After the synthesis process is initialized, the text to be synthesized should be sent using the add-tts-text message.
Request Format
{
"message-name": "add-tts-text",
"text": "<text-to-synthesize>"
}
Example
{
"message-name": "add-tts-text",
"text": "Hello world. How are you today?"
}
Parameter Descriptions
| Parameter | Description | Example Value |
|---|---|---|
message-name |
Specifies the message type. | add-tts-text |
text |
Contains the text to be synthesized. | Hello world. How are you today? |
flush |
Optional parameter. If set to true, yes, or 1, the text message is flushed immediately without sending a separate flush message. |
true |
Inline Flush Usage
The add-tts-text message also supports an optional flush parameter. When this parameter is set to a truthy value such as true, yes, or 1, the text message is flushed immediately. In this case, a separate flush message is not required for that specific text input.
Example
{
"message-name": "add-tts-text",
"text": "Hello world. How are you today?",
"flush": "true"
}
If the
flushparameter is not provided, a separateflushmessage should be sent after the text message to trigger synthesis.
6. Flush Message
After the text is sent, the flush message should be sent. This message indicates that the current text input is complete and should be processed by the service.
A separate flush message is required only when the add-tts-text message does not include the optional flush parameter.
Request Format
{
"message-name": "flush"
}
7. Stop Message
The stop message is used to stop the synthesis process.
Request Format
{
"message-name": "stop"
}
8. Finalize Synthesis Message
The finalize-synthesis message is used to finalize and close the synthesis flow.
Request Format
{
"message-name": "finalize-synthesis"
}
9. Sample Message Sequence
The following example shows the message sequence to be sent after the WebSocket connection is established.
Option 1: Text Message Followed by Separate Flush
1. Start Message
{
"message-name": "synthesize",
"audio-format": "pcm",
"voice-name": "Emily_Premium",
"sample-rate": "24000",
"volume": "1.0",
"rate": "1.0",
"Authorization": "<access-token>"
}
2. Text Message
{
"message-name": "add-tts-text",
"text": "Hello world. How are you today?"
}
3. Flush Message
{
"message-name": "flush"
}
4. Stop Message
{
"message-name": "stop"
}
5. Finalize Message
{
"message-name": "finalize-synthesis"
}
Option 2: Text Message with Inline Flush
1. Start Message
{
"message-name": "synthesize",
"audio-format": "pcm",
"voice-name": "Emily_Premium",
"sample-rate": "24000",
"volume": "1.0",
"rate": "1.0",
"Authorization": "<access-token>"
}
2. Text Message with Inline Flush
{
"message-name": "add-tts-text",
"text": "Hello world. How are you today?",
"flush": "true"
}
3. Stop Message
{
"message-name": "stop"
}
4. Finalize Message
{
"message-name": "finalize-synthesis"
}
10. Testing via Postman
- Create a new WebSocket Request in Postman.
- Enter the WebSocket URL using the following format:
wss://<tts-service-host>/synthesizer
-
Connect to the WebSocket endpoint.
-
Send the
synthesizemessage first. -
Send the text to be synthesized using the
add-tts-textmessage. -
Send the
flushmessage to trigger processing of the provided text.This step can be skipped if the
add-tts-textmessage includes the optionalflushparameter. -
Send the
stopmessage if the synthesis process needs to be stopped. -
Send the
finalize-synthesismessage to complete the synthesis flow.
11. Important Notes
- Messages should only be sent after the WebSocket connection is successfully established.
- The first message must be
synthesize. - A valid access token must be provided in the
Authorizationfield. - Access tokens should never be shared openly in documentation.
- Environment-specific endpoint information should be replaced with placeholders such as
<tts-service-host>. - The
voice-namevalue must be one of the voices supported in the relevant environment. sample-rate,audio-format,volume, andratevalues must be compatible with the values supported by the service.- If the optional
flushparameter is used in theadd-tts-textmessage, a separateflushmessage is not required for that text input. - If the optional
flushparameter is not used, theflushmessage should be sent separately after the text message.
