- Print
- PDF
Analyzes conversation speech. This node detects:
- The level of tension between the parties involved
- Interruptions in each other's speech
- Speaking rates (number of words per second)
- Periods of silence when both parties remain quiet
- Hesitations
- Sentiment of the speech text
- Speech overlaps
Inputs
Events:
name | description |
---|---|
SR Milestone | Most of the fields examined by the Conversation Analyzer are provided by this event. Features such as silence durations, speech rates, interrupts, and other characteristics are obtained by compiling channel-based events from the sr-milestone |
Emotion | The information from the emotion service is compiled to be displayed in the output based on segment and channel. For example, it can be provided that the agent channel exhibits "angry" sentiment between 4-8 seconds. |
Sentiment | The results from the sentiment service are analyzed on an agent-customer basis, and the sentiment information of each segment is reflected in the outcome. |
Audio:
none
Outputs
name | subname | description |
---|---|---|
agent | The analysis belonging to the agent's section is located under the "agent" node. | |
anger | The sections in the conversation where the agent speaks continuously without interruptions are referred to as "block." In an example output, the total count of blocks, the count of blocks per minute, and the "start-end time" and "duration" information for each block are included.The emotion state of the audio detected from the channel is expressed. The results can appear as "angry, normal, etc." The "anger" field displayed in the output is shown with specific time intervals. | |
blocks | The sections in the conversation where the agent speaks continuously without interruptions are referred to as "block." In an example output, the total count of blocks, the count of blocks per minute, and the "start-end time" and "duration" information for each block are included. | |
hesitations | This section displays the "pause" information specific to the agent, along with the time intervals | |
interrupt | The sections where the agent interrupts the customer's speech are included in this area of the output. | |
speed_letter_per_second | The information obtained by dividing the word count from the channel by the total speech duration is included in this section. | |
sr | In this section, the "speech recognition" analysis results derived from the audio obtained from the "agent" channel of the dialogue are displayed. The results include information such as time, text, and duration. | |
customer | The analysis belonging to the customer's section is located under the "customer" node. | |
anger | The sections in the conversation where the customer speaks continuously without interruptions are referred to as "block." In an example output, the total count of blocks, the count of blocks per minute, and the "start-end time" and "duration" information for each block are included.The emotion state of the audio detected from the channel is expressed. The results can appear as "angry, normal, etc." The "anger" field displayed in the output is shown with specific time intervals. | |
blocks | The sections in the conversation where the customer speaks continuously without interruptions are referred to as "block." In an example output, the total count of blocks, the count of blocks per minute, and the "start-end time" and "duration" information for each block are included. | |
hesitations | This section displays the "pause" information specific to the customer, along with the time intervals | |
interrupt | The sections where the customer interrupts the agent's speech are included in this area of the output. | |
speed_letter_per_second | The information obtained by dividing the word count from the channel by the total speech duration is included in this section. | |
sr | In this section, the "speech recognition" analysis results derived from the audio obtained from the "customer" channel of the dialogue are displayed. The results include information such as time, text, and duration. | |
overlap | The sections in which both channels speak simultaneously throughout the dialogue are listed in the "overlap" field. The overlap information consists of start time, end time, and duration. | |
silence | The sections in which no audio is received from both channels throughout the dialogue are listed in the "silence" field. The silence information consists of start time, end time, and duration. |
Audio:
none
Remarks :
To obtain data from the Conversation Analyzer, it is necessary to configure it along with the events listed in the "Input" field.
License :
none
Project Structure
Conversation Analyzer is probably the most complex node that requires the output from several nodes working together. The project structure can be seen in the default project ca-offline
. The node requires the agent
and customer
to be separated into two channels. If a mono audio input is given, Speaker Diarizer Node is utilized to do the separation. Then, the appropriate audio segments are filtered and used for Gender, Emotion, Sentiment analysis.
Audio Segment Picker and Flush Barrier are utilized, so that the most appropriate segment (5000-10000 milliseconds) is picked for Language Identification before the transcripts are generated in the SR Http node.
Supported flow types: Batch