Conversation Analyzer
  • 19 Jul 2024
  • 3 Minutes to read
  • Contributors
  • PDF

Conversation Analyzer

  • PDF

Article summary

Analyzes conversation speech. This node detects:

  • The level of tension between the parties involved
  • Interruptions in each other's speech
  • Speaking rates (number of words per second)
  • Periods of silence when both parties remain quiet
  • Hesitations
  • Sentiment of the speech text
  • Speech overlaps

Inputs

Events:

namedescription
SR MilestoneMost of the fields examined by the Conversation Analyzer are provided by this event. Features such as silence durations, speech rates, interrupts, and other characteristics are obtained by compiling channel-based events from the sr-milestone
EmotionThe information from the emotion service is compiled to be displayed in the output based on segment and channel. For example, it can be provided that the agent channel exhibits "angry" sentiment between 4-8 seconds.
SentimentThe results from the sentiment service are analyzed on an agent-customer basis, and the sentiment information of each segment is reflected in the outcome.

Audio:
none

Outputs

namesubnamedescription
agentThe analysis belonging to the agent's section is located under the "agent" node.
angerThe sections in the conversation where the agent speaks continuously without interruptions are referred to as "block." In an example output, the total count of blocks, the count of blocks per minute, and the "start-end time" and "duration" information for each block are included.The emotion state of the audio detected from the channel is expressed. The results can appear as "angry, normal, etc." The "anger" field displayed in the output is shown with specific time intervals.
blocksThe sections in the conversation where the agent speaks continuously without interruptions are referred to as "block." In an example output, the total count of blocks, the count of blocks per minute, and the "start-end time" and "duration" information for each block are included.
hesitationsThis section displays the "pause" information specific to the agent, along with the time intervals
interruptThe sections where the agent interrupts the customer's speech are included in this area of the output.
speed_letter_per_secondThe information obtained by dividing the word count from the channel by the total speech duration is included in this section.
srIn this section, the "speech recognition" analysis results derived from the audio obtained from the "agent" channel of the dialogue are displayed. The results include information such as time, text, and duration.
customerThe analysis belonging to the customer's section is located under the "customer" node.
angerThe sections in the conversation where the customer speaks continuously without interruptions are referred to as "block." In an example output, the total count of blocks, the count of blocks per minute, and the "start-end time" and "duration" information for each block are included.The emotion state of the audio detected from the channel is expressed. The results can appear as "angry, normal, etc." The "anger" field displayed in the output is shown with specific time intervals.
blocksThe sections in the conversation where the customer speaks continuously without interruptions are referred to as "block." In an example output, the total count of blocks, the count of blocks per minute, and the "start-end time" and "duration" information for each block are included.
hesitationsThis section displays the "pause" information specific to the customer, along with the time intervals
interruptThe sections where the customer interrupts the agent's speech are included in this area of the output.
speed_letter_per_secondThe information obtained by dividing the word count from the channel by the total speech duration is included in this section.
srIn this section, the "speech recognition" analysis results derived from the audio obtained from the "customer" channel of the dialogue are displayed. The results include information such as time, text, and duration.
overlapThe sections in which both channels speak simultaneously throughout the dialogue are listed in the "overlap" field. The overlap information consists of start time, end time, and duration.
silenceThe sections in which no audio is received from both channels throughout the dialogue are listed in the "silence" field. The silence information consists of start time, end time, and duration.

Audio:
none

Remarks :
To obtain data from the Conversation Analyzer, it is necessary to configure it along with the events listed in the "Input" field.

License :
none

Project Structure

Conversation Analyzer is probably the most complex node that requires the output from several nodes working together. The project structure can be seen in the default project ca-offline. The node requires the agent and customer to be separated into two channels. If a mono audio input is given, Speaker Diarizer Node is utilized to do the separation. Then, the appropriate audio segments are filtered and used for Gender, Emotion, Sentiment analysis.

Audio Segment Picker and Flush Barrier are utilized, so that the most appropriate segment (5000-10000 milliseconds) is picked for Language Identification before the transcripts are generated in the SR Http node.

image.png

Supported flow types: Batch


Was this article helpful?

What's Next
Changing your password will log you out immediately. Use the new password to log back in.
First name must have atleast 2 characters. Numbers and special characters are not allowed.
Last name must have atleast 1 characters. Numbers and special characters are not allowed.
Enter a valid email
Enter a valid password
Your profile has been successfully updated.