This documentation provides a detailed summary of the performance tests conducted on our product, capturing various metrics across different threading configurations and segments.

Sample Dialogue Durations:
Short Segments (in seconds)
| Threads | 1 | 10 | 50 |
|---|---|---|---|
| Dialogue Duration | 496.0 | 602.0 | 1845.0 |
| Talking Duration | 191.64 | 191.64 | 191.64 |
| Playing Audio Duration | 191.64 | 191.64 | 191.64 |
| Translation Duration | 111.0 | 223.0 | 1458.0 |
| Average Segment Duration | 3.6 | 3.6 | 3.6 |
| Average Translation Duration | 2.1 | 4.2 | 27.5 |
Long Segments (in seconds)
| Threads | 1 | 10 | 50 |
|---|---|---|---|
| Dialogue Duration | 477.0 | 478.0 | 997.0 |
| Talking Duration | 210.0 | 210.0 | 210.0 |
| Playing Audio Duration | 210.0 | 210.0 | 210.0 |
| Translation Duration | 57.0 | 58.0 | 577.0 |
| Average Segment Duration | 10.5 | 10.5 | 10.5 |
| Average Translation Duration | 2.85 | 2.9 | 28.85 |
Average Durations for 100 Dialogues:
Short Segments (in seconds)
| Threads | 1 | 10 | 50 |
|---|---|---|---|
| Average Dialogue Duration | 467.53 | 541.08 | 1464.91 |
| Average Segment Duration | 4.10 | 4.10 | 4.10 |
| Average Translation Duration | 2.15 | 3.75 | 24.17 |
Long Segments (in seconds)
| Threads | 1 | 10 | 50 |
|---|---|---|---|
| Average Dialogue Duration | 539.79 | 541.39 | 1139.21 |
| Average Segment Duration | 11.95 | 11.95 | 11.95 |
| Average Translation Duration | 3.09 | 3.17 | 33.06 |
Segment Statistics
| Min | Avg | Max | |
|---|---|---|---|
| Short Segments | 1 | 4 | 29 |
| Long Segments | 5 | 12 | 19 |
Comparable Summary
| 1 Thread – 200 Samples | 10 Thread – 20 Samples | |
|---|---|---|
| Case | Throughput | Average Delay |
| All Pipeline | 0.21 | 4875.53 |
| Only Whisper | 0.49 | 2038.53 |
| Only Whisper (Params Changed) | 0.49 | 2045.80 |
| Only Google Translate | 3.16 | 315.86 |
| Only Azure TTS | 3.11 | 320.76 |
Note: In tests labeled "Only," the actions like network-based audio input/output and saving audio to disk are not disabled; hence, they are not performing only inference.
Two Server Tests
| Server1 | Server2 | |
|---|---|---|
| Test | Omp_num_threads | Capacity Limiter |
| (5 threads – 20 samples) x 2 | 4 | 4 |
All Pipeline Tests
| Omp_num_threads | Capacity Limiter | Throughput | Average Delay | |
|---|---|---|---|---|
| 1 thread – 200 samples | 4 | 4 | 0.21 | 4875.53 |
| 5 threads – 40 samples | 4 | 4 | 0.49 | 9935.84 |
| 10 threads – 20 samples | 4 | 4 | 0.50 | 19451.39 |
| 1 thread – 200 samples | 8 | 8 | 0.18 | 5443.84 |
| 5 threads – 40 samples | 8 | 8 | 0.51 | 9406.90 |
| 10 threads – 20 samples | 8 | 8 | 0.54 | 18050.80 |
All Pipeline - Whisper Param Changed
| Omp_num_threads | Capacity Limiter | Throughput | Average Delay | |
|---|---|---|---|---|
| 1 thread – 200 samples | 4 | 4 | 0.20 | 4926.17 |
| 5 threads – 40 samples | 4 | 4 | 0.59 | 8296.78 |
| 10 threads – 20 samples | 4 | 4 | 0.55 | 17268.97 |
Only Whisper Tests
| Omp_num_threads | Capacity Limiter | Throughput | Average Delay | |
|---|---|---|---|---|
| 1 thread – 200 samples | 4 | 4 | 0.49 | 2038.53 |
| 5 threads – 40 samples | 4 | 4 | 0.55 | 8685.92 |
| 10 threads – 20 samples | 4 | 4 | 0.54 | 18019.95 |
Model Parameters Changed
cpu_threads=os.environ["OMP_NUM_THREADS"]num_workers=os.environ["OMP_NUM_THREADS"]
| Omp_num_threads | Capacity Limiter | Throughput | Average Delay | |
|---|---|---|---|---|
| 1 thread – 200 samples | 4 | 4 | 0.49 | 2045.80 |
| 5 threads – 40 samples | 4 | 4 | 0.71 | 6893.10 |
| 10 threads – 20 samples | 4 | 4 | 0.70 | 13391.11 |
Only Google Translate
| Omp_num_threads | Capacity Limiter | Throughput | Average Delay | |
|---|---|---|---|---|
| 1 thread – 200 samples | 4 | 4 | 3.16 | 315.86 |
| 5 threads – 40 samples | 4 | 4 | 9.59 | 476.84 |
| 10 threads – 20 samples | 4 | 4 | 918.03 |
Only Azure TTS
| Omp_num_threads | Capacity Limiter | Throughput | Average Delay | |
|---|---|---|---|---|
| 1 thread – 200 samples | 4 | 4 | 3.11 | 320.76 |
| 5 threads – 40 samples | 4 | 4 | 10.13 | 468.56 |
| 10 threads – 20 samples | 4 | 4 | 9.95 | 950.62 |
