Performance Tests & Hardware Sizing

This documentation provides a detailed summary of the performance tests conducted on our product, capturing various metrics across different threading configurations and segments.

Sample Dialogue Durations:

Short Segments (in seconds)

Threads	1	10	50
Dialogue Duration	496.0	602.0	1845.0
Talking Duration	191.64	191.64	191.64
Playing Audio Duration	191.64	191.64	191.64
Translation Duration	111.0	223.0	1458.0
Average Segment Duration	3.6	3.6	3.6
Average Translation Duration	2.1	4.2	27.5

Long Segments (in seconds)

Threads	1	10	50
Dialogue Duration	477.0	478.0	997.0
Talking Duration	210.0	210.0	210.0
Playing Audio Duration	210.0	210.0	210.0
Translation Duration	57.0	58.0	577.0
Average Segment Duration	10.5	10.5	10.5
Average Translation Duration	2.85	2.9	28.85

Average Durations for 100 Dialogues:

Short Segments (in seconds)

Threads	1	10	50
Average Dialogue Duration	467.53	541.08	1464.91
Average Segment Duration	4.10	4.10	4.10
Average Translation Duration	2.15	3.75	24.17

Long Segments (in seconds)

Threads	1	10	50
Average Dialogue Duration	539.79	541.39	1139.21
Average Segment Duration	11.95	11.95	11.95
Average Translation Duration	3.09	3.17	33.06

Segment Statistics

	Min	Avg	Max
Short Segments	1	4	29
Long Segments	5	12	19

Comparable Summary

	1 Thread – 200 Samples	10 Thread – 20 Samples
Case	Throughput	Average Delay
All Pipeline	0.21	4875.53
Only Whisper	0.49	2038.53
Only Whisper (Params Changed)	0.49	2045.80
Only Google Translate	3.16	315.86
Only Azure TTS	3.11	320.76

Note: In tests labeled "Only," the actions like network-based audio input/output and saving audio to disk are not disabled; hence, they are not performing only inference.

Two Server Tests

	Server1	Server2
Test	Omp_num_threads	Capacity Limiter
(5 threads – 20 samples) x 2	4	4

All Pipeline Tests

	Omp_num_threads	Capacity Limiter	Throughput	Average Delay
1 thread – 200 samples	4	4	0.21	4875.53
5 threads – 40 samples	4	4	0.49	9935.84
10 threads – 20 samples	4	4	0.50	19451.39
1 thread – 200 samples	8	8	0.18	5443.84
5 threads – 40 samples	8	8	0.51	9406.90
10 threads – 20 samples	8	8	0.54	18050.80

All Pipeline - Whisper Param Changed

	Omp_num_threads	Capacity Limiter	Throughput	Average Delay
1 thread – 200 samples	4	4	0.20	4926.17
5 threads – 40 samples	4	4	0.59	8296.78
10 threads – 20 samples	4	4	0.55	17268.97

Only Whisper Tests

	Omp_num_threads	Capacity Limiter	Throughput	Average Delay
1 thread – 200 samples	4	4	0.49	2038.53
5 threads – 40 samples	4	4	0.55	8685.92
10 threads – 20 samples	4	4	0.54	18019.95

Model Parameters Changed

cpu_threads=os.environ["OMP_NUM_THREADS"]
num_workers=os.environ["OMP_NUM_THREADS"]

	Omp_num_threads	Capacity Limiter	Throughput	Average Delay
1 thread – 200 samples	4	4	0.49	2045.80
5 threads – 40 samples	4	4	0.71	6893.10
10 threads – 20 samples	4	4	0.70	13391.11

Only Google Translate

	Omp_num_threads	Capacity Limiter	Throughput	Average Delay
1 thread – 200 samples	4	4	3.16	315.86
5 threads – 40 samples	4	4	9.59	476.84
10 threads – 20 samples	4	4		918.03

Only Azure TTS

	Omp_num_threads	Capacity Limiter	Throughput	Average Delay
1 thread – 200 samples	4	4	3.11	320.76
5 threads – 40 samples	4	4	10.13	468.56
10 threads – 20 samples	4	4	9.95	950.62