- Print
- PDF
The success of the Virtual Agent solution is based on its accuracy of intent detection capabilities. Testing the accuracy of virtual agent can be done by importing the test data includes utterance and expected intent list. Percentage result can be compared with expected KPI.
The accuracy is calculated by the ratio of number of correct predictions to the total number of input samples.
The accuracy test is done via the interface with the following steps:
- The user should come to the intent page of the VA. To test the utterances, there is a Test button on the right side of the page, by clicking this button, the user can first download the test template.
- In the downloaded excel file, the user is expected to fill in the first two columns, namely Utterance and Expected Intent.
Then the user needs to upload this test list. After uploading the Excel file, the test accuracy percentage is shown in the list.
The user can download the test results to check scores.
Findings are the scores, which are encountered during tests, and cause an impact in the related success rates. To improve model performance, it is suggested that training the Virtual Agent with enough and well determined utterances for each intent.
Default confidence level to match with an utterance and intent is 0.5. Below this threshold the utterance will not be matched with any intent and the fallback scenario will work. The threshold can be changed for each Virtual Agent.
The success rate is independent from SR success rate. For the projects that include SR the following criteria should be satisfied:
- The voice input used in the SR process should not have noise in the background nor contain any discontinuance. It should be clear and intelligible.
- Necessary hardware and network infrastructure should be provided.
- The words which are used in the tests should be included in the already-defined language model and dictionary.
- Lossless encoding and compression methods should be preferred.
- To improve default SR performance, language model optimization should have been done with the voice recordings received from the costumer.
- Language model and acoustic model optimizations should have been done with the voice recordings recorded on the device which is being used for speech recognition.
- Measurements should be performed from at most 1 meter away from the device.
Model performance can be improved by using “fallback utterances”. Where the utterances go to fallback, for cases where it doesn't match any intent, the data is stored in the reporting database. Fallback data can be managed manually to improve the utterance list for relevant intents.