Agentic Evaluation

Prev Next
Document Number Revision Number Revision Date
KN.GU.57.EN Rev7 10.04.2026

Agentic Evaluation

The Agentic Evaluation feature enables automated quality evaluations of agent conversations using an AI Agent Evaluator (AE) powered by large language model (LLM) capabilities. By integrating a pre-configured VA project with the CA evaluation pipeline, the system can automatically assess conversations against defined QA form criteria — without requiring manual evaluator intervention.
Unlike rule-based automatic evaluations, this approach leverages LLM reasoning to interpret conversation transcripts in context, providing answers, confidence scores, and explanatory comments for each evaluation question. This makes the AI Evaluator particularly effective for open-ended or nuanced quality criteria.

Note: AI Evaluator is available exclusively in Premium AQM editions.

Step-by-Step Configuration

1. Configure VA Project for Agentic Evaluation

Create a Virtual Agent (VA) project to be used for evaluation, or use an existing project. The VA project must be configured with:

  • Instructions that define how the agent should evaluate conversations (e.g., scoring criteria, tone, expected behaviors).
  • (Optional) Knowledge Base documents containing the QA criteria and guidelines the agent will use as reference during evaluation.

The VA project receives conversation transcripts and form questions, reasons over them using its configured instructions and knowledge base, and returns structured evaluation responses.

Sample Flow

  1. Navigate to Virtual Agent → Projects and create a new project or open an existing one (e.g. CAEvaluator).
    Image
  2. When creating the project, configure the following:
    Image
  3. Go to Knowledge Bases → New Knowledge Base and configure the following sections:
    Image
    Image
    Image
    Image
    Image
  4. Upload the QA evaluation documents under Documents:
    Image
  5. Go to AI Agents and open or create the agent for the project
    Image
    Image
    Image
  6. Give Instruction to AI Agent
    Example:
    # EVALUATION AGENT You are an objective evaluator that scores contact center conversations against KB standards. ## WORKFLOW 1. Parse Input: Extract form_id, conversation_id, transcript, and questions 2. For Each Question: - Call KB1 with: [question_text] + [hint] + "criteria" - Find evidence in transcript - Assign answer based on KB criteria - Write brief comment 3. Return ONLY with valid, minified JSON. Do not include markdown code blocks, backticks, explanations (any additional text), newlines within strings, or escaped quotes unless they are part of the actual data value. ## ANSWER RULES Format by Type: - YesNo: "order1" or "order2" | "answer": "order1" | - Multiple_Choice: "order1", "order2", or "order3"| "answer": "order2" | - CheckBoxes: ["order1", "order2"] | "answer": ["order1", "order3"] | Selection: - KB criteria met → order1 (Yes/highest) - KB criteria not met → order2 (No/lowest) - No evidence → null, confidence "low", flag for review ## CONFIDENCE - high: Clear evidence + clear KB criteria - medium: Evidence needs interpretation OR KB has gray areas - low: Missing/unclear evidence → FLAG FOR REVIEW ## COMMENT (200-250 chars) Comment Structure: [what was perfect] + [Evidence] + [KB reference] [What was missing/incorrect] + [Evidence] + [provide recommendation] + [KB reference] Examples: - "✓ Proper greeting with name. KB: Greeting Standards" - "✗ No identity verification. Required per KB: Auth Procedures" - "Name used 2x (need 3x). KB: Personalization" ``` ## OUTPUT (EXACT FORMAT) json{ "request_id": "[conversation_id]", "evaluation_status": "completed", "processing_metadata": { "agent_id": "CA_Evaluator", "agent_version": "2.0", "processing_time_ms": 0, "evaluation_timestamp": "[ISO8601]", "kb_documents_accessed": ["[KB sections]"] }, "evaluation_results": [{ "questions": [{ "question_id": "Q1.1", "question_text": "[text]", "question_type": "YesNo", "answer": "order1", "confidence": "high", "comment": "[comment]", } }] }], "flagged_for_human_review": false } ## CRITICAL RULES - Same evidence = same answer (be deterministic) - No assumptions - only transcript evidence - If confidence="low" → flag for review - Return ONLY JSON, no extra text ---
  7. Under Tools → + (Add Tool), add a Knowledge Base tool with the following settings:
    Image
  8. Add AI Agent node to workflow and click save and close
    Image
  9. Lastly publish the project
    Image

2. Enter AI Evaluator Settings in CA Tenant

To connect the Agentic AI project with the CA evaluation pipeline, navigate to:

Administration → Settings → Generative AI → AI Evaluator
Fill in the following fields:
Image
The Project Name field is used to match the CA assignment with the correct VA project at runtime.

3. Create an AI Evaluator Assignment

Once the configuration is complete, navigate to AQM → Assignments and create a new assignment.

At the top of the assignment creation page, select the evaluator type:

  • Human — uses the existing assignment flow with human evaluators
  • AI Evaluator — activates the AI-powered evaluation pipeline

When AI Evaluator is selected, the configured project is pre-selected automatically.
Configure the following fields:

Image

Note: Only manual forms are supported for AI Evaluator assignments. Rule-based form logic is applied on the CA side after receiving the VA response.

Work & Data Flow

Once an assignment is saved, the system operates fully automatically:

  1. A scheduled background job runs daily within a configured time window.
  2. The system selects eligible conversations based on assignment criteria (target users, category filter, max daily limit).
  3. For each conversation, an access token is obtained from the configured Identity server (refreshed every 24 hours).
  4. The conversation transcript and form questions are sent to the VA Sync API.
  5. The VA processes the transcript against its instructions and knowledge base, and returns evaluation responses.
  6. Responses are converted into CA evaluation results and saved to the database.
  7. Results appear in Evaluation Results and are reflected in the Coaching Dashboard → Quality tab.

Request & Response Format

Example request for VA Synch API Call:

curl --location --request POST 'https://sandbox.web.cai.demo.sestek.com/magpie/ext-api/messages/synchronized' \
--header 'Content-Type: application/json' \
--header 'Project: CAEvaluator' \
--header 'Authorization: Bearer <YOUR_TOKEN>' \
--header 'X-Conversation-Id: conv_20251120_cs_00212334567' \
--data-raw '{
  "text": "
# FORM INFO
form_id: QA_Form_CS_v2024
form_name: Müşteri Hizmetleri Kalite Değerlendirme Formu
conversation_id: conv_20251120_cs_002334

# TRANSCRIPT
Agent:    hello hello this is carlos from power telecom i would like to tell you about our plan hi sir i would like to tell you about our new sales if you have a few minutes
Customer: hello okay i guess i have a minute what is
Agent:    perfect the plan includes unlimited talk text data and nationwide coverage for only fifty dollars a month
Customer: sixty dollars my current plan is forty five dollars so the price is not
Agent:    we have the best coverage and signal nationwide sir no weak signals
Customer: okay but i think i will pass on that i am happy with my current plan and i dont want to make another commitment what will bind me for years you understand what im zain
Agent:    the sale end the sale ends today i think its a great deal you will have unlimited talk text and data for only fifty dollars a month
Customer: i will pass on your offer im not interested i will call you if i decide to do so
Agent:    okay sir sorry for the inconvenience please feel free to call me back if there is anything i can help you with
Customer: okay have a good day
Agent:    you too sir i appreciate your time bye bye

# QUESTIONS

Q1.1: Avoid Distractions?
Type: Multiple_Choice
Options:
  - order1: Full Points
  - order2: Partial Points
  - order3: Zero Points
Hint: (none)

Q1.2: Maintained Friendly, Professional Tone?
Type: Multiple_Choice
Options:
  - order1: Full Points
  - order2: Partial Points
  - order3: Zero Points
Hint: (none)

Q1.3: Offer Further Assistance & Close Call Politely?
Type: Multiple_Choice
Options:
  - order1: Full Points
  - order2: Partial Points
  - order3: Zero Points
Hint: (none)

Q1.4: Avoid Distractions / Maintained Friendly, Professional Tone
Type: CheckBoxes
Options:
  - order1: Agent Avoided Distractions
  - order2: Agent Maintained Friendly, Professional Tone
Hint: (none)

Q2.1: Provide Correct Information?
Type: YesNo
Options:
  - order1: Yes
  - order2: No
Hint: (none)
  ",
  "type": "message",
  "channelId": "IVR",
  "conversation": {
    "id": "conv_20251120_cs_00212334567"
  },
  "attachments": [],
  "channelData": {
    "ResponseType": "Text"
  }
}'

Example response from VA Synch API Call:

{
  "request_id": "conv_20251120_cs_002334",
  "evaluation_status": "completed",
  "processing_metadata": {
    "agent_id": "CA_Evaluator",
    "agent_version": "2.0",
    "processing_time_ms": 12,
    "evaluation_timestamp": "2025-11-20T00:00:00Z",
    "kb_documents_accessed": [
      "Avoid Distractions criteria",
      "Provide Correct Information criteria",
      "Customer Relations Monitoring Definitions"
    ]
  },
  "evaluation_results": [
    {
      "questions": [
        {
          "question_id": "Q1.1",
          "question_text": "Avoid Distractions?",
          "question_type": "Multiple_Choice",
          "answer": "opt3",
          "confidence": "high",
          "comment": "Agent did not stay focused and attempted to push a sale despite customer refusal; evidence: agent repeatedly promoted plan after customer said not interested; KB: Avoid Distractions expectations suggest giving full attention and not persisting past clear refusal"
        },
        {
          "question_id": "Q1.2",
          "question_text": "Maintained Friendly, Professional Tone?",
          "question_type": "Multiple_Choice",
          "answer": "opt1",
          "confidence": "high",
          "comment": "Agent used polite language and closed politely; evidence: greeted, used 'sir', apologized for inconvenience and thanked customer; KB: Maintain Friendly, Professional Tone standards require care and warmth in vocal inflection"
        },
        {
          "question_id": "Q1.3",
          "question_text": "Offer further Assistance & Close call Politely",
          "question_type": "Multiple_Choice",
          "answer": "opt1",
          "confidence": "high",
          "comment": "Agent offered further assistance and closed politely; evidence: 'please feel free to call me back if there is anything i can help you with' and courteous farewell; KB: Offer Further Assistance & Close Call Politely requires asking if anything else and courteous close"
        },
        {
          "question_id": "Q1.4",
          "question_text": "Avoid Distractions/Maintained Friendly, Professional Tone",
          "question_type": "CheckBoxes",
          "answer": [
            "opt1",
            "opt2"
          ],
          "confidence": "medium",
          "comment": "Agent maintained friendly tone and closed politely but did not avoid repetitive sales persistence after customer refusal; evidence: polite language and apology plus repeated pitch after customer said not interested; KB: Avoid Distractions and Tone criteria"
        },
        {
          "question_id": "Q2.1",
          "question_text": "Provide Correct Information",
          "question_type": "YesNo",
          "answer": "order2",
          "confidence": "high",
          "comment": "Agent provided plan details but misstated price consistency and pressured incorrectly; evidence: agent said unlimited for fifty dollars while customer referenced sixty and current forty five leading to confusion; KB: Provide Correct Information requires accurate, relevant and complete info"
        }
      ]
    }
  ],
  "flagged_for_human_review": false
}