Waiting for engine...
Skip to main content

Testing and debugging agents

After you create an agent in the Agent Designer, you can test it in the Test Agent window before deploying it. The agent's responses include a Show Trace section where you can see details on the agent's response, reasoning, and the tools it uses. These details can help you fine-tune your agent and troubleshoot issues with responses and behavior.

image showing test agent window and trace

Troubleshooting tips

  • Test your agent iteratively - test your agent before and after you add tasks and guardrails. Testing iteratively helps you easily identify which configuration is causing an issue and which configurations are working correctly.
note

Testing your agent may count against any usage limits.

Instructions

  • Be specific and detailed: You may need to adjust your instructions or add additional tasks so that the Large Language Model (LLM) understands how to behave in certain situations. It may not have enough information or context to act appropriately. This can cause incorrect reasoning to show in the trace.

  • Include timelines and action triggers: Tell the agent when to do an action. This can correct issues where the agent is not following instructions in the way you want it to. For example, "After you get information from the database about X, confirm with the user that they want to do X." "Before you do X, ask the user for the X parameter to make the API call using the X tool."

Read Guidelines for building effective AI agents for instruction best practices.

Tools

  • Make changes to tool configuration: Your tool configuration may need adjustment to work correctly. The trace can indicate if the agent is having trouble using a specific tool during a tool step. Review Building an agent for more information.

  • Ensure your tool is linked to the correct task: Your tool needs to be attached to the same task where it is relevant. You can attach a tool to multiple tasks. You may need to add additional instructions in the task that tell the agent when to use the tool for that particular outcome. For example, "Use the X tool to query the database and get information about X."

API tools

  • Remove any extra spaces surrounding parameters: Extra spaces can cause an error when the agent calls the API.

  • Test API authentication: Test the API endpoint using Postman or a similar tool. Ensure the API call is successful and that you have entered the correct credentials.

  • Check for duplication: Do not duplicate the URL in the API tool for the endpoint path. The API tool adds the base URL and the endpoint path to create the API call. For example, entering the base URL and then the base URL + endpoint path would duplicate the base URL and cause the tool to call the baseURLbaseURL+endpoint path, causing an error.

example of duplication

Guardrails

  • Adjust guardrails: Evaluate and adjust guardrails so they do not limit the agent from accomplishing the task. Guardrails cause the agent to respond with the blocked message you configured (for example, "I'm sorry but I'm only able to provide an order status and customer support contact information."). The trace can indicate when and how the LLM triggered the guardrail while following instructions.

How to test your agent

  1. Click the AI icon in Platform and navigate to Agent Garden > Agents.
  2. Click on an agent to enter edit mode.
  3. In the Test Agent window, start engaging with your agent. When you receive an agent response, click the Show Trace drop-down. The AI's responses contain a trace that provides details about the agent's steps, reasoning, and tools it is using. These details help you determine whether your agent is behaving as intended or if there are possible fixes you need to make.
note

Testing your agent may count against any usage limits.

  1. Click the Steps drop-down. Open up each step in the drop-down to see a breakdown of how the request was processed. Each step contains details about the agent's reasoning, actions, and interpretations. You can use this information to make modifications to the tasks, instructions, or tools to fine-tune agent behavior.

    • Type - describes the nature of the step, e.g, THINKING (reasoning via the LLM) or ACTION (using a tool).
    • Details - describes what occurred during the step. image shows thinking type and detail of LLM reasoning
  2. Click the Invocation Metrics drop-down. It contains metrics about the LLM invocation. This helps you monitor the agent's use of the LLM.

    • Count - number of times the LLM has been called.
    • averageLatency - average time in milliseconds to process the LLM request.
    • inputTokenCount - number of tokens in the input.
    • outputTokenCount - number of tokens in the output.
    • ttft - Time to First Token measures the latency of the LLM. It indicates the time between sending the first request to the model and receiving the first token (or word/character) in the response.
  3. Click the Tool Calls drop-down to see the tool function calls made during processing. This helps you identify when and what tools are called and the input parameters being taken in by the agent.

    • Tool_name - the name of the tool you created in Agent Designer that was invoked
    • Parameters - parameters passed to the tool (e.g., latitude, longitude)
    • Latency - time in milliseconds taken by the tool to run the tool call.
  4. Click the guardrail drop-down to see what policy enforcement mechanisms were applied while the agent handled the request.

    • topicPolicy - describe how the LLM applied topic-based filtering. Under topics, you'll find the following information:
      • name - this is the name of the policy from the Guardrails tab
      • type - a type of restriction (e.g. DENY)
      • Action - action taken (e.g BLOCKED)
    • wordPolicy - describes word-based filtering that causes a user's words to block the agent from responding. Under word policy you'll find the following information:
      • customWords - displays the number and list of blocked words configured by the user in the guardrail. Match is the blocked word, and Action describes the action the agent took ("BLOCKED").
      • managedWordLists - displays the number and list of blocked words that are applied by default for all the agents. Match is the blocked word, action is the action the agent took ("BLOCKED"), and the Type is the category of the word, ex, PROFANITY, INSULTS, etc.
    • sensitiveInformationPolicy - displays the number and list of RegEx matches that are configured by the user to prevent the agent from processing and producing sensitive information that matches a RegEx pattern. Name refers to the name of the Policy in the guardrails tab, Match is the word or phrase that matched, regex is the pattern it matched to, and Action refers to the action that Agent took ("BLOCKED")
    • contentPolicy These are default content filters that are applicable to all the agents for the following categories:
      • HATE
      • SEXUAL
      • VIOLENCE
      • INSULTS
      • MISCONDUCT
      • PROMPT_ATTACK

These filters are placed to prevent agents from behaving inappropriately and in an unsafe manner. The results of this policy are a list of filters with Type denoting the category of filter that was triggered; Confidence is a numerical score between 1-100 useful in determining how provoking a given prompt was. Filter Strength is the strength at which the filter is configured (the only value is "HIGH"), and Action refers to the action that the agent took ("BLOCKED") image shows thinking type and detail of LLM reasoning

  1. Based on the information in the trace, make adjustments to your tasks, instructions, tools, and guardrails.
On this Page