> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wildmoose.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Debugging agents

> How Wild Moose investigates production incidents

Wild Moose uses specialized debugging agents to investigate production incidents. When an alert fires, an orchestrator dynamically selects and coordinates the right agents - each one a focused investigator that knows how to query your tools and analyze the results. Investigations complete in seconds, not minutes.

<Info>
  Agent behavior is configured through **playbooks** - YAML files that define what each agent should investigate. You don't need to write these by hand - the [Moose Builder](/features/builder) creates and tests them for you.
</Info>

## How agents work

<Steps>
  <Step title="Alert arrives">
    A monitoring tool posts an alert to one of your monitored channels, or an investigation is triggered via the API.
  </Step>

  <Step title="Orchestrator selects agents">
    The orchestrator analyzes the alert context and dynamically coordinates the right debugging agents. Multiple agents can be selected for a single alert.
  </Step>

  <Step title="Agents investigate in parallel">
    Each agent queries your connected tools - metrics, logs, traces, deployments - gathering evidence simultaneously from different angles. Data collection is deterministic and parallel; AI is used only to synthesize the results.
  </Step>

  <Step title="Results are synthesized">
    The findings are combined into a structured enrichment with root cause analysis, key findings, and suggested mitigations, posted as a threaded reply - in seconds.
  </Step>
</Steps>

## Why agents are fast and cheap

Each agent combines deterministic automation with AI-powered analysis. The automation layer handles all data collection - which tools to call, in what order, with what parameters. The AI layer analyzes and synthesizes the output into a clear conclusion. No LLM-driven planning or retry loops, just parallel data gathering followed by a single synthesis pass.

This architecture means agents are:

* **Fast** - investigations complete in seconds because tool calls run in parallel
* **Cost-effective** - lightweight enough to run on every alert, not just critical ones
* **Reliable** - deterministic data collection means predictable, debuggable behavior
* **Transparent** - every step is visible and configurable

## Building agents

The [Moose Builder](/features/builder) is the fastest way to create agents. It can:

* Infer investigation workflows from your historical incidents
* Build agents conversationally from your team's debugging knowledge
* Test agents against real past issues before they go live

You can also configure agents manually through playbooks.

## Configuring agents with playbooks

Each agent's behavior is defined by a playbook - a YAML configuration that specifies what to investigate and how. Workflows are open, transparent, and fully configurable.

<AccordionGroup>
  <Accordion title="Actions" icon="play" defaultOpen={true}>
    Actions are the individual investigation steps. Each action queries one of your connected tools - metrics, logs, traces, deployments - to gather evidence. Actions run in parallel when possible to keep investigations fast.
  </Accordion>

  <Accordion title="Tip" icon="lightbulb">
    A contextual hint that guides the agent's AI when analyzing results. The tip tells the agent what this investigation is trying to accomplish, helping it produce more relevant root cause analysis and recommendations.
  </Accordion>

  <Accordion title="Mitigation tools" icon="wrench">
    Remediation suggestions that are surfaced after an investigation completes. Each mitigation tool can be either:

    * A **link** to an external tool (e.g., a runbook, a scaling dashboard, a rollback pipeline)
    * An **action** that can be executed directly

    You can mark a mitigation tool as `displayAlways` to show it regardless of the investigation outcome.
  </Accordion>

  <Accordion title="Monitor patterns" icon="radar">
    Title patterns that map incoming alerts to this agent. Patterns support `{{placeholder}}` wildcards for flexible matching.

    For example, the pattern `Instance {{host}} CPU is >{{threshold}}%` would match alerts like *"Instance web-01 CPU is >90%"* and automatically extract `host` and `threshold` as attributes.
  </Accordion>

  <Accordion title="Required attributes" icon="list-check">
    Attributes that scope the investigation to the right context. Common examples include `environment`, `cluster`, `host`, or `service`. Attributes can be extracted automatically from the alert via monitor patterns, or provided manually when triggering an investigation.
  </Accordion>

  <Accordion title="Variables" icon="brackets-curly">
    Key-value pairs that get substituted into action arguments using `{{variableName}}` syntax, allowing you to reuse the same actions with different configurations.
  </Accordion>
</AccordionGroup>

## Agent capabilities

Beyond standard investigation, agents can:

* **Detect anomalies** - identify unusual patterns in metrics and logs that may not have triggered alerts themselves
* **Identify suspicious services** - pinpoint the service or component most likely responsible for the incident
* **Suggest mitigations** - recommend specific remediation steps based on the investigation findings
* **Deduplicate alerts** - recognize when multiple alerts are symptoms of the same underlying issue
* **Learn from feedback** - improve investigation quality over time based on your team's input

## How investigations are triggered

When an alert arrives in a monitored channel, the orchestrator analyzes the alert context and dynamically coordinates the right debugging agents. Multiple agents can run in parallel for the same alert, each investigating a different angle based on what the system finds.

Investigations can also be triggered:

* Manually via commands in Slack or Teams
* Programmatically via the [Execution API](/api-reference/endpoint/execute-playbook)

## Best practices

<Columns cols={2}>
  <Card title="Start with the Builder" icon="hammer">
    Let the [Moose Builder](/features/builder) create your first agents. It will test them against historical incidents and refine them for you.
  </Card>

  <Card title="Use specific patterns" icon="bullseye">
    Write monitor patterns that are specific enough to avoid false matches but flexible enough to cover alert variations.
  </Card>
</Columns>
