Agent behavior is configured through playbooks - YAML files that define what each agent should investigate. You don’t need to write these by hand - the Moose Builder creates and tests them for you.
How agents work
Alert arrives
A monitoring tool posts an alert to one of your monitored channels, or an investigation is triggered via the API.
Orchestrator selects agents
The orchestrator analyzes the alert context and dynamically coordinates the right debugging agents. Multiple agents can be selected for a single alert.
Agents investigate in parallel
Each agent queries your connected tools - metrics, logs, traces, deployments - gathering evidence simultaneously from different angles. Data collection is deterministic and parallel; AI is used only to synthesize the results.
Why agents are fast and cheap
Each agent combines deterministic automation with AI-powered analysis. The automation layer handles all data collection - which tools to call, in what order, with what parameters. The AI layer analyzes and synthesizes the output into a clear conclusion. No LLM-driven planning or retry loops, just parallel data gathering followed by a single synthesis pass. This architecture means agents are:- Fast - investigations complete in seconds because tool calls run in parallel
- Cost-effective - lightweight enough to run on every alert, not just critical ones
- Reliable - deterministic data collection means predictable, debuggable behavior
- Transparent - every step is visible and configurable
Building agents
The Moose Builder is the fastest way to create agents. It can:- Infer investigation workflows from your historical incidents
- Build agents conversationally from your team’s debugging knowledge
- Test agents against real past issues before they go live
Configuring agents with playbooks
Each agent’s behavior is defined by a playbook - a YAML configuration that specifies what to investigate and how. Workflows are open, transparent, and fully configurable.Actions
Actions
Actions are the individual investigation steps. Each action queries one of your connected tools - metrics, logs, traces, deployments - to gather evidence. Actions run in parallel when possible to keep investigations fast.
Tip
Tip
A contextual hint that guides the agent’s AI when analyzing results. The tip tells the agent what this investigation is trying to accomplish, helping it produce more relevant root cause analysis and recommendations.
Mitigation tools
Mitigation tools
Remediation suggestions that are surfaced after an investigation completes. Each mitigation tool can be either:
- A link to an external tool (e.g., a runbook, a scaling dashboard, a rollback pipeline)
- An action that can be executed directly
displayAlways to show it regardless of the investigation outcome.Monitor patterns
Monitor patterns
Title patterns that map incoming alerts to this agent. Patterns support
{{placeholder}} wildcards for flexible matching.For example, the pattern Instance {{host}} CPU is >{{threshold}}% would match alerts like “Instance web-01 CPU is >90%” and automatically extract host and threshold as attributes.Required attributes
Required attributes
Attributes that scope the investigation to the right context. Common examples include
environment, cluster, host, or service. Attributes can be extracted automatically from the alert via monitor patterns, or provided manually when triggering an investigation.Variables
Variables
Key-value pairs that get substituted into action arguments using
{{variableName}} syntax, allowing you to reuse the same actions with different configurations.Agent capabilities
Beyond standard investigation, agents can:- Detect anomalies - identify unusual patterns in metrics and logs that may not have triggered alerts themselves
- Identify suspicious services - pinpoint the service or component most likely responsible for the incident
- Suggest mitigations - recommend specific remediation steps based on the investigation findings
- Deduplicate alerts - recognize when multiple alerts are symptoms of the same underlying issue
- Learn from feedback - improve investigation quality over time based on your team’s input
How investigations are triggered
When an alert arrives in a monitored channel, the orchestrator analyzes the alert context and dynamically coordinates the right debugging agents. Multiple agents can run in parallel for the same alert, each investigating a different angle based on what the system finds. Investigations can also be triggered:- Manually via commands in Slack or Teams
- Programmatically via the Execution API
Best practices
Start with the Builder
Let the Moose Builder create your first agents. It will test them against historical incidents and refine them for you.
Use specific patterns
Write monitor patterns that are specific enough to avoid false matches but flexible enough to cover alert variations.