Overview
Dakora provides complete visibility into your AI agent operations:- Analytics dashboard with cost trends and performance metrics
- Execution traces with full conversation history
- Budget monitoring with alerts and enforcement
- AI-powered optimization to reduce spending

How Tracking Works
When you render templates with the SDK, executions are automatically tracked:Data Flow
- Template Rendering - SDK embeds a metadata marker in your prompt
- LLM Call - Your agent framework makes the API call
- Span Ingestion - OpenTelemetry spans are sent to Dakora
- Template Linkage - Dakora matches spans to templates via markers
- Analytics - Usage aggregated for dashboards and recommendations
What Gets Tracked
| Field | Description |
|---|---|
trace_id | Unique identifier for the execution |
provider | LLM provider (openai, azure_openai, etc.) |
model | Model identifier used |
tokens_in / tokens_out | Token counts |
cost_usd | Calculated cost based on pricing |
latency_ms | Response time in milliseconds |
messages | Full input/output message history |
Analytics Dashboard
Access the Analytics Dashboard from your project sidebar for aggregate insights.Key Metrics
| Metric | Description |
|---|---|
| Total Executions | LLM calls in the selected time range |
| Active Prompts | Template count currently in use |
| Daily Average | Token consumption over last 24 hours |
| Weekly Cost | Total spend for past 7 days |
Time Range
Toggle between periods:- Last 24 hours - Real-time monitoring
- Last 7 days - Weekly trends
- Last 30 days - Monthly planning
Health Metrics
The Overview tab shows:- Avg Latency with P95/P99 percentiles
- Success Rate with error breakdown
- Error Count for the selected period
High P95/P99 latency compared to average indicates occasional slow responses. Consider investigating slow model calls.
Token Usage Timeline
An interactive chart showing consumption over time:- Hover to see tokens, cost, and execution count
- Identify usage spikes and patterns
- Correlate with deployments or traffic changes
Model Distribution
See which models consume your tokens:Executions List
Navigate to Executions in your project sidebar to view all LLM calls.Filtering
| Filter | Description |
|---|---|
| Provider | Filter by LLM provider (OpenAI, Azure OpenAI) |
| Model | Search for specific models (e.g., “gpt-4o”) |
| Agent ID | Show calls from a specific agent |
| Prompt ID | Filter by template usage |
| Has Templates | Toggle to show only template-linked calls |
Performance Badges
Executions are tagged with performance indicators:- 🟢 Fast - Under 2 seconds
- 🟡 Normal - 2-5 seconds
- 🟠 Slow - 5-10 seconds
- 🔴 Very Slow - Over 10 seconds
Trace Detail
Click any execution to see the full trace view.
Metrics Summary
Four key metric cards show:- Total Tokens - Combined input/output with tokens/second throughput
- Cost - USD cost with per-1K-token rate
- Latency - Response time in milliseconds
- Messages - Conversation count and linked templates
Conversation View
The main content displays the full conversation:- User messages - What was sent to the model
- Assistant responses - Model outputs
- System prompts - Initial context
- Tool calls - Function invocations in agentic workflows
Template Linkage
If templates were used, the sidebar shows:- Template name with version number
- Direct link to edit the template
Timeline View
For multi-step workflows, the timeline shows span hierarchy:- Agent spans - High-level agent invocations
- Chat spans - Individual LLM calls
- Tool spans - Function calls and responses
Cost Optimization
Dakora analyzes execution history to find savings opportunities.How It Works
- Usage Analysis - Examines prompts with 20+ executions
- Output Assessment - Measures average output token count
- Variance Detection - Checks output consistency
- Model Matching - Suggests cheaper models for simple tasks
Recommendation Cards
Each recommendation shows:| Field | Description |
|---|---|
| Prompt ID | The template to optimize |
| Current Model | What you’re using (e.g., gpt-4o) |
| Suggested Model | Cheaper alternative (e.g., gpt-4o-mini) |
| Confidence | High, Medium, or Low |
| Weekly Savings | Estimated USD saved per week |
Confidence Levels
- High - Low output variance, many executions, simple outputs
- Medium - Moderate variance or fewer executions
- Low - Outputs may vary; test carefully
Best Practices
Keep Tracking Enabled
Keep Tracking Enabled
Leave
embed_metadata=True (the default) when rendering templates for accurate cost attribution.Set Budget Alerts Early
Set Budget Alerts Early
Configure alerts at 50% and 80% thresholds before costs become a problem.
Review Weekly
Review Weekly
Make analytics review part of your routine. Catch trends before month-end surprises.
Start with High-Confidence Optimizations
Start with High-Confidence Optimizations
Implement high-confidence recommendations first for lowest risk and highest impact.
Use Separate Projects
Use Separate Projects
Create distinct projects for dev, staging, and production to avoid skewing analytics.
Next Steps
Project Budgets
Configure spending limits