Operations¶
Dashboard¶
Agent Dashboard: atl-e.dashecorp.com Kanban Board: kanban.dashecorp.com
The Automate-E dashboard shows:
- Agent Info — name, bio, memory type
- Active Sessions — Discord threads being tracked
- MCP Servers — GitHub MCP status (green = connected)
- Recent Tool Calls — GitHub API calls with latency and status
- Token Usage & Cost — LLM calls, tokens, cost per model
- Stats — Tool call success rate
- Live Logs — Agent activity log
Checking Logs¶
# Gateway logs (Discord connection)
kubectl logs -n atl-e deploy/atl-e-automate-e-gateway
# Worker logs (agent loop + GitHub MCP)
kubectl logs -n atl-e -l app.kubernetes.io/component=worker
# Latest cron run
kubectl logs -n atl-e -l app.kubernetes.io/component=cron --sort-by=.metadata.creationTimestamp | tail -30
# All pods
kubectl get pods -n atl-e
Common Issues¶
Workers not responding to Discord messages¶
Check that workers have the GITHUB_PERSONAL_ACCESS_TOKEN env var:
kubectl get deploy atl-e-automate-e-worker -n atl-e \
-o jsonpath='{.spec.template.spec.containers[0].env[*].name}'
Should include GITHUB_PERSONAL_ACCESS_TOKEN. If missing, ArgoCD hasn't synced the latest chart.
GitHub MCP server fails to connect¶
kubectl logs -n atl-e -l app.kubernetes.io/component=worker | grep -i "mcp\|github"
Common causes:
- GITHUB_PERSONAL_ACCESS_TOKEN not set or expired
- Network policy blocking outbound HTTPS
- npm registry issues (MCP server downloaded via npx)
Cron posts reasoning instead of just notifications¶
The character prompt must include strict output rules. Check that personality contains:
- "Output ONLY the final notifications"
- "Do NOT include reasoning, analysis, or chain-of-thought"
- "Keep total response under 1500 characters"
Webhooks not arriving¶
Check gateway logs for webhook events:
kubectl logs -n atl-e deploy/atl-e-automate-e-gateway | grep -i webhook
Common causes:
- GITHUB_WEBHOOK_SECRET env var missing or mismatched
- Cloudflare Tunnel not routing to gateway port 3000
- GitHub webhook delivery failures (check repo Settings → Webhooks → Recent Deliveries)
Postgres connection issues¶
# Check if Postgres is running
kubectl get pods -n atl-e -l app=postgres
# Test connection from worker
kubectl exec -n atl-e deploy/atl-e-automate-e-worker -- \
node -e "console.log(process.env.DATABASE_URL ? 'DB URL set' : 'DB URL missing')"
# Check if facts are being saved
kubectl exec -n atl-e postgres-0 -- psql -U atl_e -d atl_e -c 'SELECT count(*) FROM facts'
High cost per cron run¶
Input tokens are high (~80K) because GitHub MCP returns verbose PR data. To reduce: - Reduce number of monitored repos - Increase cron interval (currently 1 hour) - The GitHub MCP server returns full PR bodies which inflates tokens
Scaling¶
| Setting | Current | To change |
|---|---|---|
| Worker replicas | 2 | workers.replicas in values.yaml |
| Cron frequency | Every hour | cron.schedule in values.yaml |
| Model | Haiku (cheapest) | character.llm.model in values.yaml |
Restart¶
# Restart gateway + workers
kubectl rollout restart deploy -n atl-e
# Force ArgoCD sync
kubectl -n argocd patch application atl-e --type merge \
-p '{"metadata":{"annotations":{"argocd.argoproj.io/refresh":"hard"}}}'