Phase 10.2: Better Logging & Log Management - Context

Gathered: 2026-02-08 Status: Ready for planning

## Phase Boundary

Improve operational visibility into the bot's own execution. Add centralized error capture, execution tracing, and debugging infrastructure so that issues (sub-workflow data loss, callback routing confusion, Docker API failures) can be diagnosed programmatically rather than through manual investigation of n8n execution logs.

This is NOT about container log viewing (the /logs command) — it's about the bot's internal execution logging.

## Implementation Decisions

Error capture & reporting

Errors display inline to the user as summary + cause (e.g., "Failed to stop nginx: Docker API returned 404 (container not found)")
Full diagnostic data (sub-workflow name, node, raw response, stack trace) captured in central error store for Claude's use
Only report errors on user-triggered actions — no proactive/unsolicited error notifications
Error store uses ring buffer: last 50 errors, auto-rotated
Manual clear command also available (/clear-errors or similar, hidden/unlisted)

Execution traceability

All sub-workflows report errors back to main workflow for centralized storage
Trace data designed for programmatic access — Claude can query it during debugging sessions
Hidden/unlisted Telegram commands for quick error checks (e.g., /errors to see recent errors)
File-based access also available for deep investigation during debugging sessions

Log output & storage

Error/trace data stored in n8n workflow static data (main workflow)
Centralized in main workflow — sub-workflows report back, main stores
Auto-rotate (ring buffer, 50 entries) + manual clear command
Both Telegram commands (quick checks) and file/API access (deep investigation)

Debug mode

Debug mode is for Claude's use during debugging — not user-facing
Must address three specific pain points:
1. Sub-workflow data loss — capture what data was sent to and received from each sub-workflow at boundaries
2. Callback routing confusion — trace which path a callback took through routing logic
3. n8n API execution log parsing — make execution data easily queryable without manual workflow investigation

Claude's Discretion

Trace format and structure (timeline vs. data snapshots vs. both)
Whether to trace all executions or only errors (overhead vs. usefulness)
Structured entries vs. simple log lines (what enables best debugging)
Debug toggle mechanism (global toggle, per-request, or always-on for errors)
Log level granularity (on/off vs. error/warn/info)
What specific debug data to capture (raw API responses, sub-workflow I/O, timing)
Telegram command naming and exact interface

## Specific Ideas

"I want you to be more easily able to track down issues when they occur" — the driving goal is Claude's ability to programmatically diagnose issues
Past pain points: sub-workflow boundary data disappearing, callback routing taking unexpected paths, difficulty parsing n8n execution API responses
"These logs would resolve these issues" — the logging infrastructure should make the three pain points immediately queryable
Error commands should be hidden/unlisted (developer/debug tools, not part of normal command set)

## Deferred Ideas

None — discussion stayed within phase scope

Phase: 10.2-better-logging-and-log-management Context gathered: 2026-02-08

3.5 KiB Raw Permalink Blame History