docs(10.2-01): complete error ring buffer foundation plan

- Create 10.2-01-SUMMARY.md with full plan execution details - Update STATE.md: Phase 10.2 plan 1 of 3 complete (33% progress) - Document all technical decisions and architecture patterns - Self-check verification passed: all files and commits present - Duration: 156 seconds (2.6 minutes) - Node count: 168 -> 172 (+4 nodes: 2 command, 2 utility) - Ring buffer infrastructure ready for Plan 02 wiring
2026-02-08 12:49:54 -05:00
parent 030118efb3
commit 6833641ad1
2 changed files with 236 additions and 10 deletions
@@ -0,0 +1,209 @@
+---
+phase: 10.2-better-logging-and-log-management
+plan: 01
+subsystem: logging-infrastructure
+tags: [error-logging, debug-commands, ring-buffer, telegram-commands]
+dependency_graph:
+  requires: []
+  provides: [error-ring-buffer, debug-toggle, trace-infrastructure, hidden-commands]
+  affects: [main-workflow]
+tech_stack:
+  added: [workflow-static-data, ring-buffer-pattern]
+  patterns: [ring-buffer, structured-logging, command-routing]
+key_files:
+  created: []
+  modified: [n8n-workflow.json]
+decisions:
+  - "Ring buffer size set to 50 entries for both errors and traces"
+  - "Debug mode auto-disables after 100 executions to prevent performance impact"
+  - "Field truncation: error stack 500 chars, raw response 1000 chars"
+  - "All 4 debug commands use single unified code node for maintainability"
+  - "Commands use startsWith operator to prevent false matches with regular text"
+metrics:
+  duration: 156
+  completed: 2026-02-08T17:47:57Z
+---
+
+# Phase 10.2 Plan 01: Error Ring Buffer Foundation and Hidden Debug Commands Summary
+
+**Built error ring buffer infrastructure and hidden Telegram debug commands in the main workflow, establishing centralized error/trace storage using workflow static data and providing command-line interface for quick diagnostics.**
+
+## Completed Tasks
+
+### Task 1: Add hidden command routing and debug command processor
+**Status:** Complete
+**Commit:** daff5bc
+
+Added 4 new keyword routing rules to the Keyword Router switch node for /errors, /clear-errors, /debug, and /trace commands. All use `startsWith` operator (not `contains`) to prevent false matches with regular text. Created unified Process Debug Command code node that handles all 4 commands in a single code block, implementing:
+
+- Static data initialization (errorLog structure with debug, errors, traces)
+- `/errors [N]` command: displays last N errors (default 5, max 50) with formatted output
+- `/clear-errors` command: resets error ring buffer
+- `/debug on|off|status` command: toggles debug mode, shows status
+- `/trace <correlationId>` command: queries all entries matching a correlation ID
+
+Added Send Debug Response Telegram node with HTML parse mode for formatted output. Positioned nodes at [1120, -200] and [1340, -200] to keep them visually grouped above existing menu path. Commands remain hidden (not listed in /start help menu).
+
+**Key changes:**
+- Keyword Router: 9 rules -> 13 rules (+4 hidden commands)
+- New node: Process Debug Command (id: `code-process-debug-command`)
+- New node: Send Debug Response (id: `telegram-send-debug-response`)
+- Connections: Keyword Router -> Process Debug Command -> Send Debug Response
+- Node count: 168 -> 170 (+2 nodes)
+
+### Task 2: Add error logging utility function and ring buffer write helper
+**Status:** Complete
+**Commit:** 4c2194c
+
+Created two standalone utility Code nodes that will be wired by Plan 02 and Plan 03:
+
+**Log Error node** (id: `code-log-error`) - Centralized error logging entry point:
+- Accepts structured error input (correlationId, workflow, node, operation, userMessage, errorMessage, httpCode, rawResponse, contextData)
+- Initializes static data structure if missing
+- Creates error entry with auto-incrementing ID (err_001, err_002, etc.)
+- Implements ring buffer: max 50 entries, auto-rotates (shift oldest when full)
+- Truncates large fields: error stack to 500 chars, raw response to 1000 chars
+- Passes through all input data with _errorLogged flag for downstream nodes
+- Positioned at [2600, -200] in utility area
+
+**Log Trace node** (id: `code-log-trace`) - Debug mode trace logging:
+- Checks debug.enabled flag before logging (passes through unchanged if off)
+- Increments execution count and auto-disables at 100 executions
+- Stores trace entries in separate ring buffer (max 50)
+- Trace entry fields: id, correlationId, timestamp, executionId, event, workflow, node, data
+- Supports two event types: "sub-workflow-call" and "callback-routing"
+- Passes through all input data unchanged with _traceLogged flag
+- Positioned at [2600, -400] in utility area
+
+Both nodes are standalone (no incoming/outgoing connections) and ready for wiring in Plan 02.
+
+**Key changes:**
+- New node: Log Error (id: `code-log-error`)
+- New node: Log Trace (id: `code-log-trace`)
+- Node count: 170 -> 172 (+2 utility nodes)
+
+## Technical Implementation
+
+### Ring Buffer Pattern
+Both error and trace buffers use simple array-based ring buffer with auto-rotation:
+```javascript
+buffer.push(entry);
+if (buffer.length > MAX_SIZE) {
+  buffer.shift();  // Remove oldest
+}
+```
+
+### Static Data Structure
+```javascript
+staticData.errorLog = {
+  debug: { enabled: false, executionCount: 0 },
+  errors: { buffer: [], nextId: 1, count: 0, lastCleared: ISO_DATE },
+  traces: { buffer: [], nextId: 1 }
+}
+```
+
+### Command Routing
+All 4 debug commands route through single Process Debug Command node:
+- `/errors [N]` - Display recent errors (HTML formatted)
+- `/clear-errors` - Reset error buffer
+- `/debug on|off|status` - Toggle/query debug mode
+- `/trace <correlationId>` - Query by correlation ID
+
+### Auto-Disable Mechanism
+Debug mode automatically disables after 100 executions to prevent:
+- Performance impact from continuous tracing
+- Ring buffer fill-up with trace noise
+- Forgotten debug mode affecting production
+
+## Deviations from Plan
+
+None - plan executed exactly as written.
+
+## Architecture Decisions
+
+**1. Unified command handler**
+Used single Process Debug Command code node for all 4 commands rather than separate nodes per command. This improves maintainability (single code block to update) and reduces node count.
+
+**2. Ring buffer size: 50 entries**
+Conservative size based on research recommendations. Prevents workflow static data size limits while providing sufficient history for debugging. Can be adjusted if needed.
+
+**3. Field truncation**
+Error stack: 500 chars, raw response: 1000 chars. Balances diagnostic value with static data size constraints.
+
+**4. startsWith operator for keywords**
+Used `startsWith` instead of `contains` to prevent false matches (e.g., user typing "debug the container" wouldn't trigger /debug command).
+
+**5. HTML formatting for Telegram output**
+Used `parse_mode: HTML` with `<b>`, `<pre>`, `<code>` tags for better readability in Telegram. Escapes `>` as `&gt;` to prevent HTML parsing errors.
+
+## Success Criteria Met
+
+- [x] Main workflow has 172 nodes (168 + 4 new)
+- [x] Ring buffer infrastructure initialized in workflow static data
+- [x] /errors, /clear-errors, /debug, /trace commands routed and handled
+- [x] Log Error and Log Trace utility nodes ready for wiring
+- [x] No regression to existing functionality
+- [x] All new nodes use correct n8n typeVersion (2 for code, 1.2 for Telegram)
+- [x] Commands remain hidden (not in /start menu)
+
+## Self-Check
+
+Running verification of created files and commits:
+
+**Files created:**
+```bash
+# No new files - only modified n8n-workflow.json
+```
+
+**Files modified:**
+```bash
+$ [ -f "/home/luc/Projects/unraid-docker-manager/n8n-workflow.json" ] && echo "FOUND: n8n-workflow.json"
+```
+FOUND: n8n-workflow.json
+
+**Commits created:**
+```bash
+$ git log --oneline --all | grep -q "daff5bc" && echo "FOUND: daff5bc"
+$ git log --oneline --all | grep -q "4c2194c" && echo "FOUND: 4c2194c"
+```
+FOUND: daff5bc
+FOUND: 4c2194c
+
+**Node count verification:**
+```bash
+$ python3 -c "import json; wf=json.load(open('n8n-workflow.json')); print(f'Node count: {len(wf[\"nodes\"])}')"
+```
+Node count: 172
+
+## Self-Check: PASSED
+
+All files, commits, and node counts verified. Plan executed successfully.
+
+## Next Steps
+
+**Plan 02:** Wire error logging to main workflow error paths
+- Connect Log Error node to sub-workflow error returns
+- Add error context capture (sub-workflow input/output)
+- Wire error propagation from sub-workflows to centralized storage
+
+**Plan 03:** Add debug tracing to sub-workflow boundaries and callback routing
+- Wire Log Trace node to sub-workflow call points (capture I/O)
+- Add trace logging to callback routing decisions
+- Test debug mode toggle and auto-disable behavior
+
+## Metrics
+
+- **Duration:** 156 seconds (2.6 minutes)
+- **Tasks completed:** 2/2
+- **Commits:** 2 (1 per task)
+- **Files modified:** 1 (n8n-workflow.json)
+- **Nodes added:** 4 (2 command handler nodes, 2 utility nodes)
+- **Node count:** 168 -> 172 (+2.4%)
+- **Rules added:** 4 (hidden command keywords)
+- **Connections added:** 5 (4 keyword outputs + 1 command-to-response)
+
+---
+
+*Plan completed: 2026-02-08*
+*Phase: 10.2-better-logging-and-log-management*
+*Execution agent: Claude Sonnet 4.5*