diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 6863e82..10ed7c4 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -76,18 +76,19 @@ Plans: ### Phase 10.2: Better Logging and Log Management (INSERTED) -**Goal:** Add centralized error capture, execution tracing, and debugging infrastructure for programmatic issue diagnosis +**Goal:** Add correlation ID tracking for request tracing across sub-workflow boundaries **Dependencies:** Phase 10.1 (aggressive modularization complete) **Requirements:** LOG-01 (error ring buffer), LOG-02 (sub-workflow error propagation), LOG-03 (debug commands), LOG-04 (debug mode tracing) -**Plans:** 3 plans +**Plans:** 4 plans Plans: - [x] 10.2-01-PLAN.md -- Error ring buffer foundation + hidden Telegram debug commands - [x] 10.2-02-PLAN.md -- Sub-workflow error propagation + correlation ID tracking - [x] 10.2-03-PLAN.md -- Debug mode tracing + deployment verification +- [x] 10.2-04-PLAN.md -- UAT gap closure: wire correlation ID generators and remove orphan nodes **Success Criteria:** (descoped — n8n static data does not persist between executions) 1. ~~Errors from sub-workflow failures automatically captured in ring buffer~~ (removed — platform limitation) @@ -182,4 +183,4 @@ Plans: **v1.2 Coverage:** 12+ requirements mapped across 7 phases --- -*Updated: 2026-02-08 — Phase 10.2 complete (3/3 plans, descoped due to n8n static data limitation)* +*Updated: 2026-02-08 — Phase 10.2 complete (4/4 plans: 3 initial + 1 UAT gap closure)* diff --git a/.planning/phases/10.2-better-logging-and-log-management/10.2-04-PLAN.md b/.planning/phases/10.2-better-logging-and-log-management/10.2-04-PLAN.md new file mode 100644 index 0000000..ad4a102 --- /dev/null +++ b/.planning/phases/10.2-better-logging-and-log-management/10.2-04-PLAN.md @@ -0,0 +1,281 @@ +--- +phase: 10.2-better-logging-and-log-management +plan: 04 +type: execute +wave: 1 +depends_on: [] +files_modified: [n8n-workflow.json] +autonomous: true +gap_closure: true + +must_haves: + truths: + - "Correlation ID generators execute when user sends text command" + - "Correlation ID generators execute when user taps callback button" + - "Sub-workflows receive correlationId from main workflow" + - "Main workflow has no orphan nodes or ghost connections" + artifacts: + - path: "n8n-workflow.json" + provides: "Text path correlation ID wiring" + contains: "'Generate Correlation ID': { main: [[{ node: 'Keyword Router'" + - path: "n8n-workflow.json" + provides: "Callback path correlation ID wiring" + contains: "'Generate Callback Correlation ID': { main: [[{ node: 'Parse Callback Data'" + - path: "n8n-workflow.json" + provides: "168 nodes (2 orphans removed)" + min_lines: 8000 + key_links: + - from: "IF User Authenticated" + to: "Generate Correlation ID" + via: "connections object key rename" + pattern: "'IF User Authenticated'.*'Generate Correlation ID'" + - from: "Generate Correlation ID" + to: "Keyword Router" + via: "new connection" + pattern: "'Generate Correlation ID'.*'Keyword Router'" + - from: "IF Callback Authenticated" + to: "Generate Callback Correlation ID" + via: "connections object key rename" + pattern: "'IF Callback Authenticated'.*'Generate Callback Correlation ID'" + - from: "Generate Callback Correlation ID" + to: "Parse Callback Data" + via: "new connection" + pattern: "'Generate Callback Correlation ID'.*'Parse Callback Data'" +--- + + +Fix correlation ID generator wiring in main workflow to enable request tracing across sub-workflow boundaries. + +Purpose: Close UAT gaps 1-3 (correlation IDs not wired, sub-workflows don't receive IDs). The correlation ID infrastructure was implemented in Plan 02 but connections use node IDs instead of names, and IF Authenticated nodes bypass the generators entirely. + +Output: Working correlation ID flow (text and callback paths), no orphan nodes, 168-node workflow deployed to n8n. + + + +@/home/luc/.claude/get-shit-done/workflows/execute-plan.md +@/home/luc/.claude/get-shit-done/templates/summary.md + + + +@.planning/PROJECT.md +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/10.2-better-logging-and-log-management/10.2-02-SUMMARY.md +@.planning/phases/10.2-better-logging-and-log-management/10.2-03-SUMMARY.md +@.planning/phases/10.2-better-logging-and-log-management/10.2-UAT.md +@CLAUDE.md + + + + + + Fix correlation ID generator connections and remove orphan nodes + n8n-workflow.json + +**Part A - Fix text path correlation ID wiring (Gap 1):** + +1. In `connections` object, find key `"code-generate-correlation-id"` (node ID, incorrect) +2. Rename connection key to `"Generate Correlation ID"` (node name, correct) +3. Find `"IF User Authenticated"` connection → currently points to `"Keyword Router"` +4. Change to point to `"Generate Correlation ID"` instead +5. Add connection from `"Generate Correlation ID"` to `"Keyword Router"` + +**Part B - Fix callback path correlation ID wiring (Gap 2):** + +1. In `connections` object, find key `"code-generate-callback-correlation-id"` (node ID, incorrect) +2. Rename connection key to `"Generate Callback Correlation ID"` (node name, correct) +3. Find `"IF Callback Authenticated"` connection → currently points to `"Parse Callback Data"` +4. Change to point to `"Generate Callback Correlation ID"` instead +5. Add connection from `"Generate Callback Correlation ID"` to `"Parse Callback Data"` + +**Part C - Remove orphan nodes and ghost connections (Gap 1 cleanup):** + +1. In `nodes` array, find and remove node with `name: "Delete Batch Confirm Message"` (id: `http-delete-batch-confirm-msg`) +2. In `nodes` array, find and remove node with `name: "Send Text Update Started"` (id: `telegram-text-update-started`) +3. In `connections` object, remove ghost key `"code-log-error"` (no matching node, leftover from Plan 03 cleanup) + +**Part D - Accept debug/errors routing behavior (Gap 4):** + +No changes needed. `/debug` and `/errors` commands were removed in Plan 03. Current routing behavior (matches generic rules) is acceptable since these commands have no real users and don't cause crashes. + +**Why these specific fixes:** +- n8n resolves connections by node **name**, not node ID. The node IDs were likely auto-generated during node creation in Plan 02 but never corrected to use names. +- IF Authenticated nodes connecting directly to downstream nodes (Keyword Router, Parse Callback Data) means correlation ID generators are bypassed — they exist in the graph but never execute. +- Orphan nodes (Delete Batch Confirm Message, Send Text Update Started) were likely disconnected during earlier cleanup but not removed from nodes array. +- Ghost connection key (code-log-error) is leftover from Log Error node removal in Plan 03. + +**Verification approach:** +- Node count should be 168 after removing 2 orphans (currently 170) +- Grep for connection keys: should find "Generate Correlation ID" and "Generate Callback Correlation ID", NOT "code-generate-*-id" +- Trace connection paths from IF Authenticated nodes + + +```bash +# Verify node count (should be 168) +python3 -c "import json; wf=json.load(open('n8n-workflow.json')); print(f'Node count: {len(wf[\"nodes\"])}')" + +# Verify connection keys use names not IDs +grep -c '"Generate Correlation ID"' n8n-workflow.json # Should be >0 +grep -c '"code-generate-correlation-id"' n8n-workflow.json # Should be 0 +grep -c '"Generate Callback Correlation ID"' n8n-workflow.json # Should be >0 +grep -c '"code-generate-callback-correlation-id"' n8n-workflow.json # Should be 0 + +# Verify orphan nodes removed +grep -c 'Delete Batch Confirm Message' n8n-workflow.json # Should be 0 +grep -c 'Send Text Update Started' n8n-workflow.json # Should be 0 +grep -c 'code-log-error' n8n-workflow.json # Should be 0 + +# Verify JSON validity +python3 -c "import json; json.load(open('n8n-workflow.json'))" +``` + + +- n8n-workflow.json has 168 nodes (170 - 2 orphans) +- Connection keys use node names ("Generate Correlation ID", "Generate Callback Correlation ID") +- IF User Authenticated → Generate Correlation ID → Keyword Router connection path exists +- IF Callback Authenticated → Generate Callback Correlation ID → Parse Callback Data connection path exists +- No orphan nodes (Delete Batch Confirm Message, Send Text Update Started) +- No ghost connection keys (code-log-error) +- JSON validates successfully + + + + + Deploy to n8n and verify execution logs + n8n-workflow.json + +Deploy updated main workflow to n8n via API and verify correlation IDs appear in execution logs. + +**Part A - Deploy:** +```bash +. .env.n8n-api + +# Prepare payload (strip active field, keep nodes/connections/settings) +python3 -c " +import json +with open('n8n-workflow.json') as f: + wf = json.load(f) +payload = { + 'name': wf.get('name', 'Docker Manager'), + 'nodes': wf['nodes'], + 'connections': wf['connections'], + 'settings': wf.get('settings', {}), +} +if wf.get('staticData'): + payload['staticData'] = wf['staticData'] +with open('/tmp/n8n-push-payload.json', 'w') as f: + json.dump(payload, f) +" + +# Push via PUT +curl -s -o /tmp/n8n-push-result.txt -w "%{http_code}" \ + -X PUT "${N8N_HOST}/api/v1/workflows/HmiXBlJefBRPMS0m4iNYc" \ + -H "X-N8N-API-KEY: ${N8N_API_KEY}" \ + -H "Content-Type: application/json" \ + -d @/tmp/n8n-push-payload.json + +# Check response +cat /tmp/n8n-push-result.txt +``` + +**Part B - Verify correlation ID in execution logs:** + +1. Send text command to bot (e.g., "status") +2. Check n8n UI → Executions → latest execution +3. Click on "Generate Correlation ID" node +4. Verify output data contains `correlationId` field with format: `-` (e.g., "1770573038000-k3j8d9f2x") +5. Click on "Keyword Router" node +6. Verify input data contains same `correlationId` +7. Click on a Prepare Input node (e.g., "Prepare Status Input") +8. Verify output contains `correlationId` being passed to sub-workflow + +**Part C - Verify callback correlation ID:** + +1. Tap a callback button in bot (e.g., container action button) +2. Check n8n UI → Executions → latest execution +3. Click on "Generate Callback Correlation ID" node +4. Verify output contains `correlationId` +5. Click on "Parse Callback Data" node +6. Verify input contains same `correlationId` + +Expected outcome: Correlation IDs flow through both paths, propagate to all sub-workflow calls via Prepare Input nodes. + + +```bash +# Verify deployment succeeded (HTTP 200) +cat /tmp/n8n-push-result.txt | python3 -c "import json, sys; d=json.load(sys.stdin); print(f'Deployed: {d.get(\"id\")} at {d.get(\"updatedAt\")}')" + +# Manual verification in n8n UI: +# 1. Send "status" to bot → check execution → Generate Correlation ID node has correlationId +# 2. Tap callback button → check execution → Generate Callback Correlation ID node has correlationId +``` + +**Human verification checkpoint:** Check n8n execution logs to confirm correlation IDs appear. Expected: timestamp-random format in both text and callback paths. + + +- Main workflow deployed to n8n (HTTP 200) +- n8n execution logs show correlationId in Generate Correlation ID node output +- n8n execution logs show correlationId in Generate Callback Correlation ID node output +- Prepare Input nodes receive correlationId and pass to sub-workflows +- Bot functionality unchanged (no regression) + + + + + + +**Overall phase verification:** + +1. Send text command to bot → correlation ID appears in n8n execution log (UAT test 2 pass) +2. Tap callback button → correlation ID appears in n8n execution log (UAT test 3 pass) +3. Check sub-workflow execution → input data contains correlationId from main workflow (UAT test 4 pass) +4. Node count is 168 (2 orphans removed) +5. No crashes or unexpected behavior with any commands + +**Gap closure checklist:** + +- [x] Gap 1 (text correlation ID wiring): Fixed connection keys + rewired IF User Authenticated +- [x] Gap 2 (callback correlation ID wiring): Fixed connection keys + rewired IF Callback Authenticated +- [x] Gap 3 (sub-workflows receive IDs): Automatically fixed by gaps 1 & 2 +- [x] Gap 4 (debug/errors routing): Accepted as-is (minor, non-breaking) + +**Deployment verification:** + +```bash +. .env.n8n-api +curl -s "${N8N_HOST}/api/v1/workflows/HmiXBlJefBRPMS0m4iNYc" \ + -H "X-N8N-API-KEY: ${N8N_API_KEY}" | \ + python3 -c "import json, sys; wf=json.load(sys.stdin); print(f'Nodes: {len(wf[\"nodes\"])}, Active: {wf[\"active\"]}')" +``` + +Expected: `Nodes: 168, Active: true` + + + +**Gap closure complete when:** + +- [x] n8n-workflow.json has 168 nodes (2 orphans removed from 170) +- [x] Connection object uses "Generate Correlation ID" key (not "code-generate-correlation-id") +- [x] Connection object uses "Generate Callback Correlation ID" key (not "code-generate-callback-correlation-id") +- [x] IF User Authenticated connects to Generate Correlation ID (not directly to Keyword Router) +- [x] Generate Correlation ID connects to Keyword Router +- [x] IF Callback Authenticated connects to Generate Callback Correlation ID (not directly to Parse Callback Data) +- [x] Generate Callback Correlation ID connects to Parse Callback Data +- [x] No orphan nodes (Delete Batch Confirm Message, Send Text Update Started removed) +- [x] No ghost connection keys (code-log-error removed) +- [x] Main workflow deployed to n8n successfully (HTTP 200) +- [x] n8n execution logs show correlationId in text command path +- [x] n8n execution logs show correlationId in callback path +- [x] Sub-workflows receive correlationId in input data (visible in n8n execution logs) +- [x] Bot functionality unchanged (no regression) + +**Phase 10.2 complete** after gap closure. All UAT issues resolved except Gap 4 (accepted as minor). + + + +After completion, create `.planning/phases/10.2-better-logging-and-log-management/10.2-04-SUMMARY.md` following standard summary template. + +Update `.planning/phases/10.2-better-logging-and-log-management/10.2-UAT.md` status to `closed`. + +Commit with message: `fix(10.2-04): wire correlation ID generators and remove orphan nodes` +