Files
unraid-docker-manager/.planning/STATE.md
T
Lucas Berger afddb6130a docs(12-02): complete Phase 12 — all v1.2 requirements closed
BATCH-04 and BATCH-05 UAT passed. 9 bugs fixed during testing.
All 12 v1.2 requirements now complete. Phase 13 (docs overhaul) next.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 18:56:44 -05:00

15 KiB

Project State -- Unraid Docker Manager

Current Position

  • Milestone: v1.2 -- Modularization & Polish
  • Phase: 12 of 13 (Polish & Audit)
  • Plan: 2 of 2 complete
  • Status: Phase 12 COMPLETE (documentation + UAT, all v1.2 requirements closed)
  • Last activity: 2026-02-08 -- Completed 12-02 (UAT: BATCH-04, BATCH-05 passed after 9 bug fixes)

Progress

v1.0: [**********] 100% SHIPPED
v1.1: [**********] 100% SHIPPED

v1.2: [*********_] 83%

Phase 10:   Workflow Modularization         [**********] 100% COMPLETE (+ 10-07 UAT fixes)
Phase 10.1: Aggressive Modularization       [**********] 100% COMPLETE (9/9 plans + UAT closure)
Phase 10.2: Better Logging & Log Management [**********] 100% COMPLETE (4/4 plans complete)
Phase 11:   Update All & Callback Limits    [**********] 100% COMPLETE (2/2 plans, UAT 6/6 pass)
Phase 12:   Polish & Audit                  [**********] 100% COMPLETE (2/2 plans, all requirements closed)
Phase 13:   Documentation Overhaul          [          ] Pending

Phase 10 Completion Summary

Plan Description Status
10-01 Orphan node cleanup Complete
10-02 Container Update sub-workflow Complete
10-03 Container Actions sub-workflow Complete
10-04 Integration verification Complete
10-05 Complete modularization (batch, logs) Complete
10-06 Remediation: routing, logs, cleanup Complete
10-07 UAT gap closure (5 fixes) Complete

Achievements:

  • 3 sub-workflows created and deployed (Update, Actions, Logs)
  • All container operations consolidated (no duplicate logic)
  • Old inline batch execution path removed
  • Legacy callbacks modernized to new format
  • Main workflow: 209 -> 192 nodes (-8%)
  • 6 Python helper scripts removed
  • UAT gaps closed: race condition, data chain errors, fuzzy matching, refresh errors

Key Artifacts

  • n8n-workflow.json -- Main workflow (166 nodes: structural minimum achieved, orphan callback chain removed)
  • n8n-batch-ui.json -- Batch UI sub-workflow (17 nodes: 16 baseline + 1 Fetch Containers For Exec) -- ID: ZJhnGzJT26UUmW45
  • n8n-status.json -- Container Status sub-workflow (11 nodes) -- ID: lqpg2CqesnKE2RJQ
  • n8n-confirmation.json -- Confirmation Dialogs sub-workflow (16 nodes) -- ID: fZ1hu8eiovkCk08G
  • n8n-update.json -- Container Update sub-workflow (34 nodes) -- ID: 7AvTzLtKXM2hZTio92_mC
  • n8n-actions.json -- Container Actions sub-workflow (11 nodes) -- ID: fYSZS5PkH0VSEaT5
  • n8n-logs.json -- Container Logs sub-workflow (9 nodes) -- ID: oE7aO2GhbksXDEIw
  • n8n-matching.json -- Container Matching sub-workflow (23 nodes) -- ID: kL4BoI8ITSP9Oxek
  • DEPLOY-SUBWORKFLOWS.md -- Full architecture docs, contracts, and node analysis

Technical Notes

n8n typeVersion 1.2 requirement:

"workflowId": { "__rl": true, "mode": "list", "value": "<id>" }

Docker API success detection:

  • 204 No Content = success (empty response body)
  • Check !response.message && !response.error

Sub-workflow input contracts:

  • Container Update: containerId, containerName, chatId, messageId, responseMode
  • Container Actions: containerId, containerName, action, chatId, messageId, responseMode
  • Container Logs: containerId/containerName, lineCount, chatId, messageId, responseMode
  • Batch UI: chatId, messageId, queryId, callbackData, action, batchPage, selectedCsv, toggleName, batchAction
  • Container Status: chatId, messageId, action, containerId, containerName, page, queryId, searchTerm
  • Confirmation: chatId, messageId, action, containerId, containerName, confirmAction, confirmationToken, expired, responseMode
  • Matching: action, containerList, searchTerm, selectedContainers, chatId, messageId

Sub-workflow output patterns:

  • Batch UI returns action field (keyboard/execute/cancel)
  • Container Status returns action field (list/status/paginate)
  • Confirmation returns action field (show_stop/show_update/confirm_stop_result/confirm_update/cancel/expired)
  • Matching returns action field (matched/multiple/no_match/error/suggestion/batch_matched/disambiguation/not_found + update variants)
  • Main workflow routes based on action to appropriate Telegram response handler

Data chain pattern (10-07):

  • Use $('Build Progress Message').item.json to reference data across async nodes
  • Do not rely on $json after Telegram API calls (response overwrites data)

Dynamic input pattern (10-09):

  • Use $input.item.json for nodes with multiple predecessors
  • Matching sub-workflow returns both action (routing label) and actionType (user's requested action)

Accumulated Decisions

Phase Decision Rationale
10-05 Use placeholder workflow ID for logs sub-workflow ID assigned by n8n on import
10-05 Retain Parse Logs Command in main workflow Handles error cases before sub-workflow call
10-06 Remove old batch inline path Migrated to bexec: callback format, uses sub-workflow
10-06 Defer aggressive modularization to 10.1 Core goals achieved, deeper work needs separate phase
10-07 Timestamp on logs refresh Prevents "message not modified" error, shows freshness
10-07 Fuzzy matching in logs sub-workflow Simpler than duplicating Docker query infrastructure
10.1-01 Realistic target 115-125 nodes (not 50-80) 58 Telegram response nodes locked to main workflow
10.1-01 Wave 2: Batch UI + Container List extraction Highest-value domains with clear boundaries
10.1-02 Partial batch UI extraction (UI only, not loop) Batch execution loop cannot be in sub-workflow due to n8n limitations
10.1-02 Action-based sub-workflow routing Sub-workflow returns action field, main routes to Telegram handlers
10.1-03 Minimal net node reduction due to integration overhead Removed 10 nodes but added 9 integration nodes; value is complexity reduction
10.1-04 Return confirm_update action to main workflow Update flow tightly integrated with existing update sub-workflow
10.1-04 Call n8n-actions.json for stop execution Reuse existing action execution instead of duplicating Docker API calls
10.1-06 Downstream nodes reference original parse nodes for action type Sub-workflow doesn't carry user's requested action (stop/start) through return data
10.1-06 Text-mode status needs keyboard strip + messageId routing Pre-existing bug exposed by testing; text commands have no message to edit
10.1-06 Batch text needs Prepare Batch Execution transform Sub-workflow returns matchedContainers/batch_matched, downstream expects allMatched/stop
10.1-07 No further Code node extraction viable 2 candidates yield net-negative extraction (-50% efficiency)
10.1-07 168 nodes is near-minimal (structural minimum: 166) Evidence-based analysis of all 168 nodes by category
10.1-07 115-125 target was unrealistic Based on incomplete extraction overhead analysis
10.1-08 Status code checks before message-based fallback Explicit HTTP response handling before message parsing
10.1-08 HTTP 304 treated as success Docker API returns 304 for already-in-state, better UX than error
10.1-09 /list command as alias for status Status command already provides list functionality; alias simpler than duplication
10.1-09 Dynamic predecessor reference pattern Use $input.item.json for nodes with multiple incoming paths
  • [Phase 10.2-03]: n8n workflow static data does NOT persist between executions (critical platform limitation)
  • [Phase 10.2-03]: Ring buffer + debug commands architecture non-functional due to static data limitation
  • [Phase 10.2-03]: Stripped all static-data-dependent features, kept correlation IDs + structured error returns
  • [Phase 10.2-02]: Correlation ID uses timestamp + random string (no UUID dependency)
  • [Phase 10.2-02]: Use $input.item.json.correlationId pattern for Prepare Input nodes
  • [Phase 10.2-04]: Fixed connection keys to use node names per n8n resolution protocol
  • [Phase 10.2-04]: Accepted debug/errors routing behavior as minor (commands removed, no real users)
  • [Phase 10.2-04]: Final state 168 nodes (includes 2 correlation ID generators, 2 orphans removed)
  • [Phase 10.2-04]: Fixed connection keys to use node names per n8n resolution protocol
  • [Phase 10.2-04]: Accepted debug/errors routing behavior as minor (commands removed, no real users)
  • [Phase 11-01]: Use base36 BigInt encoding for bitmaps (supports 50+ containers, max ~20 bytes callback size)
  • [Phase 11-01]: Retain old batch parsers for graceful migration of in-flight messages (<1 minute window)
  • [Quick 1-1]: Removed 6 orphan callback nodes (no incoming connections after Phase 10 modularization)
  • [Quick 1-1]: Achieved structural minimum of 166 nodes (per Phase 10.1-07 analysis)
  • [Phase 12-01]: Document Unraid badge limitation instead of programmatic fix (Unraid API integration adds complexity for cosmetic issue)

Phase 10.1 Progress

Plan Description Status
10.1-01 Foundation and Domain Analysis Complete
10.1-02 Batch UI Sub-workflow (Wave 2) Complete
10.1-03 Container Status Sub-workflow (Wave 2) Complete
10.1-04 Confirmation Sub-workflow (Wave 3) Complete
10.1-05 Integration Verification Complete
10.1-06 Matching Sub-workflow Extraction Complete
10.1-07 Code Classification + Contract Documentation Complete
10.1-08 UAT Gap Closure: Container Action Status Codes Complete
10.1-09 UAT Gap Closure: Data Flow Fixes Complete

Node count progress:

  • Start: 192 nodes
  • After 10.1-02: 179 nodes (-13)
  • After 10.1-03: 178 nodes (-1)
  • After 10.1-04: 168 nodes (-10)
  • After 10.1-06: 168 nodes (net 0: -12 extracted, +9 integration, +3 fix nodes)
  • Final: 168 nodes (structural minimum: 166, gap: 2 non-viable candidates)

Extraction complete:

  • Batch UI: -13 nodes (16 nodes in sub-workflow)
  • Container Status: -1 net (11 nodes in sub-workflow, complexity reduction)
  • Confirmation: -10 nodes (16 nodes in sub-workflow)
  • Matching: net 0 (23 nodes in sub-workflow, complexity reduction)
  • Total reduction: 24 nodes (192 -> 168, -12.5%)

Phase 10.1 Sub-workflows

All 7 sub-workflows deployed and operational:

  • n8n-update.json -- 7AvTzLtKXM2hZTio92_mC
  • n8n-actions.json -- fYSZS5PkH0VSEaT5
  • n8n-logs.json -- oE7aO2GhbksXDEIw
  • n8n-batch-ui.json -- ZJhnGzJT26UUmW45
  • n8n-status.json -- lqpg2CqesnKE2RJQ
  • n8n-confirmation.json -- fZ1hu8eiovkCk08G
  • n8n-matching.json -- kL4BoI8ITSP9Oxek

Phase 10.2 Progress

Plan Description Status
10.2-01 Error Ring Buffer Foundation and Hidden Debug Commands Complete (infrastructure later removed)
10.2-02 Wire Error Logging to Main Workflow Complete (error logging removed, correlation IDs kept)
10.2-03 Add Debug Tracing to Sub-workflow Boundaries Complete (scope reduced due to static data limitation)
10.2-04 Gap Closure: Correlation ID Wiring Complete (UAT gaps 1-3 closed)

Critical Finding:

  • n8n workflow static data does NOT persist between executions (execution-scoped, not workflow-scoped)
  • Ring buffer + debug command architecture non-functional due to this limitation
  • All static-data-dependent features stripped in Plan 03 cleanup

Achievements (10.2-01): [REMOVED in 10.2-03 cleanup]

  • Ring buffer infrastructure (non-functional - static data doesn't persist)
  • 4 hidden debug commands (removed)
  • Log Error and Log Trace utility nodes (removed)

Achievements (10.2-02): [PARTIALLY RETAINED]

  • Structured error returns in all 7 sub-workflows (KEPT - success/error fields)
  • Correlation ID generation for text and callback paths (KEPT - 2 nodes)
  • 19 Prepare Input nodes modified to pass correlationId (KEPT)
  • Error detection IF nodes (REMOVED - depended on static data logging)

Final State (10.2-04):

  • Main workflow: 168 nodes (includes 2 correlation ID generators, 2 orphans removed)
  • Correlation ID infrastructure wired and functional (text + callback paths)
  • Correlation IDs flow to all sub-workflows via Prepare Input nodes
  • Structured error returns in all sub-workflows (enables better error handling)
  • All static-data-dependent features removed cleanly
  • UAT gaps 1-3 closed (correlation ID wiring), gap 4 accepted as minor
  • No regression to bot functionality

Phase 11 Progress

Plan Description Status
11-01 Bitmap encoding for batch selection Complete
11-02 Update All button with confirmation Complete

Achievements (11-01):

  • Bitmap-encoded batch selection eliminates 64-byte Telegram callback limit
  • Supports unlimited container selection (max ~20 bytes for 50+ containers)
  • Base36 BigInt encoding: b:0:1a3:5 vs old CSV batch:toggle:0:plex,sonarr:jellyfin
  • Graceful migration: old parsers retained as fallback for in-flight messages
  • Batch stop confirmation works with bitmap via resolution flow
  • n8n-batch-ui.json: 17 nodes (16 + 1 Fetch Containers For Exec)
  • n8n-workflow.json: 166 nodes (structural minimum achieved)

Phase 12 Progress

Plan Description Status
12-01 Documentation audit (ENV-01, ENV-02, DEBT-01, DEBT-02, UNR-01) Complete
12-02 Deferred UAT: BATCH-04 + BATCH-05 (9 bug fixes) Complete

Achievements (12-02):

  • BATCH-04 (text "update all") passed end-to-end UAT
  • BATCH-05 (inline keyboard "Update All :latest") passed end-to-end UAT
  • 9 bugs discovered and fixed during UAT (data chains, format mismatches, infra exclusion)
  • Infrastructure container exclusion added (n8n, socket-proxy) — prevents bot self-destruction
  • Batch responseMode added to update sub-workflow — suppresses per-container Telegram messages
  • Dynamic edit/send endpoint for confirmation (editMessageText for keyboard, sendMessage for text)
  • All v1.2 requirements now closed (12/12)

Achievements (12-01):

  • README updated to document docker-socket-proxy architecture (not direct socket mount)
  • Clarified TELEGRAM_BOT_TOKEN requires both n8n credential AND environment variable
  • Clarified user ID is hardcoded in IF nodes (no TELEGRAM_USERID env var)
  • Documented all 8 workflow files (main + 7 sub-workflows) in installation section
  • Added missing commands to usage table: update all and /list alias
  • Verified DEBT-02 is fixed: single --max-time 600 flag, no duplicates
  • Documented Unraid update badge limitation (UNR-01) with root cause and workaround
  • Closed 4 requirements: ENV-01, ENV-02, DEBT-01, DEBT-02
  • Resolved UNR-01 as documented limitation (not a fix, but closed)

Quick Tasks Completed

Task Description Status Date Node Impact
quick-1-1 Remove orphan callback node chain Complete 2026-02-08 172→166 nodes

Next Step

Phase 12 complete (documentation + UAT, all v1.2 requirements closed). Next: Phase 13 (Documentation Overhaul).

Session Continuity

Last session: 2026-02-08 Stopped at: Completed 12-02 (UAT: BATCH-04, BATCH-05 passed, 9 bug fixes, all v1.2 requirements closed) Resume file: None


Auto-maintained by GSD workflow