Files
unraid-docker-manager/.planning/research/ARCHITECTURE.md
T

23 KiB

Architecture Research: Unraid Update Status Sync Integration

Domain: Telegram Bot Docker Management Extension Researched: 2026-02-08 Confidence: MEDIUM

Integration Overview

This research focuses on integrating Unraid update status sync into the existing 287-node n8n workflow system (1 main + 7 sub-workflows). The goal is to clear Unraid's "update available" badges after the bot successfully updates a container.

Current Architecture (Baseline)

┌─────────────────────────────────────────────────────────────┐
│                     Telegram User                            │
└───────────────────────┬─────────────────────────────────────┘
                        │
┌───────────────────────▼─────────────────────────────────────┐
│               n8n-workflow.json (Main)                       │
│                     166 nodes                                │
├─────────────────────────────────────────────────────────────┤
│  Telegram Trigger → Auth → Correlation ID → Keyword Router  │
│                        │                                     │
│              Execute Workflow nodes (17)                     │
│                        │                                     │
│           ┌────────────┼────────────┐                        │
│           ▼            ▼            ▼                        │
│     n8n-update   n8n-actions   n8n-status                    │
│      (34 nodes)   (11 nodes)   (11 nodes)                    │
│           │            │            │                        │
│       [5 more sub-workflows: logs, batch-ui,                 │
│        confirmation, matching]                               │
└───────────────────────┬─────────────────────────────────────┘
                        │
┌───────────────────────▼─────────────────────────────────────┐
│            docker-socket-proxy:2375                          │
│              (Tecnativa/LinuxServer)                         │
└───────────────────────┬─────────────────────────────────────┘
                        │
┌───────────────────────▼─────────────────────────────────────┐
│              Docker Engine (Unraid Host)                     │
└─────────────────────────────────────────────────────────────┘

Target Architecture (With Unraid Sync)

┌─────────────────────────────────────────────────────────────┐
│                     Telegram User                            │
└───────────────────────┬─────────────────────────────────────┘
                        │
┌───────────────────────▼─────────────────────────────────────┐
│               n8n-workflow.json (Main)                       │
│                     166 nodes                                │
├─────────────────────────────────────────────────────────────┤
│  Telegram Trigger → Auth → Correlation ID → Keyword Router  │
│                        │                                     │
│              Execute Workflow nodes (17)                     │
│                        │                                     │
│           ┌────────────┼────────────┐                        │
│           ▼            ▼            ▼                        │
│     n8n-update   n8n-actions   n8n-status                    │
│      (34 nodes)   (11 nodes)   (11 nodes)                    │
│           │            │            │                        │
│       [5 more sub-workflows]                                 │
└───────────┬───────────────────────┬─────────────────────────┘
            │                       │
            │                   ┌───▼──────────────────────┐
            │                   │   NEW: Clear Unraid      │
            │                   │   Update Status (node)   │
            │                   └───┬──────────────────────┘
            │                       │
┌───────────▼───────────────────────▼─────────────────────────┐
│            docker-socket-proxy:2375                          │
│              (Tecnativa/LinuxServer)                         │
└───────────────────────┬─────────────────────────────────────┘
                        │
┌───────────────────────▼─────────────────────────────────────┐
│              Docker Engine (Unraid Host)                     │
│                                                              │
│  /var/lib/docker/unraid-update-status.json ← DELETE HERE    │
└─────────────────────────────────────────────────────────────┘

Integration Points

Where to Add Sync Logic

Option A: Extend n8n-update.json sub-workflow (RECOMMENDED)

Pros Cons
Single responsibility: update sub-workflow owns all update-related actions Couples Unraid-specific logic to generic update flow
Minimal changes to main workflow Breaks if called from non-Unraid systems (not a concern here)
Sync executes immediately after successful update None significant
Easier to test in isolation

Option B: Add to main workflow after Execute Update returns

Pros Cons
Keeps Unraid logic separate from generic update More complex routing in main workflow
Could conditionally enable based on environment Requires checking sub-workflow success result
Adds latency between update and sync
Harder to test (requires full workflow execution)

Option C: New sub-workflow (n8n-unraid-sync.json)

Pros Cons
Complete separation of concerns Overkill for single operation (file deletion)
Reusable if other Unraid integrations added Adds 8th sub-workflow to manage
Main workflow needs new Execute Workflow node
Extra complexity for minimal benefit

Recommendation: Option A (extend n8n-update.json) because:

  1. Sync is tightly coupled to update success
  2. Single responsibility: "update a container AND clear its Unraid status"
  3. Minimal architectural impact
  4. Easiest to test and maintain

Modification to n8n-update.json

Current End State (Return Success node, line 499):

return {
  json: {
    success: true,
    updated: true,
    message: data.message,
    oldDigest: data.oldDigest,
    newDigest: data.newDigest,
    correlationId: data.correlationId || ''
  }
};

New Flow:

Remove Old Image (Success) → Clear Unraid Status → Return Success
                                     │
                                     ▼
                          (Execute Command node)
                  rm -f /var/lib/docker/unraid-update-status.json

Node Addition (1 new node):

  • Type: Execute Command
  • Position: After "Remove Old Image (Success)", before "Return Success"
  • Command: rm -f /var/lib/docker/unraid-update-status.json
  • Error Handling: continueRegularOutput (don't fail update if sync fails)
  • Total Nodes: 34 → 35

Data Flow

Update Completion to Unraid Status Clear

1. User: "update plex"
     ↓
2. Main workflow → n8n-update.json
     ↓
3. Update sub-workflow:
     Inspect Container → Parse Config → Pull Image → Check Pull Success
     → Inspect New Image → Compare Digests → Stop Container
     → Remove Container → Build Create Body → Create Container
     → Start Container → Format Update Success → Send Response
     → Remove Old Image (Success)
     ↓
4. NEW: Clear Unraid Status
     Execute Command: rm -f /var/lib/docker/unraid-update-status.json
     ↓
5. Return Success (existing)
     ↓
6. Main workflow routes result to Telegram response

Key Properties:

  • Sync happens AFTER container is updated and old image removed
  • Sync happens BEFORE sub-workflow returns (ensures completion)
  • Sync failure does NOT fail the update (onError: continueRegularOutput)
  • User receives success message regardless of sync status

Why Delete the Entire Status File?

Unraid's Update Tracking Mechanism:

  1. Unraid stores update status in /var/lib/docker/unraid-update-status.json
  2. File contains: {"containerName": "true|false", ...} (true = updated, false = needs update)
  3. When bot updates externally, Unraid's file is stale
  4. Unraid only checks for newly available updates, not "containers now current"
  5. Deleting file forces Unraid to recheck ALL containers on next "Check for Updates"

Why Not Modify the JSON?

  • File format is internal to Unraid's DockerClient.php
  • Could change between Unraid versions
  • Parsing/modifying JSON from Execute Command is fragile
  • Deletion is simpler and forces full recheck (HIGH confidence)

User Impact:

  • After bot update, Unraid badge may show "outdated" for ~30 seconds until next UI refresh
  • User can manually click "Check for Updates" in Unraid Docker tab to force immediate recheck
  • Next automatic Unraid check will rebuild status file correctly

Container Access Requirements

How n8n Accesses Unraid Host Filesystem

Current n8n Container Configuration:

The n8n container must be able to delete /var/lib/docker/unraid-update-status.json on the Unraid host.

Access Pattern Options:

Method Implementation Pros Cons
Volume Mount (RECOMMENDED) -v /var/lib/docker:/host/var/lib/docker:rw Direct filesystem access, simple Grants access to entire /var/lib/docker
Docker API via exec docker exec unraid-host rm -f /var/lib/docker/... No volume mount needed Requires exec API (security risk)
SSH into host Execute Command with SSH No volume mount Requires SSH credentials in workflow
Unraid API (future) GraphQL mutation to clear status Proper API layer Requires Unraid 7.2+, API key setup

Recommendation: Volume mount /var/lib/docker as read-write because:

  1. n8n already accesses Docker via socket proxy (security boundary established)
  2. Unraid status file is Docker-internal data (reasonable scope)
  3. No additional credentials or services needed
  4. Direct file operations are faster than API calls
  5. Works on all Unraid versions (no version dependency)

Security Consideration:

  • /var/lib/docker contains Docker data, not general host filesystem
  • Socket proxy already limits Docker API access
  • File deletion is least-privilege operation (no read of sensitive data)
  • Alternative is exec API (worse security than filesystem mount)

Volume Mount Configuration

Add to n8n Container:

services:
  n8n:
    # ... existing config ...
    volumes:
      - /var/lib/docker:/host/var/lib/docker:rw  # NEW
      # ... existing volumes ...

Execute Command Node:

# Path accessible from inside n8n container
rm -f /host/var/lib/docker/unraid-update-status.json

Why /host/ prefix:

  • Inside container, /var/lib/docker is container's own filesystem
  • Volume mount at /host/var/lib/docker is Unraid host's filesystem
  • Prevents accidental deletion of n8n's own Docker files

Component Modifications

Modified Components

Component Type Change Impact
n8n container Infrastructure Add volume mount /var/lib/docker:/host/var/lib/docker:rw MEDIUM - requires container recreation
n8n-update.json Sub-workflow Add 1 Execute Command node after "Remove Old Image (Success)" LOW - workflow edit only
Clear Unraid Status (NEW) Node Execute Command: rm -f /host/var/lib/docker/unraid-update-status.json NEW - single operation

Unchanged Components

  • Main workflow (n8n-workflow.json): No changes
  • Other 6 sub-workflows: No changes
  • Socket proxy configuration: No changes
  • Docker socket access pattern: No changes
  • Telegram integration: No changes

Build Order

Based on dependencies and risk:

Phase 1: Infrastructure Setup

Delivers: n8n container has host filesystem access Tasks:

  1. Add volume mount to n8n container configuration
  2. Recreate n8n container with new mount
  3. Verify mount accessible: docker exec n8n ls /host/var/lib/docker
  4. Test file deletion: docker exec n8n rm -f /host/var/lib/docker/test-file

Rationale: Infrastructure change first, before workflow modifications. Ensures mount works before relying on it.

Risks:

  • Container recreation causes brief downtime (~10 seconds)
  • Mount path typo breaks functionality

Mitigation:

  • Schedule during low-traffic window
  • Test mount manually before workflow change
  • Document rollback: remove volume mount, recreate container

Phase 2: Workflow Modification

Delivers: Update sub-workflow clears Unraid status Tasks:

  1. Read n8n-update.json via n8n API
  2. Add Execute Command node after "Remove Old Image (Success)"
  3. Configure command: rm -f /host/var/lib/docker/unraid-update-status.json
  4. Set error handling: continueRegularOutput
  5. Connect to "Return Success" node
  6. Push updated workflow via n8n API
  7. Test with single container update

Rationale: Modify workflow only after infrastructure proven working. Single node addition is minimal risk.

Risks:

  • File path wrong (typo in command)
  • Permissions issue (mount is read-only)
  • Delete fails silently

Mitigation:

  • Test command manually first (Phase 1 testing)
  • Verify mount is :rw not :ro
  • Check execution logs for errors

Phase 3: Validation

Delivers: Confirmed end-to-end functionality Tasks:

  1. Update container via bot
  2. Check Unraid UI - badge should still show "update available" (file deleted)
  3. Click "Check for Updates" in Unraid Docker tab
  4. Verify badge clears (Unraid rechecked and found container current)
  5. Verify workflow execution logs show no errors

Rationale: Prove the integration works before considering it complete.

Success Criteria:

  • Container updates successfully
  • Status file deleted (verify via ls /var/lib/docker/unraid-update-status.json returns "not found")
  • Unraid recheck clears badge
  • No errors in n8n execution log

Architectural Patterns

Pattern 1: Post-Action Sync

What: Execute external system sync after primary operation completes successfully When to use: When primary operation (update container) should trigger related state updates (clear Unraid cache) Trade-offs:

  • PRO: Keeps systems consistent
  • PRO: User doesn't need manual sync step
  • CON: Couples unrelated systems
  • CON: Sync failure can confuse (update worked, but Unraid shows stale state)

Example (this implementation):

Update Container → Remove Old Image → Clear Unraid Status → Return Success

Error Handling Strategy: Sync failure does NOT fail the primary operation. Use continueRegularOutput to log error but continue to success return.

Pattern 2: Filesystem Access from Containerized Workflow

What: Mount host filesystem into container to enable file operations from workflow When to use: When workflow needs to manipulate host-specific files (e.g., clear cache, trigger recheck) Trade-offs:

  • PRO: Direct access, no additional services
  • PRO: Works regardless of API availability
  • CON: Breaks container isolation
  • CON: Path changes between environments (host vs container)

Example (this implementation):

# Host path -> Container path
-v /var/lib/docker:/host/var/lib/docker:rw

Security Boundary: Limit mount scope to minimum required directory. Here: /var/lib/docker (Docker-internal) not /var/lib (too broad).

Pattern 3: Best-Effort External Sync

What: Attempt sync with external system but don't fail primary operation if sync fails When to use: When sync is nice-to-have but not critical to primary operation success Trade-offs:

  • PRO: Primary operation reliability unaffected
  • PRO: Degrades gracefully (manual sync still possible)
  • CON: Silent failures can go unnoticed
  • CON: Systems drift out of sync

Example (this implementation):

// Execute Command node configuration
{
  "onError": "continueRegularOutput",  // Don't throw on sync failure
  "command": "rm -f /host/var/lib/docker/unraid-update-status.json"
}

Monitoring: Log sync failures but return success. User can manually sync if needed (Unraid "Check for Updates").

Anti-Patterns

Anti-Pattern 1: Parsing Unraid Status File

What people might do: Read /var/lib/docker/unraid-update-status.json, parse JSON, update only the changed container's status, write back

Why it's wrong:

  • File format is internal to Unraid's DockerClient.php implementation
  • Could change between Unraid versions without notice
  • Parsing JSON in Execute Command (bash) is fragile
  • Risk of corrupting file if concurrent Unraid writes happen

Do this instead: Delete entire file to force full recheck. Simpler, more robust, version-agnostic.

Anti-Pattern 2: Using Docker exec for File Deletion

What people might do: docker exec unraid-host rm -f /var/lib/docker/unraid-update-status.json

Why it's wrong:

  • Requires EXEC API access on socket proxy (major security risk)
  • unraid-host container doesn't exist (Unraid itself is host, not container)
  • More complex than direct filesystem access

Do this instead: Volume mount for direct filesystem access (more secure than exec API).

Anti-Pattern 3: Blocking Update on Sync Failure

What people might do: Fail entire update if Unraid status file deletion fails

Why it's wrong:

  • Update already completed (container recreated with new image)
  • Failing at this point leaves system in inconsistent state (container updated, user told it failed)
  • User can manually sync (click "Check for Updates")

Do this instead: Log sync failure, return success, document manual sync option.

Scaling Considerations

Scale Approach Notes
1-10 containers Current approach works File deletion is <1ms operation
10-50 containers Current approach works Unraid recheck time increases linearly but still <10s
50+ containers Current approach works Deleting status file forces full recheck (may take 30-60s) but acceptable as one-time cost

Optimization Not Needed:

  • File deletion is instant regardless of container count
  • Unraid recheck is user-initiated (not blocking bot operation)
  • No performance bottleneck identified

Alternative for Many Containers (future): If Unraid provides GraphQL API to selectively clear single container status (not found in research), could optimize to:

  • Clear only updated container's status
  • Avoid forcing full recheck
  • Requires Unraid 7.2+ and API discovery

Integration Testing Strategy

Test Cases

Test Expected Behavior Verification
Update single container Container updates, status file deleted Check file gone: ls /var/lib/docker/unraid-update-status.json
Update container, sync fails Update succeeds, error logged Check execution log for error, container still updated
Batch update multiple containers Each update clears status file File deleted after first update, remains deleted
Update with no status file Update succeeds, no error rm -f tolerates missing file
Mount not accessible Update succeeds, sync error logged Execution log shows file not found error

Rollback Plan

If integration causes issues:

  1. Quick rollback (workflow only):

    • Revert n8n-update.json to previous version via n8n API
    • Status sync stops happening
    • Core update functionality unaffected
  2. Full rollback (infrastructure):

    • Remove volume mount from n8n container config
    • Recreate n8n container
    • Revert workflow
    • Manual Unraid sync only

Rollback triggers:

  • Sync consistently fails (execution log errors)
  • Permissions issues prevent file deletion
  • Unraid behavior changes unexpectedly

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence)

  • n8n Execute Command documentation (operational patterns)
  • Community reports of Unraid update badge behavior (anecdotal)

Architecture research for: Unraid Update Status Sync Integration Researched: 2026-02-08