Files
unraid-docker-manager/.planning/research/ARCHITECTURE.md
T
Lucas Berger 811030cee4 docs: complete v1.1 research (4 researchers + synthesis)
Files:
- STACK.md: Socket proxy, n8n API, Telegram keyboards
- FEATURES.md: Table stakes, differentiators, MVP scope
- ARCHITECTURE.md: Integration points, data flow changes
- PITFALLS.md: Top 5 risks with prevention strategies
- SUMMARY.md: Executive summary, build order, confidence

Key findings:
- Stack: LinuxServer socket-proxy, HTTP Request nodes for keyboards
- Architecture: TCP curl migration (~15 nodes), new callback routes
- Critical pitfall: Socket proxy breaks existing curl commands

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 22:09:06 -05:00

14 KiB

Architecture Research: v1.1

Researched: 2026-02-02 Domain: Integration architecture for v1.1 features Confidence: HIGH (n8n API, Docker socket proxy), MEDIUM (Telegram keyboards), LOW (Unraid update sync)

Executive Summary

The v1.1 features require careful integration with the existing single-workflow architecture. The key finding is that all four features (n8n API access, Docker socket security, Telegram keyboards, Unraid update sync) can be implemented without major refactoring of the existing workflow structure. However, Docker socket proxy is the most impactful change, requiring container-level infrastructure modifications before workflow changes.

Current Architecture

+-------------------+     +------------------+     +-------------------+
|   Telegram User   |---->|   n8n Workflow   |---->|  Docker Socket    |
|   (Webhook)       |     |   (3,200 lines)  |     |  (/var/run/...)   |
+-------------------+     +------------------+     +-------------------+
                                 |
                                 v
                          +------------------+
                          |  Telegram Bot    |
                          |  API (Response)  |
                          +------------------+

Current Components:

Component Description Modification Risk
Telegram Trigger Webhook receiving messages & callbacks LOW - add callback_query handling
Route Update Type Switch node: message vs callback LOW - already exists
Auth Check IF nodes checking user ID NONE - unchanged
Keyword Router Switch node with contains rules LOW - add new routes
Docker operations Execute Command + curl HIGH - needs socket proxy changes
Response formatting Code nodes + Telegram send LOW - extend existing

Integration Points

1. n8n API Access

What needs to change: Enable n8n REST API and create API key for external access.

Integration approach: Configuration change only, no workflow modifications needed.

Configuration required:

# n8n container environment variables
N8N_PUBLIC_API_DISABLED=false     # Default - API is enabled
# No additional env vars needed for basic API access

How Claude Code can use it:

# Get workflow (requires API key)
curl -X GET "http://n8n:5678/api/v1/workflows" \
  -H "X-N8N-API-KEY: <api-key>"

# Get specific workflow
curl -X GET "http://n8n:5678/api/v1/workflows/<workflow-id>" \
  -H "X-N8N-API-KEY: <api-key>"

# Update workflow
curl -X PUT "http://n8n:5678/api/v1/workflows/<workflow-id>" \
  -H "X-N8N-API-KEY: <api-key>" \
  -H "Content-Type: application/json" \
  -d @workflow.json

# Get execution logs
curl -X GET "http://n8n:5678/api/v1/executions" \
  -H "X-N8N-API-KEY: <api-key>"

# Execute workflow manually
curl -X POST "http://n8n:5678/api/v1/workflows/<workflow-id>/activate" \
  -H "X-N8N-API-KEY: <api-key>"

Impact on existing workflow: NONE - this is infrastructure configuration.

Sources:

2. Docker Socket Security (Socket Proxy)

What needs to change: Replace direct Docker socket mount with socket proxy container.

Current architecture (INSECURE):

n8n container ----mount----> /var/run/docker.sock (host)

Target architecture (SECURE):

n8n container ---tcp---> socket-proxy container ---mount---> docker.sock (host)

New container required: LinuxServer or Tecnativa docker-socket-proxy

Recommended configuration (LinuxServer):

# docker-compose.yml
services:
  socket-proxy:
    image: lscr.io/linuxserver/socket-proxy:latest
    container_name: socket-proxy
    environment:
      - CONTAINERS=1      # List containers
      - POST=0            # Disable general POST (security)
      - ALLOW_START=1     # Allow container start
      - ALLOW_STOP=1      # Allow container stop
      - ALLOW_RESTARTS=1  # Allow restart/kill
      - IMAGES=1          # For image operations (update flow)
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    restart: unless-stopped
    read_only: true
    tmpfs:
      - /run

  n8n:
    # Remove: -v /var/run/docker.sock:/var/run/docker.sock
    environment:
      - DOCKER_HOST=tcp://socket-proxy:2375
    depends_on:
      - socket-proxy

Impact on existing workflow: HIGH - all curl commands must change

Current Pattern New Pattern
curl --unix-socket /var/run/docker.sock http://localhost/... curl http://socket-proxy:2375/...

Workflow changes needed:

  1. All Execute Command nodes using --unix-socket flag need modification
  2. Host changes from localhost to socket-proxy
  3. Socket path removed from curl commands

Search & replace pattern:

FROM: curl -s --unix-socket /var/run/docker.sock 'http://localhost/v1.47/
TO:   curl -s 'http://socket-proxy:2375/v1.47/

Node count affected: ~15 Execute Command nodes (Docker List, Inspect, Start, Stop, etc.)

Security benefit:

  • n8n no longer has root-equivalent host access
  • Socket proxy limits which Docker API endpoints are accessible
  • Read-only mount on socket-proxy container

Sources:

3. Telegram Inline Keyboards

What needs to change: Add persistent menu buttons and expand inline keyboard usage.

Integration with existing architecture:

  • Existing callback_query handling already in place (suggestions, batch confirmations)
  • HTTP Request nodes already used for complex Telegram API calls
  • Switch node (Route Callback) handles different callback types

Current callback flow (already exists):

Telegram Trigger (callback_query)
  -> IF Callback Authenticated
    -> Parse Callback Data
      -> Route Callback (Switch)
        -> cancel / expired / batch / single action

Additions needed:

  1. Persistent keyboard on /start or unknown input:
{
  "reply_markup": {
    "keyboard": [
      [{"text": "Status"}],
      [{"text": "Start"}, {"text": "Stop"}],
      [{"text": "Restart"}, {"text": "Update"}],
      [{"text": "Logs"}]
    ],
    "is_persistent": true,
    "resize_keyboard": true
  }
}
  1. Container selection keyboards: After user sends "start" without container name, show inline keyboard with running/stopped containers.

Known limitation: n8n's native Telegram node doesn't support dynamic inline keyboards well. Continue using HTTP Request nodes for complex keyboards (already the pattern in v1.0).

New callback routing needed:

Callback Type Action Code Handler
Container selection sel:{containerId} New route in Route Callback
Menu action menu:{action} New route for keyboard-triggered actions

Impact on existing workflow: MEDIUM

  • Add 1-2 new outputs to Route Callback switch
  • Add nodes for container selection keyboard generation
  • Modify "Show Menu" response to include persistent keyboard

Sources:

4. Unraid Update Sync

What needs to change: After bot updates a container, clear Unraid's "update available" badge.

How Unraid tracks updates:

  • File: /var/lib/docker/unraid-update-status.json
  • Managed by: /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php
  • Update check compares local image digest vs registry digest

Known issue: When containers are updated by external tools (Watchtower, our bot), Unraid doesn't automatically clear the "update available" badge. Workaround is deleting the status file.

Integration approach:

Option A: Delete status file (simple but blunt)

# After successful container update
rm /var/lib/docker/unraid-update-status.json

This forces Unraid to recheck all containers on next "Check for Updates".

Option B: Trigger Unraid's check (cleaner)

# Call Unraid's update check endpoint (if exposed)
# Requires research into Unraid API

Option C: Modify status file directly

# Update the specific container's entry in the JSON
# Requires understanding the JSON schema

Challenges:

  1. Status file is on host, not in n8n container
  2. n8n doesn't have host filesystem access
  3. Socket proxy won't help (not a Docker API operation)

Possible solutions:

  1. Mount status file into n8n: -v /var/lib/docker/unraid-update-status.json:/unraid-status.json
  2. Create helper script on host: Cron job that monitors for updates
  3. Use Unraid API: If Unraid exposes an endpoint to trigger update checks

Confidence: LOW - This requires host-level access that may not be available via Docker. Needs further investigation with actual Unraid system access.

Sources:

Component Changes Summary

Component Change Type Description Effort
socket-proxy NEW Docker socket proxy container Medium
n8n container MODIFY Remove socket mount, add DOCKER_HOST env Low
n8n API key NEW Create API key in n8n settings Low
Execute Command nodes MODIFY Change curl commands to use proxy Medium
Show Menu node MODIFY Add persistent keyboard Low
Route Callback switch MODIFY Add new callback types Low
Container selection NEW Generate dynamic keyboards Medium
Unraid sync NEW Status file update mechanism High (uncertain)

Data Flow Changes

Before v1.1 (current)

User -> Telegram -> n8n webhook -> curl -> docker.sock -> Docker Engine

After v1.1

User -> Telegram -> n8n webhook -> curl -> socket-proxy:2375 -> docker.sock -> Docker Engine
                         ^
                         |
Claude Code -> n8n API --+

Suggested Build Order

Based on dependencies and risk:

Phase 1: Infrastructure (no workflow changes)

  1. n8n API enablement - Configuration only, enables faster iteration
  2. Socket proxy setup - Container deployment, test connectivity

Rationale: API access first means Claude Code can more easily modify the workflow for subsequent phases. Socket proxy is infrastructure that must exist before workflow changes.

Phase 2: Workflow Migration

  1. Update all curl commands - Migrate from socket to proxy
  2. Test all existing functionality - Verify no regressions

Rationale: This is the highest-risk change. Must be completed and tested before adding new features.

Phase 3: UX Enhancements

  1. Persistent keyboard - Quick win, minimal integration
  2. Container selection keyboards - Depends on callback routing

Rationale: UX improvements can proceed once core functionality is stable.

Phase 4: Advanced Integration

  1. Unraid update sync - Requires further research, may need host access

Rationale: Most uncertain feature, defer until other features are stable.

Risk Assessment

Feature Risk Level Mitigation
n8n API LOW Configuration only, easy rollback
Socket proxy MEDIUM Test in isolation before workflow changes
Curl migration MEDIUM Use search/replace, comprehensive testing
Persistent keyboard LOW Additive change, no existing code modified
Inline container selection LOW Extends existing callback pattern
Unraid sync HIGH May require host access not available

Open Questions

  1. Unraid API: Does Unraid expose an API endpoint to trigger "Check for Updates"?
  2. Status file format: What is the exact JSON structure of unraid-update-status.json?
  3. Socket proxy port: Should socket-proxy be exposed only on internal Docker network (recommended) or also externally?
  4. n8n API network: Should n8n API be exposed on LAN or only via Tailscale/VPN?

Recommendations

  1. Start with n8n API enablement - Low risk, high value for development velocity
  2. Deploy socket proxy as sidecar - Same Docker network as n8n, not exposed externally
  3. Use HTTP Request nodes for keyboards - Native Telegram node has limitations with dynamic keyboards
  4. Defer Unraid sync - Needs host access investigation, may not be feasible via Docker alone
  5. Test socket proxy thoroughly - All container operations (start, stop, restart, update, logs) before proceeding

Sources Summary

HIGH confidence (official docs):

MEDIUM confidence (verified patterns):

LOW confidence (community reports):