Files
unraid-docker-manager/.planning/phases/03-container-actions/03-RESEARCH.md
T
2026-01-29 15:02:50 -05:00

22 KiB

Phase 3: Container Actions - Research

Researched: 2026-01-29 Domain: Docker container lifecycle control via n8n + Telegram inline buttons Confidence: HIGH

Summary

This phase implements container control actions (start, stop, restart, update) through natural language commands with fuzzy name matching. The research confirms that the Docker Engine API provides straightforward POST endpoints for container lifecycle operations (/containers/{id}/start, /containers/{id}/stop, /containers/{id}/restart). Container updates require a multi-step process: pull new image, stop old container, remove it, create new container with same config, and start it.

The critical technical finding is that n8n's native Telegram node does not properly support dynamic inline keyboards via expressions. The workaround is to use the HTTP Request node to call the Telegram Bot API directly with full JSON payload control. This enables the confirmation buttons required for batch actions.

State management for pending confirmations (with 2-minute timeout) can be achieved using n8n's workflow static data or a simple approach where callback_data encodes all necessary context (action, container IDs, timestamp) so no server-side state is needed.

Primary recommendation: Use Docker Engine API v1.47 POST endpoints for container control, HTTP Request node for Telegram inline keyboards, and encode confirmation state in callback_data to avoid complex state management.

Standard Stack

The established libraries/tools for this domain:

Core

Library Version Purpose Why Standard
Docker Engine API v1.47 Container lifecycle control Official API, already working in Phase 2
curl 7.50+ HTTP requests to Unix socket Already established, supports POST with -X POST
n8n Execute Command Latest Run Docker API calls Already established pattern
n8n HTTP Request Latest Telegram API inline keyboards Required workaround for dynamic buttons
n8n Code node Latest Response formatting, state encoding Already established pattern

Supporting

Library Version Purpose When to Use
n8n Switch node Latest Route callback queries vs messages Handle different Telegram update types
n8n Telegram node Latest Answer callback queries Native node works for answerCallbackQuery

Alternatives Considered

Instead of Could Use Tradeoff
HTTP Request for keyboard Native Telegram node Native node doesn't support dynamic inline keyboards via expressions
Stateless callback_data n8n Static Data Static data adds complexity; callback_data encoding simpler for 2-min timeout
Container recreate via API Watchtower Watchtower is automated; we want user-controlled updates

No additional installation required - all tools already available from Phase 1 and 2.

Architecture Patterns

Telegram Trigger (message + callback_query)
  |
  +-> IF User Authenticated
        |
        +-> Switch: Update Type
              |
              +-> [message] -> Route Message -> Action Branch
              |                                   |
              |                                   +-> Start/Stop/Restart
              |                                   +-> Update (pull+recreate)
              |
              +-> [callback_query] -> Process Confirmation
                                        |
                                        +-> Decode callback_data
                                        +-> Validate timestamp (2-min)
                                        +-> Execute action
                                        +-> Answer callback query

Pattern 1: Container Lifecycle Actions via API

What: Use POST requests to Docker API for start/stop/restart When to use: All container control operations

Example:

# Start container
curl -s --unix-socket /var/run/docker.sock \
  -X POST 'http://localhost/v1.47/containers/{id}/start'

# Stop container (with 10s timeout)
curl -s --unix-socket /var/run/docker.sock \
  -X POST 'http://localhost/v1.47/containers/{id}/stop?t=10'

# Restart container
curl -s --unix-socket /var/run/docker.sock \
  -X POST 'http://localhost/v1.47/containers/{id}/restart?t=10'

Response codes:

  • 204: Success (no content)
  • 304: Container already started/stopped (for start/stop)
  • 404: Container not found
  • 500: Server error

Source: Docker Engine API Examples

Pattern 2: Inline Keyboard via HTTP Request Node

What: Send messages with inline buttons using HTTP Request node When to use: Batch confirmations, suggestions ("did you mean X?")

Example (Code node to generate, HTTP Request to send):

// Code node: Generate keyboard JSON
const containers = ['sonarr', 'radarr', 'lidarr'];
const action = 'stop';
const timestamp = Date.now();

const keyboard = {
  inline_keyboard: [
    [
      {
        text: `Yes, ${action} ${containers.length} containers`,
        callback_data: JSON.stringify({
          a: action,           // action
          c: containers,       // container IDs (short)
          t: timestamp         // timestamp for timeout check
        })
      },
      {
        text: "Cancel",
        callback_data: JSON.stringify({ a: 'cancel' })
      }
    ]
  ]
};

return {
  json: {
    chat_id: chatId,
    text: `Found ${containers.length} matches: ${containers.join(', ')}`,
    reply_markup: keyboard
  }
};

// HTTP Request node config:
// URL: https://api.telegram.org/bot{{ $credentials.telegram.accessToken }}/sendMessage
// Method: POST
// Body: JSON (from previous node)

Source: n8n Community - Dynamic Inline Keyboard

Pattern 3: Handle Callback Queries

What: Process inline button clicks and respond When to use: When user clicks confirmation or suggestion button

Telegram Trigger config:

// Set updates to receive both messages and callback queries
{
  "updates": ["message", "callback_query"]
}

Processing callback_query (Code node):

const update = $input.item.json;

// Check if this is a callback query
if (update.callback_query) {
  const callbackData = JSON.parse(update.callback_query.data);
  const queryId = update.callback_query.id;
  const chatId = update.callback_query.message.chat.id;
  const messageId = update.callback_query.message.message_id;

  // Check timeout (2 minutes = 120000ms)
  if (Date.now() - callbackData.t > 120000) {
    return {
      json: {
        expired: true,
        queryId,
        text: "Confirmation expired. Please try again."
      }
    };
  }

  return {
    json: {
      action: callbackData.a,
      containers: callbackData.c,
      queryId,
      chatId,
      messageId
    }
  };
}

Answer callback query (Telegram node):

// Operation: Answer Query
// Query ID: {{ $json.queryId }}
// Text: (optional toast message)
// Show Alert: false

Source: Telegram Bot API - CallbackQuery

Pattern 4: Container Update (Pull + Recreate)

What: Pull new image, stop container, remove, recreate with same config, start When to use: "update plex" command

Steps:

// 1. Get current container config
const inspectCmd = `curl -s --unix-socket /var/run/docker.sock \
  'http://localhost/v1.47/containers/${containerId}/json'`;
// Returns: { Config: {...}, HostConfig: {...}, Name: "...", ... }

// 2. Pull new image (streaming response)
const imageName = containerConfig.Config.Image;
const pullCmd = `curl -s --unix-socket /var/run/docker.sock \
  -X POST 'http://localhost/v1.47/images/create?fromImage=${encodeURIComponent(imageName)}'`;
// Returns: Stream of {"status": "Pulling...", "progress": "..."} lines

// 3. Compare digests to detect if update occurred
// Old digest: containerConfig.Image (the image ID)
// New digest: Parse last line of pull response or inspect new image

// 4. Stop container
const stopCmd = `curl -s --unix-socket /var/run/docker.sock \
  -X POST 'http://localhost/v1.47/containers/${containerId}/stop?t=10'`;

// 5. Remove container
const removeCmd = `curl -s --unix-socket /var/run/docker.sock \
  -X DELETE 'http://localhost/v1.47/containers/${containerId}'`;

// 6. Create new container with same config
const createBody = {
  ...containerConfig.Config,
  HostConfig: containerConfig.HostConfig,
  NetworkingConfig: containerConfig.NetworkSettings.Networks
};
// POST to /containers/create?name=containerName with createBody

// 7. Start new container
const startCmd = `curl -s --unix-socket /var/run/docker.sock \
  -X POST 'http://localhost/v1.47/containers/${newContainerId}/start'`;

Source: Docker Forums - Recreate Container

Pattern 5: Version Detection for Update Messages

What: Detect if image actually updated and extract version info When to use: Showing "plex updated: v1.32.0 -> v1.32.1"

// Get old image digest before pull
const oldImageId = containerConfig.Image;

// After pull, inspect new image
const newImageInspect = JSON.parse(execSync(`curl -s --unix-socket /var/run/docker.sock \
  'http://localhost/v1.47/images/${encodeURIComponent(imageName)}/json'`));
const newImageId = newImageInspect.Id;

// Compare
if (oldImageId === newImageId) {
  return { updated: false, message: null }; // Stay silent per user decision
}

// Try to extract version from labels (common pattern)
const oldVersion = containerConfig.Config.Labels?.['org.opencontainers.image.version']
  || containerConfig.Config.Labels?.['version']
  || oldImageId.substring(7, 19);
const newVersion = newImageInspect.Config.Labels?.['org.opencontainers.image.version']
  || newImageInspect.Config.Labels?.['version']
  || newImageId.substring(7, 19);

return {
  updated: true,
  message: `${containerName} updated: ${oldVersion} -> ${newVersion}`
};

Source: Docker Image Digests

Anti-Patterns to Avoid

  • Using native Telegram node for dynamic keyboards: Doesn't work - use HTTP Request node instead
  • Server-side state for confirmations: Adds complexity; encode everything in callback_data
  • Not handling 304 responses: Container already in desired state is success, not error
  • Force-killing without timeout: Use ?t=10 to give container graceful shutdown time
  • Assuming image pull always updates: Must compare digests to detect actual changes

Don't Hand-Roll

Problems that look simple but have existing solutions:

Problem Don't Build Use Instead Why
Inline keyboard buttons Native Telegram node HTTP Request + Telegram API Native node has expression bug, HTTP works
Container config extraction Manual JSON manipulation Docker inspect API Full config including HostConfig, Networks
Timeout enforcement setTimeout in Code node Encode timestamp in callback_data Stateless, survives workflow restarts
Image update detection File hash comparison Docker image digest comparison Registry-aware, handles layers correctly

Key insight: The complexity in this phase is state management for confirmations and the container update workflow. Keep confirmations stateless by encoding in callback_data. The update workflow is inherently multi-step but each step is a simple API call.

Common Pitfalls

Pitfall 1: Native Telegram Node Inline Keyboard Bug

What goes wrong: Trying to pass inline keyboard via expression results in "The value is not supported!" error Why it happens: n8n Telegram node interprets array as string instead of JSON How to avoid: Use HTTP Request node to call Telegram API directly with full JSON control Warning signs: Buttons don't appear despite valid-looking keyboard structure

Source: n8n Issue #19955

Pitfall 2: Not Handling 304 "Already Stopped/Started"

What goes wrong: Code treats 304 response as error Why it happens: 304 means "not modified" - container already in desired state How to avoid:

// 204 = success, 304 = already in state (also success)
if (statusCode === 204 || statusCode === 304) {
  return { success: true };
}

Warning signs: "Error stopping container" when container was already stopped

Pitfall 3: Container Recreate Loses Network Settings

What goes wrong: New container can't connect to other containers Why it happens: NetworkSettings from inspect need special handling for create How to avoid:

// Extract network config correctly
const networks = {};
for (const [name, config] of Object.entries(containerConfig.NetworkSettings.Networks)) {
  networks[name] = {
    IPAMConfig: config.IPAMConfig,
    Links: config.Links,
    Aliases: config.Aliases
  };
}
// Use in create: { NetworkingConfig: { EndpointsConfig: networks } }

Warning signs: Container starts but can't reach other services

Pitfall 4: Image Pull Returns Stream, Not JSON

What goes wrong: JSON.parse() fails on image pull response Why it happens: Pull endpoint returns newline-delimited JSON stream How to avoid:

// Parse last line for final status
const lines = pullOutput.trim().split('\n');
const lastLine = JSON.parse(lines[lines.length - 1]);
if (lastLine.error) {
  throw new Error(lastLine.error);
}
// Or just check exit code - success means pull completed

Warning signs: "Unexpected token" errors during update

Pitfall 5: callback_data Size Limit

What goes wrong: Telegram silently fails to send buttons with large callback_data Why it happens: callback_data limited to 64 bytes How to avoid: Use short keys, container short IDs (12 chars), abbreviate action names

// Bad: { action: "restart", containers: ["full-id-1234567890abcdef..."] }
// Good: { a: "r", c: ["abc123"] }  // Short ID is unique enough

Warning signs: Buttons don't appear, no error

Source: Telegram Bot API Docs

Pitfall 6: Race Condition in Update Workflow

What goes wrong: Container remove fails because container is still stopping Why it happens: Stop returns before container fully stops with short timeout How to avoid: Use adequate timeout (10s default) or check container state before remove

// Wait for stop to complete
await execStop();
// Verify stopped before remove
const state = await inspectContainer();
if (state.State.Running) {
  throw new Error('Container still running after stop');
}
await removeContainer();

Warning signs: "You cannot remove a running container" errors

Code Examples

Verified patterns from official sources:

Start Container

# Execute Command node
curl -s --unix-socket /var/run/docker.sock \
  -X POST 'http://localhost/v1.47/containers/plex/start'

# Returns nothing (204) on success
# Returns 304 if already running
# Returns 404 if container not found

Stop Container with Timeout

# Execute Command node
curl -s --unix-socket /var/run/docker.sock \
  -X POST 'http://localhost/v1.47/containers/plex/stop?t=10'

# t=10 gives container 10 seconds to shutdown gracefully
# After timeout, SIGKILL is sent

Send Message with Inline Keyboard (HTTP Request)

// Code node: Prepare payload
const payload = {
  chat_id: chatId,
  text: "Found 3 containers matching 'arr': sonarr, radarr, lidarr\n\nStop all?",
  parse_mode: "HTML",
  reply_markup: {
    inline_keyboard: [
      [
        { text: "Yes, stop 3 containers", callback_data: '{"a":"stop","c":["abc","def","ghi"],"t":1706544000000}' },
        { text: "Cancel", callback_data: '{"a":"x"}' }
      ]
    ]
  }
};

return { json: payload };

// HTTP Request node config:
// Method: POST
// URL: https://api.telegram.org/bot{{$credentials.telegramApi.accessToken}}/sendMessage
// Body Content Type: JSON
// Body: {{ JSON.stringify($json) }}

Handle Callback Query

// Code node after Telegram Trigger
const update = $input.item.json;

if (!update.callback_query) {
  // Not a callback query, handle as message
  return { json: { type: 'message', data: update.message } };
}

const callback = update.callback_query;
let data;
try {
  data = JSON.parse(callback.data);
} catch (e) {
  data = { a: callback.data }; // Plain string fallback
}

// Check timeout (2 minutes)
const TWO_MINUTES = 120000;
const isExpired = data.t && (Date.now() - data.t > TWO_MINUTES);

return {
  json: {
    type: 'callback',
    queryId: callback.id,
    chatId: callback.message.chat.id,
    messageId: callback.message.message_id,
    action: data.a,
    containers: data.c || [],
    expired: isExpired,
    userId: callback.from.id
  }
};

Answer Callback Query (Telegram Node)

// Telegram node settings
// Operation: Answer Query
// Query ID: {{ $json.queryId }}
// Text: Action completed  (or leave empty for no notification)
// Show Alert: false
// Cache Time: 0

Delete Confirmation Message After Action

// HTTP Request node to delete the confirmation message
// URL: https://api.telegram.org/bot{{$credentials.telegramApi.accessToken}}/deleteMessage
// Method: POST
// Body: { "chat_id": {{ $json.chatId }}, "message_id": {{ $json.messageId }} }

Pull Image and Check for Update

// Code node: Pull image and compare
const containerId = $json.containerId;
const chatId = $json.chatId;

// Get current container info
const inspectResult = $('Docker Inspect').item.json;
const currentImageId = inspectResult.Image;
const imageName = inspectResult.Config.Image;

// Pull result (from Execute Command node that ran curl POST to /images/create)
const pullOutput = $('Docker Pull').item.json.stdout;

// Parse pull output (newline-delimited JSON)
const lines = pullOutput.trim().split('\n').filter(l => l);
const statuses = lines.map(l => {
  try { return JSON.parse(l); }
  catch { return null; }
}).filter(Boolean);

// Check for errors
const errorStatus = statuses.find(s => s.error);
if (errorStatus) {
  return { json: { error: true, message: errorStatus.error } };
}

// Get new image ID
const newInspect = $('Docker Image Inspect').item.json;
const newImageId = newInspect.Id;

if (currentImageId === newImageId) {
  return { json: { updated: false } }; // No message per user decision
}

// Extract versions from labels
const getVersion = (config) =>
  config?.Labels?.['org.opencontainers.image.version'] ||
  config?.Labels?.['version'] ||
  'unknown';

return {
  json: {
    updated: true,
    oldVersion: getVersion(inspectResult.Config),
    newVersion: getVersion(newInspect.Config),
    containerId,
    chatId
  }
};

State of the Art

Old Approach Current Approach When Changed Impact
Native Telegram node keyboard HTTP Request + Telegram API n8n limitation (ongoing) Required for dynamic buttons
Server-side confirmation state Stateless callback_data encoding Best practice Simpler, no cleanup needed
docker commit for update Pull + inspect + recreate Always preferred Preserves exact config, no manual re-entry
Manual docker pull/stop/rm/run API calls in sequence Phase 3 Scriptable, error-handled

Deprecated/outdated:

  • Watchtower for user-initiated updates: Watchtower is for automated updates; we want manual control
  • docker exec for container control: Always use Docker Engine API, not CLI parsing
  • Telegram node editMessageReplyMarkup: Same expression bug; use HTTP Request

Open Questions

Things that couldn't be fully resolved:

  1. Container NetworkingConfig exact format

    • What we know: Need to extract from inspect and pass to create
    • What's unclear: Exact transformation needed between inspect output and create input
    • Recommendation: Test with a simple container first; may need to strip some fields
  2. Image pull authentication for private registries

    • What we know: Public images (Docker Hub) work without auth
    • What's unclear: If user has private registry images, need X-Registry-Auth header
    • Recommendation: Document as limitation for v1; add auth support if requested
  3. Long-running pull timeout

    • What we know: Large images can take minutes to pull
    • What's unclear: n8n Execute Command timeout, user patience threshold
    • Recommendation: Send "in progress" message for update actions (Claude's discretion per CONTEXT.md)

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence)

Metadata

Confidence breakdown:

  • Standard stack: HIGH - Docker API endpoints verified, n8n patterns established in Phase 2
  • Architecture: HIGH - Patterns from official docs and verified community workarounds
  • Pitfalls: MEDIUM - Based on documented issues and community reports
  • Container update: MEDIUM - Multi-step process, some edge cases around NetworkingConfig

Research date: 2026-01-29 Valid until: 2026-04-29 (90 days - Docker API stable, n8n may fix Telegram keyboard issue)