Files
unraid-docker-manager/.planning/phases/07-socket-security/07-RESEARCH.md
T
Lucas Berger 1432d4feb2 docs(07): research phase domain
Phase 07: socket-security
- Standard stack identified
- Architecture patterns documented
- Pitfalls catalogued

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 08:40:04 -05:00

22 KiB

Phase 7: Socket Security - Research

Researched: 2026-02-03 Domain: Docker socket security proxy with tecnativa/docker-socket-proxy Confidence: HIGH

Summary

The tecnativa/docker-socket-proxy is an Alpine-based HAProxy container that filters Docker API requests based on environment variables. It prevents direct socket access by placing a configurable proxy between n8n and the Docker daemon. The proxy operates on TCP port 2375 and returns HTTP 403 Forbidden with "Request forbidden by administrative rules" for blocked endpoints.

The standard approach is to deploy the proxy on the same Docker network as n8n, configure environment variables to enable only required Docker API endpoints (CONTAINERS=1, IMAGES=1, POST=1, plus granular ALLOW_START, ALLOW_STOP, ALLOW_RESTARTS), and update all n8n workflow curl commands from --unix-socket /var/run/docker.sock to TCP calls against http://docker-socket-proxy:2375.

Docker API v1.53 is the current version (January 2026) but v1.47 (used in existing workflow) remains compatible. Container operations use POST to /containers/{id}/start|stop|restart, image pulls use POST to /images/create?fromImage={image}, and all endpoints accept both short and long container IDs.

Primary recommendation: Deploy tecnativa/docker-socket-proxy via Unraid CA "dockersocket" template with minimal configuration (CONTAINERS=1, IMAGES=1, POST=1, ALLOW_START=1, ALLOW_STOP=1, ALLOW_RESTARTS=1), add to n8n's Docker network, update workflow nodes to replace unix socket curl with TCP curl to docker-socket-proxy:2375, handle 403 responses as immediate failures without retry.

Standard Stack

The established solution for Docker socket security proxying:

Core

Library Version Purpose Why Standard
tecnativa/docker-socket-proxy latest Docker API proxy with HAProxy filtering Industry standard for limiting socket access, used by Traefik/Portainer integrations, actively maintained
Docker Engine API v1.53 (current), v1.47 (compatible) RESTful container/image operations Official Docker API, backward compatible across minor versions

Supporting

Library Version Purpose When to Use
HAProxy Alpine-based (embedded) HTTP proxy filtering Embedded in tecnativa proxy, no separate deployment
Docker custom bridge network Built-in Network isolation When multiple containers need inter-container communication

Alternatives Considered

Instead of Could Use Tradeoff
tecnativa/docker-socket-proxy wollomatic/socket-proxy (Go-based) Go version is lighter but less widely adopted, tecnativa has broader community usage
tecnativa/docker-socket-proxy linuxserver/socket-proxy LinuxServer.io fork with same functionality but different maintenance cadence
TCP proxy Read-only socket mount Read-only mount does NOT prevent dangerous operations, only makes them harder to exploit

Installation:

Via Unraid Community Apps (recommended):

  1. Search for "dockersocket" template
  2. Install tecnativa/docker-socket-proxy:latest
  3. Configure environment variables (see Architecture Patterns below)
  4. Add to same network as n8n

Manual Docker deployment:

docker run -d \
  --name docker-socket-proxy \
  --privileged \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -p 127.0.0.1:2375:2375 \
  -e CONTAINERS=1 \
  -e IMAGES=1 \
  -e POST=1 \
  -e ALLOW_START=1 \
  -e ALLOW_STOP=1 \
  -e ALLOW_RESTARTS=1 \
  tecnativa/docker-socket-proxy:latest

Architecture Patterns

Current vs Future Pattern

Current (Phase 6 - Direct Socket):

n8n container ---mount---> /var/run/docker.sock (host)
                ↓
          Docker Engine

Future (Phase 7 - Proxy):

n8n container ---TCP---> socket-proxy container ---mount---> /var/run/docker.sock (host)
                                                        ↓
                                                  Docker Engine

Proxy Environment Variable Configuration

Environment variables control API access (0=deny, 1=allow):

Required for container operations:

CONTAINERS=1           # Enable /containers/* endpoints
POST=1                 # Enable POST/PUT/DELETE (read-only without this)
ALLOW_START=1          # Enable /containers/{id}/start
ALLOW_STOP=1           # Enable /containers/{id}/stop
ALLOW_RESTARTS=1       # Enable /containers/{id}/stop|restart|kill

Required for image operations (update command):

IMAGES=1               # Enable /images/* endpoints (includes pull)

Blocked by default (do NOT enable):

BUILD=0                # Blocks /build endpoint
COMMIT=0               # Blocks /commit endpoint
EXEC=0                 # Blocks /containers/{id}/exec (command execution inside containers)
SECRETS=0              # Blocks /secrets endpoint
AUTH=0                 # Blocks authentication endpoints

Optional logging:

LOG_LEVEL=info         # Values: debug, info, notice, warning, err, crit, alert, emerg

Pattern 1: Replacing Unix Socket Curl Commands

What: Convert all n8n Execute Command nodes from unix socket to TCP proxy calls.

When to use: Every Execute Command node that currently calls --unix-socket /var/run/docker.sock

Search & Replace Pattern:

FROM: curl -s --unix-socket /var/run/docker.sock 'http://localhost/v1.47/
TO:   curl -s 'http://docker-socket-proxy:2375/v1.47/

Example transformations:

List containers:

# Before
curl -s --unix-socket /var/run/docker.sock 'http://localhost/v1.47/containers/json?all=true'

# After
curl -s 'http://docker-socket-proxy:2375/v1.47/containers/json?all=true'

Start container:

# Before
curl -s -o /dev/null -w "%{http_code}" --unix-socket /var/run/docker.sock -X POST 'http://localhost/v1.47/containers/abc123/start'

# After
curl -s -o /dev/null -w "%{http_code}" -X POST 'http://docker-socket-proxy:2375/v1.47/containers/abc123/start'

Pull image:

# Before
curl -s --unix-socket /var/run/docker.sock -X POST 'http://localhost/v1.47/images/create?fromImage=alpine'

# After
curl -s -X POST 'http://docker-socket-proxy:2375/v1.47/images/create?fromImage=alpine'

Pattern 2: Error Handling for Blocked APIs

What: Distinguish between policy blocks (403), Docker errors (4xx/5xx), and connectivity failures (timeout/refused).

When to use: After every Docker API curl call in n8n workflow.

Example (Code node after Execute Command):

// Input: $json.exitCode, $json.stdout, $json.stderr
const exitCode = $json.exitCode;
const stdout = $json.stdout || '';
const stderr = $json.stderr || '';

// 403 = blocked by proxy policy (do NOT retry)
if (stdout.includes('403') || stderr.includes('403 Forbidden')) {
  throw new Error('This action is blocked by security policy');
}

// Connection failures (proxy unavailable)
if (stderr.includes('Connection refused') || stderr.includes('Could not resolve host')) {
  throw new Error('Docker proxy unavailable — please check server');
}

// Timeout (allow ONE retry via workflow logic)
if (stderr.includes('timeout') || stderr.includes('timed out')) {
  return {
    json: {
      retry: true,
      error: 'Request timed out'
    }
  };
}

// Success or other Docker errors
return {
  json: {
    response: stdout,
    exitCode: exitCode
  }
};

Pattern 3: Network Configuration (Unraid)

What: Add docker-socket-proxy to n8n's Docker network for inter-container communication.

When to use: During proxy deployment phase.

Method 1 - Via Unraid CA template:

  1. Install dockersocket template
  2. In template configuration, set "Network Type" to Custom: br0 or existing custom network
  3. Note: Unraid GUI may require manual network joining via docker network connect

Method 2 - Via docker network connect (after deployment):

# Find n8n's network
docker inspect n8n | grep NetworkMode

# Connect proxy to same network
docker network connect [network_name] docker-socket-proxy

# Verify
docker network inspect [network_name]
# Should show both 'n8n' and 'docker-socket-proxy' containers

Pattern 4: Timeout Configuration

What: Add timeout flag to curl commands to prevent indefinite hangs.

When to use: All proxy curl commands.

Example:

# 5 second timeout
curl -s --max-time 5 'http://docker-socket-proxy:2375/v1.47/containers/json?all=true'

Anti-Patterns to Avoid

  • Exposing proxy port to host: Never use -p 2375:2375 (no host binding) — proxy should only be accessible from Docker network
  • Falling back to direct socket: Do NOT add fallback logic to use /var/run/docker.sock if proxy fails — fails closed is correct behavior
  • Retrying 403 responses: Blocked API calls should fail immediately, not retry (retry wastes time and adds confusion)
  • Enabling POST globally without granular controls: Even with POST=1, use ALLOW_START/STOP/RESTARTS for defense in depth
  • Mounting socket as read-write in proxy: Proxy container should mount socket as :ro (read-only) even though HAProxy needs write access internally

Don't Hand-Roll

Problems that look simple but have existing solutions:

Problem Don't Build Use Instead Why
Docker API filtering Custom Node.js proxy checking URLs tecnativa/docker-socket-proxy HAProxy handles edge cases (partial paths, encoded URLs, method verification), widely tested in production
Error response parsing Regex matching on curl stderr HTTP status code checks + known message patterns Docker API responses are structured, proxy returns consistent 403 format
Container network discovery Parsing docker inspect output docker network connect command Built-in Docker networking, handles bridge/overlay/macvlan correctly
Retry logic for timeouts Sleep loops in bash n8n's built-in "On Error" workflow + "Stop and Error" node n8n provides workflow-level retry with backoff, cleaner than curl retry flags

Key insight: Docker socket security is a solved problem with established tooling. The tecnativa proxy is the de facto standard used by Traefik, Portainer, and other Docker management tools. Custom filtering logic will miss edge cases and introduce vulnerabilities.

Common Pitfalls

Pitfall 1: Container Name vs Network Hostname Confusion

What goes wrong: n8n workflow calls http://docker-socket-proxy:2375 but gets "Could not resolve host" error.

Why it happens: Container name does NOT automatically become a resolvable hostname unless containers are on the same user-defined network. Default bridge network doesn't provide DNS resolution.

How to avoid:

  • Verify both n8n and docker-socket-proxy are on same custom network (not default bridge)
  • Use docker network inspect [network] to confirm both containers listed
  • Container name must match exactly (docker-socket-proxy, not dockersocket or socket-proxy)

Warning signs:

  • curl: (6) Could not resolve host: docker-socket-proxy
  • getaddrinfo failed in n8n logs
  • Works with IP address but not hostname

Pitfall 2: Forgetting POST=1 for Write Operations

What goes wrong: Container start/stop/restart commands return 403 Forbidden even though ALLOW_START=1 is set.

Why it happens: ALLOW_START/STOP/RESTARTS only work when POST=1 is also enabled. The granular ALLOW_* variables are subsets of the POST permission.

How to avoid:

  • Always set POST=1 when enabling any write operations
  • Think of POST as the "write operations enabled" master switch
  • ALLOW_* variables then control which specific write operations within POST are permitted

Warning signs:

  • Container operation endpoints return 403
  • GET requests work but POST requests blocked
  • Proxy logs show "Request forbidden by administrative rules" for POST

Pitfall 3: HTTP 403 Treated Like Temporary Error

What goes wrong: Workflow retries blocked API calls multiple times, delaying error response to user by 15+ seconds.

Why it happens: Retry logic doesn't distinguish between "proxy unavailable" (retry makes sense) and "action blocked by policy" (retry pointless).

How to avoid:

  • Check HTTP status code (403) or response body ("Request forbidden by administrative rules")
  • Fail immediately on 403 without retry
  • Only retry on timeout, connection refused, or 5xx errors

Warning signs:

  • User reports slow error responses
  • Telegram bot takes 10+ seconds to say "blocked by security policy"
  • n8n execution logs show multiple identical curl attempts

Pitfall 4: API Version Mismatch Breaking Endpoints

What goes wrong: After updating Docker Engine, API v1.47 endpoints return 400 Bad Request or unexpected responses.

Why it happens: Docker API maintains backward compatibility, but new Docker versions may change defaults (e.g., v1.53 changed Aliases field behavior in container inspect).

How to avoid:

  • Use current API version (v1.53) when possible
  • If staying on v1.47, avoid relying on fields documented as "changed in v1.50+"
  • Test workflow after Docker Engine updates
  • Pin API version in curl URLs (/v1.47/ explicit, not /latest/)

Warning signs:

  • Workflows break after Unraid update
  • Container operations return 400 instead of 200
  • JSON response structure different than expected

Pitfall 5: Short Container IDs Breaking in API v1.53

What goes wrong: Code that parsed Aliases field to get short container ID gets empty array or wrong values.

Why it happens: API v1.53 (Docker Engine 29.2.0, Jan 2026) changed Aliases field to only show user-provided values, not auto-generated short IDs. Use DNSNames field instead.

How to avoid:

  • Don't rely on Aliases field for short container IDs in new code
  • Use Id field (returns full 64-char) and truncate in code if needed: Id.substring(0, 12)
  • Or use new DNSNames field if on v1.53+

Warning signs:

  • Container short ID extraction returns empty value
  • Workflows break after updating to Docker Engine 29.x
  • Code checking Aliases[0] gets unexpected value

Code Examples

Verified patterns from official sources:

List All Containers (GET)

# Source: https://docs.docker.com/engine/api/sdk/examples/
# After proxy deployment (no --unix-socket flag)
curl -s 'http://docker-socket-proxy:2375/v1.47/containers/json?all=true'

Start Container (POST)

# Source: https://docs.docker.com/engine/api/sdk/examples/
# Returns HTTP 204 on success, 304 if already started, 404 if not found, 500 on error
curl -s -o /dev/null -w "%{http_code}" \
  -X POST 'http://docker-socket-proxy:2375/v1.47/containers/abc123/start'

Stop Container with Timeout (POST)

# Source: https://docs.docker.com/engine/api/sdk/examples/
# t=10 gives container 10 seconds to gracefully stop before SIGKILL
curl -s -o /dev/null -w "%{http_code}" \
  -X POST 'http://docker-socket-proxy:2375/v1.47/containers/abc123/stop?t=10'

Restart Container (POST)

# Source: https://docs.docker.com/engine/api/sdk/examples/
# Combines stop + start, respects t parameter for graceful stop timeout
curl -s -o /dev/null -w "%{http_code}" \
  -X POST 'http://docker-socket-proxy:2375/v1.47/containers/abc123/restart?t=10'

Pull Image (POST)

# Source: https://docs.docker.com/engine/api/sdk/examples/
# fromImage parameter takes full image name with optional tag
# Returns JSON stream, check exit code for success
curl -s -X POST 'http://docker-socket-proxy:2375/v1.47/images/create?fromImage=alpine:latest'

Inspect Container (GET)

# Source: Docker API reference
# Returns full container JSON including Config, State, NetworkSettings
curl -s 'http://docker-socket-proxy:2375/v1.47/containers/abc123/json'

n8n HTTP Request Node Configuration

For n8n workflows, use Execute Command node (not HTTP Request node) because:

  • Execute Command can use shell timeout flags
  • Easier to capture both stdout and stderr
  • Consistent with existing workflow pattern
// Code node: Build Docker API curl command
return {
  json: {
    cmd: `curl -s --max-time 5 'http://docker-socket-proxy:2375/v1.47/containers/json?all=true'`
  }
};

State of the Art

Old Approach Current Approach When Changed Impact
Direct docker.sock mount Socket proxy with filtering 2020-2021 (proxy released 2018) Industry standard for limiting Docker API access in multi-container apps
Read-only socket mount Filtered proxy 2020+ Read-only mount insufficient (doesn't prevent dangerous read operations like inspect revealing secrets)
docker.sock at 0666 perms Proxy on isolated network 2021+ Network isolation prevents unauthorized containers from reaching socket
API version pinning Latest version with backward compat API v1.53 (Jan 2026) Some fields changed (Aliases), but endpoints remain compatible

Deprecated/outdated:

  • Mounting /var/run/docker.sock as read-only for security: This does NOT prevent dangerous operations. A read-only mount still allows exec, inspect (which can leak environment variables with secrets), and other sensitive operations. Use a filtering proxy instead.

  • Using latest API version path: Always pin API version in URL path (e.g., /v1.47/ not /latest/). Latest redirects to current version, which may have breaking changes after Docker Engine updates.

  • Checking Aliases field for short container ID: In API v1.53+, this field only contains user-provided aliases. Use Id.substring(0, 12) or the new DNSNames field.

Open Questions

Things that couldn't be fully resolved:

  1. Unraid CA template network configuration

    • What we know: Unraid CA "dockersocket" template exists and provides tecnativa/docker-socket-proxy
    • What's unclear: Whether the CA template supports selecting custom Docker network during initial setup, or if docker network connect must be run post-deployment
    • Recommendation: Document both methods (CA template with manual network join, or docker run with --network flag). Verify via Unraid GUI during planning.
  2. n8n timeout behavior with unavailable proxy

    • What we know: curl supports --max-time flag for operation timeout, we want 5 second timeout
    • What's unclear: Whether n8n Execute Command node respects curl timeout or has its own timeout that could interfere
    • Recommendation: Set both curl --max-time 5 and test in dev workflow. If n8n timeout is longer, curl timeout will trigger first (desired behavior).
  3. Proxy container restart order dependency

    • What we know: If proxy restarts, n8n curl commands will fail with connection refused until proxy is back up
    • What's unclear: Whether we should add depends_on or Docker restart policy coordination between n8n and proxy
    • Recommendation: Don't add orchestration. User's decision was "proxy managed by Unraid" — let Unraid handle restart order. n8n error messages will alert user if proxy is down.

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Secondary (verified with official source)

Tertiary (LOW confidence - WebSearch only)

  • Various community forum threads on docker-socket-proxy deployment patterns - useful for common pitfalls but not authoritative for configuration

Metadata

Confidence breakdown:

  • Standard stack: HIGH - tecnativa/docker-socket-proxy is documented standard, Docker API v1.53 is current official version
  • Architecture: HIGH - Environment variables documented in official README, curl patterns from Docker's official examples
  • Pitfalls: MEDIUM - Network DNS issue from Docker docs, POST=1 requirement from tecnativa README, 403 retry issue from observed n8n behavior patterns (LOW source but logical conclusion)

Research date: 2026-02-03 Valid until: 2026-03-03 (30 days) - Docker API stable, proxy project mature with infrequent breaking changes