docs(07): research phase domain

Phase 07: socket-security
- Standard stack identified
- Architecture patterns documented
- Pitfalls catalogued

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Lucas Berger
2026-02-03 08:40:04 -05:00
parent e17c5bf0d4
commit 1432d4feb2
@@ -0,0 +1,474 @@
# Phase 7: Socket Security - Research
**Researched:** 2026-02-03
**Domain:** Docker socket security proxy with tecnativa/docker-socket-proxy
**Confidence:** HIGH
## Summary
The tecnativa/docker-socket-proxy is an Alpine-based HAProxy container that filters Docker API requests based on environment variables. It prevents direct socket access by placing a configurable proxy between n8n and the Docker daemon. The proxy operates on TCP port 2375 and returns HTTP 403 Forbidden with "Request forbidden by administrative rules" for blocked endpoints.
The standard approach is to deploy the proxy on the same Docker network as n8n, configure environment variables to enable only required Docker API endpoints (CONTAINERS=1, IMAGES=1, POST=1, plus granular ALLOW_START, ALLOW_STOP, ALLOW_RESTARTS), and update all n8n workflow curl commands from `--unix-socket /var/run/docker.sock` to TCP calls against `http://docker-socket-proxy:2375`.
Docker API v1.53 is the current version (January 2026) but v1.47 (used in existing workflow) remains compatible. Container operations use POST to `/containers/{id}/start|stop|restart`, image pulls use POST to `/images/create?fromImage={image}`, and all endpoints accept both short and long container IDs.
**Primary recommendation:** Deploy tecnativa/docker-socket-proxy via Unraid CA "dockersocket" template with minimal configuration (CONTAINERS=1, IMAGES=1, POST=1, ALLOW_START=1, ALLOW_STOP=1, ALLOW_RESTARTS=1), add to n8n's Docker network, update workflow nodes to replace unix socket curl with TCP curl to docker-socket-proxy:2375, handle 403 responses as immediate failures without retry.
## Standard Stack
The established solution for Docker socket security proxying:
### Core
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| tecnativa/docker-socket-proxy | latest | Docker API proxy with HAProxy filtering | Industry standard for limiting socket access, used by Traefik/Portainer integrations, actively maintained |
| Docker Engine API | v1.53 (current), v1.47 (compatible) | RESTful container/image operations | Official Docker API, backward compatible across minor versions |
### Supporting
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| HAProxy | Alpine-based (embedded) | HTTP proxy filtering | Embedded in tecnativa proxy, no separate deployment |
| Docker custom bridge network | Built-in | Network isolation | When multiple containers need inter-container communication |
### Alternatives Considered
| Instead of | Could Use | Tradeoff |
|------------|-----------|----------|
| tecnativa/docker-socket-proxy | wollomatic/socket-proxy (Go-based) | Go version is lighter but less widely adopted, tecnativa has broader community usage |
| tecnativa/docker-socket-proxy | linuxserver/socket-proxy | LinuxServer.io fork with same functionality but different maintenance cadence |
| TCP proxy | Read-only socket mount | Read-only mount does NOT prevent dangerous operations, only makes them harder to exploit |
**Installation:**
Via Unraid Community Apps (recommended):
1. Search for "dockersocket" template
2. Install tecnativa/docker-socket-proxy:latest
3. Configure environment variables (see Architecture Patterns below)
4. Add to same network as n8n
Manual Docker deployment:
```bash
docker run -d \
--name docker-socket-proxy \
--privileged \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-p 127.0.0.1:2375:2375 \
-e CONTAINERS=1 \
-e IMAGES=1 \
-e POST=1 \
-e ALLOW_START=1 \
-e ALLOW_STOP=1 \
-e ALLOW_RESTARTS=1 \
tecnativa/docker-socket-proxy:latest
```
## Architecture Patterns
### Current vs Future Pattern
**Current (Phase 6 - Direct Socket):**
```
n8n container ---mount---> /var/run/docker.sock (host)
Docker Engine
```
**Future (Phase 7 - Proxy):**
```
n8n container ---TCP---> socket-proxy container ---mount---> /var/run/docker.sock (host)
Docker Engine
```
### Proxy Environment Variable Configuration
Environment variables control API access (0=deny, 1=allow):
**Required for container operations:**
```bash
CONTAINERS=1 # Enable /containers/* endpoints
POST=1 # Enable POST/PUT/DELETE (read-only without this)
ALLOW_START=1 # Enable /containers/{id}/start
ALLOW_STOP=1 # Enable /containers/{id}/stop
ALLOW_RESTARTS=1 # Enable /containers/{id}/stop|restart|kill
```
**Required for image operations (update command):**
```bash
IMAGES=1 # Enable /images/* endpoints (includes pull)
```
**Blocked by default (do NOT enable):**
```bash
BUILD=0 # Blocks /build endpoint
COMMIT=0 # Blocks /commit endpoint
EXEC=0 # Blocks /containers/{id}/exec (command execution inside containers)
SECRETS=0 # Blocks /secrets endpoint
AUTH=0 # Blocks authentication endpoints
```
**Optional logging:**
```bash
LOG_LEVEL=info # Values: debug, info, notice, warning, err, crit, alert, emerg
```
### Pattern 1: Replacing Unix Socket Curl Commands
**What:** Convert all n8n Execute Command nodes from unix socket to TCP proxy calls.
**When to use:** Every Execute Command node that currently calls `--unix-socket /var/run/docker.sock`
**Search & Replace Pattern:**
```
FROM: curl -s --unix-socket /var/run/docker.sock 'http://localhost/v1.47/
TO: curl -s 'http://docker-socket-proxy:2375/v1.47/
```
**Example transformations:**
List containers:
```bash
# Before
curl -s --unix-socket /var/run/docker.sock 'http://localhost/v1.47/containers/json?all=true'
# After
curl -s 'http://docker-socket-proxy:2375/v1.47/containers/json?all=true'
```
Start container:
```bash
# Before
curl -s -o /dev/null -w "%{http_code}" --unix-socket /var/run/docker.sock -X POST 'http://localhost/v1.47/containers/abc123/start'
# After
curl -s -o /dev/null -w "%{http_code}" -X POST 'http://docker-socket-proxy:2375/v1.47/containers/abc123/start'
```
Pull image:
```bash
# Before
curl -s --unix-socket /var/run/docker.sock -X POST 'http://localhost/v1.47/images/create?fromImage=alpine'
# After
curl -s -X POST 'http://docker-socket-proxy:2375/v1.47/images/create?fromImage=alpine'
```
### Pattern 2: Error Handling for Blocked APIs
**What:** Distinguish between policy blocks (403), Docker errors (4xx/5xx), and connectivity failures (timeout/refused).
**When to use:** After every Docker API curl call in n8n workflow.
**Example (Code node after Execute Command):**
```javascript
// Input: $json.exitCode, $json.stdout, $json.stderr
const exitCode = $json.exitCode;
const stdout = $json.stdout || '';
const stderr = $json.stderr || '';
// 403 = blocked by proxy policy (do NOT retry)
if (stdout.includes('403') || stderr.includes('403 Forbidden')) {
throw new Error('This action is blocked by security policy');
}
// Connection failures (proxy unavailable)
if (stderr.includes('Connection refused') || stderr.includes('Could not resolve host')) {
throw new Error('Docker proxy unavailable — please check server');
}
// Timeout (allow ONE retry via workflow logic)
if (stderr.includes('timeout') || stderr.includes('timed out')) {
return {
json: {
retry: true,
error: 'Request timed out'
}
};
}
// Success or other Docker errors
return {
json: {
response: stdout,
exitCode: exitCode
}
};
```
### Pattern 3: Network Configuration (Unraid)
**What:** Add docker-socket-proxy to n8n's Docker network for inter-container communication.
**When to use:** During proxy deployment phase.
**Method 1 - Via Unraid CA template:**
1. Install dockersocket template
2. In template configuration, set "Network Type" to `Custom: br0` or existing custom network
3. Note: Unraid GUI may require manual network joining via `docker network connect`
**Method 2 - Via docker network connect (after deployment):**
```bash
# Find n8n's network
docker inspect n8n | grep NetworkMode
# Connect proxy to same network
docker network connect [network_name] docker-socket-proxy
# Verify
docker network inspect [network_name]
# Should show both 'n8n' and 'docker-socket-proxy' containers
```
### Pattern 4: Timeout Configuration
**What:** Add timeout flag to curl commands to prevent indefinite hangs.
**When to use:** All proxy curl commands.
**Example:**
```bash
# 5 second timeout
curl -s --max-time 5 'http://docker-socket-proxy:2375/v1.47/containers/json?all=true'
```
### Anti-Patterns to Avoid
- **Exposing proxy port to host:** Never use `-p 2375:2375` (no host binding) — proxy should only be accessible from Docker network
- **Falling back to direct socket:** Do NOT add fallback logic to use `/var/run/docker.sock` if proxy fails — fails closed is correct behavior
- **Retrying 403 responses:** Blocked API calls should fail immediately, not retry (retry wastes time and adds confusion)
- **Enabling POST globally without granular controls:** Even with POST=1, use ALLOW_START/STOP/RESTARTS for defense in depth
- **Mounting socket as read-write in proxy:** Proxy container should mount socket as `:ro` (read-only) even though HAProxy needs write access internally
## Don't Hand-Roll
Problems that look simple but have existing solutions:
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| Docker API filtering | Custom Node.js proxy checking URLs | tecnativa/docker-socket-proxy | HAProxy handles edge cases (partial paths, encoded URLs, method verification), widely tested in production |
| Error response parsing | Regex matching on curl stderr | HTTP status code checks + known message patterns | Docker API responses are structured, proxy returns consistent 403 format |
| Container network discovery | Parsing docker inspect output | `docker network connect` command | Built-in Docker networking, handles bridge/overlay/macvlan correctly |
| Retry logic for timeouts | Sleep loops in bash | n8n's built-in "On Error" workflow + "Stop and Error" node | n8n provides workflow-level retry with backoff, cleaner than curl retry flags |
**Key insight:** Docker socket security is a solved problem with established tooling. The tecnativa proxy is the de facto standard used by Traefik, Portainer, and other Docker management tools. Custom filtering logic will miss edge cases and introduce vulnerabilities.
## Common Pitfalls
### Pitfall 1: Container Name vs Network Hostname Confusion
**What goes wrong:** n8n workflow calls `http://docker-socket-proxy:2375` but gets "Could not resolve host" error.
**Why it happens:** Container name does NOT automatically become a resolvable hostname unless containers are on the same user-defined network. Default bridge network doesn't provide DNS resolution.
**How to avoid:**
- Verify both n8n and docker-socket-proxy are on same custom network (not default bridge)
- Use `docker network inspect [network]` to confirm both containers listed
- Container name must match exactly (docker-socket-proxy, not dockersocket or socket-proxy)
**Warning signs:**
- `curl: (6) Could not resolve host: docker-socket-proxy`
- `getaddrinfo failed` in n8n logs
- Works with IP address but not hostname
### Pitfall 2: Forgetting POST=1 for Write Operations
**What goes wrong:** Container start/stop/restart commands return 403 Forbidden even though ALLOW_START=1 is set.
**Why it happens:** ALLOW_START/STOP/RESTARTS only work when POST=1 is also enabled. The granular ALLOW_* variables are subsets of the POST permission.
**How to avoid:**
- Always set POST=1 when enabling any write operations
- Think of POST as the "write operations enabled" master switch
- ALLOW_* variables then control which specific write operations within POST are permitted
**Warning signs:**
- Container operation endpoints return 403
- GET requests work but POST requests blocked
- Proxy logs show "Request forbidden by administrative rules" for POST
### Pitfall 3: HTTP 403 Treated Like Temporary Error
**What goes wrong:** Workflow retries blocked API calls multiple times, delaying error response to user by 15+ seconds.
**Why it happens:** Retry logic doesn't distinguish between "proxy unavailable" (retry makes sense) and "action blocked by policy" (retry pointless).
**How to avoid:**
- Check HTTP status code (403) or response body ("Request forbidden by administrative rules")
- Fail immediately on 403 without retry
- Only retry on timeout, connection refused, or 5xx errors
**Warning signs:**
- User reports slow error responses
- Telegram bot takes 10+ seconds to say "blocked by security policy"
- n8n execution logs show multiple identical curl attempts
### Pitfall 4: API Version Mismatch Breaking Endpoints
**What goes wrong:** After updating Docker Engine, API v1.47 endpoints return 400 Bad Request or unexpected responses.
**Why it happens:** Docker API maintains backward compatibility, but new Docker versions may change defaults (e.g., v1.53 changed Aliases field behavior in container inspect).
**How to avoid:**
- Use current API version (v1.53) when possible
- If staying on v1.47, avoid relying on fields documented as "changed in v1.50+"
- Test workflow after Docker Engine updates
- Pin API version in curl URLs (`/v1.47/` explicit, not `/latest/`)
**Warning signs:**
- Workflows break after Unraid update
- Container operations return 400 instead of 200
- JSON response structure different than expected
### Pitfall 5: Short Container IDs Breaking in API v1.53
**What goes wrong:** Code that parsed `Aliases` field to get short container ID gets empty array or wrong values.
**Why it happens:** API v1.53 (Docker Engine 29.2.0, Jan 2026) changed Aliases field to only show user-provided values, not auto-generated short IDs. Use `DNSNames` field instead.
**How to avoid:**
- Don't rely on Aliases field for short container IDs in new code
- Use `Id` field (returns full 64-char) and truncate in code if needed: `Id.substring(0, 12)`
- Or use new `DNSNames` field if on v1.53+
**Warning signs:**
- Container short ID extraction returns empty value
- Workflows break after updating to Docker Engine 29.x
- Code checking `Aliases[0]` gets unexpected value
## Code Examples
Verified patterns from official sources:
### List All Containers (GET)
```bash
# Source: https://docs.docker.com/engine/api/sdk/examples/
# After proxy deployment (no --unix-socket flag)
curl -s 'http://docker-socket-proxy:2375/v1.47/containers/json?all=true'
```
### Start Container (POST)
```bash
# Source: https://docs.docker.com/engine/api/sdk/examples/
# Returns HTTP 204 on success, 304 if already started, 404 if not found, 500 on error
curl -s -o /dev/null -w "%{http_code}" \
-X POST 'http://docker-socket-proxy:2375/v1.47/containers/abc123/start'
```
### Stop Container with Timeout (POST)
```bash
# Source: https://docs.docker.com/engine/api/sdk/examples/
# t=10 gives container 10 seconds to gracefully stop before SIGKILL
curl -s -o /dev/null -w "%{http_code}" \
-X POST 'http://docker-socket-proxy:2375/v1.47/containers/abc123/stop?t=10'
```
### Restart Container (POST)
```bash
# Source: https://docs.docker.com/engine/api/sdk/examples/
# Combines stop + start, respects t parameter for graceful stop timeout
curl -s -o /dev/null -w "%{http_code}" \
-X POST 'http://docker-socket-proxy:2375/v1.47/containers/abc123/restart?t=10'
```
### Pull Image (POST)
```bash
# Source: https://docs.docker.com/engine/api/sdk/examples/
# fromImage parameter takes full image name with optional tag
# Returns JSON stream, check exit code for success
curl -s -X POST 'http://docker-socket-proxy:2375/v1.47/images/create?fromImage=alpine:latest'
```
### Inspect Container (GET)
```bash
# Source: Docker API reference
# Returns full container JSON including Config, State, NetworkSettings
curl -s 'http://docker-socket-proxy:2375/v1.47/containers/abc123/json'
```
### n8n HTTP Request Node Configuration
For n8n workflows, use Execute Command node (not HTTP Request node) because:
- Execute Command can use shell timeout flags
- Easier to capture both stdout and stderr
- Consistent with existing workflow pattern
```javascript
// Code node: Build Docker API curl command
return {
json: {
cmd: `curl -s --max-time 5 'http://docker-socket-proxy:2375/v1.47/containers/json?all=true'`
}
};
```
## State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| Direct docker.sock mount | Socket proxy with filtering | 2020-2021 (proxy released 2018) | Industry standard for limiting Docker API access in multi-container apps |
| Read-only socket mount | Filtered proxy | 2020+ | Read-only mount insufficient (doesn't prevent dangerous read operations like inspect revealing secrets) |
| docker.sock at 0666 perms | Proxy on isolated network | 2021+ | Network isolation prevents unauthorized containers from reaching socket |
| API version pinning | Latest version with backward compat | API v1.53 (Jan 2026) | Some fields changed (Aliases), but endpoints remain compatible |
**Deprecated/outdated:**
- **Mounting `/var/run/docker.sock` as read-only for security:** This does NOT prevent dangerous operations. A read-only mount still allows exec, inspect (which can leak environment variables with secrets), and other sensitive operations. Use a filtering proxy instead.
- **Using `latest` API version path:** Always pin API version in URL path (e.g., `/v1.47/` not `/latest/`). Latest redirects to current version, which may have breaking changes after Docker Engine updates.
- **Checking `Aliases` field for short container ID:** In API v1.53+, this field only contains user-provided aliases. Use `Id.substring(0, 12)` or the new `DNSNames` field.
## Open Questions
Things that couldn't be fully resolved:
1. **Unraid CA template network configuration**
- What we know: Unraid CA "dockersocket" template exists and provides tecnativa/docker-socket-proxy
- What's unclear: Whether the CA template supports selecting custom Docker network during initial setup, or if `docker network connect` must be run post-deployment
- Recommendation: Document both methods (CA template with manual network join, or docker run with --network flag). Verify via Unraid GUI during planning.
2. **n8n timeout behavior with unavailable proxy**
- What we know: curl supports `--max-time` flag for operation timeout, we want 5 second timeout
- What's unclear: Whether n8n Execute Command node respects curl timeout or has its own timeout that could interfere
- Recommendation: Set both curl `--max-time 5` and test in dev workflow. If n8n timeout is longer, curl timeout will trigger first (desired behavior).
3. **Proxy container restart order dependency**
- What we know: If proxy restarts, n8n curl commands will fail with connection refused until proxy is back up
- What's unclear: Whether we should add `depends_on` or Docker restart policy coordination between n8n and proxy
- Recommendation: Don't add orchestration. User's decision was "proxy managed by Unraid" — let Unraid handle restart order. n8n error messages will alert user if proxy is down.
## Sources
### Primary (HIGH confidence)
- [GitHub - Tecnativa/docker-socket-proxy](https://github.com/Tecnativa/docker-socket-proxy) - Official README with environment variables, security model, configuration examples
- [Docker Engine API v1.53 Documentation](https://docs.docker.com/reference/api/engine/) - Official API reference, version history, current version (v1.53)
- [Docker API SDK Examples](https://docs.docker.com/engine/api/sdk/examples/) - Official curl examples for container start, stop, pull
- [Protect the Docker daemon socket | Docker Docs](https://docs.docker.com/engine/security/protect-access/) - Official security guidance on socket protection, TLS, authorization
### Secondary (MEDIUM confidence)
- [Docker Socket Proxy Security Best Practices - LinuxServer.io](https://docs.linuxserver.io/images/docker-socket-proxy/) - Community documentation on deployment patterns
- [Does a docker socket proxy improve security? - Docker Forums](https://forums.docker.com/t/does-a-docker-socket-proxy-improve-security/136305) - Community discussion confirming security benefits over read-only mount
- [Managing & customizing containers | Unraid Docs](https://docs.unraid.net/unraid-os/using-unraid-to/run-docker-containers/managing-and-customizing-containers/) - Official Unraid documentation on custom networks
- [n8n HTTP Request node documentation](https://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.httprequest/) - Official n8n docs on timeout configuration
### Secondary (verified with official source)
- [Docker API v1.53 Engine version history](https://docs.docker.com/reference/api/engine/version-history/) - Verified API v1.53 current as of January 2026, Aliases field change documented
- [API usage examples | Portainer Documentation](https://docs.portainer.io/api/examples) - Verified container ID format (short/long both work) via official Portainer docs
### Tertiary (LOW confidence - WebSearch only)
- Various community forum threads on docker-socket-proxy deployment patterns - useful for common pitfalls but not authoritative for configuration
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH - tecnativa/docker-socket-proxy is documented standard, Docker API v1.53 is current official version
- Architecture: HIGH - Environment variables documented in official README, curl patterns from Docker's official examples
- Pitfalls: MEDIUM - Network DNS issue from Docker docs, POST=1 requirement from tecnativa README, 403 retry issue from observed n8n behavior patterns (LOW source but logical conclusion)
**Research date:** 2026-02-03
**Valid until:** 2026-03-03 (30 days) - Docker API stable, proxy project mature with infrequent breaking changes