# Pitfalls Research: v1.1 **Project:** Unraid Docker Manager **Milestone:** v1.1 - n8n Integration & Polish **Researched:** 2026-02-02 **Confidence:** MEDIUM-HIGH (verified with official docs where possible) ## Context This research identifies pitfalls specific to **adding** these features to an existing working system: - n8n API access (programmatic workflow read/update/test/logs) - Docker socket proxy (security hardening) - Telegram inline keyboards (UX improvements) - Unraid update sync (clear "update available" badge) **Risk focus:** Breaking existing functionality while adding new features. --- ## n8n API Access Pitfalls | Pitfall | Warning Signs | Prevention | Phase | |---------|---------------|------------|-------| | **API key with full access** | API key created without scopes; all workflows accessible | Enterprise: use scoped API keys (read-only for Claude Code initially). Non-enterprise: accept risk, rotate keys every 6-12 months | API Setup | | **Missing X-N8N-API-KEY header** | 401 Unauthorized errors on all API calls | Store API key in Claude Code MCP config; always send as `X-N8N-API-KEY` header, not Bearer token | API Setup | | **Workflow ID mismatch after import** | API calls return 404; workflow actions fail | Workflow IDs change on import; query `/api/v1/workflows` first to get current IDs, don't hardcode | API Setup | | **Editing active workflow via API** | Production workflow changes unexpectedly; users see partial updates | n8n 2.0: Save vs Publish are separate actions. Use API to read only; manual publish via UI | API Setup | | **N8N_BLOCK_ENV_ACCESS_IN_NODE default** | Code nodes can't access env vars; returns undefined | n8n 2.0+ blocks env vars by default. Use credentials system instead, or explicitly set `N8N_BLOCK_ENV_ACCESS_IN_NODE=false` | API Setup | | **API not enabled on instance** | Connection refused on /api/v1 endpoints | Self-hosted: API is available by default. Cloud trial: API not available. Verify with `curl http://localhost:5678/api/v1/workflows` | API Setup | | **Rate limiting on rapid API calls** | 429 errors when reading workflow repeatedly | Add delay between API calls (1-2 seconds); use caching for workflow data that doesn't change frequently | API Usage | **Sources:** - [n8n API Authentication](https://docs.n8n.io/api/authentication/) - [n8n API Reference](https://docs.n8n.io/api/) - [n8n v2.0 Breaking Changes](https://docs.n8n.io/2-0-breaking-changes/) --- ## Docker Socket Security Pitfalls | Pitfall | Warning Signs | Prevention | Phase | |---------|---------------|------------|-------| | **Proxy exposes POST by default** | Container can create/delete containers; security scan flags | Set `POST=0` on socket proxy; most read operations work with GET only | Socket Proxy | | **Using `--privileged` unnecessarily** | Security audit fails; container has excessive permissions | Remove `--privileged` flag; Tecnativa proxy works without it on standard Docker | Socket Proxy | | **Outdated socket proxy image** | Using `latest` tag which is 3+ years old | Pin to specific version: `tecnativa/docker-socket-proxy:0.2.0` or use `linuxserver/socket-proxy` | Socket Proxy | | **Proxy port exposed publicly** | Port 2375 accessible from network; security scan fails | Never expose proxy port; run on internal Docker network only | Socket Proxy | | **Insufficient permissions for n8n** | "Permission denied" or empty responses from Docker API | Enable minimum required: `CONTAINERS=1`, `ALLOW_START=1`, `ALLOW_STOP=1`, `ALLOW_RESTARTS=1` for actions | Socket Proxy | | **Breaking existing curl commands** | Existing workflow fails after adding proxy; commands timeout | Socket proxy uses TCP, not Unix socket. Update curl commands: `curl http://socket-proxy:2375/...` instead of `--unix-socket` | Socket Proxy | | **Network isolation breaks connectivity** | n8n can't reach proxy; "connection refused" errors | Both containers must be on same Docker network; verify with `docker network inspect` | Socket Proxy | | **Permissions too restrictive** | Container list works but start/stop fails | Must explicitly enable action endpoints: `ALLOW_START=1`, `ALLOW_STOP=1`, `ALLOW_RESTARTS=1` (separate from `CONTAINERS=1`) | Socket Proxy | | **Missing INFO or VERSION permissions** | Some Docker API calls fail unexpectedly | `VERSION=1` and `PING=1` are enabled by default; may need `INFO=1` for system queries | Socket Proxy | **Minimum safe configuration for this project:** ```yaml environment: - CONTAINERS=1 # Read container info - ALLOW_START=1 # Start containers - ALLOW_STOP=1 # Stop containers - ALLOW_RESTARTS=1 # Restart containers - IMAGES=1 # Pull images (for updates) - POST=1 # Required for start/stop/restart actions - NETWORKS=0 # Not needed - VOLUMES=0 # Not needed - BUILD=0 # Not needed - COMMIT=0 # Not needed - CONFIGS=0 # Not needed - SECRETS=0 # Security critical - keep disabled - EXEC=0 # Security critical - keep disabled - AUTH=0 # Security critical - keep disabled ``` **Sources:** - [Tecnativa docker-socket-proxy](https://github.com/Tecnativa/docker-socket-proxy) - [LinuxServer socket-proxy](https://docs.linuxserver.io/images/docker-socket-proxy/) - [Docker Community Forums - Socket Proxy Security](https://forums.docker.com/t/does-a-docker-socket-proxy-improve-security/136305) --- ## Telegram Keyboard Pitfalls | Pitfall | Warning Signs | Prevention | Phase | |---------|---------------|------------|-------| | **Native node rejects dynamic keyboards** | Error: "The value '[[...]]' is not supported!" | Use HTTP Request node for inline keyboards instead of native Telegram node; this is a known n8n limitation | Keyboards | | **callback_data exceeds 64 bytes** | Buttons don't respond; no callback_query received; 400 BUTTON_DATA_INVALID | Use short codes: `s:plex` not `start_container:plex-media-server`. Hash long names to 8-char IDs | Keyboards | | **Callback auth path missing** | Keyboard clicks ignored; no response to button press | Existing workflow already handles callback_query (line 56-74 in workflow). Ensure new keyboards use same auth flow | Keyboards | | **Multiple additional fields ignored** | Button has both callback_data and URL; only URL works | n8n Telegram node limitation - can't use both. Choose one per button: either action (callback) or link (URL) | Keyboards | | **Keyboard flickers on every message** | Visual glitches; keyboard re-renders constantly | Send `reply_markup` only on /start or menu requests; omit from action responses (keyboard persists) | Keyboards | | **Inline vs Reply keyboard confusion** | Wrong keyboard type appears; buttons don't trigger callbacks | Inline keyboards (InlineKeyboardMarkup) for callbacks; Reply keyboards (ReplyKeyboardMarkup) for persistent menus. Use inline for container actions | Keyboards | | **answerCallbackQuery not called** | "Loading..." spinner persists after button click; Telegram shows timeout | Must call `answerCallbackQuery` within 10 seconds of receiving callback_query, even if just to acknowledge | Keyboards | | **Button layout exceeds limits** | Buttons don't appear; API error | Bot API 7.0: max 100 buttons total per message. For container lists, paginate or limit to 8-10 buttons | Keyboards | **Recommended keyboard structure for container actions:** ```javascript // Short callback_data pattern: action:container_short_id // Example: "s:abc123" for start, "x:abc123" for stop { "inline_keyboard": [ [ {"text": "Start", "callback_data": "s:" + containerId.slice(0,8)}, {"text": "Stop", "callback_data": "x:" + containerId.slice(0,8)} ], [ {"text": "Restart", "callback_data": "r:" + containerId.slice(0,8)}, {"text": "Logs", "callback_data": "l:" + containerId.slice(0,8)} ] ] } ``` **Sources:** - [n8n GitHub Issue #19955 - Inline Keyboard Expression](https://github.com/n8n-io/n8n/issues/19955) - [n8n Telegram Callback Operations](https://docs.n8n.io/integrations/builtin/app-nodes/n8n-nodes-base.telegram/callback-operations/) - [Telegram Bot API - InlineKeyboardButton](https://core.telegram.org/bots/api#inlinekeyboardbutton) --- ## Unraid Integration Pitfalls | Pitfall | Warning Signs | Prevention | Phase | |---------|---------------|------------|-------| | **Update badge persists after bot update** | Unraid UI shows "update available" after container updated via bot | Delete `/var/lib/docker/unraid-update-status.json` to force recheck; or trigger Unraid's check mechanism | Unraid Sync | | **unraid-update-status.json format unknown** | Attempted to modify file directly; broke Unraid Docker tab | File format is undocumented. Safest approach: delete file and let Unraid regenerate. Don't modify directly | Unraid Sync | | **Unraid only checks for new updates** | Badge never clears; only sees new updates, not cleared updates | This is known Unraid behavior. Deletion of status file is current workaround per Unraid forums | Unraid Sync | | **Race condition on status file** | Status file deleted but badge still shows; file regenerated too fast | Wait for Unraid's update check interval, or manually trigger "Check for Updates" from Unraid UI after deletion | Unraid Sync | | **Bot can't access Unraid filesystem** | Permission denied when accessing /var/lib/docker/ | n8n container needs additional volume mount: `/var/lib/docker:/var/lib/docker` or execute via SSH | Unraid Sync | | **Breaking Unraid's Docker management** | Unraid Docker tab shows errors; containers appear in wrong state | Never modify Unraid's internal files (in /boot/config/docker or /var/lib/docker) except update-status.json deletion | Unraid Sync | **Unraid sync approach (safest):** 1. After bot successfully updates container 2. Execute: `rm -f /var/lib/docker/unraid-update-status.json` 3. Unraid will regenerate on next "Check for Updates" or automatically **Sources:** - [Unraid Forums - Update notification regression](https://forums.unraid.net/bug-reports/stable-releases/regression-incorrect-docker-update-notification-r2807/) - [Unraid Forums - Update badge persists](https://forums.unraid.net/topic/157820-docker-shows-update-ready-after-updating/) - [Unraid Forums - Containers show update available incorrectly](https://forums.unraid.net/topic/142238-containers-show-update-available-even-when-it-is-up-to-date/) --- ## Integration Pitfalls (Breaking Existing Functionality) | Pitfall | Warning Signs | Prevention | Phase | |---------|---------------|------------|-------| | **Socket proxy breaks existing curl** | All Docker commands fail after adding proxy | Existing workflow uses `--unix-socket`. Migrate curl commands to use proxy TCP endpoint: `http://socket-proxy:2375` | Socket Proxy | | **Auth flow bypassed on new paths** | New keyboard handlers skip user ID check; anyone can click buttons | Existing workflow has auth at lines 92-122 and 126-155. Copy same pattern for any new callback handlers | All | | **Workflow test vs production mismatch** | Works in test mode; fails when activated | Test with actual Telegram messages, not just manual execution. Production triggers differ from manual runs | All | | **n8n 2.0 upgrade breaks workflow** | After n8n update, workflow stops working; nodes missing | n8n 2.0 has breaking changes: Execute Command disabled by default, Start node removed, env vars blocked. Check [migration guide](https://docs.n8n.io/2-0-breaking-changes/) before upgrading | All | | **Credential reference breaks after import** | Imported workflow can't decrypt credentials; all nodes fail | n8n uses N8N_ENCRYPTION_KEY. After import, must recreate credentials manually in n8n UI | All | | **HTTP Request node vs Execute Command** | HTTP Request can't reach Docker socket; timeout errors | HTTP Request node doesn't support Unix sockets. Keep using Execute Command with curl for Docker API (or migrate to TCP proxy) | Socket Proxy | | **Parallel execution race conditions** | Two button clicks cause conflicting container states | Add debounce logic: ignore rapid duplicate callbacks within 2-3 seconds. Store last action timestamp | Keyboards | | **Error workflow doesn't fire** | Errors occur but no notification; silent failures | Error Trigger only fires on automatic executions, not manual test runs. Test by triggering via Telegram with intentional failure | All | | **Save vs Publish confusion (n8n 2.0)** | Edited workflow but production still uses old version | n8n 2.0 separates Save (preserves edits) from Publish (updates production). Must explicitly publish changes | All | **Pre-migration checklist:** - [ ] Export current workflow JSON as backup - [ ] Document current curl commands and endpoints - [ ] Test each existing command works after changes - [ ] Verify auth flow applies to new handlers - [ ] Test error handling triggers correctly **Sources:** - [n8n v2.0 Breaking Changes](https://docs.n8n.io/2-0-breaking-changes/) - [n8n Manual vs Production Executions](https://docs.n8n.io/workflows/executions/manual-partial-and-production-executions/) - [n8n Community - Test vs Production Behavior](https://community.n8n.io/t/workflow-behaves-differently-in-test-vs-production/139973) --- ## Summary: Top 5 Risks Ranked by likelihood x impact for this specific milestone: ### 1. Socket Proxy Breaks Existing Commands (HIGH likelihood, HIGH impact) **Why:** Current workflow uses `--unix-socket` flag. Socket proxy uses TCP. All existing functionality breaks if not migrated correctly. **Prevention:** 1. Add socket proxy container first (don't remove direct socket yet) 2. Update curl commands one-by-one to use proxy 3. Test each command works via proxy 4. Only then remove direct socket mount ### 2. Native Telegram Node Rejects Dynamic Keyboards (HIGH likelihood, MEDIUM impact) **Why:** n8n's native Telegram node has a known bug (Issue #19955) where it rejects array expressions for inline keyboards. **Prevention:** Use HTTP Request node to call Telegram API directly for any dynamic keyboard generation. Keep native node for simple text responses only. ### 3. Unraid Update Badge Never Clears (HIGH likelihood, LOW impact) **Why:** Unraid doesn't check for "no longer outdated" containers - only new updates. Documented behavior, not a bug. **Prevention:** Delete `/var/lib/docker/unraid-update-status.json` after successful bot update. Requires additional volume mount or SSH access. ### 4. n8n 2.0 Breaking Changes on Upgrade (MEDIUM likelihood, HIGH impact) **Why:** n8n 2.0 (released Dec 2025) has multiple breaking changes: Execute Command disabled by default, env vars blocked, Save/Publish separation. **Prevention:** 1. Check current n8n version before starting 2. If upgrading, run Migration Report first (Settings > Migration Report) 3. Don't upgrade n8n during this milestone unless necessary ### 5. callback_data Exceeds 64 Bytes (MEDIUM likelihood, MEDIUM impact) **Why:** Container names can be long (e.g., `linuxserver-plex-media-server`). Adding action prefix easily exceeds 64 bytes. **Prevention:** Use short action codes (`s:`, `x:`, `r:`, `l:`) and container ID prefix (8 chars) instead of full names. Map back via lookup. --- ## Phase Assignment Summary | Phase | Pitfalls to Address | |-------|---------------------| | **API Setup** | API key scoping, header format, workflow ID discovery, env var blocking | | **Socket Proxy** | Proxy configuration, permission settings, curl command migration, network setup | | **Keyboards** | HTTP Request node for keyboards, callback_data limits, answerCallbackQuery | | **Unraid Sync** | Update status file deletion, volume mount for access | | **All Phases** | Auth flow consistency, test vs production, error workflow testing | --- ## Confidence Assessment | Area | Confidence | Rationale | |------|------------|-----------| | n8n API | HIGH | Official docs verified, known breaking changes documented | | Docker Socket Proxy | HIGH | Official Tecnativa docs, community best practices verified | | Telegram Keyboards | MEDIUM-HIGH | n8n GitHub issues confirm limitations, Telegram API docs verified | | Unraid Integration | MEDIUM | Forum posts describe workaround, but file format undocumented | | Integration Risks | MEDIUM | Based on existing v1.0 codebase analysis and general patterns | **Research date:** 2026-02-02 **Valid until:** 2026-03-02 (30 days - n8n and Telegram APIs stable)