docs: complete v1.1 research (4 researchers + synthesis)

Files:
- STACK.md: Socket proxy, n8n API, Telegram keyboards
- FEATURES.md: Table stakes, differentiators, MVP scope
- ARCHITECTURE.md: Integration points, data flow changes
- PITFALLS.md: Top 5 risks with prevention strategies
- SUMMARY.md: Executive summary, build order, confidence

Key findings:
- Stack: LinuxServer socket-proxy, HTTP Request nodes for keyboards
- Architecture: TCP curl migration (~15 nodes), new callback routes
- Critical pitfall: Socket proxy breaks existing curl commands

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Lucas Berger
2026-02-02 22:09:06 -05:00
parent ff289677ab
commit 811030cee4
5 changed files with 1614 additions and 0 deletions
+268
View File
@@ -0,0 +1,268 @@
# Research Summary: v1.1 n8n Integration & Polish
**Project:** Unraid Docker Manager
**Domain:** Telegram Bot Enhancement / Security Hardening
**Researched:** 2026-02-02
**Confidence:** MEDIUM-HIGH
## Executive Summary
The v1.1 milestone focuses on four areas: Docker socket security (critical), Telegram UX improvements (inline keyboards), n8n API access (development workflow), and Unraid update sync (nice-to-have). Research confirms all features are achievable with existing n8n capabilities and no new application dependencies beyond a Docker socket proxy container.
The recommended approach is security-first: deploy the Docker socket proxy before any workflow changes, then migrate existing curl commands to use TCP instead of Unix socket. This order minimizes risk and provides a clean foundation. Telegram inline keyboards require HTTP Request nodes due to n8n native node limitations with dynamic keyboards. The n8n API is enabled by default on self-hosted instances and requires only an API key for Claude Code access.
Key risk is breaking existing functionality during socket proxy migration. All 15+ Execute Command nodes using `--unix-socket` must be updated simultaneously. The mitigation is incremental migration with comprehensive testing before removing direct socket access. Unraid update sync has the lowest confidence - it works via file deletion but requires additional volume mounts.
---
## Stack Additions
| Component | Purpose | Why This Choice |
|-----------|---------|-----------------|
| **LinuxServer socket-proxy** | Docker socket security | HAProxy-based filtering, active maintenance, Unraid community familiarity |
| **n8n REST API** | Programmatic workflow management | Already enabled by default, no new dependencies |
| **HTTP Request nodes** | Dynamic Telegram keyboards | Workaround for n8n native node limitations with inline keyboards |
**No new application dependencies** - all solutions use existing n8n capabilities:
- HTTP Request node for Telegram API and Docker via proxy
- Execute Command node for Unraid file operations
- n8n public API for Claude Code workflow management
**Socket proxy environment variables (minimum required):**
```bash
CONTAINERS=1 # List/inspect containers
IMAGES=1 # Pull images for updates
POST=1 # Enable write operations
ALLOW_START=1 # Container start
ALLOW_STOP=1 # Container stop
ALLOW_RESTARTS=1 # Container restart/kill
```
---
## Feature Table Stakes
### Must Have
| Feature | Rationale |
|---------|-----------|
| **Docker socket proxy** | Security requirement - remove root-equivalent access from internet-exposed n8n |
| **Inline keyboard callback handling** | Already partially implemented; must complete for button responses |
| **answerCallbackQuery responses** | Telegram requirement - loading spinner persists up to 1 minute without it |
| **n8n API key setup** | Enables programmatic workflow management for Claude Code |
| **Batch update with progress** | Core batch use case - `update plex sonarr radarr` with per-container feedback |
### Should Have
| Feature | Rationale |
|---------|-----------|
| **Container action buttons** | Tap-to-action UX improvement over typing commands |
| **Confirmation dialogs** | "Are you sure?" before stop/restart/update prevents accidents |
| **"Update all" command** | Convenience feature with mandatory confirmation |
| **Unraid status file sync** | Clear "update available" badge after bot updates (file deletion method) |
### Defer to v2+
| Feature | Rationale |
|---------|-----------|
| **Unraid GraphQL API integration** | Requires Unraid 7.2+, adds complexity |
| **MCP integration for n8n** | Unofficial server exists but adds significant complexity |
| **Cancel mid-batch** | Requires state management complexity |
| **Pagination for containers** | Only needed if >10 containers common |
---
## Architecture Changes
### Target Architecture
```
User -> Telegram -> n8n webhook -> curl -> socket-proxy:2375 -> docker.sock -> Docker Engine
^
|
Claude Code -> n8n API --+
```
### Key Integration Points
| Component | Change | Impact |
|-----------|--------|--------|
| **socket-proxy container** | NEW - sidecar on internal network | Infrastructure |
| **n8n container** | MODIFY - `DOCKER_HOST=tcp://socket-proxy:2375`, remove socket mount | Medium |
| **Execute Command nodes** | MODIFY - change curl from `--unix-socket` to TCP | ~15 nodes |
| **Route Callback switch** | MODIFY - add new callback types for keyboards | Low |
| **HTTP Request nodes** | NEW - for dynamic inline keyboard generation | Medium |
### Curl Migration Pattern
```
FROM: curl -s --unix-socket /var/run/docker.sock 'http://localhost/v1.47/...'
TO: curl -s 'http://socket-proxy:2375/v1.47/...'
```
---
## Top Risks
### 1. Socket Proxy Breaks Existing Commands (CRITICAL)
**Risk:** All Docker commands fail after migration
**Prevention:**
1. Deploy socket-proxy first without removing direct socket
2. Update curl commands one-by-one to use proxy
3. Test each command via proxy before removing direct socket
4. Maintain rollback capability throughout
### 2. n8n Native Telegram Node Rejects Dynamic Keyboards (HIGH)
**Risk:** Error "The value is not supported!" when using expressions in keyboard fields
**Prevention:** Use HTTP Request node to call Telegram Bot API directly for any dynamic keyboard. Keep native node for simple text responses only.
### 3. callback_data Exceeds 64 Bytes (MEDIUM)
**Risk:** Buttons silently fail when callback_data is too long
**Prevention:** Use short codes: `s:abc12345` (action:container_id_prefix) instead of full names. Map back via container ID lookup.
### 4. n8n 2.0 Breaking Changes (MEDIUM)
**Risk:** Execute Command disabled by default, env vars blocked, Save/Publish separation
**Prevention:** Check n8n version before starting. If 2.0+, verify Execute Command is enabled in settings. Don't upgrade n8n during this milestone.
### 5. Unraid Update Badge Never Clears (LOW impact but HIGH likelihood)
**Risk:** Unraid UI shows "update available" even after bot updates container
**Prevention:** Delete `/var/lib/docker/unraid-update-status.json` after successful bot update. Document that user may need to click "Check for Updates" in Unraid UI.
---
## Recommended Build Order
Based on dependencies and risk mitigation:
### Phase 1: Socket Security Foundation
**Delivers:** Secure Docker socket access via proxy
**What:**
1. Deploy socket-proxy container on internal network
2. Configure minimum required permissions
3. Migrate all curl commands to use TCP endpoint
4. Test all existing functionality
5. Remove direct socket mount from n8n
**Rationale:** Security is the primary v1.1 goal. Must complete before adding any new features to avoid compounding risk.
**Pitfalls to avoid:**
- Proxy port exposed publicly (keep internal only)
- Insufficient permissions (test each operation)
- Breaking existing curl commands (migrate incrementally)
### Phase 2: n8n API Access
**Delivers:** Claude Code can read/update/test workflows programmatically
**What:**
1. Create API key in n8n Settings
2. Document API endpoints for workflow management
3. Test basic operations (list, get, update workflow)
**Rationale:** Low risk, high value. Enables faster iteration for subsequent phases.
**Pitfalls to avoid:**
- API key committed to repository (use environment/secrets)
- Workflow ID hardcoded (query API to discover)
### Phase 3: Inline Keyboard Infrastructure
**Delivers:** Foundation for button-based UX
**What:**
1. HTTP Request node pattern for dynamic keyboards
2. Callback routing for new action types
3. answerCallbackQuery integration
4. Short callback_data encoding scheme
**Rationale:** Foundation needed before adding specific button features.
**Pitfalls to avoid:**
- Using native Telegram node for keyboards (use HTTP Request)
- callback_data exceeding 64 bytes (use short codes)
### Phase 4: UX Improvements
**Delivers:** Button-based container management
**What:**
1. Container selection keyboards from commands
2. Confirmation dialogs for dangerous actions
3. Message editing for progress/results
4. Batch update with progress feedback
**Rationale:** User-facing improvements built on Phase 3 foundation.
### Phase 5: Unraid Sync (Optional)
**Delivers:** Clear update badges after bot updates
**What:**
1. Add volume mount for status file access
2. Delete status file after successful updates
3. Document user step to refresh Unraid UI
**Rationale:** Lowest confidence, most uncertain. May require additional host access. Defer if other phases take longer than expected.
---
## Open Questions
1. **Socket proxy network:** Should socket-proxy be on a dedicated internal network or share n8n's network? (Recommendation: dedicated internal network)
2. **n8n API exposure:** Should n8n API be accessible only via Tailscale/VPN or also on LAN? (Recommendation: Tailscale only for security)
3. **Unraid status file access:** Does n8n already have `/var/lib/docker` mounted, or is a new volume mount needed? (Needs verification)
4. **Batch size limits:** Should "update all" have a maximum container limit or always require confirmation? (Recommendation: always confirm, no limit)
---
## Ready for Requirements
Research is complete. All four research files provide sufficient detail for roadmap creation:
- **STACK.md:** Socket proxy configuration, n8n API setup, Telegram keyboard patterns
- **FEATURES.md:** Table stakes vs differentiators, complexity estimates, MVP recommendations
- **ARCHITECTURE.md:** Integration points, data flow changes, component modifications
- **PITFALLS.md:** Top 5 risks ranked, prevention strategies, phase-specific warnings
**Recommended phase count:** 4-5 phases
**Estimated complexity:** Medium (infrastructure change + UX improvements)
**Confidence for planning:** HIGH for Phases 1-4, MEDIUM for Phase 5 (Unraid sync)
---
## Confidence Assessment
| Area | Confidence | Notes |
|------|------------|-------|
| Stack | HIGH | Official docs for socket-proxy and n8n API verified |
| Features | MEDIUM-HIGH | Telegram API well-documented; n8n limitations confirmed via GitHub issues |
| Architecture | HIGH | Clear integration points, existing workflow well understood |
| Pitfalls | MEDIUM-HIGH | Based on official docs + community experience; Unraid behavior forum-confirmed |
**Overall confidence:** MEDIUM-HIGH
### Gaps to Address
1. **n8n version check** - Verify current n8n version before starting (2.0 has breaking changes)
2. **Unraid volume mounts** - Verify existing mounts or plan for new ones
3. **Telegram keyboard testing** - HTTP Request pattern needs validation with actual workflow
---
## Sources
### Primary (HIGH confidence)
- [n8n API Documentation](https://docs.n8n.io/api/) - Authentication, endpoints
- [LinuxServer socket-proxy](https://docs.linuxserver.io/images/docker-socket-proxy/) - Configuration options
- [Telegram Bot API](https://core.telegram.org/bots/api) - Inline keyboards, callback queries
### Secondary (MEDIUM confidence)
- [n8n GitHub Issue #19955](https://github.com/n8n-io/n8n/issues/19955) - Native node keyboard limitations
- [Tecnativa docker-socket-proxy](https://github.com/Tecnativa/docker-socket-proxy) - Alternative proxy option
### Tertiary (LOW confidence)
- [Unraid Forums](https://forums.unraid.net/bug-reports/stable-releases/regression-incorrect-docker-update-notification-r2807/) - Update badge behavior workarounds
---
*Research completed: 2026-02-02*
*Ready for roadmap: yes*