Files: - STACK.md: Socket proxy, n8n API, Telegram keyboards - FEATURES.md: Table stakes, differentiators, MVP scope - ARCHITECTURE.md: Integration points, data flow changes - PITFALLS.md: Top 5 risks with prevention strategies - SUMMARY.md: Executive summary, build order, confidence Key findings: - Stack: LinuxServer socket-proxy, HTTP Request nodes for keyboards - Architecture: TCP curl migration (~15 nodes), new callback routes - Critical pitfall: Socket proxy breaks existing curl commands Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
19 KiB
Features Research: v1.1
Domain: Telegram Bot for Docker Container Management Researched: 2026-02-02 Confidence: MEDIUM-HIGH (WebSearch verified with official docs where available)
Telegram Inline Keyboards
Table Stakes
| Feature | Why Expected | Complexity | Dependencies |
|---|---|---|---|
| Callback button handling | Core inline keyboard functionality - buttons must trigger actions | Low | Telegram Trigger already handles callback_query |
| answerCallbackQuery response | Required by Telegram - clients show loading animation until answered (up to 1 minute) | Low | None |
| Edit message after button press | Standard pattern - update existing message rather than send new one to reduce clutter | Low | None |
| Container action buttons | Users expect tap-to-action for start/stop/restart without typing | Medium | Existing container matching logic |
| Status view with action buttons | Show container list with inline buttons for each container | Medium | Existing status command |
Differentiators
| Feature | Value Proposition | Complexity | Dependencies |
|---|---|---|---|
| Confirmation dialogs for dangerous actions | "Are you sure?" before stop/restart/update prevents accidental actions | Low | None - edit message with Yes/No buttons |
| Contextual button removal | Remove buttons after action completes (prevents double-tap issues) | Low | None |
| Dynamic container list keyboards | Generate buttons based on actual running containers | Medium | Container listing logic |
| Progress indicators via message edit | Update message with "Updating..." then "Complete" states | Low | None |
| Pagination for many containers | "Next page" button when >8-10 containers | Medium | None |
Anti-features
| Anti-Feature | Why Avoid | What to Do Instead |
|---|---|---|
| Reply keyboards for actions | Takes over user keyboard space, sends visible messages to chat | Use inline keyboards attached to bot messages |
| More than 5 buttons per row | Wraps poorly on mobile/desktop, breaks muscle memory | Max 3-4 buttons per row for container actions |
| Complex callback_data structures | 64-byte limit, easy to exceed with JSON | Use short action codes: start_plex, stop_sonarr |
| Buttons without feedback | Users think tap didn't work, tap again | Always answerCallbackQuery, even for errors |
| Auto-refreshing keyboards | High API traffic, rate limiting risk | Refresh on explicit user action only |
Implementation Notes
Critical constraint: callback_data is limited to 64 bytes. Use short codes like action:containername rather than JSON structures.
n8n native node limitation: The Telegram node doesn't support dynamic inline keyboards well. Workaround is HTTP Request node calling Telegram Bot API directly for sendMessage with reply_markup parameter.
Pattern for confirmations:
- User taps "Stop plex"
- Edit message: "Stop plex container?" with [Yes] [Cancel] buttons
- User taps Yes -> perform action, edit message with result, remove buttons
- User taps Cancel -> edit message back to original state
Sources:
- Telegram Bot Features (HIGH confidence)
- Telegram Bot API Buttons (HIGH confidence)
- n8n Telegram Callback Operations (HIGH confidence)
- n8n Community: Dynamic Inline Keyboard (MEDIUM confidence)
Batch Operations
Table Stakes
| Feature | Why Expected | Complexity | Dependencies |
|---|---|---|---|
| Update multiple specified containers | Core batch use case - update plex sonarr radarr |
Medium | Existing update logic, loop handling |
| Sequential execution | Process one at a time to avoid resource contention | Low | None |
| Per-container status feedback | "Updated plex... Updated sonarr..." progress | Low | Existing message sending |
| Error handling per container | One failure shouldn't abort the batch | Low | Try-catch per iteration |
| Final summary message | "3 updated, 1 failed: jellyfin" | Low | Accumulator pattern |
Differentiators
| Feature | Value Proposition | Complexity | Dependencies |
|---|---|---|---|
| "Update all" command | Single command to update everything (with confirmation) | Medium | Container listing |
| "Update all except X" | Exclude specific containers from batch | Medium | Exclusion pattern |
| Parallel status checks | Check which containers have updates available first | Medium | None |
| Batch operation confirmation | Show what will happen before doing it | Low | Keyboard buttons |
| Cancel mid-batch | Stop processing remaining containers | High | State management |
Anti-features
| Anti-Feature | Why Avoid | What to Do Instead |
|---|---|---|
| Parallel container updates | Resource contention, disk I/O saturation, network bandwidth | Sequential with progress feedback |
| Silent batch operations | User thinks bot is frozen during long batch | Send progress message per container |
| Update without checking first | Wastes time on already-updated containers | Check for updates, report "3 containers have updates" |
| Auto-update on schedule | Out of scope - user might be using system when update causes downtime | User-initiated only, this is reactive tool |
Implementation Notes
Existing update flow: Current implementation pulls image, recreates container, cleans up old image. Batch needs to wrap this in a loop.
Progress pattern:
User: update all
Bot: Found 5 containers with updates. Update now? [Yes] [Cancel]
User: Yes
Bot: Updating plex (1/5)...
Bot: (edit) Updated plex. Updating sonarr (2/5)...
...
Bot: (edit) Batch complete: 5 updated, 0 failed.
Watchtower-style options (NOT recommended for this bot):
- Watchtower does automatic updates on schedule
- This bot is intentionally reactive (user asks, bot does)
- Automation can cause downtime at bad times
Sources:
- Watchtower Documentation (HIGH confidence)
- Docker Multi-Container Apps (HIGH confidence)
- How to Update Docker Containers (MEDIUM confidence)
Development API Workflow
Table Stakes
| Feature | Why Expected | Complexity | Dependencies |
|---|---|---|---|
| API key authentication | Standard n8n API auth method | Low | n8n configuration |
| Get workflow by ID | Read current workflow JSON | Low | n8n REST API |
| Update workflow | Push modified workflow back | Low | n8n REST API |
| Activate/deactivate workflow | Turn workflow on/off programmatically | Low | n8n REST API |
| Get execution list | See recent runs | Low | n8n REST API |
| Get execution details/logs | Debug failed executions | Low | n8n REST API |
Differentiators
| Feature | Value Proposition | Complexity | Dependencies |
|---|---|---|---|
| Execute workflow on demand | Trigger test run via API | Medium | n8n REST API with test data |
| Version comparison | Diff local vs deployed workflow | High | JSON diff tooling |
| Backup before update | Save current version before pushing changes | Low | File system or git |
| Rollback capability | Restore previous version on failure | Medium | Version history |
| MCP integration | Claude Code can manage workflows via MCP | High | MCP server setup |
Anti-features
| Anti-Feature | Why Avoid | What to Do Instead |
|---|---|---|
| Direct n8n database access | Bypasses API, can corrupt state | Use REST API only |
| Credential exposure via API | API returns credential IDs, not values | Never try to extract credential values |
| Auto-deploy on git push | Adds CI/CD complexity, not needed for single-user | Manual deploy via API call |
| Real-time workflow editing | n8n UI is better for this | API for read/bulk operations only |
Implementation Notes
n8n REST API key endpoints:
| Operation | Method | Endpoint |
|---|---|---|
| List workflows | GET | /api/v1/workflows |
| Get workflow | GET | /api/v1/workflows/{id} |
| Update workflow | PUT | /api/v1/workflows/{id} |
| Activate | POST | /api/v1/workflows/{id}/activate |
| Deactivate | POST | /api/v1/workflows/{id}/deactivate |
| List executions | GET | /api/v1/executions |
| Get execution | GET | /api/v1/executions/{id} |
| Execute workflow | POST | /rest/workflows/{id}/run |
Authentication: Header X-N8N-API-KEY: your_api_key
Workflow structure: n8n workflows are JSON documents (~3,200 lines for this bot). Key sections:
nodes[]- Array of workflow nodesconnections- How nodes connectsettings- Workflow-level settings
MCP option: There's an unofficial n8n MCP server (makafeli/n8n-workflow-builder) that could enable Claude Code to manage workflows directly, but this adds complexity. Standard REST API is simpler for v1.1.
Sources:
- n8n API Documentation (HIGH confidence)
- n8n API Reference (HIGH confidence)
- n8n Workflow Manager API Template (MEDIUM confidence)
- Python n8n API Guide (MEDIUM confidence)
Update Notification Sync
Table Stakes
| Feature | Why Expected | Complexity | Dependencies |
|---|---|---|---|
| Update clears bot's "update available" state | Bot should know container is now current | Low | Already works - re-check after update |
| Accurate update status reporting | Status command shows which have updates | Medium | Image digest comparison |
Differentiators
| Feature | Value Proposition | Complexity | Dependencies |
|---|---|---|---|
| Sync with Unraid UI | Clear "update available" badge in Unraid web UI | High | Unraid API or file manipulation |
| Pre-update check | Show what version you're on, what version available | Medium | Image tag inspection |
| Update notification to user | "3 containers have updates available" proactive message | Medium | Scheduled check, notification logic |
Anti-features
| Anti-Feature | Why Avoid | What to Do Instead |
|---|---|---|
| Taking over Unraid notifications | Explicitly out of scope per PROJECT.md | Keep Unraid notifications, bot is for control |
| Proactive monitoring | Bot is reactive per PROJECT.md | User checks status manually |
| Blocking Unraid auto-updates | User may want both systems | Coexist with Unraid's own update mechanism |
Implementation Notes
The core problem: When you update a container via the bot (or Watchtower), Unraid's web UI may still show "update available" because it has its own tracking.
Unraid update status file: /var/lib/docker/unraid-update-status.json
- This file tracks which containers have updates
- Deleting it forces Unraid to recheck
- Can also trigger recheck via: Settings > Docker > Check for Updates
Unraid API (v7.2+):
- GraphQL API for Docker containers
- Can query container status
- Mutations for notifications exist
- API key auth:
x-api-keyheader
Practical approach for v1.1:
- Minimum: Document that Unraid UI may lag behind - user can click "Check for Updates" in Unraid
- Better: After bot update, delete
/var/lib/docker/unraid-update-status.jsonto force Unraid recheck - Best (requires Unraid 7.2+): Use Unraid GraphQL API to clear notification state
Known issue: Users report Unraid shows "update ready" even after container is updated. This is a known Unraid bug where it only checks for new updates, not whether containers are now current.
Sources:
- Unraid API Documentation (HIGH confidence)
- Unraid Docker Integration DeepWiki (MEDIUM confidence)
- Watchtower + Unraid Discussion (MEDIUM confidence)
- Unraid Forum: Update Badge Issues (MEDIUM confidence)
Docker Socket Security
Table Stakes
| Feature | Why Expected | Complexity | Dependencies |
|---|---|---|---|
| Remove direct socket from internet-exposed n8n | Security requirement per PROJECT.md scope | Medium | Socket proxy setup |
| Maintain all existing functionality | Bot should work identically after security change | Medium | API compatibility |
| Container start/stop/restart/update | Core actions must still work | Low | Proxy allows these APIs |
| Container list/inspect | Status command must still work | Low | Proxy allows read APIs |
| Image pull | Update command needs this | Low | Proxy configuration |
Differentiators
| Feature | Value Proposition | Complexity | Dependencies |
|---|---|---|---|
| Granular API restrictions | Only allow APIs the bot actually uses | Low | Socket proxy env vars |
| Block dangerous APIs | Prevent exec, create, system commands | Low | Socket proxy defaults |
| Audit logging | Log all Docker API calls through proxy | Medium | Proxy logging config |
Anti-features
| Anti-Feature | Why Avoid | What to Do Instead |
|---|---|---|
| Read-only socket mount (:ro) | Doesn't actually protect - socket as pipe stays writable | Use proper socket proxy |
| Direct socket access from internet-facing container | Full root access if n8n is compromised | Socket proxy isolates access |
| Allowing exec API | Enables arbitrary command execution in containers | Block exec in proxy |
| Allowing create/network APIs | Bot doesn't need to create containers | Block creation APIs |
Implementation Notes
Recommended: Tecnativa/docker-socket-proxy or LinuxServer.io/docker-socket-proxy
Both provide HAProxy-based filtering of Docker API requests.
Minimal proxy configuration for this bot:
# docker-compose.yml
services:
socket-proxy:
image: tecnativa/docker-socket-proxy
environment:
- CONTAINERS=1 # List/inspect containers
- IMAGES=1 # Pull images
- POST=1 # Allow write operations
- SERVICES=0 # Swarm services (not needed)
- TASKS=0 # Swarm tasks (not needed)
- NETWORKS=0 # Network management (not needed)
- VOLUMES=0 # Volume management (not needed)
- EXEC=0 # CRITICAL: Block exec
- BUILD=0 # CRITICAL: Block build
- COMMIT=0 # CRITICAL: Block commit
- SECRETS=0 # CRITICAL: Block secrets
- CONFIGS=0 # CRITICAL: Block configs
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
networks:
- docker-proxy
n8n:
# ... existing config ...
environment:
- DOCKER_HOST=tcp://socket-proxy:2375
networks:
- docker-proxy
# Plus existing networks
Key security benefits:
- n8n no longer has direct socket access
- Only whitelisted API categories are available
- EXEC=0 prevents arbitrary command execution
- Proxy is on internal network only, not internet-exposed
Migration path:
- Deploy socket-proxy container
- Update n8n to use
DOCKER_HOST=tcp://socket-proxy:2375 - Remove direct socket mount from n8n
- Test all bot commands still work
Sources:
- Tecnativa docker-socket-proxy (HIGH confidence)
- LinuxServer.io docker-socket-proxy (HIGH confidence)
- Docker Socket Security Guide (MEDIUM confidence)
Feature Summary Table
| Feature | Complexity | Dependencies | Priority | Notes |
|---|---|---|---|---|
| Inline Keyboards | ||||
| Basic callback handling | Low | Existing trigger | Must Have | Foundation for all buttons |
| Container action buttons | Medium | Container matching | Must Have | Core UX improvement |
| Confirmation dialogs | Low | None | Should Have | Prevents accidents |
| Dynamic keyboard generation | Medium | HTTP Request node | Must Have | n8n native node limitation workaround |
| Batch Operations | ||||
| Update multiple containers | Medium | Existing update | Must Have | Sequential with progress |
| "Update all" command | Medium | Container listing | Should Have | With confirmation |
| Per-container feedback | Low | None | Must Have | Progress visibility |
| n8n API | ||||
| API key setup | Low | n8n config | Must Have | Enable programmatic access |
| Read workflow | Low | REST API | Must Have | Development workflow |
| Update workflow | Low | REST API | Must Have | Development workflow |
| Activate/deactivate | Low | REST API | Should Have | Testing workflow |
| Update Sync | ||||
| Delete status file | Low | SSH/exec access | Should Have | Simple Unraid sync |
| Unraid GraphQL API | High | Unraid 7.2+, API key | Nice to Have | Requires version check |
| Security | ||||
| Socket proxy deployment | Medium | New container | Must Have | Security requirement |
| API restriction config | Low | Proxy env vars | Must Have | Minimize attack surface |
| Migration testing | Low | All commands | Must Have | Verify no regression |
MVP Recommendation for v1.1
Phase 1: Foundation (Must Have)
- Docker socket security via proxy - security first
- n8n API access setup - enables faster development
- Basic inline keyboard infrastructure - callback handling
Phase 2: UX Improvements (Should Have) 4. Container action buttons from status view 5. Confirmation dialogs for stop/update actions 6. Batch update with progress feedback
Phase 3: Polish (Nice to Have) 7. Unraid update status sync (file deletion method) 8. "Update all" convenience command
Confidence Assessment
| Area | Confidence | Reason |
|---|---|---|
| Telegram Inline Keyboards | HIGH | Official Telegram docs + n8n docs verified |
| Batch Operations | MEDIUM-HIGH | Standard Docker patterns, well-documented |
| n8n API | MEDIUM | API exists but detailed endpoint docs required fetching |
| Unraid Update Sync | MEDIUM | Community knowledge, API docs limited |
| Docker Socket Security | HIGH | Well-documented proxy solutions |
Gaps to Address in Phase Planning
- Exact n8n API endpoints - Need to verify full endpoint list during implementation
- Unraid version compatibility - GraphQL API requires Unraid 7.2+, need version check
- n8n Telegram node workarounds - HTTP Request approach needs testing
- Socket proxy on Unraid - Deployment specifics for Unraid environment