811030cee4
Files: - STACK.md: Socket proxy, n8n API, Telegram keyboards - FEATURES.md: Table stakes, differentiators, MVP scope - ARCHITECTURE.md: Integration points, data flow changes - PITFALLS.md: Top 5 risks with prevention strategies - SUMMARY.md: Executive summary, build order, confidence Key findings: - Stack: LinuxServer socket-proxy, HTTP Request nodes for keyboards - Architecture: TCP curl migration (~15 nodes), new callback routes - Critical pitfall: Socket proxy breaks existing curl commands Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
382 lines
19 KiB
Markdown
382 lines
19 KiB
Markdown
# Features Research: v1.1
|
|
|
|
**Domain:** Telegram Bot for Docker Container Management
|
|
**Researched:** 2026-02-02
|
|
**Confidence:** MEDIUM-HIGH (WebSearch verified with official docs where available)
|
|
|
|
## Telegram Inline Keyboards
|
|
|
|
### Table Stakes
|
|
|
|
| Feature | Why Expected | Complexity | Dependencies |
|
|
|---------|--------------|------------|--------------|
|
|
| Callback button handling | Core inline keyboard functionality - buttons must trigger actions | Low | Telegram Trigger already handles callback_query |
|
|
| answerCallbackQuery response | Required by Telegram - clients show loading animation until answered (up to 1 minute) | Low | None |
|
|
| Edit message after button press | Standard pattern - update existing message rather than send new one to reduce clutter | Low | None |
|
|
| Container action buttons | Users expect tap-to-action for start/stop/restart without typing | Medium | Existing container matching logic |
|
|
| Status view with action buttons | Show container list with inline buttons for each container | Medium | Existing status command |
|
|
|
|
### Differentiators
|
|
|
|
| Feature | Value Proposition | Complexity | Dependencies |
|
|
|---------|-------------------|------------|--------------|
|
|
| Confirmation dialogs for dangerous actions | "Are you sure?" before stop/restart/update prevents accidental actions | Low | None - edit message with Yes/No buttons |
|
|
| Contextual button removal | Remove buttons after action completes (prevents double-tap issues) | Low | None |
|
|
| Dynamic container list keyboards | Generate buttons based on actual running containers | Medium | Container listing logic |
|
|
| Progress indicators via message edit | Update message with "Updating..." then "Complete" states | Low | None |
|
|
| Pagination for many containers | "Next page" button when >8-10 containers | Medium | None |
|
|
|
|
### Anti-features
|
|
|
|
| Anti-Feature | Why Avoid | What to Do Instead |
|
|
|--------------|-----------|-------------------|
|
|
| Reply keyboards for actions | Takes over user keyboard space, sends visible messages to chat | Use inline keyboards attached to bot messages |
|
|
| More than 5 buttons per row | Wraps poorly on mobile/desktop, breaks muscle memory | Max 3-4 buttons per row for container actions |
|
|
| Complex callback_data structures | 64-byte limit, easy to exceed with JSON | Use short action codes: `start_plex`, `stop_sonarr` |
|
|
| Buttons without feedback | Users think tap didn't work, tap again | Always answerCallbackQuery, even for errors |
|
|
| Auto-refreshing keyboards | High API traffic, rate limiting risk | Refresh on explicit user action only |
|
|
|
|
### Implementation Notes
|
|
|
|
**Critical constraint:** callback_data is limited to 64 bytes. Use short codes like `action:containername` rather than JSON structures.
|
|
|
|
**n8n native node limitation:** The Telegram node doesn't support dynamic inline keyboards well. Workaround is HTTP Request node calling Telegram Bot API directly for `sendMessage` with `reply_markup` parameter.
|
|
|
|
**Pattern for confirmations:**
|
|
1. User taps "Stop plex"
|
|
2. Edit message: "Stop plex container?" with [Yes] [Cancel] buttons
|
|
3. User taps Yes -> perform action, edit message with result, remove buttons
|
|
4. User taps Cancel -> edit message back to original state
|
|
|
|
**Sources:**
|
|
- [Telegram Bot Features](https://core.telegram.org/bots/features) (HIGH confidence)
|
|
- [Telegram Bot API Buttons](https://core.telegram.org/api/bots/buttons) (HIGH confidence)
|
|
- [n8n Telegram Callback Operations](https://docs.n8n.io/integrations/builtin/app-nodes/n8n-nodes-base.telegram/callback-operations/) (HIGH confidence)
|
|
- [n8n Community: Dynamic Inline Keyboard](https://community.n8n.io/t/dynamic-inline-keyboard-for-telegram-bot/86568) (MEDIUM confidence)
|
|
|
|
---
|
|
|
|
## Batch Operations
|
|
|
|
### Table Stakes
|
|
|
|
| Feature | Why Expected | Complexity | Dependencies |
|
|
|---------|--------------|------------|--------------|
|
|
| Update multiple specified containers | Core batch use case - `update plex sonarr radarr` | Medium | Existing update logic, loop handling |
|
|
| Sequential execution | Process one at a time to avoid resource contention | Low | None |
|
|
| Per-container status feedback | "Updated plex... Updated sonarr..." progress | Low | Existing message sending |
|
|
| Error handling per container | One failure shouldn't abort the batch | Low | Try-catch per iteration |
|
|
| Final summary message | "3 updated, 1 failed: jellyfin" | Low | Accumulator pattern |
|
|
|
|
### Differentiators
|
|
|
|
| Feature | Value Proposition | Complexity | Dependencies |
|
|
|---------|-------------------|------------|--------------|
|
|
| "Update all" command | Single command to update everything (with confirmation) | Medium | Container listing |
|
|
| "Update all except X" | Exclude specific containers from batch | Medium | Exclusion pattern |
|
|
| Parallel status checks | Check which containers have updates available first | Medium | None |
|
|
| Batch operation confirmation | Show what will happen before doing it | Low | Keyboard buttons |
|
|
| Cancel mid-batch | Stop processing remaining containers | High | State management |
|
|
|
|
### Anti-features
|
|
|
|
| Anti-Feature | Why Avoid | What to Do Instead |
|
|
|--------------|-----------|-------------------|
|
|
| Parallel container updates | Resource contention, disk I/O saturation, network bandwidth | Sequential with progress feedback |
|
|
| Silent batch operations | User thinks bot is frozen during long batch | Send progress message per container |
|
|
| Update without checking first | Wastes time on already-updated containers | Check for updates, report "3 containers have updates" |
|
|
| Auto-update on schedule | Out of scope - user might be using system when update causes downtime | User-initiated only, this is reactive tool |
|
|
|
|
### Implementation Notes
|
|
|
|
**Existing update flow:** Current implementation pulls image, recreates container, cleans up old image. Batch needs to wrap this in a loop.
|
|
|
|
**Progress pattern:**
|
|
```
|
|
User: update all
|
|
Bot: Found 5 containers with updates. Update now? [Yes] [Cancel]
|
|
User: Yes
|
|
Bot: Updating plex (1/5)...
|
|
Bot: (edit) Updated plex. Updating sonarr (2/5)...
|
|
...
|
|
Bot: (edit) Batch complete: 5 updated, 0 failed.
|
|
```
|
|
|
|
**Watchtower-style options (NOT recommended for this bot):**
|
|
- Watchtower does automatic updates on schedule
|
|
- This bot is intentionally reactive (user asks, bot does)
|
|
- Automation can cause downtime at bad times
|
|
|
|
**Sources:**
|
|
- [Watchtower Documentation](https://containrrr.dev/watchtower/) (HIGH confidence)
|
|
- [Docker Multi-Container Apps](https://docs.docker.com/get-started/docker-concepts/running-containers/multi-container-applications/) (HIGH confidence)
|
|
- [How to Update Docker Containers](https://phoenixnap.com/kb/update-docker-image-container) (MEDIUM confidence)
|
|
|
|
---
|
|
|
|
## Development API Workflow
|
|
|
|
### Table Stakes
|
|
|
|
| Feature | Why Expected | Complexity | Dependencies |
|
|
|---------|--------------|------------|--------------|
|
|
| API key authentication | Standard n8n API auth method | Low | n8n configuration |
|
|
| Get workflow by ID | Read current workflow JSON | Low | n8n REST API |
|
|
| Update workflow | Push modified workflow back | Low | n8n REST API |
|
|
| Activate/deactivate workflow | Turn workflow on/off programmatically | Low | n8n REST API |
|
|
| Get execution list | See recent runs | Low | n8n REST API |
|
|
| Get execution details/logs | Debug failed executions | Low | n8n REST API |
|
|
|
|
### Differentiators
|
|
|
|
| Feature | Value Proposition | Complexity | Dependencies |
|
|
|---------|-------------------|------------|--------------|
|
|
| Execute workflow on demand | Trigger test run via API | Medium | n8n REST API with test data |
|
|
| Version comparison | Diff local vs deployed workflow | High | JSON diff tooling |
|
|
| Backup before update | Save current version before pushing changes | Low | File system or git |
|
|
| Rollback capability | Restore previous version on failure | Medium | Version history |
|
|
| MCP integration | Claude Code can manage workflows via MCP | High | MCP server setup |
|
|
|
|
### Anti-features
|
|
|
|
| Anti-Feature | Why Avoid | What to Do Instead |
|
|
|--------------|-----------|-------------------|
|
|
| Direct n8n database access | Bypasses API, can corrupt state | Use REST API only |
|
|
| Credential exposure via API | API returns credential IDs, not values | Never try to extract credential values |
|
|
| Auto-deploy on git push | Adds CI/CD complexity, not needed for single-user | Manual deploy via API call |
|
|
| Real-time workflow editing | n8n UI is better for this | API for read/bulk operations only |
|
|
|
|
### Implementation Notes
|
|
|
|
**n8n REST API key endpoints:**
|
|
|
|
| Operation | Method | Endpoint |
|
|
|-----------|--------|----------|
|
|
| List workflows | GET | `/api/v1/workflows` |
|
|
| Get workflow | GET | `/api/v1/workflows/{id}` |
|
|
| Update workflow | PUT | `/api/v1/workflows/{id}` |
|
|
| Activate | POST | `/api/v1/workflows/{id}/activate` |
|
|
| Deactivate | POST | `/api/v1/workflows/{id}/deactivate` |
|
|
| List executions | GET | `/api/v1/executions` |
|
|
| Get execution | GET | `/api/v1/executions/{id}` |
|
|
| Execute workflow | POST | `/rest/workflows/{id}/run` |
|
|
|
|
**Authentication:** Header `X-N8N-API-KEY: your_api_key`
|
|
|
|
**Workflow structure:** n8n workflows are JSON documents (~3,200 lines for this bot). Key sections:
|
|
- `nodes[]` - Array of workflow nodes
|
|
- `connections` - How nodes connect
|
|
- `settings` - Workflow-level settings
|
|
|
|
**MCP option:** There's an unofficial n8n MCP server (makafeli/n8n-workflow-builder) that could enable Claude Code to manage workflows directly, but this adds complexity. Standard REST API is simpler for v1.1.
|
|
|
|
**Sources:**
|
|
- [n8n API Documentation](https://docs.n8n.io/api/) (HIGH confidence)
|
|
- [n8n API Reference](https://docs.n8n.io/api/api-reference/) (HIGH confidence)
|
|
- [n8n Workflow Manager API Template](https://n8n.io/workflows/4166-n8n-workflow-manager-api/) (MEDIUM confidence)
|
|
- [Python n8n API Guide](https://martinuke0.github.io/posts/2025-12-10-a-detailed-guide-to-using-the-n8n-api-with-python/) (MEDIUM confidence)
|
|
|
|
---
|
|
|
|
## Update Notification Sync
|
|
|
|
### Table Stakes
|
|
|
|
| Feature | Why Expected | Complexity | Dependencies |
|
|
|---------|--------------|------------|--------------|
|
|
| Update clears bot's "update available" state | Bot should know container is now current | Low | Already works - re-check after update |
|
|
| Accurate update status reporting | Status command shows which have updates | Medium | Image digest comparison |
|
|
|
|
### Differentiators
|
|
|
|
| Feature | Value Proposition | Complexity | Dependencies |
|
|
|---------|-------------------|------------|--------------|
|
|
| Sync with Unraid UI | Clear "update available" badge in Unraid web UI | High | Unraid API or file manipulation |
|
|
| Pre-update check | Show what version you're on, what version available | Medium | Image tag inspection |
|
|
| Update notification to user | "3 containers have updates available" proactive message | Medium | Scheduled check, notification logic |
|
|
|
|
### Anti-features
|
|
|
|
| Anti-Feature | Why Avoid | What to Do Instead |
|
|
|--------------|-----------|-------------------|
|
|
| Taking over Unraid notifications | Explicitly out of scope per PROJECT.md | Keep Unraid notifications, bot is for control |
|
|
| Proactive monitoring | Bot is reactive per PROJECT.md | User checks status manually |
|
|
| Blocking Unraid auto-updates | User may want both systems | Coexist with Unraid's own update mechanism |
|
|
|
|
### Implementation Notes
|
|
|
|
**The core problem:** When you update a container via the bot (or Watchtower), Unraid's web UI may still show "update available" because it has its own tracking.
|
|
|
|
**Unraid update status file:** `/var/lib/docker/unraid-update-status.json`
|
|
- This file tracks which containers have updates
|
|
- Deleting it forces Unraid to recheck
|
|
- Can also trigger recheck via: Settings > Docker > Check for Updates
|
|
|
|
**Unraid API (v7.2+):**
|
|
- GraphQL API for Docker containers
|
|
- Can query container status
|
|
- Mutations for notifications exist
|
|
- API key auth: `x-api-key` header
|
|
|
|
**Practical approach for v1.1:**
|
|
1. **Minimum:** Document that Unraid UI may lag behind - user can click "Check for Updates" in Unraid
|
|
2. **Better:** After bot update, delete `/var/lib/docker/unraid-update-status.json` to force Unraid recheck
|
|
3. **Best (requires Unraid 7.2+):** Use Unraid GraphQL API to clear notification state
|
|
|
|
**Known issue:** Users report Unraid shows "update ready" even after container is updated. This is a known Unraid bug where it only checks for new updates, not whether containers are now current.
|
|
|
|
**Sources:**
|
|
- [Unraid API Documentation](https://docs.unraid.net/API/how-to-use-the-api/) (HIGH confidence)
|
|
- [Unraid Docker Integration DeepWiki](https://deepwiki.com/unraid/api/2.4.1-docker-integration) (MEDIUM confidence)
|
|
- [Watchtower + Unraid Discussion](https://github.com/containrrr/watchtower/discussions/1389) (MEDIUM confidence)
|
|
- [Unraid Forum: Update Badge Issues](https://forums.unraid.net/topic/157820-docker-shows-update-ready-after-updating/) (MEDIUM confidence)
|
|
|
|
---
|
|
|
|
## Docker Socket Security
|
|
|
|
### Table Stakes
|
|
|
|
| Feature | Why Expected | Complexity | Dependencies |
|
|
|---------|--------------|------------|--------------|
|
|
| Remove direct socket from internet-exposed n8n | Security requirement per PROJECT.md scope | Medium | Socket proxy setup |
|
|
| Maintain all existing functionality | Bot should work identically after security change | Medium | API compatibility |
|
|
| Container start/stop/restart/update | Core actions must still work | Low | Proxy allows these APIs |
|
|
| Container list/inspect | Status command must still work | Low | Proxy allows read APIs |
|
|
| Image pull | Update command needs this | Low | Proxy configuration |
|
|
|
|
### Differentiators
|
|
|
|
| Feature | Value Proposition | Complexity | Dependencies |
|
|
|---------|-------------------|------------|--------------|
|
|
| Granular API restrictions | Only allow APIs the bot actually uses | Low | Socket proxy env vars |
|
|
| Block dangerous APIs | Prevent exec, create, system commands | Low | Socket proxy defaults |
|
|
| Audit logging | Log all Docker API calls through proxy | Medium | Proxy logging config |
|
|
|
|
### Anti-features
|
|
|
|
| Anti-Feature | Why Avoid | What to Do Instead |
|
|
|--------------|-----------|-------------------|
|
|
| Read-only socket mount (:ro) | Doesn't actually protect - socket as pipe stays writable | Use proper socket proxy |
|
|
| Direct socket access from internet-facing container | Full root access if n8n is compromised | Socket proxy isolates access |
|
|
| Allowing exec API | Enables arbitrary command execution in containers | Block exec in proxy |
|
|
| Allowing create/network APIs | Bot doesn't need to create containers | Block creation APIs |
|
|
|
|
### Implementation Notes
|
|
|
|
**Recommended: Tecnativa/docker-socket-proxy or LinuxServer.io/docker-socket-proxy**
|
|
|
|
Both provide HAProxy-based filtering of Docker API requests.
|
|
|
|
**Minimal proxy configuration for this bot:**
|
|
|
|
```yaml
|
|
# docker-compose.yml
|
|
services:
|
|
socket-proxy:
|
|
image: tecnativa/docker-socket-proxy
|
|
environment:
|
|
- CONTAINERS=1 # List/inspect containers
|
|
- IMAGES=1 # Pull images
|
|
- POST=1 # Allow write operations
|
|
- SERVICES=0 # Swarm services (not needed)
|
|
- TASKS=0 # Swarm tasks (not needed)
|
|
- NETWORKS=0 # Network management (not needed)
|
|
- VOLUMES=0 # Volume management (not needed)
|
|
- EXEC=0 # CRITICAL: Block exec
|
|
- BUILD=0 # CRITICAL: Block build
|
|
- COMMIT=0 # CRITICAL: Block commit
|
|
- SECRETS=0 # CRITICAL: Block secrets
|
|
- CONFIGS=0 # CRITICAL: Block configs
|
|
volumes:
|
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
|
networks:
|
|
- docker-proxy
|
|
|
|
n8n:
|
|
# ... existing config ...
|
|
environment:
|
|
- DOCKER_HOST=tcp://socket-proxy:2375
|
|
networks:
|
|
- docker-proxy
|
|
# Plus existing networks
|
|
```
|
|
|
|
**Key security benefits:**
|
|
1. n8n no longer has direct socket access
|
|
2. Only whitelisted API categories are available
|
|
3. EXEC=0 prevents arbitrary command execution
|
|
4. Proxy is on internal network only, not internet-exposed
|
|
|
|
**Migration path:**
|
|
1. Deploy socket-proxy container
|
|
2. Update n8n to use `DOCKER_HOST=tcp://socket-proxy:2375`
|
|
3. Remove direct socket mount from n8n
|
|
4. Test all bot commands still work
|
|
|
|
**Sources:**
|
|
- [Tecnativa docker-socket-proxy](https://github.com/Tecnativa/docker-socket-proxy) (HIGH confidence)
|
|
- [LinuxServer.io docker-socket-proxy](https://docs.linuxserver.io/images/docker-socket-proxy/) (HIGH confidence)
|
|
- [Docker Socket Security Guide](https://www.paulsblog.dev/how-to-secure-your-docker-environment-by-using-a-docker-socket-proxy/) (MEDIUM confidence)
|
|
|
|
---
|
|
|
|
## Feature Summary Table
|
|
|
|
| Feature | Complexity | Dependencies | Priority | Notes |
|
|
|---------|------------|--------------|----------|-------|
|
|
| **Inline Keyboards** | | | | |
|
|
| Basic callback handling | Low | Existing trigger | Must Have | Foundation for all buttons |
|
|
| Container action buttons | Medium | Container matching | Must Have | Core UX improvement |
|
|
| Confirmation dialogs | Low | None | Should Have | Prevents accidents |
|
|
| Dynamic keyboard generation | Medium | HTTP Request node | Must Have | n8n native node limitation workaround |
|
|
| **Batch Operations** | | | | |
|
|
| Update multiple containers | Medium | Existing update | Must Have | Sequential with progress |
|
|
| "Update all" command | Medium | Container listing | Should Have | With confirmation |
|
|
| Per-container feedback | Low | None | Must Have | Progress visibility |
|
|
| **n8n API** | | | | |
|
|
| API key setup | Low | n8n config | Must Have | Enable programmatic access |
|
|
| Read workflow | Low | REST API | Must Have | Development workflow |
|
|
| Update workflow | Low | REST API | Must Have | Development workflow |
|
|
| Activate/deactivate | Low | REST API | Should Have | Testing workflow |
|
|
| **Update Sync** | | | | |
|
|
| Delete status file | Low | SSH/exec access | Should Have | Simple Unraid sync |
|
|
| Unraid GraphQL API | High | Unraid 7.2+, API key | Nice to Have | Requires version check |
|
|
| **Security** | | | | |
|
|
| Socket proxy deployment | Medium | New container | Must Have | Security requirement |
|
|
| API restriction config | Low | Proxy env vars | Must Have | Minimize attack surface |
|
|
| Migration testing | Low | All commands | Must Have | Verify no regression |
|
|
|
|
## MVP Recommendation for v1.1
|
|
|
|
**Phase 1: Foundation (Must Have)**
|
|
1. Docker socket security via proxy - security first
|
|
2. n8n API access setup - enables faster development
|
|
3. Basic inline keyboard infrastructure - callback handling
|
|
|
|
**Phase 2: UX Improvements (Should Have)**
|
|
4. Container action buttons from status view
|
|
5. Confirmation dialogs for stop/update actions
|
|
6. Batch update with progress feedback
|
|
|
|
**Phase 3: Polish (Nice to Have)**
|
|
7. Unraid update status sync (file deletion method)
|
|
8. "Update all" convenience command
|
|
|
|
## Confidence Assessment
|
|
|
|
| Area | Confidence | Reason |
|
|
|------|------------|--------|
|
|
| Telegram Inline Keyboards | HIGH | Official Telegram docs + n8n docs verified |
|
|
| Batch Operations | MEDIUM-HIGH | Standard Docker patterns, well-documented |
|
|
| n8n API | MEDIUM | API exists but detailed endpoint docs required fetching |
|
|
| Unraid Update Sync | MEDIUM | Community knowledge, API docs limited |
|
|
| Docker Socket Security | HIGH | Well-documented proxy solutions |
|
|
|
|
## Gaps to Address in Phase Planning
|
|
|
|
1. **Exact n8n API endpoints** - Need to verify full endpoint list during implementation
|
|
2. **Unraid version compatibility** - GraphQL API requires Unraid 7.2+, need version check
|
|
3. **n8n Telegram node workarounds** - HTTP Request approach needs testing
|
|
4. **Socket proxy on Unraid** - Deployment specifics for Unraid environment
|