Files
unraid-docker-manager/.planning/phases/05-polish-deploy/05-RESEARCH.md
T
Lucas Berger 4c09d61943 docs(05): research phase domain
Phase 5: Polish & Deploy
- Standard stack identified (n8n, Telegram Bot API, Docker)
- Architecture patterns documented (Switch routing, persistent keyboards, error workflows)
- Pitfalls catalogued (credential leaks, testing limitations, configuration issues)
- Code examples for keyword routing, persistent menus, error handling

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 20:55:29 -05:00

25 KiB

Phase 5: Polish & Deploy - Research

Researched: 2026-01-31 Domain: Production deployment with n8n workflow polishing, Telegram bot UX, and deployment packaging Confidence: HIGH

Summary

Phase 5 focuses on production-ready deployment requiring four main areas: removing NLU/Claude nodes and replacing with keyword routing, implementing Telegram persistent menu buttons for discoverability, hardening error handling with minimal user-facing messages, and packaging the workflow for deployment with proper credential handling.

The standard approach for n8n production workflows emphasizes testing in non-production environments first, using n8n's built-in credentials system for sensitive data, implementing centralized error handling with the Error Trigger node, and exporting workflow JSON to version control while ensuring credentials are never hardcoded. For Telegram bots, the persistent menu pattern uses ReplyKeyboardMarkup with is_persistent=true to keep command buttons always visible, while inline keyboards handle dynamic interactions like container selection.

Based on user decisions from CONTEXT.md, the implementation will use n8n's Switch node for keyword matching (replacing Claude nodes entirely), ReplyKeyboardMarkup for the persistent menu with grouped commands, n8n credentials system for the Telegram user ID, and minimal error messages following the "Failed to X" pattern with infrastructure-specific messages only for Docker socket errors.

Primary recommendation: Use n8n Switch node with string "contains" operators for keyword routing, set up persistent Telegram menu with ReplyKeyboardMarkup, move sensitive values to n8n credentials before exporting workflow JSON, and create root-level README with step-by-step deployment instructions.

Standard Stack

The established tools for this deployment phase:

Core

Library Version Purpose Why Standard
n8n Current stable Workflow orchestration and credential management Already deployed on Unraid, handles webhook security
Telegram Bot API 2.0+ Persistent menu buttons and inline keyboards Native support for is_persistent parameter added in Bot API 2.0
Docker API Host version Container management via Unix socket Standard on Unraid installations

Supporting

Library Version Purpose When to Use
n8n Error Trigger Built-in Centralized error workflow Production error handling and monitoring
n8n HTTP Request node Built-in Telegram API calls for keyboards When native Telegram node has limitations
Git Any Version control for workflow JSON Workflow versioning and rollback capability

Alternatives Considered

Instead of Could Use Tradeoff
Switch node routing IF node cascade Switch handles multiple routes cleaner, IF requires nested structure
ReplyKeyboardMarkup InlineKeyboardMarkup Reply keyboards persist but take keyboard space, inline are per-message
n8n credentials Environment variables n8n CE blocks env var access in expressions (known limitation)

Installation:

# No additional packages needed - using built-in n8n nodes
# Workflow will be imported via n8n UI or CLI

Architecture Patterns

Telegram Trigger
├── Route Update Type (Switch: message vs callback_query)
│   ├── [message path]
│   │   └── Auth Check (IF)
│   │       └── Keyword Router (Switch: contains operations)
│   │           ├── status → Container Status flow
│   │           ├── start → Container Action flow
│   │           ├── stop → Container Action flow
│   │           ├── restart → Container Action flow
│   │           ├── update → Container Action flow
│   │           ├── logs → Logs flow
│   │           └── [fallback] → Show Menu
│   └── [callback_query path]
│       └── Auth Check (IF)
│           └── [existing callback handlers]
└── Error Trigger Workflow (separate)
    └── Log + Notify

Pattern 1: Keyword Routing with Switch Node

What: Replace NLU intent parsing with simple keyword matching using Switch node with multiple "contains" rules When to use: User input routing for command-based bots where keywords are predictable Example:

{
  "parameters": {
    "rules": {
      "values": [
        {
          "conditions": {
            "conditions": [
              {
                "leftValue": "={{ $json.message.text.toLowerCase() }}",
                "rightValue": "status",
                "operator": {
                  "type": "string",
                  "operation": "contains"
                }
              }
            ]
          },
          "renameOutput": true,
          "outputKey": "status"
        }
      ]
    },
    "options": {
      "fallbackOutput": "extra"
    }
  },
  "type": "n8n-nodes-base.switch"
}

Source: n8n Switch node documentation

Pattern 2: Persistent Telegram Menu Button

What: Use ReplyKeyboardMarkup with is_persistent=true to display command buttons that remain visible when keyboard is hidden When to use: When users need constant access to core commands without remembering keywords Example:

{
  "chat_id": "{{ $json.chatId }}",
  "text": "Welcome! Use buttons below:",
  "reply_markup": {
    "keyboard": [
      [{"text": "📊 Status"}],
      [{"text": "▶️ Start"}, {"text": "⏹️ Stop"}],
      [{"text": "🔄 Restart"}, {"text": "⬆️ Update"}],
      [{"text": "📜 Logs"}]
    ],
    "is_persistent": true,
    "resize_keyboard": true
  }
}

Source: Telegram Bot API - Persistent Menu

Pattern 3: Credential References in n8n

What: Store sensitive values in n8n credentials system and reference them in workflow expressions When to use: Any hardcoded sensitive data (user IDs, tokens, API keys) before exporting workflow Example:

// In n8n IF node condition - checking authorized user
// Instead of: $json.message.from.id === 123456789
// Use credential reference:
$json.message.from.id === parseInt($credentials.telegramAuth.userId)

Source: n8n Credentials Documentation

Pattern 4: Centralized Error Workflow

What: Create separate workflow with Error Trigger node that catches failures from all workflows When to use: Production deployments requiring error monitoring and graceful failure handling Example:

Error Workflow:
[Error Trigger]
  → [Code: Format Error Details]
    → [Telegram: Notify Admin "Cannot connect to Docker"]
    → [HTTP: Log to monitoring service]

Source: n8n Error Handling

Anti-Patterns to Avoid

  • Hardcoding credentials in workflow nodes - Export will expose sensitive data, use n8n credentials system instead
  • Complex regex in Switch conditions - Simple "contains" operations are sufficient for keyword matching, regex adds complexity
  • Verbose error messages to end users - Expose internal state and overwhelm users; keep messages terse
  • Editing production workflows directly - Test changes in duplicate workflow first to prevent breaking live bot
  • Using "Save Execution Progress" - Debug feature causes excessive database writes in production (3000+ writes/day for 30-node workflow running 100x/day)

Don't Hand-Roll

Problems that look simple but have existing solutions:

Problem Don't Build Use Instead Why
Secure credential storage Custom encryption or env vars n8n credentials system Built-in AES256 encryption, credential sharing, OAuth support
Error tracking Manual logging nodes Error Trigger workflow Automatic error capture, centralized handling, no manual wiring
Telegram keyboard rendering String concatenation Telegram reply_markup object Proper escaping, layout control, persistent menu support
Workflow versioning Manual JSON backups Git with workflow export Diff tracking, rollback capability, team collaboration
User authorization Custom auth logic n8n IF node + credentials Simple, tested, integrates with credential system
Keyword matching Custom parser code Switch node "contains" Native n8n, no code maintenance, visual debugging
Retry logic for API calls Custom retry code n8n HTTP Request retry options Exponential backoff, jitter, configurable attempts built-in

Key insight: n8n provides production-grade features (credentials, error handling, retry logic) that seem simple to replicate but have edge cases around encryption keys, error propagation, and failure recovery. Using built-in capabilities ensures upgrades don't break custom solutions.

Common Pitfalls

Pitfall 1: Credentials Leak in Exported Workflow

What goes wrong: Hardcoded user IDs, API keys, or tokens remain in workflow JSON when exported, exposing sensitive data when sharing or committing to Git. Why it happens: n8n CE blocks environment variable access in expressions, leading developers to hardcode values directly in nodes. How to avoid:

  • Create custom credential type in n8n with required fields (e.g., "Telegram Auth" with userId field)
  • Reference credential in expressions: $credentials.telegramAuth.userId
  • Before export, verify no hardcoded IDs with: grep -E '[0-9]{8,}' workflow.json Warning signs: grep finds large numbers in workflow JSON, credential fields in nodes show raw values instead of credential references

Pitfall 2: Testing Error Workflows with Manual Execution

What goes wrong: Error Trigger only fires on automatic workflow failures, not manual test runs. Developers think error handling works but it never triggers in production. Why it happens: n8n Error Trigger is designed for production errors only, manual executions bypass error workflows. How to avoid:

  • Use "Stop and Error" node in main workflow to force failures
  • Test by triggering workflow via webhook/Telegram (automatic execution)
  • Verify error workflow with intentional Docker socket disconnect Warning signs: Error workflow never shows execution history, production failures go unhandled

Pitfall 3: Switch Node Fallback Misconfiguration

What goes wrong: Setting fallback to "none" silently drops messages that don't match any rules. Users send commands but get no response. Why it happens: Default fallback is "none" - messages that don't match any routing rule disappear without executing downstream nodes. How to avoid:

  • Set Switch node fallback to "extra" output
  • Connect fallback output to "Show Menu" or "Unknown command" response
  • Test with unrecognized input: "asdfgh" should get helpful response Warning signs: Some user messages disappear without response, execution history shows Switch node with no output paths taken

Pitfall 4: Case-Sensitive Keyword Matching

What goes wrong: User types "Status" (capitalized) but Switch rule checks for lowercase "status", command not recognized. Why it happens: Switch node conditions are case-sensitive by default. How to avoid:

  • Normalize input: $json.message.text.toLowerCase() in leftValue expression
  • Set Switch node "Ignore Case" option to true
  • Test with various capitalizations: "status", "Status", "STATUS" Warning signs: Same command works sometimes but not others based on capitalization

Pitfall 5: Persistent Keyboard Overwrites

What goes wrong: Every response includes full keyboard definition, causing Telegram to re-render unnecessarily and creating visual flickering. Why it happens: Setting reply_markup on every message instead of only on initial welcome or menu request. How to avoid:

  • Send keyboard only on /start command, unknown input, or explicit menu request
  • Normal responses omit reply_markup parameter (preserves existing keyboard)
  • Use reply_markup: {"remove_keyboard": true} only when intentionally hiding keyboard Warning signs: Keyboard flickers on every bot response, excessive data in Telegram messages

Pitfall 6: Workflow Export Without Encryption Key

What goes wrong: Workflow imported on different n8n instance can't decrypt credentials, all authenticated nodes fail. Why it happens: n8n uses N8N_ENCRYPTION_KEY for credential encryption; different instances have different keys. How to avoid:

  • Document in README: credentials must be recreated on target n8n instance
  • Export workflow, manually create credentials on new instance
  • Never copy encryption key between environments (security risk)
  • Use external secrets manager (Vault, AWS Secrets Manager) for team environments Warning signs: Imported workflow shows credentials as "missing" or nodes fail with auth errors

Pitfall 7: Inline Keyboard Callback Data Limits

What goes wrong: Callback data exceeds Telegram's 64-byte limit, inline buttons fail silently. Why it happens: Encoding full container names or multiple parameters in callback_data without length validation. How to avoid:

  • Use short encoding: single-char action codes (s/t/r/x for start/stop/restart/update)
  • Validate callback_data length: callback_data.length <= 64
  • Batch limit already addressed (4 containers max) Warning signs: Inline buttons don't respond when clicked, no callback_query received

Pitfall 8: Docker Socket Permission Errors After Deployment

What goes wrong: n8n container can execute curl commands but gets "permission denied" on /var/run/docker.sock. Why it happens: n8n runs as node user (UID 1000) without docker group membership. How to avoid:

  • n8n container must use --group-add 281 (docker group on Unraid)
  • Document in deployment README as required Docker run flag
  • Test with: docker exec n8n curl --unix-socket /var/run/docker.sock http://localhost/containers/json Warning signs: "Cannot connect to Docker" messages, curl permission denied errors

Code Examples

Verified patterns from official sources:

Keyword Router Switch Node

{
  "parameters": {
    "rules": {
      "values": [
        {
          "id": "match-status",
          "conditions": {
            "options": {
              "caseSensitive": false
            },
            "conditions": [
              {
                "leftValue": "={{ $json.message.text }}",
                "rightValue": "status",
                "operator": {
                  "type": "string",
                  "operation": "contains"
                }
              }
            ]
          },
          "renameOutput": true,
          "outputKey": "status"
        },
        {
          "id": "match-start",
          "conditions": {
            "conditions": [
              {
                "leftValue": "={{ $json.message.text.toLowerCase() }}",
                "rightValue": "start",
                "operator": {
                  "type": "string",
                  "operation": "contains"
                }
              }
            ]
          },
          "outputKey": "start"
        }
      ]
    },
    "options": {
      "fallbackOutput": "extra"
    }
  },
  "name": "Keyword Router",
  "type": "n8n-nodes-base.switch"
}

Source: n8n Switch node docs

Persistent Menu with HTTP Request Node

// In n8n HTTP Request node sending Telegram message
// URL: https://api.telegram.org/bot{{ $credentials.telegramApi.token }}/sendMessage
// Method: POST
// Body (JSON):
{
  "chat_id": "={{ $json.message.chat.id }}",
  "text": "Use buttons below or type commands:",
  "parse_mode": "HTML",
  "reply_markup": {
    "keyboard": [
      [{"text": "📊 Status"}],
      [{"text": "▶️ Start"}, {"text": "⏹️ Stop"}],
      [{"text": "🔄 Restart"}, {"text": "⬆️ Update"}],
      [{"text": "📜 Logs"}]
    ],
    "is_persistent": true,
    "resize_keyboard": true,
    "one_time_keyboard": false
  }
}

Source: Telegram Bot API - ReplyKeyboardMarkup

Error Handler Workflow

{
  "name": "Docker Bot Error Handler",
  "nodes": [
    {
      "parameters": {},
      "name": "Error Trigger",
      "type": "n8n-nodes-base.errorTrigger",
      "position": [240, 300]
    },
    {
      "parameters": {
        "jsCode": "// Format error for user notification\nconst error = $json.error;\nconst workflow = $json.workflow;\n\n// Check for Docker socket errors\nif (error.message && error.message.includes('docker.sock')) {\n  return {\n    userMessage: 'Cannot connect to Docker',\n    adminMessage: `Docker socket error in ${workflow.name}: ${error.message}`\n  };\n}\n\n// Generic infrastructure error\nreturn {\n  userMessage: 'Something went wrong',\n  adminMessage: `Error in ${workflow.name} at node ${error.node.name}: ${error.message}`\n};"
      },
      "name": "Format Error",
      "type": "n8n-nodes-base.code",
      "position": [440, 300]
    },
    {
      "parameters": {
        "chatId": "={{ $credentials.telegramAuth.userId }}",
        "text": "={{ $json.userMessage }}",
        "additionalFields": {
          "parse_mode": "HTML"
        }
      },
      "name": "Notify User",
      "type": "n8n-nodes-base.telegram",
      "position": [640, 300],
      "credentials": {
        "telegramApi": {
          "id": "telegram-credential",
          "name": "Telegram API"
        }
      }
    }
  ]
}

Source: n8n Error Trigger documentation

Credential Reference Pattern

// In n8n IF node - check authorized user
// Instead of hardcoding: $json.message.from.id === 123456789
// Create credential type "Telegram Auth" with field "userId"
// Then reference in condition:

// Condition leftValue:
$json.message.from.id

// Condition rightValue (using credential):
={{ parseInt($credentials.telegramAuth.userId) }}

// operator: equals (number type)

Source: n8n Credentials Library

Deployment README Template

# Docker Manager Bot - Deployment Guide

## Prerequisites

- Unraid server with Docker enabled
- n8n container running on Unraid
- Telegram Bot Token (from @BotFather)
- Your Telegram User ID (from @userinfobot)

## Installation Steps

### 1. Create n8n Credentials

In n8n UI, create two credentials:

**Telegram API:**
- Type: Telegram API
- Name: `Telegram API`
- Access Token: `<your bot token from @BotFather>`

**Telegram Auth:**
- Type: Generic Credential Type → HTTP Header Auth
- Name: `Telegram Auth`
- Add custom field: `userId` = `<your Telegram user ID>`

### 2. Import Workflow

1. Copy `n8n-workflow.json` to your server
2. In n8n UI: Workflows → Import from File
3. Select `n8n-workflow.json`
4. Map credentials when prompted:
   - `Telegram API` → your Telegram API credential
   - `Telegram Auth` → your Telegram Auth credential

### 3. Configure n8n Container

Ensure n8n container has Docker socket access:

```bash
docker run -d \\
  --name n8n \\
  --group-add 281 \\
  -v /var/run/docker.sock:/var/run/docker.sock \\
  -v /path/to/curl:/usr/bin/curl:ro \\
  n8nio/n8n

Required:

  • --group-add 281 - Docker group for socket access
  • Socket mount: /var/run/docker.sock
  • Static curl binary mount

4. Activate Workflow

  1. Open imported workflow in n8n
  2. Click "Active" toggle in top-right
  3. Test by messaging your bot: "status"

Usage

Send commands via Telegram:

  • status - View container status
  • start - Start container
  • stop - Stop container
  • restart - Restart container
  • update - Pull latest image and restart
  • logs - View recent logs

Or use persistent menu buttons for common actions.

Troubleshooting

Bot doesn't respond:

  • Check workflow is Active
  • Verify Telegram credentials are correct
  • Check n8n execution logs

"Cannot connect to Docker":

  • Verify --group-add 281 in n8n container
  • Check docker.sock mount exists
  • Test: docker exec n8n curl --unix-socket /var/run/docker.sock http://localhost/containers/json

Credentials missing after import:

  • Credentials are not exported with workflow
  • Recreate credentials in n8n UI
  • Re-map in workflow settings
**Source:** [README Best Practices](https://github.com/jehna/readme-best-practices)

## State of the Art

| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| Claude API for NLU | Keyword matching with Switch node | 2026-01-31 | Removes external API dependency, faster response, no API costs |
| Commands menu | Persistent ReplyKeyboardMarkup | Telegram Bot API 2.0 | Menu always visible, better UX for non-technical users |
| Hardcoded user ID | n8n credentials system | Project start | Allows sharing workflow without exposing sensitive data |
| Manual workflow backup | Git version control | Industry standard | Enables rollback, change tracking, team collaboration |
| Ad-hoc error handling | Error Trigger workflow | n8n v0.x | Centralized error management, consistent user experience |

**Deprecated/outdated:**
- **Custom keyboard on each message**: Use is_persistent instead - avoids re-rendering and flickering
- **Environment variables in n8n CE**: Use credentials system - env vars blocked in expressions
- **"Save Execution Progress" in production**: Disable - causes excessive database writes (known performance issue)
- **IF node cascades for routing**: Use Switch node - cleaner multiple-output routing

## Open Questions

Things that couldn't be fully resolved:

1. **Exact menu button layout UX**
   - What we know: Telegram supports grouped buttons (arrays within keyboard array), emojis render correctly
   - What's unclear: Optimal grouping for 6 commands (Status + 5 actions) - user preference on rows vs columns
   - Recommendation: Start with CONTEXT.md structure (Status solo, Actions in pairs), iterate based on user feedback during testing

2. **Retry buttons on retriable errors**
   - What we know: Telegram inline keyboards can include retry buttons that re-trigger callback with same parameters
   - What's unclear: Whether retry UX adds value vs just asking user to tap action again
   - Recommendation: Mark as Claude's discretion in CONTEXT.md - implement if time permits, not critical for v1.0

3. **README location**
   - What we know: Root README is standard for project entry point, docs/ folder separates documentation from code
   - What's unclear: This is n8n workflow (JSON) not code - root vs docs/ both valid
   - Recommendation: Use root README.md (marked as Claude's discretion) - single-file deployment guide, no docs/ needed for single-workflow project

## Sources

### Primary (HIGH confidence)
- [n8n Switch node documentation](https://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.switch/) - Keyword routing patterns
- [n8n Error Handling documentation](https://docs.n8n.io/flow-logic/error-handling/) - Error Trigger workflow setup
- [n8n Credentials Library](https://docs.n8n.io/credentials/) - Credential system and references
- [n8n Workflow Export/Import](https://docs.n8n.io/workflows/export-import/) - Export best practices and sensitive data handling
- [Telegram Bot API](https://core.telegram.org/bots/api) - ReplyKeyboardMarkup and is_persistent parameter

### Secondary (MEDIUM confidence)
- [n8n Credential Hygiene (Medium, Jan 2026)](https://medium.com/@bhagyarana80/n8n-credential-hygiene-for-self-hosted-reality-cfa90ef1a114) - Credential best practices verified with official docs
- [7 Common n8n Workflow Mistakes (Medium, Jan 2026)](https://medium.com/@juanm.acebal/7-common-n8n-workflow-mistakes-that-can-break-your-automations-9638903fb076) - Pitfalls cross-referenced with n8n documentation
- [n8n Workflow Testing (Medium, Jan 2026)](https://medium.com/@Modexa/n8n-workflow-testing-without-the-panic-deploy-7376586a8b43) - Testing practices verified with community discussions
- [Seven n8n Workflow Best Practices for 2026](https://michaelitoback.com/n8n-workflow-best-practices/) - Current best practices aggregated from multiple sources
- [README Best Practices](https://github.com/jehna/readme-best-practices) - README structure template

### Tertiary (LOW confidence)
- [n8n Telegram Bot Templates](https://n8n.io/workflows/) - Example workflows for pattern reference, not authoritative for best practices
- Various n8n Community Forum discussions - Real-world issues but not official guidance

## Metadata

**Confidence breakdown:**
- Standard stack: HIGH - Official n8n and Telegram Bot API documentation verified
- Architecture patterns: HIGH - Direct verification with official docs and existing workflow structure
- Pitfalls: MEDIUM - Mix of official documentation (Error Trigger) and community-reported issues (verified where possible)
- Code examples: HIGH - All examples based on official API documentation and n8n node schemas

**Research date:** 2026-01-31
**Valid until:** 2026-02-28 (30 days) - n8n stable platform, Telegram Bot API unlikely to change core features