Files

T

Lucas Berger 23ba84ae4e docs(11): research phase domain

2026-02-08 18:56:44 -05:00

20 KiB

Raw Blame History

Phase 11: Update All & Callback Limits - Research

Researched: 2026-02-08 Domain: Telegram Bot API callback data optimization, n8n workflow state management Confidence: HIGH

Summary

Phase 11 adds "update all" functionality for :latest containers and fixes Telegram's 64-byte callback_data limit that currently restricts batch selection to ~2 containers. The main workflow already has partial "update all" implementation (text command routing, confirmation keyboard, :latest filtering) but lacks inline keyboard entry point. The critical blocker is the batch selection keyboard's callback_data format (batch:toggle:0::plex = 22 bytes + CSV of selected containers), which grows linearly with selection and hits the 64-byte limit after selecting 2-3 short-named containers.

Primary recommendation: Replace CSV-in-callback approach with server-side state storage using n8n workflow static data to track batch selection state, reducing callback_data to fixed-size tokens (e.g., batch:toggle:0:abc123:plex where abc123 is a session key). Add "Update All" button to container list keyboard that triggers the existing update-all confirmation flow.

Standard Stack

Core

Library	Version	Purpose	Why Standard
Telegram Bot API	7.0+	Inline keyboard, callback queries	Official Telegram bot interface, 64-byte callback_data limit enforced
n8n workflow	1.x	Orchestration, sub-workflow execution	Project's existing automation platform
n8n static data	n8n built-in	Workflow-scoped persistence	n8n's native state storage (execution-scoped, not global)

Supporting

Library	Version	Purpose	When to Use
Docker API	1.47	Container list, image tags	Filtering :latest containers for update-all
JavaScript (n8n Code nodes)	ES6+	Callback parsing, keyboard building	All workflow logic implemented in Code nodes

Alternatives Considered

Instead of	Could Use	Tradeoff
n8n static data	External Redis/DB	n8n static data is execution-scoped (doesn't persist between executions), but workflow executions in this bot are short-lived (sub-minute), so state only needs to survive within a single conversation flow; Redis adds infrastructure complexity
Callback data tokens	Protobuf + base85	Protobuf/base85 saves ~30% space but still hits 64-byte limit with 3+ selections; token approach eliminates linear growth
Session tokens	Callback data compression	Compression saves bytes but doesn't solve fundamental limit; tokens cap size at ~20 bytes regardless of selection count

Installation: No new dependencies. Changes confined to existing n8n workflow JSON files.

Architecture Patterns

Current State (Problematic)

Batch selection callback data format:

batch:toggle:{page}:{selectedCsv}:{containerName}
Example: batch:toggle:0:plex,sonarr,radarr:jellyfin
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ = 42 bytes (22 bytes overhead + 20 bytes names)

Problem: With 4 containers averaging 8 chars each (plex, sonarr, radarr, nzbget):

Prefix overhead: batch:toggle:0:: = 16 bytes
Selection CSV: plex,sonarr,radarr,nzbget = 28 bytes
Toggle name: jellyfin = 8 bytes
Total: 52 bytes (leaves only 12 bytes headroom)

With 5th container (10 chars): plex,sonarr,radarr,nzbget,jellyfin = 38 bytes → 62 bytes total (over limit)

Recommended Pattern: Session-Based State Storage

Callback Data Format (Fixed Size):
batch:toggle:{sessionId}:{containerName}
Example: batch:toggle:a7f3d2:plex
         ^^^^^^^^^^^^^^^^^^^^^^^ = 23 bytes (fixed, regardless of selection size)

State Storage (n8n Static Data):
{
  "batchSessions": {
    "a7f3d2": {
      "chatId": 563878771,
      "page": 0,
      "selected": ["plex", "sonarr", "radarr", "nzbget", "jellyfin"],
      "action": "stop",
      "created": 1738972800000,
      "expires": 1738973100000  // 5 minutes TTL
    }
  }
}

Benefits:

Callback data size constant at ~25 bytes (60% reduction from worst case)
Supports unlimited container selections
Session cleanup prevents static data bloat

Pattern Implementation: Session Lifecycle

1. Session Creation (Batch Mode Entry)

// Code node: "Initialize Batch Session"
const staticData = $getWorkflowStaticData('global');
const sessions = JSON.parse(staticData._batchSessions || '{}');

// Generate 6-char session ID
const sessionId = Math.random().toString(36).substring(2, 8);
const now = Date.now();

sessions[sessionId] = {
  chatId: $json.chatId,
  page: 0,
  selected: [],
  action: $json.batchAction || 'stop',
  created: now,
  expires: now + 300000  // 5 minutes
};

// Clean expired sessions (prevent bloat)
Object.keys(sessions).forEach(id => {
  if (sessions[id].expires < now) delete sessions[id];
});

staticData._batchSessions = JSON.stringify(sessions);

return { json: { sessionId, chatId: $json.chatId } };

2. Session Update (Toggle Selection)

// Code node: "Update Batch Session"
const staticData = $getWorkflowStaticData('global');
const sessions = JSON.parse(staticData._batchSessions || '{}');
const sessionId = $json.sessionId;
const toggleName = $json.toggleName;

if (!sessions[sessionId]) {
  return { json: { error: 'Session expired', chatId: $json.chatId } };
}

const session = sessions[sessionId];
const selected = new Set(session.selected);

// Toggle selection
if (selected.has(toggleName)) {
  selected.delete(toggleName);
} else {
  selected.add(toggleName);
}

session.selected = Array.from(selected);
staticData._batchSessions = JSON.stringify(sessions);

return { json: {
  sessionId,
  selectedCount: selected.size,
  selectedCsv: session.selected.join(',')
} };

3. Keyboard Building (Retrieve Session)

// Code node: "Build Batch Keyboard With Session"
const staticData = $getWorkflowStaticData('global');
const sessions = JSON.parse(staticData._batchSessions || '{}');
const sessionId = $json.sessionId;
const session = sessions[sessionId];

const selectedSet = new Set(session.selected);

// Build keyboard with fixed-size callbacks
const keyboard = displayContainers.map(c => {
  const isSelected = selectedSet.has(c.name);
  const icon = c.state === 'running' ? '🟢' : '⚪';
  const checkmark = isSelected ? '✓ ' : '';
  return [{
    text: `${checkmark}${icon} ${c.name}`,
    callback_data: `batch:toggle:${sessionId}:${c.name}`  // Fixed size
  }];
});

// Navigation buttons also use session ID
if (page > 0) {
  navRow.push({
    text: '◀️ Previous',
    callback_data: `batch:nav:${sessionId}:${page - 1}`
  });
}

Pattern 2: Update All Entry Points

Text Command (Already Implemented):

User: "update all"
  ↓
Keyword Router → "updateall" output
  ↓
Get All Containers For Update All (HTTP: filter :latest)
  ↓
Build Update All Confirmation (keyboard with uall:confirm:{timestamp})
  ↓
Send confirmation message

Inline Keyboard Entry Point (NEW):

Container List keyboard:
[🟢 plex]   [🟢 sonarr]
[🟢 radarr] [⚪ nzbget]
──────────────────────
[🔄 Update All :latest] ← NEW BUTTON
[◀️ Previous] [1/2] [Next ▶️]

Callback data: uall:start (10 bytes, no parameters needed — fetches :latest containers on click)

Anti-Patterns to Avoid

Storing entire selection in callback_data: Hits 64-byte limit after 2-3 containers
Using message ID as session key: Message ID reused across conversations; use generated tokens
Global session store without TTL: n8n static data persists indefinitely; must clean expired sessions
Session lookup without expiry check: Old sessions can cause stale state bugs

Don't Hand-Roll

Problem	Don't Build	Use Instead	Why
Callback data compression	Custom LZ4/zlib compression	Session tokens + static data	Compression can't bypass 64-byte hard limit; tokens eliminate size dependency
Session ID generation	Timestamp-based sequential IDs	Math.random().toString(36)	Sequential IDs leak execution count; random alphanumeric sufficient for short-lived sessions
Static data serialization	Custom binary format	JSON.stringify/parse	n8n static data already uses JSON internally; custom format adds complexity
Session cleanup	Background cron node	Inline cleanup on session access	n8n workflows don't support background tasks; cleanup-on-access prevents bloat

Key insight: Telegram's 64-byte limit is a hard constraint enforced at the API level. The only viable workarounds are: (1) reduce callback_data to fixed-size tokens, or (2) use alternative callback methods (e.g., switch_inline_query). Token-based approach is simplest and requires no architecture changes beyond state management.

Common Pitfalls

Pitfall 1: n8n Static Data Scope Confusion

What goes wrong: Assuming $getWorkflowStaticData('global') persists across workflow activations or different workflow instances

Why it happens: "Global" means "workflow-scoped" (accessible to all nodes in the workflow), not "instance-global" (persists forever). From Phase 10.2 UAT: static data is execution-scoped in n8n cloud and may not persist between executions.

How to avoid:

Document that sessions are conversation-scoped (survive single execution only)
Implement TTL cleanup to prevent session bloat in long-running executions
Test session persistence across multiple callback interactions in same execution

Warning signs:

User reports "session expired" immediately after creating batch selection
Static data object grows unbounded with old session IDs
Session lookups fail after workflow re-activation

Pitfall 2: Deep Nested Mutation of Static Data

What goes wrong: Modifying staticData.sessions.abc123.selected.push('plex') doesn't persist changes

Why it happens: n8n only tracks top-level property changes for static data persistence. Deep mutations are silently lost. (From CLAUDE.md: "Deep nested mutations are silently lost. Always use JSON serialization.")

How to avoid:

// WRONG - deep mutation not persisted
staticData.sessions[sessionId].selected.push('plex');

// CORRECT - top-level assignment persisted
const sessions = JSON.parse(staticData._batchSessions || '{}');
sessions[sessionId].selected.push('plex');
staticData._batchSessions = JSON.stringify(sessions);

Warning signs:

Session state reverts to initial state after toggle
Selection list shows empty array despite successful toggles
Debugging shows correct in-memory values but wrong persisted values

Pitfall 3: Callback Data URL Encoding

What goes wrong: Container names with spaces or special chars exceed 64-byte limit after URL encoding

Why it happens: Telegram URL-encodes callback_data before enforcing 64-byte limit. container name becomes container%20name (+2 bytes per space).

How to avoid:

Normalize container names to remove leading slash (Docker returns /plex, store as plex)
Session tokens are alphanumeric only (no encoding needed)
Test with containers that have spaces, dashes, underscores

Warning signs:

Batch toggle works for plex but fails for my-container-name-v2
Telegram API returns 400 Bad Request with no error details
Callback data length looks under 64 bytes in code but fails at API

Pitfall 4: Update All Without Confirmation

What goes wrong: Adding "Update All" button that immediately triggers batch update without confirmation

Why it happens: Copying pattern from batch start/stop exec buttons, which show confirmation for stop only

How to avoid:

ALWAYS show confirmation for update-all (updates are destructive — image pull can fail, container recreation can break state)
Reuse existing Build Update All Confirmation code node (already implemented at line 2810 in main workflow)
Add inline keyboard entry point that routes to confirmation flow, not direct execution

Warning signs:

User reports containers updated without confirmation prompt
Update-all triggers immediately on button press
No 30-second timeout check for update-all

Code Examples

Verified patterns from existing implementation and Telegram Bot API:

Session-Based Batch Toggle

// Source: n8n-batch-ui.json + Telegram Bot API docs
// Modified from current CSV-in-callback to session-based approach

// Code node: "Handle Toggle With Session"
const triggerData = $('When executed by another workflow').item.json;
const sessionId = triggerData.sessionId;
const toggleName = triggerData.toggleName;
const chatId = triggerData.chatId;

// Load session state
const staticData = $getWorkflowStaticData('global');
const sessions = JSON.parse(staticData._batchSessions || '{}');

if (!sessions[sessionId]) {
  return {
    json: {
      success: false,
      action: 'expired',
      queryId: triggerData.queryId,
      chatId: chatId,
      answerText: 'Session expired (5 min timeout)',
      showAlert: true
    }
  };
}

const session = sessions[sessionId];
const selectedSet = new Set(session.selected);

// Toggle selection
if (selectedSet.has(toggleName)) {
  selectedSet.delete(toggleName);
} else {
  selectedSet.add(toggleName);
}

session.selected = Array.from(selectedSet);

// CRITICAL: Top-level assignment for persistence
staticData._batchSessions = JSON.stringify(sessions);

return {
  json: {
    success: true,
    action: 'toggle_update',
    sessionId: sessionId,
    selectedCount: selectedSet.size,
    selectedCsv: session.selected.join(','),
    needsKeyboardUpdate: true
  }
};

Update All Inline Keyboard Entry

// Source: n8n-status.json Build Container List node
// Add "Update All" button to container list keyboard

// Code node: "Build Container List" (modified)
// ... existing container list logic ...

// Add Update All button row after pagination
keyboard.push([
  {
    text: '🔄 Update All :latest',
    callback_data: 'uall:start'  // 10 bytes, triggers existing flow
  }
]);

return {
  json: {
    success: true,
    action: 'list',
    chatId: chatId,
    messageId: messageId,
    text: message,
    reply_markup: { inline_keyboard: keyboard }
  }
};

Session Cleanup on Access

// Source: n8n best practices + project patterns
// Clean expired sessions every time static data is accessed

function getSessionsWithCleanup() {
  const staticData = $getWorkflowStaticData('global');
  const sessions = JSON.parse(staticData._batchSessions || '{}');
  const now = Date.now();
  let cleaned = false;

  // Remove expired sessions (5-minute TTL)
  Object.keys(sessions).forEach(id => {
    if (sessions[id].expires < now) {
      delete sessions[id];
      cleaned = true;
    }
  });

  // Persist cleanup
  if (cleaned) {
    staticData._batchSessions = JSON.stringify(sessions);
  }

  return sessions;
}

// Usage in any session-access code
const sessions = getSessionsWithCleanup();

Callback Data Parser Update

// Source: n8n-workflow.json "Parse Callback Data" node (line 589)
// Add session-based batch toggle parsing

// Existing: batch:toggle:{page}:{selectedCsv}:{containerName}
// New:      batch:toggle:{sessionId}:{containerName}

if (rawData.startsWith('batch:toggle:')) {
  const parts = rawData.substring(13).split(':');
  const sessionId = parts[0];  // Changed from page number
  const toggleName = parts.slice(1).join(':');  // Handle names with colons

  return {
    json: {
      queryId,
      chatId,
      messageId,
      isBatchToggle: true,
      sessionId: sessionId,  // NEW field
      toggleName: toggleName,
      // Removed: batchPage, selectedCsv (now in session state)
    }
  };
}

// NEW: Update All start button
if (rawData === 'uall:start') {
  return {
    json: {
      queryId,
      chatId,
      messageId,
      isUpdateAllStart: true,  // Routes to existing confirmation flow
    }
  };
}

State of the Art

Old Approach	Current Approach	When Changed	Impact
CSV in callback_data	Session tokens + server state	Telegram Bot API 7.0 (2023) enforced 64-byte limit strictly	Libraries like python-telegram-bot added CallbackDataCache in v20+
Manual session cleanup	Inline cleanup on access	n8n lacks background tasks	Must clean on every session read to prevent bloat
Direct :latest image pull	Filter then confirm	Docker Hub rate limits (2020)	Always confirm batch operations to avoid wasted pulls
Batch exec without limit UI	Multi-select keyboard	Telegram inline keyboard UX (2018+)	Users expect checkbox-style interfaces for batch selection

Deprecated/outdated:

Storing full selection in callback_data: Python-telegram-bot deprecated this pattern in v20.0 (2022), introduced CallbackDataCache for server-side storage
Unlimited batch operations without confirmation: Docker Hub introduced rate limits (100 pulls/6hrs for free tier) in November 2020 — always confirm before batch image pulls
Using message_id as state key: Early Telegram bots used message ID for state lookup, but message IDs are reused across chats — use chat_id + random token

Open Questions

What is the maximum practical selection size?
- What we know: Session state stored in n8n static data (execution-scoped JSON)
- What's unclear: n8n static data size limits, if any
- Recommendation: Cap batch selection at 50 containers (practical UX limit for 6 per page = 9 pages), document limit in keyboard message
Should session TTL be configurable or hardcoded?
- What we know: Telegram callback queries expire after message age threshold (unclear exact time)
- What's unclear: Optimal balance between UX (allow user to take time selecting) vs resource usage (cleanup frequency)
- Recommendation: Hardcode 5-minute TTL (matches Telegram confirmation timeout pattern already in workflow), revisit if users report timeout issues
Does Update All inline keyboard need "Update All (N containers)" dynamic count?
- What we know: Text command shows count in confirmation (Update 12 containers?)
- What's unclear: Whether button should show live count (requires Docker API call on every list render) or static text
- Recommendation: Static text 🔄 Update All :latest in list keyboard, dynamic count shown in confirmation message after click (reduces API calls)

Sources

Primary (HIGH confidence)

Telegram Bot API Official Docs - InlineKeyboardButton callback_data 1-64 bytes limit
Telegram Limits Reference - Comprehensive Bot API limits documentation
Project codebase: n8n-workflow.json (lines 276-296, 589-1020, 2750-3074) - Existing "update all" implementation, callback parser, :latest filtering
Project codebase: n8n-batch-ui.json (lines 236-251) - Current CSV-in-callback approach and 64-byte limit check
Project codebase: CLAUDE.md - n8n static data deep mutation pitfall, JSON serialization requirement

Secondary (MEDIUM confidence)

n8n getWorkflowStaticData Docs - Static data persistence behavior, execution scope
n8n Static Data Persistence GitHub Issue #17321 - Cloud execution-scoped behavior (may not persist between triggers)
python-telegram-bot CallbackDataCache - Standard library pattern for callback_data workarounds
Telegram Inline Keyboard UX Guide - Best practices for multi-select interfaces

Tertiary (LOW confidence)

Medium: Telegram bot inline buttons with large data - Community workarounds for 64-byte limit using Redis
Enhanced callback_data with protobuf + base85 - Advanced encoding techniques (35% space savings but still hits limit with selections)

Metadata

Confidence breakdown:

Standard stack: HIGH - Telegram Bot API and n8n workflow are existing project dependencies with official docs
Architecture: HIGH - Session token pattern is standard workaround documented in python-telegram-bot and multiple sources; existing "update all" code verified in workflow JSON
Pitfalls: HIGH - n8n static data mutation pitfall directly from project CLAUDE.md; callback_data limit enforced by Telegram API

Research date: 2026-02-08 Valid until: 2026-03-08 (30 days - stable domain, Telegram Bot API 7.0 limit unchanged since 2023)

20 KiB Raw Blame History

Phase 11: Update All & Callback Limits - Research

Summary

Standard Stack

Core

Supporting

Alternatives Considered

Architecture Patterns

Current State (Problematic)

Recommended Pattern: Session-Based State Storage

Pattern Implementation: Session Lifecycle

1. Session Creation (Batch Mode Entry)

2. Session Update (Toggle Selection)

3. Keyboard Building (Retrieve Session)

Pattern 2: Update All Entry Points

Anti-Patterns to Avoid

Don't Hand-Roll

Common Pitfalls

Pitfall 1: n8n Static Data Scope Confusion

Pitfall 2: Deep Nested Mutation of Static Data

Pitfall 3: Callback Data URL Encoding

Pitfall 4: Update All Without Confirmation

Code Examples

Session-Based Batch Toggle

Update All Inline Keyboard Entry

Session Cleanup on Access

Callback Data Parser Update

State of the Art

Open Questions

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence)

Metadata

20 KiB

Raw Blame History