Files
Lucas Berger 21e888c1ce docs: remove NLU from v1.0 scope
- Remove Claude API integration and intent parsing (04-02-PLAN)
- REQ-08 (conversational queries) moved to out of scope
- Phase 4 renamed from "Logs & Intelligence" to "Logs" (complete)
- v1.0 now focuses on keyword-based container control

Simple substring matching works well for container management.
NLU adds complexity without proportional value for v1.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 14:52:37 -05:00

19 KiB

Phase 4: Logs & Intelligence - Research

Researched: 2026-01-30 Domain: Docker API integration, Claude API NLU, conversational AI Confidence: HIGH

Summary

Phase 4 integrates Docker logs retrieval and Claude-powered conversational intelligence into the n8n workflow. Research focused on five domains: Docker Engine API for logs and stats, Claude Messages API for natural language understanding, intent parsing patterns, n8n workflow integration, and security considerations.

The standard approach uses Docker Engine API's /containers/{id}/logs endpoint with the tail parameter for configurable log retrieval, and /containers/{id}/stats for resource metrics. Claude API provides intent parsing through pure LLM reasoning (no traditional classification needed), using the Messages API with system prompts to guide behavior. n8n HTTP Request nodes handle API calls with proper error handling and retry logic.

Key findings show that prompt caching can reduce Claude API costs by 90% for repeated context (system prompts, conversation history), making conversational workflows highly cost-effective. Docker API calls via Unix socket are secure by default (no authentication needed), while Claude API requires X-Api-Key header authentication. The primary security concern is prompt injection attacks in conversational interfaces, mitigated through input validation and system prompt design.

Primary recommendation: Use n8n HTTP Request nodes for Docker API calls via Unix socket, Claude Messages API with prompt caching for intent parsing, and structured output to ensure reliable JSON responses for workflow routing.

Standard Stack

The established libraries/tools for this domain:

Core

Library Version Purpose Why Standard
Docker Engine API v1.53 Container logs and stats retrieval Official Docker API, direct socket access
Claude Messages API 2023-06-01 Natural language understanding and intent parsing Anthropic's production API, superior reasoning
n8n HTTP Request Built-in API orchestration and workflow routing Already in stack, handles authentication and retries
curl Static binary Execute Docker API calls from n8n Lightweight, works in hardened containers

Supporting

Library Version Purpose When to Use
Claude Sonnet 4.5 claude-sonnet-4-5-20250929 Primary NLU model Best balance of speed, cost, and intelligence for agent workflows
Claude Haiku 4.5 claude-3-5-haiku-20241022 Lightweight intent classification When simple intent detection suffices (cost optimization)
n8n Execute Code Built-in JSON validation and transformation Transform Claude responses into workflow-compatible data

Alternatives Considered

Instead of Could Use Tradeoff
Claude API Local LLM on N100 N100 too weak for fast inference (already decided against)
HTTP Request Docker SDK libraries Requires installing packages in hardened container (not feasible)
Structured outputs Regex parsing Brittle, fails on natural language variations

Installation:

# No installation needed - using existing stack
# curl binary already mounted to n8n container
# Claude API accessed via HTTP Request node

Architecture Patterns

n8n Workflow:
├── Telegram Trigger           # Incoming user message
├── HTTP Request (Claude)      # Intent parsing
├── Execute Code               # Validate/transform response
├── Switch                     # Route based on intent
│   ├── logs → Docker API
│   ├── stats → Docker API
│   └── error → Error handler
└── Telegram Reply             # Send response

Pattern 1: Intent-First Routing

What: Use Claude to parse user intent before executing Docker commands When to use: All conversational queries (prevents misinterpretation) Example:

// n8n Execute Code node - Transform Claude response
const claudeResponse = $input.item.json.content[0].text;

// Claude returns JSON with structured intent
const intent = JSON.parse(claudeResponse);

return {
  intent: intent.action,        // "view_logs" | "query_stats" | "unknown"
  container: intent.container,  // Container name/ID
  params: intent.parameters     // { lines: 100 } or { metric: "memory" }
};

Pattern 2: Prompt Caching for System Instructions

What: Cache static system prompts to reduce latency and cost When to use: All Claude API calls (5min TTL, auto-refreshed) Example:

{
  "model": "claude-sonnet-4-5-20250929",
  "max_tokens": 1024,
  "system": [
    {
      "type": "text",
      "text": "You are a Docker container management assistant. Parse user requests and return JSON with: {\"action\": \"view_logs|query_stats|unknown\", \"container\": \"name\", \"parameters\": {...}}",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [
    {"role": "user", "content": "{{$json.message}}"}
  ]
}

Pattern 3: Docker API via Unix Socket

What: Use curl with --unix-socket for secure Docker API access When to use: All Docker API calls from n8n Example:

# n8n Execute Command node or HTTP Request pre-processing
curl -s --unix-socket /var/run/docker.sock \
  "http://localhost/v1.53/containers/{{$json.container}}/logs?stdout=1&stderr=1&tail={{$json.lines}}"

Pattern 4: Structured Output Validation

What: Use Claude's structured outputs or JSON schema validation When to use: When reliable JSON parsing is critical Example:

// n8n Execute Code node - Validate Claude response
const response = $input.item.json.content[0].text;

// Try parsing as JSON
try {
  const intent = JSON.parse(response);

  // Validate required fields
  if (!intent.action || !intent.container) {
    throw new Error('Missing required fields');
  }

  return intent;
} catch (error) {
  // Fallback to error handler
  return {
    action: 'error',
    message: 'Could not parse intent'
  };
}

Anti-Patterns to Avoid

  • Regex parsing of natural language: Brittle, fails on variations. Use LLM intent parsing instead.
  • Streaming Docker logs in n8n: Workflow nodes expect finite responses. Use tail parameter for bounded output.
  • Hardcoded API keys in workflows: Use n8n credentials storage (encrypted).
  • Ignoring rate limits: Implement exponential backoff for Claude API 429 errors.
  • No cache invalidation strategy: Don't cache conversation history indefinitely - use 5min TTL.

Don't Hand-Roll

Problems that look simple but have existing solutions:

Problem Don't Build Use Instead Why
Intent classification Regex rules, keyword matching Claude API with system prompt Handles natural language variations, understands context
JSON extraction from LLM String manipulation, regex Structured outputs or schema validation Claude can return validated JSON directly
Docker API authentication Custom auth logic Unix socket file permissions OS-level security, no tokens needed
Rate limiting Manual retry counters n8n's built-in retry with exponential backoff Handles transient failures, respects retry-after headers
Prompt management String concatenation Prompt caching with cache_control 90% cost reduction, automatic deduplication
Conversation state Custom database Claude conversation history in messages array Stateless API design, simpler architecture

Key insight: Modern LLM APIs are designed for conversational workflows. Don't build traditional NLU pipelines (tokenization, feature extraction, classification) - Claude handles intent understanding end-to-end through natural language prompts.

Common Pitfalls

Pitfall 1: Docker Logs Streaming Without Bounds

What goes wrong: Using follow=true or no tail parameter causes infinite streaming that blocks n8n nodes Why it happens: Docker logs API defaults to streaming all logs from container start How to avoid: Always specify tail parameter with reasonable limit (e.g., 100-500 lines) Warning signs: n8n workflow hangs on HTTP Request node, timeout errors

Pitfall 2: Claude API Rate Limiting (429 Errors)

What goes wrong: Exceeding 50 RPM (Tier 1) or token limits causes API rejections Why it happens: Short bursts of requests, or acceleration limits on new organizations How to avoid:

  • Implement exponential backoff with retry-after header
  • Use prompt caching to reduce ITPM (cached tokens don't count toward limits)
  • Enable n8n's "Retry on Fail" with increasing intervals Warning signs: 429 status codes, "rate_limit_error" in response

Pitfall 3: Prompt Injection Attacks

What goes wrong: User input manipulates system behavior ("Ignore previous instructions and...") Why it happens: LLMs process all text as potential instructions, no input/output separation How to avoid:

  • Use structured system prompts that explicitly define valid actions
  • Validate LLM output against expected schema
  • Limit LLM's action space to safe operations (read-only queries)
  • Don't execute arbitrary commands from LLM responses Warning signs: Unexpected LLM behavior, security boundary violations

Pitfall 4: Cache Invalidation on Minor Changes

What goes wrong: Small prompt changes break entire cache, causing unnecessary costs Why it happens: Cache requires 100% identical prefix up to cache_control point How to avoid:

  • Place static content first (tools, system instructions)
  • Put variable content (user message) after cache breakpoint
  • Use multiple breakpoints for content that changes at different rates
  • Monitor cache_read_input_tokens vs cache_creation_input_tokens Warning signs: cache_creation_input_tokens > 0 on every request, high costs

Pitfall 5: Ignoring Docker API Version in URL

What goes wrong: API calls fail or use deprecated features Why it happens: Docker API is versioned, different endpoints available in different versions How to avoid: Always specify version in URL path (/v1.53/containers/...) Warning signs: 404 errors, unexpected API behavior

Pitfall 6: Not Handling Container Name vs ID

What goes wrong: User says "portainer" but Docker API expects container ID Why it happens: Docker accepts both, but stats/logs endpoints may behave differently How to avoid:

  • Use /containers/json endpoint to resolve names to IDs first
  • Or rely on Docker's name resolution (works for most endpoints)
  • Handle both patterns in intent parsing Warning signs: Inconsistent results, "container not found" errors

Code Examples

Verified patterns from official sources:

Docker Logs Retrieval (Bounded)

# Source: https://docs.docker.com/reference/api/engine/
# Non-streaming logs with tail limit
curl -s --unix-socket /var/run/docker.sock \
  "http://localhost/v1.53/containers/portainer/logs?stdout=1&stderr=1&tail=100"

Docker Stats Query

# Source: https://docs.docker.com/reference/api/engine/
# Single snapshot (non-streaming)
curl -s --unix-socket /var/run/docker.sock \
  "http://localhost/v1.53/containers/portainer/stats?stream=false"

Claude Intent Parsing with Caching

// Source: https://platform.claude.com/docs/en/api/messages
// POST https://api.anthropic.com/v1/messages
{
  "model": "claude-sonnet-4-5-20250929",
  "max_tokens": 1024,
  "system": [
    {
      "type": "text",
      "text": "You are a Docker management assistant. Parse user requests about containers and return JSON.\n\nValid actions: view_logs, query_stats, unknown\n\nExamples:\n- \"Show me portainer logs\" → {\"action\": \"view_logs\", \"container\": \"portainer\", \"parameters\": {\"lines\": 100}}\n- \"What's using most memory?\" → {\"action\": \"query_stats\", \"container\": \"all\", \"parameters\": {\"metric\": \"memory\", \"sort\": \"desc\"}}\n- \"Hello\" → {\"action\": \"unknown\", \"message\": \"I can help with Docker logs and stats. Try: 'show logs' or 'what's using memory?'\"}",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": "Show me the last 50 lines of nginx logs"
    }
  ]
}

n8n HTTP Request Node Configuration (Claude)

// Source: n8n documentation + Claude API docs
{
  "method": "POST",
  "url": "https://api.anthropic.com/v1/messages",
  "authentication": "predefinedCredentialType",
  "nodeCredentialType": "claudeApi",
  "sendHeaders": true,
  "headerParameters": {
    "parameters": [
      {"name": "anthropic-version", "value": "2023-06-01"}
    ]
  },
  "sendBody": true,
  "bodyParameters": {
    "parameters": [
      {"name": "model", "value": "claude-sonnet-4-5-20250929"},
      {"name": "max_tokens", "value": 1024},
      {"name": "system", "value": "={{$json.systemPrompt}}"},
      {"name": "messages", "value": "={{$json.messages}}"}
    ]
  },
  "options": {
    "retry": {
      "enabled": true,
      "maxTries": 3,
      "waitBetweenTries": 1000
    },
    "timeout": 30000
  }
}

Intent Validation (n8n Execute Code)

// Source: Research synthesis
// Validate and transform Claude response
const content = $input.item.json.content[0].text;

try {
  const intent = JSON.parse(content);

  // Validate schema
  const validActions = ['view_logs', 'query_stats', 'unknown'];
  if (!validActions.includes(intent.action)) {
    throw new Error('Invalid action');
  }

  // Normalize container name
  if (intent.container) {
    intent.container = intent.container.toLowerCase().trim();
  }

  // Set defaults
  if (intent.action === 'view_logs' && !intent.parameters?.lines) {
    intent.parameters = { ...intent.parameters, lines: 100 };
  }

  return intent;

} catch (error) {
  return {
    action: 'error',
    message: 'Failed to parse intent: ' + error.message
  };
}

Resource Query Pattern

# Source: https://docs.docker.com/reference/cli/docker/container/stats/
# Get stats for all containers (for "what's using most memory?" queries)
curl -s --unix-socket /var/run/docker.sock \
  "http://localhost/v1.53/containers/json" | \
  jq -r '.[].Id' | \
  while read container_id; do
    curl -s --unix-socket /var/run/docker.sock \
      "http://localhost/v1.53/containers/$container_id/stats?stream=false"
  done

State of the Art

Old Approach Current Approach When Changed Impact
Rule-based intent classification LLM-native reasoning 2023-2024 No regex patterns needed, handles natural language variations
Separate NLU pipeline (tokenize, extract, classify) End-to-end LLM with system prompts 2023-2024 Simpler architecture, fewer moving parts
Docker CLI parsing Docker Engine API direct Always available Programmatic access, structured responses
Per-request pricing only Prompt caching (cache reads 90% cheaper) 2024 Conversational AI economically viable
Fine-tuned models Few-shot prompting with examples 2023-2024 No training needed, faster iteration
Organization-level cache isolation Workspace-level cache isolation Feb 5, 2026 Multi-workspace users need separate caching strategies

Deprecated/outdated:

  • Traditional intent classification libraries (Rasa NLU, LUIS): LLMs handle this natively now
  • Docker Remote API v1.24 and earlier: Use v1.53 for latest features
  • Claude models: Sonnet 3.7 deprecated, use Sonnet 4.5 or 4
  • Embeddings + similarity search for intent: Direct LLM reasoning is more accurate

Open Questions

Things that couldn't be fully resolved:

  1. Optimal cache breakpoint strategy for multi-turn conversations

    • What we know: Cache persists 5min, refreshed on use; can use up to 4 breakpoints
    • What's unclear: Whether to cache conversation history incrementally or use single breakpoint at end
    • Recommendation: Start with single breakpoint at end of static system prompt; add conversation caching if chats exceed 5 turns
  2. Claude model selection for production

    • What we know: Sonnet 4.5 is "best for agents", Haiku 4.5 is fastest/cheapest
    • What's unclear: Whether simple intent parsing justifies Sonnet's cost vs Haiku
    • Recommendation: Start with Sonnet 4.5 (proven for agent workflows), test Haiku 4.5 if costs are concern
  3. Docker stats aggregation for "what's using most X?" queries

    • What we know: Stats API returns per-container data, must query multiple containers
    • What's unclear: Best way to aggregate in n8n (bash script vs Execute Code node)
    • Recommendation: Use Execute Code node with multiple HTTP Request results; avoid bash for portability
  4. Rate limit tier for Claude API

    • What we know: Tier 1 = 50 RPM, higher tiers require usage history
    • What's unclear: Single-user bot's actual request rate, whether Tier 1 sufficient
    • Recommendation: Monitor usage; prompt caching reduces effective RPM needs significantly

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence - WebSearch only)

Metadata

Confidence breakdown:

  • Standard stack: HIGH - Official API documentation verified, existing tools in stack
  • Architecture: HIGH - Patterns derived from official docs and proven n8n workflows
  • Pitfalls: MEDIUM - Synthesized from official docs + community experience, prompt injection requires ongoing research

Research date: 2026-01-30 Valid until: 2026-02-28 (30 days - stable APIs, but Claude features evolve rapidly)