36 KiB
Phase 10.2: Better Logging and Log Management - Research
Researched: 2026-02-08 Domain: n8n workflow execution logging, error tracking, and debug infrastructure Confidence: HIGH
Summary
Phase 10.2 adds centralized logging and error tracking to improve Claude's ability to diagnose issues in the n8n-based Unraid Docker Manager bot. The research reveals that n8n provides native capabilities for this exact use case: workflow static data for in-memory storage, structured error data from Error Trigger nodes, sub-workflow return patterns for error propagation, and API access to execution logs. The primary challenge is designing a trace format that makes the three specific pain points (sub-workflow data loss, callback routing confusion, execution log parsing) immediately queryable.
The standard approach combines ring buffer storage in workflow static data, structured error objects with context, correlation IDs for request tracing, and programmatic access via both Telegram commands and n8n API. This infrastructure is well-established in distributed systems observability (2026) and maps cleanly to n8n's architecture.
Primary recommendation: Use workflow static data for ring buffer storage (50 errors), structured error objects with correlation IDs, sub-workflow error propagation via return values, and selective debug mode that captures boundary data only when enabled. Avoid over-logging; focus on the three stated pain points with targeted trace data.
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
Error capture & reporting:
- Errors display inline to the user as summary + cause (e.g., "Failed to stop nginx: Docker API returned 404 (container not found)")
- Full diagnostic data (sub-workflow name, node, raw response, stack trace) captured in central error store for Claude's use
- Only report errors on user-triggered actions — no proactive/unsolicited error notifications
- Error store uses ring buffer: last 50 errors, auto-rotated
- Manual clear command also available (/clear-errors or similar, hidden/unlisted)
Execution traceability:
- All sub-workflows report errors back to main workflow for centralized storage
- Trace data designed for programmatic access — Claude can query it during debugging sessions
- Hidden/unlisted Telegram commands for quick error checks (e.g., /errors to see recent errors)
- File-based access also available for deep investigation during debugging sessions
Log output & storage:
- Error/trace data stored in n8n workflow static data (main workflow)
- Centralized in main workflow — sub-workflows report back, main stores
- Auto-rotate (ring buffer, 50 entries) + manual clear command
- Both Telegram commands (quick checks) and file/API access (deep investigation)
Debug mode:
- Debug mode is for Claude's use during debugging — not user-facing
- Must address three specific pain points:
- Sub-workflow data loss — capture what data was sent to and received from each sub-workflow at boundaries
- Callback routing confusion — trace which path a callback took through routing logic
- n8n API execution log parsing — make execution data easily queryable without manual workflow investigation
Claude's Discretion
- Trace format and structure (timeline vs. data snapshots vs. both)
- Whether to trace all executions or only errors (overhead vs. usefulness)
- Structured entries vs. simple log lines (what enables best debugging)
- Debug toggle mechanism (global toggle, per-request, or always-on for errors)
- Log level granularity (on/off vs. error/warn/info)
- What specific debug data to capture (raw API responses, sub-workflow I/O, timing)
- Telegram command naming and exact interface
Deferred Ideas (OUT OF SCOPE)
None — discussion stayed within phase scope
</user_constraints>
Standard Stack
Core Components
| Component | Version/Type | Purpose | Why Standard |
|---|---|---|---|
| n8n Workflow Static Data | Built-in ($getWorkflowStaticData('global')) |
In-memory ring buffer storage | Native n8n persistence mechanism, survives across executions |
| n8n Error Trigger | Built-in node type | Structured error capture | Standard n8n error handling pattern, provides rich error context |
| n8n Execute Workflow | Built-in node type | Sub-workflow communication | Existing pattern in project (7 sub-workflows deployed) |
| n8n API | /api/v1/executions endpoint |
Programmatic execution log access | Official n8n API for querying execution history and data |
| Correlation ID | String field in trace entries | Request tracking across workflow boundaries | Industry standard for distributed tracing (OpenTelemetry pattern) |
Note: No external logging libraries needed. n8n's built-in capabilities are sufficient for this use case.
Supporting Patterns
| Pattern | Implementation | Purpose | When to Use |
|---|---|---|---|
| Ring Buffer | JavaScript array with modulo arithmetic | Auto-rotating error store (50 entries) | Size-bounded in-memory storage |
| Structured Error Object | JSON with standard fields (timestamp, executionId, node, error, context) | Queryable error data | Always — enables programmatic access |
| Error Propagation | Sub-workflow return values include error object | Centralized error collection | When sub-workflow encounters error |
| Debug Toggle | Boolean flag in workflow static data | Enable/disable debug tracing | Claude sets via Telegram command or API |
| Correlation ID | UUID passed through sub-workflow calls | Trace single request across workflows | All sub-workflow invocations |
Alternatives Considered
| Instead of | Could Use | Tradeoff |
|---|---|---|
| Workflow static data | External database (Redis, MongoDB) | External DB provides unlimited storage but adds infrastructure complexity; static data is simpler, sufficient for 50-entry ring buffer |
| Ring buffer | Append-only log with external rotation | Unlimited history but requires external storage and log rotation scripts; ring buffer is self-managing |
| n8n API access | n8n log streaming to external service | Real-time streaming but requires external log aggregator; API access is simpler for on-demand queries |
| Correlation IDs | Execution ID only | Execution ID doesn't span sub-workflows; correlation ID tracks single user request across all workflows |
Installation: No external packages needed. All components are n8n built-ins.
Architecture Patterns
Recommended Data Structure
// Workflow static data structure
{
"debug": {
"enabled": false, // Debug mode toggle
"logLevel": "error" // "off" | "error" | "warn" | "info" | "debug"
},
"errors": {
"buffer": [ // Ring buffer (max 50 entries)
{
"id": "err_001", // Sequential error ID
"correlationId": "uuid-v4", // Trace across sub-workflows
"timestamp": "2026-02-08T10:30:00Z",
"executionId": "12345", // n8n execution ID
"workflow": "main", // "main" or sub-workflow name
"node": "Execute Container Action",
"operation": "docker.stop",
"userMessage": "Failed to stop nginx: Docker API returned 404 (container not found)",
"error": {
"message": "Container not found",
"stack": "Error: Container not found\n at ...",
"httpCode": 404,
"rawResponse": "{\"message\":\"No such container: nginx\"}"
},
"context": {
"userId": "123456789",
"containerId": "nginx",
"subWorkflowInput": {...}, // Data sent to sub-workflow
"subWorkflowOutput": {...} // Data received from sub-workflow
}
}
],
"nextId": 2, // Auto-increment for error IDs
"count": 1, // Total errors captured (all-time)
"lastCleared": "2026-02-08T09:00:00Z"
},
"traces": { // Debug mode traces (only when debug.enabled = true)
"buffer": [ // Ring buffer (max 50 entries)
{
"id": "trace_001",
"correlationId": "uuid-v4",
"timestamp": "2026-02-08T10:29:55Z",
"executionId": "12345",
"event": "sub-workflow-call",
"workflow": "n8n-actions",
"node": "Execute Container Action",
"data": {
"input": {...}, // Boundary data: what was sent
"output": {...}, // Boundary data: what was received
"duration": 234 // Execution time in ms
}
},
{
"id": "trace_002",
"correlationId": "uuid-v4",
"timestamp": "2026-02-08T10:29:56Z",
"executionId": "12345",
"event": "callback-routing",
"node": "Route Callback",
"data": {
"callbackData": "action:stop:nginx",
"routeTaken": "single-action", // Which switch output path
"availableRoutes": ["cancel", "expired", "batch", "single-action"]
}
}
],
"nextId": 3
}
}
Pattern 1: Ring Buffer Implementation
What: Fixed-size circular buffer that auto-rotates when full, keeping only the most recent N entries.
When to use: Storing errors and traces in bounded memory (workflow static data has size limits).
Example:
// Code node: Add Error to Ring Buffer
const staticData = $getWorkflowStaticData('global');
// Initialize if needed
if (!staticData.errors) {
staticData.errors = {
buffer: [],
nextId: 1,
count: 0,
lastCleared: new Date().toISOString()
};
}
const MAX_ENTRIES = 50;
const errorEntry = {
id: `err_${String(staticData.errors.nextId).padStart(3, '0')}`,
correlationId: $execution.id, // Use execution ID as correlation ID
timestamp: new Date().toISOString(),
executionId: $execution.id,
workflow: 'main',
node: $('Execute Container Action').name,
operation: 'docker.stop',
userMessage: $input.item.json.errorMessage,
error: {
message: $input.item.json.error.message,
stack: $input.item.json.error.stack,
httpCode: $input.item.json.error.httpCode,
rawResponse: $input.item.json.error.rawResponse
},
context: {
userId: $input.item.json.userId,
containerId: $input.item.json.containerId,
subWorkflowInput: $input.item.json.subWorkflowInput,
subWorkflowOutput: $input.item.json.subWorkflowOutput
}
};
// Ring buffer: add at end, remove from start if full
staticData.errors.buffer.push(errorEntry);
if (staticData.errors.buffer.length > MAX_ENTRIES) {
staticData.errors.buffer.shift(); // Remove oldest
}
staticData.errors.nextId++;
staticData.errors.count++;
return { json: { success: true, errorId: errorEntry.id } };
Source: Ring buffer pattern from Tucker Leach - Ring Buffer in TypeScript
Pattern 2: Sub-workflow Error Propagation
What: Sub-workflows return error objects to main workflow for centralized storage.
When to use: All sub-workflow calls. Enables centralized error collection.
Example:
// Sub-workflow (n8n-actions.json): Return error to main workflow
// Code node: Format Error Response (on error path)
return {
json: {
success: false,
error: {
message: $input.item.json.error.message,
stack: $input.item.json.error.stack || '',
httpCode: $input.item.json.error.httpCode || 500,
rawResponse: $input.item.json.error.rawResponse || ''
},
context: {
workflow: 'n8n-actions',
node: $('Stop Container').name,
operation: 'docker.stop',
input: $('When executed by another workflow').item.json // What was sent to this sub-workflow
}
}
};
// Main workflow: Capture sub-workflow error
// IF node: Check Sub-workflow Success
{{ $('Execute Container Action').item.json.success }} equals false
// Code node: Log Error (on false path)
const subWorkflowResult = $('Execute Container Action').item.json;
const errorData = {
errorMessage: `Failed to stop ${subWorkflowResult.context.input.containerId}: ${subWorkflowResult.error.message}`,
error: subWorkflowResult.error,
userId: $('Telegram Trigger').item.json.message.from.id,
containerId: subWorkflowResult.context.input.containerId,
subWorkflowInput: subWorkflowResult.context.input,
subWorkflowOutput: subWorkflowResult
};
// Pass to ring buffer node
return { json: errorData };
Source: n8n sub-workflow pattern from n8n Execute Sub-workflow docs
Pattern 3: Correlation ID for Request Tracing
What: Unique ID generated at workflow entry point, passed through all sub-workflow calls, used to correlate logs/traces for single user request.
When to use: Always. Essential for tracing requests across sub-workflows.
Example:
// Main workflow: Generate Correlation ID
// Code node: Initialize Request Context (early in workflow, after auth)
const { v4: uuidv4 } = require('uuid'); // n8n includes uuid
const correlationId = uuidv4();
const requestContext = {
correlationId,
userId: $('Telegram Trigger').item.json.message.from.id,
messageId: $('Telegram Trigger').item.json.message.message_id,
timestamp: new Date().toISOString()
};
return { json: { ...requestContext, ...$input.item.json } };
// Pass correlation ID to sub-workflow
// Execute Workflow node: Execute Container Action
// Input parameters:
{{ { correlationId: $('Initialize Request Context').item.json.correlationId, ...otherParams } }}
// Debug trace: Log callback routing decision
const staticData = $getWorkflowStaticData('global');
if (staticData.debug?.enabled) {
const traceEntry = {
id: `trace_${String(staticData.traces.nextId).padStart(3, '0')}`,
correlationId: $('Initialize Request Context').item.json.correlationId,
timestamp: new Date().toISOString(),
executionId: $execution.id,
event: 'callback-routing',
node: 'Route Callback',
data: {
callbackData: $input.item.json.callback_query.data,
routeTaken: '{{ $json.routeName }}', // Set by switch node metadata
availableRoutes: ['cancel', 'expired', 'batch', 'single-action']
}
};
// Add to ring buffer (same pattern as errors)
staticData.traces.buffer.push(traceEntry);
if (staticData.traces.buffer.length > 50) {
staticData.traces.buffer.shift();
}
staticData.traces.nextId++;
}
Source: Correlation ID pattern from Microsoft Engineering Playbook - Correlation IDs
Pattern 4: Debug Mode Toggle
What: Boolean flag in workflow static data that enables/disables debug tracing. When enabled, captures boundary data (sub-workflow I/O) and routing decisions.
When to use: Claude needs to diagnose issues. User doesn't see debug traces; only visible via /errors command or API.
Example:
// Telegram command: /debug on|off (hidden command)
// Code node: Toggle Debug Mode
const staticData = $getWorkflowStaticData('global');
const command = $input.item.json.message.text.toLowerCase();
if (!staticData.debug) {
staticData.debug = { enabled: false, logLevel: 'error' };
}
if (command === '/debug on') {
staticData.debug.enabled = true;
return { json: { message: 'Debug mode enabled. Tracing sub-workflow boundaries and callback routing.' } };
} else if (command === '/debug off') {
staticData.debug.enabled = false;
return { json: { message: 'Debug mode disabled.' } };
} else if (command === '/debug status') {
return { json: {
message: `Debug mode: ${staticData.debug.enabled ? 'ON' : 'OFF'}\nLog level: ${staticData.debug.logLevel}`
} };
}
Pattern 5: Query Errors via Telegram
What: Hidden command that returns recent errors in human-readable format.
When to use: Quick error checks during debugging sessions.
Example:
// Telegram command: /errors [count] (hidden command)
// Code node: Format Error Report
const staticData = $getWorkflowStaticData('global');
const errors = staticData.errors?.buffer || [];
const requestedCount = parseInt($input.item.json.message.text.split(' ')[1]) || 5;
const recentErrors = errors.slice(-requestedCount).reverse();
if (recentErrors.length === 0) {
return { json: { message: 'No errors recorded.' } };
}
let message = `📋 Recent Errors (${recentErrors.length}):\n\n`;
recentErrors.forEach(err => {
const time = new Date(err.timestamp).toLocaleString();
message += `🔴 ${err.id} - ${time}\n`;
message += `Workflow: ${err.workflow} → ${err.node}\n`;
message += `User: ${err.userMessage}\n`;
message += `Error: ${err.error.message}\n`;
if (err.error.httpCode) {
message += `HTTP: ${err.error.httpCode}\n`;
}
message += `\n`;
});
message += `Total errors: ${staticData.errors.count}\n`;
message += `Last cleared: ${new Date(staticData.errors.lastCleared).toLocaleString()}`;
return { json: { message } };
Pattern 6: n8n API Access for Deep Investigation
What: Use n8n API to retrieve full execution data including node inputs/outputs.
When to use: Deep debugging when Telegram command output isn't sufficient.
Example:
# Claude Code: Query recent failed executions
curl -X GET 'http://n8n:5678/api/v1/executions?status=error&limit=10' \
-H 'X-N8N-API-KEY: <api-key>'
# Response:
{
"data": [
{
"id": "12345",
"workflowId": "1000",
"status": "error",
"startedAt": "2026-02-08T10:29:55Z",
"finishedAt": "2026-02-08T10:30:00Z"
}
]
}
# Get detailed execution data
curl -X GET 'http://n8n:5678/api/v1/executions/12345?includeData=true' \
-H 'X-N8N-API-KEY: <api-key>'
# Response includes node-level data:
{
"id": "12345",
"data": {
"resultData": {
"runData": {
"Execute Container Action": [
{
"startTime": "...",
"executionTime": 234,
"data": {
"main": [
[
{
"json": {
"success": false,
"error": { ... }
}
}
]
]
}
}
]
}
}
}
}
Source: n8n Executions API
Anti-Patterns to Avoid
- Over-logging: Don't trace every node execution — only boundaries (sub-workflow I/O) and decision points (routing). Full tracing creates noise and fills the ring buffer quickly.
- Logging sensitive data: Don't capture Telegram API keys, Docker socket responses with sensitive container environment variables, or user credentials in error context.
- Unbounded storage: Don't append errors indefinitely to workflow static data — use ring buffer with fixed size (50 entries). Static data has size limits and isn't designed for unlimited storage.
- Synchronous API calls: Don't call n8n API from within workflow execution for logging — too slow, creates circular dependency. Use workflow static data; query API externally (Claude Code).
- User-facing debug output: Don't send raw error objects or stack traces to Telegram user — only show
userMessagefield. Full diagnostic data is for Claude only.
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| Ring buffer with manual rotation | Custom linked list, manual cleanup logic | Simple array with push() and shift() |
Ring buffer with array + modulo is 10 lines of code; custom structures add complexity for zero benefit |
| Correlation ID generation | Manual timestamp-based IDs | UUID v4 (require('uuid').v4()) |
UUIDs are guaranteed unique; custom IDs risk collisions |
| Error serialization | Custom error formatting | JSON.stringify(error) with try-catch |
Errors aren't always JSON-serializable; need safe serialization (error.message, error.stack fields) |
| Execution log parsing | Manual n8n database queries | n8n API /api/v1/executions |
API provides structured access; database queries are fragile and break on schema changes |
| Log aggregation service | External ELK/Splunk/Datadog | Workflow static data + n8n API | 50-entry ring buffer is sufficient for debugging; external service is over-engineering for this use case |
Key insight: n8n's built-in capabilities (static data, Error Trigger, API) are designed for exactly this use case. Don't add external dependencies when native features are sufficient.
Common Pitfalls
Pitfall 1: Workflow Static Data Not Persisting
What goes wrong: Static data cleared between executions, errors not retained.
Why it happens: Workflow static data only persists when workflow is active (not testing mode) and execution completes successfully. If workflow execution errors before reaching end, static data changes are lost.
How to avoid:
- Ensure main workflow is active (not testing)
- Write to static data in nodes that execute before error occurs
- For error logging: use
trynode or error trigger to catch errors without failing execution
Warning signs:
/errorscommand shows no errors despite known failures- Ring buffer resets to empty on every execution
nextIdcounter doesn't increment
Source: n8n workflow static data behavior
Pitfall 2: Execution ID vs Correlation ID Confusion
What goes wrong: Using execution ID to trace across sub-workflows fails because each sub-workflow has its own execution ID.
Why it happens: n8n creates new execution ID for each sub-workflow invocation. Single user request = multiple execution IDs (main + N sub-workflows).
How to avoid:
- Generate correlation ID in main workflow (UUID v4)
- Pass correlation ID to all sub-workflows as input parameter
- Use correlation ID (not execution ID) to query logs for single user request
Warning signs:
- Can't trace callback from callback_query through sub-workflow to result
- Errors from sub-workflows appear unrelated to main workflow execution
Example:
User request "stop nginx"
├─ Main workflow execution: executionId=12345, correlationId=uuid-abc
├─ Sub-workflow (n8n-actions): executionId=12346, correlationId=uuid-abc ← Same correlation ID
└─ Error logged with correlationId=uuid-abc ← Can query all entries for this request
Source: Distributed tracing correlation ID pattern
Pitfall 3: Static Data Size Limits
What goes wrong: Workflow static data grows unbounded, eventually fails with "data too large" error.
Why it happens: n8n stores static data in database. Large objects (50+ entries with full rawResponse fields) can exceed database column size limits.
How to avoid:
- Use ring buffer (fixed size, auto-rotate)
- Limit
rawResponsefield size (truncate to 1000 chars) - Don't store binary data or large payloads in error context
- Provide manual clear command (
/clear-errors) for ring buffer reset
Warning signs:
- Workflow execution fails with database error
- Static data write operations timing out
- Execution time increases as ring buffer fills
Mitigation:
// Truncate large fields before storing
error: {
message: err.message,
stack: err.stack?.substring(0, 500) || '', // Limit stack trace
rawResponse: err.rawResponse?.substring(0, 1000) || '' // Limit response
}
Source: n8n community: static data size limits
Pitfall 4: Querying Errors by Wrong Field
What goes wrong: Can't find specific error when searching logs because field name assumptions are wrong.
Why it happens: Inconsistent field naming (e.g., containerId vs container_id, workflow vs workflowName).
How to avoid:
- Define standard error schema (see Architecture Patterns above)
- Use TypeScript-style interfaces as comments in Code nodes
- Validate error object structure when storing (check required fields exist)
Warning signs:
/errorscommand can't filter by container or user- Claude's queries return empty results despite known errors for that container
Prevention:
// Code node: Validate Error Schema
const requiredFields = ['id', 'correlationId', 'timestamp', 'workflow', 'node', 'userMessage', 'error'];
const errorEntry = { ... };
// Validate
const missing = requiredFields.filter(field => !errorEntry[field]);
if (missing.length > 0) {
console.error(`Missing required error fields: ${missing.join(', ')}`);
}
Pitfall 5: Debug Mode Always-On Performance Impact
What goes wrong: Debug mode left enabled, fills ring buffer with traces, obscures actual errors.
Why it happens: Claude enables debug mode for investigation, forgets to disable it.
How to avoid:
- Default debug mode to OFF
- Auto-disable debug mode after N executions (e.g., 100)
- Include debug status in
/errorscommand output - Separate ring buffers for errors (always on) and traces (debug mode only)
Warning signs:
- Ring buffer fills with trace entries, pushes out error entries
/errorscommand mostly shows traces, not actual errors- Workflow execution noticeably slower
Mitigation:
// Auto-disable debug mode after 100 executions
const staticData = $getWorkflowStaticData('global');
if (staticData.debug?.enabled) {
staticData.debug.executionCount = (staticData.debug.executionCount || 0) + 1;
if (staticData.debug.executionCount > 100) {
staticData.debug.enabled = false;
// Send notification to Claude via Telegram
return { json: {
message: '⚠️ Debug mode auto-disabled after 100 executions.'
}};
}
}
Code Examples
All code examples provided in Architecture Patterns section above. Key patterns:
- Ring Buffer Implementation - Add/rotate entries in workflow static data
- Sub-workflow Error Propagation - Return error objects from sub-workflows
- Correlation ID Tracking - Generate and pass correlation ID through calls
- Debug Mode Toggle - Enable/disable tracing via Telegram command
- Query Errors via Telegram - Format and display recent errors
- n8n API Access - Retrieve execution data for deep investigation
State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|---|---|---|---|
| Log to external service (Splunk, Datadog) | Store in workflow static data + query via API | 2024-2025 | n8n static data sufficient for small-scale debugging; no external dependencies |
| Trace every node execution | Trace only boundaries and decisions | 2025-2026 | Reduces noise, focuses on actionable data (distributed tracing best practices) |
| Execution ID only | Correlation ID + Execution ID | 2024-2026 | Correlation ID essential for multi-workflow tracing (OpenTelemetry pattern) |
| Manual log parsing | Structured JSON logs | 2023-2024 | Programmatic querying replaces manual log reading |
| Error Trigger to external workflow | Error propagation via return values | 2024-2025 | Centralized storage in main workflow, simpler architecture |
Deprecated/outdated:
- n8n log streaming to external service: Requires self-hosted n8n with log streaming enabled. Adds infrastructure complexity. Static data + API is simpler for debugging use case.
- External error tracking service (Sentry, Rollbar): Over-engineering for workflow errors. These services are for application errors in production systems, not workflow debugging.
- Database storage for logs: n8n already stores execution data in database. Querying via API is cleaner than direct database access (which is fragile and breaks on schema changes).
Source: n8n log streaming (optional feature, not required)
Open Questions
1. Workflow Static Data Size Limits
- What we know: Static data persists in n8n database, has size limits, can fail with "data too large" error
- What's unclear: Exact size limit in bytes/entries before failure occurs
- Recommendation: Conservative ring buffer size (50 entries), truncate large fields (
rawResponseto 1000 chars), provide manual clear command. Monitor in production; reduce to 25 entries if size errors occur.
2. Sub-workflow Error Context Propagation
- What we know: Sub-workflows can return error objects via return values
- What's unclear: Do all 7 sub-workflows currently return structured responses, or do some fail silently?
- Recommendation: Audit existing sub-workflows during implementation. Standardize return format:
{ success: boolean, error?: object, data?: object }. Update all sub-workflows to return errors (don't throw/fail execution).
3. Debug Mode Performance Impact
- What we know: Capturing boundary data and routing decisions adds code execution overhead
- What's unclear: Measurable impact on workflow execution time (milliseconds? seconds?)
- Recommendation: Implement debug mode with selective tracing (only 3 pain points). Measure execution time before/after debug mode enabled. If impact > 500ms, reduce trace granularity.
4. n8n API Rate Limits
- What we know: n8n provides API for querying executions
- What's unclear: Are there rate limits on API calls? Does frequent querying impact n8n performance?
- Recommendation: Use Telegram commands for quick checks (doesn't hit API, reads static data). Reserve API queries for deep investigation. If rate limits discovered, implement query caching/throttling.
5. Telegram Message Size Limits
- What we know: Telegram messages have 4096 character limit
- What's unclear: If
/errorscommand returns 50 errors, will message exceed limit? - Recommendation: Paginate error output (default: last 5 errors, optional count parameter). Provide
/errors fullfor file-based export (Telegram file upload API). Split long messages if needed.
Sources
Primary (HIGH confidence)
- n8n Workflow Static Data - Official docs on
$getWorkflowStaticData() - n8n Error Trigger Node - Error data structure and usage
- n8n Execute Sub-workflow - Sub-workflow communication patterns
- n8n Executions API - Querying execution data programmatically
- n8n workflow data access - Accessing node data and workflow metadata
Secondary (MEDIUM confidence)
- Better Stack: Node.js Logging Best Practices - Structured logging patterns
- Microsoft Engineering Playbook: Correlation IDs - Request tracing pattern
- Distributed Tracing Logs (GroundCover) - Tracing workflow debugging patterns
- Tucker Leach: Ring Buffer in TypeScript - Ring buffer implementation
- n8n Community: Workflow Static Data - Static data limitations and behaviors
Tertiary (LOW confidence)
- n8n community: inline keyboard callback query - Telegram callback patterns (referenced for callback routing context)
- Ring buffer npm packages - External libraries (not needed, but validate pattern)
Metadata
Confidence breakdown:
- Standard stack: HIGH - All components are n8n built-ins, well-documented in official docs
- Architecture patterns: HIGH - Ring buffer, correlation IDs, structured errors are industry-standard patterns; n8n static data verified in official docs
- Common pitfalls: MEDIUM - Based on n8n community reports and general workflow debugging experience; specific size limits not documented precisely
- Code examples: HIGH - All examples use documented n8n APIs and standard JavaScript patterns
Research date: 2026-02-08 Valid until: 2026-03-08 (30 days - stable technology stack)
Implementation Recommendations
Based on research findings and user constraints:
1. Trace Format (Claude's Discretion)
Recommendation: Hybrid approach — structured error objects (always on) + selective debug traces (opt-in).
Rationale: Errors are rare and always need full context. Debug traces are verbose and only needed for specific pain points. Separate ring buffers prevent trace noise from obscuring errors.
Structure:
staticData.errors.buffer- 50 entries, always onstaticData.traces.buffer- 50 entries, only whenstaticData.debug.enabled = true
2. Trace Scope (Claude's Discretion)
Recommendation: Trace only errors (always) + three pain points (debug mode only).
Pain point traces (debug mode only):
- Sub-workflow boundaries: Capture input/output at Execute Workflow nodes
- Callback routing: Capture which switch path taken in Route Callback node
- n8n API queries: (No tracing needed — query via API is already structured)
Rationale: Tracing every execution creates noise. Focus on high-value data: errors (always actionable) and specific debug scenarios (when Claude needs deep visibility).
3. Structured vs. Simple Logs (Claude's Discretion)
Recommendation: Structured JSON objects.
Rationale: Claude needs programmatic access to query by correlationId, workflow, node, error type. Simple log lines require text parsing; structured objects enable direct field access.
4. Debug Toggle Mechanism (Claude's Discretion)
Recommendation: Global toggle via Telegram command (/debug on|off) with auto-disable after 100 executions.
Rationale: Global toggle is simplest. Per-request debugging adds complexity (need to tag specific requests). Always-on would fill ring buffer with traces. Auto-disable prevents performance impact from forgotten debug mode.
5. Log Level Granularity (Claude's Discretion)
Recommendation: Binary on/off for debug mode. Errors are always logged (no levels).
Rationale: Traditional log levels (error/warn/info/debug) are for application logs. Workflow debugging has two modes: normal (errors only) and debug (errors + traces). Additional levels add complexity without benefit.
6. Specific Debug Data to Capture (Claude's Discretion)
Recommendation: Minimal boundary data + routing decisions.
Capture:
- Sub-workflow I/O:
{ input: {...}, output: {...}, duration: 234 } - Callback routing:
{ callbackData: "...", routeTaken: "...", availableRoutes: [...] } - Docker API responses:
{ httpCode: 404, rawResponse: "..." }(truncate to 1000 chars)
Don't capture:
- Every node execution (too verbose)
- Full execution data from n8n API (query on-demand, don't cache)
- User messages, Telegram webhook payloads (not relevant to pain points)
7. Telegram Command Interface (Claude's Discretion)
Recommendation:
| Command | Description | Hidden? |
|---|---|---|
/errors [count] |
Show last N errors (default 5) | Yes (unlisted) |
/clear-errors |
Clear error ring buffer | Yes (unlisted) |
/debug on|off|status |
Toggle debug mode | Yes (unlisted) |
/trace <correlationId> |
Show all entries for correlation ID | Yes (unlisted) |
Rationale: Developer/debug tools should be hidden (not in /help menu). Claude can use them during debugging sessions. User never needs to see these commands.