docs(16): research Unraid GraphQL API migration patterns
This commit is contained in:
@@ -0,0 +1,767 @@
|
|||||||
|
# Phase 16: API Migration - Research
|
||||||
|
|
||||||
|
**Researched:** 2026-02-09
|
||||||
|
**Domain:** Unraid GraphQL API migration for Docker container operations
|
||||||
|
**Confidence:** HIGH
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Phase 16 replaces all Docker socket proxy API calls with Unraid GraphQL API mutations and queries. This is a **pure substitution migration** — the user experience remains identical (same Telegram commands, same responses, same timing), but the backend switches from Docker Engine REST API to Unraid's GraphQL API.
|
||||||
|
|
||||||
|
The migration complexity is mitigated by Phase 15 infrastructure: Container ID Registry handles ID translation (Docker 64-char hex → Unraid 129-char PrefixedID), GraphQL Response Normalizer transforms API responses to Docker contract format, and GraphQL Error Handler standardizes error checking. The workflows already have 60+ Code nodes expecting Docker API response shapes — the normalizer ensures zero changes to these downstream nodes.
|
||||||
|
|
||||||
|
Key architectural wins: (1) Single `updateContainer` GraphQL mutation replaces the 5-step Docker flow (inspect → stop → remove → create → start → cleanup), (2) Batch operations use efficient `updateContainers` plural mutation instead of N serial API calls, (3) Unraid update badges clear automatically (no manual "Apply Update" clicks), (4) No Docker socket proxy security boundary to manage.
|
||||||
|
|
||||||
|
**Primary recommendation:** Migrate workflows in dependency order (n8n-status.json first for container listing, then n8n-actions.json for lifecycle, then n8n-update.json for updates), using the Phase 15 utility nodes as drop-in replacements for Docker API HTTP Request nodes. Keep existing Code node logic unchanged — let normalizer/error handler bridge the API differences.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Standard Stack
|
||||||
|
|
||||||
|
### Core
|
||||||
|
|
||||||
|
| Library | Version | Purpose | Why Standard |
|
||||||
|
|---------|---------|---------|--------------|
|
||||||
|
| Unraid GraphQL API | 7.2+ native | Container lifecycle and update operations | Official Unraid interface, same mechanism as WebGUI, v1.3 Phase 14 verified |
|
||||||
|
| Phase 15 utility nodes | Current | Data transformation layer | Container ID Registry, GraphQL Normalizer, Error Handler — purpose-built for this migration |
|
||||||
|
| n8n HTTP Request node | Built-in | GraphQL client | GraphQL-over-HTTP with POST method, 15s timeout for myunraid.net relay |
|
||||||
|
|
||||||
|
### Supporting
|
||||||
|
|
||||||
|
| Library | Version | Purpose | When to Use |
|
||||||
|
|---------|---------|---------|-------------|
|
||||||
|
| Unraid API HTTP Template | Phase 15-02 | Pre-configured HTTP node | Duplicate and modify query for each GraphQL call |
|
||||||
|
| Container ID Registry | Phase 15-01 | Name ↔ PrefixedID mapping | All GraphQL mutations (require 129-char PrefixedID format) |
|
||||||
|
| Callback Token Encoder/Decoder | Phase 15-01 | Telegram callback data encoding | Inline keyboard callbacks with PrefixedIDs (64-byte limit) |
|
||||||
|
|
||||||
|
### Alternatives Considered
|
||||||
|
|
||||||
|
| Instead of | Could Use | Tradeoff |
|
||||||
|
|------------|-----------|----------|
|
||||||
|
| GraphQL API | Keep Docker socket proxy | Misses architectural goal (single API), no update badge sync, security boundary remains |
|
||||||
|
| Single updateContainer mutation | 5-step Docker flow via GraphQL | Unraid doesn't expose low-level primitives — GraphQL abstracts container recreation |
|
||||||
|
| Normalizer layer | Rewrite 60+ Code nodes for Unraid response shape | High risk, massive changeset, testing nightmare |
|
||||||
|
| Container ID Registry | Store only container names, fetch ID on each mutation | N extra API calls, latency overhead, cache staleness risk |
|
||||||
|
|
||||||
|
**Installation:**
|
||||||
|
|
||||||
|
No new dependencies. Phase 15 utility nodes already deployed in n8n-workflow.json. Migration uses existing HTTP Request nodes (duplicate template, wire to normalizer/error handler).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture Patterns
|
||||||
|
|
||||||
|
### Pattern 1: GraphQL Query Migration (Container Listing)
|
||||||
|
|
||||||
|
**What:** Replace Docker API `GET /containers/json` with Unraid GraphQL `containers` query
|
||||||
|
|
||||||
|
**When to use:** n8n-status.json (container list/status), n8n-batch-ui.json (batch selection), main workflow (container lookups)
|
||||||
|
|
||||||
|
**Example migration:**
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// BEFORE (Docker API):
|
||||||
|
// HTTP Request node: GET http://docker-socket-proxy:2375/containers/json?all=true
|
||||||
|
// Response: [{ "Id": "abc123", "Names": ["/plex"], "State": "running" }]
|
||||||
|
|
||||||
|
// AFTER (Unraid GraphQL):
|
||||||
|
// 1. Duplicate "Unraid API HTTP Template" node
|
||||||
|
// 2. Set query body:
|
||||||
|
{
|
||||||
|
"query": "query { docker { containers { id names state image } } }"
|
||||||
|
}
|
||||||
|
|
||||||
|
// 3. Wire: HTTP Request → GraphQL Response Normalizer → (existing downstream Code nodes)
|
||||||
|
// Normalizer output: [{ "Id": "server_hash:container_hash", "Names": ["/plex"], "State": "running", "_unraidId": "..." }]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key pattern:** Normalizer transforms Unraid response to Docker contract — downstream nodes see identical data structure.
|
||||||
|
|
||||||
|
**Source:** Phase 15-02 Plan (GraphQL Response Normalizer implementation)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pattern 2: GraphQL Mutation Migration (Container Start/Stop/Restart)
|
||||||
|
|
||||||
|
**What:** Replace Docker API `POST /containers/{id}/start` with Unraid GraphQL `start(id: PrefixedID!)` mutation
|
||||||
|
|
||||||
|
**When to use:** n8n-actions.json (start/stop/restart operations)
|
||||||
|
|
||||||
|
**Example migration:**
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// BEFORE (Docker API):
|
||||||
|
// HTTP Request: POST http://docker-socket-proxy:2375/v1.47/containers/abc123/start
|
||||||
|
// On 304: Container already started (handled by existing Code node checking statusCode === 304)
|
||||||
|
|
||||||
|
// AFTER (Unraid GraphQL):
|
||||||
|
// 1. Look up PrefixedID from Container ID Registry (by container name)
|
||||||
|
// 2. Call GraphQL mutation:
|
||||||
|
{
|
||||||
|
"query": "mutation { docker { start(id: \"server_hash:container_hash\") { id state } } }"
|
||||||
|
}
|
||||||
|
|
||||||
|
// 3. Wire: HTTP Request → GraphQL Error Handler → (existing downstream Code nodes)
|
||||||
|
// Error Handler maps ALREADY_IN_STATE error to { statusCode: 304, alreadyInState: true }
|
||||||
|
// Existing Code node: if (response.statusCode === 304) { /* already started */ }
|
||||||
|
```
|
||||||
|
|
||||||
|
**RESTART special case:** No native `restart` mutation in Unraid GraphQL. Implement as sequential `stop` + `start`:
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// GraphQL has no restart mutation — use two operations:
|
||||||
|
// 1. mutation { docker { stop(id: "...") { id state } } }
|
||||||
|
// 2. mutation { docker { start(id: "...") { id state } } }
|
||||||
|
// Wire: Stop HTTP → Error Handler → Start HTTP → Error Handler → Success Response
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key pattern:** Error Handler maps GraphQL error codes to HTTP status codes (ALREADY_IN_STATE → 304) — existing Code nodes unchanged.
|
||||||
|
|
||||||
|
**Source:** Unraid GraphQL schema (DockerMutations type), Phase 15-02 Plan (GraphQL Error Handler implementation)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pattern 3: Single Container Update Migration (5-Step Flow → 1 Mutation)
|
||||||
|
|
||||||
|
**What:** Replace Docker's 5-step update flow with single `updateContainer(id: PrefixedID!)` mutation
|
||||||
|
|
||||||
|
**When to use:** n8n-update.json (single container update), main workflow (text command "update \<name\>")
|
||||||
|
|
||||||
|
**Current 5-step Docker flow:**
|
||||||
|
1. Inspect container (get current config)
|
||||||
|
2. Stop container
|
||||||
|
3. Remove container
|
||||||
|
4. Create container (with new image)
|
||||||
|
5. Start container
|
||||||
|
6. Remove old image (cleanup)
|
||||||
|
|
||||||
|
**New 1-step Unraid flow:**
|
||||||
|
```javascript
|
||||||
|
// Single GraphQL mutation replaces entire flow:
|
||||||
|
{
|
||||||
|
"query": "mutation { docker { updateContainer(id: \"server_hash:container_hash\") { id state image imageId } } }"
|
||||||
|
}
|
||||||
|
|
||||||
|
// Unraid internally handles: pull new image, stop, remove, recreate, start
|
||||||
|
// Returns: Updated container object (normalized by GraphQL Response Normalizer)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Success criteria verification:**
|
||||||
|
- **Before:** Check old vs new image digest to confirm update happened
|
||||||
|
- **After:** Unraid mutation updates `imageId` field — compare before/after values
|
||||||
|
|
||||||
|
**Migration steps:**
|
||||||
|
1. Get container name from user input
|
||||||
|
2. Look up current container state (for "before" imageId comparison)
|
||||||
|
3. Look up PrefixedID from Container ID Registry
|
||||||
|
4. Call `updateContainer` mutation
|
||||||
|
5. Normalize response
|
||||||
|
6. Compare imageId: if different → updated, if same → no update available
|
||||||
|
7. Return same success/failure messages as before
|
||||||
|
|
||||||
|
**Key win:** Simpler flow, Unraid handles retry logic and state management, update badge clears automatically.
|
||||||
|
|
||||||
|
**Source:** Unraid GraphQL schema (DockerMutations.updateContainer), WebSearch results (Unraid update implementation shells to Dynamix Docker Manager)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pattern 4: Batch Update Migration (Serial → Parallel)
|
||||||
|
|
||||||
|
**What:** Replace N serial Docker update flows with single `updateContainers(ids: [PrefixedID!]!)` mutation
|
||||||
|
|
||||||
|
**When to use:** Batch update (multiple container selection), "Update All :latest" feature
|
||||||
|
|
||||||
|
**Example migration:**
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// BEFORE (Docker API): Loop over selected containers, call update flow N times serially
|
||||||
|
// for (const container of selectedContainers) {
|
||||||
|
// await updateDockerContainer(container.id); // 5-step flow each
|
||||||
|
// }
|
||||||
|
|
||||||
|
// AFTER (Unraid GraphQL):
|
||||||
|
// 1. Look up all PrefixedIDs from Container ID Registry (by names)
|
||||||
|
// 2. Single mutation:
|
||||||
|
{
|
||||||
|
"query": "mutation { docker { updateContainers(ids: [\"id1\", \"id2\", \"id3\"]) { id state imageId } } }"
|
||||||
|
}
|
||||||
|
|
||||||
|
// Returns: Array of updated containers (each normalized)
|
||||||
|
```
|
||||||
|
|
||||||
|
**"Update All :latest" special case:**
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// Option 1: Filter in workflow Code node, call updateContainers
|
||||||
|
// 1. Query all containers: query { docker { containers { id image } } }
|
||||||
|
// 2. Filter where image.endsWith(':latest')
|
||||||
|
// 3. Call updateContainers(ids: [...filteredIds])
|
||||||
|
|
||||||
|
// Option 2: Use updateAllContainers mutation (updates everything, slower)
|
||||||
|
{
|
||||||
|
"query": "mutation { docker { updateAllContainers { id state imageId } } }"
|
||||||
|
}
|
||||||
|
|
||||||
|
// Recommendation: Option 1 (filtered updateContainers) — matches current ":latest" filter behavior
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key pattern:** Batch efficiency — 1 API call instead of N, Unraid handles parallelization internally.
|
||||||
|
|
||||||
|
**Source:** Unraid GraphQL schema (DockerMutations.updateContainers, updateAllContainers)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pattern 5: Container ID Registry Usage
|
||||||
|
|
||||||
|
**What:** All GraphQL mutations require Unraid's 129-character PrefixedID format — use Container ID Registry to map container names to IDs
|
||||||
|
|
||||||
|
**When to use:** Every mutation call (start, stop, update), every inline keyboard callback (encode PrefixedID into 64-byte limit)
|
||||||
|
|
||||||
|
**Workflow integration:**
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// 1. User input: container name (e.g., "plex")
|
||||||
|
// 2. Look up in Container ID Registry:
|
||||||
|
// Input: { action: "lookup", containerName: "plex" }
|
||||||
|
// Output: { prefixedId: "server_hash:container_hash", found: true }
|
||||||
|
// 3. Use prefixedId in GraphQL mutation
|
||||||
|
// 4. Store result back in registry (cache refresh)
|
||||||
|
|
||||||
|
// Cache refresh pattern:
|
||||||
|
// After GraphQL query/mutation returns container data:
|
||||||
|
// Input: { action: "updateCache", containers: [...normalizedContainers] }
|
||||||
|
// Registry extracts Names[0] and Id, updates internal map
|
||||||
|
```
|
||||||
|
|
||||||
|
**Callback encoding:**
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// Inline keyboard callbacks (64-byte limit):
|
||||||
|
// BEFORE: "s:abc123" (status, Docker ID)
|
||||||
|
// AFTER: Use Callback Token Encoder
|
||||||
|
// Input: { containerName: "plex", action: "status" }
|
||||||
|
// Output: "s:1a2b3c4d" (8-char hash token, deterministic)
|
||||||
|
// Decoder: "s:1a2b3c4d" → lookup in registry → "plex" → get PrefixedID
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key pattern:** Registry is the single source of truth for name ↔ PrefixedID mapping. Update it after every GraphQL query/mutation that returns container data.
|
||||||
|
|
||||||
|
**Source:** Phase 15-01 Plan (Container ID Registry implementation)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Anti-Patterns to Avoid
|
||||||
|
|
||||||
|
- **Rewriting existing Code nodes:** GraphQL Normalizer exists to prevent this — use it
|
||||||
|
- **Storing PrefixedIDs in Telegram callback data directly:** Too long (129 chars vs 64-byte limit) — use Callback Token Encoder
|
||||||
|
- **Calling GraphQL mutations without Error Handler:** Skips ALREADY_IN_STATE → 304 mapping, breaks existing error logic
|
||||||
|
- **Querying containers without updating Registry cache:** Stale ID lookups, mutations fail with "container not found"
|
||||||
|
- **Using Docker container IDs in GraphQL calls:** Unraid expects PrefixedID format, Docker IDs are incompatible
|
||||||
|
- **Implementing custom restart via low-level operations:** Unraid doesn't expose container create/remove — use stop + start pattern
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Don't Hand-Roll
|
||||||
|
|
||||||
|
| Problem | Don't Build | Use Instead | Why |
|
||||||
|
|---------|-------------|-------------|-----|
|
||||||
|
| GraphQL response transformation | Custom mapping for each Code node | Phase 15 GraphQL Response Normalizer | 60+ Code nodes expect Docker contract, normalizer handles all |
|
||||||
|
| Container ID translation | Ad-hoc lookups in each workflow | Phase 15 Container ID Registry | Single source of truth, cache management, name resolution |
|
||||||
|
| Error code mapping | Custom error checks per node | Phase 15 GraphQL Error Handler | Standardized ALREADY_IN_STATE → 304, NOT_FOUND handling |
|
||||||
|
| Callback data encoding | Custom compression/truncation | Phase 15 Callback Token Encoder | Deterministic 8-char hash, 64-byte limit compliance |
|
||||||
|
| Restart mutation | Try to recreate container via GraphQL | Sequential stop + start | Unraid abstracts low-level ops, no create/remove exposed |
|
||||||
|
|
||||||
|
**Key insight:** Phase 15 infrastructure was built specifically to make this migration low-risk. Using it prevents cascading changes across 60+ nodes.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Common Pitfalls
|
||||||
|
|
||||||
|
### Pitfall 1: Forgetting to Update Container ID Registry Cache
|
||||||
|
|
||||||
|
**What goes wrong:** User updates container via bot. Next command uses stale registry cache, mutation fails with "container not found: server_hash:old_container_hash".
|
||||||
|
|
||||||
|
**Why it happens:** `updateContainer` mutation recreates the container with a new ID (same as Docker update flow). Registry still has the old PrefixedID.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
1. After every GraphQL query/mutation that returns container data, wire through Registry's "updateCache" action
|
||||||
|
2. Extract normalized containers from response, pass to Registry
|
||||||
|
3. Registry refreshes name → PrefixedID mappings
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Mutation succeeds, but next command on same container fails
|
||||||
|
- "Container not found" errors after successful updates
|
||||||
|
- Registry lookup returns PrefixedID that doesn't exist in Unraid
|
||||||
|
|
||||||
|
**Prevention pattern:**
|
||||||
|
```javascript
|
||||||
|
// After updateContainer mutation:
|
||||||
|
// 1. Normalize response (get updated container object)
|
||||||
|
// 2. Update Registry cache:
|
||||||
|
// Input: { action: "updateCache", containers: [normalizedContainer] }
|
||||||
|
// 3. Proceed with success message
|
||||||
|
```
|
||||||
|
|
||||||
|
**Source:** Docker behavior (container ID changes on recreate), Phase 15-01 design
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 2: GraphQL Timeout on Slow Update Operations
|
||||||
|
|
||||||
|
**What goes wrong:** `updateContainer` mutation for large container (10GB+ image) times out at 15 seconds, leaving container in intermediate state (stopped, old image removed).
|
||||||
|
|
||||||
|
**Why it happens:** Phase 15 HTTP Template uses 15-second timeout for myunraid.net cloud relay latency. Container updates can take 30+ seconds for large images.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
1. **Increase timeout for update mutations specifically:** Duplicate HTTP Template, set timeout to 60000ms (60s) for updateContainer/updateContainers nodes
|
||||||
|
2. **Keep 15s timeout for queries and quick mutations** (start/stop)
|
||||||
|
3. Document in ARCHITECTURE.md: "Update operations have 60s timeout to accommodate large image pulls"
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Timeout errors during container updates (not start/stop)
|
||||||
|
- Containers stuck in "stopped" state after timeout
|
||||||
|
- Unraid shows "pulling image" in Docker tab, but bot reports failure
|
||||||
|
|
||||||
|
**Recommended timeouts by operation:**
|
||||||
|
- Queries (containers list): 15s (current)
|
||||||
|
- Start/stop/restart: 15s (current)
|
||||||
|
- Single container update: 60s (increase)
|
||||||
|
- Batch updates: 120s (increase further)
|
||||||
|
|
||||||
|
**Source:** Real-world Docker image pull times (10GB+ images take 20-30s on gigabit), myunraid.net relay adds 200-500ms per request
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 3: ALREADY_IN_STATE Not Mapped to HTTP 304
|
||||||
|
|
||||||
|
**What goes wrong:** User taps "Start" on running container. GraphQL returns ALREADY_IN_STATE error. Existing Code node expects `statusCode === 304`, throws generic error instead of "already started" message.
|
||||||
|
|
||||||
|
**Why it happens:** Forgetting to wire GraphQL Error Handler between HTTP Request and existing Code node.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
1. **Every GraphQL mutation HTTP Request node MUST wire through GraphQL Error Handler**
|
||||||
|
2. Error Handler maps `error.extensions.code === "ALREADY_IN_STATE"` → `{ statusCode: 304, alreadyInState: true }`
|
||||||
|
3. Existing Code nodes check `response.statusCode === 304` unchanged
|
||||||
|
|
||||||
|
**Warning signs:**
|
||||||
|
- Generic error messages instead of "Container already started"
|
||||||
|
- Errors when user repeats same action (stop stopped container, etc.)
|
||||||
|
- Code nodes throwing on ALREADY_IN_STATE instead of graceful handling
|
||||||
|
|
||||||
|
**Correct wiring:**
|
||||||
|
```
|
||||||
|
HTTP Request (GraphQL mutation)
|
||||||
|
↓
|
||||||
|
GraphQL Error Handler (maps ALREADY_IN_STATE → 304)
|
||||||
|
↓
|
||||||
|
Existing Code node (checks statusCode === 304)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Source:** Phase 15-02 Plan (GraphQL Error Handler implementation), n8n-actions.json existing pattern
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 4: Restart Implementation Without Error Handling
|
||||||
|
|
||||||
|
**What goes wrong:** Restart operation calls `stop` mutation, which fails with ALREADY_IN_STATE (container already stopped). Sequential `start` mutation never executes, user sees error.
|
||||||
|
|
||||||
|
**Why it happens:** Implementing restart as sequential `stop` + `start` without ALREADY_IN_STATE tolerance.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
1. **Stop mutation:** Wire through Error Handler, **continue on 304** (already stopped is OK)
|
||||||
|
2. **Start mutation:** Wire through Error Handler, fail on ALREADY_IN_STATE for start (indicates logic error)
|
||||||
|
3. Use n8n "Continue On Fail" or explicit error checking in Code node
|
||||||
|
|
||||||
|
**Correct implementation:**
|
||||||
|
```
|
||||||
|
1. Stop mutation → Error Handler
|
||||||
|
- On 304: Continue to start (container was already stopped, fine)
|
||||||
|
- On error: Fail restart operation
|
||||||
|
2. Start mutation → Error Handler
|
||||||
|
- On success: Return "restarted" message
|
||||||
|
- On 304: Fail (container started during restart, unexpected)
|
||||||
|
- On error: Fail restart operation
|
||||||
|
```
|
||||||
|
|
||||||
|
**Alternative:** Check container state first, only stop if running. Adds latency but avoids ALREADY_IN_STATE on stop.
|
||||||
|
|
||||||
|
**Source:** Unraid GraphQL schema (no native restart mutation), standard restart logic patterns
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 5: Batch Update Progress Not Visible
|
||||||
|
|
||||||
|
**What goes wrong:** User selects 10 containers for batch update. Bot sends "Updating..." then silence for 2 minutes, then "Done". User doesn't know if bot is working or stuck.
|
||||||
|
|
||||||
|
**Why it happens:** `updateContainers` mutation is atomic — returns only after all containers updated. No progress events.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
1. **Keep existing Docker pattern:** Serial updates with Telegram message edits per container
|
||||||
|
2. **Alternative (faster but no progress):** Use `updateContainers` mutation, send initial "Updating X containers..." then final result
|
||||||
|
3. **Hybrid (recommended):** Small batches (≤5) use `updateContainers` for speed, large batches (>5) use serial with progress
|
||||||
|
|
||||||
|
**Implementation for hybrid:**
|
||||||
|
```javascript
|
||||||
|
// In batch update Code node:
|
||||||
|
if (selectedContainers.length <= 5) {
|
||||||
|
// Fast path: Single updateContainers mutation
|
||||||
|
const ids = selectedContainers.map(c => lookupPrefixedId(c.name));
|
||||||
|
await updateContainers(ids);
|
||||||
|
return { message: `Updated ${selectedContainers.length} containers` };
|
||||||
|
} else {
|
||||||
|
// Progress path: Serial updates with Telegram edits
|
||||||
|
for (const container of selectedContainers) {
|
||||||
|
await updateContainer(container.prefixedId);
|
||||||
|
await editTelegramMessage(`Updated ${i}/${total}: ${container.name}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Tradeoff:** Progress visibility vs speed. User decision from v1.2 batch work: progress is important.
|
||||||
|
|
||||||
|
**Source:** v1.2 batch operations design, user feedback on "silent operations"
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Pitfall 6: Update Badge Still Shows After Bot Update
|
||||||
|
|
||||||
|
**What goes wrong:** User updates container via bot. Unraid Docker tab still shows "apply update" badge. User clicks badge, update completes instantly (image already cached).
|
||||||
|
|
||||||
|
**Why it happens:** This is **the problem v1.4 solves**. If it still occurs, GraphQL mutation isn't properly clearing Unraid's internal update tracking.
|
||||||
|
|
||||||
|
**How to avoid:**
|
||||||
|
1. **Verify GraphQL mutation returns success** (not just HTTP 200, but valid container object)
|
||||||
|
2. **Check Unraid version:** Update badge sync requires Unraid 7.2+ or Connect plugin with recent version
|
||||||
|
3. **Test in real environment:** Synthetic tests may not reveal badge state issues
|
||||||
|
|
||||||
|
**Verification test:**
|
||||||
|
```bash
|
||||||
|
# 1. Via bot: Update container
|
||||||
|
# 2. Check Unraid Docker tab: Badge should be GONE
|
||||||
|
# 3. If badge remains: Check Unraid logs for GraphQL mutation execution
|
||||||
|
# 4. If logs show success but badge remains: Unraid bug, report to Unraid team
|
||||||
|
```
|
||||||
|
|
||||||
|
**Expected behavior (success):** After `updateContainer` mutation completes, refreshing Unraid Docker tab shows no update badge for that container.
|
||||||
|
|
||||||
|
**If badge persists:** Check Unraid API version, verify mutation actually executed (not just HTTP success), check Unraid internal logs (`/var/log/syslog`).
|
||||||
|
|
||||||
|
**Source:** v1.3 Known Limitations (update badge issue), v1.4 migration goal, Unraid GraphQL API design
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Code Examples
|
||||||
|
|
||||||
|
### Container List Query Migration
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// BEFORE (Docker API):
|
||||||
|
// HTTP Request node: GET http://docker-socket-proxy:2375/containers/json?all=true
|
||||||
|
// Next node (Code): processes response as-is
|
||||||
|
|
||||||
|
// AFTER (Unraid GraphQL):
|
||||||
|
// HTTP Request node (duplicate "Unraid API HTTP Template"):
|
||||||
|
{
|
||||||
|
"method": "POST",
|
||||||
|
"url": "={{ $env.UNRAID_HOST }}/graphql",
|
||||||
|
"body": {
|
||||||
|
"query": "query { docker { containers { id names state image } } }"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Wire: HTTP Request → GraphQL Response Normalizer → Update Container ID Registry → (existing Code nodes)
|
||||||
|
|
||||||
|
// Normalizer transforms:
|
||||||
|
// IN: { data: { docker: { containers: [{ id: "hash:hash", names: ["/plex"], state: "RUNNING" }] } } }
|
||||||
|
// OUT: [{ Id: "hash:hash", Names: ["/plex"], State: "running", _unraidId: "hash:hash" }]
|
||||||
|
|
||||||
|
// Registry update (Code node after normalizer):
|
||||||
|
const containers = $input.all().map(item => item.json);
|
||||||
|
const registryInput = {
|
||||||
|
action: "updateCache",
|
||||||
|
containers: containers
|
||||||
|
};
|
||||||
|
// Pass to Container ID Registry node
|
||||||
|
|
||||||
|
// Existing Code nodes see Docker API format unchanged
|
||||||
|
```
|
||||||
|
|
||||||
|
**Source:** Phase 15-02 normalizer implementation, ARCHITECTURE.md Docker API contract
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Container Start Mutation Migration
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// BEFORE (Docker API):
|
||||||
|
// HTTP Request: POST http://docker-socket-proxy:2375/v1.47/containers/abc123/start
|
||||||
|
// Code node checks: if (response.statusCode === 304) { /* already started */ }
|
||||||
|
|
||||||
|
// AFTER (Unraid GraphQL):
|
||||||
|
// Step 1: Lookup PrefixedID (Code node before HTTP Request)
|
||||||
|
const containerName = $json.containerName; // From upstream input
|
||||||
|
const registryLookup = {
|
||||||
|
action: "lookup",
|
||||||
|
containerName: containerName
|
||||||
|
};
|
||||||
|
// Pass to Container ID Registry → returns { prefixedId: "...", found: true }
|
||||||
|
|
||||||
|
// Step 2: Build mutation (Code node prepares GraphQL body)
|
||||||
|
const prefixedId = $('Container ID Registry').item.json.prefixedId;
|
||||||
|
return {
|
||||||
|
json: {
|
||||||
|
query: `mutation { docker { start(id: "${prefixedId}") { id state } } }`
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Step 3: Execute mutation (HTTP Request, uses Unraid API HTTP Template)
|
||||||
|
// Body: {{ $json.query }}
|
||||||
|
|
||||||
|
// Step 4: Handle errors (wire through GraphQL Error Handler)
|
||||||
|
// Error Handler maps ALREADY_IN_STATE → { statusCode: 304, alreadyInState: true }
|
||||||
|
|
||||||
|
// Step 5: Existing Code node (unchanged)
|
||||||
|
const response = $input.item.json;
|
||||||
|
if (response.statusCode === 304) {
|
||||||
|
return { json: { message: "Container already started" } };
|
||||||
|
}
|
||||||
|
if (response.success) {
|
||||||
|
return { json: { message: "Container started successfully" } };
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Source:** Phase 15 utility node integration, n8n-actions.json existing error handling
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Single Container Update Mutation Migration
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// BEFORE (Docker API 5-step flow in n8n-update.json):
|
||||||
|
// 1. Inspect container → get image digest
|
||||||
|
// 2. Stop container
|
||||||
|
// 3. Remove container
|
||||||
|
// 4. Create container (pulls new image)
|
||||||
|
// 5. Start container
|
||||||
|
// 6. Remove old image
|
||||||
|
// Total: 6 HTTP Request nodes, 8 Code nodes for orchestration
|
||||||
|
|
||||||
|
// AFTER (Unraid GraphQL):
|
||||||
|
// Step 1: Get current container state (for imageId comparison)
|
||||||
|
const containerName = $json.containerName;
|
||||||
|
// Query: { docker { containers { id image imageId } } } (filter by name)
|
||||||
|
|
||||||
|
// Step 2: Lookup PrefixedID
|
||||||
|
// Registry input: { action: "lookup", containerName: containerName }
|
||||||
|
|
||||||
|
// Step 3: Single mutation
|
||||||
|
const prefixedId = $('Container ID Registry').item.json.prefixedId;
|
||||||
|
const oldImageId = $json.currentImageId; // From step 1
|
||||||
|
return {
|
||||||
|
json: {
|
||||||
|
query: `mutation { docker { updateContainer(id: "${prefixedId}") { id state image imageId } } }`
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Step 4: Execute mutation (HTTP Request with 60s timeout)
|
||||||
|
|
||||||
|
// Step 5: Normalize response and check if updated
|
||||||
|
// GraphQL Response Normalizer → Code node:
|
||||||
|
const response = $input.item.json;
|
||||||
|
const newImageId = response.imageId;
|
||||||
|
const updated = (newImageId !== oldImageId);
|
||||||
|
|
||||||
|
if (updated) {
|
||||||
|
return {
|
||||||
|
json: {
|
||||||
|
success: true,
|
||||||
|
updated: true,
|
||||||
|
message: `Updated ${containerName}: ${oldImageId.slice(0,12)} → ${newImageId.slice(0,12)}`
|
||||||
|
}
|
||||||
|
};
|
||||||
|
} else {
|
||||||
|
return {
|
||||||
|
json: {
|
||||||
|
success: true,
|
||||||
|
updated: false,
|
||||||
|
message: `No update available for ${containerName}`
|
||||||
|
}
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
// Total: 3 HTTP Request nodes (query current, lookup ID, update mutation), 3 Code nodes
|
||||||
|
// Reduction: 6 → 3 HTTP nodes, 8 → 3 Code nodes
|
||||||
|
```
|
||||||
|
|
||||||
|
**Source:** n8n-update.json current implementation, Unraid GraphQL schema updateContainer mutation
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Batch Update Migration
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// BEFORE (Docker API): Loop in Code node, Execute Workflow sub-workflow call per container (serial)
|
||||||
|
|
||||||
|
// AFTER (Unraid GraphQL):
|
||||||
|
// Option A: Small batch (≤5 containers) — parallel mutation
|
||||||
|
const selectedNames = $json.selectedContainers.split(',');
|
||||||
|
|
||||||
|
// Lookup all PrefixedIDs
|
||||||
|
const ids = [];
|
||||||
|
for (const name of selectedNames) {
|
||||||
|
const result = lookupInRegistry(name); // Call Registry node
|
||||||
|
ids.push(result.prefixedId);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Single mutation
|
||||||
|
return {
|
||||||
|
json: {
|
||||||
|
query: `mutation { docker { updateContainers(ids: ${JSON.stringify(ids)}) { id state imageId } } }`
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// HTTP Request (120s timeout for batch) → Normalizer → Success message
|
||||||
|
|
||||||
|
// Option B: Large batch (>5 containers) — serial with progress
|
||||||
|
// Keep existing pattern: loop + Execute Workflow calls, replace inner logic with GraphQL mutation
|
||||||
|
|
||||||
|
// Hybrid recommendation:
|
||||||
|
const batchSize = selectedNames.length;
|
||||||
|
if (batchSize <= 5) {
|
||||||
|
// Use updateContainers mutation (Option A)
|
||||||
|
} else {
|
||||||
|
// Use serial loop with Telegram progress updates (Option B)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Source:** n8n-batch-ui.json, Unraid GraphQL schema updateContainers mutation
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Restart Implementation (Sequential Stop + Start)
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// Unraid has no native restart mutation — implement as two operations
|
||||||
|
|
||||||
|
// Step 1: Stop mutation (tolerate ALREADY_IN_STATE)
|
||||||
|
const prefixedId = $json.prefixedId;
|
||||||
|
return {
|
||||||
|
json: {
|
||||||
|
query: `mutation { docker { stop(id: "${prefixedId}") { id state } } }`
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// HTTP Request → GraphQL Error Handler
|
||||||
|
// Error Handler output: { statusCode: 304, alreadyInState: true } OR { success: true }
|
||||||
|
|
||||||
|
// Step 2: Check stop result (Code node)
|
||||||
|
const stopResult = $input.item.json;
|
||||||
|
if (stopResult.statusCode === 304 || stopResult.success) {
|
||||||
|
// Container stopped (or was already stopped) — proceed to start
|
||||||
|
return { json: { proceedToStart: true } };
|
||||||
|
}
|
||||||
|
// Other errors fail the restart
|
||||||
|
|
||||||
|
// Step 3: Start mutation
|
||||||
|
return {
|
||||||
|
json: {
|
||||||
|
query: `mutation { docker { start(id: "${prefixedId}") { id state } } }`
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// HTTP Request → Error Handler → Success
|
||||||
|
|
||||||
|
// Wiring: Stop HTTP → Error Handler → Check Result IF → Start HTTP → Error Handler → Format Result
|
||||||
|
```
|
||||||
|
|
||||||
|
**Source:** Unraid GraphQL schema (no restart mutation), standard restart implementation pattern
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## State of the Art
|
||||||
|
|
||||||
|
| Old Approach | Current Approach | When Changed | Impact |
|
||||||
|
|--------------|------------------|--------------|--------|
|
||||||
|
| Docker REST API via socket proxy | Unraid GraphQL API via myunraid.net relay | This phase (v1.4) | Single API, update badge sync, no proxy security boundary |
|
||||||
|
| 5-step update flow (stop/remove/create/start) | Single `updateContainer` mutation | This phase | Simpler, faster, Unraid handles retry logic |
|
||||||
|
| Serial batch updates with progress | `updateContainers` plural mutation for small batches | This phase | Parallel execution, faster for ≤5 containers |
|
||||||
|
| Docker 64-char container IDs | Unraid 129-char PrefixedID with Registry mapping | Phase 15-16 | Requires translation layer, but enables GraphQL API |
|
||||||
|
| Manual "Apply Update" in Unraid UI | Automatic badge clear via GraphQL | This phase | Core user pain point solved |
|
||||||
|
|
||||||
|
**Deprecated/outdated:**
|
||||||
|
- **docker-socket-proxy container:** Removed in Phase 17, GraphQL API replaces Docker socket access
|
||||||
|
- **Container logs feature:** Removed in Phase 17, not valuable enough to maintain hybrid architecture
|
||||||
|
- **Direct Docker container ID storage:** Replaced by Container ID Registry lookups (PrefixedID required)
|
||||||
|
|
||||||
|
**Current best practice (post-Phase 16):** All container operations via Unraid GraphQL API. Docker socket proxy is legacy artifact.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
|
||||||
|
1. **Actual updateContainer mutation timeout needs**
|
||||||
|
- What we know: Large images (10GB+) can take 30+ seconds to pull
|
||||||
|
- What's unclear: Does myunraid.net relay timeout separately? Will 60s be enough for all cases?
|
||||||
|
- Recommendation: Start with 60s timeout, add workflow logging to capture actual duration, adjust if needed
|
||||||
|
|
||||||
|
2. **Batch update progress tradeoff**
|
||||||
|
- What we know: `updateContainers` is fast but silent, serial updates show progress but slow
|
||||||
|
- What's unclear: User preference — speed or visibility?
|
||||||
|
- Recommendation: Hybrid approach (≤5 fast, >5 with progress), can adjust threshold based on user feedback
|
||||||
|
|
||||||
|
3. **Restart error handling edge cases**
|
||||||
|
- What we know: Stop + start pattern works, need to tolerate ALREADY_IN_STATE on stop
|
||||||
|
- What's unclear: What if container exits between stop and start? Retry logic needed?
|
||||||
|
- Recommendation: Implement basic stop→start, add retry if real-world issues occur
|
||||||
|
|
||||||
|
4. **Container ID Registry cache invalidation**
|
||||||
|
- What we know: Registry caches name → PrefixedID mapping, must refresh after updates
|
||||||
|
- What's unclear: Cache expiry strategy? Time-based TTL or event-driven only?
|
||||||
|
- Recommendation: Event-driven only (update after every GraphQL query/mutation), no TTL needed
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Sources
|
||||||
|
|
||||||
|
### Primary (HIGH confidence)
|
||||||
|
- [Unraid GraphQL Schema](https://raw.githubusercontent.com/unraid/api/main/api/generated-schema.graphql) — Mutation signatures, DockerContainer type fields
|
||||||
|
- [Using the Unraid API](https://docs.unraid.net/API/how-to-use-the-api/) — Authentication, endpoint, rate limiting
|
||||||
|
- Phase 15-01 Plan — Container ID Registry, Callback Token Encoder/Decoder implementation
|
||||||
|
- Phase 15-02 Plan — GraphQL Response Normalizer, Error Handler, HTTP Template implementation
|
||||||
|
- ARCHITECTURE.md — Current Docker API contracts, workflow node breakdown, error patterns
|
||||||
|
|
||||||
|
### Secondary (MEDIUM confidence)
|
||||||
|
- [Docker and VM Integration | Unraid API](https://deepwiki.com/unraid/api/2.4.2-notification-system) — Unraid update implementation details (shells to Dynamix Docker Manager)
|
||||||
|
- [Core Services | Unraid API](https://deepwiki.com/unraid/api/2.4-docker-integration) — DockerService retry logic (5 polling attempts at 500ms intervals)
|
||||||
|
- n8n-update.json — Current 5-step Docker update flow implementation
|
||||||
|
- n8n-actions.json — Current start/stop error handling pattern (statusCode === 304 check)
|
||||||
|
- n8n-status.json — Current container list query pattern
|
||||||
|
|
||||||
|
### Tertiary (LOW confidence)
|
||||||
|
- Community forum posts on Unraid container updates — Anecdotal timing data for large image pulls
|
||||||
|
- Real-world myunraid.net relay latency observations — 200-500ms baseline from Phase 14 testing
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Metadata
|
||||||
|
|
||||||
|
**Confidence breakdown:**
|
||||||
|
- Standard stack: HIGH — Unraid GraphQL API verified in Phase 14, Phase 15 infrastructure already built
|
||||||
|
- Architecture: HIGH — Migration patterns are straightforward substitutions, Phase 15 utilities handle complexity
|
||||||
|
- Pitfalls: MEDIUM-HIGH — Most are standard API migration issues, actual timeout needs and batch tradeoffs require real-world testing
|
||||||
|
|
||||||
|
**Research date:** 2026-02-09
|
||||||
|
**Valid until:** 60 days (Unraid GraphQL API stable, schema changes infrequent)
|
||||||
|
|
||||||
|
**Critical dependencies for planning:**
|
||||||
|
- Phase 15 utility nodes deployed and tested (Container ID Registry, GraphQL Normalizer, Error Handler, HTTP Template)
|
||||||
|
- Phase 14 Unraid API access verified (credentials, network connectivity, authentication working)
|
||||||
|
- n8n workflow JSON structure understood (node IDs, connections, typeVersion patterns from CLAUDE.md)
|
||||||
|
|
||||||
|
**Migration risk assessment:**
|
||||||
|
- **Low risk:** Container queries (status, list) — direct substitution, normalizer handles response shape
|
||||||
|
- **Medium risk:** Container lifecycle (start/stop/restart) — ALREADY_IN_STATE error mapping critical, restart needs sequential implementation
|
||||||
|
- **Medium risk:** Single container update — timeout configuration important, imageId comparison for success detection
|
||||||
|
- **Medium-high risk:** Batch updates — tradeoff between speed and progress visibility, hybrid approach recommended
|
||||||
|
|
||||||
|
**Ready for planning:** YES — Clear migration patterns identified, Phase 15 infrastructure ready, pitfalls documented, code examples provided for each operation type.
|
||||||
Reference in New Issue
Block a user