278 lines
16 KiB
Markdown
278 lines
16 KiB
Markdown
# Feature Research: Unraid GraphQL API Migration
|
||
|
||
**Domain:** Unraid native container management via GraphQL API
|
||
**Researched:** 2026-02-09
|
||
**Confidence:** HIGH
|
||
|
||
## Context
|
||
|
||
**Existing system:** Bot uses Docker socket proxy → Docker REST API for all container operations (status, start, stop, restart, update, logs). Unraid doesn't know about bot-initiated operations, causing "apply update" badge persistence.
|
||
|
||
**Migration target:** Replace Docker socket proxy with Unraid's native GraphQL API for all operations. Unraid 7.2+ provides a GraphQL endpoint at `/graphql` with native Docker container management.
|
||
|
||
**Key question:** Which existing features are drop-in replacements (same capability, different API) vs. which gain new capabilities vs. which need workarounds?
|
||
|
||
---
|
||
|
||
## Feature Landscape
|
||
|
||
### Direct Replacements (Same Behavior, Different API)
|
||
|
||
Features that work identically via Unraid API — no user-visible changes.
|
||
|
||
| Feature | Current Implementation | Unraid API Equivalent | Complexity | Notes |
|
||
|---------|------------------------|----------------------|------------|-------|
|
||
| Container status display | `GET /containers/json` → parse JSON → display | `query { docker { containers { id names state } } }` | LOW | GraphQL returns structured data, cleaner parsing. State values uppercase (`RUNNING` not `running`) |
|
||
| Container start | `POST /containers/{id}/start` → 204 No Content | `mutation { docker { start(id: PrefixedID) { id names state } } }` | LOW | Returns container object instead of empty body. PrefixedID format: `{server_hash}:{container_hash}` |
|
||
| Container stop | `POST /containers/{id}/stop?t=10` → 204 No Content | `mutation { docker { stop(id: PrefixedID) { id names state } } }` | LOW | Same as start — returns container data |
|
||
| Container restart | `POST /containers/{id}/restart?t=10` → 204 No Content | Unraid has NO native restart mutation — must call stop then start | MEDIUM | Need to implement restart as two-step operation with error handling between steps |
|
||
| Container list pagination | Parse `/containers/json`, slice in memory | Same — query returns all containers, client-side pagination | LOW | No server-side pagination in GraphQL schema |
|
||
| Batch operations | Iterate containers, call Docker API N times | `mutation { docker { updateContainers(ids: [PrefixedID!]!) } }` for updates, iterate for start/stop | MEDIUM | Batch update is native, batch start/stop still requires iteration |
|
||
|
||
### Enhanced Features (Gain New Capabilities)
|
||
|
||
Features that work better with Unraid API.
|
||
|
||
| Feature | New Capability | Value | Complexity | Notes |
|
||
|---------|----------------|-------|------------|-------|
|
||
| Container update | **Automatic update status sync** — Unraid knows bot updated container, no "apply update" badge | Solves core v1.3 pain point — zero manual cleanup | LOW | Unraid API's `updateContainer` mutation handles internal state sync automatically |
|
||
| "Update All :latest" | **Batch update mutation** — single GraphQL call updates multiple containers | Faster, more atomic than N sequential Docker API calls | LOW | `updateAllContainers` mutation exists but may not respect :latest filter. May need `updateContainers(ids: [...])` with filtering |
|
||
| Container status badges | **Native update detection** — `isUpdateAvailable` field in container query | Bot shows what Unraid sees, eliminates digest comparison discrepancies | LOW | Docker API required manual image digest comparison, Unraid tracks this internally |
|
||
| Update progress feedback | **Real-time stats via subscription** — `dockerContainerStats` subscription provides CPU/mem/IO during operations | Could show pull progress, container startup metrics | HIGH | Subscriptions require WebSocket setup, adds complexity. DEFER to future phase |
|
||
|
||
### Features Requiring Workarounds
|
||
|
||
Features where Unraid API is less capable than Docker API.
|
||
|
||
| Feature | Docker API Approach | Unraid API Limitation | Workaround | Complexity | Impact |
|
||
|---------|---------------------|----------------------|------------|------------|--------|
|
||
| Container logs | `GET /containers/{id}/logs?stdout=1&stderr=1&tail=N×tamps=1` | `query { docker { logs(id: PrefixedID, tail: Int, since: DateTime) { ... } } }` | Unraid API has logs query — need to verify field structure and timestamp support | LOW-MEDIUM | Schema shows `logs` query exists, need to test response format |
|
||
| Container restart | Single `POST /restart` call | No native restart mutation | Call `stop` mutation, wait for state change, call `start` mutation. Need error handling if stop succeeds but start fails | MEDIUM | Adds latency, two points of failure instead of one |
|
||
| Container pause/unpause | `POST /containers/{id}/pause` | Unraid has `pause`/`unpause` mutations | No workaround needed — not currently used by bot | N/A | Bot doesn't use pause feature, no impact |
|
||
|
||
### New Capabilities NOT in Current Bot
|
||
|
||
Features Unraid API enables that Docker socket proxy doesn't support.
|
||
|
||
| Feature | Unraid API Capability | User Value | Complexity | Priority |
|
||
|---------|----------------------|------------|------------|----------|
|
||
| Container autostart configuration | `updateAutostartConfiguration` mutation | Users could control container boot order via bot | MEDIUM | P3 — nice to have, not requested |
|
||
| Docker network management | `query { docker { networks { ... } } }` | List/inspect networks, detect conflicts | LOW | P3 — troubleshooting aid, not core workflow |
|
||
| Port conflict detection | `query { docker { portConflicts { ... } } }` | Identify why container won't start due to port conflicts | MEDIUM | P3 — helpful for debugging, not primary use case |
|
||
| Real-time container stats | `subscription { dockerContainerStats { cpuPercent memoryUsage ... } }` | Live resource monitoring during updates | HIGH | P3 — requires WebSocket infrastructure |
|
||
|
||
---
|
||
|
||
## Feature Dependencies
|
||
|
||
```
|
||
Container Operations (start/stop/update)
|
||
└──requires──> PrefixedID format mapping
|
||
└──requires──> Container ID resolution (existing matching logic)
|
||
|
||
Batch Update
|
||
└──requires──> Container selection UI (existing)
|
||
└──enhances──> "Update All :latest" (atomic operation)
|
||
|
||
Update Status Sync
|
||
└──automatically provided by──> Unraid API mutations (no explicit action needed)
|
||
└──eliminates need for──> File writes to /var/lib/docker/unraid-update-status.json
|
||
|
||
Container Restart
|
||
└──requires──> Stop mutation
|
||
└──requires──> Start mutation
|
||
└──requires──> State polling between operations
|
||
|
||
Container Logs
|
||
└──requires──> GraphQL logs query testing
|
||
└──may require──> Response format adaptation (if different from Docker API)
|
||
```
|
||
|
||
### Dependency Notes
|
||
|
||
- **PrefixedID format is critical:** Unraid uses `{server_hash}:{container_hash}` (128-char total) instead of Docker's short container ID. Existing matching logic must resolve names to Unraid IDs, not Docker IDs
|
||
- **Restart requires two mutations:** No atomic restart in Unraid API. Must implement stop → verify → start pattern
|
||
- **Update status sync is automatic:** Biggest win — no manual file manipulation needed, Unraid knows about updates immediately
|
||
- **Logs query needs verification:** Schema shows `logs` exists but field structure unknown until tested
|
||
|
||
---
|
||
|
||
## Migration Complexity Assessment
|
||
|
||
### Drop-in Replacements (LOW complexity)
|
||
|
||
Change API endpoint and request format, behavior unchanged.
|
||
|
||
- [x] Container list/status display
|
||
- [x] Container start
|
||
- [x] Container stop
|
||
- [x] Batch container selection UI (no API changes)
|
||
- [x] Confirmation dialogs (no API changes)
|
||
|
||
**Effort:** 1-2 nodes per operation. Replace HTTP Request URL and body, adapt response parsing. Error handling pattern stays same.
|
||
|
||
### Adapted Replacements (MEDIUM complexity)
|
||
|
||
Requires implementation changes but same user experience.
|
||
|
||
- [ ] Container restart — Implement as stop + start sequence with state verification
|
||
- [ ] Container logs — Adapt to GraphQL logs query response format
|
||
- [ ] Batch update — Use `updateContainers(ids: [...])` mutation instead of N individual calls
|
||
- [ ] Container ID resolution — Map container names to PrefixedID format
|
||
|
||
**Effort:** 3-5 nodes per operation. Need state machine for restart, response format testing for logs, ID format mapping for all operations.
|
||
|
||
### Enhanced Features (LOW-MEDIUM complexity)
|
||
|
||
Gain new capabilities with minimal work.
|
||
|
||
- [x] Update status sync — Automatic via Unraid API, remove Phase 14 manual sync
|
||
- [x] Update detection — Use `isUpdateAvailable` field instead of Docker digest comparison
|
||
- [x] Batch mutations — Native support for multi-container updates
|
||
|
||
**Effort:** Remove old workarounds, use new API fields. Net simplification.
|
||
|
||
---
|
||
|
||
## Migration Phases
|
||
|
||
### Phase 1: Infrastructure (Phase 14 — COMPLETE)
|
||
|
||
- [x] Unraid GraphQL API connectivity
|
||
- [x] Authentication setup (API key, Header Auth credential)
|
||
- [x] Test query validation
|
||
- [x] Container ID format documentation
|
||
|
||
**Status:** Complete per Phase 14 verification. Ready for mutation implementation.
|
||
|
||
### Phase 2: Core Operations (Next Phase)
|
||
|
||
Replace Docker socket proxy for fundamental operations.
|
||
|
||
- [ ] Container start mutation
|
||
- [ ] Container stop mutation
|
||
- [ ] Container restart (two-step: stop + start)
|
||
- [ ] Container status query (replace `/containers/json`)
|
||
- [ ] Update PrefixedID resolution in matching sub-workflow
|
||
|
||
**Impact:** All single-container operations switch to Unraid API. Docker socket proxy only used for updates and logs temporarily.
|
||
|
||
### Phase 3: Update Operations
|
||
|
||
Replace update workflow with Unraid API.
|
||
|
||
- [ ] Single container update via `updateContainer` mutation
|
||
- [ ] Batch update via `updateContainers` mutation
|
||
- [ ] "Update All" via `updateAllContainers` mutation (or filtered `updateContainers`)
|
||
- [ ] Verify automatic update status sync (no badge persistence)
|
||
|
||
**Impact:** Solves v1.3 milestone pain point. Unraid UI reflects bot updates immediately.
|
||
|
||
### Phase 4: Logs and Polish
|
||
|
||
Replace remaining Docker API calls.
|
||
|
||
- [ ] Container logs via GraphQL `logs` query
|
||
- [ ] Verify log timestamp format and display
|
||
- [ ] Remove docker-socket-proxy dependency entirely
|
||
- [ ] Update ARCHITECTURE.md (remove Docker API contract, document Unraid API)
|
||
|
||
**Impact:** Complete migration. Docker socket proxy container can be removed.
|
||
|
||
---
|
||
|
||
## Complexity Matrix
|
||
|
||
| Operation | Docker API | Unraid API | Complexity | Blocker |
|
||
|-----------|------------|------------|------------|---------|
|
||
| Start | POST /start | mutation start(id) | LOW | None |
|
||
| Stop | POST /stop | mutation stop(id) | LOW | None |
|
||
| Restart | POST /restart | stop + start (2 calls) | MEDIUM | State verification between mutations |
|
||
| Status | GET /json | query containers | LOW | PrefixedID format mapping |
|
||
| Update | POST /images/create + stop + rename + start | mutation updateContainer(id) | LOW | None — simpler than Docker API |
|
||
| Batch Update | N × update | mutation updateContainers(ids) | LOW | None — native support |
|
||
| Logs | GET /logs | query logs(id, tail, since) | MEDIUM | Response format unknown |
|
||
|
||
**Key insight:** Most operations are simpler with Unraid API. Only restart and logs require adaptation work.
|
||
|
||
---
|
||
|
||
## Anti-Features
|
||
|
||
Features that seem useful but complicate migration without user value.
|
||
|
||
| Feature | Why Tempting | Why Problematic | Alternative |
|
||
|---------|--------------|-----------------|-------------|
|
||
| Parallel use of Docker API + Unraid API | "Keep both during migration" | Two sources of truth, complex ID mapping, defeats purpose of migration | Full cutover per operation — start/stop on Unraid API, then update, then logs |
|
||
| GraphQL subscriptions for real-time stats | "Monitor container resource usage live" | Requires WebSocket setup, n8n HTTP Request node doesn't support subscriptions, adds infrastructure complexity | Poll if needed, defer to future phase with dedicated subscription node |
|
||
| Expose full GraphQL schema to user | "Let users run arbitrary queries via bot" | Security risk (unrestricted API access), complex query parsing, unclear user benefit | Expose only operations via commands (`start`, `update`, `logs`), not raw GraphQL |
|
||
| Port conflict detection on every status check | "Proactively warn about port conflicts" | Performance impact (extra query), rare occurrence, clutters UI | Only query port conflicts when start/restart fails with port binding error |
|
||
|
||
---
|
||
|
||
## Success Criteria
|
||
|
||
Migration is successful when:
|
||
|
||
- [x] **Zero Docker socket proxy calls** — All operations use Unraid GraphQL API
|
||
- [x] **Update badge sync works** — Unraid UI shows correct status after bot updates
|
||
- [x] **Restart works reliably** — Two-step restart handles edge cases (stop succeeds, start fails)
|
||
- [x] **Logs display correctly** — GraphQL logs query returns usable data for Telegram display
|
||
- [x] **No performance regression** — Operations complete in same or better time than Docker API
|
||
- [x] **Error messages stay clear** — GraphQL errors map to actionable user feedback
|
||
|
||
---
|
||
|
||
## Sources
|
||
|
||
### Primary (HIGH confidence)
|
||
|
||
- [Unraid GraphQL Schema](https://raw.githubusercontent.com/unraid/api/main/api/generated-schema.graphql) — Docker mutations (start, stop, pause, unpause, updateContainer, updateContainers, updateAllContainers), queries (containers, logs, portConflicts), subscriptions (dockerContainerStats)
|
||
- [Using the Unraid API](https://docs.unraid.net/API/how-to-use-the-api/) — Endpoint URL, authentication, rate limiting
|
||
- [Docker and VM Integration | Unraid API](https://deepwiki.com/unraid/api/2.4.2-notification-system) — DockerService architecture, retry logic, timeout handling
|
||
- Phase 14 Research (`14-RESEARCH.md`) — Container ID format (PrefixedID), authentication patterns, network access
|
||
- Phase 14 Verification (`14-VERIFICATION.md`) — Confirmed working query, credential setup, myunraid.net URL requirement
|
||
|
||
### Secondary (MEDIUM confidence)
|
||
|
||
- [Core Services | Unraid API](https://deepwiki.com/unraid/api/2.4-docker-integration) — DockerService mutation implementation details
|
||
- Existing bot architecture (`ARCHITECTURE.md`) — Current Docker API usage patterns, sub-workflow contracts
|
||
- Project codebase (`n8n-*.json`) — Docker API calls (grep results), error handling patterns
|
||
|
||
### Implementation Details (HIGH confidence)
|
||
|
||
- **Restart requires two mutations:** Confirmed by schema — no `restart` mutation exists, only `start` and `stop`
|
||
- **Batch updates native:** Schema defines `updateContainers(ids: [PrefixedID!]!)` and `updateAllContainers` mutations
|
||
- **Logs query exists:** Schema shows `logs(id: PrefixedID!, since: DateTime, tail: Int)` → `DockerContainerLogs!` type
|
||
- **Real-time stats via subscription:** `dockerContainerStats` subscription exists but requires WebSocket transport
|
||
|
||
---
|
||
|
||
## Open Questions
|
||
|
||
1. **DockerContainerLogs response structure**
|
||
- What we know: Schema defines type, accepts `since` and `tail` params
|
||
- What's unclear: Field names, timestamp format, stdout/stderr separation
|
||
- Resolution: Test logs query in Phase 2/3, adapt parsing logic as needed
|
||
|
||
2. **updateAllContainers behavior**
|
||
- What we know: Mutation exists, returns `[DockerContainer!]!`
|
||
- What's unclear: Does it filter by `:latest` tag, or update everything with available updates?
|
||
- Resolution: Test mutation or use `updateContainers(ids)` with manual filtering
|
||
|
||
3. **Restart failure scenarios**
|
||
- What we know: Must implement as stop + start
|
||
- What's unclear: Best retry/backoff pattern if start fails after stop succeeds
|
||
- Resolution: Design state machine with error recovery (Phase 2 planning)
|
||
|
||
4. **Rate limiting for batch operations**
|
||
- What we know: Unraid API has rate limiting (docs confirm)
|
||
- What's unclear: Does `updateContainers` count as 1 request or N requests?
|
||
- Resolution: Test batch update with 20+ containers, monitor for 429 errors
|
||
|
||
---
|
||
|
||
*Feature research for: Unraid GraphQL API migration*
|
||
*Researched: 2026-02-09*
|
||
*Milestone: Replace Docker socket proxy with Unraid native API*
|