From 652d877ce1005bdb6fb80c8e01b68a8faafe37cd Mon Sep 17 00:00:00 2001 From: Lucas Berger Date: Wed, 4 Feb 2026 08:22:15 -0500 Subject: [PATCH] docs(09-04): add comprehensive test plan for batch operations - 21 test cases across 5 suites (text commands, update all, multi-select, errors, regression) - Deployment steps for n8n workflow import - Success criteria mapping to tests - Test execution log template with issue tracking - Covers all BAT-01 through BAT-06 requirements --- .../DEPLOYMENT-TEST-PLAN.md | 499 ++++++++++++++++++ 1 file changed, 499 insertions(+) create mode 100644 .planning/phases/09-batch-operations/DEPLOYMENT-TEST-PLAN.md diff --git a/.planning/phases/09-batch-operations/DEPLOYMENT-TEST-PLAN.md b/.planning/phases/09-batch-operations/DEPLOYMENT-TEST-PLAN.md new file mode 100644 index 0000000..91caba9 --- /dev/null +++ b/.planning/phases/09-batch-operations/DEPLOYMENT-TEST-PLAN.md @@ -0,0 +1,499 @@ +# Phase 09-04 Deployment and Test Plan + +**Generated:** 2026-02-04 +**Purpose:** Verification testing for batch operations implementation + +## Deployment Steps + +### 1. Import Updated Workflow + +1. Open your n8n instance (typically at http://your-server:5678) +2. Navigate to Workflows +3. Select "Docker Manager Bot" workflow +4. Click the three-dot menu → Export +5. Save current version as backup: `n8n-workflow-backup-20260204.json` +6. Return to Workflows → Import from File +7. Select the updated `n8n-workflow.json` from this repository +8. Confirm credential mapping (should use existing "Telegram API" credential) +9. Save and activate the workflow + +### 2. Verify Workflow Health + +Before testing, confirm: +- ✅ Workflow is "Active" (toggle in top-right) +- ✅ No error indicators on nodes +- ✅ Telegram Trigger shows connected +- ✅ Your user ID is still configured in IF User/Callback Authenticated nodes + +## Test Plan + +### Test Suite A: Batch Text Commands + +#### Test A1: Multi-Container Update +**Objective:** Verify batch update with space-separated names + +**Steps:** +1. Send to bot: `update plex sonarr` (use 2-3 of your actual container names) +2. Observe behavior + +**Expected Results:** +- ✅ Message shows "Updating N containers..." +- ✅ Progress updates appear for each container individually +- ✅ Shows "Pulling image..." → "Stopping..." → "Starting..." per container +- ✅ Final summary message appears with success count +- ✅ If any container fails, shows failure with reason but continues batch +- ✅ Summary emphasizes failures (if any) over successes + +**Notes:** +- Record actual execution time for performance assessment +- Screenshot final summary for documentation + +--- + +#### Test A2: Multi-Container Start +**Objective:** Verify batch start executes immediately (no confirmation) + +**Preparation:** +1. Manually stop 2-3 containers via existing single commands: `stop plex`, `stop sonarr` +2. Confirm containers are stopped via `status` + +**Steps:** +1. Send to bot: `start plex sonarr` (use your stopped container names) +2. Observe behavior + +**Expected Results:** +- ✅ No confirmation prompt (starts immediately) +- ✅ Progress shows for each container +- ✅ Summary shows successful starts +- ✅ Verify containers are running via `status` + +--- + +#### Test A3: Multi-Container Stop with Confirmation +**Objective:** Verify batch stop requires confirmation (safety measure) + +**Steps:** +1. Send to bot: `stop plex sonarr` (use 2+ running containers) +2. Wait for confirmation prompt +3. Tap "Confirm" button +4. Observe execution + +**Expected Results:** +- ✅ Confirmation message appears: "Stop 2 containers?" +- ✅ Lists container names in confirmation +- ✅ Has "Confirm" and "Cancel" buttons +- ✅ After confirm: batch execution proceeds with progress +- ✅ Summary shows containers stopped + +**Follow-up Test:** +1. Repeat but tap "Cancel" button +2. Expected: Confirmation deleted, no action taken + +**Follow-up Test 2:** +1. Repeat but don't respond for 30+ seconds +2. Expected: Confirmation expires with message + +--- + +#### Test A4: Fuzzy Matching with Exact Match Priority +**Objective:** Verify exact match takes priority over partial matches + +**Scenario 1: Exact match exists** +1. Send: `update plex` (when both "plex" and "jellyplex" exist) +2. Expected: Only "plex" container updates (no disambiguation) + +**Scenario 2: Only partial matches** +1. Send: `update jelly` (matches "jellyplex" but not exact) +2. Expected: If only one match, proceeds; if multiple, shows disambiguation + +**Note:** This test depends on your actual container names. Adjust to match your server. + +--- + +#### Test A5: Disambiguation for Ambiguous Names +**Objective:** Verify disambiguation prompt appears when multiple containers match + +**Steps:** +1. Send command with ambiguous partial match (e.g., `update lin` if you have multiple "lin*" containers) +2. Wait for disambiguation prompt +3. Select intended container + +**Expected Results:** +- ✅ Shows "Multiple containers match: lin" +- ✅ Lists matching containers with buttons +- ✅ Selecting one proceeds with single-container action +- ✅ Batch not triggered (user clarified intent) + +**Note:** If no ambiguous names on your server, document "Cannot test - no ambiguous container names" + +--- + +### Test Suite B: Update All Command + +#### Test B1: Update All with Available Updates +**Objective:** Verify "update all" targets only :latest containers with updates + +**Steps:** +1. Send: `update all` (or `updateall`) +2. Observe confirmation prompt +3. Note which containers are listed +4. Tap "Confirm" +5. Observe execution + +**Expected Results:** +- ✅ Command recognized and routed +- ✅ Confirmation shows: "Update N containers?" +- ✅ Lists containers (max 10 displayed in message) +- ✅ Only includes containers using :latest tag +- ✅ 30-second timeout on confirmation +- ✅ After confirm: batch execution with progress per container +- ✅ Summary shows results + +**Verification:** +- Check that only :latest containers were updated +- Containers with specific tags (e.g., `:1.2.3`) should not appear in list + +--- + +#### Test B2: Update All When No Updates Available +**Objective:** Verify appropriate message when all containers are current + +**Preparation:** +1. Update all containers manually first OR wait until all are current + +**Steps:** +1. Send: `update all` +2. Observe response + +**Expected Results:** +- ✅ Shows: "All containers are up to date!" (or similar message) +- ✅ No confirmation prompt +- ✅ No batch execution attempted + +--- + +#### Test B3: Update All Cancel +**Objective:** Verify cancel works + +**Steps:** +1. Send: `update all` +2. Wait for confirmation +3. Tap "Cancel" + +**Expected Results:** +- ✅ Confirmation message deleted +- ✅ Shows cancellation feedback +- ✅ No containers updated + +--- + +#### Test B4: Update All Timeout +**Objective:** Verify expiration behavior + +**Steps:** +1. Send: `update all` +2. Wait for confirmation +3. Don't respond for 30+ seconds +4. Try tapping button after expiry + +**Expected Results:** +- ✅ After 30s: Shows expiry message +- ✅ Confirmation becomes inactive +- ✅ Tapping expired button shows alert + +--- + +### Test Suite C: Inline Keyboard Multi-Select + +#### Test C1: Enter Multi-Select Mode +**Objective:** Verify multi-select keyboard appears + +**Steps:** +1. Send: `/status` +2. Locate "Select Multiple" button (may need to be added in future plan) +3. OR send callback manually: Use bot command that triggers `batch:mode` + +**Note:** If no entry point exists yet, test by: +- Temporarily adding "Select Multiple" button to status keyboard +- OR testing via n8n "Execute Node" with callback_query data: `batch:mode` + +**Expected Results:** +- ✅ Keyboard shows container list with state icons +- ✅ Running containers: 🟢 +- ✅ Stopped containers: ⚪ +- ✅ Each button shows container name +- ✅ No checkmarks initially +- ✅ Bottom row has "Cancel" button + +--- + +#### Test C2: Toggle Selection +**Objective:** Verify checkmarks toggle on/off + +**Steps:** +1. Enter multi-select mode (from C1) +2. Tap a container button +3. Observe keyboard update +4. Tap same container again +5. Tap different container + +**Expected Results:** +- ✅ First tap: Checkmark (✓) appears before container name +- ✅ Second tap: Checkmark disappears +- ✅ Multiple containers can have checkmarks +- ✅ Action buttons appear when any container selected +- ✅ "Clear Selection" button appears with selection + +--- + +#### Test C3: Execute Batch Update from Multi-Select +**Objective:** Verify batch execution from inline keyboard + +**Steps:** +1. Enter multi-select mode +2. Select 2-3 containers (tap to add checkmarks) +3. Tap "Update Selected (N)" button +4. Observe execution + +**Expected Results:** +- ✅ Immediate execution (no confirmation for update) +- ✅ Progress shows for each selected container +- ✅ Summary shows results +- ✅ Selection message deleted after execution starts + +--- + +#### Test C4: Execute Batch Stop with Confirmation +**Objective:** Verify stop requires confirmation from multi-select + +**Steps:** +1. Enter multi-select mode +2. Select 2+ running containers +3. Tap "Stop Selected (N)" button +4. Wait for confirmation +5. Tap "Confirm" + +**Expected Results:** +- ✅ Confirmation prompt appears (doesn't execute immediately) +- ✅ Lists selected containers +- ✅ After confirm: batch stop executes +- ✅ Summary shows stopped containers + +--- + +#### Test C5: Selection Limit Enforcement +**Objective:** Verify callback size limit prevents overflow + +**Steps:** +1. Enter multi-select mode +2. Select containers one by one +3. Attempt to select 9+ containers + +**Expected Results:** +- ✅ Selection works smoothly for first ~8 containers +- ✅ At limit: Alert appears "Selection limit reached" +- ✅ Cannot select additional containers +- ✅ Can deselect and select different containers +- ✅ Guidance shown (e.g., "Use 'update all' for larger batches") + +**Note:** Exact limit depends on container name lengths. Shorter names = more selections possible. + +--- + +#### Test C6: Clear Selection +**Objective:** Verify clear button resets selection + +**Steps:** +1. Select 3+ containers +2. Tap "Clear Selection" button + +**Expected Results:** +- ✅ All checkmarks removed +- ✅ Action buttons disappear +- ✅ Only "Cancel" button remains +- ✅ Can start new selection + +--- + +#### Test C7: Cancel Multi-Select +**Objective:** Verify cancel exits cleanly + +**Steps:** +1. Enter multi-select mode (with or without selection) +2. Tap "Cancel" button + +**Expected Results:** +- ✅ Selection message deleted +- ✅ Returns to previous state +- ✅ No actions executed + +--- + +### Test Suite D: Error Handling + +#### Test D1: Failure Isolation +**Objective:** Verify one failure doesn't abort batch + +**Steps:** +1. Create a batch with intentional failure (e.g., non-existent container mixed with real ones) +2. Example: `update plex nonexistent sonarr` +3. Observe execution + +**Expected Results:** +- ✅ First container processes +- ✅ Failed container shows error message with reason +- ✅ Remaining containers still process (batch continues) +- ✅ Summary shows: "2 succeeded, 1 failed" +- ✅ Failure details prominent in summary + +--- + +#### Test D2: Warning vs Error Classification +**Objective:** Verify non-critical warnings don't show as errors + +**Setup:** +1. Stop a container: `stop plex` +2. Try stopping again: `stop plex` + +**Expected Results:** +- ✅ Shows as warning, not error +- ✅ Message: "Already stopped" or similar +- ✅ Summary distinguishes warnings from errors + +**Similar tests:** +- Update container with no update available → Warning, not error +- Start already-running container → Warning + +--- + +### Test Suite E: Regression Tests + +#### Test E1: Single-Container Commands Still Work +**Objective:** Verify no regression in existing functionality + +**Commands to test:** +1. `status` → Shows container list keyboard +2. `start plex` → Starts single container +3. `stop plex` → Shows confirmation, then stops +4. `restart plex` → Restarts with progress +5. `update plex` → Updates single container +6. `logs plex` → Shows last 50 lines +7. `logs plex 100` → Shows last 100 lines + +**Expected Results:** +- ✅ All single-container commands work exactly as before +- ✅ No batch behavior triggered for single containers +- ✅ Confirmation behavior unchanged (stop requires confirm, others don't) + +--- + +#### Test E2: Inline Keyboard Actions Still Work +**Objective:** Verify phase 8 keyboard functionality intact + +**Steps:** +1. Send: `status` +2. Tap a container name button +3. Tap an action button (e.g., "▶️ Start", "Update 🔄") +4. Complete action + +**Expected Results:** +- ✅ Container detail view appears +- ✅ Action buttons work +- ✅ Actions execute correctly +- ✅ No interference from batch infrastructure + +--- + +#### Test E3: Pagination Still Works +**Objective:** Verify container list pagination for many containers + +**Steps:** +1. Send: `status` (if you have 10+ containers) +2. Navigate with Previous/Next buttons + +**Expected Results:** +- ✅ Pagination works correctly +- ✅ Page numbers accurate +- ✅ No batch selection interference + +**Note:** If < 10 containers, document "Cannot test - insufficient containers" + +--- + +## Success Criteria Verification + +After completing all tests, verify these criteria are met: + +- [ ] **BAT-01:** User can update multiple containers in one command + - Tests: A1, C3 + +- [ ] **BAT-02:** Batch updates execute sequentially with per-container feedback + - Tests: A1, A2, C3 + +- [ ] **BAT-03:** "Update all" updates only containers with updates available + - Tests: B1, B2 + +- [ ] **BAT-04:** "Update all" requires confirmation + - Tests: B1, B3, B4 + +- [ ] **BAT-05:** One failure doesn't abort remaining batch + - Tests: D1 + +- [ ] **BAT-06:** Final summary shows success/failure count + - Tests: A1, C3, D1 + +- [ ] **Inline keyboard batch selection works** + - Tests: C1-C7 + +- [ ] **No regression in existing commands** + - Tests: E1, E2, E3 + +## Test Execution Log + +**Date:** ___________ +**Tester:** ___________ +**n8n Version:** ___________ +**Workflow Import Time:** ___________ + +### Results Summary + +| Test | Status | Notes | +|------|--------|-------| +| A1: Multi-container update | ⬜ Pass / ⬜ Fail | | +| A2: Multi-container start | ⬜ Pass / ⬜ Fail | | +| A3: Multi-container stop | ⬜ Pass / ⬜ Fail | | +| A4: Fuzzy matching | ⬜ Pass / ⬜ Fail / ⬜ N/A | | +| A5: Disambiguation | ⬜ Pass / ⬜ Fail / ⬜ N/A | | +| B1: Update all with updates | ⬜ Pass / ⬜ Fail | | +| B2: Update all (none available) | ⬜ Pass / ⬜ Fail / ⬜ N/A | | +| B3: Update all cancel | ⬜ Pass / ⬜ Fail | | +| B4: Update all timeout | ⬜ Pass / ⬜ Fail | | +| C1: Enter multi-select | ⬜ Pass / ⬜ Fail | | +| C2: Toggle selection | ⬜ Pass / ⬜ Fail | | +| C3: Batch update from multi-select | ⬜ Pass / ⬜ Fail | | +| C4: Batch stop with confirm | ⬜ Pass / ⬜ Fail | | +| C5: Selection limit | ⬜ Pass / ⬜ Fail | | +| C6: Clear selection | ⬜ Pass / ⬜ Fail | | +| C7: Cancel multi-select | ⬜ Pass / ⬜ Fail | | +| D1: Failure isolation | ⬜ Pass / ⬜ Fail | | +| D2: Warning vs error | ⬜ Pass / ⬜ Fail | | +| E1: Single commands regression | ⬜ Pass / ⬜ Fail | | +| E2: Inline keyboard regression | ⬜ Pass / ⬜ Fail | | +| E3: Pagination regression | ⬜ Pass / ⬜ Fail / ⬜ N/A | | + +### Issues Found + +| Issue # | Test | Description | Severity | Status | +|---------|------|-------------|----------|--------| +| | | | | | + +### Additional Observations + +_Record any unexpected behavior, performance notes, UX feedback, etc._ + +--- + +**Overall Assessment:** ⬜ Ready for Production / ⬜ Issues Need Resolution + +**Notes for Phase 10:** +_Record any polish/improvement ideas discovered during testing_