docs(09-04): add comprehensive test plan for batch operations

- 21 test cases across 5 suites (text commands, update all, multi-select, errors, regression) - Deployment steps for n8n workflow import - Success criteria mapping to tests - Test execution log template with issue tracking - Covers all BAT-01 through BAT-06 requirements
2026-02-04 08:22:15 -05:00
parent 93be20d9c4
commit 652d877ce1
1 changed files with 499 additions and 0 deletions
@@ -0,0 +1,499 @@
 # Phase 09-04 Deployment and Test Plan
 **Generated:** 2026-02-04
 **Purpose:** Verification testing for batch operations implementation
 ## Deployment Steps
 ### 1. Import Updated Workflow
 1. Open your n8n instance (typically at http://your-server:5678)
 2. Navigate to Workflows
 3. Select "Docker Manager Bot" workflow
 4. Click the three-dot menu → Export
 5. Save current version as backup: `n8n-workflow-backup-20260204.json`
 6. Return to Workflows → Import from File
 7. Select the updated `n8n-workflow.json` from this repository
 8. Confirm credential mapping (should use existing "Telegram API" credential)
 9. Save and activate the workflow
 ### 2. Verify Workflow Health
 Before testing, confirm:
 - ✅ Workflow is "Active" (toggle in top-right)
 - ✅ No error indicators on nodes
 - ✅ Telegram Trigger shows connected
 - ✅ Your user ID is still configured in IF User/Callback Authenticated nodes
 ## Test Plan
 ### Test Suite A: Batch Text Commands
 #### Test A1: Multi-Container Update
 **Objective:** Verify batch update with space-separated names
 **Steps:**
 1. Send to bot: `update plex sonarr` (use 2-3 of your actual container names)
 2. Observe behavior
 **Expected Results:**
 - ✅ Message shows "Updating N containers..."
 - ✅ Progress updates appear for each container individually
 - ✅ Shows "Pulling image..." → "Stopping..." → "Starting..." per container
 - ✅ Final summary message appears with success count
 - ✅ If any container fails, shows failure with reason but continues batch
 - ✅ Summary emphasizes failures (if any) over successes
 **Notes:**
 - Record actual execution time for performance assessment
 - Screenshot final summary for documentation
 ---
 #### Test A2: Multi-Container Start
 **Objective:** Verify batch start executes immediately (no confirmation)
 **Preparation:**
 1. Manually stop 2-3 containers via existing single commands: `stop plex`, `stop sonarr`
 2. Confirm containers are stopped via `status`
 **Steps:**
 1. Send to bot: `start plex sonarr` (use your stopped container names)
 2. Observe behavior
 **Expected Results:**
 - ✅ No confirmation prompt (starts immediately)
 - ✅ Progress shows for each container
 - ✅ Summary shows successful starts
 - ✅ Verify containers are running via `status`
 ---
 #### Test A3: Multi-Container Stop with Confirmation
 **Objective:** Verify batch stop requires confirmation (safety measure)
 **Steps:**
 1. Send to bot: `stop plex sonarr` (use 2+ running containers)
 2. Wait for confirmation prompt
 3. Tap "Confirm" button
 4. Observe execution
 **Expected Results:**
 - ✅ Confirmation message appears: "Stop 2 containers?"
 - ✅ Lists container names in confirmation
 - ✅ Has "Confirm" and "Cancel" buttons
 - ✅ After confirm: batch execution proceeds with progress
 - ✅ Summary shows containers stopped
 **Follow-up Test:**
 1. Repeat but tap "Cancel" button
 2. Expected: Confirmation deleted, no action taken
 **Follow-up Test 2:**
 1. Repeat but don't respond for 30+ seconds
 2. Expected: Confirmation expires with message
 ---
 #### Test A4: Fuzzy Matching with Exact Match Priority
 **Objective:** Verify exact match takes priority over partial matches
 **Scenario 1: Exact match exists**
 1. Send: `update plex` (when both "plex" and "jellyplex" exist)
 2. Expected: Only "plex" container updates (no disambiguation)
 **Scenario 2: Only partial matches**
 1. Send: `update jelly` (matches "jellyplex" but not exact)
 2. Expected: If only one match, proceeds; if multiple, shows disambiguation
 **Note:** This test depends on your actual container names. Adjust to match your server.
 ---
 #### Test A5: Disambiguation for Ambiguous Names
 **Objective:** Verify disambiguation prompt appears when multiple containers match
 **Steps:**
 1. Send command with ambiguous partial match (e.g., `update lin` if you have multiple "lin*" containers)
 2. Wait for disambiguation prompt
 3. Select intended container
 **Expected Results:**
 - ✅ Shows "Multiple containers match: lin"
 - ✅ Lists matching containers with buttons
 - ✅ Selecting one proceeds with single-container action
 - ✅ Batch not triggered (user clarified intent)
 **Note:** If no ambiguous names on your server, document "Cannot test - no ambiguous container names"
 ---
 ### Test Suite B: Update All Command
 #### Test B1: Update All with Available Updates
 **Objective:** Verify "update all" targets only :latest containers with updates
 **Steps:**
 1. Send: `update all` (or `updateall`)
 2. Observe confirmation prompt
 3. Note which containers are listed
 4. Tap "Confirm"
 5. Observe execution
 **Expected Results:**
 - ✅ Command recognized and routed
 - ✅ Confirmation shows: "Update N containers?"
 - ✅ Lists containers (max 10 displayed in message)
 - ✅ Only includes containers using :latest tag
 - ✅ 30-second timeout on confirmation
 - ✅ After confirm: batch execution with progress per container
 - ✅ Summary shows results
 **Verification:**
 - Check that only :latest containers were updated
 - Containers with specific tags (e.g., `:1.2.3`) should not appear in list
 ---
 #### Test B2: Update All When No Updates Available
 **Objective:** Verify appropriate message when all containers are current
 **Preparation:**
 1. Update all containers manually first OR wait until all are current
 **Steps:**
 1. Send: `update all`
 2. Observe response
 **Expected Results:**
 - ✅ Shows: "All containers are up to date!" (or similar message)
 - ✅ No confirmation prompt
 - ✅ No batch execution attempted
 ---
 #### Test B3: Update All Cancel
 **Objective:** Verify cancel works
 **Steps:**
 1. Send: `update all`
 2. Wait for confirmation
 3. Tap "Cancel"
 **Expected Results:**
 - ✅ Confirmation message deleted
 - ✅ Shows cancellation feedback
 - ✅ No containers updated
 ---
 #### Test B4: Update All Timeout
 **Objective:** Verify expiration behavior
 **Steps:**
 1. Send: `update all`
 2. Wait for confirmation
 3. Don't respond for 30+ seconds
 4. Try tapping button after expiry
 **Expected Results:**
 - ✅ After 30s: Shows expiry message
 - ✅ Confirmation becomes inactive
 - ✅ Tapping expired button shows alert
 ---
 ### Test Suite C: Inline Keyboard Multi-Select
 #### Test C1: Enter Multi-Select Mode
 **Objective:** Verify multi-select keyboard appears
 **Steps:**
 1. Send: `/status`
 2. Locate "Select Multiple" button (may need to be added in future plan)
 3. OR send callback manually: Use bot command that triggers `batch:mode`
 **Note:** If no entry point exists yet, test by:
 - Temporarily adding "Select Multiple" button to status keyboard
 - OR testing via n8n "Execute Node" with callback_query data: `batch:mode`
 **Expected Results:**
 - ✅ Keyboard shows container list with state icons
 - ✅ Running containers: 🟢
 - ✅ Stopped containers: ⚪
 - ✅ Each button shows container name
 - ✅ No checkmarks initially
 - ✅ Bottom row has "Cancel" button
 ---
 #### Test C2: Toggle Selection
 **Objective:** Verify checkmarks toggle on/off
 **Steps:**
 1. Enter multi-select mode (from C1)
 2. Tap a container button
 3. Observe keyboard update
 4. Tap same container again
 5. Tap different container
 **Expected Results:**
 - ✅ First tap: Checkmark (✓) appears before container name
 - ✅ Second tap: Checkmark disappears
 - ✅ Multiple containers can have checkmarks
 - ✅ Action buttons appear when any container selected
 - ✅ "Clear Selection" button appears with selection
 ---
 #### Test C3: Execute Batch Update from Multi-Select
 **Objective:** Verify batch execution from inline keyboard
 **Steps:**
 1. Enter multi-select mode
 2. Select 2-3 containers (tap to add checkmarks)
 3. Tap "Update Selected (N)" button
 4. Observe execution
 **Expected Results:**
 - ✅ Immediate execution (no confirmation for update)
 - ✅ Progress shows for each selected container
 - ✅ Summary shows results
 - ✅ Selection message deleted after execution starts
 ---
 #### Test C4: Execute Batch Stop with Confirmation
 **Objective:** Verify stop requires confirmation from multi-select
 **Steps:**
 1. Enter multi-select mode
 2. Select 2+ running containers
 3. Tap "Stop Selected (N)" button
 4. Wait for confirmation
 5. Tap "Confirm"
 **Expected Results:**
 - ✅ Confirmation prompt appears (doesn't execute immediately)
 - ✅ Lists selected containers
 - ✅ After confirm: batch stop executes
 - ✅ Summary shows stopped containers
 ---
 #### Test C5: Selection Limit Enforcement
 **Objective:** Verify callback size limit prevents overflow
 **Steps:**
 1. Enter multi-select mode
 2. Select containers one by one
 3. Attempt to select 9+ containers
 **Expected Results:**
 - ✅ Selection works smoothly for first ~8 containers
 - ✅ At limit: Alert appears "Selection limit reached"
 - ✅ Cannot select additional containers
 - ✅ Can deselect and select different containers
 - ✅ Guidance shown (e.g., "Use 'update all' for larger batches")
 **Note:** Exact limit depends on container name lengths. Shorter names = more selections possible.
 ---
 #### Test C6: Clear Selection
 **Objective:** Verify clear button resets selection
 **Steps:**
 1. Select 3+ containers
 2. Tap "Clear Selection" button
 **Expected Results:**
 - ✅ All checkmarks removed
 - ✅ Action buttons disappear
 - ✅ Only "Cancel" button remains
 - ✅ Can start new selection
 ---
 #### Test C7: Cancel Multi-Select
 **Objective:** Verify cancel exits cleanly
 **Steps:**
 1. Enter multi-select mode (with or without selection)
 2. Tap "Cancel" button
 **Expected Results:**
 - ✅ Selection message deleted
 - ✅ Returns to previous state
 - ✅ No actions executed
 ---
 ### Test Suite D: Error Handling
 #### Test D1: Failure Isolation
 **Objective:** Verify one failure doesn't abort batch
 **Steps:**
 1. Create a batch with intentional failure (e.g., non-existent container mixed with real ones)
 2. Example: `update plex nonexistent sonarr`
 3. Observe execution
 **Expected Results:**
 - ✅ First container processes
 - ✅ Failed container shows error message with reason
 - ✅ Remaining containers still process (batch continues)
 - ✅ Summary shows: "2 succeeded, 1 failed"
 - ✅ Failure details prominent in summary
 ---
 #### Test D2: Warning vs Error Classification
 **Objective:** Verify non-critical warnings don't show as errors
 **Setup:**
 1. Stop a container: `stop plex`
 2. Try stopping again: `stop plex`
 **Expected Results:**
 - ✅ Shows as warning, not error
 - ✅ Message: "Already stopped" or similar
 - ✅ Summary distinguishes warnings from errors
 **Similar tests:**
 - Update container with no update available → Warning, not error
 - Start already-running container → Warning
 ---
 ### Test Suite E: Regression Tests
 #### Test E1: Single-Container Commands Still Work
 **Objective:** Verify no regression in existing functionality
 **Commands to test:**
 1. `status` → Shows container list keyboard
 2. `start plex` → Starts single container
 3. `stop plex` → Shows confirmation, then stops
 4. `restart plex` → Restarts with progress
 5. `update plex` → Updates single container
 6. `logs plex` → Shows last 50 lines
 7. `logs plex 100` → Shows last 100 lines
 **Expected Results:**
 - ✅ All single-container commands work exactly as before
 - ✅ No batch behavior triggered for single containers
 - ✅ Confirmation behavior unchanged (stop requires confirm, others don't)
 ---
 #### Test E2: Inline Keyboard Actions Still Work
 **Objective:** Verify phase 8 keyboard functionality intact
 **Steps:**
 1. Send: `status`
 2. Tap a container name button
 3. Tap an action button (e.g., "▶️ Start", "Update 🔄")
 4. Complete action
 **Expected Results:**
 - ✅ Container detail view appears
 - ✅ Action buttons work
 - ✅ Actions execute correctly
 - ✅ No interference from batch infrastructure
 ---
 #### Test E3: Pagination Still Works
 **Objective:** Verify container list pagination for many containers
 **Steps:**
 1. Send: `status` (if you have 10+ containers)
 2. Navigate with Previous/Next buttons
 **Expected Results:**
 - ✅ Pagination works correctly
 - ✅ Page numbers accurate
 - ✅ No batch selection interference
 **Note:** If < 10 containers, document "Cannot test - insufficient containers"
 ---
 ## Success Criteria Verification
 After completing all tests, verify these criteria are met:
 - [ ] **BAT-01:** User can update multiple containers in one command
  - Tests: A1, C3
 - [ ] **BAT-02:** Batch updates execute sequentially with per-container feedback
  - Tests: A1, A2, C3
 - [ ] **BAT-03:** "Update all" updates only containers with updates available
  - Tests: B1, B2
 - [ ] **BAT-04:** "Update all" requires confirmation
  - Tests: B1, B3, B4
 - [ ] **BAT-05:** One failure doesn't abort remaining batch
  - Tests: D1
 - [ ] **BAT-06:** Final summary shows success/failure count
  - Tests: A1, C3, D1
 - [ ] **Inline keyboard batch selection works**
  - Tests: C1-C7
 - [ ] **No regression in existing commands**
  - Tests: E1, E2, E3
 ## Test Execution Log
 **Date:** ___________
 **Tester:** ___________
 **n8n Version:** ___________
 **Workflow Import Time:** ___________
 ### Results Summary
 | Test | Status | Notes |
 |------|--------|-------|
 | A1: Multi-container update | ⬜ Pass / ⬜ Fail | |
 | A2: Multi-container start | ⬜ Pass / ⬜ Fail | |
 | A3: Multi-container stop | ⬜ Pass / ⬜ Fail | |
 | A4: Fuzzy matching | ⬜ Pass / ⬜ Fail / ⬜ N/A | |
 | A5: Disambiguation | ⬜ Pass / ⬜ Fail / ⬜ N/A | |
 | B1: Update all with updates | ⬜ Pass / ⬜ Fail | |
 | B2: Update all (none available) | ⬜ Pass / ⬜ Fail / ⬜ N/A | |
 | B3: Update all cancel | ⬜ Pass / ⬜ Fail | |
 | B4: Update all timeout | ⬜ Pass / ⬜ Fail | |
 | C1: Enter multi-select | ⬜ Pass / ⬜ Fail | |
 | C2: Toggle selection | ⬜ Pass / ⬜ Fail | |
 | C3: Batch update from multi-select | ⬜ Pass / ⬜ Fail | |
 | C4: Batch stop with confirm | ⬜ Pass / ⬜ Fail | |
 | C5: Selection limit | ⬜ Pass / ⬜ Fail | |
 | C6: Clear selection | ⬜ Pass / ⬜ Fail | |
 | C7: Cancel multi-select | ⬜ Pass / ⬜ Fail | |
 | D1: Failure isolation | ⬜ Pass / ⬜ Fail | |
 | D2: Warning vs error | ⬜ Pass / ⬜ Fail | |
 | E1: Single commands regression | ⬜ Pass / ⬜ Fail | |
 | E2: Inline keyboard regression | ⬜ Pass / ⬜ Fail | |
 | E3: Pagination regression | ⬜ Pass / ⬜ Fail / ⬜ N/A | |
 ### Issues Found
 | Issue # | Test | Description | Severity | Status |
 |---------|------|-------------|----------|--------|
 | | | | | |
 ### Additional Observations
 _Record any unexpected behavior, performance notes, UX feedback, etc._
 ---
 **Overall Assessment:** ⬜ Ready for Production / ⬜ Issues Need Resolution
 **Notes for Phase 10:**
 _Record any polish/improvement ideas discovered during testing_