surfaces/.planning/phases/05-mvp-deployment/05-VALIDATION.md

5 KiB

phase slug status nyquist_compliant wave_0_complete created
05 mvp-deployment revised true false 2026-04-28

Phase 05 — Validation Strategy

Per-phase validation contract for feedback sampling during execution.


Test Infrastructure

Property Value
Framework pytest + pytest-asyncio
Config file pyproject.toml
Quick run command pytest tests/adapter/matrix/test_reconciliation.py tests/adapter/matrix/test_restart_persistence.py -v
Full suite command pytest tests/ -v
Estimated runtime targeted slices < 60 seconds each; full suite longer

Sampling Rate

  • After every task commit: Run the exact <automated> command from the task that just changed
  • After every plan wave: Run pytest tests/adapter/matrix/ -v
  • Before $gsd-verify-work: Full suite must be green
  • Max feedback latency: 60 seconds for task-level slices

Per-Task Verification Map

Task ID Plan Wave Requirement Test Type Automated Command File Exists Status
05-01-01 01 1 PH05-01 integration pytest tests/adapter/matrix/test_invite_space.py tests/adapter/matrix/test_chat_space.py tests/adapter/matrix/test_reconciliation.py tests/adapter/matrix/test_restart_persistence.py -v W0 pending
05-01-02 01 1 PH05-03 integration pytest tests/adapter/matrix/test_invite_space.py tests/adapter/matrix/test_chat_space.py tests/adapter/matrix/test_reconciliation.py tests/adapter/matrix/test_restart_persistence.py tests/adapter/matrix/test_dispatcher.py -v W0 pending
05-02-01 02 2 PH05-02 integration pytest tests/adapter/matrix/test_context_commands.py tests/adapter/matrix/test_routed_platform.py -v partial pending
05-02-02 02 2 PH05-02 integration pytest tests/adapter/matrix/test_context_commands.py tests/adapter/matrix/test_routed_platform.py tests/adapter/matrix/test_dispatcher.py -v partial pending
05-03-01 03 1 PH05-04 integration pytest tests/adapter/matrix/test_files.py tests/platform/test_real.py -v partial pending
05-03-02 03 1 PH05-04 integration pytest tests/adapter/matrix/test_files.py tests/platform/test_real.py tests/adapter/matrix/test_send_outgoing.py -v partial pending
05-04-01 04 2 PH05-05 smoke docker compose -f docker-compose.prod.yml config && docker compose -f docker-compose.fullstack.yml config W0 pending
05-04-02 04 2 PH05-05 docs smoke `rg -n "docker-compose\.prod docker-compose\.fullstack /agents

Status: pending · green · red · ⚠️ flaky


Wave 0 Requirements

  • tests/adapter/matrix/test_reconciliation.py — startup recovery of user and room metadata from Matrix state
  • tests/adapter/matrix/test_restart_persistence.py additions — deterministic backfill for legacy rooms missing platform_chat_id
  • tests/adapter/matrix/test_context_commands.py additions — room-local !clear rotation semantics
  • tests/adapter/matrix/test_files.py additions — cross-room attachment isolation and shared-root consistency
  • Compose smoke coverage or documented verification command for docker-compose.prod.yml and docker-compose.fullstack.yml

Manual-Only Verifications

Behavior Requirement Why Manual Test Instructions
Restart after real Matrix room topology exists PH05-03 Full recovery depends on live Space hierarchy and persisted homeserver state Start the bot, provision a Space and chat rooms, stop the bot, remove local SQLite metadata, restart, confirm routing and room labels are rebuilt before live messages are handled
Shared /agents volume behavior across bot and platform containers PH05-04 Container mounts and permissions are environment-dependent Run docker compose -f docker-compose.fullstack.yml up, upload a file in Matrix, confirm the agent sees the relative workspace_path, then confirm an agent-created file is readable back from the bot side
Operator handoff of prod compose PH05-05 Final deploy contract depends on real env files and target host conventions Run docker compose -f docker-compose.prod.yml config on the target deployment checkout and confirm only bot services, required env vars, and shared volumes are present

Validation Sign-Off

  • All tasks have <automated> verify or Wave 0 dependencies
  • Sampling continuity: no 3 consecutive tasks without automated verify
  • Wave 0 covers all MISSING references
  • No watch-mode flags
  • Feedback latency target tightened to task slices under 60s
  • nyquist_compliant: true set in frontmatter

Approval: pending