surfaces/.planning/phases/05-mvp-deployment/.continue-here.md

61 lines
3.9 KiB
Markdown

---
phase: 05-mvp-deployment
task: 0
total_tasks: 0
status: paused_after_handoff
last_updated: 2026-04-28T18:39:43.064Z
---
<current_state>
Phase 05 implementation and deployment handoff are complete. The latest handoff commit is `5b53788` on `feat/deploy`, pushed to origin. The Matrix surface image was built and published as `mput1/surfaces-bot:latest` with digest `sha256:26ba3a49290ab7c1cf0fa97f3de3fefdc70b59df7e6f1e0c2255728f8e2369be`.
The production model is one generic Matrix surface container connected to 25-30 externally managed platform agents. The surface does not start or manage agent containers.
</current_state>
<completed_work>
- Finalized `docker-compose.prod.yml` as a bot-only handoff using required `SURFACES_BOT_IMAGE`.
- Kept `docker-compose.fullstack.yml` as internal E2E harness with one local `platform-agent` and local `agent_api` build context.
- Updated `Dockerfile` so production installs `platform/agent_api` from Git and no longer depends on local `external/`.
- Updated `.dockerignore` to keep `external/`, `.planning/`, tests, local runtime state, and real `config/matrix-agents.yaml` out of the image context.
- Updated `README.md`, `.env.example`, `docs/deploy-architecture.md`, and `config/matrix-agents.example.yaml` with the multi-agent contract.
- Added deploy contract tests and file-volume routing tests covering `/agents/17/incoming` and `/agents/17/output`.
- Verified handoff slice: `74 passed`, ruff clean, compose render clean, `git diff --check` clean.
- User built and pushed `mput1/surfaces-bot:latest` successfully.
</completed_work>
<remaining_work>
- Send platform the published image tag/digest and the deploy contract:
- `mput1/surfaces-bot:latest`
- `sha256:26ba3a49290ab7c1cf0fa97f3de3fefdc70b59df7e6f1e0c2255728f8e2369be`
- one surface container, external 25-30 agents, routing through `config/matrix-agents.yaml`
- Platform must provide real `config/matrix-agents.yaml` with `agent_id`, `base_url`, and `workspace_path` for each agent.
- Platform must mount shared storage so bot-side `/agents/N` is the same storage each `agent_N` sees as `/workspace`.
- Run a real Matrix smoke test against platform-managed agents after the platform deploys the image.
</remaining_work>
<decisions_made>
- Ship one generic Matrix surface image instead of attempting to model 25-30 agent services in our production compose.
- Keep agent lifecycle, scaling, and orchestration owned by the platform.
- Use `SURFACES_BOT_IMAGE=mput1/surfaces-bot:latest` as the documented image for handoff.
- Preserve `docker-compose.fullstack.yml` only as a local/internal E2E harness, not as production topology.
- Treat file exchange as a shared-volume contract: user files go to `{workspace_path}/incoming/...`; agent output is read from `{workspace_path}/output/...`.
</decisions_made>
<blockers>
- Full production verification is external: it requires the platform team's real 25-30 agent orchestration, reverse proxy routes, Matrix credentials, and volume mounts.
- Existing unrelated `.planning` changes and a local jpg remain in the worktree; they predate this pause and were not part of the deploy handoff commit.
</blockers>
<context>
If resuming, do not re-open the old single-chat / DM-first deployment direction. The accepted model is Space+rooms, per-room `platform_chat_id`, one Matrix surface image, and external per-agent routing via `matrix-agents.yaml`.
The likely next conversation with platform should be operational, not implementation-heavy: confirm they pull `mput1/surfaces-bot:latest`, mount `/agents`, provide `matrix-agents.yaml`, and run one user-to-agent file round trip.
</context>
<next_action>
Start by sending platform the image tag/digest and the concise deployment checklist. Then coordinate the first real smoke test: one Matrix user mapped to one agent, text message, incoming file to `/agents/N/incoming`, outgoing file from `/agents/N/output`.
</next_action>