wip: 02-prototype paused at task 4/4

This commit is contained in:
Mikhail Putilovskij 2026-04-08 02:55:30 +03:00
parent 9c73266ea5
commit 7507b2f252
2 changed files with 138 additions and 40 deletions

View file

@ -0,0 +1,72 @@
---
phase: 02-prototype
task: 4
total_tasks: 4
status: paused
last_updated: 2026-04-07T23:54:30.473Z
---
<current_state>
The Matrix direct-agent prototype is implemented and manually proven working on branch `feat/matrix-direct-agent-prototype`. The current code path can log into Matrix, accept invites, provision the first Space/chat tree for a fresh user, and send live text messages to a patched local `platform-agent` over WebSocket. The immediate remaining engineering gap is not feature delivery but resilience: backend/provider failures can still bubble up as `PlatformError` and crash the Matrix bot process.
</current_state>
<completed_work>
- Task 1: Added `sdk/agent_session.py` and transport tests for direct WebSocket messaging with collision-safe `thread_key` generation.
- Task 2: Added `sdk/prototype_state.py` and tests for stable local user mapping, settings defaults, and mutation-safe settings copies.
- Task 3: Added `sdk/real.py` as the `PlatformClient` implementation, fixed import-time dependency leakage, and aligned thread-key tests to the actual dispatcher contract.
- Task 4: Wired Matrix runtime selection through `MATRIX_PLATFORM_BACKEND=real`, documented usage in `README.md`, and added dispatcher coverage for real backend selection.
- Fixed repeat Matrix invites so the bot now `join()`s before the existing-user early return path.
- Added Russian runbook doc `docs/matrix-direct-agent-prototype-ru.md` and pushed the branch.
- Manually validated live bring-up using a local patched `external/platform-agent` on port 8000 plus the Matrix homeserver `https://matrix.lambda.coredump.ru`.
</completed_work>
<remaining_work>
- Add graceful degradation for backend/provider failures so `PlatformError` does not crash the Matrix process.
- Decide whether to upstream or separately push the required `external/platform-agent` patch (`1dca2c1`) that enables WebSocket `thread_id`.
- Optionally clean up repeat-invite UX if Space/chat reprovisioning should ever happen for already-known users.
- Optionally prepare a PR from `feat/matrix-direct-agent-prototype`.
</remaining_work>
<decisions_made>
- Keep the prototype in this repo, not a separate Matrix-only repo.
- Keep Matrix adapter logic intact and absorb backend differences inside `sdk/`.
- Split the real backend into `AgentSessionClient` and `PrototypeStateStore` behind `RealPlatformClient`.
- Patch only `platform-agent` for per-thread memory instead of changing both `agent` and `agent_api`.
- Use a serialized collision-safe thread key because Matrix user IDs contain colons.
- For repeat invites, join the room but do not recreate Space/chat state if the user is already provisioned locally.
</decisions_made>
<blockers>
- Technical: provider/backend errors still crash the Matrix bot instead of returning a user-facing failure reply.
- External: the required `platform-agent` patch exists only in the local clone under `external/` and is not yet upstream.
- Operational: credentials used during manual bring-up were exposed in-session and should be rotated.
</blockers>
<context>
The important mental model is stable. `platform/master` is still not the backend for surfaces, so the working prototype goes directly to `platform-agent` over `/agent_ws/`. The live setup that worked was:
- `surfaces-bot` branch: `feat/matrix-direct-agent-prototype`
- Matrix bot env: `MATRIX_PLATFORM_BACKEND=real`, `AGENT_WS_URL=ws://127.0.0.1:8000/agent_ws/`
- patched local `external/platform-agent` with `thread_id` support
- provider configured through OpenRouter using model `qwen/qwen3.5-122b-a10b`
Important files:
- `sdk/agent_session.py`
- `sdk/prototype_state.py`
- `sdk/real.py`
- `adapter/matrix/bot.py`
- `adapter/matrix/handlers/auth.py`
- `docs/matrix-direct-agent-prototype-ru.md`
Important local-only dependency:
- `external/platform-agent` commit `1dca2c1` (`feat: support websocket thread ids`)
Likely running background process at pause time:
- local `platform-agent` server on port 8000, PID 13499
</context>
<next_action>
Start with the failure path: catch `PlatformError` around Matrix message handling so a bad provider response becomes a normal reply like “backend unavailable, try again later” instead of killing the process. After that, either upstream `external/platform-agent` commit `1dca2c1` or document it as an explicit prerequisite in the platform repo.
</next_action>