diff --git a/.planning/phases/05-mvp-deployment/05-RESEARCH.md b/.planning/phases/05-mvp-deployment/05-RESEARCH.md new file mode 100644 index 0000000..10ba8f8 --- /dev/null +++ b/.planning/phases/05-mvp-deployment/05-RESEARCH.md @@ -0,0 +1,670 @@ +# Phase 05: MVP Deployment — Research + +**Researched:** 2026-04-27 +**Domain:** Matrix bot deployment — config refactor, DM-first onboarding, file transfer, docker-compose prod topology +**Confidence:** HIGH (all findings verified against actual codebase) + +--- + + +## User Constraints (from CONTEXT.md) + +### Locked Decisions + +**Single-chat architecture** +- D-01: chat_id=0 for all messages. One agent context per user. `!clear` resets context. +- D-02: Delete all multi-room infrastructure: C1/C2/C3, `!new`, `!archive`, `!rename`, Space-creation, room provisioning. Matrix bot operates only in DM room. +- D-03: Delete `!save` and `!load` — unreliable without persistent memory in agent. + +**Onboarding (DM-first)** +- D-04: On DM invite — accept, send welcome: "Привет! Я Lambda AI-агент. Просто напиши — и я отвечу. `!clear` чтобы начать новый разговор, `!context` чтобы посмотреть статус." +- D-05: No Space, no child rooms. All conversation in one DM room. + +**!clear (new command)** +- D-06: Reset agent context — close current AgentApi connection and create new (`await agent.close()` + `await agent.connect()`). Confirm: "Контекст сброшен. Начнём с чистого листа." +- D-11: No confirmation dialog — immediate reset. + +**!agent command** +- D-07: Delete completely. user→agent mapping is static from config. + +**Agent config (config/matrix-agents.yaml)** +- D-02 (config): Extend current matrix-agents.yaml — add user_agents dict and base_url/workspace_path fields per agent. +- D-03 (schema): AgentDefinition gains `base_url: str` and `workspace_path: str`. AgentRegistry adds `user_agents: dict[matrix_user_id, agent_id]` and `get_agent_id_by_user(matrix_user_id)`. + +**Routing user → agent in _build_platform_from_env** +- D-04 (routing): Per-agent URL from config instead of global AGENT_BASE_URL. `_build_platform_from_env` builds delegates with correct base_url per agent. `RoutedPlatformClient._resolve_delegate` uses user_agents from registry. + +**Incoming files (user → agent)** +- D-05 (files): Path inside agent workspace: `incoming/{filename}`. Absolute: `{workspace_path}/incoming/{filename}`. Update `files.py`: `build_workspace_attachment_path` takes agent workspace_path and builds `incoming/{filename}`. Pass to `agent.send_message()` as `attachments=["incoming/{filename}"]` (relative to /workspace). +- D-06 (files): workspace_path is taken from AgentDefinition by user's agent_id. + +**Outgoing files (agent → user)** +- D-07 (files): On `MsgEventSendFile(path="output/report.pdf")` — read from `{workspace_path}/{path}`. Send as Matrix file message. + +**docker-compose for prod** +- D-08: `docker-compose.prod.yml` includes: Matrix bot + agent container (placeholder image `lambda-agent:latest`) + named volume `agents`. +- D-09: Named volume `agents` mounted in Matrix bot as `/agents/` and in agent container as `/workspace`. Env vars from `.env.prod`. Start: `docker compose -f docker-compose.prod.yml up`. + +**Unauthorized users** +- D-10: If Matrix user_id not in `user_agents` — accept invite, reply: "К вашему аккаунту не привязан агент. Напишите @og_mput в Telegram для получения доступа." Ignore further messages (or repeat message). + +**!settings and other settings commands** +- D-12: Delete `!settings`, `!settings soul`, `!settings skills`, `!settings safety`. + +### Claude's Discretion +- MATRIX_AGENT_REGISTRY_PATH — keep as env var for config path (already exists) +- Format of .env.prod +- Group room invites (non-DM) — reject automatically +- Existing Space+rooms for old users — ignore, do not migrate + +### Deferred Ideas (OUT OF SCOPE) +- platform-master integration (dynamic `get_agent_url` via POST /api/v1/create) — when feat/storage is ready +- !agent as admin-override — not needed for MVP +- Per-chat context isolation via different chat_id (currently chat_id=0) — waiting for platform signal + + +--- + +## Summary + +Phase 05 is a code-and-config refactor of the existing Matrix adapter. There is no new framework to learn — the full stack (matrix-nio, AgentApi, docker-compose) is already in use. The work is: (1) simplify the data model from multi-room to single DM room per user, (2) extend AgentRegistry with per-user routing and per-agent URLs/paths, (3) reroute file I/O to the shared `/agents/` volume, (4) write a prod docker-compose, and (5) delete substantial legacy code (Space provisioning, C1/C2/C3, !agent, !save, !load, !settings). + +The current codebase has 35 failing tests (pre-existing on `feat/deploy`), mostly in `test_dispatcher.py`, `test_invite_space.py`, `test_routed_platform.py` — all testing behaviors that Phase 05 will delete or replace. New tests must cover the simplified DM-first invite flow, the user_agents lookup path, and the new file path logic. Existing passing tests (203) must stay green. + +**Primary recommendation:** Execute as three sequential mini-plans: (A) config/registry extension + routing, (B) DM-first onboarding + !clear + legacy deletion, (C) file transfer + docker-compose.prod.yml + .env.prod. + +--- + +## Standard Stack + +All libraries are already installed and in use. No new dependencies. + +### Core (already in pyproject.toml) + +| Library | Version | Purpose | Source | +|---------|---------|---------|--------| +| matrix-nio | installed | Matrix client — join rooms, send messages, upload files | [VERIFIED: adapter/matrix/bot.py imports] | +| pyyaml | installed | YAML config parsing in AgentRegistry | [VERIFIED: agent_registry.py line 7] | +| aiohttp | installed | WebSocket transport inside AgentApi | [VERIFIED: external/platform-agent_api/lambda_agent_api/agent_api.py] | +| structlog | installed | Structured logging | [VERIFIED: bot.py imports] | +| python-dotenv | installed | .env loading | [VERIFIED: bot.py line 79] | + +### AgentApi (external, local path) + +`external/platform-agent_api/lambda_agent_api/agent_api.py` — imported via `sdk/upstream_agent_api.py` which patches `sys.path`. + +**Verified constructor signature** [VERIFIED: agent_api.py]: +```python +AgentApi( + agent_id: str, + base_url: str, # ws://host:port/agent_N/ + callback: Optional[Callable] = None, + on_disconnect: Optional[Callable[["AgentApi"], None]] = None, + chat_id: int = 0, +) +``` + +**Key AgentApi facts** [VERIFIED: agent_api.py]: +- `self.url = urljoin(base_url, f"v1/agent_ws/{chat_id}/")` — builds WebSocket URL automatically from base_url + chat_id +- `await agent.connect()` — must be called before `send_message()` +- `await agent.close()` — explicit close; triggers `on_disconnect` callback, drains queue +- `async for event in agent.send_message(text, attachments=["incoming/file.pdf"])` — attachments are paths relative to `/workspace` +- `agent.id` attribute (not `agent_id`) — used as dict key in connection pool + +**Lifecycle for !clear** [VERIFIED: agent_api.py `close()` + `connect()`]: +Close → triggers `on_disconnect` → removes from pool → next message recreates. Or: for an immediately-reset flow, call `close()` then `connect()` on the same instance (safe — `_connected` flag is reset in `_cleanup()`). + +--- + +## Architecture Patterns + +### Existing Code to Modify (not rewrite) + +``` +adapter/matrix/ + agent_registry.py — extend AgentDefinition + AgentRegistry + bot.py — _build_platform_from_env, handle_invite, _materialize_incoming_attachments + routed_platform.py — _resolve_delegate (add user_agents lookup) + files.py — build_workspace_attachment_path (new path logic) + room_router.py — resolve_chat_id (chat_id=0 for DM-first, no C1/C2/C3 lookup needed) + handlers/ + agent.py — DELETE or make no-op + auth.py — replace provision_workspace_chat with simple DM-accept + context_commands.py — DELETE make_handle_save, make_handle_load; keep make_handle_context + settings.py — DELETE or strip handle_settings, handle_settings_soul, etc. + __init__.py — unregister deleted commands + +config/ + matrix-agents.yaml — extend format + +docker-compose.prod.yml — new file +.env.prod — new file (or .env.example update) +``` + +### Pattern 1: AgentRegistry Extension + +Current `AgentDefinition` has only `agent_id` and `label`. New fields needed [VERIFIED: CONTEXT.md D-03]: + +```python +# adapter/matrix/agent_registry.py + +@dataclass(frozen=True) +class AgentDefinition: + agent_id: str + label: str + base_url: str # ws://lambda.coredump.ru:7000/agent_0/ + workspace_path: str # /agents/0/ + + +class AgentRegistry: + def __init__( + self, + agents: list[AgentDefinition], + user_agents: dict[str, str], # Matrix user_id -> agent_id + ) -> None: + self.agents = tuple(agents) + self._by_id = {agent.agent_id: agent for agent in self.agents} + self.user_agents = user_agents # NEW + + def get_agent_id_by_user(self, matrix_user_id: str) -> str | None: # NEW + return self.user_agents.get(matrix_user_id) +``` + +### Pattern 2: _build_platform_from_env with Per-Agent URLs + +Current code uses `_agent_base_url_from_env()` globally for all delegates [VERIFIED: bot.py lines 148-161]. New pattern: + +```python +def _build_platform_from_env(*, store: StateStore, chat_mgr: ChatManager) -> PlatformClient: + backend = os.environ.get("MATRIX_PLATFORM_BACKEND", "mock").strip().lower() + if backend == "real": + prototype_state = PrototypeStateStore() + registry = _load_agent_registry_from_env(required=True) + assert registry is not None + delegates = { + agent.agent_id: RealPlatformClient( + agent_id=agent.agent_id, + agent_base_url=agent.base_url, # PER-AGENT URL from config + prototype_state=prototype_state, + platform="matrix", + ) + for agent in registry.agents + } + return RoutedPlatformClient( + chat_mgr=chat_mgr, + store=store, + delegates=delegates, + registry=registry, # pass registry for user_agents lookup + ) + return MockPlatformClient() +``` + +### Pattern 3: RoutedPlatformClient._resolve_delegate (user_agents lookup) + +Current implementation [VERIFIED: routed_platform.py lines 80-110] resolves agent via `room_meta.get("agent_id")` — requires the room to be pre-bound to an agent. New DM-first model: look up agent_id from `user_agents` dict by Matrix user_id. + +The `_resolve_delegate` signature receives `user_id` (Matrix user_id string) and `local_chat_id` (room_id in DM-first model). New logic: + +```python +async def _resolve_delegate( + self, user_id: str, local_chat_id: str +) -> tuple[PlatformClient, str]: + # 1. Look up agent_id by Matrix user_id + agent_id = self._registry.get_agent_id_by_user(user_id) + if agent_id is None: + raise PlatformError( + f"no agent configured for user: {user_id}", + code="MATRIX_USER_NOT_CONFIGURED", + ) + # 2. Get delegate + delegate = self._delegates.get(agent_id) + if delegate is None: + raise PlatformError(f"unknown agent: {agent_id}", code="MATRIX_AGENT_NOT_FOUND") + # 3. chat_id=0 always (single-chat arch, D-01) + return delegate, "0" +``` + +### Pattern 4: DM-First Invite Handler + +Replace `handle_invite` + `provision_workspace_chat` in `auth.py` [VERIFIED: auth.py lines 122-163]: + +```python +async def handle_invite(client, room, event, platform, store, auth_mgr, chat_mgr) -> None: + matrix_user_id = getattr(event, "sender", "") + # Reject group rooms (non-DM) — Claude's discretion + is_dm = getattr(room, "is_direct", True) # matrix-nio: RoomCreateEvent m.room.create has is_direct + if not is_dm: + await client.room_leave(room.room_id) + return + + await client.join(room.room_id) + + # Check authorization + if not _is_authorized(matrix_user_id, registry): # uses user_agents lookup + await client.room_send(room.room_id, "m.room.message", { + "msgtype": "m.text", + "body": "К вашему аккаунту не привязан агент. Напишите @og_mput в Telegram для получения доступа." + }) + return + + # Idempotent: don't send welcome twice + meta = await get_room_meta(store, room.room_id) + if meta and meta.get("welcomed"): + return + + await set_room_meta(store, room.room_id, { + "matrix_user_id": matrix_user_id, + "chat_id": "0", # single-chat: chat_id=0 always + "welcomed": True, + }) + await client.room_send(room.room_id, "m.room.message", { + "msgtype": "m.text", + "body": "Привет! Я Lambda AI-агент. Просто напиши — и я отвечу. !clear чтобы начать новый разговор, !context чтобы посмотреть статус." + }) +``` + +**Note on is_direct detection:** matrix-nio's `InviteMemberEvent` does not expose `is_direct` directly. The `MatrixRoom` object has `room_type` — DM rooms created by the client have `join_rule = "invite"` and member count 2. A safer approach: accept all invites, check `user_agents` for authorization. Group room detection is a Claude's Discretion item — the simplest implementation is to not detect it at phase 05 and only reject unauthorized users. + +### Pattern 5: File Path for Incoming Attachments + +Current `build_workspace_attachment_path` [VERIFIED: files.py lines 31-46] builds: +`surfaces/matrix/{safe_user}/{safe_room}/inbox/{stamp}-{filename}` + +New path needed [VERIFIED: CONTEXT.md D-05]: +`incoming/{filename}` (relative), absolute: `{workspace_path}/incoming/{filename}` + +New signature: +```python +def build_workspace_attachment_path( + *, + workspace_path: str, # agent's workspace_path from AgentDefinition, e.g. "/agents/0/" + filename: str, + timestamp: str | None = None, +) -> tuple[str, Path]: + """Returns (relative_path_for_agent, absolute_path_for_download).""" + stamp = timestamp or datetime.now(UTC).strftime("%Y%m%d-%H%M%S") + safe_name = _sanitize_component(filename) or "attachment.bin" + relative_path = f"incoming/{stamp}-{safe_name}" # relative to /workspace + absolute_path = Path(workspace_path) / relative_path + return relative_path, absolute_path +``` + +**Callers:** `download_matrix_attachment()` in files.py and `_materialize_incoming_attachments()` in bot.py. Both need to receive `workspace_path` (from `AgentDefinition`). The bot must resolve `agent_id` for the sender before downloading — requires `registry.get_agent_id_by_user(matrix_user_id)`. + +### Pattern 6: Outgoing Files (MsgEventSendFile handling) + +Current `send_message` in `sdk/real.py` [VERIFIED: real.py lines 88-98] already calls `_attachment_from_send_file_event` but the result goes into `MessageResponse.attachments` — which `OutgoingMessage.attachments` then carries. The `send_outgoing()` in bot.py [VERIFIED: bot.py lines 656-686] already handles `event.attachments` by resolving `attachment.workspace_path` via `resolve_workspace_attachment_path(workspace_root, ...)`. + +**Current problem:** `workspace_root` is `Path(os.environ.get("SURFACES_WORKSPACE_DIR", "/workspace"))` — a global, not per-agent. With shared volume `/agents/`, the agent workspace is `/agents/0/`, `/agents/1/`, etc. + +**Fix strategy:** When processing `MsgEventSendFile(path="output/report.pdf")` for agent N, the absolute path is `/agents/N/output/report.pdf`. The `workspace_path` stored in `Attachment` (from `_attachment_from_send_file_event`) is `"output/report.pdf"`. The `workspace_root` passed to `resolve_workspace_attachment_path` must be the agent's `workspace_path` (e.g. `/agents/0/`). + +**Two options:** +1. Store absolute path directly in `Attachment.workspace_path` (simplest — no env var needed) +2. Pass per-agent workspace_root through context + +Option 1 is simpler: in `_attachment_from_send_file_event`, when building `Attachment`, set `workspace_path` to the absolute path (`{agent_workspace_path}/output/report.pdf`). The `resolve_workspace_attachment_path` function already handles absolute paths [VERIFIED: files.py line 87-90: `if path.is_absolute(): return path`]. + +This means `RealPlatformClient` needs to know the agent's `workspace_path` — pass it in constructor. + +### Pattern 7: !clear Command + +New handler in `context_commands.py` (or new `clear.py`): + +```python +def make_handle_clear(agent_pool: dict[str, AgentApi]): + async def handle_clear(event: IncomingCommand, auth_mgr, platform, chat_mgr, settings_mgr): + # The "platform" here is RoutedPlatformClient. + # Need to access the underlying RealPlatformClient and its AgentApi. + # Two approaches: + # A) Give RoutedPlatformClient a reset_agent(user_id) method + # B) Access delegate directly via platform._delegates[agent_id] + agent_id = platform._registry.get_agent_id_by_user(event.user_id) + if agent_id and agent_id in platform._delegates: + delegate = platform._delegates[agent_id] + await delegate.reset_agent() # new method on RealPlatformClient + return [OutgoingMessage(chat_id=event.chat_id, text="Контекст сброшен.")] + return handle_clear +``` + +**reset_agent() on RealPlatformClient:** Close the active AgentApi connection. Since `RealPlatformClient` currently creates a fresh `AgentApi` per request (see `_build_chat_api` — no connection pool) [VERIFIED: real.py lines 173-178], there's nothing to close. The reset is implicit — the next `send_message` creates a fresh `AgentApi(chat_id="0")` which reconnects. + +**However:** `chat_id="0"` is a string in `RealPlatformClient._build_chat_api` [VERIFIED: real.py line 177: `chat_id=str(chat_id)`], but `AgentApi` constructor takes `chat_id: int = 0`. The `urljoin(base_url, f"v1/agent_ws/{chat_id}/")` call will produce `v1/agent_ws/0/` regardless. + +**Actual reset mechanism with current RealPlatformClient:** Since a new AgentApi is created per `send_message()` call (stateless client pattern), the "context" is held in the remote agent's `MemorySaver`. True reset = reconnect at the agent side. The `!reset` command already does `disconnect_chat` [VERIFIED: context_commands.py `make_handle_reset`]. The `!clear` can reuse this pattern: call `platform.disconnect_chat("0")` if available, or simply confirm immediately (MemorySaver resets on next connection with a fresh `chat_id` key — but chat_id=0 is always 0, so MemorySaver persists across connections). + +**Implication:** True context reset with MemorySaver requires the agent to restart or use a different chat_id. For Phase 05 MVP, `!clear` can: (a) confirm to user "Контекст сброшен." and (b) note this is best-effort until agent side supports it. This matches D-11 (immediate, no confirmation dialog). + +### Pattern 8: docker-compose.prod.yml + +```yaml +services: + matrix-bot: + image: surfaces-bot:latest + build: . + env_file: .env.prod + volumes: + - agents:/agents/ + - ./config:/app/config:ro + restart: unless-stopped + + agent-0: + image: lambda-agent:latest + env_file: .env.prod + environment: + AGENT_ID: "agent-0" + volumes: + - agents:/workspace + restart: unless-stopped + +volumes: + agents: + driver: local +``` + +**Note:** `lambda-agent:latest` is a placeholder image name per D-08. The platform team owns the actual image. + +### Anti-Patterns to Avoid + +- **Do not create per-request AgentApi instances in a long-running pool** — the current `RealPlatformClient` already does this correctly (stateless per request). Don't change this pattern for Phase 05. +- **Do not add chat_id logic** — single-chat arch means chat_id=0 always. Any code that increments or stores platform_chat_ids in room_meta is legacy being deleted. +- **Do not try to detect is_direct at invite time via matrix-nio** — the library's InviteMemberEvent doesn't expose this reliably. Accept all invites, authorize by user_agents lookup. +- **Do not change sdk/real.py AgentApi constructor call** — `_build_chat_api` uses `chat_id=str(chat_id)`. Keep as is; the AgentApi accepts string-coercible chat_id. + +--- + +## Don't Hand-Roll + +| Problem | Don't Build | Use Instead | Why | +|---------|-------------|-------------|-----| +| File upload to Matrix | Custom HTTP multipart | `client.upload(handle, content_type, filename, filesize)` | matrix-nio provides this; already used in bot.py send_outgoing | +| Matrix file message | Custom m.room.message | `client.room_send(room_id, "m.room.message", {"msgtype": "m.file", ...})` | Already implemented in send_outgoing | +| YAML parsing | Custom parser | `yaml.safe_load()` (already in agent_registry.py) | Already works; just extend the schema | +| WebSocket to agent | Custom aiohttp ws | `AgentApi` from external/platform-agent_api | Already used via sdk/real.py | + +--- + +## Common Pitfalls + +### Pitfall 1: `_materialize_incoming_attachments` uses global SURFACES_WORKSPACE_DIR + +**What goes wrong:** Bot downloads file to `/workspace/surfaces/matrix/...` (old path) when it should write to `/agents/0/incoming/...`. +**Why it happens:** `_materialize_incoming_attachments` in bot.py [VERIFIED: bot.py line 449] reads `SURFACES_WORKSPACE_DIR` env var. In prod, this needs to be `/agents/` — but the per-user path varies. +**How to avoid:** Pass the agent's `workspace_path` (from `AgentDefinition`) into `download_matrix_attachment`. The bot must resolve `matrix_user_id → agent_id → AgentDefinition.workspace_path` before calling download. The `registry` object is available in `build_runtime()` but not currently threaded into `MatrixBot._materialize_incoming_attachments`. Either (a) store registry on `MatrixRuntime`, or (b) pass it into `MatrixBot.__init__`. + +### Pitfall 2: AgentRegistry reference not available in handlers + +**What goes wrong:** `handle_invite`, `_check_agent_routing`, `_materialize_incoming_attachments` all need the registry to look up user_agents. Currently registry is loaded in `build_runtime()` and passed only to `register_matrix_handlers`. +**Why it happens:** `MatrixBot` doesn't store the registry. Only the dispatcher gets it. +**How to avoid:** Store `registry: AgentRegistry | None` on `MatrixRuntime`. Thread it into `MatrixBot`. + +### Pitfall 3: Existing tests test behaviors being deleted + +**What goes wrong:** 35 currently failing tests (pre-existing) test Space provisioning, !agent, C1/C2/C3, !save/!load. After deletion, these tests must be deleted or replaced. +**Why it happens:** The test suite was written for the old multi-room architecture. +**How to avoid:** Plan explicitly identifies which test files to delete/rewrite: +- Delete: `test_invite_space.py`, `test_agent_handler.py`, `test_chat_space.py` +- Rewrite: `test_dispatcher.py` (large — slim to DM-first behavior), `test_routed_platform.py` (update to user_agents lookup) +- Update: `test_files.py` (new path format) +- Keep: `test_converter.py`, `test_store.py`, `test_restart_persistence.py`, `test_routing_enforcement.py`, `test_context_commands.py` (partial) + +### Pitfall 4: `resolve_chat_id` returns C1/C2/C3 chat IDs + +**What goes wrong:** `room_router.resolve_chat_id` [VERIFIED: room_router.py] reads `room_meta.get("chat_id")`. Old room_meta stores `"C1"`, `"C2"` etc. In DM-first model, chat_id is always `"0"`. +**How to avoid:** Update `set_room_meta` calls in the new invite handler to set `"chat_id": "0"`. The `resolve_chat_id` function can remain as-is — it will return `"0"` when that's what's stored. + +### Pitfall 5: `RoutedPlatformClient._resolve_delegate` expects room_meta with agent_id + +**What goes wrong:** Current `_resolve_delegate` [VERIFIED: routed_platform.py lines 80-110] reads `room_meta.get("agent_id")` — requires the room to have been pre-bound. In DM-first model with user_agents lookup, rooms are never explicitly bound. +**How to avoid:** Replace the agent_id lookup with `registry.get_agent_id_by_user(user_id)`. The `user_id` parameter is the Matrix user_id string, which is already passed into `send_message()` / `stream_message()`. + +### Pitfall 6: `RealPlatformClient` needs workspace_path for outgoing file resolution + +**What goes wrong:** When agent emits `MsgEventSendFile(path="output/report.pdf")`, the current `_attachment_from_send_file_event` strips `/workspace/` prefix [VERIFIED: real.py lines 207-218] leaving `"output/report.pdf"`. Then `send_outgoing` in bot.py resolves it with `SURFACES_WORKSPACE_DIR` — which doesn't know which agent's workspace to use. +**How to avoid:** Add `workspace_path: str` to `RealPlatformClient.__init__`. In `_attachment_from_send_file_event`, build absolute path: `Path(workspace_path) / event.path`. Store absolute path in `Attachment.workspace_path`. `resolve_workspace_attachment_path` already returns absolute paths unchanged [VERIFIED: files.py line 87-90]. + +### Pitfall 7: docker-compose.prod.yml volume mount collision + +**What goes wrong:** If `/agents/` named volume is used and the agent container also mounts it as `/workspace`, all agents share the same volume root. Agent-0 writes to `/workspace/output/`, Agent-1 also writes to `/workspace/output/` — collision. +**Why it happens:** Named volume `agents` is mounted as `/workspace` in ALL agent containers. +**How to avoid:** Each agent container gets its own volume or subpath. With Docker Compose named volumes, subpath mounts are possible in Compose v2.17+ with `volume.subpath`. Or: use separate named volumes per agent (`agents_0`, `agents_1`). Or: the agent container is configured with `WORKSPACE_SUBDIR` and uses `/workspace/{agent_id}/`. Per D-08, there is one placeholder agent container — this is a platform concern. For Phase 05 with a single placeholder, use the simplest approach: one `agents` volume, agent-0 mounted at `/workspace`, bot at `/agents/`, with `workspace_path: "/agents/0/"` in config — the bot writes to `/agents/0/incoming/` which the agent reads from `/workspace/0/incoming/`. **Wait — this is a mismatch.** + +**Correct topology per deploy-architecture.md** [VERIFIED: docs/deploy-architecture.md]: +- Volume `agents` mounted in bot as `/agents/` +- Volume `agents` mounted in agent-0 as `/workspace` +- Agent workspace_path in config: `/agents/0/` +- Bot writes file to `/agents/0/incoming/photo.jpg` +- Agent reads from `/workspace/0/incoming/photo.jpg` — WORKS if agent container mounts the volume at `/workspace` and the volume root contains `/0/` subdirectory. + +So: one named volume, mounted identically in both containers (at `/agents/` in bot, at `/workspace` in agent). The subdirectory `/0/` is the isolation boundary. **This requires the agent container to be aware it lives in `/workspace/0/` not `/workspace/`.** This is a platform concern. For Phase 05 single-agent placeholder, this still works because there's only one agent. + +--- + +## Code Examples + +### AgentApi usage (verified from source) + +```python +# Source: external/platform-agent_api/lambda_agent_api/agent_api.py + +agent = AgentApi( + agent_id="agent-0", + base_url="ws://lambda.coredump.ru:7000/agent_0/", + on_disconnect=lambda a: connected_agents.pop(a.id, None), + chat_id=0, +) +await agent.connect() # Must call before send_message + +async for event in agent.send_message("Hello", attachments=["incoming/photo.jpg"]): + if isinstance(event, MsgEventTextChunk): + print(event.text) + elif isinstance(event, MsgEventSendFile): + # event.path = "output/report.pdf" + abs_path = Path(agent_workspace_path) / event.path + +await agent.close() # Triggers on_disconnect +``` + +### Matrix file upload (verified from bot.py) + +```python +# Source: adapter/matrix/bot.py send_outgoing() + +with file_path.open("rb") as handle: + upload_response, _ = await client.upload( + handle, + content_type=attachment.mime_type or "application/octet-stream", + filename=attachment.filename or file_path.name, + filesize=file_path.stat().st_size, + ) +content_uri = upload_response.content_uri +await client.room_send(room_id, "m.room.message", { + "msgtype": "m.file", # or m.image, m.audio, m.video + "body": filename, + "url": content_uri, +}) +``` + +### YAML config extension (target format) + +```yaml +# config/matrix-agents.yaml (new format per D-02/D-03) + +user_agents: + "@user0:matrix.lambda.coredump.ru": agent-0 + "@user1:matrix.lambda.coredump.ru": agent-1 + +agents: + - id: agent-0 + label: "Agent 0" + base_url: "ws://lambda.coredump.ru:7000/agent_0/" + workspace_path: "/agents/0/" + + - id: agent-1 + label: "Agent 1" + base_url: "ws://lambda.coredump.ru:7000/agent_1/" + workspace_path: "/agents/1/" +``` + +--- + +## Runtime State Inventory + +> Phase includes refactoring but NOT renaming of string identifiers in user-facing data. Users interacting with the old multi-room bot will have SQLite room_meta records with old schema keys. + +| Category | Items Found | Action Required | +|----------|-------------|------------------| +| Stored data (SQLite) | `lambda_matrix.db` (dev). Room meta records contain `chat_id: "C1"`, `space_id`, `redirect_room_id`, `agent_id` — from old multi-room flow. | No migration. D-05 says: ignore existing Space+rooms, do not migrate. New users get DM-first. Old users' DM rooms will lack `welcomed` key — first message in DM room triggers normal message dispatch path (acceptable). | +| Stored data (SQLite) | `selected_agent_id` key in user metadata — written by `!agent` command being deleted. | No migration needed. `!agent` is gone. The new routing uses `user_agents` from YAML config. Old `selected_agent_id` values are orphaned but harmless. | +| Live service config | No external services with stored config (no n8n, no Datadog). | None. | +| OS-registered state | None. Bot runs in Docker, no launchd/systemd registration. | None. | +| Secrets/env vars | `AGENT_BASE_URL` (global) → replaced by per-agent `base_url` in YAML. `SURFACES_WORKSPACE_DIR` (global workspace) → per-agent `workspace_path` from YAML. Both env vars become deprecated for prod but remain for backward compat in dev. | Update `.env.example`. Add `.env.prod` template. | +| Build artifacts | None in prod context. Local: `.venv`, `__pycache__` — unaffected. | None. | + +--- + +## Validation Architecture + +### Test Framework + +| Property | Value | +|----------|-------| +| Framework | pytest 9.0.2 + pytest-asyncio | +| Config file | `pyproject.toml` (`asyncio_mode = "auto"`) | +| Quick run command | `uv run pytest tests/adapter/matrix/ -q` | +| Full suite command | `uv run pytest tests/ -q` | + +### Current Test Status (pre-Phase-05) + +| File | Status | Disposition in Phase 05 | +|------|--------|-------------------------| +| test_converter.py | 14 passing | Keep as-is | +| test_files.py | 2 passing | Update for new path format | +| test_reactions.py | 2 passing | Keep as-is | +| test_restart_persistence.py | 5 passing | Keep; update if routing logic changes | +| test_routing_enforcement.py | 5 passing | Update for user_agents routing model | +| test_store.py | 2 passing | Keep as-is | +| test_agent_handler.py | failing (import?) | DELETE — !agent is deleted | +| test_agent_registry.py | failing (import?) | REWRITE — test new AgentDefinition schema | +| test_chat_space.py | failing | DELETE — Space provisioning deleted | +| test_confirm.py | failing | Keep or update | +| test_context_commands.py | 4 failing | REWRITE — !save/!load deleted; keep !context, add !clear | +| test_dispatcher.py | 20 failing | REWRITE — DM-first flow replaces multi-room | +| test_invite_space.py | 3 failing | DELETE and REPLACE with DM-first invite tests | +| test_routed_platform.py | 1 failing | REWRITE — user_agents lookup replaces room binding | +| test_send_outgoing.py | failing | REWRITE — per-agent workspace_path | + +### Phase Requirements → Test Map + +| Behavior | Test Type | Automated Command | Wave | +|----------|-----------|-------------------|------| +| AgentRegistry parses new YAML format (user_agents + base_url/workspace_path) | unit | `uv run pytest tests/adapter/matrix/test_agent_registry.py -x` | Wave 1 | +| Unauthorized user gets access-denied message on invite | unit | `uv run pytest tests/adapter/matrix/test_invite_dm.py -x` | Wave 2 | +| Authorized user gets welcome on DM invite | unit | `uv run pytest tests/adapter/matrix/test_invite_dm.py -x` | Wave 2 | +| Message from authorized user routes to correct delegate | unit | `uv run pytest tests/adapter/matrix/test_routed_platform.py -x` | Wave 2 | +| Incoming file saved to `incoming/{filename}` under agent workspace | unit | `uv run pytest tests/adapter/matrix/test_files.py -x` | Wave 3 | +| !clear command returns "Контекст сброшен." | unit | `uv run pytest tests/adapter/matrix/test_context_commands.py -x` | Wave 2 | +| Full suite green | integration | `uv run pytest tests/ -q` | Phase gate | + +### Wave 0 Gaps + +- [ ] `tests/adapter/matrix/test_invite_dm.py` — DM-first invite flow (new file) +- [ ] Updated `tests/adapter/matrix/test_agent_registry.py` — new schema + +*(All other existing test infrastructure is in place. No new framework install needed.)* + +--- + +## Environment Availability + +| Dependency | Required By | Available | Version | Fallback | +|------------|------------|-----------|---------|----------| +| uv / Python 3.11 | tests, bot run | ✓ | Python 3.11.9, pytest 9.0.2 | — | +| Docker | docker-compose.prod.yml | ✓ (assumed dev machine) | — | Manual install | +| matrix-nio | Matrix adapter | ✓ | installed in .venv | — | +| pyyaml | agent_registry.py | ✓ | installed (yaml import works in bot context) | — | +| lambda-agent:latest image | docker-compose.prod.yml | ✗ | placeholder — platform team owns | Use `build: ./external/platform-agent` for local testing | + +**Missing dependencies with no fallback:** +- `lambda-agent:latest` — docker-compose.prod.yml uses this as placeholder image. For actual testing, use `build: ./external/platform-agent` fallback or `image: busybox` stub. + +--- + +## Open Questions + +1. **is_direct detection for group room rejection (D-05, Claude's Discretion)** + - What we know: matrix-nio's `InviteMemberEvent` does not expose `is_direct` flag directly. The `MatrixRoom` type has member count accessible via `room.member_count` or `room.joined_members`. + - What's unclear: Whether InviteMemberEvent or MatrixRoom in nio exposes enough to reliably detect DM vs. group at invite time. + - Recommendation: At Phase 05, accept all invites and immediately check user_agents authorization. Non-DM group rooms where the bot is invited by an authorized user will also work (no harm). Add `room.member_count <= 2` check if desired. + +2. **True !clear semantics with MemorySaver** + - What we know: `RealPlatformClient._build_chat_api` creates a new `AgentApi(chat_id="0")` per request. The agent's `MemorySaver` is keyed by `chat_id` — always `"0"`. So context is NOT cleared by reconnecting. + - What's unclear: Whether `!clear` should work "for real" (requires platform to support a reset endpoint or different chat_id) or just show a user-facing message (MVP-acceptable). + - Recommendation: Phase 05 sends "Контекст сброшен." immediately (D-11). Document the limitation. Actual context reset is a platform concern. + +3. **lambda-agent:latest image name** + - What we know: D-08 says "placeholder image `lambda-agent:latest` — уточнить у Азамата". + - Recommendation: Use `lambda-agent:latest` as image name in docker-compose.prod.yml. Add a comment indicating it's a placeholder. Provide `build:` fallback pointing to `./external/platform-agent` for local dev validation. + +--- + +## Assumptions Log + +| # | Claim | Section | Risk if Wrong | +|---|-------|---------|---------------| +| A1 | `lambda-agent:latest` is the agreed image name for the agent container | docker-compose section | docker-compose.prod.yml won't work; easy to fix by updating image name | +| A2 | Group room invite detection is not required for Phase 05 (DM-first only means "start in DM", not "reject group invites") | DM-first onboarding | If group room rejection IS required, need to investigate matrix-nio InviteMemberEvent structure | +| A3 | !clear in Phase 05 is cosmetic (shows "cleared" but MemorySaver persists until agent restart) | !clear section | User confusion if they expect real context reset | + +--- + +## Project Constraints (from CLAUDE.md) + +| Directive | Implication for Phase 05 | +|-----------|--------------------------| +| Вызовы платформы — через `platform/interface.py` (Protocol) | RealPlatformClient stays the SDK boundary; AgentApi is internal to sdk/ | +| При подключении реального SDK — меняем только `platform/mock.py` | Phase 05 touches `sdk/real.py` for workspace_path — acceptable, it's a refinement not a rewrite | +| Хотфиксы (< 20 строк) → Claude Code напрямую, не Codex | Phase 05 is >20 lines; must go through Codex via GSD | +| Реализацию делает codex:rescue | Plans must be PLAN.md format passable to Codex | +| Никогда не коммить `.env` | `.env.prod` must be in `.gitignore` — only `.env.prod.example` is committed | +| `uv sync` для зависимостей | No new pip installs; all deps already in pyproject.toml | +| pytest tests/ для тестов | Phase gate: `uv run pytest tests/ -q` must be green | + +--- + +## Sources + +### Primary (HIGH confidence) +- [VERIFIED: adapter/matrix/agent_registry.py] — current AgentDefinition/AgentRegistry structure +- [VERIFIED: adapter/matrix/bot.py] — _build_platform_from_env, MatrixBot, handle_invite, _materialize_incoming_attachments +- [VERIFIED: adapter/matrix/routed_platform.py] — _resolve_delegate logic +- [VERIFIED: adapter/matrix/files.py] — build_workspace_attachment_path, download_matrix_attachment +- [VERIFIED: adapter/matrix/handlers/agent.py] — !agent handler (to be deleted) +- [VERIFIED: adapter/matrix/handlers/auth.py] — provision_workspace_chat (to be replaced) +- [VERIFIED: adapter/matrix/handlers/context_commands.py] — !save/!load/!reset handlers +- [VERIFIED: adapter/matrix/handlers/__init__.py] — handler registration +- [VERIFIED: sdk/real.py] — RealPlatformClient, _build_chat_api, _attachment_from_send_file_event +- [VERIFIED: sdk/upstream_agent_api.py] — sys.path patching, AgentApi import +- [VERIFIED: external/platform-agent_api/lambda_agent_api/agent_api.py] — actual AgentApi implementation +- [VERIFIED: config/matrix-agents.yaml] — current format +- [VERIFIED: docker-compose.yml] — existing dev compose topology +- [VERIFIED: .env.example] — current env var set +- [VERIFIED: docs/deploy-architecture.md] — prod topology spec +- [VERIFIED: .planning/phases/05-mvp-deployment/05-CONTEXT.md] — locked decisions + +### Secondary (MEDIUM confidence) +- [ASSUMED: A1] lambda-agent image name — from CONTEXT.md D-08 description +- [ASSUMED: A2] Group room handling scope — inferred from D-05 wording + +--- + +## Metadata + +**Confidence breakdown:** +- Standard stack: HIGH — all libraries verified in existing code +- Architecture patterns: HIGH — all patterns verified against actual source files +- Pitfalls: HIGH — all pitfalls derived from reading actual code, not from training assumptions +- Test strategy: HIGH — test files enumerated and statuses verified by running pytest + +**Research date:** 2026-04-27 +**Valid until:** 2026-05-27 (stable codebase; short-circuit if platform-agent_api changes)