39 KiB
Phase 05: MVP Deployment — Research
Researched: 2026-04-27 Domain: Matrix bot deployment — config refactor, DM-first onboarding, file transfer, docker-compose prod topology Confidence: HIGH (all findings verified against actual codebase)
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
Single-chat architecture
- D-01: chat_id=0 for all messages. One agent context per user.
!clearresets context. - D-02: Delete all multi-room infrastructure: C1/C2/C3,
!new,!archive,!rename, Space-creation, room provisioning. Matrix bot operates only in DM room. - D-03: Delete
!saveand!load— unreliable without persistent memory in agent.
Onboarding (DM-first)
- D-04: On DM invite — accept, send welcome: "Привет! Я Lambda AI-агент. Просто напиши — и я отвечу.
!clearчтобы начать новый разговор,!contextчтобы посмотреть статус." - D-05: No Space, no child rooms. All conversation in one DM room.
!clear (new command)
- D-06: Reset agent context — close current AgentApi connection and create new (
await agent.close()+await agent.connect()). Confirm: "Контекст сброшен. Начнём с чистого листа." - D-11: No confirmation dialog — immediate reset.
!agent command
- D-07: Delete completely. user→agent mapping is static from config.
Agent config (config/matrix-agents.yaml)
- D-02 (config): Extend current matrix-agents.yaml — add user_agents dict and base_url/workspace_path fields per agent.
- D-03 (schema): AgentDefinition gains
base_url: strandworkspace_path: str. AgentRegistry addsuser_agents: dict[matrix_user_id, agent_id]andget_agent_id_by_user(matrix_user_id).
Routing user → agent in _build_platform_from_env
- D-04 (routing): Per-agent URL from config instead of global AGENT_BASE_URL.
_build_platform_from_envbuilds delegates with correct base_url per agent.RoutedPlatformClient._resolve_delegateuses user_agents from registry.
Incoming files (user → agent)
- D-05 (files): Path inside agent workspace:
incoming/{filename}. Absolute:{workspace_path}/incoming/{filename}. Updatefiles.py:build_workspace_attachment_pathtakes agent workspace_path and buildsincoming/{filename}. Pass toagent.send_message()asattachments=["incoming/{filename}"](relative to /workspace). - D-06 (files): workspace_path is taken from AgentDefinition by user's agent_id.
Outgoing files (agent → user)
- D-07 (files): On
MsgEventSendFile(path="output/report.pdf")— read from{workspace_path}/{path}. Send as Matrix file message.
docker-compose for prod
- D-08:
docker-compose.prod.ymlincludes: Matrix bot + agent container (placeholder imagelambda-agent:latest) + named volumeagents. - D-09: Named volume
agentsmounted in Matrix bot as/agents/and in agent container as/workspace. Env vars from.env.prod. Start:docker compose -f docker-compose.prod.yml up.
Unauthorized users
- D-10: If Matrix user_id not in
user_agents— accept invite, reply: "К вашему аккаунту не привязан агент. Напишите @og_mput в Telegram для получения доступа." Ignore further messages (or repeat message).
!settings and other settings commands
- D-12: Delete
!settings,!settings soul,!settings skills,!settings safety.
Claude's Discretion
- MATRIX_AGENT_REGISTRY_PATH — keep as env var for config path (already exists)
- Format of .env.prod
- Group room invites (non-DM) — reject automatically
- Existing Space+rooms for old users — ignore, do not migrate
Deferred Ideas (OUT OF SCOPE)
- platform-master integration (dynamic
get_agent_urlvia POST /api/v1/create) — when feat/storage is ready - !agent as admin-override — not needed for MVP
- Per-chat context isolation via different chat_id (currently chat_id=0) — waiting for platform signal </user_constraints>
Summary
Phase 05 is a code-and-config refactor of the existing Matrix adapter. There is no new framework to learn — the full stack (matrix-nio, AgentApi, docker-compose) is already in use. The work is: (1) simplify the data model from multi-room to single DM room per user, (2) extend AgentRegistry with per-user routing and per-agent URLs/paths, (3) reroute file I/O to the shared /agents/ volume, (4) write a prod docker-compose, and (5) delete substantial legacy code (Space provisioning, C1/C2/C3, !agent, !save, !load, !settings).
The current codebase has 35 failing tests (pre-existing on feat/deploy), mostly in test_dispatcher.py, test_invite_space.py, test_routed_platform.py — all testing behaviors that Phase 05 will delete or replace. New tests must cover the simplified DM-first invite flow, the user_agents lookup path, and the new file path logic. Existing passing tests (203) must stay green.
Primary recommendation: Execute as three sequential mini-plans: (A) config/registry extension + routing, (B) DM-first onboarding + !clear + legacy deletion, (C) file transfer + docker-compose.prod.yml + .env.prod.
Standard Stack
All libraries are already installed and in use. No new dependencies.
Core (already in pyproject.toml)
| Library | Version | Purpose | Source |
|---|---|---|---|
| matrix-nio | installed | Matrix client — join rooms, send messages, upload files | [VERIFIED: adapter/matrix/bot.py imports] |
| pyyaml | installed | YAML config parsing in AgentRegistry | [VERIFIED: agent_registry.py line 7] |
| aiohttp | installed | WebSocket transport inside AgentApi | [VERIFIED: external/platform-agent_api/lambda_agent_api/agent_api.py] |
| structlog | installed | Structured logging | [VERIFIED: bot.py imports] |
| python-dotenv | installed | .env loading | [VERIFIED: bot.py line 79] |
AgentApi (external, local path)
external/platform-agent_api/lambda_agent_api/agent_api.py — imported via sdk/upstream_agent_api.py which patches sys.path.
Verified constructor signature [VERIFIED: agent_api.py]:
AgentApi(
agent_id: str,
base_url: str, # ws://host:port/agent_N/
callback: Optional[Callable] = None,
on_disconnect: Optional[Callable[["AgentApi"], None]] = None,
chat_id: int = 0,
)
Key AgentApi facts [VERIFIED: agent_api.py]:
self.url = urljoin(base_url, f"v1/agent_ws/{chat_id}/")— builds WebSocket URL automatically from base_url + chat_idawait agent.connect()— must be called beforesend_message()await agent.close()— explicit close; triggerson_disconnectcallback, drains queueasync for event in agent.send_message(text, attachments=["incoming/file.pdf"])— attachments are paths relative to/workspaceagent.idattribute (notagent_id) — used as dict key in connection pool
Lifecycle for !clear [VERIFIED: agent_api.py close() + connect()]:
Close → triggers on_disconnect → removes from pool → next message recreates. Or: for an immediately-reset flow, call close() then connect() on the same instance (safe — _connected flag is reset in _cleanup()).
Architecture Patterns
Existing Code to Modify (not rewrite)
adapter/matrix/
agent_registry.py — extend AgentDefinition + AgentRegistry
bot.py — _build_platform_from_env, handle_invite, _materialize_incoming_attachments
routed_platform.py — _resolve_delegate (add user_agents lookup)
files.py — build_workspace_attachment_path (new path logic)
room_router.py — resolve_chat_id (chat_id=0 for DM-first, no C1/C2/C3 lookup needed)
handlers/
agent.py — DELETE or make no-op
auth.py — replace provision_workspace_chat with simple DM-accept
context_commands.py — DELETE make_handle_save, make_handle_load; keep make_handle_context
settings.py — DELETE or strip handle_settings, handle_settings_soul, etc.
__init__.py — unregister deleted commands
config/
matrix-agents.yaml — extend format
docker-compose.prod.yml — new file
.env.prod — new file (or .env.example update)
Pattern 1: AgentRegistry Extension
Current AgentDefinition has only agent_id and label. New fields needed [VERIFIED: CONTEXT.md D-03]:
# adapter/matrix/agent_registry.py
@dataclass(frozen=True)
class AgentDefinition:
agent_id: str
label: str
base_url: str # ws://lambda.coredump.ru:7000/agent_0/
workspace_path: str # /agents/0/
class AgentRegistry:
def __init__(
self,
agents: list[AgentDefinition],
user_agents: dict[str, str], # Matrix user_id -> agent_id
) -> None:
self.agents = tuple(agents)
self._by_id = {agent.agent_id: agent for agent in self.agents}
self.user_agents = user_agents # NEW
def get_agent_id_by_user(self, matrix_user_id: str) -> str | None: # NEW
return self.user_agents.get(matrix_user_id)
Pattern 2: _build_platform_from_env with Per-Agent URLs
Current code uses _agent_base_url_from_env() globally for all delegates [VERIFIED: bot.py lines 148-161]. New pattern:
def _build_platform_from_env(*, store: StateStore, chat_mgr: ChatManager) -> PlatformClient:
backend = os.environ.get("MATRIX_PLATFORM_BACKEND", "mock").strip().lower()
if backend == "real":
prototype_state = PrototypeStateStore()
registry = _load_agent_registry_from_env(required=True)
assert registry is not None
delegates = {
agent.agent_id: RealPlatformClient(
agent_id=agent.agent_id,
agent_base_url=agent.base_url, # PER-AGENT URL from config
prototype_state=prototype_state,
platform="matrix",
)
for agent in registry.agents
}
return RoutedPlatformClient(
chat_mgr=chat_mgr,
store=store,
delegates=delegates,
registry=registry, # pass registry for user_agents lookup
)
return MockPlatformClient()
Pattern 3: RoutedPlatformClient._resolve_delegate (user_agents lookup)
Current implementation [VERIFIED: routed_platform.py lines 80-110] resolves agent via room_meta.get("agent_id") — requires the room to be pre-bound to an agent. New DM-first model: look up agent_id from user_agents dict by Matrix user_id.
The _resolve_delegate signature receives user_id (Matrix user_id string) and local_chat_id (room_id in DM-first model). New logic:
async def _resolve_delegate(
self, user_id: str, local_chat_id: str
) -> tuple[PlatformClient, str]:
# 1. Look up agent_id by Matrix user_id
agent_id = self._registry.get_agent_id_by_user(user_id)
if agent_id is None:
raise PlatformError(
f"no agent configured for user: {user_id}",
code="MATRIX_USER_NOT_CONFIGURED",
)
# 2. Get delegate
delegate = self._delegates.get(agent_id)
if delegate is None:
raise PlatformError(f"unknown agent: {agent_id}", code="MATRIX_AGENT_NOT_FOUND")
# 3. chat_id=0 always (single-chat arch, D-01)
return delegate, "0"
Pattern 4: DM-First Invite Handler
Replace handle_invite + provision_workspace_chat in auth.py [VERIFIED: auth.py lines 122-163]:
async def handle_invite(client, room, event, platform, store, auth_mgr, chat_mgr) -> None:
matrix_user_id = getattr(event, "sender", "")
# Reject group rooms (non-DM) — Claude's discretion
is_dm = getattr(room, "is_direct", True) # matrix-nio: RoomCreateEvent m.room.create has is_direct
if not is_dm:
await client.room_leave(room.room_id)
return
await client.join(room.room_id)
# Check authorization
if not _is_authorized(matrix_user_id, registry): # uses user_agents lookup
await client.room_send(room.room_id, "m.room.message", {
"msgtype": "m.text",
"body": "К вашему аккаунту не привязан агент. Напишите @og_mput в Telegram для получения доступа."
})
return
# Idempotent: don't send welcome twice
meta = await get_room_meta(store, room.room_id)
if meta and meta.get("welcomed"):
return
await set_room_meta(store, room.room_id, {
"matrix_user_id": matrix_user_id,
"chat_id": "0", # single-chat: chat_id=0 always
"welcomed": True,
})
await client.room_send(room.room_id, "m.room.message", {
"msgtype": "m.text",
"body": "Привет! Я Lambda AI-агент. Просто напиши — и я отвечу. !clear чтобы начать новый разговор, !context чтобы посмотреть статус."
})
Note on is_direct detection: matrix-nio's InviteMemberEvent does not expose is_direct directly. The MatrixRoom object has room_type — DM rooms created by the client have join_rule = "invite" and member count 2. A safer approach: accept all invites, check user_agents for authorization. Group room detection is a Claude's Discretion item — the simplest implementation is to not detect it at phase 05 and only reject unauthorized users.
Pattern 5: File Path for Incoming Attachments
Current build_workspace_attachment_path [VERIFIED: files.py lines 31-46] builds:
surfaces/matrix/{safe_user}/{safe_room}/inbox/{stamp}-{filename}
New path needed [VERIFIED: CONTEXT.md D-05]:
incoming/{filename} (relative), absolute: {workspace_path}/incoming/{filename}
New signature:
def build_workspace_attachment_path(
*,
workspace_path: str, # agent's workspace_path from AgentDefinition, e.g. "/agents/0/"
filename: str,
timestamp: str | None = None,
) -> tuple[str, Path]:
"""Returns (relative_path_for_agent, absolute_path_for_download)."""
stamp = timestamp or datetime.now(UTC).strftime("%Y%m%d-%H%M%S")
safe_name = _sanitize_component(filename) or "attachment.bin"
relative_path = f"incoming/{stamp}-{safe_name}" # relative to /workspace
absolute_path = Path(workspace_path) / relative_path
return relative_path, absolute_path
Callers: download_matrix_attachment() in files.py and _materialize_incoming_attachments() in bot.py. Both need to receive workspace_path (from AgentDefinition). The bot must resolve agent_id for the sender before downloading — requires registry.get_agent_id_by_user(matrix_user_id).
Pattern 6: Outgoing Files (MsgEventSendFile handling)
Current send_message in sdk/real.py [VERIFIED: real.py lines 88-98] already calls _attachment_from_send_file_event but the result goes into MessageResponse.attachments — which OutgoingMessage.attachments then carries. The send_outgoing() in bot.py [VERIFIED: bot.py lines 656-686] already handles event.attachments by resolving attachment.workspace_path via resolve_workspace_attachment_path(workspace_root, ...).
Current problem: workspace_root is Path(os.environ.get("SURFACES_WORKSPACE_DIR", "/workspace")) — a global, not per-agent. With shared volume /agents/, the agent workspace is /agents/0/, /agents/1/, etc.
Fix strategy: When processing MsgEventSendFile(path="output/report.pdf") for agent N, the absolute path is /agents/N/output/report.pdf. The workspace_path stored in Attachment (from _attachment_from_send_file_event) is "output/report.pdf". The workspace_root passed to resolve_workspace_attachment_path must be the agent's workspace_path (e.g. /agents/0/).
Two options:
- Store absolute path directly in
Attachment.workspace_path(simplest — no env var needed) - Pass per-agent workspace_root through context
Option 1 is simpler: in _attachment_from_send_file_event, when building Attachment, set workspace_path to the absolute path ({agent_workspace_path}/output/report.pdf). The resolve_workspace_attachment_path function already handles absolute paths [VERIFIED: files.py line 87-90: if path.is_absolute(): return path].
This means RealPlatformClient needs to know the agent's workspace_path — pass it in constructor.
Pattern 7: !clear Command
New handler in context_commands.py (or new clear.py):
def make_handle_clear(agent_pool: dict[str, AgentApi]):
async def handle_clear(event: IncomingCommand, auth_mgr, platform, chat_mgr, settings_mgr):
# The "platform" here is RoutedPlatformClient.
# Need to access the underlying RealPlatformClient and its AgentApi.
# Two approaches:
# A) Give RoutedPlatformClient a reset_agent(user_id) method
# B) Access delegate directly via platform._delegates[agent_id]
agent_id = platform._registry.get_agent_id_by_user(event.user_id)
if agent_id and agent_id in platform._delegates:
delegate = platform._delegates[agent_id]
await delegate.reset_agent() # new method on RealPlatformClient
return [OutgoingMessage(chat_id=event.chat_id, text="Контекст сброшен.")]
return handle_clear
reset_agent() on RealPlatformClient: Close the active AgentApi connection. Since RealPlatformClient currently creates a fresh AgentApi per request (see _build_chat_api — no connection pool) [VERIFIED: real.py lines 173-178], there's nothing to close. The reset is implicit — the next send_message creates a fresh AgentApi(chat_id="0") which reconnects.
However: chat_id="0" is a string in RealPlatformClient._build_chat_api [VERIFIED: real.py line 177: chat_id=str(chat_id)], but AgentApi constructor takes chat_id: int = 0. The urljoin(base_url, f"v1/agent_ws/{chat_id}/") call will produce v1/agent_ws/0/ regardless.
Actual reset mechanism with current RealPlatformClient: Since a new AgentApi is created per send_message() call (stateless client pattern), the "context" is held in the remote agent's MemorySaver. True reset = reconnect at the agent side. The !reset command already does disconnect_chat [VERIFIED: context_commands.py make_handle_reset]. The !clear can reuse this pattern: call platform.disconnect_chat("0") if available, or simply confirm immediately (MemorySaver resets on next connection with a fresh chat_id key — but chat_id=0 is always 0, so MemorySaver persists across connections).
Implication: True context reset with MemorySaver requires the agent to restart or use a different chat_id. For Phase 05 MVP, !clear can: (a) confirm to user "Контекст сброшен." and (b) note this is best-effort until agent side supports it. This matches D-11 (immediate, no confirmation dialog).
Pattern 8: docker-compose.prod.yml
services:
matrix-bot:
image: surfaces-bot:latest
build: .
env_file: .env.prod
volumes:
- agents:/agents/
- ./config:/app/config:ro
restart: unless-stopped
agent-0:
image: lambda-agent:latest
env_file: .env.prod
environment:
AGENT_ID: "agent-0"
volumes:
- agents:/workspace
restart: unless-stopped
volumes:
agents:
driver: local
Note: lambda-agent:latest is a placeholder image name per D-08. The platform team owns the actual image.
Anti-Patterns to Avoid
- Do not create per-request AgentApi instances in a long-running pool — the current
RealPlatformClientalready does this correctly (stateless per request). Don't change this pattern for Phase 05. - Do not add chat_id logic — single-chat arch means chat_id=0 always. Any code that increments or stores platform_chat_ids in room_meta is legacy being deleted.
- Do not try to detect is_direct at invite time via matrix-nio — the library's InviteMemberEvent doesn't expose this reliably. Accept all invites, authorize by user_agents lookup.
- Do not change sdk/real.py AgentApi constructor call —
_build_chat_apiuseschat_id=str(chat_id). Keep as is; the AgentApi accepts string-coercible chat_id.
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| File upload to Matrix | Custom HTTP multipart | client.upload(handle, content_type, filename, filesize) |
matrix-nio provides this; already used in bot.py send_outgoing |
| Matrix file message | Custom m.room.message | client.room_send(room_id, "m.room.message", {"msgtype": "m.file", ...}) |
Already implemented in send_outgoing |
| YAML parsing | Custom parser | yaml.safe_load() (already in agent_registry.py) |
Already works; just extend the schema |
| WebSocket to agent | Custom aiohttp ws | AgentApi from external/platform-agent_api |
Already used via sdk/real.py |
Common Pitfalls
Pitfall 1: _materialize_incoming_attachments uses global SURFACES_WORKSPACE_DIR
What goes wrong: Bot downloads file to /workspace/surfaces/matrix/... (old path) when it should write to /agents/0/incoming/....
Why it happens: _materialize_incoming_attachments in bot.py [VERIFIED: bot.py line 449] reads SURFACES_WORKSPACE_DIR env var. In prod, this needs to be /agents/ — but the per-user path varies.
How to avoid: Pass the agent's workspace_path (from AgentDefinition) into download_matrix_attachment. The bot must resolve matrix_user_id → agent_id → AgentDefinition.workspace_path before calling download. The registry object is available in build_runtime() but not currently threaded into MatrixBot._materialize_incoming_attachments. Either (a) store registry on MatrixRuntime, or (b) pass it into MatrixBot.__init__.
Pitfall 2: AgentRegistry reference not available in handlers
What goes wrong: handle_invite, _check_agent_routing, _materialize_incoming_attachments all need the registry to look up user_agents. Currently registry is loaded in build_runtime() and passed only to register_matrix_handlers.
Why it happens: MatrixBot doesn't store the registry. Only the dispatcher gets it.
How to avoid: Store registry: AgentRegistry | None on MatrixRuntime. Thread it into MatrixBot.
Pitfall 3: Existing tests test behaviors being deleted
What goes wrong: 35 currently failing tests (pre-existing) test Space provisioning, !agent, C1/C2/C3, !save/!load. After deletion, these tests must be deleted or replaced. Why it happens: The test suite was written for the old multi-room architecture. How to avoid: Plan explicitly identifies which test files to delete/rewrite:
- Delete:
test_invite_space.py,test_agent_handler.py,test_chat_space.py - Rewrite:
test_dispatcher.py(large — slim to DM-first behavior),test_routed_platform.py(update to user_agents lookup) - Update:
test_files.py(new path format) - Keep:
test_converter.py,test_store.py,test_restart_persistence.py,test_routing_enforcement.py,test_context_commands.py(partial)
Pitfall 4: resolve_chat_id returns C1/C2/C3 chat IDs
What goes wrong: room_router.resolve_chat_id [VERIFIED: room_router.py] reads room_meta.get("chat_id"). Old room_meta stores "C1", "C2" etc. In DM-first model, chat_id is always "0".
How to avoid: Update set_room_meta calls in the new invite handler to set "chat_id": "0". The resolve_chat_id function can remain as-is — it will return "0" when that's what's stored.
Pitfall 5: RoutedPlatformClient._resolve_delegate expects room_meta with agent_id
What goes wrong: Current _resolve_delegate [VERIFIED: routed_platform.py lines 80-110] reads room_meta.get("agent_id") — requires the room to have been pre-bound. In DM-first model with user_agents lookup, rooms are never explicitly bound.
How to avoid: Replace the agent_id lookup with registry.get_agent_id_by_user(user_id). The user_id parameter is the Matrix user_id string, which is already passed into send_message() / stream_message().
Pitfall 6: RealPlatformClient needs workspace_path for outgoing file resolution
What goes wrong: When agent emits MsgEventSendFile(path="output/report.pdf"), the current _attachment_from_send_file_event strips /workspace/ prefix [VERIFIED: real.py lines 207-218] leaving "output/report.pdf". Then send_outgoing in bot.py resolves it with SURFACES_WORKSPACE_DIR — which doesn't know which agent's workspace to use.
How to avoid: Add workspace_path: str to RealPlatformClient.__init__. In _attachment_from_send_file_event, build absolute path: Path(workspace_path) / event.path. Store absolute path in Attachment.workspace_path. resolve_workspace_attachment_path already returns absolute paths unchanged [VERIFIED: files.py line 87-90].
Pitfall 7: docker-compose.prod.yml volume mount collision
What goes wrong: If /agents/ named volume is used and the agent container also mounts it as /workspace, all agents share the same volume root. Agent-0 writes to /workspace/output/, Agent-1 also writes to /workspace/output/ — collision.
Why it happens: Named volume agents is mounted as /workspace in ALL agent containers.
How to avoid: Each agent container gets its own volume or subpath. With Docker Compose named volumes, subpath mounts are possible in Compose v2.17+ with volume.subpath. Or: use separate named volumes per agent (agents_0, agents_1). Or: the agent container is configured with WORKSPACE_SUBDIR and uses /workspace/{agent_id}/. Per D-08, there is one placeholder agent container — this is a platform concern. For Phase 05 with a single placeholder, use the simplest approach: one agents volume, agent-0 mounted at /workspace, bot at /agents/, with workspace_path: "/agents/0/" in config — the bot writes to /agents/0/incoming/ which the agent reads from /workspace/0/incoming/. Wait — this is a mismatch.
Correct topology per deploy-architecture.md [VERIFIED: docs/deploy-architecture.md]:
- Volume
agentsmounted in bot as/agents/ - Volume
agentsmounted in agent-0 as/workspace - Agent workspace_path in config:
/agents/0/ - Bot writes file to
/agents/0/incoming/photo.jpg - Agent reads from
/workspace/0/incoming/photo.jpg— WORKS if agent container mounts the volume at/workspaceand the volume root contains/0/subdirectory.
So: one named volume, mounted identically in both containers (at /agents/ in bot, at /workspace in agent). The subdirectory /0/ is the isolation boundary. This requires the agent container to be aware it lives in /workspace/0/ not /workspace/. This is a platform concern. For Phase 05 single-agent placeholder, this still works because there's only one agent.
Code Examples
AgentApi usage (verified from source)
# Source: external/platform-agent_api/lambda_agent_api/agent_api.py
agent = AgentApi(
agent_id="agent-0",
base_url="ws://lambda.coredump.ru:7000/agent_0/",
on_disconnect=lambda a: connected_agents.pop(a.id, None),
chat_id=0,
)
await agent.connect() # Must call before send_message
async for event in agent.send_message("Hello", attachments=["incoming/photo.jpg"]):
if isinstance(event, MsgEventTextChunk):
print(event.text)
elif isinstance(event, MsgEventSendFile):
# event.path = "output/report.pdf"
abs_path = Path(agent_workspace_path) / event.path
await agent.close() # Triggers on_disconnect
Matrix file upload (verified from bot.py)
# Source: adapter/matrix/bot.py send_outgoing()
with file_path.open("rb") as handle:
upload_response, _ = await client.upload(
handle,
content_type=attachment.mime_type or "application/octet-stream",
filename=attachment.filename or file_path.name,
filesize=file_path.stat().st_size,
)
content_uri = upload_response.content_uri
await client.room_send(room_id, "m.room.message", {
"msgtype": "m.file", # or m.image, m.audio, m.video
"body": filename,
"url": content_uri,
})
YAML config extension (target format)
# config/matrix-agents.yaml (new format per D-02/D-03)
user_agents:
"@user0:matrix.lambda.coredump.ru": agent-0
"@user1:matrix.lambda.coredump.ru": agent-1
agents:
- id: agent-0
label: "Agent 0"
base_url: "ws://lambda.coredump.ru:7000/agent_0/"
workspace_path: "/agents/0/"
- id: agent-1
label: "Agent 1"
base_url: "ws://lambda.coredump.ru:7000/agent_1/"
workspace_path: "/agents/1/"
Runtime State Inventory
Phase includes refactoring but NOT renaming of string identifiers in user-facing data. Users interacting with the old multi-room bot will have SQLite room_meta records with old schema keys.
| Category | Items Found | Action Required |
|---|---|---|
| Stored data (SQLite) | lambda_matrix.db (dev). Room meta records contain chat_id: "C1", space_id, redirect_room_id, agent_id — from old multi-room flow. |
No migration. D-05 says: ignore existing Space+rooms, do not migrate. New users get DM-first. Old users' DM rooms will lack welcomed key — first message in DM room triggers normal message dispatch path (acceptable). |
| Stored data (SQLite) | selected_agent_id key in user metadata — written by !agent command being deleted. |
No migration needed. !agent is gone. The new routing uses user_agents from YAML config. Old selected_agent_id values are orphaned but harmless. |
| Live service config | No external services with stored config (no n8n, no Datadog). | None. |
| OS-registered state | None. Bot runs in Docker, no launchd/systemd registration. | None. |
| Secrets/env vars | AGENT_BASE_URL (global) → replaced by per-agent base_url in YAML. SURFACES_WORKSPACE_DIR (global workspace) → per-agent workspace_path from YAML. Both env vars become deprecated for prod but remain for backward compat in dev. |
Update .env.example. Add .env.prod template. |
| Build artifacts | None in prod context. Local: .venv, __pycache__ — unaffected. |
None. |
Validation Architecture
Test Framework
| Property | Value |
|---|---|
| Framework | pytest 9.0.2 + pytest-asyncio |
| Config file | pyproject.toml (asyncio_mode = "auto") |
| Quick run command | uv run pytest tests/adapter/matrix/ -q |
| Full suite command | uv run pytest tests/ -q |
Current Test Status (pre-Phase-05)
| File | Status | Disposition in Phase 05 |
|---|---|---|
| test_converter.py | 14 passing | Keep as-is |
| test_files.py | 2 passing | Update for new path format |
| test_reactions.py | 2 passing | Keep as-is |
| test_restart_persistence.py | 5 passing | Keep; update if routing logic changes |
| test_routing_enforcement.py | 5 passing | Update for user_agents routing model |
| test_store.py | 2 passing | Keep as-is |
| test_agent_handler.py | failing (import?) | DELETE — !agent is deleted |
| test_agent_registry.py | failing (import?) | REWRITE — test new AgentDefinition schema |
| test_chat_space.py | failing | DELETE — Space provisioning deleted |
| test_confirm.py | failing | Keep or update |
| test_context_commands.py | 4 failing | REWRITE — !save/!load deleted; keep !context, add !clear |
| test_dispatcher.py | 20 failing | REWRITE — DM-first flow replaces multi-room |
| test_invite_space.py | 3 failing | DELETE and REPLACE with DM-first invite tests |
| test_routed_platform.py | 1 failing | REWRITE — user_agents lookup replaces room binding |
| test_send_outgoing.py | failing | REWRITE — per-agent workspace_path |
Phase Requirements → Test Map
| Behavior | Test Type | Automated Command | Wave |
|---|---|---|---|
| AgentRegistry parses new YAML format (user_agents + base_url/workspace_path) | unit | uv run pytest tests/adapter/matrix/test_agent_registry.py -x |
Wave 1 |
| Unauthorized user gets access-denied message on invite | unit | uv run pytest tests/adapter/matrix/test_invite_dm.py -x |
Wave 2 |
| Authorized user gets welcome on DM invite | unit | uv run pytest tests/adapter/matrix/test_invite_dm.py -x |
Wave 2 |
| Message from authorized user routes to correct delegate | unit | uv run pytest tests/adapter/matrix/test_routed_platform.py -x |
Wave 2 |
Incoming file saved to incoming/{filename} under agent workspace |
unit | uv run pytest tests/adapter/matrix/test_files.py -x |
Wave 3 |
| !clear command returns "Контекст сброшен." | unit | uv run pytest tests/adapter/matrix/test_context_commands.py -x |
Wave 2 |
| Full suite green | integration | uv run pytest tests/ -q |
Phase gate |
Wave 0 Gaps
tests/adapter/matrix/test_invite_dm.py— DM-first invite flow (new file)- Updated
tests/adapter/matrix/test_agent_registry.py— new schema
(All other existing test infrastructure is in place. No new framework install needed.)
Environment Availability
| Dependency | Required By | Available | Version | Fallback |
|---|---|---|---|---|
| uv / Python 3.11 | tests, bot run | ✓ | Python 3.11.9, pytest 9.0.2 | — |
| Docker | docker-compose.prod.yml | ✓ (assumed dev machine) | — | Manual install |
| matrix-nio | Matrix adapter | ✓ | installed in .venv | — |
| pyyaml | agent_registry.py | ✓ | installed (yaml import works in bot context) | — |
| lambda-agent:latest image | docker-compose.prod.yml | ✗ | placeholder — platform team owns | Use build: ./external/platform-agent for local testing |
Missing dependencies with no fallback:
lambda-agent:latest— docker-compose.prod.yml uses this as placeholder image. For actual testing, usebuild: ./external/platform-agentfallback orimage: busyboxstub.
Open Questions
-
is_direct detection for group room rejection (D-05, Claude's Discretion)
- What we know: matrix-nio's
InviteMemberEventdoes not exposeis_directflag directly. TheMatrixRoomtype has member count accessible viaroom.member_countorroom.joined_members. - What's unclear: Whether InviteMemberEvent or MatrixRoom in nio exposes enough to reliably detect DM vs. group at invite time.
- Recommendation: At Phase 05, accept all invites and immediately check user_agents authorization. Non-DM group rooms where the bot is invited by an authorized user will also work (no harm). Add
room.member_count <= 2check if desired.
- What we know: matrix-nio's
-
True !clear semantics with MemorySaver
- What we know:
RealPlatformClient._build_chat_apicreates a newAgentApi(chat_id="0")per request. The agent'sMemorySaveris keyed bychat_id— always"0". So context is NOT cleared by reconnecting. - What's unclear: Whether
!clearshould work "for real" (requires platform to support a reset endpoint or different chat_id) or just show a user-facing message (MVP-acceptable). - Recommendation: Phase 05 sends "Контекст сброшен." immediately (D-11). Document the limitation. Actual context reset is a platform concern.
- What we know:
-
lambda-agent:latest image name
- What we know: D-08 says "placeholder image
lambda-agent:latest— уточнить у Азамата". - Recommendation: Use
lambda-agent:latestas image name in docker-compose.prod.yml. Add a comment indicating it's a placeholder. Providebuild:fallback pointing to./external/platform-agentfor local dev validation.
- What we know: D-08 says "placeholder image
Assumptions Log
| # | Claim | Section | Risk if Wrong |
|---|---|---|---|
| A1 | lambda-agent:latest is the agreed image name for the agent container |
docker-compose section | docker-compose.prod.yml won't work; easy to fix by updating image name |
| A2 | Group room invite detection is not required for Phase 05 (DM-first only means "start in DM", not "reject group invites") | DM-first onboarding | If group room rejection IS required, need to investigate matrix-nio InviteMemberEvent structure |
| A3 | !clear in Phase 05 is cosmetic (shows "cleared" but MemorySaver persists until agent restart) | !clear section | User confusion if they expect real context reset |
Project Constraints (from CLAUDE.md)
| Directive | Implication for Phase 05 |
|---|---|
Вызовы платформы — через platform/interface.py (Protocol) |
RealPlatformClient stays the SDK boundary; AgentApi is internal to sdk/ |
При подключении реального SDK — меняем только platform/mock.py |
Phase 05 touches sdk/real.py for workspace_path — acceptable, it's a refinement not a rewrite |
| Хотфиксы (< 20 строк) → Claude Code напрямую, не Codex | Phase 05 is >20 lines; must go through Codex via GSD |
| Реализацию делает codex:rescue | Plans must be PLAN.md format passable to Codex |
Никогда не коммить .env |
.env.prod must be in .gitignore — only .env.prod.example is committed |
uv sync для зависимостей |
No new pip installs; all deps already in pyproject.toml |
| pytest tests/ для тестов | Phase gate: uv run pytest tests/ -q must be green |
Sources
Primary (HIGH confidence)
- [VERIFIED: adapter/matrix/agent_registry.py] — current AgentDefinition/AgentRegistry structure
- [VERIFIED: adapter/matrix/bot.py] — _build_platform_from_env, MatrixBot, handle_invite, _materialize_incoming_attachments
- [VERIFIED: adapter/matrix/routed_platform.py] — _resolve_delegate logic
- [VERIFIED: adapter/matrix/files.py] — build_workspace_attachment_path, download_matrix_attachment
- [VERIFIED: adapter/matrix/handlers/agent.py] — !agent handler (to be deleted)
- [VERIFIED: adapter/matrix/handlers/auth.py] — provision_workspace_chat (to be replaced)
- [VERIFIED: adapter/matrix/handlers/context_commands.py] — !save/!load/!reset handlers
- [VERIFIED: adapter/matrix/handlers/init.py] — handler registration
- [VERIFIED: sdk/real.py] — RealPlatformClient, _build_chat_api, _attachment_from_send_file_event
- [VERIFIED: sdk/upstream_agent_api.py] — sys.path patching, AgentApi import
- [VERIFIED: external/platform-agent_api/lambda_agent_api/agent_api.py] — actual AgentApi implementation
- [VERIFIED: config/matrix-agents.yaml] — current format
- [VERIFIED: docker-compose.yml] — existing dev compose topology
- [VERIFIED: .env.example] — current env var set
- [VERIFIED: docs/deploy-architecture.md] — prod topology spec
- [VERIFIED: .planning/phases/05-mvp-deployment/05-CONTEXT.md] — locked decisions
Secondary (MEDIUM confidence)
- [ASSUMED: A1] lambda-agent image name — from CONTEXT.md D-08 description
- [ASSUMED: A2] Group room handling scope — inferred from D-05 wording
Metadata
Confidence breakdown:
- Standard stack: HIGH — all libraries verified in existing code
- Architecture patterns: HIGH — all patterns verified against actual source files
- Pitfalls: HIGH — all pitfalls derived from reading actual code, not from training assumptions
- Test strategy: HIGH — test files enumerated and statuses verified by running pytest
Research date: 2026-04-27 Valid until: 2026-05-27 (stable codebase; short-circuit if platform-agent_api changes)