surfaces/.planning/phases/04-matrix-mvp-shared-agent-context-and-context-management-comma/04-RESEARCH.md

28 KiB
Raw Blame History

Phase 4: Matrix MVP — Shared Agent Context + Context Management — Research

Researched: 2026-04-16 Domain: Matrix bot, AgentApi WebSocket client, context management commands, Docker packaging Confidence: HIGH (all findings verified against actual source files in this repo)


<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

Архитектура платформы:

  • Один контейнер = один чат: AgentService с thread_id = "default" — намеренная архитектура. Не менять.
  • Убрать thread_id патч: наш коммит 1dca2c1 в external/platform-agent удаляем. Переходим на origin/main platform-agent.
  • Удалить build_thread_key: функция больше не нужна. Убрать из sdk/agent_session.py и sdk/real.py.
  • Заменить AgentSessionClient на AgentApi: использовать AgentApi из external/platform-agent_api/lambda_agent_api/agent_api.py.

!save: Синтаксис !save (автоимя) или !save [имя]. Механизм: посылаем агенту текстовое сообщение. Имена сохранений хранятся в PrototypeStateStore.

!load: !load без аргументов → нумерованный список. Пользователь вводит число. Выход: 0 или !cancel. После выбора — посылаем агенту текстовое сообщение. Состояние ожидания в Matrix store.

!reset: Confirmation-диалог с !yes/!save имя/!no. !yesPOST {AGENT_BASE_URL}/reset. Fallback если 404: сообщение пользователю.

!context: Показывает имя сессии, токены, список сохранений. Без вызовов агента.

Dockerfile + docker-compose: Для Matrix-бота. Env через .env. Platform-agent — отдельно.

Claude's Discretion

  • Структура хранения saved sessions в PrototypeStateStore (dict name→timestamp)
  • Формат автоимени для !save без аргументов
  • HTTP клиент для POST /reset (aiohttp или httpx)
  • Точный формат промптов к агенту для save/load

Deferred Ideas (OUT OF SCOPE)

  • Замена PrototypeStateStore на реальный control-plane из platform-master
  • Skills интеграция через SkillsMiddleware
  • E2EE для Matrix
  • !reset через docker restart
  • Суммаризация контекста </user_constraints>

Summary

Phase 4 replaces the custom AgentSessionClient with the production AgentApi from lambda_agent_api, adds four context management commands to the Matrix bot, and packages it in Docker. All findings are verified directly against source files.

Primary recommendation: Wire AgentApi as a persistent connection in MatrixBot.__init__ (connect on start, close in finally block of main()). Expose it through RealPlatformClient. The four commands follow the existing handler registration pattern in adapter/matrix/handlers/__init__.py.

The platform-agent at origin/main already works with AgentApi — it does NOT require thread_id query param. Our local patch (1dca2c1) must be discarded and external/platform-agent reset to origin/main.


Project Constraints (from CLAUDE.md)

  • Tech stack: matrix-nio for Matrix — do not change without discussion
  • Platform client: connected only via sdk/interface.py Protocol — core/ and adapters untouched when swapping implementation
  • No E2EE — matrix-nio without python-olm
  • Hotfixes < 20 lines → Claude Code directly; implementation → Codex via GSD
  • MATRIX_PLATFORM_BACKEND env var controls mock vs real

Standard Stack

Core (verified)

Library Version Purpose Source
lambda_agent_api local (external/platform-agent_api) AgentApi WebSocket client [VERIFIED: file read]
aiohttp >=3.9 (surfaces-bot), >=3.13.4 (agent_api) WebSocket transport inside AgentApi [VERIFIED: pyproject.toml]
pydantic >=2.5 Message serialization (MsgUserMessage, MsgEventEnd, etc.) [VERIFIED: server.py/client.py]
httpx >=0.27 HTTP client for POST /reset (already in deps) [VERIFIED: pyproject.toml]
structlog >=24.1 Logging (existing pattern) [VERIFIED: pyproject.toml]

Supporting

Library Version Purpose When to Use
aiohttp already a dep Alternative HTTP for POST /reset Could use instead of httpx — both available

Installation: No new packages needed. lambda_agent_api is installed as a local path package (currently accessed via sys.path injection in tests; for production use, add to pyproject.toml as path dep or install via pip install -e external/platform-agent_api).

Critical: lambda_agent_api requires Python >=3.14 per its own pyproject.toml. The surfaces-bot requires Python >=3.11. [VERIFIED: pyproject.toml of both]. This is a version mismatch — see Pitfalls.


Architecture Patterns

AgentApi Constructor (verified)

# Source: external/platform-agent_api/lambda_agent_api/agent_api.py
AgentApi(
    agent_id: str,               # arbitrary string ID, used in logs
    url: str,                    # WebSocket URL, e.g. "ws://127.0.0.1:8000/agent_ws/"
    callback: Optional[Callable[[ServerMessage], None]] = None,  # for orphaned msgs
    on_disconnect: Optional[Callable[['AgentApi'], None]] = None  # called on WS close
)

AgentApi Lifecycle (verified)

# Source: external/platform-agent_api/lambda_agent_api/agent_api.py
agent = AgentApi(agent_id="matrix-bot", url=ws_url)
await agent.connect()   # opens WS, waits for MsgStatus, starts _listen() task
# ... use agent ...
await agent.close()     # cancels _listen task, closes WS and session

connect() blocks until MsgStatus is received from server (5s timeout). After connect(), a background _listen() asyncio task runs continuously, routing server messages to an internal asyncio.Queue.

AgentApi.send_message() semantics (verified)

# Source: external/platform-agent_api/lambda_agent_api/agent_api.py, line 134
async def send_message(self, text: str) -> AsyncIterator[AgentEventUnion]:
  • AgentEventUnion = Union[MsgEventTextChunk, MsgEventEnd]but the generator yields only MsgEventTextChunk chunks; it breaks (stops) on MsgEventEnd without yielding it.
  • MsgEventEnd carries tokens_used: int — to capture this, the caller must intercept the queue or handle MsgEventEnd in the _listen loop. Currently send_message discards tokens_used. This affects !context which needs tokens.

Resolution: In RealPlatformClient.stream_message(), after iterating through send_message(), tokens_used won't be directly available. Options:

  1. Store tokens_used in a shared attribute after each response (add self._last_tokens_used to AgentApi or a wrapper).
  2. Use the callback parameter to capture MsgEventEnd events from the _listen loop.

[ASSUMED] The simplest approach: wrap AgentApi in a thin AgentApiAdapter class that intercepts _listen output and exposes last_tokens_used. Or: store tokens in PrototypeStateStore after each message.

AgentApi concurrency constraint (verified)

AgentApi._request_lock prevents parallel send_message() calls — second call raises AgentBusyException. In the single-user Matrix prototype this is acceptable. The bot must not dispatch two messages concurrently to the same agent.

Wiring AgentApi into MatrixBot (integration pattern)

The AgentApi must be a persistent connection (not per-message connect/disconnect) because:

  1. _listen() task runs in background and routes server push events.
  2. Per-message connect/disconnect would recreate the aiohttp session each time and discard LangGraph thread state.

Recommended wiring:

# adapter/matrix/bot.py — main() function
agent_api = AgentApi(agent_id="matrix-bot", url=ws_url)
await agent_api.connect()
runtime = build_runtime(store=SQLiteStore(db_path), client=client, agent_api=agent_api)
try:
    await client.sync_forever(timeout=30000, since=since_token)
finally:
    await client.close()
    await agent_api.close()

_build_platform_from_env() currently instantiates everything synchronously. It must be refactored to async or split: construct AgentApi synchronously, call connect() in main() before starting sync loop.

RealPlatformClient updates

RealPlatformClient currently imports AgentSessionClient and calls build_thread_key. Both are removed. The updated class:

class RealPlatformClient(PlatformClient):
    def __init__(
        self,
        agent_api: AgentApi,           # replaces agent_sessions: AgentSessionClient
        prototype_state: PrototypeStateStore,
        platform: str = "matrix",
    ) -> None:

send_message() and stream_message() call agent_api.send_message(text) directly — no thread_key needed.

platform-agent origin/main: what changes (verified)

Our patch 1dca2c1 added thread_id query param handling to external/platform-agent/src/api/external.py. On origin/main, the process_message() function does NOT use thread_id — it calls agent_service.astream(msg.text) without thread_id. The WS URL becomes simply ws://host:port/agent_ws/ — no query params.

Existing command registration pattern (verified)

# adapter/matrix/handlers/__init__.py — register_matrix_handlers()
dispatcher.register(IncomingCommand, "new",      make_handle_new_chat(client, store))
dispatcher.register(IncomingCommand, "settings", handle_settings)
dispatcher.register(IncomingCallback, "confirm", make_handle_confirm(store))

Handler signature (all existing handlers follow this):

async def handle_X(
    event: IncomingCommand,
    auth_mgr,
    platform,
    chat_mgr,
    settings_mgr,
) -> list[OutgoingEvent]:

New context commands need access to agent_api (for !save, !load) and store (for !context, !load pending state). Pattern: use make_handle_X(agent_api, store) closures — same as make_handle_new_chat(client, store).

!load pending state pattern (verified)

Existing PENDING_CONFIRM_PREFIX = "matrix_pending_confirm:" in adapter/matrix/store.py.

New key for load pending state:

LOAD_PENDING_PREFIX = "matrix_load_pending:"

def _load_pending_key(user_id: str, room_id: str) -> str:
    return f"{LOAD_PENDING_PREFIX}{user_id}:{room_id}"

Stored data structure:

{
    "saves": [{"name": "my-save", "ts": "2026-04-16T12:00:00Z"}, ...],
    "display": "1. my-save (2026-04-16)\n2. other..."
}

The numeric input 1, 2, etc. is intercepted in MatrixBot.on_room_message() BEFORE dispatching as IncomingMessage — check if load_pending exists for this user+room, resolve to save name, dispatch the load command internally.

Alternative (recommended): Handle numeric input in the IncomingMessage handler via a pre-dispatch interceptor, or register a special numeric-input check in the dispatcher for messages that are pure integers.

!reset confirmation dialog pattern

!reset reuses the OutgoingUI + pending_confirm mechanism or a simpler custom state. Since the dialog options are !yes, !save имя, !no (not just yes/no), it cannot reuse pending_confirm directly without extension.

Simplest approach: store reset_pending:{user_id}:{room_id} key (boolean) and check for !yes/!no/!save commands from the IncomingCommand dispatcher when reset_pending is set.

saved sessions storage in PrototypeStateStore

New dict attribute on PrototypeStateStore:

self._saved_sessions: dict[str, list[dict]] = {}
# Key: matrix_user_id
# Value: [{"name": "my-save", "created_at": "2026-04-16T12:00:00Z"}, ...]

Methods to add:

async def add_saved_session(self, user_id: str, name: str) -> None: ...
async def list_saved_sessions(self, user_id: str) -> list[dict]: ...

!context tokens_used tracking

MsgEventEnd.tokens_used: int is available from server.py. Since AgentApi.send_message() drops it, the planner must decide how to surface it. Recommended: store in PrototypeStateStore as _last_tokens_used: dict[str, int] keyed by user_id, updated after each successful agent response in RealPlatformClient.

Prompts for !save / !load (Claude's Discretion)

# !save
SAVE_PROMPT = (
    "Summarize our conversation and save to /workspace/contexts/{name}.md. "
    "Reply only with: Saved: {name}"
)

# !load
LOAD_PROMPT = (
    "Load context from /workspace/contexts/{name}.md and use it as background "
    "for our conversation. Reply: Loaded: {name}"
)

Auto-name format (Claude's Discretion): context-{YYYYMMDD-HHMMSS} (UTC, no spaces, no special chars, safe as filename).

POST /reset endpoint

Confirmed absent in origin/main platform-agent. Only endpoint is GET /agent_ws/ (WebSocket). The main.py has no HTTP routes beyond what FastAPI provides by default (/docs, /openapi.json).

!reset with !yesPOST {AGENT_BASE_URL}/reset → expect 404 → return "Reset endpoint недоступен. Обратитесь к администратору."

HTTP client for this: httpx (already in pyproject.toml):

import httpx
async with httpx.AsyncClient() as client:
    response = await client.post(f"{agent_base_url}/reset", timeout=5.0)
    if response.status_code == 404:
        return [OutgoingMessage(chat_id=..., text="Reset endpoint недоступен...")]

Dockerfile

FROM python:3.11-slim
WORKDIR /app
COPY pyproject.toml .
RUN pip install -e .
COPY . .
ENV PYTHONUNBUFFERED=1
CMD ["python", "-m", "adapter.matrix.bot"]

lambda_agent_api must be installed in the container. Options:

  1. COPY external/platform-agent_api /app/external/platform-agent_api + pip install -e /app/external/platform-agent_api
  2. Include lambda_agent_api package directly in surfaces-bot package (copy source files)

Option 1 is cleaner.

docker-compose.yml structure

services:
  matrix-bot:
    build: .
    env_file: .env
    restart: unless-stopped

Platform-agent runs separately — not in this compose file.


Don't Hand-Roll

Problem Don't Build Use Instead Why
WebSocket lifecycle with reconnect Custom WS manager AgentApi from lambda_agent_api Already handles connect/close/listen loop, error routing, queue management
Message deserialization Custom JSON parsing ServerMessage.validate_json() (Pydantic TypeAdapter) Discriminated union handles all message types
HTTP async client aiohttp.ClientSession directly httpx.AsyncClient Already in deps, cleaner API for one-shot POST
Concurrent request guard Custom lock AgentApi._request_lock Already implemented, raises AgentBusyException

Common Pitfalls

Pitfall 1: lambda_agent_api Python version mismatch

What goes wrong: lambda_agent_api/pyproject.toml declares requires-python = ">=3.14". The surfaces-bot runs on Python 3.11+. If pip install -e external/platform-agent_api is run with Python 3.11 it may fail or emit warnings.

Why it happens: The lambda_agent_api was developed under Python 3.14 (seen in .venv path: python3.14). The code itself uses no 3.14-specific syntax — it is pure aiohttp + pydantic which run on 3.11.

How to avoid: Change requires-python = ">=3.11" in external/platform-agent_api/pyproject.toml before building the Docker image, or install with --ignore-requires-python. Alternatively, copy the three source files directly into the surfaces-bot package.

Warning signs: pip install failure with "requires Python >=3.14".

Pitfall 2: AgentApi.send_message() drops MsgEventEnd (tokens_used lost)

What goes wrong: The generator yields only MsgEventTextChunk objects and breaks on MsgEventEnd without yielding it. Any downstream code that tries to get tokens_used from the iterator gets nothing.

Why it happens: The generator breaks on MsgEventEnd (line 172 of agent_api.py) without yielding it. This is intentional for streaming UX but loses token info.

How to avoid: Before streaming, set self._last_tokens_used = 0. In _listen(), MsgEventEnd is put into _current_queue (line 241). The send_message() generator reads from that queue but does break — the MsgEventEnd object is consumed but not returned to caller. The only way to capture it is to subclass AgentApi or read from _current_queue directly before the break.

Practical fix: Add self.last_tokens_used: int = 0 to AgentApi and intercept the queue in the finally block of send_message() — or store it in a wrapper class.

Pitfall 3: AgentApi persistent connection vs sync_forever loop

What goes wrong: If agent_api.connect() is called inside _build_platform_from_env() (sync function), it creates an asyncio.Task for _listen() outside the event loop context.

Why it happens: _build_platform_from_env() is called synchronously from build_runtime(). connect() is a coroutine.

How to avoid: Do NOT call agent_api.connect() inside _build_platform_from_env(). Instead:

  1. _build_platform_from_env() creates RealPlatformClient with an unconnected AgentApi
  2. main() awaits agent_api.connect() explicitly after constructing runtime

Expose agent_api from RealPlatformClient via a property so main() can call connect() on it.

Pitfall 4: !load numeric input interception

What goes wrong: When user types 1 in response to !load menu, it is dispatched as IncomingMessage (not a command) and routed to the platform — the agent receives "1" as a user message.

Why it happens: The Matrix converter (from_room_event) produces IncomingMessage for plain text, IncomingCommand only for !-prefixed text.

How to avoid: In MatrixBot.on_room_message(), before calling dispatcher.dispatch(), check if load_pending state exists for this user+room. If yes and the message text is a digit (or 0/!cancel), handle it as a load selection instead of routing to agent.

Pitfall 5: platform-agent thread_id removal breaks existing tests

What goes wrong: tests/platform/test_agent_session.py imports build_thread_key and tests process_message with thread_id in query params. After the patch is removed, these tests will fail.

Why it happens: Tests were written against our patched external.py.

How to avoid: The plan must include updating test_agent_session.py — remove build_thread_key tests, update process_message tests to reflect origin/main signature (no thread_id param).

Pitfall 6: !reset dialog conflicts with existing !yes/!no flow

What goes wrong: The existing pending_confirm flow uses !yes/!no. If both reset_pending and pending_confirm are active simultaneously, !yes could trigger the wrong handler.

Why it happens: Both flows listen for the same commands.

How to avoid: !reset dialog uses a separate state key reset_pending:{user_id}:{room_id}. The handler for !yes must check reset_pending first, then pending_confirm. Document priority in handler code.


Code Examples

Invoking AgentApi.send_message() in stream_message

# Source: external/platform-agent_api/lambda_agent_api/agent_api.py
async def stream_message(self, user_id: str, chat_id: str, text: str, ...) -> AsyncIterator[MessageChunk]:
    async for event in self._agent_api.send_message(text):
        if isinstance(event, MsgEventTextChunk):
            yield MessageChunk(
                message_id=user_id,
                delta=event.text,
                finished=False,
            )
    # After loop ends, MsgEventEnd was consumed internally
    yield MessageChunk(message_id=user_id, delta="", finished=True, tokens_used=self._agent_api.last_tokens_used)

Handler registration pattern

# Source: adapter/matrix/handlers/__init__.py
def register_matrix_handlers(dispatcher: EventDispatcher, client=None, store=None, agent_api=None) -> None:
    # existing...
    dispatcher.register(IncomingCommand, "save",    make_handle_save(agent_api, store))
    dispatcher.register(IncomingCommand, "load",    make_handle_load(agent_api, store))
    dispatcher.register(IncomingCommand, "reset",   make_handle_reset(store))
    dispatcher.register(IncomingCommand, "context", make_handle_context(store))

!load pending key

# New in adapter/matrix/store.py
LOAD_PENDING_PREFIX = "matrix_load_pending:"

async def get_load_pending(store: StateStore, user_id: str, room_id: str) -> dict | None:
    return await store.get(f"{LOAD_PENDING_PREFIX}{user_id}:{room_id}")

async def set_load_pending(store: StateStore, user_id: str, room_id: str, data: dict) -> None:
    await store.set(f"{LOAD_PENDING_PREFIX}{user_id}:{room_id}", data)

async def clear_load_pending(store: StateStore, user_id: str, room_id: str) -> None:
    await store.delete(f"{LOAD_PENDING_PREFIX}{user_id}:{room_id}")

platform-agent origin/main process_message (no thread_id)

# Source: git show origin/main:src/api/external.py in external/platform-agent
async def process_message(ws: WebSocket, msg, agent_service: AgentService):
    match msg:
        case MsgUserMessage():
            async for chunk in agent_service.astream(msg.text):  # no thread_id arg
                await ws.send_text(chunk.model_dump_json())
            await ws.send_text(MsgEventEnd(tokens_used=0).model_dump_json())

Assumptions Log

# Claim Section Risk if Wrong
A1 tokens_used can be captured by storing in AgentApi.last_tokens_used attribute during _listen() before it's queued Architecture Patterns If _listen timing means value is read before queue, token count would be wrong — low risk, easy to test
A2 Python 3.11 can run lambda_agent_api despite >=3.14 constraint in pyproject.toml Standard Stack If code uses 3.14-specific syntax, would fail at runtime — actual code inspected: no 3.14 syntax found
A3 httpx is preferred over aiohttp for POST /reset (one-shot HTTP) Standard Stack Either works; httpx already in deps

Open Questions

  1. tokens_used capture from AgentApi

    • What we know: MsgEventEnd.tokens_used is put into _current_queue but consumed (not yielded) by send_message() generator
    • What's unclear: Cleanest interception point without modifying lambda_agent_api source
    • Recommendation: Add last_tokens_used: int = 0 attribute to AgentApi and set it in send_message()'s finally block when draining orphan queue, OR set it in _listen() before putting MsgEventEnd in queue
  2. !load numeric input dispatch

    • What we know: Plain text 1, 2 arrives as IncomingMessage, not IncomingCommand
    • What's unclear: Where to intercept — in on_room_message() (bot layer) or in dispatcher pre-hook
    • Recommendation: Intercept in MatrixBot.on_room_message() before dispatcher.dispatch(). Keeps dispatcher clean.
  3. lambda_agent_api install in Docker

    • What we know: It's a local package in external/platform-agent_api/
    • What's unclear: Whether to install as editable or copy sources
    • Recommendation: COPY external/platform-agent_api /build/lambda_agent_api && pip install /build/lambda_agent_api in Dockerfile

Environment Availability

Dependency Required By Available Version Fallback
Python 3.11+ All System
aiohttp AgentApi WS >=3.9 in deps
httpx POST /reset >=0.27 in deps aiohttp
matrix-nio Matrix bot >=0.21 in deps
lambda_agent_api AgentApi local only 0.1.0
Docker Container build [ASSUMED] standard dev env
platform-agent (running) Integration test local clone origin/main needed

Validation Architecture

Test Framework

Property Value
Framework pytest + pytest-asyncio (asyncio_mode = "auto")
Config file pyproject.toml [tool.pytest.ini_options]
Quick run command pytest tests/platform/test_real.py tests/adapter/matrix/test_dispatcher.py -v
Full suite command pytest tests/ -v

Phase Requirements → Test Map

Req Behavior Test Type File
Remove build_thread_key Function gone from sdk/ unit tests/platform/test_agent_session.py — update/remove
AgentApi replaces AgentSessionClient RealPlatformClient uses AgentApi unit tests/platform/test_real.py — update
!save sends prompt to agent Command dispatches agent message unit tests/adapter/matrix/test_dispatcher.py — add
!load shows list Command returns numbered list unit tests/adapter/matrix/test_dispatcher.py — add
!load numeric select Bot intercepts digit, sends load prompt unit tests/adapter/matrix/test_dispatcher.py — add
!reset shows dialog Command returns confirmation UI unit tests/adapter/matrix/test_dispatcher.py — add
!context returns snapshot Command returns session info unit tests/adapter/matrix/test_dispatcher.py — add
PrototypeStateStore saved sessions add/list saved sessions unit tests/platform/test_prototype_state.py — add

Wave 0 Gaps

  • tests/platform/test_agent_api_integration.py — unit tests for RealPlatformClient with mocked AgentApi
  • tests/adapter/matrix/test_context_commands.py — dedicated module for !save/!load/!reset/!context handlers

Sources

Primary (HIGH confidence — verified by file read in this session)

  • external/platform-agent_api/lambda_agent_api/agent_api.py — AgentApi constructor, connect/close/send_message, _listen loop
  • external/platform-agent_api/lambda_agent_api/server.py — MsgEventTextChunk, MsgEventEnd, MsgStatus, AgentEventUnion types
  • external/platform-agent_api/lambda_agent_api/client.py — MsgUserMessage type
  • external/platform-agent/src/api/external.py — current (patched) and origin/main versions verified via git show
  • adapter/matrix/handlers/__init__.py — handler registration pattern
  • adapter/matrix/store.py — pending_confirm key pattern
  • adapter/matrix/bot.py — MatrixBot, build_runtime, _build_platform_from_env
  • sdk/agent_session.py — current AgentSessionClient (to be replaced)
  • sdk/real.py — RealPlatformClient (to be updated)
  • sdk/prototype_state.py — PrototypeStateStore (to be extended)
  • core/protocol.py — IncomingCommand, OutgoingMessage types
  • pyproject.toml — dependency versions
  • external/platform-agent_api/pyproject.toml — Python version constraint

Tertiary (LOW confidence)

  • Docker best practices for Python apps [ASSUMED] — standard industry pattern

Metadata

Confidence breakdown:

  • AgentApi interface: HIGH — read source directly
  • platform-agent origin/main diff: HIGH — verified via git show origin/main
  • handler registration pattern: HIGH — read all handler files
  • pending_confirm key pattern: HIGH — read store.py directly
  • tokens_used interception: MEDIUM — pattern clear but implementation needs care
  • Docker/docker-compose: MEDIUM — standard pattern, not verified against specific matrix-nio requirements

Research date: 2026-04-16 Valid until: 2026-05-16 (lambda_agent_api is local — stable until platform team updates it)