docs: clean up GSD planning state and remove outdated legacy phases

This commit is contained in:
Mikhail Putilovskij 2026-05-03 23:42:05 +03:00
parent 65445f516f
commit 9fc0b72ab1
12 changed files with 61 additions and 1386 deletions

View file

@ -2,56 +2,44 @@
## What This Is ## What This Is
Telegram и Matrix боты для взаимодействия пользователя с AI-агентом Lambda. Каждый бот — тонкий адаптер поверх общего ядра (`core/`), изолирующего бизнес-логику от транспорта. Платформа подключается через `sdk/interface.py` Protocol; сейчас используется `MockPlatformClient`. Surfaces (поверхности) — это тонкие адаптеры-клиенты, соединяющие мессенджеры с агентами платформы Lambda.
Текущая и главная реализация — **Matrix MVP**. Бот работает как stateless-прослойка: преобразует события Matrix во внутренний протокол `core/` и маршрутизирует их на внешние контейнеры агентов (через `AgentApi` по WebSocket).
## Core Value ## Core Value
Пользователь может вести диалог с Lambda-агентом через любой из поддерживаемых мессенджеров без изменения ядра системы. Пользователь может бесшовно взаимодействовать с изолированными AI-агентами через нативные интерфейсы мессенджеров (с поддержкой пересылки файлов и работы в комнатах), в то время как сама платформа агентов не зависит от транспорта.
## Requirements ## Requirements
### Validated ### Validated
- ✓ core/ — унифицированный протокол событий, EventDispatcher, StateStore, ChatManager, AuthManager, SettingsManager — existing - ✓ `core/` — унифицированный протокол событий, EventDispatcher, StateStore, ChatManager.
- ✓ adapter/telegram/ — forum-first адаптер (Threaded Mode), `/start`, `/new`, `/archive`, `/rename`, `/settings`, стриминг ответов — existing, QA passed - ✓ `adapter/matrix/` — Space+rooms адаптер. Прием инвайтов, автосоздание иерархии комнат, команды `!new`, `!archive`, `!clear`, `!yes`/`!no`.
- ✓ adapter/matrix/ — Space+rooms адаптер, invite flow, `!new`, `!archive`, `!rename`, `!settings`, room-per-chat — existing - ✓ `sdk/real.py` — интеграция с AgentApi. Поддержка WebSocket для обмена сообщениями и передачи вложений в обе стороны.
- ✓ sdk/mock.py — MockPlatformClient: `stream_message`, `get_or_create_user`, `get_settings`, `update_settings` — existing - ✓ Shared Volume — прямая передача файлов в локальные рабочие папки агентов (`/agents/`).
- ✓ Dynamic Routing — маршрутизация чатов к агентам на основе `config/matrix-agents.yaml`.
- ✓ Deployment — Разделение окружений на `docker-compose.prod.yml` (только бот) и `docker-compose.fullstack.yml` (бот + локальный агент для E2E).
### Active ### Out of Scope / Deferred
- [ ] Matrix QA — ручное тестирование Matrix адаптера, фиксация багов - E2EE для Matrix (отложено из-за сложностей сборки `python-olm` на кросс-платформенных средах).
- [ ] SDK integration — заменить MockPlatformClient реальным Lambda SDK (когда платформа готова) - Интеграция с Master-сервисом платформы (временно используется прямое соединение с `platform-agent` через AgentApi).
- [ ] Production hardening — конфиг для деплоя, логирование, мониторинг - Telegram-адаптер (вынесен в легаси ветку `feat/telegram-adapter`, MVP фокусируется на Matrix).
### Out of Scope
- E2EE для Matrix (python-olm не собирается на macOS/ARM) — инфраструктурная задача, отдельный трек
- Supergroup forum mode для Telegram — заменён Threaded Mode как основным режимом
- Telegram DM-first режим — заменён forum-first (Threaded Mode)
## Context ## Context
- Python 3.11+, aiogram 3.4+, matrix-nio 0.21+, SQLite, pytest-asyncio - Стек: Python 3.11+, `matrix-nio`, `uv`, `pydantic`.
- Threaded Mode — Bot API 9.3, Mac клиент имеет известные баги (новые топики не сразу видны в сайдбаре) - Бот хранит только локальную привязку (`room_id` <-> `platform_chat_id`) в SQLite. Вся долговременная память и история диалогов хранятся на стороне агента.
- Lambda platform SDK ещё не готов, всё работает через MockPlatformClient - Жизненный цикл контейнеров агентов управляется платформой, а не ботом.
- Архитектура: Hexagonal / Ports-and-Adapters; `core/` не зависит от транспорта
## Constraints
- **Tech stack**: aiogram 3.x для Telegram, matrix-nio для Matrix — не менять без обсуждения
- **Platform**: SDK подключается только через `sdk/interface.py` Protocol — core/ и adapters не трогаются при смене реализации
- **Telegram**: Threaded Mode — единственный поддерживаемый режим; `closeForumTopic`/`deleteForumTopic` не работают в personal chat forums
- **E2EE**: python-olm не собирается на текущей среде — Matrix работает только без шифрования
## Key Decisions ## Key Decisions
| Decision | Rationale | Outcome | | Decision | Rationale | Outcome |
|----------|-----------|---------| |----------|-----------|---------|
| Forum-first (Threaded Mode) для Telegram | Bot API 9.3 позволяет личный чат как форум — чище, без суперпруппы | ✓ Good | | Space+rooms для Matrix | Room-based UX и явные чаты (по одному на тред) удобнее, чем DM-каша | ✓ Good |
| (user_id, thread_id) как PK в chats | Изоляция контекстов по топику | ✓ Good | | Прямая интеграция AgentApi | Master API не был готов, прямое WebSocket соединение позволяет передавать стейт и файлы | ✓ Good |
| MockPlatformClient через sdk/interface.py | Не ждать SDK, разрабатывать независимо | ✓ Good | | Shared Volume для файлов | Избавляет от необходимости гонять base64 по сети, быстрый прямой доступ к файлам | ✓ Good |
| Space+rooms для Matrix | Room-based UX и явные чаты важнее DM-first упрощений | ✓ Good | | Stateless бот | Бот легко перезапускать и масштабировать, память изолирована в агентах | ✓ Good |
| Отказ от E2EE в Matrix | python-olm не собирается на macOS/ARM | — Pending |
## Evolution ## Evolution
@ -61,10 +49,5 @@ Telegram и Matrix боты для взаимодействия пользова
3. New requirements emerged? → Add to Active 3. New requirements emerged? → Add to Active
4. Decisions to log? → Add to Key Decisions 4. Decisions to log? → Add to Key Decisions
**After each milestone:**
1. Full review of all sections
2. Core Value check — still the right priority?
3. Update Context with current state
--- ---
*Last updated: 2026-04-02 after initialization* *Last updated: 2026-05-03 after codebase consolidation*

View file

@ -1,101 +1,32 @@
# Roadmap — v1.0 # Roadmap — v1.0
## Milestone: v1.0 — Production-ready surfaces ## Milestone: v1.0 — Production-ready Matrix MVP
### Phase 1: Matrix QA & Polish
**Goal:** Переработать Matrix адаптер с DM-first на Space+rooms, убрать реакции в пользу !yes/!no, довести до уровня "приемлемо работает" как Telegram.
**Depends on:** Telegram QA complete
**Plans:** 6 plans
Plans:
- [x] 01-01-PLAN.md — Space+rooms infrastructure (store helpers, handle_invite rewrite, room_router)
- [x] 01-02-PLAN.md — Chat command handlers (!new, !archive, !rename) Space-aware
- [x] 01-03-PLAN.md — Reaction removal + !yes/!no confirmation + settings dashboard
- [x] 01-04-PLAN.md — Test suite (fix 4 broken + 12 new MAT-01..MAT-12)
- [x] 01-05-PLAN.md — Gap closure for Matrix `!yes` / `!no` pending-confirm scope
- [x] 01-06-PLAN.md — Remaining Phase 01 gap closure work (completed 2026-04-03)
### Phase 01: Matrix QA & Polish
**Goal:** Переработать Matrix адаптер с DM-first на Space+rooms, убрать реакции в пользу `!yes`/`!no`.
**Status:** Completed
**Deliverables:** **Deliverables:**
- Space+rooms architecture for Matrix adapter - Space+rooms architecture for Matrix adapter
- !yes/!no text-based confirmation (no reactions) - !yes/!no text-based confirmation
- Read-only !settings dashboard - Test suite green
- 96+ tests green
---
### Phase 01.1: Matrix restart reconciliation and dev reset workflow (INSERTED)
**Goal:** Сделать Matrix-адаптер пригодным для повторяемого локального рестарта и ручного QA: бот восстанавливает минимальный local state из существующих Space/rooms и даёт явный dev reset workflow вместо ручного ritual reset.
**Requirements**: none explicitly mapped
**Depends on:** Phase 1
**Plans:** 3 plans
Plans:
- [ ] 01.1-01-PLAN.md — Non-destructive Matrix reconciliation module and tests
- [ ] 01.1-02-PLAN.md — Wire startup/bootstrap recovery into the Matrix runtime
- [ ] 01.1-03-PLAN.md — Dev reset CLI and updated Matrix restart runbook
### Phase 2: SDK Integration
**Goal:** Заменить MockPlatformClient реальным Lambda SDK — бот начинает работать с настоящим AI-агентом.
**Depends on:** Phase 1, Lambda platform SDK готов
### Phase 04: Matrix MVP: Agent Integration
**Goal:** Подключить реального агента через `AgentApi`, добавить команды управления контекстом (`!clear`).
**Status:** Completed
**Deliverables:** **Deliverables:**
- `sdk/real.py` — реализация PlatformClient через реальный SDK - `sdk/real.py` — реализация `PlatformClient` через реальный SDK (`AgentApi`).
- `bot.py` для обоих адаптеров переключается на реальный клиент через env var - Поддержка WebSocket стриминга.
- `stream_message` работает с реальным стримингом - Команды управления контекстом.
- Интеграционные тесты с реальным SDK (или staging) - Обертка в Docker.
### Phase 4: Matrix MVP: shared agent context and context management commands
**Goal:** Привести Matrix-бот к рабочему состоянию для MVP-деплоя: заменить AgentSessionClient на AgentApi, добавить !save/!load/!reset/!context команды управления контекстом агента, упаковать в Docker.
**Requirements**: Replace AgentSessionClient with AgentApi; Wire AgentApi lifecycle; Implement !save, !load, !reset, !context commands; Dockerfile + docker-compose
**Depends on:** Phase 1 (Matrix adapter complete)
**Plans:** 3 plans
Plans:
- [x] 04-01-PLAN.md — Replace AgentSessionClient with AgentApi; update sdk/real.py, bot.py, broken tests
- [x] 04-02-PLAN.md — !save, !load, !reset, !context handlers; PrototypeStateStore extensions; numeric interception
- [x] 04-03-PLAN.md — Dockerfile + docker-compose.yml + .env.example update
---
### Phase 05: MVP Deployment ### Phase 05: MVP Deployment
**Goal:** Подготовить Matrix-бот к реальному деплою на lambda.coredump.ru с маршрутизацией по агентам и передачей файлов.
**Goal:** Подготовить Matrix-бот к реальному деплою на lambda.coredump.ru без потери Space+rooms UX: закрепить per-room `platform_chat_id`, реальный `!clear`, reconciliation, file transfer через shared volume и разделение prod/fullstack compose. **Status:** Completed
**Depends on:** Phase 4
**Plans:** 4/4 plans complete
Plans:
- [x] 05-01-PLAN.md — Startup reconciliation from authoritative Matrix Space topology before live sync
- [x] 05-02-PLAN.md — Room-local `platform_chat_id` routing and real `!clear` semantics
- [x] 05-03-PLAN.md — Shared-volume attachment path hardening for `/agents` deployment
- [x] 05-04-PLAN.md — Split bot-only prod compose from internal fullstack compose and update docs
**Deliverables:** **Deliverables:**
- Space+rooms onboarding remains the primary Matrix UX - Загрузка `matrix-agents.yaml` для маппинга пользователей к агентам.
- Per-room `platform_chat_id` provides true context isolation and `!clear` - Per-room `platform_chat_id` routing.
- Reconciliation restores room metadata and routing after restart - File transfer через shared `/agents/` volume.
- File transfer uses shared `/agents/` volume with room-safe behavior - Разделение `docker-compose.prod.yml` и `docker-compose.fullstack.yml`.
- `docker-compose.prod.yml` is bot-only handoff; `docker-compose.fullstack.yml` is for internal E2E testing
--- ---
*Note: Легаси-фазы, связанные с Telegram, прототипами и Mock-платформой, были удалены из Roadmap после закрепления архитектуры MVP в ветке main.*
### Phase 3: Production Hardening
**Goal:** Подготовить боты к реальному деплою — конфиг, логирование, мониторинг, обработка ошибок.
**Depends on:** Phase 2
**Deliverables:**
- Docker / systemd конфиг для деплоя
- Структурированное логирование в production формате
- Health-check endpoint (если нужен)
- Rate limiting и защита от спама
- Graceful shutdown

View file

@ -2,12 +2,12 @@
gsd_state_version: 1.0 gsd_state_version: 1.0
milestone: v1.0 milestone: v1.0
milestone_name: — Production-ready surfaces milestone_name: — Production-ready surfaces
status: Phase 05 Paused status: MVP Deployed
last_updated: "2026-04-29T08:49:04Z" last_updated: "2026-05-03T23:00:00Z"
progress: progress:
total_phases: 6 total_phases: 3
completed_phases: 3 completed_phases: 3
total_plans: 16 total_plans: 13
completed_plans: 13 completed_plans: 13
--- ---
@ -15,115 +15,35 @@ progress:
## Project Reference ## Project Reference
See: .planning/PROJECT.md (updated 2026-04-02) See: `.planning/PROJECT.md` (updated 2026-05-03)
**Core value:** Пользователь ведёт диалог с Lambda через любой мессенджер без изменения ядра **Core value:** Пользователь бесшовно взаимодействует с изолированными агентами через нативные интерфейсы (Matrix), в то время как платформа агентов не зависит от транспорта.
**Current focus:** Phase 05 paused — latest file-contract change needs a new image build before platform redeploy **Current focus:** Итерационное развитие текущей архитектуры, добавление новых фич и поверхностей (по мере необходимости).
## Current Phase ## Current Phase
**Phase 05** paused: MVP deployment hardening is in place, but the latest attachment workspace-root change is not yet published Текущий MVP успешно завершен. Все базовые механизмы внедрены и работают:
- Маршрутизация к `AgentApi`
- Shared Volume файловый обмен (`/agents/`)
- Dynamic config через `matrix-agents.yaml`
- Изоляция контекстов через `platform_chat_id`
Deployment handoff follow-up is external. The last published image predates the latest file-handling change; the next step is to rebuild and publish a fresh image, then ask the platform to redeploy Matrix with the shared `/agents` volumes and `config/matrix-agents.yaml`. Проект находится в чистом состоянии для начала нового планирования. Неактуальные легаси фазы и прототипы (Telegram, MockPlatformClient) удалены из Roadmap и трекинга.
Plan `05-01` is complete. Matrix startup now reconciles managed Space rooms from synced topology before live traffic, restoring local metadata and deterministic legacy `platform_chat_id` bindings on restart.
- `a75b26a` — failing restart reconciliation regressions for recovery, idempotence, startup ordering, and legacy backfill
- `8a80d00` — startup reconciliation module and pre-sync wiring in the Matrix runtime
Verified with `MATRIX_AGENT_REGISTRY_PATH='' MATRIX_PLATFORM_BACKEND='' UV_CACHE_DIR=/tmp/uv-cache-surfaces uv run pytest tests/adapter/matrix/test_invite_space.py tests/adapter/matrix/test_chat_space.py tests/adapter/matrix/test_reconciliation.py tests/adapter/matrix/test_restart_persistence.py tests/adapter/matrix/test_dispatcher.py -v`.
Plan `05-02` is complete. Matrix room-local context commands now rely on repaired per-room `platform_chat_id` bindings, and `!clear` rotates only the active room's upstream context when prototype room state is available.
- `ae37476` — failing regressions for clear registration, room-local rotation, and strict routed-platform metadata requirements
- `85e2fda` — room-local clear semantics, compatibility alias wiring, and strict context resolution without shared chat fallbacks
Verified with `MATRIX_AGENT_REGISTRY_PATH='' MATRIX_PLATFORM_BACKEND='' UV_CACHE_DIR=/tmp/uv-cache-surfaces uv run pytest tests/adapter/matrix/test_context_commands.py tests/adapter/matrix/test_routed_platform.py tests/adapter/matrix/test_dispatcher.py -v`.
Plan `05-03` is complete. Shared-volume attachment handling now preserves relative agent paths while tolerating both `/workspace` and `/agents` absolute prefixes during normalization and Matrix file rendering.
- `7a12a71` — failing regressions for shared-volume path normalization and room-safe attachment handling
- `5eddf16``/agents` deployment path hardening for Matrix files and routed platform attachments
Verified with `uv run pytest tests/adapter/matrix/test_files.py tests/platform/test_real.py tests/adapter/matrix/test_send_outgoing.py -v`.
Plan `05-04` is complete. Production handoff now uses `docker-compose.prod.yml` for a bot-only runtime, while internal end-to-end verification uses `docker-compose.fullstack.yml` with shared `/agents` volume guidance and health-gated startup.
- `df6d8bf` — split prod and full-stack compose artifacts with the shared `/agents` contract
- `22a3a2b` — operator and deployment docs aligned to the split compose artifacts
Verified with `docker compose -f docker-compose.prod.yml config`, `docker compose -f docker-compose.fullstack.yml config`, and docs grep checks for `docker-compose.prod.yml`, `docker-compose.fullstack.yml`, and `/agents`.
## Decisions ## Decisions
- Продолжаем с Threaded Mode несмотря на баги Mac клиента (2026-04-02) - **Space+rooms для Matrix**: Разделение тредов на отдельные Matrix-комнаты, собранные в едином Space пользователя.
- Invite flow Matrix переведён на idempotent-проверку через `user_meta.space_id`, а не через invite-room metadata (2026-04-02) - **AgentApi**: Прямая интеграция с локальным агентом без Master-прослойки по WebSocket.
- Неизвестные Matrix rooms больше не auto-register в роутере; используется явный fallback `unregistered:{room_id}` с warning-логом (2026-04-02) - **Shared Volume**: Файлы кладутся напрямую в рабочую папку агента, избавляя от необходимости гонять их по сети в Base64.
- [Phase 01]: Use ChatContext.surface_ref as the Matrix room identifier for !rename updates. - **Статическая маршрутизация**: На данном этапе пользователи маппятся на агентов жестко через YAML.
- [Phase 01]: Keep !archive limited to core archive state in Phase 1; Space child removal remains deferred.
- [Phase 01]: Matrix OutgoingUI no longer emits reactions; confirmation state is persisted and resumed via `!yes` / `!no`.
- [Phase 01]: `!settings` now renders a dashboard snapshot instead of advertising mutable subcommands.
- [Phase 01]: Split Matrix regression coverage into dedicated invite/chat/send_outgoing/confirm test modules.
- [Phase 01]: Kept 01-04 scoped to test coverage without widening into production-code changes.
- [Phase 01]: Matrix command callbacks now include room_id in payload for !yes and !no so confirm handlers can resolve runtime state without changing core protocol types.
- [Phase 01]: Pending confirmations are stored under the D-08 composite key of matrix user id plus room id, with a narrow legacy fallback only for callers that omit room context.
- [Phase 01]: Removed Matrix reaction conversion entirely and kept command callbacks limited to !yes/!no.
- [Phase 01]: Kept !settings as a pure snapshot surface while preserving mutable subcommands outside the dashboard.
- [Phase 01]: Seeded invite and dispatcher tests with explicit next_chat_index and room ids instead of treating C1 as Matrix transport identity.
- [Phase 04]: Replaced AgentSessionClient with AgentApiWrapper and persistent agent connection lifecycle in Matrix runtime.
- [Phase 04]: Added !save, !load, !reset, and !context commands with pending-state interception and local prototype session metadata.
- [Phase 04]: Added Matrix-only Docker packaging for MVP deployment; platform services remain external to this compose setup.
- [Phase 04]: Replaced the Matrix prod path again with direct upstream `AgentApi` per request; removed the local runtime wrapper from the prod flow.
- [Phase 04]: Adopted `AGENT_BASE_URL` as the primary runtime contract and kept `AGENT_WS_URL` only as backward-compatible env fallback.
- [Phase 04 follow-up]: Kept shared PlatformClient unchanged; introduced Matrix-specific RoutedPlatformClient to avoid breaking Telegram adapter.
- [Phase 04 follow-up]: agent_routing_enabled flag on MatrixRuntime activates stale-room check only in real multi-agent mode (RoutedPlatformClient).
- [Phase 04 follow-up]: !new binds agent_id at room creation time using selected_agent_id from user metadata.
- [Phase 04 follow-up]: platform_chat_seq (PLATFORM_CHAT_SEQ_KEY) is stored in SQLiteStore and survives restart — confirmed by test.
- [Phase 05 reset]: Discard the single-chat / DM-first deployment direction. Replan around Space+rooms, per-room `platform_chat_id`, real `!clear`, reconciliation, and split prod/fullstack compose artifacts.
- [Phase 05]: Keep adapter/matrix/files.py as the sole path builder; sdk/real.py only normalizes shared-volume attachment references.
- [Phase 05]: Normalize /workspace and /agents absolute file paths back to relative workspace_path values before agent transport and Matrix file rendering.
- [Phase 05]: Treat synced Matrix topology as authoritative for startup recovery; keep SQLite rebuildable.
- [Phase 05]: Backfill missing platform_chat_id values during startup reconciliation before routed handling begins.
- [Phase 05]: Expose `clear` only when prototype room-context support is available, while keeping `reset` as a compatibility alias.
- [Phase 05]: Require recovered `platform_chat_id` for save/context/clear flows instead of falling back to shared local chat ids.
- [Phase 05]: Split Compose artifacts by runtime intent: bot-only prod handoff vs internal full-stack verification.
- [Phase 05]: Document /agents as the bot-side shared volume root while internal platform-agent keeps /workspace on the same named volume.
## Blockers ## Blockers
- Lambda platform SDK не готов — Phase 2 заблокирована до готовности платформы - Отсутствуют. Проект готов к деплою (см. `docker-compose.prod.yml`).
- Full production verification depends on the platform team's real multi-agent orchestration, production Matrix credentials, `config/matrix-agents.yaml`, and shared `/agents/N` volume mounts.
## Accumulated Context ## Accumulated Context
### Roadmap Evolution ### Roadmap Evolution
- Phase 01.1 inserted after Phase 01: Matrix restart reconciliation and dev reset workflow (URGENT) - Изначальный Roadmap включал множество ответвлений (прототипы Telegram, локальный mock-клиент). После закрепления MVP в `main` Roadmap был очищен, чтобы отражать только актуальный путь продукта.
- Phase 4 added: Matrix MVP: shared agent context and context management command - Следующие фазы будут добавляться по мере возникновения новых задач (например, переход от YAML-конфига к БД для реестра агентов).
- Phase 04 follow-up added inline: multi-agent routing (RoutedPlatformClient, !agent, stale room blocking, restart persistence)
- Phase 05 reset on 2026-04-28: erroneous single-chat deployment artifacts were removed before fresh planning.
## Performance Metrics
| Phase | Plan | Duration | Tasks | Files | Recorded |
| --- | --- | --- | --- | --- | --- |
| 01 | 01 | 1 min | 3 | 3 | 2026-04-02T19:50:50Z |
| 01 | 02 | 1 min | 2 | 2 | 2026-04-02 |
| 01 | 03 | 3 min | 2 | 5 | 2026-04-02T19:57:34Z |
| 01 | 04 | 3 min | 2 | 7 | 2026-04-02T20:03:38Z |
| 01 | 05 | 2 min | 2 | 7 | 2026-04-03T09:28:47Z |
| 01 | 06 | 4 min | 2 | 7 | 2026-04-03T09:35:47Z |
| 04 | 01 | 1 session | 1 wave | 8 | 2026-04-17 |
| 04 | 02 | 1 session | 2 commits + summary | 8 | 2026-04-17 |
| 04 | 03 | 1 session | 1 commit + summary | 4 | 2026-04-17 |
| 04 | follow-up | 1 session | 5 tasks | 10+ | 2026-04-24 |
| 05 | 03 | 3 min | 2 | 3 | 2026-04-27T22:06:43Z |
| 05 | 01 | 8 min | 2 | 4 | 2026-04-27T22:09:28Z |
| 05 | 02 | 16 min | 2 | 4 | 2026-04-27T22:15:58Z |
| 05 | 04 | 3 min | 2 | 5 | 2026-04-27T22:17:10Z |
## Session
- Last session: 2026-04-29T08:49:04Z
- Stopped at: Handoff updated after attachment workspace-root change; waiting for image rebuild and platform redeploy
- Resume file: .planning/phases/05-mvp-deployment/.continue-here.md

View file

@ -1,63 +0,0 @@
---
phase: 01.1-matrix-restart-reconciliation-and-dev-reset-workflow
task: 1
total_tasks: 2
status: paused
last_updated: 2026-04-07T21:29:48.982Z
---
<current_state>
Formally, the most recently active execution artifact inside the roadmap is still `01.1-03-PLAN.md`, which has not been implemented yet. In parallel, the platform-integration track has moved forward: the direct-agent Matrix prototype design is now approved, the implementation plan is written, and the next useful session should evaluate that spec/plan pair against the live platform repos before starting execution.
</current_state>
<completed_work>
- Re-analysed live platform repos on 2026-04-07 by cloning `platform/agent`, `platform/agent_api`, `platform/master`, and `platform/docs`.
- Confirmed `master` is still only a thin HTTP skeleton with `/health` and `/users/{user_id}`, not a chat/session/settings backend for surfaces.
- Confirmed `agent` exposes a working `/agent_ws/` WebSocket and `agent_api` provides enough protocol/client code to stream model output.
- Identified the real technical gap for a prototype: `agent` currently uses a singleton service with a fixed `thread_id="default"`, so all conversations would share memory unless that is parameterized.
- Derived and got approval for the prototype path: keep Matrix adapter logic largely intact, add `sdk/agent_session.py`, `sdk/prototype_state.py`, and `sdk/real.py`, keep settings local, and use the direct `agent` WebSocket for real messaging.
- Resolved the repo-placement question: the prototype stays in this repo on its own branch, not in a separate prototype repo.
- Resolved the platform-change minimization question: prefer patching only `platform/agent`, not `platform/agent_api`, and use a tiny local WebSocket client in this repo.
- Wrote and committed the approved design spec: `docs/superpowers/specs/2026-04-08-matrix-direct-agent-prototype-design.md`.
- Wrote the implementation plan: `docs/superpowers/plans/2026-04-08-matrix-direct-agent-prototype.md`.
</completed_work>
<remaining_work>
- Task 1: Implement `adapter.matrix.reset` with `local-only`, `server-leave-forget`, and `--dry-run`, plus tests.
- Task 2: Update `README.md` so restart vs explicit reset workflow is documented and the old manual reset ritual is removed.
- Prototype evaluation follow-up: review the approved spec and plan against the platform repos before starting execution.
- Future prototype work: introduce `sdk/real.py` plus a narrow compatibility boundary that keeps Matrix adapter logic stable while allowing later expansion toward a fuller platform split.
</remaining_work>
<decisions_made>
- Do not integrate with `master` yet; it is still not the backend surfaces needs.
- Use the direct `agent` WebSocket as the only realistic path for a working prototype right now.
- Keep consumer-facing Matrix logic as intact as possible and absorb backend differences inside a shim under `sdk/`.
- Treat future platform evolution as likely to split into at least two concerns: control-plane access and direct agent session streaming.
- Keep the prototype in this repo on its own branch.
- Minimize platform-side changes by patching only `platform/agent` if possible.
</decisions_made>
<blockers>
- Phase 01.1 itself is not blocked; it is simply paused.
- Prototype blocker: the `agent` repo currently hardcodes a shared `thread_id`, so per-user/per-chat conversation isolation requires either a small upstream change or a careful workaround.
- Platform contract blocker remains for the longer-term Phase 02 direction: `master` still lacks stable user/chat/session/settings APIs for surfaces.
</blockers>
<context>
The important mental model is now stable enough to execute. Full SDK integration through `master` is still premature, but a working Matrix prototype can be built now by talking directly to the `agent` WebSocket and hiding the split backend reality behind `sdk/real.py`. The approved design keeps the prototype in this repo, keeps settings local, and minimizes platform changes by preferring a tiny `platform/agent` patch over broader protocol churn. For evaluation and implementation context, inspect:
- local spec: `docs/superpowers/specs/2026-04-08-matrix-direct-agent-prototype-design.md`
- local plan: `docs/superpowers/plans/2026-04-08-matrix-direct-agent-prototype.md`
- remote repos: `https://git.lambda.coredump.ru/platform/agent`, `https://git.lambda.coredump.ru/platform/master`, `https://git.lambda.coredump.ru/platform/agent_api`
- local clones: `/tmp/platform-agent`, `/tmp/platform-master`, `/tmp/platform-agent_api`
</context>
<next_action>
Resume with one of these depending on priority:
1. Evaluate the approved prototype spec and implementation plan against the live platform repos and decide whether to start in this repo or patch `platform/agent` first.
2. If staying on roadmap execution, implement `01.1-03-PLAN.md` Task 1 (`adapter.matrix.reset`) first.
3. If starting prototype execution immediately, begin with Task 1 of `docs/superpowers/plans/2026-04-08-matrix-direct-agent-prototype.md`.
</next_action>

View file

@ -1,157 +0,0 @@
---
phase: 01.1-matrix-restart-reconciliation-and-dev-reset-workflow
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- adapter/matrix/reconcile.py
- tests/adapter/matrix/test_reconcile.py
autonomous: true
requirements: []
must_haves:
truths:
- "A normal Matrix restart can rebuild missing local metadata from already joined Space/chat rooms instead of requiring a destructive reset."
- "Reconciliation restores the minimal local state needed for routing and chat operations: `matrix_user:*`, `matrix_room:*`, and missing `chat:{user}:{chat_id}` rows."
- "Reconciliation never provisions new Matrix rooms or Spaces while repairing local state."
- "Recovered users get `next_chat_index` advanced past the highest recovered `C*` chat id."
artifacts:
- path: "adapter/matrix/reconcile.py"
provides: "Matrix bootstrap reconciliation helpers and structured report objects."
- path: "tests/adapter/matrix/test_reconcile.py"
provides: "Regression coverage for startup and single-room reconciliation behavior."
key_links:
- from: "adapter/matrix/reconcile.py"
to: "adapter/matrix/store.py"
via: "set_user_meta and set_room_meta restore Matrix metadata"
pattern: "set_(user|room)_meta"
- from: "adapter/matrix/reconcile.py"
to: "core/chat.py"
via: "chat_mgr.get_or_create repairs missing `chat:*` rows"
pattern: "chat_mgr\\.get_or_create"
---
<objective>
Create the non-destructive Matrix reconciliation layer that Phase 01.1 depends on.
Purpose: Per D-01 through D-07, the adapter must stop treating local SQLite state as the only truth in dev. Startup and recovery code need a single helper module that can rebuild local metadata from the homeserver room graph without creating duplicate Spaces or chats.
Output: `adapter/matrix/reconcile.py` with full-run and single-room recovery helpers, plus targeted pytest coverage.
</objective>
<execution_context>
@/Users/a/.codex/get-shit-done/workflows/execute-plan.md
@/Users/a/.codex/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-CONTEXT.md
@.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-RESEARCH.md
@.planning/phases/01-matrix-qa-polish/01-01-SUMMARY.md
@adapter/matrix/store.py
@adapter/matrix/handlers/auth.py
@core/chat.py
@tests/adapter/matrix/test_invite_space.py
<interfaces>
From `adapter/matrix/store.py`:
```python
async def get_room_meta(store: StateStore, room_id: str) -> dict | None
async def set_room_meta(store: StateStore, room_id: str, meta: dict) -> None
async def get_user_meta(store: StateStore, matrix_user_id: str) -> dict | None
async def set_user_meta(store: StateStore, matrix_user_id: str, meta: dict) -> None
```
From `core/chat.py`:
```python
async def get_or_create(
self,
user_id: str,
chat_id: str,
platform: str,
surface_ref: str,
name: str | None = None,
) -> ChatContext
```
From Phase 01 room metadata shape:
```python
{
"room_type": "chat",
"chat_id": "C4",
"display_name": "Чат 4",
"matrix_user_id": "@alice:example.org",
"space_id": "!space:example.org",
}
```
</interfaces>
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: Add reconciliation module for startup and single-room recovery</name>
<files>adapter/matrix/reconcile.py, tests/adapter/matrix/test_reconcile.py</files>
<read_first>adapter/matrix/store.py, adapter/matrix/handlers/auth.py, core/chat.py, tests/adapter/matrix/test_invite_space.py, .planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-CONTEXT.md</read_first>
<behavior>
- Test 1: `reconcile_matrix_state(...)` recreates missing `matrix_user:*`, `matrix_room:*`, and `chat:*` entries from joined Matrix rooms without calling `room_create`.
- Test 2: `reconcile_matrix_state(...)` leaves already-correct local metadata intact and reports restored vs skipped/conflicting rooms.
- Test 3: `reconcile_single_room(...)` can repair one `unregistered:{room_id}` chat room on demand and recompute `next_chat_index` for that user.
- Test 4: Space rooms or unrelated joined rooms are skipped, not converted into chat rows.
</behavior>
<action>
Create `adapter/matrix/reconcile.py` as the authoritative recovery module for this phase. Implement a small, explicit API that Plan 02 can wire directly:
```python
async def reconcile_matrix_state(client: Any, store: StateStore, chat_mgr: ChatManager) -> dict: ...
async def reconcile_single_room(
client: Any, store: StateStore, chat_mgr: ChatManager, room_id: str, matrix_user_id: str
) -> dict: ...
```
Inside this module, add focused private helpers as needed for room classification, extracting room names, parsing `C<number>` ids, and recomputing `next_chat_index`. Keep the logic non-destructive per D-04:
- never call `room_create`, `room_invite`, or provisioning code from `handlers/auth.py`
- prefer already-hydrated room data from the post-sync client object, and only fall back to explicit room-state fetches if required for room classification
- rebuild only the minimal metadata required by D-03: `matrix_user:*`, `matrix_room:*`, and missing `chat:{user}:{chat_id}` records
- if `chat:*` exists but points at the wrong `surface_ref`, repair it from Matrix room metadata and include the fix in the returned report
- derive `next_chat_index` from the highest recovered `C<number>` for that user instead of trusting stale local counters
Return a structured reconciliation report with stable keys such as:
`joined_rooms`, `restored_user_meta`, `restored_room_meta`, `restored_chat_rows`, `repaired_chat_rows`, `skipped_rooms`, and `conflicts`.
Write `tests/adapter/matrix/test_reconcile.py` with lightweight `SimpleNamespace`/fake-client fixtures following the existing Matrix test style. Cover both full startup reconciliation and `reconcile_single_room(...)`. Assert that no provisioning calls are made during reconciliation, because D-04 forbids creating new Space/room topology while recovering local state.
</action>
<verify>
<automated>cd /Users/a/MAI/sem2/lambda/surfaces-bot && pytest tests/adapter/matrix/test_reconcile.py -q</automated>
</verify>
<acceptance_criteria>
- `adapter/matrix/reconcile.py` exports `reconcile_matrix_state` and `reconcile_single_room`.
- Reconciliation restores missing `matrix_user:*`, `matrix_room:*`, and `chat:*` entries for already-joined Matrix chat rooms per D-02 and D-03.
- Reconciliation does not call `room_create` or otherwise provision new server-side rooms per D-04.
- The report returned by reconciliation clearly distinguishes restored items, skipped rooms, and conflicts.
- `tests/adapter/matrix/test_reconcile.py` proves `next_chat_index` is recomputed from recovered chat ids rather than stale local state.
</acceptance_criteria>
<done>The repository has an executable, tested reconciliation layer that can rebuild local Matrix metadata after dev-state loss without duplicating server-side rooms.</done>
</task>
</tasks>
<verification>
Run `pytest tests/adapter/matrix/test_reconcile.py -q` and confirm startup and single-room reconciliation paths are covered.
</verification>
<success_criteria>
- Matrix recovery logic exists as a dedicated module instead of being scattered through handlers.
- Reconciliation is idempotent, non-destructive, and sufficient to restore routing/chat metadata from existing Matrix rooms.
- Plan 02 can wire startup and first-access recovery by calling exported functions rather than inventing new recovery logic.
</success_criteria>
<output>
After completion, create `.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-01-SUMMARY.md`
</output>

View file

@ -1,167 +0,0 @@
---
phase: 01.1-matrix-restart-reconciliation-and-dev-reset-workflow
plan: 02
type: execute
wave: 2
depends_on: ["01.1-01"]
files_modified:
- adapter/matrix/bot.py
- tests/adapter/matrix/test_dispatcher.py
autonomous: true
requirements: []
must_haves:
truths:
- "The Matrix bot performs an initial sync and reconciliation before entering steady-state `sync_forever()`."
- "If a room still arrives as `unregistered:{room_id}` after startup, the bot makes one targeted recovery attempt before dispatching or failing."
- "When reconciliation cannot repair a room, the bot logs a clear diagnostic reason instead of crashing on downstream commands like `!rename`."
artifacts:
- path: "adapter/matrix/bot.py"
provides: "Startup bootstrap flow with initial sync, reconciliation, and targeted runtime retry."
- path: "tests/adapter/matrix/test_dispatcher.py"
provides: "Matrix runtime coverage for pre-sync reconcile and on-message recovery behavior."
key_links:
- from: "adapter/matrix/bot.py"
to: "adapter/matrix/reconcile.py"
via: "startup bootstrap and single-room recovery calls"
pattern: "reconcile_(matrix_state|single_room)"
- from: "adapter/matrix/bot.py"
to: "adapter/matrix/room_router.py"
via: "unregistered room detection before dispatch"
pattern: "unregistered:"
---
<objective>
Wire the new reconciliation layer into the actual Matrix runtime.
Purpose: D-05 through D-07 require restart recovery to be the default developer path. The bot must bootstrap itself from existing Matrix rooms on startup and make one on-demand repair attempt before routing an unknown room through the dispatcher.
Output: `adapter/matrix/bot.py` performs initial sync + reconciliation before `sync_forever()`, and runtime tests prove the bot recovers or logs clearly instead of blindly dispatching broken state.
</objective>
<execution_context>
@/Users/a/.codex/get-shit-done/workflows/execute-plan.md
@/Users/a/.codex/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-CONTEXT.md
@.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-RESEARCH.md
@.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-01-PLAN.md
@adapter/matrix/bot.py
@adapter/matrix/room_router.py
@adapter/matrix/reconcile.py
@tests/adapter/matrix/test_dispatcher.py
<interfaces>
From `adapter/matrix/bot.py`:
```python
class MatrixBot:
async def on_room_message(self, room: MatrixRoom, event: RoomMessageText) -> None
async def main() -> None
```
From `adapter/matrix/reconcile.py`:
```python
async def reconcile_matrix_state(client: Any, store: StateStore, chat_mgr: ChatManager) -> dict
async def reconcile_single_room(
client: Any, store: StateStore, chat_mgr: ChatManager, room_id: str, matrix_user_id: str
) -> dict
```
From `adapter/matrix/room_router.py`:
```python
async def resolve_chat_id(store: StateStore, room_id: str, matrix_user_id: str) -> str
```
</interfaces>
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: Run initial sync and reconciliation before the long-poll loop</name>
<files>adapter/matrix/bot.py, tests/adapter/matrix/test_dispatcher.py</files>
<read_first>adapter/matrix/bot.py, adapter/matrix/reconcile.py, tests/adapter/matrix/test_dispatcher.py, .planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-RESEARCH.md</read_first>
<behavior>
- Test 1: `main()` performs `client.sync(timeout=0, full_state=True)` before `sync_forever()`.
- Test 2: `main()` calls `reconcile_matrix_state(...)` after the initial sync and logs the returned report.
- Test 3: startup still reaches `sync_forever()` when reconciliation reports recoverable skips/conflicts instead of fatal failure.
</behavior>
<action>
Modify `adapter/matrix/bot.py` so normal startup follows the two-phase bootstrap recommended in research:
1. build client and runtime
2. authenticate
3. register callbacks
4. run `await client.sync(timeout=0, full_state=True)`
5. run `await reconcile_matrix_state(client, runtime.store, runtime.chat_mgr)`
6. log a structured `matrix_reconcile_complete` event with the report fields
7. enter `await client.sync_forever(timeout=30000)`
Do not move provisioning logic into startup. The startup step only rehydrates local state from server-side rooms per D-02 through D-04.
Update or add focused tests in `tests/adapter/matrix/test_dispatcher.py` using `monkeypatch`/fake-client patterns already used in the repo so the verify command proves the call order and logging-safe behavior. The test should fail if `sync_forever()` starts before reconciliation.
</action>
<verify>
<automated>cd /Users/a/MAI/sem2/lambda/surfaces-bot && pytest tests/adapter/matrix/test_dispatcher.py -q</automated>
</verify>
<acceptance_criteria>
- `adapter/matrix/bot.py` runs an initial full-state sync before steady-state polling.
- `adapter/matrix/bot.py` invokes `reconcile_matrix_state(...)` exactly once during startup.
- Startup logs a structured reconciliation summary instead of silently skipping the recovery step.
- `tests/adapter/matrix/test_dispatcher.py` asserts the bootstrap order explicitly.
</acceptance_criteria>
<done>Normal Matrix bot startup now includes a recovery pass before the event loop begins handling user traffic.</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: Retry unknown-room routing once before dispatching broken state</name>
<files>adapter/matrix/bot.py, tests/adapter/matrix/test_dispatcher.py</files>
<read_first>adapter/matrix/bot.py, adapter/matrix/room_router.py, adapter/matrix/reconcile.py, tests/adapter/matrix/test_dispatcher.py, .planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-CONTEXT.md</read_first>
<behavior>
- Test 1: `MatrixBot.on_room_message(...)` detects `unregistered:{room_id}`, runs `reconcile_single_room(...)`, then retries `resolve_chat_id(...)`.
- Test 2: if retry succeeds, the event is dispatched against the recovered logical chat id.
- Test 3: if retry still fails, the bot does not crash; it logs a clear warning and sends a user-facing diagnostic message to that room.
</behavior>
<action>
Extend `MatrixBot.on_room_message(...)` so D-07 is satisfied even when startup could not repair a room yet. Keep `resolve_chat_id(...)` as the room-router source of truth, but treat `unregistered:{room_id}` as a recovery trigger rather than a stable runtime identity:
- first call `resolve_chat_id(...)`
- if the result starts with `unregistered:`, call `reconcile_single_room(client, runtime.store, runtime.chat_mgr, room.room_id, event.sender)`
- immediately retry `resolve_chat_id(...)`
- only dispatch once a concrete logical chat id exists
- if the retry still returns `unregistered:{room_id}`, log a structured warning with room id, matrix user id, and reconciliation report, then send a short `OutgoingMessage`-equivalent Matrix text explaining that local state could not be restored automatically and a dev reset/restart may be required
Do not invent a new fallback chat id and do not auto-create rooms here; that would violate D-04. Keep this change inside `adapter/matrix/bot.py` so file ownership stays isolated for this plan.
</action>
<verify>
<automated>cd /Users/a/MAI/sem2/lambda/surfaces-bot && pytest tests/adapter/matrix/test_dispatcher.py -q</automated>
</verify>
<acceptance_criteria>
- Unknown Matrix rooms trigger one targeted reconciliation attempt before dispatch.
- Successful targeted recovery leads to normal dispatch with a real logical `chat_id`.
- Failed targeted recovery logs a clear diagnostic and avoids a handler crash on missing chat state per D-06.
- No code path in this task provisions new Matrix rooms or Spaces.
</acceptance_criteria>
<done>The runtime treats unknown rooms as recoverable state drift first, not as a silent routing failure or crash path.</done>
</task>
</tasks>
<verification>
Run `pytest tests/adapter/matrix/test_dispatcher.py -q` and confirm both startup-bootstrap and first-access recovery behaviors are covered.
</verification>
<success_criteria>
- A standard Matrix restart now attempts recovery before the bot starts processing live events.
- Unknown-room events are diagnosable and recoverable instead of falling straight into broken command handling.
- The runtime never provisions new server-side rooms during restart reconciliation.
</success_criteria>
<output>
After completion, create `.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-02-SUMMARY.md`
</output>

View file

@ -1,149 +0,0 @@
---
phase: 01.1-matrix-restart-reconciliation-and-dev-reset-workflow
plan: 03
type: execute
wave: 1
depends_on: []
files_modified:
- adapter/matrix/reset.py
- tests/adapter/matrix/test_reset.py
- README.md
autonomous: true
requirements: []
must_haves:
truths:
- "Developers have an explicit dev-only reset command instead of relying on memory or ad hoc shell history."
- "The default reset mode clears only local Matrix state and explains the manual Matrix-client cleanup that may still be needed."
- "Optional server cleanup is clearly named around leave/forget semantics and supports dry-run output."
artifacts:
- path: "adapter/matrix/reset.py"
provides: "Dev reset CLI for local-only, server-leave-forget, and dry-run workflows."
- path: "tests/adapter/matrix/test_reset.py"
provides: "CLI coverage for local reset behavior and printed operator guidance."
- path: "README.md"
provides: "Updated developer instructions for normal restart vs explicit reset."
key_links:
- from: "adapter/matrix/reset.py"
to: "README.md"
via: "documented invocation and manual Matrix cleanup guidance"
pattern: "adapter\\.matrix\\.reset"
---
<objective>
Ship the dev reset workflow that complements normal restart reconciliation.
Purpose: D-08 through D-10 require a repeatable, explicit reset path for clean-room QA without making destructive cleanup the default restart flow. This plan creates the tool and updates the runbook developers actually use.
Output: `adapter/matrix/reset.py`, pytest coverage, and README instructions that replace the old `rm -f lambda_matrix.db` ritual.
</objective>
<execution_context>
@/Users/a/.codex/get-shit-done/workflows/execute-plan.md
@/Users/a/.codex/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-CONTEXT.md
@.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-RESEARCH.md
@README.md
@adapter/matrix/bot.py
@core/store.py
<interfaces>
From `adapter/matrix/bot.py` env usage:
```python
db_path = os.environ.get("MATRIX_DB_PATH", "lambda_matrix.db")
store_path = os.environ.get("MATRIX_STORE_PATH", "matrix_store")
homeserver = os.environ.get("MATRIX_HOMESERVER")
user_id = os.environ.get("MATRIX_USER_ID")
```
From `core/store.py`:
```python
class SQLiteStore:
def __init__(self, db_path: str) -> None: ...
```
</interfaces>
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: Add a dev-only Matrix reset CLI with explicit modes</name>
<files>adapter/matrix/reset.py, tests/adapter/matrix/test_reset.py</files>
<read_first>adapter/matrix/bot.py, core/store.py, .planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-RESEARCH.md</read_first>
<behavior>
- Test 1: `--mode local-only` deletes the configured local DB/store paths or reports what would be deleted in dry-run mode.
- Test 2: `--mode server-leave-forget --dry-run` prints the exact rooms it would leave/forget and does not mutate local files.
- Test 3: when server cleanup is not executed, the command prints the manual Matrix-client steps required by D-10.
</behavior>
<action>
Create `adapter/matrix/reset.py` as a CLI entrypoint runnable via `uv run python -m adapter.matrix.reset`. Use `argparse` and keep the tool explicitly dev-only in its help text and logs.
Implement the following modes from research and locked decisions:
- `local-only` (default destructive mode for local QA): remove `MATRIX_DB_PATH` and `MATRIX_STORE_PATH` if they exist; if not, report that they were already absent
- `server-leave-forget`: for the bot account only, log in using the same Matrix env vars as `adapter/matrix/bot.py`, inspect joined rooms, and call `room_leave()` + `room_forget()` for each joined room; support `--dry-run` so the operator can inspect the target set before mutation
- `--dry-run` must work with both modes and print a structured summary instead of mutating files or Matrix membership
Always print a post-run summary that distinguishes:
- what local files/directories were deleted or would be deleted
- what server-side leave/forget actions were executed or would be executed
- the manual Matrix client steps still required for a true clean-room QA rerun (leave/archive old rooms or Space in Element, accept fresh invites, etc.) when those actions are outside this phase
Write `tests/adapter/matrix/test_reset.py` to cover local-only deletion, dry-run output, and server-leave-forget dry-run behavior with fake clients/temporary directories. Follow the repos existing lightweight async test style.
</action>
<verify>
<automated>cd /Users/a/MAI/sem2/lambda/surfaces-bot && pytest tests/adapter/matrix/test_reset.py -q</automated>
</verify>
<acceptance_criteria>
- `adapter/matrix/reset.py` supports `local-only`, `server-leave-forget`, and `--dry-run`.
- `local-only` reset targets both `lambda_matrix.db` and `matrix_store` via env-aware paths per D-09.
- The tool never claims to globally delete Matrix rooms; it uses leave/forget semantics or prints manual cleanup instructions per D-10.
- `tests/adapter/matrix/test_reset.py` proves dry-run mode is non-destructive.
</acceptance_criteria>
<done>The repository contains a repeatable dev reset tool that replaces the undocumented shell ritual and names server-side cleanup honestly.</done>
</task>
<task type="auto">
<name>Task 2: Replace the README reset ritual with the new restart and reset workflow</name>
<files>README.md</files>
<read_first>README.md, .planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-CONTEXT.md, .planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-RESEARCH.md</read_first>
<action>
Update `README.md` so Matrix development instructions reflect Phase 01.1 instead of the old destructive reset ritual. Replace the current manual QA block that tells developers to `rm -f lambda_matrix.db` with a short, explicit split:
- normal restart: `PYTHONPATH=. uv run python -m adapter.matrix.bot` now performs reconciliation automatically
- explicit clean-room reset: `PYTHONPATH=. uv run python -m adapter.matrix.reset --mode local-only`
- optional server cleanup preview: `PYTHONPATH=. uv run python -m adapter.matrix.reset --mode server-leave-forget --dry-run`
State clearly that normal restart is the default path per D-05, and that full server-side cleanup may still require manual steps in the Matrix client. Keep the README concise; do not add production guidance or Phase 2 SDK content.
</action>
<verify>
<automated>cd /Users/a/MAI/sem2/lambda/surfaces-bot && python -m adapter.matrix.reset --help >/tmp/matrix-reset-help.txt && rg -n "adapter.matrix.reset|local-only|server-leave-forget|reconciliation" README.md /tmp/matrix-reset-help.txt</automated>
</verify>
<acceptance_criteria>
- `README.md` no longer recommends raw `rm -f lambda_matrix.db` as the default Matrix restart workflow.
- `README.md` documents the normal restart path and the explicit reset path separately.
- The documented reset commands match the CLI implemented in `adapter/matrix/reset.py`.
</acceptance_criteria>
<done>Developers can follow a repeatable README workflow for ordinary restart and clean-room QA reset without relying on tribal knowledge.</done>
</task>
</tasks>
<verification>
Run `pytest tests/adapter/matrix/test_reset.py -q` and `python -m adapter.matrix.reset --help`, then confirm the README commands and help text stay aligned.
</verification>
<success_criteria>
- Dev reset is an explicit tool, not a remembered shell sequence.
- Local-only reset is automated and documented.
- Server cleanup semantics are honest, dry-runnable, and accompanied by manual Matrix-client guidance where needed.
</success_criteria>
<output>
After completion, create `.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-03-SUMMARY.md`
</output>

View file

@ -1,121 +0,0 @@
# Phase 01.1: Matrix restart reconciliation and dev reset workflow - Context
**Gathered:** 2026-04-03
**Status:** Ready for planning
<domain>
## Phase Boundary
Сделать Matrix-адаптер пригодным для повторяемой локальной разработки и ручного QA без ручного “ритуала” удаления БД, выхода из всех комнат и пересоздания пользователя.
В scope этой фазы:
- безопасный restart flow для Matrix-бота после потери локального state
- reconciliation локального store с уже существующими Matrix rooms / Space
- отдельный dev reset workflow для controlled clean-room QA
- диагностируемое поведение при несогласованности local state и server-side Matrix state
Вне scope:
- реальный Lambda SDK
- новые пользовательские Matrix features
- E2EE
- production-grade multi-user migration framework
</domain>
<decisions>
## Implementation Decisions
### Matrix state lifecycle
- **D-01:** Локальный SQLite store больше не должен считаться единственной точкой истины для Matrix runtime в dev workflow.
- **D-02:** При старте бот должен пытаться восстановить минимально необходимое локальное состояние из уже существующих Matrix rooms / Space, а не требовать full reset.
- **D-03:** Reconciliation должен восстанавливать как минимум `matrix_user:*`, `matrix_room:*` и missing `chat:{user}:{chat_id}` записи, если серверные комнаты уже существуют.
- **D-04:** Reconciliation не должен создавать новые Space/rooms, если задача — именно восстановление локального state после рестарта.
### Dev restart behavior
- **D-05:** Обычный restart бота должен быть основным путём для разработки; удаление `lambda_matrix.db` и `matrix_store` не должно быть обязательным для проверки workflow.
- **D-06:** Если local state неполон, бот должен либо восстановить его, либо логировать понятную причину, а не падать на командах вроде `!rename`.
- **D-07:** Несогласованность между `room_meta` и `ChatManager` должна обнаруживаться и устраняться автоматически на startup или при первом обращении.
### Dev reset workflow
- **D-08:** Нужен отдельный dev-only reset tool/script для controlled QA, вместо ручного набора shell-команд.
- **D-09:** Reset workflow должен как минимум поддерживать `local-only` reset: удаление `lambda_matrix.db` и `matrix_store` с понятной инструкцией, что делать с server-side Matrix rooms.
- **D-10:** Если full server-side cleanup не автоматизируется в этой фазе, tool должен явно печатать, какие ручные шаги обязательны в Matrix client.
### The agent's Discretion
- Точное место вызова reconciliation в startup flow
- Внутренняя структура helper-модуля (`bootstrap.py`, `reconcile.py` или аналог)
- Формат dev reset script и уровень автоматизации server-side cleanup
- Детали debug-logging и dry-run режима, если они помогают без раздувания scope
</decisions>
<specifics>
## Specific Ideas
- Главный критерий: после обычного restart бот не должен ломаться только потому, что local DB была сброшена или частично потеряна.
- Reset workflow должен быть явным и repeatable, а не завязанным на память разработчика.
- Нужно различать две ситуации:
- broken because code is wrong
- broken because local dev state was deliberately reset and requires reconciliation
</specifics>
<canonical_refs>
## Canonical References
**Downstream agents MUST read these before planning or implementing.**
### Matrix phase artifacts
- `.planning/phases/01-matrix-qa-polish/01-CONTEXT.md` — locked Matrix decisions from Phase 1
- `.planning/phases/01-matrix-qa-polish/01-VERIFICATION.md` — what Phase 1 already validated and what manual QA still expects
- `.planning/phases/01-matrix-qa-polish/01-HUMAN-UAT.md` — remaining real-client Matrix checks
### Current Matrix runtime
- `adapter/matrix/bot.py` — startup flow, sync loop, runtime wiring, DB/store env vars
- `adapter/matrix/store.py` — persisted Matrix metadata and pending confirmation keys
- `adapter/matrix/room_router.py` — room_id to chat_id resolution and current unregistered fallback
- `adapter/matrix/handlers/auth.py` — invite bootstrap that creates Space and first chat room
- `core/chat.py``ChatManager` persistence contract that currently breaks when local state is missing
### Supporting docs
- `docs/matrix-prototype.md` — intended Matrix UX and architecture direction
- `README.md` — current run instructions and existing manual QA/reset habits
</canonical_refs>
<code_context>
## Existing Code Insights
### Reusable Assets
- `adapter/matrix/store.py` already persists room/user metadata and is the obvious place to anchor reconciliation inputs.
- `adapter/matrix/room_router.py` already detects unknown rooms via `unregistered:{room_id}` fallback; this is a useful reconciliation trigger point.
- `core/chat.py` already has `get_or_create`, `rename`, `archive`, `list_active`; missing chat records can be rebuilt through this API instead of inventing a parallel format.
### Established Patterns
- Matrix runtime uses `SQLiteStore` for adapter-local metadata and `matrix-nio` room callbacks for transport events.
- Phase 1 already moved Matrix to Space+rooms and command-only confirmations, so this phase must preserve that model rather than reverting to DM-first simplifications.
### Integration Points
- Startup path in `adapter/matrix/bot.py:main()` is the natural place to run reconciliation before `sync_forever`.
- Invite/bootstrap path in `adapter/matrix/handlers/auth.py` is the existing source of truth for what metadata a healthy first room should have.
- `ChatManager` records and `matrix_room:*` metadata must stay consistent enough that commands like `!rename`, `!archive`, and `!chats` work after restart.
</code_context>
<deferred>
## Deferred Ideas
- Full production-grade migration of historical Matrix state across schema versions
- Automatic server-side deletion/leave for all Matrix rooms and Space during reset, if it requires broader admin semantics
- Any Phase 2 SDK integration work
</deferred>
---
*Phase: 01.1-matrix-restart-reconciliation-and-dev-reset-workflow*
*Context gathered: 2026-04-03*

View file

@ -1,350 +0,0 @@
# Phase 01.1: Matrix restart reconciliation and dev reset workflow - Research
**Researched:** 2026-04-03
**Domain:** Matrix adapter restart reconciliation, local state recovery, dev reset workflow
**Confidence:** HIGH
<user_constraints>
## User Constraints (from CONTEXT.md)
### Locked Decisions
- **D-01:** Локальный SQLite store больше не должен считаться единственной точкой истины для Matrix runtime в dev workflow.
- **D-02:** При старте бот должен пытаться восстановить минимально необходимое локальное состояние из уже существующих Matrix rooms / Space, а не требовать full reset.
- **D-03:** Reconciliation должен восстанавливать как минимум `matrix_user:*`, `matrix_room:*` и missing `chat:{user}:{chat_id}` записи, если серверные комнаты уже существуют.
- **D-04:** Reconciliation не должен создавать новые Space/rooms, если задача — именно восстановление локального state после рестарта.
- **D-05:** Обычный restart бота должен быть основным путём для разработки; удаление `lambda_matrix.db` и `matrix_store` не должно быть обязательным для проверки workflow.
- **D-06:** Если local state неполон, бот должен либо восстановить его, либо логировать понятную причину, а не падать на командах вроде `!rename`.
- **D-07:** Несогласованность между `room_meta` и `ChatManager` должна обнаруживаться и устраняться автоматически на startup или при первом обращении.
- **D-08:** Нужен отдельный dev-only reset tool/script для controlled QA, вместо ручного набора shell-команд.
- **D-09:** Reset workflow должен как минимум поддерживать `local-only` reset: удаление `lambda_matrix.db` и `matrix_store` с понятной инструкцией, что делать с server-side Matrix rooms.
- **D-10:** Если full server-side cleanup не автоматизируется в этой фазе, tool должен явно печатать, какие ручные шаги обязательны в Matrix client.
### Claude's Discretion
- Точное место вызова reconciliation в startup flow
- Внутренняя структура helper-модуля (`bootstrap.py`, `reconcile.py` или аналог)
- Формат dev reset script и уровень автоматизации server-side cleanup
- Детали debug-logging и dry-run режима, если они помогают без раздувания scope
### Deferred Ideas (OUT OF SCOPE)
- Full production-grade migration of historical Matrix state across schema versions
- Automatic server-side deletion/leave for all Matrix rooms and Space during reset, if it requires broader admin semantics
- Any Phase 2 SDK integration work
</user_constraints>
## Summary
Phase 01.1 should be planned as a bootstrap/recovery phase, not as another chat-feature phase. The current Matrix adapter has no startup reconciliation path: `adapter/matrix/bot.py` logs in and goes directly to `sync_forever()`, while routing and command handlers assume `matrix_room:*`, `matrix_user:*`, and `chat:*` keys already exist. That means local DB loss currently produces logical corruption, not just missing cache.
The safe standard approach is: perform a first sync that hydrates joined-room state, inspect the bot's current joined rooms and room state from the homeserver, rebuild the minimal local metadata needed for command routing, and only then enter the long-running sync loop. Reconciliation should be non-destructive and idempotent: if local keys already exist and match server state, leave them alone; if they are missing, recreate them; if they conflict, prefer the server room topology for Matrix-specific metadata and recreate missing `ChatManager` rows from that.
For reset, separate two workflows explicitly. `local-only` reset is the default and should be automated. Optional server-side cleanup may leave/forget rooms for the bot account, but it cannot promise global deletion of Matrix rooms for all members; if that is not automated, the tool must print the exact manual steps for the Matrix client.
**Primary recommendation:** Add a startup `reconcile_matrix_state()` step before `sync_forever()`, and ship a dev-only reset CLI with `local-only`, `server-leave-forget`, and `dry-run` modes.
## Project Constraints (from CLAUDE.md)
- Do not treat missing Lambda SDK as a blocker.
- Keep all platform calls behind `platform/interface.py`.
- Current runtime implementation is `platform/mock.py`; recommendations must work with that.
- Prefer architecture changes in adapters and core without coupling to future SDK internals.
- Use pytest-based verification.
- Do not recommend committing `.env`.
- Respect dependency order: `core/` first, then `platform/`, then adapters.
## Standard Stack
### Core
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| Python | 3.14.3 installed | Runtime for bot and scripts | Already available locally; codebase targets `>=3.11`. |
| `matrix-nio` | 0.25.2, published 2024-10-04 | Matrix client, sync, room membership/state APIs | Already installed; exposes the exact bootstrap/reset APIs this phase needs. |
| `SQLiteStore` (repo) | local | Adapter/core KV persistence | Existing persistence contract for `matrix_user:*`, `matrix_room:*`, and `chat:*`. |
| Matrix Client-Server API | spec latest | Authoritative room membership/state semantics | Needed to reason about restart recovery and leave/forget behavior correctly. |
### Supporting
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| `pytest` | 9.0.2, published 2025-12-06 | Test runner | For targeted adapter/bootstrap regression tests. |
| `pytest-asyncio` | 1.3.0, published 2025-11-10 | Async test execution | For async reconciliation/reset flows. |
| `structlog` | 25.5.0, published 2025-10-27 | Diagnostics | For reconciliation summaries and conflict logging. |
| `python-dotenv` | 1.2.2, published 2026-03-01 | Env loading | Already used by `adapter/matrix/bot.py` for Matrix config. |
### Alternatives Considered
| Instead of | Could Use | Tradeoff |
|------------|-----------|----------|
| Startup reconciliation from joined rooms + state | Force developers to wipe local DB and recreate rooms | Simpler code, but directly violates D-01, D-02, D-05. |
| Non-destructive local rebuild | Full auto-recreate of Space/rooms on missing local state | Easier to implement, but causes duplicate Matrix rooms and breaks D-04. |
| Dev reset script | README-only manual ritual | Lower code cost, but not repeatable and fails D-08..D-10. |
**Installation:**
```bash
uv sync
```
**Version verification:** Verified via installed environment and PyPI metadata on 2026-04-03:
- `matrix-nio` `0.25.2` - 2024-10-04
- `pytest` `9.0.2` - 2025-12-06
- `pytest-asyncio` `1.3.0` - 2025-11-10
- `structlog` `25.5.0` - 2025-10-27
- `python-dotenv` `1.2.2` - 2026-03-01
## Architecture Patterns
### Recommended Project Structure
```text
adapter/matrix/
├── bot.py # startup flow calls reconciliation before sync loop
├── reconcile.py # bootstrap/rebuild logic from Matrix server state
├── reset.py # dev-only reset CLI / entrypoint
├── room_router.py # room_id -> chat_id with recovery hook
├── store.py # metadata helpers, prefix scans, derived counters
└── handlers/
├── auth.py # first-time provisioning only
└── chat.py # uses recovered state, no provisioning fallback
```
### Pattern 1: Two-Phase Startup Bootstrap
**What:** Split startup into `login -> initial sync/full_state -> reconcile -> steady-state sync_forever`.
**When to use:** Always for Matrix bot startup when local DB may be missing or stale.
**Example:**
```python
# Source: matrix-nio AsyncClient docs/source + repo startup flow
client = AsyncClient(...)
runtime = build_runtime(store=SQLiteStore(db_path), client=client)
await login_or_restore_session(client)
await client.sync(timeout=0, full_state=True)
report = await reconcile_matrix_state(client, runtime.store, runtime.chat_mgr)
logger.info("matrix_reconcile_complete", **report)
await client.sync_forever(timeout=30000)
```
### Pattern 2: Rebuild Local Metadata From Joined Rooms
**What:** Enumerate joined rooms, inspect local hydrated room objects or room state, and recreate missing `matrix_room:*`, `matrix_user:*`, and `chat:*` records.
**When to use:** On startup and optionally on `unregistered:{room_id}` fallback at runtime.
**Example:**
```python
# Source: matrix-nio AsyncClient.joined_rooms/room_get_state + repo store contracts
joined = await client.joined_rooms()
for room_id in joined.rooms:
state = await client.room_get_state(room_id)
# detect: space room vs chat room, owner user, child relationship, display name
# rebuild matrix_room:{room_id}
# rebuild chat:{matrix_user_id}:{chat_id} if absent
```
### Pattern 3: Non-Destructive Reconciliation Report
**What:** Return a structured report: scanned rooms, restored rooms, restored chats, conflicts, skipped rooms.
**When to use:** Every reconciliation run, including dry-run.
**Example:**
```python
{
"joined_rooms": 4,
"restored_user_meta": 1,
"restored_room_meta": 3,
"restored_chat_rows": 3,
"conflicts": [],
"skipped_rooms": ["!dm:example.org"],
}
```
### Pattern 4: Reset Modes Are Explicit
**What:** Separate `local-only`, `server-leave-forget`, and `dry-run`.
**When to use:** For dev/QA only. Never mix destructive server cleanup into normal startup.
**Example:**
```bash
uv run python -m adapter.matrix.reset --mode local-only
uv run python -m adapter.matrix.reset --mode server-leave-forget --dry-run
```
### Anti-Patterns to Avoid
- **Provisioning during reconciliation:** Do not create a new Space or new rooms while trying to recover missing local state.
- **Treating `next_chat_index` as primary truth:** Derive it from recovered `chat_id` values after scan; do not trust a missing or stale counter.
- **Routing unknown rooms straight through:** `unregistered:{room_id}` is a signal to reconcile, not a stable runtime identity.
- **Destructive reset by default:** Startup must never leave/forget rooms automatically.
- **Blindly trusting local `surface_ref`:** If `chat:*` and `matrix_room:*` disagree, rebuild from Matrix room metadata and repair the chat row.
## Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| Room discovery | Custom DB-only reconstruction heuristics | `AsyncClient.joined_rooms()` plus synced room state | Server already knows which rooms the bot joined. |
| Space membership detection | Naming-convention parsing of room names | Matrix state: `m.room.create.type`, `m.space.child`, `m.space.parent` | Names are mutable and non-authoritative. |
| Room cleanup semantics | Custom “delete room” assumptions | `room_leave()` + `room_forget()` semantics | Client API supports leave/forget, not guaranteed global deletion. |
| Chat ID recovery | Hardcoded `C1/C2/...` reset | Rebuild from existing `matrix_room:*`/server state and compute next index | Prevents collisions after partial DB loss. |
| Diagnostic output | Ad hoc `print()` strings | Structured reconciliation/reset report via `structlog` | Easier manual QA and failure triage. |
**Key insight:** The homeserver already persists the bots room graph. This phase should rehydrate local cache from that graph, not attempt to replace it with a second custom truth model.
## Common Pitfalls
### Pitfall 1: Joining the sync loop before reconciliation
**What goes wrong:** Commands arrive while local metadata is still missing, producing `unregistered:{room_id}` routing or `ChatManager` misses.
**Why it happens:** Current `main()` enters `sync_forever()` immediately after login.
**How to avoid:** Perform initial sync and reconciliation first.
**Warning signs:** `unregistered_room` logs immediately after restart; `ValueError("Chat ... not found")` on `!rename` or `!archive`.
### Pitfall 2: Recovering room metadata but not chat rows
**What goes wrong:** Room routing works, but `ChatManager.rename/archive/list_active` still fails because `chat:{user}:{chat_id}` rows were not recreated.
**Why it happens:** Matrix adapter metadata and core chat metadata live in different keyspaces.
**How to avoid:** Reconciliation must repair both stores in one pass.
**Warning signs:** `matrix_room:*` exists but `chat:*` keys do not.
### Pitfall 3: Trusting stale `next_chat_index`
**What goes wrong:** New chats reuse existing `C` IDs after local recovery.
**Why it happens:** `next_chat_id()` increments a persisted counter that may be absent or behind.
**How to avoid:** After scan, set `next_chat_index = max(recovered_chat_numbers) + 1`.
**Warning signs:** New room gets `C1` even though Space already contains prior rooms.
### Pitfall 4: Assuming room names identify chat rooms safely
**What goes wrong:** Reconciliation binds the wrong room because a user renamed a room or Space.
**Why it happens:** Names are user-facing labels, not stable identifiers.
**How to avoid:** Prefer room state and existing `chat_id` metadata; use display names only as fallback.
**Warning signs:** Duplicate “Чат 1” names or renamed rooms break matching.
### Pitfall 5: Over-promising full cleanup
**What goes wrong:** Reset script claims a “clean slate” but rooms still exist in Element or for other members.
**Why it happens:** Leaving/forgetting affects the bot accounts membership/history, not necessarily global room deletion.
**How to avoid:** Name the mode accurately and print the manual client steps when needed.
**Warning signs:** QA reruns still show old rooms in the users client.
## Code Examples
Verified patterns from official sources and the installed library surface:
### Initial Sync Before Reconcile
```python
# Source: matrix-nio AsyncClient.sync/sync_forever
await client.sync(timeout=0, full_state=True)
report = await reconcile_matrix_state(client, store, chat_mgr)
await client.sync_forever(timeout=30000)
```
### Space Child Link Creation
```python
# Source: Matrix client-server API state event + current auth/new-chat flow
await client.room_put_state(
room_id=space_id,
event_type="m.space.child",
content={"via": [homeserver]},
state_key=chat_room_id,
)
```
### Bot-Side Leave/Forget Cleanup
```python
# Source: matrix-nio AsyncClient.room_leave / room_forget
for room_id in room_ids:
await client.room_leave(room_id)
await client.room_forget(room_id)
```
### Router Recovery Trigger
```python
# Source: repo room_router contract
chat_id = await resolve_chat_id(store, room_id, matrix_user_id)
if chat_id.startswith("unregistered:"):
await reconcile_single_room(client, store, chat_mgr, room_id, matrix_user_id)
```
## State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| Local adapter DB treated as the operational truth | Rebuildable local cache from server room graph | Mature Matrix client practice; supported by current Matrix CS API and `matrix-nio` | Restart no longer requires destructive local reset. |
| Manual room cleanup in client after experiments | Scripted leave/forget plus explicit manual instructions | Current `matrix-nio` 0.25.x API surface | QA becomes repeatable and auditable. |
| Immediate steady-state sync after login | Initial sync/full-state bootstrap before long polling | Supported by current `AsyncClient.sync()` / `sync_forever()` behavior | Reconciliation can run before any user traffic is handled. |
**Deprecated/outdated:**
- `README.md` Matrix manual QA instruction `rm -f lambda_matrix.db` as the primary restart flow: outdated for this phase.
- DM-first Matrix recovery assumptions in `docs/matrix-prototype.md`: outdated relative to Phase 1 Space+rooms decisions.
## Open Questions
1. **How exactly should reconciliation identify the owning Matrix user for a recovered room when local `matrix_room:*` is gone?**
- What we know: the bot can enumerate joined rooms and fetch room state; current healthy metadata stores `matrix_user_id` and `space_id`.
- What's unclear: whether Phase 1-created rooms also expose enough server-side structure to recover owner deterministically without existing local metadata in every case.
- Recommendation: Plan a proof test against a real homeserver/client. If room-state-only ownership is ambiguous, persist a tiny bot-authored marker state event going forward, but keep that addition narrowly scoped.
2. **Should runtime recovery happen only on startup, or also lazily on first unknown room access?**
- What we know: startup repair satisfies D-02/D-07 for common restart loss; `room_router` already surfaces unknown rooms cleanly.
- What's unclear: whether partial DB corruption during runtime is common enough to justify lazy single-room repair in Phase 01.1.
- Recommendation: Make startup reconciliation required, lazy room repair optional if it stays small.
3. **How much of server cleanup should Phase 01.1 automate?**
- What we know: `room_leave()` and `room_forget()` are available; global room deletion is not what the client API guarantees.
- What's unclear: whether automating bot-side leave/forget is worth the extra risk for this urgent phase.
- Recommendation: Keep `local-only` mandatory. Make server cleanup optional and clearly labeled experimental/dev-only if included.
## Environment Availability
| Dependency | Required By | Available | Version | Fallback |
|------------|------------|-----------|---------|----------|
| Python | Runtime, scripts, tests | ✓ | 3.14.3 | — |
| `uv` | Standard install/run workflow | ✓ | 0.9.30 | `python -m` + existing venv |
| `pytest` | Automated verification | ✓ | 9.0.2 | `uv run pytest` |
| Matrix homeserver credentials | Real restart/reset manual QA | ✗ in current shell | — | Manual-only after `.env` is configured |
| Matrix bot local DB/store paths | Reset workflow | ✓ | defaults in code | Can override with `MATRIX_DB_PATH` / `MATRIX_STORE_PATH` |
**Missing dependencies with no fallback:**
- Live Matrix credentials for real manual reconciliation/reset QA.
**Missing dependencies with fallback:**
- None for repository-only implementation and tests.
## Validation Architecture
### Test Framework
| Property | Value |
|----------|-------|
| Framework | `pytest 9.0.2` + `pytest-asyncio 1.3.0` |
| Config file | `pyproject.toml` |
| Quick run command | `pytest tests/adapter/matrix -v` |
| Full suite command | `pytest tests/ -v` |
### Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| PH01.1-BOOT | Startup rebuilds missing `matrix_user:*`, `matrix_room:*`, and `chat:*` from existing rooms without creating new rooms | unit/integration | `pytest tests/adapter/matrix/test_reconcile.py -v` | ❌ Wave 0 |
| PH01.1-ROUTER | Unknown room fallback can trigger repair or yields diagnosable warning without crashing commands | unit | `pytest tests/adapter/matrix/test_room_router_reconcile.py -v` | ❌ Wave 0 |
| PH01.1-COUNTER | Reconciliation resets `next_chat_index` to recovered max + 1 | unit | `pytest tests/adapter/matrix/test_reconcile.py -k next_chat_index -v` | ❌ Wave 0 |
| PH01.1-RESET | Dev reset `local-only` removes local DB/store paths and prints next steps | unit/smoke | `pytest tests/adapter/matrix/test_reset.py -v` | ❌ Wave 0 |
| PH01.1-NONDESTRUCTIVE | Reconciliation never calls room creation APIs | unit | `pytest tests/adapter/matrix/test_reconcile.py -k no_create -v` | ❌ Wave 0 |
### Sampling Rate
- **Per task commit:** `pytest tests/adapter/matrix -v`
- **Per wave merge:** `pytest tests/ -v`
- **Phase gate:** Full suite green before `/gsd:verify-work`
### Wave 0 Gaps
- [ ] `tests/adapter/matrix/test_reconcile.py` - startup reconciliation scenarios
- [ ] `tests/adapter/matrix/test_reset.py` - CLI/script reset modes and output
- [ ] `tests/adapter/matrix/test_room_router_reconcile.py` - lazy recovery or warning behavior
- [ ] Integration fixture for a fake `AsyncClient` response surface matching `joined_rooms()` and `room_get_state()`
## Sources
### Primary (HIGH confidence)
- Matrix Client-Server API - room state, leave, forget, joined rooms, Spaces semantics: https://spec.matrix.org/latest/client-server-api/index.html
- `matrix-nio` installed 0.25.2 API surface verified locally on 2026-04-03 via `AsyncClient.sync`, `sync_forever`, `joined_rooms`, `room_get_state`, `room_leave`, `room_forget`
- Repo code: [adapter/matrix/bot.py](/Users/a/MAI/sem2/lambda/surfaces-bot/adapter/matrix/bot.py), [adapter/matrix/store.py](/Users/a/MAI/sem2/lambda/surfaces-bot/adapter/matrix/store.py), [adapter/matrix/room_router.py](/Users/a/MAI/sem2/lambda/surfaces-bot/adapter/matrix/room_router.py), [adapter/matrix/handlers/auth.py](/Users/a/MAI/sem2/lambda/surfaces-bot/adapter/matrix/handlers/auth.py), [core/chat.py](/Users/a/MAI/sem2/lambda/surfaces-bot/core/chat.py)
- PyPI release metadata: https://pypi.org/project/matrix-nio/ , https://pypi.org/project/pytest/ , https://pypi.org/project/pytest-asyncio/ , https://pypi.org/project/structlog/ , https://pypi.org/project/python-dotenv/
### Secondary (MEDIUM confidence)
- [README.md](/Users/a/MAI/sem2/lambda/surfaces-bot/README.md) - current manual reset habit and run commands
- [docs/matrix-prototype.md](/Users/a/MAI/sem2/lambda/surfaces-bot/docs/matrix-prototype.md) - original Matrix UX intent, noting outdated DM/reaction sections
- [01-CONTEXT.md](/Users/a/MAI/sem2/lambda/surfaces-bot/.planning/phases/01-matrix-qa-polish/01-CONTEXT.md) - locked Phase 1 Matrix decisions
- [01-VERIFICATION.md](/Users/a/MAI/sem2/lambda/surfaces-bot/.planning/phases/01-matrix-qa-polish/01-VERIFICATION.md) - what has already been verified and what still needs human Matrix QA
### Tertiary (LOW confidence)
- None
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH - verified against installed environment, PyPI metadata, and official Matrix spec
- Architecture: HIGH - directly grounded in current repo flow plus current `matrix-nio`/Matrix capabilities
- Pitfalls: HIGH - derived from concrete gaps in current startup/store/router code
**Research date:** 2026-04-03
**Valid until:** 2026-05-03

View file

@ -1,80 +0,0 @@
---
phase: 01.1
slug: matrix-restart-reconciliation-and-dev-reset-workflow
status: draft
nyquist_compliant: false
wave_0_complete: false
created: 2026-04-03
---
# Phase 01.1 — Validation Strategy
> Per-phase validation contract for feedback sampling during execution.
---
## Test Infrastructure
| Property | Value |
|----------|-------|
| **Framework** | `pytest 9.0.2` + `pytest-asyncio 1.3.0` |
| **Config file** | `pyproject.toml` |
| **Quick run command** | `pytest tests/adapter/matrix -v` |
| **Full suite command** | `pytest tests/ -v` |
| **Estimated runtime** | ~20 seconds |
---
## Sampling Rate
- **After every task commit:** Run `pytest tests/adapter/matrix -v`
- **After every plan wave:** Run `pytest tests/ -v`
- **Before `$gsd-verify-work`:** Full suite must be green
- **Max feedback latency:** 20 seconds
---
## Per-Task Verification Map
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|---------|------|------|-------------|-----------|-------------------|-------------|--------|
| 01.1-01-01 | 01 | 1 | PH01.1-BOOT | unit/integration | `pytest tests/adapter/matrix/test_reconcile.py -v` | ❌ W0 | ⬜ pending |
| 01.1-01-01 | 01 | 1 | PH01.1-COUNTER | unit | `pytest tests/adapter/matrix/test_reconcile.py -k next_chat_index -v` | ❌ W0 | ⬜ pending |
| 01.1-01-01 | 01 | 1 | PH01.1-NONDESTRUCTIVE | unit | `pytest tests/adapter/matrix/test_reconcile.py -k no_create -v` | ❌ W0 | ⬜ pending |
| 01.1-02-01 | 02 | 2 | PH01.1-BOOT | unit | `pytest tests/adapter/matrix/test_dispatcher.py -k startup -v` | ✅ | ⬜ pending |
| 01.1-02-02 | 02 | 2 | PH01.1-ROUTER | unit | `pytest tests/adapter/matrix/test_dispatcher.py -k reconcile -v` | ✅ | ⬜ pending |
| 01.1-03-01 | 03 | 1 | PH01.1-RESET | unit/smoke | `pytest tests/adapter/matrix/test_reset.py -v` | ❌ W0 | ⬜ pending |
| 01.1-03-02 | 03 | 1 | PH01.1-RESET | smoke | `python -m adapter.matrix.reset --help` | ❌ W0 | ⬜ pending |
*Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky*
---
## Wave 0 Requirements
- [ ] `tests/adapter/matrix/test_reconcile.py` — startup reconciliation scenarios, `next_chat_index`, and no-provisioning assertions
- [ ] `tests/adapter/matrix/test_reset.py` — CLI reset modes, dry-run behavior, and operator guidance output
- [ ] `tests/adapter/matrix/test_dispatcher.py` — startup bootstrap order and targeted unknown-room recovery coverage
- [ ] Fake `AsyncClient` fixture surface for joined rooms, room state, leave, and forget behavior
---
## Manual-Only Verifications
| Behavior | Requirement | Why Manual | Test Instructions |
|----------|-------------|------------|-------------------|
| Reconciled Space/chat rooms render correctly in a real Matrix client after restart | PH01.1-BOOT | Client UX and homeserver state cannot be fully trusted from fake nio fixtures | 1. Start the bot with existing Space/chat rooms. 2. Verify the bot does not create duplicate Space or chat rooms. 3. Send a command in a recovered room and confirm it routes normally. |
| Server-side cleanup leaves the account in a usable Element state after `server-leave-forget` | PH01.1-RESET | Element/archive behavior and homeserver retention are client/server integration concerns | 1. Run `python -m adapter.matrix.reset --mode server-leave-forget --dry-run`. 2. Run without `--dry-run` on a test account. 3. Confirm joined rooms disappear for the bot and fresh invites can be accepted cleanly. |
---
## Validation Sign-Off
- [ ] All tasks have `<automated>` verify or Wave 0 dependencies
- [ ] Sampling continuity: no 3 consecutive tasks without automated verify
- [ ] Wave 0 covers all MISSING references
- [ ] No watch-mode flags
- [ ] Feedback latency < 20s
- [ ] `nyquist_compliant: true` set in frontmatter
**Approval:** pending

View file

@ -1,72 +0,0 @@
---
phase: 02-prototype
task: 4
total_tasks: 4
status: paused
last_updated: 2026-04-07T23:54:30.473Z
---
<current_state>
The Matrix direct-agent prototype is implemented and manually proven working on branch `feat/matrix-direct-agent-prototype`. The current code path can log into Matrix, accept invites, provision the first Space/chat tree for a fresh user, and send live text messages to a patched local `platform-agent` over WebSocket. The immediate remaining engineering gap is not feature delivery but resilience: backend/provider failures can still bubble up as `PlatformError` and crash the Matrix bot process.
</current_state>
<completed_work>
- Task 1: Added `sdk/agent_session.py` and transport tests for direct WebSocket messaging with collision-safe `thread_key` generation.
- Task 2: Added `sdk/prototype_state.py` and tests for stable local user mapping, settings defaults, and mutation-safe settings copies.
- Task 3: Added `sdk/real.py` as the `PlatformClient` implementation, fixed import-time dependency leakage, and aligned thread-key tests to the actual dispatcher contract.
- Task 4: Wired Matrix runtime selection through `MATRIX_PLATFORM_BACKEND=real`, documented usage in `README.md`, and added dispatcher coverage for real backend selection.
- Fixed repeat Matrix invites so the bot now `join()`s before the existing-user early return path.
- Added Russian runbook doc `docs/matrix-direct-agent-prototype-ru.md` and pushed the branch.
- Manually validated live bring-up using a local patched `external/platform-agent` on port 8000 plus the Matrix homeserver `https://matrix.lambda.coredump.ru`.
</completed_work>
<remaining_work>
- Add graceful degradation for backend/provider failures so `PlatformError` does not crash the Matrix process.
- Decide whether to upstream or separately push the required `external/platform-agent` patch (`1dca2c1`) that enables WebSocket `thread_id`.
- Optionally clean up repeat-invite UX if Space/chat reprovisioning should ever happen for already-known users.
- Optionally prepare a PR from `feat/matrix-direct-agent-prototype`.
</remaining_work>
<decisions_made>
- Keep the prototype in this repo, not a separate Matrix-only repo.
- Keep Matrix adapter logic intact and absorb backend differences inside `sdk/`.
- Split the real backend into `AgentSessionClient` and `PrototypeStateStore` behind `RealPlatformClient`.
- Patch only `platform-agent` for per-thread memory instead of changing both `agent` and `agent_api`.
- Use a serialized collision-safe thread key because Matrix user IDs contain colons.
- For repeat invites, join the room but do not recreate Space/chat state if the user is already provisioned locally.
</decisions_made>
<blockers>
- Technical: provider/backend errors still crash the Matrix bot instead of returning a user-facing failure reply.
- External: the required `platform-agent` patch exists only in the local clone under `external/` and is not yet upstream.
- Operational: credentials used during manual bring-up were exposed in-session and should be rotated.
</blockers>
<context>
The important mental model is stable. `platform/master` is still not the backend for surfaces, so the working prototype goes directly to `platform-agent` over `/agent_ws/`. The live setup that worked was:
- `surfaces-bot` branch: `feat/matrix-direct-agent-prototype`
- Matrix bot env: `MATRIX_PLATFORM_BACKEND=real`, `AGENT_WS_URL=ws://127.0.0.1:8000/agent_ws/`
- patched local `external/platform-agent` with `thread_id` support
- provider configured through OpenRouter using model `qwen/qwen3.5-122b-a10b`
Important files:
- `sdk/agent_session.py`
- `sdk/prototype_state.py`
- `sdk/real.py`
- `adapter/matrix/bot.py`
- `adapter/matrix/handlers/auth.py`
- `docs/matrix-direct-agent-prototype-ru.md`
Important local-only dependency:
- `external/platform-agent` commit `1dca2c1` (`feat: support websocket thread ids`)
Likely running background process at pause time:
- local `platform-agent` server on port 8000, PID 13499
</context>
<next_action>
Start with the failure path: catch `PlatformError` around Matrix message handling so a bad provider response becomes a normal reply like “backend unavailable, try again later” instead of killing the process. After that, either upstream `external/platform-agent` commit `1dca2c1` or document it as an explicit prerequisite in the platform repo.
</next_action>