--- phase: 01.1-matrix-restart-reconciliation-and-dev-reset-workflow plan: 02 type: execute wave: 2 depends_on: ["01.1-01"] files_modified: - adapter/matrix/bot.py - tests/adapter/matrix/test_dispatcher.py autonomous: true requirements: [] must_haves: truths: - "The Matrix bot performs an initial sync and reconciliation before entering steady-state `sync_forever()`." - "If a room still arrives as `unregistered:{room_id}` after startup, the bot makes one targeted recovery attempt before dispatching or failing." - "When reconciliation cannot repair a room, the bot logs a clear diagnostic reason instead of crashing on downstream commands like `!rename`." artifacts: - path: "adapter/matrix/bot.py" provides: "Startup bootstrap flow with initial sync, reconciliation, and targeted runtime retry." - path: "tests/adapter/matrix/test_dispatcher.py" provides: "Matrix runtime coverage for pre-sync reconcile and on-message recovery behavior." key_links: - from: "adapter/matrix/bot.py" to: "adapter/matrix/reconcile.py" via: "startup bootstrap and single-room recovery calls" pattern: "reconcile_(matrix_state|single_room)" - from: "adapter/matrix/bot.py" to: "adapter/matrix/room_router.py" via: "unregistered room detection before dispatch" pattern: "unregistered:" --- Wire the new reconciliation layer into the actual Matrix runtime. Purpose: D-05 through D-07 require restart recovery to be the default developer path. The bot must bootstrap itself from existing Matrix rooms on startup and make one on-demand repair attempt before routing an unknown room through the dispatcher. Output: `adapter/matrix/bot.py` performs initial sync + reconciliation before `sync_forever()`, and runtime tests prove the bot recovers or logs clearly instead of blindly dispatching broken state. @/Users/a/.codex/get-shit-done/workflows/execute-plan.md @/Users/a/.codex/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-CONTEXT.md @.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-RESEARCH.md @.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-01-PLAN.md @adapter/matrix/bot.py @adapter/matrix/room_router.py @adapter/matrix/reconcile.py @tests/adapter/matrix/test_dispatcher.py From `adapter/matrix/bot.py`: ```python class MatrixBot: async def on_room_message(self, room: MatrixRoom, event: RoomMessageText) -> None async def main() -> None ``` From `adapter/matrix/reconcile.py`: ```python async def reconcile_matrix_state(client: Any, store: StateStore, chat_mgr: ChatManager) -> dict async def reconcile_single_room( client: Any, store: StateStore, chat_mgr: ChatManager, room_id: str, matrix_user_id: str ) -> dict ``` From `adapter/matrix/room_router.py`: ```python async def resolve_chat_id(store: StateStore, room_id: str, matrix_user_id: str) -> str ``` Task 1: Run initial sync and reconciliation before the long-poll loop adapter/matrix/bot.py, tests/adapter/matrix/test_dispatcher.py adapter/matrix/bot.py, adapter/matrix/reconcile.py, tests/adapter/matrix/test_dispatcher.py, .planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-RESEARCH.md - Test 1: `main()` performs `client.sync(timeout=0, full_state=True)` before `sync_forever()`. - Test 2: `main()` calls `reconcile_matrix_state(...)` after the initial sync and logs the returned report. - Test 3: startup still reaches `sync_forever()` when reconciliation reports recoverable skips/conflicts instead of fatal failure. Modify `adapter/matrix/bot.py` so normal startup follows the two-phase bootstrap recommended in research: 1. build client and runtime 2. authenticate 3. register callbacks 4. run `await client.sync(timeout=0, full_state=True)` 5. run `await reconcile_matrix_state(client, runtime.store, runtime.chat_mgr)` 6. log a structured `matrix_reconcile_complete` event with the report fields 7. enter `await client.sync_forever(timeout=30000)` Do not move provisioning logic into startup. The startup step only rehydrates local state from server-side rooms per D-02 through D-04. Update or add focused tests in `tests/adapter/matrix/test_dispatcher.py` using `monkeypatch`/fake-client patterns already used in the repo so the verify command proves the call order and logging-safe behavior. The test should fail if `sync_forever()` starts before reconciliation. cd /Users/a/MAI/sem2/lambda/surfaces-bot && pytest tests/adapter/matrix/test_dispatcher.py -q - `adapter/matrix/bot.py` runs an initial full-state sync before steady-state polling. - `adapter/matrix/bot.py` invokes `reconcile_matrix_state(...)` exactly once during startup. - Startup logs a structured reconciliation summary instead of silently skipping the recovery step. - `tests/adapter/matrix/test_dispatcher.py` asserts the bootstrap order explicitly. Normal Matrix bot startup now includes a recovery pass before the event loop begins handling user traffic. Task 2: Retry unknown-room routing once before dispatching broken state adapter/matrix/bot.py, tests/adapter/matrix/test_dispatcher.py adapter/matrix/bot.py, adapter/matrix/room_router.py, adapter/matrix/reconcile.py, tests/adapter/matrix/test_dispatcher.py, .planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-CONTEXT.md - Test 1: `MatrixBot.on_room_message(...)` detects `unregistered:{room_id}`, runs `reconcile_single_room(...)`, then retries `resolve_chat_id(...)`. - Test 2: if retry succeeds, the event is dispatched against the recovered logical chat id. - Test 3: if retry still fails, the bot does not crash; it logs a clear warning and sends a user-facing diagnostic message to that room. Extend `MatrixBot.on_room_message(...)` so D-07 is satisfied even when startup could not repair a room yet. Keep `resolve_chat_id(...)` as the room-router source of truth, but treat `unregistered:{room_id}` as a recovery trigger rather than a stable runtime identity: - first call `resolve_chat_id(...)` - if the result starts with `unregistered:`, call `reconcile_single_room(client, runtime.store, runtime.chat_mgr, room.room_id, event.sender)` - immediately retry `resolve_chat_id(...)` - only dispatch once a concrete logical chat id exists - if the retry still returns `unregistered:{room_id}`, log a structured warning with room id, matrix user id, and reconciliation report, then send a short `OutgoingMessage`-equivalent Matrix text explaining that local state could not be restored automatically and a dev reset/restart may be required Do not invent a new fallback chat id and do not auto-create rooms here; that would violate D-04. Keep this change inside `adapter/matrix/bot.py` so file ownership stays isolated for this plan. cd /Users/a/MAI/sem2/lambda/surfaces-bot && pytest tests/adapter/matrix/test_dispatcher.py -q - Unknown Matrix rooms trigger one targeted reconciliation attempt before dispatch. - Successful targeted recovery leads to normal dispatch with a real logical `chat_id`. - Failed targeted recovery logs a clear diagnostic and avoids a handler crash on missing chat state per D-06. - No code path in this task provisions new Matrix rooms or Spaces. The runtime treats unknown rooms as recoverable state drift first, not as a silent routing failure or crash path. Run `pytest tests/adapter/matrix/test_dispatcher.py -q` and confirm both startup-bootstrap and first-access recovery behaviors are covered. - A standard Matrix restart now attempts recovery before the bot starts processing live events. - Unknown-room events are diagnosable and recoverable instead of falling straight into broken command handling. - The runtime never provisions new server-side rooms during restart reconciliation. After completion, create `.planning/phases/01.1-matrix-restart-reconciliation-and-dev-reset-workflow/01.1-02-SUMMARY.md`