[fix] restart gap

This commit is contained in:
Azamat 2026-04-02 23:39:25 +03:00
parent 770af1fe76
commit 50af62b3fb
10 changed files with 348 additions and 4 deletions

View file

@ -0,0 +1,17 @@
# 007 Startup Sandbox Reconciliation
Context
- Active sandboxes outlive the process because Docker keeps containers running across master-service restarts.
- The in-memory session repository is rebuilt on each start and otherwise loses running sandbox state.
Decision
- Reconcile sandbox state during app startup before the cleanup loop starts serving requests.
- Read running Docker containers through sandbox labels `session_id`, `chat_id`, and `expires_at`.
- Rebuild the in-memory registry from the reconciled sessions and prefer the newest session per `chat_id`.
- Let the normal cleanup flow handle reconciled sessions that are already expired.
- Do not stop healthy sandbox containers during service shutdown; shutdown only stops background control-plane work and closes local resources.
Consequences
- A restarted master-service reuses existing sandboxes instead of starting duplicates for the same chat.
- Startup now depends on Docker state access and should fail fast if runtime state cannot be listed.
- The reconciliation rule stays local to outer layers and does not leak Docker into usecases.