diff --git a/docs/superpowers/specs/2026-04-20-matrix-staged-attachments-design.md b/docs/superpowers/specs/2026-04-20-matrix-staged-attachments-design.md new file mode 100644 index 0000000..ae8a11a --- /dev/null +++ b/docs/superpowers/specs/2026-04-20-matrix-staged-attachments-design.md @@ -0,0 +1,262 @@ +# Matrix Staged Attachments Design + +## Goal + +Make file sending in the Matrix surface usable for an AI agent despite current Matrix client behavior, especially in Element where media is often sent immediately as separate events without a shared text composer. + +The result should be: + +- files can arrive before the user writes the actual instruction +- the surface stages those files instead of immediately sending them to the agent +- the next normal user message in the same chat commits all staged files as one agent turn +- the user can inspect and remove staged files with short chat commands + +## Core Decision + +The selected UX model is: + +`incoming Matrix media -> staged attachments for (chat_id, user_id) -> next normal message commits them` + +This means: + +- attachment-only events do not immediately invoke the agent +- the bot acknowledges staged files with a service message +- the next normal user message sends text plus all currently staged files to the agent +- staged files are then cleared + +## Why This Decision + +Matrix natively models messages as separate events, and common clients do not provide a reliable "one text message with many attachments" composer flow. + +In practice this causes two UX failures for an AI bot: + +- users may send files first and only then write the task +- users may send multiple files as multiple independent Matrix events + +If the surface treats each incoming file as a full agent turn, the bot becomes noisy and context-fragmented. If it ignores file-only messages, file handling feels broken. + +Staging is the smallest surface-side abstraction that fixes both problems without fighting the Matrix event model. + +## Scope + +This design covers: + +- staging inbound Matrix attachments before agent submission +- per-chat attachment state for a specific user +- user-facing service messages for staged attachments +- short commands for listing and removing staged files +- commit behavior on the next normal message + +This design does not cover: + +- edits or redactions of original Matrix media events as attachment controls +- cross-surface shared staging +- thread-aware staging beyond the existing `chat_id` boundary +- changes to the platform attachment contract + +## State Model + +### Staging key + +Staged attachments are isolated by: + +- `chat_id` +- `user_id` + +This means: + +- files staged by a user in one chat never appear in another chat +- files staged by one user do not mix with another user's files in the same room + +### Staged attachment record + +Each staged attachment must track at least: + +- stable internal id +- display filename +- workspace-relative path +- MIME type if known +- created timestamp + +User-visible commands operate on the current ordered list, not on internal ids. + +### Lifecycle + +A staged attachment is in exactly one of these states: + +1. `staged` +2. `committed` +3. `removed` + +Rules: + +- only `staged` attachments appear in `!list` +- `committed` attachments are no longer user-removable +- `removed` attachments are excluded from future commits + +## Inbound Behavior + +### Attachment-only event + +If the Matrix surface receives one or more file/media events from a user without a normal text message to commit them: + +1. download each file into shared `/workspace` +2. add each file to the staged set for `(chat_id, user_id)` +3. do not call the agent yet +4. send a service acknowledgment message + +### Service acknowledgment + +The service message must communicate: + +- the current staged attachment list with indices +- that the next normal message will be sent to the agent together with those files +- available commands: `!list`, `!remove `, `!remove all` + +Example shape: + +```text +Staged attachments: +1. screenshot.png +2. invoice.pdf + +Your next message will be sent to the agent with these files. +Commands: !list, !remove , !remove all +``` + +### Burst handling + +Matrix clients may send multiple files as separate consecutive events. + +To avoid bot spam, service acknowledgments should be debounced over a short window and aggregated into one reply where feasible. + +The acknowledgment must reflect the full current staged set, not only the most recently received file. + +## Commit Behavior + +### Commit trigger + +The commit trigger is: + +- the next normal user message in the same `(chat_id, user_id)` scope + +Normal user message means: + +- not a staging control command +- not a pure attachment event being staged + +### Commit action + +When a commit-triggering message arrives: + +1. collect all currently staged attachments for `(chat_id, user_id)` +2. send the user text plus those attachments to the agent as one turn +3. mark all included staged attachments as `committed` +4. clear the staged set + +After commit: + +- the just-sent attachments must no longer appear in `!list` +- a later file upload starts a new staged set + +## Commands + +### `!list` + +Shows the current staged attachment list for the user in the current chat. + +If the list is empty, the response should be short and explicit. + +### `!remove ` + +Removes the staged attachment at the current 1-based index. + +Behavior: + +- if the index is valid, remove that staged attachment and return the updated staged list +- if the index is invalid, return a short error without repeating the list + +### `!remove all` + +Clears the entire staged set for the user in the current chat. + +The response should be short and explicit. + +## Ordering Rules + +The staged list is ordered by staging time. + +User-facing indices: + +- are 1-based +- are recalculated from the current staged set +- may change after removals + +Therefore: + +- `!list` always shows the current authoritative numbering +- after a successful `!remove `, the bot should reply with the refreshed list + +## Error Handling + +### Download failure + +If a file cannot be downloaded or stored: + +- do not add it to the staged set +- do not pretend it will be sent later +- send a short user-visible failure message + +### Invalid command + +If the command is malformed or uses an invalid index: + +- return a short error +- do not commit staged attachments +- do not clear the staged set + +### Agent submission failure + +If commit fails when sending the text plus staged files to the agent: + +- staged attachments must remain available for retry unless the failure is known to be irreversible +- the user-visible error should make it clear that the files were not consumed + +This prevents silent loss of staged context. + +## Interaction with Shared Workspace Design + +This design assumes the shared-workspace contract defined in +[2026-04-20-matrix-shared-workspace-file-flow-design.md](/Users/a/MAI/sem2/lambda/surfaces-bot/docs/superpowers/specs/2026-04-20-matrix-shared-workspace-file-flow-design.md). + +Specifically: + +- staged files are stored in shared `/workspace` +- the final commit still passes workspace-relative paths to `platform-agent` +- staging changes only when the surface chooses to invoke the agent, not how attachments are represented + +## Testing + +The implementation must cover: + +- file-only Matrix events are staged and do not immediately invoke the agent +- service acknowledgment includes staged filenames and command hints +- `!list` returns the current staged set for the correct `(chat_id, user_id)` +- `!remove ` removes the correct staged attachment and refreshes numbering +- `!remove all` clears the staged set +- invalid `!remove ` returns a short error and keeps state unchanged +- the next normal message commits all staged attachments with the text as one agent turn +- committed attachments disappear from staging after success +- failed commits preserve staged attachments +- staging in one chat does not leak into another chat +- staging for one user does not leak to another user in the same room + +## Non-Goals + +This design intentionally does not attempt to: + +- emulate Telegram-style albums in Matrix +- rely on special support from Element or other Matrix clients +- introduce a rich interactive attachment management UI + +The goal is a reliable chat-native workflow that works within Matrix's actual event model.