docs: add matrix staged attachments design

This commit is contained in:
Mikhail Putilovskij 2026-04-20 16:05:28 +03:00
parent 8b04fcaf77
commit 105ecc68ed

View file

@ -0,0 +1,262 @@
# Matrix Staged Attachments Design
## Goal
Make file sending in the Matrix surface usable for an AI agent despite current Matrix client behavior, especially in Element where media is often sent immediately as separate events without a shared text composer.
The result should be:
- files can arrive before the user writes the actual instruction
- the surface stages those files instead of immediately sending them to the agent
- the next normal user message in the same chat commits all staged files as one agent turn
- the user can inspect and remove staged files with short chat commands
## Core Decision
The selected UX model is:
`incoming Matrix media -> staged attachments for (chat_id, user_id) -> next normal message commits them`
This means:
- attachment-only events do not immediately invoke the agent
- the bot acknowledges staged files with a service message
- the next normal user message sends text plus all currently staged files to the agent
- staged files are then cleared
## Why This Decision
Matrix natively models messages as separate events, and common clients do not provide a reliable "one text message with many attachments" composer flow.
In practice this causes two UX failures for an AI bot:
- users may send files first and only then write the task
- users may send multiple files as multiple independent Matrix events
If the surface treats each incoming file as a full agent turn, the bot becomes noisy and context-fragmented. If it ignores file-only messages, file handling feels broken.
Staging is the smallest surface-side abstraction that fixes both problems without fighting the Matrix event model.
## Scope
This design covers:
- staging inbound Matrix attachments before agent submission
- per-chat attachment state for a specific user
- user-facing service messages for staged attachments
- short commands for listing and removing staged files
- commit behavior on the next normal message
This design does not cover:
- edits or redactions of original Matrix media events as attachment controls
- cross-surface shared staging
- thread-aware staging beyond the existing `chat_id` boundary
- changes to the platform attachment contract
## State Model
### Staging key
Staged attachments are isolated by:
- `chat_id`
- `user_id`
This means:
- files staged by a user in one chat never appear in another chat
- files staged by one user do not mix with another user's files in the same room
### Staged attachment record
Each staged attachment must track at least:
- stable internal id
- display filename
- workspace-relative path
- MIME type if known
- created timestamp
User-visible commands operate on the current ordered list, not on internal ids.
### Lifecycle
A staged attachment is in exactly one of these states:
1. `staged`
2. `committed`
3. `removed`
Rules:
- only `staged` attachments appear in `!list`
- `committed` attachments are no longer user-removable
- `removed` attachments are excluded from future commits
## Inbound Behavior
### Attachment-only event
If the Matrix surface receives one or more file/media events from a user without a normal text message to commit them:
1. download each file into shared `/workspace`
2. add each file to the staged set for `(chat_id, user_id)`
3. do not call the agent yet
4. send a service acknowledgment message
### Service acknowledgment
The service message must communicate:
- the current staged attachment list with indices
- that the next normal message will be sent to the agent together with those files
- available commands: `!list`, `!remove <n>`, `!remove all`
Example shape:
```text
Staged attachments:
1. screenshot.png
2. invoice.pdf
Your next message will be sent to the agent with these files.
Commands: !list, !remove <n>, !remove all
```
### Burst handling
Matrix clients may send multiple files as separate consecutive events.
To avoid bot spam, service acknowledgments should be debounced over a short window and aggregated into one reply where feasible.
The acknowledgment must reflect the full current staged set, not only the most recently received file.
## Commit Behavior
### Commit trigger
The commit trigger is:
- the next normal user message in the same `(chat_id, user_id)` scope
Normal user message means:
- not a staging control command
- not a pure attachment event being staged
### Commit action
When a commit-triggering message arrives:
1. collect all currently staged attachments for `(chat_id, user_id)`
2. send the user text plus those attachments to the agent as one turn
3. mark all included staged attachments as `committed`
4. clear the staged set
After commit:
- the just-sent attachments must no longer appear in `!list`
- a later file upload starts a new staged set
## Commands
### `!list`
Shows the current staged attachment list for the user in the current chat.
If the list is empty, the response should be short and explicit.
### `!remove <n>`
Removes the staged attachment at the current 1-based index.
Behavior:
- if the index is valid, remove that staged attachment and return the updated staged list
- if the index is invalid, return a short error without repeating the list
### `!remove all`
Clears the entire staged set for the user in the current chat.
The response should be short and explicit.
## Ordering Rules
The staged list is ordered by staging time.
User-facing indices:
- are 1-based
- are recalculated from the current staged set
- may change after removals
Therefore:
- `!list` always shows the current authoritative numbering
- after a successful `!remove <n>`, the bot should reply with the refreshed list
## Error Handling
### Download failure
If a file cannot be downloaded or stored:
- do not add it to the staged set
- do not pretend it will be sent later
- send a short user-visible failure message
### Invalid command
If the command is malformed or uses an invalid index:
- return a short error
- do not commit staged attachments
- do not clear the staged set
### Agent submission failure
If commit fails when sending the text plus staged files to the agent:
- staged attachments must remain available for retry unless the failure is known to be irreversible
- the user-visible error should make it clear that the files were not consumed
This prevents silent loss of staged context.
## Interaction with Shared Workspace Design
This design assumes the shared-workspace contract defined in
[2026-04-20-matrix-shared-workspace-file-flow-design.md](/Users/a/MAI/sem2/lambda/surfaces-bot/docs/superpowers/specs/2026-04-20-matrix-shared-workspace-file-flow-design.md).
Specifically:
- staged files are stored in shared `/workspace`
- the final commit still passes workspace-relative paths to `platform-agent`
- staging changes only when the surface chooses to invoke the agent, not how attachments are represented
## Testing
The implementation must cover:
- file-only Matrix events are staged and do not immediately invoke the agent
- service acknowledgment includes staged filenames and command hints
- `!list` returns the current staged set for the correct `(chat_id, user_id)`
- `!remove <n>` removes the correct staged attachment and refreshes numbering
- `!remove all` clears the staged set
- invalid `!remove <n>` returns a short error and keeps state unchanged
- the next normal message commits all staged attachments with the text as one agent turn
- committed attachments disappear from staging after success
- failed commits preserve staged attachments
- staging in one chat does not leak into another chat
- staging for one user does not leak to another user in the same room
## Non-Goals
This design intentionally does not attempt to:
- emulate Telegram-style albums in Matrix
- rely on special support from Element or other Matrix clients
- introduce a rich interactive attachment management UI
The goal is a reliable chat-native workflow that works within Matrix's actual event model.