Revise TODO.md to introduce Subagent Architecture and Interactive Clarifying Questions Tool
- Updated the structure of the TODO list, renaming and expanding the "Context Management" section to "Subagent Architecture" with detailed problem and solution descriptions. - Added a new section for "Interactive Clarifying Questions Tool," outlining the problem of agent assumptions and proposing a multiple-choice prompt tool for user interaction. - Included implementation details and benefits for both features, enhancing clarity and direction for future development.
This commit is contained in:
parent
9c8d707530
commit
3db83b6824
1 changed files with 362 additions and 19 deletions
381
TODO.md
381
TODO.md
|
|
@ -39,7 +39,85 @@ These items need to be addressed ASAP:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 1. Context Management
|
## 1. Subagent Architecture (Context Isolation) 🎯
|
||||||
|
|
||||||
|
**Problem:** Long-running tools (terminal commands, browser automation, complex file operations) consume massive context. A single `ls -la` can add hundreds of lines. Browser snapshots, debugging sessions, and iterative terminal work quickly bloat the main conversation, leaving less room for actual reasoning.
|
||||||
|
|
||||||
|
**Solution:** The main agent becomes an **orchestrator** that delegates context-heavy tasks to **subagents**.
|
||||||
|
|
||||||
|
**Architecture:**
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ ORCHESTRATOR (main agent) │
|
||||||
|
│ - Receives user request │
|
||||||
|
│ - Plans approach │
|
||||||
|
│ - Delegates heavy tasks to subagents │
|
||||||
|
│ - Receives summarized results │
|
||||||
|
│ - Maintains clean, focused context │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
│ │ │
|
||||||
|
▼ ▼ ▼
|
||||||
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||||
|
│ TERMINAL AGENT │ │ BROWSER AGENT │ │ CODE AGENT │
|
||||||
|
│ - terminal tool │ │ - browser tools │ │ - file tools │
|
||||||
|
│ - file tools │ │ - web_search │ │ - terminal │
|
||||||
|
│ │ │ - web_extract │ │ │
|
||||||
|
│ Isolated context│ │ Isolated context│ │ Isolated context│
|
||||||
|
│ Returns summary │ │ Returns summary │ │ Returns summary │
|
||||||
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
**How it works:**
|
||||||
|
1. User asks: "Set up a new Python project with FastAPI and tests"
|
||||||
|
2. Orchestrator plans: "I need to create files, install deps, write code"
|
||||||
|
3. Orchestrator calls: `terminal_task(goal="Create venv, install fastapi pytest", context="New project in ~/myapp")`
|
||||||
|
4. **Subagent spawns** with fresh context, only terminal/file tools
|
||||||
|
5. Subagent iterates (may take 10+ tool calls, lots of output)
|
||||||
|
6. Subagent completes → returns summary: "Created venv, installed fastapi==0.109.0, pytest==8.0.0"
|
||||||
|
7. Orchestrator receives **only the summary**, context stays clean
|
||||||
|
8. Orchestrator continues with next subtask
|
||||||
|
|
||||||
|
**Key tools to implement:**
|
||||||
|
- [ ] `terminal_task(goal, context, cwd?)` - Delegate terminal/shell work
|
||||||
|
- [ ] `browser_task(goal, context, start_url?)` - Delegate web research/automation
|
||||||
|
- [ ] `code_task(goal, context, files?)` - Delegate code writing/modification
|
||||||
|
- [ ] Generic `delegate_task(goal, context, toolsets=[])` - Flexible delegation
|
||||||
|
|
||||||
|
**Implementation details:**
|
||||||
|
- [ ] Subagent uses same `run_agent.py` but with:
|
||||||
|
- Fresh/empty conversation history
|
||||||
|
- Limited toolset (only what's needed)
|
||||||
|
- Smaller max_iterations (focused task)
|
||||||
|
- Task-specific system prompt
|
||||||
|
- [ ] Subagent returns structured result:
|
||||||
|
```python
|
||||||
|
{
|
||||||
|
"success": True,
|
||||||
|
"summary": "Installed 3 packages, created 2 files",
|
||||||
|
"details": "Optional longer explanation if needed",
|
||||||
|
"artifacts": ["~/myapp/requirements.txt", "~/myapp/main.py"], # Files created
|
||||||
|
"errors": [] # Any issues encountered
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- [ ] Orchestrator sees only the summary in its context
|
||||||
|
- [ ] Full subagent transcript saved separately for debugging
|
||||||
|
|
||||||
|
**Benefits:**
|
||||||
|
- 🧹 **Clean context** - Orchestrator stays focused, doesn't drown in tool output
|
||||||
|
- 📊 **Better token efficiency** - 50 terminal outputs → 1 summary paragraph
|
||||||
|
- 🎯 **Focused subagents** - Each agent has just the tools it needs
|
||||||
|
- 🔄 **Parallel potential** - Independent subtasks could run concurrently
|
||||||
|
- 🐛 **Easier debugging** - Each subtask has its own isolated transcript
|
||||||
|
|
||||||
|
**When to use subagents vs direct tools:**
|
||||||
|
- **Subagent**: Multi-step tasks, iteration likely, lots of output expected
|
||||||
|
- **Direct**: Quick one-off commands, simple file reads, user needs to see output
|
||||||
|
|
||||||
|
**Files to modify:** `run_agent.py` (add orchestration mode), new `tools/delegate_tools.py`, new `subagent_runner.py`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Context Management (complements Subagents)
|
||||||
|
|
||||||
**Problem:** Context grows unbounded during long conversations. Trajectory compression exists for training data post-hoc, but live conversations lack intelligent context management.
|
**Problem:** Context grows unbounded during long conversations. Trajectory compression exists for training data post-hoc, but live conversations lack intelligent context management.
|
||||||
|
|
||||||
|
|
@ -162,7 +240,43 @@ These items need to be addressed ASAP:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 6. Uncertainty & Honesty Calibration 🎚️
|
## 6. Interactive Clarifying Questions Tool ❓
|
||||||
|
|
||||||
|
**Problem:** Agent sometimes makes assumptions or guesses when it should ask the user. Currently can only ask via text, which gets lost in long outputs.
|
||||||
|
|
||||||
|
**Ideas:**
|
||||||
|
- [ ] **Multiple-choice prompt tool** - Let agent present structured choices to user:
|
||||||
|
```
|
||||||
|
ask_user_choice(
|
||||||
|
question="Should the language switcher enable only German or all languages?",
|
||||||
|
choices=[
|
||||||
|
"Only enable German - works immediately",
|
||||||
|
"Enable all, mark untranslated - show fallback notice",
|
||||||
|
"Let me specify something else"
|
||||||
|
]
|
||||||
|
)
|
||||||
|
```
|
||||||
|
- Renders as interactive terminal UI with arrow key / Tab navigation
|
||||||
|
- User selects option, result returned to agent
|
||||||
|
- Up to 4 choices + optional free-text option
|
||||||
|
|
||||||
|
- [ ] **Implementation:**
|
||||||
|
- Use `inquirer` or `questionary` Python library for rich terminal prompts
|
||||||
|
- Tool returns selected option text (or user's custom input)
|
||||||
|
- **CLI-only** - only works when running via `cli.py` (not API/programmatic use)
|
||||||
|
- Graceful fallback: if not in interactive mode, return error asking agent to rephrase as text
|
||||||
|
|
||||||
|
- [ ] **Use cases:**
|
||||||
|
- Clarify ambiguous requirements before starting work
|
||||||
|
- Confirm destructive operations with clear options
|
||||||
|
- Let user choose between implementation approaches
|
||||||
|
- Checkpoint complex multi-step workflows
|
||||||
|
|
||||||
|
**Files to modify:** New `tools/ask_user_tool.py`, `cli.py` (detect interactive mode), `model_tools.py`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Uncertainty & Honesty Calibration 🎚️
|
||||||
|
|
||||||
**Problem:** Sometimes confidently wrong. Should be better calibrated about what I know vs. don't know.
|
**Problem:** Sometimes confidently wrong. Should be better calibrated about what I know vs. don't know.
|
||||||
|
|
||||||
|
|
@ -179,7 +293,7 @@ These items need to be addressed ASAP:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 7. Resource Awareness & Efficiency 💰
|
## 8. Resource Awareness & Efficiency 💰
|
||||||
|
|
||||||
**Problem:** No awareness of costs, time, or resource usage. Could be smarter about efficiency.
|
**Problem:** No awareness of costs, time, or resource usage. Could be smarter about efficiency.
|
||||||
|
|
||||||
|
|
@ -197,7 +311,7 @@ These items need to be addressed ASAP:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 8. Collaborative Problem Solving 🤝
|
## 9. Collaborative Problem Solving 🤝
|
||||||
|
|
||||||
**Problem:** Interaction is command/response. Complex problems benefit from dialogue.
|
**Problem:** Interaction is command/response. Complex problems benefit from dialogue.
|
||||||
|
|
||||||
|
|
@ -216,7 +330,7 @@ These items need to be addressed ASAP:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 9. Project-Local Context 💾
|
## 10. Project-Local Context 💾
|
||||||
|
|
||||||
**Problem:** Valuable context lost between sessions.
|
**Problem:** Valuable context lost between sessions.
|
||||||
|
|
||||||
|
|
@ -236,7 +350,7 @@ These items need to be addressed ASAP:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 10. Graceful Degradation & Robustness 🛡️
|
## 11. Graceful Degradation & Robustness 🛡️
|
||||||
|
|
||||||
**Problem:** When things go wrong, recovery is limited. Should fail gracefully.
|
**Problem:** When things go wrong, recovery is limited. Should fail gracefully.
|
||||||
|
|
||||||
|
|
@ -257,17 +371,17 @@ These items need to be addressed ASAP:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 11. Tools & Skills Wishlist 🧰
|
## 12. Tools & Skills Wishlist 🧰
|
||||||
|
|
||||||
*Things that would need new tool implementations (can't do well with current tools):*
|
*Things that would need new tool implementations (can't do well with current tools):*
|
||||||
|
|
||||||
### High-Impact
|
### High-Impact
|
||||||
|
|
||||||
- [ ] **Audio/Video Transcription** 🎬
|
- [ ] **Audio/Video Transcription** 🎬 *(See also: Section 16 for detailed spec)*
|
||||||
- Transcribe audio files, podcasts, YouTube videos
|
- Transcribe audio files, podcasts, YouTube videos
|
||||||
- Extract key moments from video
|
- Extract key moments from video
|
||||||
- Currently blind to multimedia content
|
- Voice memo transcription for messaging integrations
|
||||||
- *Could potentially use whisper via terminal, but native tool would be cleaner*
|
- *Provider options: Whisper API, Deepgram, local Whisper*
|
||||||
|
|
||||||
- [ ] **Diagram Rendering** 📊
|
- [ ] **Diagram Rendering** 📊
|
||||||
- Render Mermaid/PlantUML to actual images
|
- Render Mermaid/PlantUML to actual images
|
||||||
|
|
@ -324,13 +438,169 @@ These items need to be addressed ASAP:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 13. Messaging Platform Integrations 💬
|
||||||
|
|
||||||
|
**Problem:** Agent currently only works via `cli.py` which requires direct terminal access. Users may want to interact via messaging apps from their phone or other devices.
|
||||||
|
|
||||||
|
**Architecture:**
|
||||||
|
- `run_agent.py` already accepts `conversation_history` parameter and returns updated messages ✅
|
||||||
|
- Need: persistent session storage, platform monitors, session key resolution
|
||||||
|
|
||||||
|
**Implementation approach:**
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ Platform Monitor (e.g., telegram_monitor.py) │
|
||||||
|
│ ├─ Long-running daemon connecting to messaging platform │
|
||||||
|
│ ├─ On message: resolve session key → load history from disk│
|
||||||
|
│ ├─ Call run_agent.py with loaded history │
|
||||||
|
│ ├─ Save updated history back to disk (JSONL) │
|
||||||
|
│ └─ Send response back to platform │
|
||||||
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
**Platform support (each user sets up their own credentials):**
|
||||||
|
- [ ] **Telegram** - via `python-telegram-bot` or `grammy` equivalent
|
||||||
|
- Bot token from @BotFather
|
||||||
|
- Easiest to set up, good for personal use
|
||||||
|
- [ ] **Discord** - via `discord.py`
|
||||||
|
- Bot token from Discord Developer Portal
|
||||||
|
- Can work in servers (group sessions) or DMs
|
||||||
|
- [ ] **WhatsApp** - via `baileys` (WhatsApp Web protocol)
|
||||||
|
- QR code scan to authenticate
|
||||||
|
- More complex, but reaches most people
|
||||||
|
|
||||||
|
**Session management:**
|
||||||
|
- [ ] **Session store** - JSONL persistence per session key
|
||||||
|
- `~/.hermes/sessions/{session_key}.jsonl`
|
||||||
|
- Session keys: `telegram:dm:{user_id}`, `discord:channel:{id}`, etc.
|
||||||
|
- [ ] **Session expiry** - Configurable reset policies
|
||||||
|
- Daily reset (default 4am) OR idle timeout (e.g., 2 hours)
|
||||||
|
- Manual reset via `/reset` or `/new` command in chat
|
||||||
|
- [ ] **Session continuity** - Conversations persist across messages until reset
|
||||||
|
|
||||||
|
**Files to create:** `monitors/telegram_monitor.py`, `monitors/discord_monitor.py`, `monitors/session_store.py`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 14. Scheduled Tasks / Cron Jobs ⏰
|
||||||
|
|
||||||
|
**Problem:** Agent only runs on-demand. Some tasks benefit from scheduled execution (daily summaries, monitoring, reminders).
|
||||||
|
|
||||||
|
**Ideas:**
|
||||||
|
- [ ] **Cron-style scheduler** - Run agent turns on a schedule
|
||||||
|
- Store jobs in `~/.hermes/cron/jobs.json`
|
||||||
|
- Each job: `{ id, schedule, prompt, session_mode, delivery }`
|
||||||
|
- Uses APScheduler or similar Python library
|
||||||
|
|
||||||
|
- [ ] **Session modes:**
|
||||||
|
- `isolated` - Fresh session each run (no history, clean context)
|
||||||
|
- `main` - Append to main session (agent remembers previous scheduled runs)
|
||||||
|
|
||||||
|
- [ ] **Delivery options:**
|
||||||
|
- Write output to file (`~/.hermes/cron/output/{job_id}/{timestamp}.md`)
|
||||||
|
- Send to messaging channel (if integrations enabled)
|
||||||
|
- Both
|
||||||
|
|
||||||
|
- [ ] **CLI interface:**
|
||||||
|
```bash
|
||||||
|
# List scheduled jobs
|
||||||
|
python cli.py --cron list
|
||||||
|
|
||||||
|
# Add a job (runs daily at 9am)
|
||||||
|
python cli.py --cron add "Summarize my email inbox" --schedule "0 9 * * *"
|
||||||
|
|
||||||
|
# Quick syntax for simple intervals
|
||||||
|
python cli.py --cron add "Check server status" --every 30m
|
||||||
|
|
||||||
|
# Remove a job
|
||||||
|
python cli.py --cron remove <job_id>
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Agent self-scheduling** - Let the agent create its own cron jobs
|
||||||
|
- New tool: `schedule_task(prompt, schedule, session_mode)`
|
||||||
|
- "Remind me to check the deployment tomorrow at 9am"
|
||||||
|
- Agent can set follow-up tasks for itself
|
||||||
|
|
||||||
|
- [ ] **In-chat command:** `/cronjob {prompt} {frequency}` when using messaging integrations
|
||||||
|
|
||||||
|
**Files to create:** `cron/scheduler.py`, `cron/jobs.py`, `tools/schedule_tool.py`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 15. Text-to-Speech (TTS) 🔊
|
||||||
|
|
||||||
|
**Problem:** Agent can only respond with text. Some users prefer audio responses (accessibility, hands-free use, podcasts).
|
||||||
|
|
||||||
|
**Ideas:**
|
||||||
|
- [ ] **TTS tool** - Generate audio files from text
|
||||||
|
```python
|
||||||
|
tts_generate(text="Here's your summary...", voice="nova", output="summary.mp3")
|
||||||
|
```
|
||||||
|
- Returns path to generated audio file
|
||||||
|
- For messaging integrations: can send as voice message
|
||||||
|
|
||||||
|
- [ ] **Provider options:**
|
||||||
|
- Edge TTS (free, good quality, many voices)
|
||||||
|
- OpenAI TTS (paid, excellent quality)
|
||||||
|
- ElevenLabs (paid, best quality, voice cloning)
|
||||||
|
- Local options (Coqui TTS, Bark)
|
||||||
|
|
||||||
|
- [ ] **Modes:**
|
||||||
|
- On-demand: User explicitly asks "read this to me"
|
||||||
|
- Auto-TTS: Configurable to always generate audio for responses
|
||||||
|
- Long-text handling: Summarize or chunk very long responses
|
||||||
|
|
||||||
|
- [ ] **Integration with messaging:**
|
||||||
|
- When enabled, can send voice notes instead of/alongside text
|
||||||
|
- User preference per channel
|
||||||
|
|
||||||
|
**Files to create:** `tools/tts_tool.py`, config in `cli-config.yaml`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 16. Speech-to-Text / Audio Transcription 🎤
|
||||||
|
|
||||||
|
**Problem:** Users may want to send voice memos instead of typing. Agent is blind to audio content.
|
||||||
|
|
||||||
|
**Ideas:**
|
||||||
|
- [ ] **Voice memo transcription** - For messaging integrations
|
||||||
|
- User sends voice message → transcribe → process as text
|
||||||
|
- Seamless: user speaks, agent responds
|
||||||
|
|
||||||
|
- [ ] **Audio/video file transcription** - Existing idea, expanded:
|
||||||
|
- Transcribe local audio files (mp3, wav, m4a)
|
||||||
|
- Transcribe YouTube videos (download audio → transcribe)
|
||||||
|
- Extract key moments with timestamps
|
||||||
|
|
||||||
|
- [ ] **Provider options:**
|
||||||
|
- OpenAI Whisper API (good quality, cheap)
|
||||||
|
- Deepgram (fast, good for real-time)
|
||||||
|
- Local Whisper (free, runs on GPU)
|
||||||
|
- Groq Whisper (fast, free tier available)
|
||||||
|
|
||||||
|
- [ ] **Tool interface:**
|
||||||
|
```python
|
||||||
|
transcribe(source="audio.mp3") # Local file
|
||||||
|
transcribe(source="https://youtube.com/...") # YouTube
|
||||||
|
transcribe(source="voice_message", data=bytes) # Voice memo
|
||||||
|
```
|
||||||
|
|
||||||
|
**Files to create:** `tools/transcribe_tool.py`, integrate with messaging monitors
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Priority Order (Suggested)
|
## Priority Order (Suggested)
|
||||||
|
|
||||||
1. **Memory & Context Management** - Biggest impact on complex tasks
|
1. **🎯 Subagent Architecture** - Critical for context management, enables everything else
|
||||||
2. **Self-Reflection** - Improves reliability and reduces wasted tool calls
|
2. **Memory & Context Management** - Complements subagents for remaining context
|
||||||
3. **Project-Local Context** - Practical win, keeps useful info across sessions
|
3. **Self-Reflection** - Improves reliability and reduces wasted tool calls
|
||||||
4. **Tool Composition** - Quality of life, builds on other improvements
|
4. **Project-Local Context** - Practical win, keeps useful info across sessions
|
||||||
5. **Dynamic Skills** - Force multiplier for repeated tasks
|
5. **Messaging Integrations** - Unlocks mobile access, new interaction patterns
|
||||||
|
6. **Scheduled Tasks / Cron Jobs** - Enables automation, reminders, monitoring
|
||||||
|
7. **Tool Composition** - Quality of life, builds on other improvements
|
||||||
|
8. **Dynamic Skills** - Force multiplier for repeated tasks
|
||||||
|
9. **Interactive Clarifying Questions** - Better UX for ambiguous tasks
|
||||||
|
10. **TTS / Audio Transcription** - Accessibility, hands-free use
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -339,11 +609,13 @@ These items need to be addressed ASAP:
|
||||||
The following were removed because they're architecturally impossible:
|
The following were removed because they're architecturally impossible:
|
||||||
|
|
||||||
- ~~Proactive suggestions / Prefetching~~ - Agent only runs on user request, can't interject
|
- ~~Proactive suggestions / Prefetching~~ - Agent only runs on user request, can't interject
|
||||||
- ~~Session save/restore across conversations~~ - Agent doesn't control session persistence
|
|
||||||
- ~~User preference learning across sessions~~ - Same issue
|
|
||||||
- ~~Clipboard integration~~ - No access to user's local system clipboard
|
- ~~Clipboard integration~~ - No access to user's local system clipboard
|
||||||
- ~~Voice/TTS playback~~ - Can generate audio but can't play it to user
|
|
||||||
- ~~Set reminders~~ - No persistent background execution
|
The following **moved to active TODO** (now possible with new architecture):
|
||||||
|
|
||||||
|
- ~~Session save/restore~~ → See **Messaging Integrations** (session persistence)
|
||||||
|
- ~~Voice/TTS playback~~ → See **TTS** (can generate audio files, send via messaging)
|
||||||
|
- ~~Set reminders~~ → See **Scheduled Tasks / Cron Jobs**
|
||||||
|
|
||||||
The following were removed because they're **already possible**:
|
The following were removed because they're **already possible**:
|
||||||
|
|
||||||
|
|
@ -357,4 +629,75 @@ The following were removed because they're **already possible**:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🧪 Brainstorm Ideas (Not Yet Fleshed Out)
|
||||||
|
|
||||||
|
*These are early-stage ideas that need more thinking before implementation. Captured here so they don't get lost.*
|
||||||
|
|
||||||
|
### Remote/Distributed Execution 🌐
|
||||||
|
|
||||||
|
**Concept:** Run agent on a powerful remote server while interacting from a thin client.
|
||||||
|
|
||||||
|
**Why interesting:**
|
||||||
|
- Run on beefy GPU server for local LLM inference
|
||||||
|
- Agent has access to remote machine's resources (files, tools, internet)
|
||||||
|
- User interacts via lightweight client (phone, low-power laptop)
|
||||||
|
|
||||||
|
**Open questions:**
|
||||||
|
- How does this differ from just SSH + running cli.py on remote?
|
||||||
|
- Would need secure communication channel (WebSocket? gRPC?)
|
||||||
|
- How to handle tool outputs that reference remote paths?
|
||||||
|
- Credential management for remote execution
|
||||||
|
- Latency considerations for interactive use
|
||||||
|
|
||||||
|
**Possible architecture:**
|
||||||
|
```
|
||||||
|
┌─────────────┐ ┌─────────────────────────┐
|
||||||
|
│ Thin Client │ ◄─────► │ Remote Hermes Server │
|
||||||
|
│ (phone/web) │ WS/API │ - Full agent + tools │
|
||||||
|
└─────────────┘ │ - GPU for local LLM │
|
||||||
|
│ - Access to server files│
|
||||||
|
└─────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
**Related to:** Messaging integrations (could be the "server" that monitors receive from)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Multi-Agent Parallel Execution 🤖🤖
|
||||||
|
|
||||||
|
**Concept:** Extension of Subagent Architecture (Section 1) - run multiple subagents in parallel.
|
||||||
|
|
||||||
|
**Why interesting:**
|
||||||
|
- Independent subtasks don't need to wait for each other
|
||||||
|
- "Research X while setting up Y" - both run simultaneously
|
||||||
|
- Faster completion for complex multi-part tasks
|
||||||
|
|
||||||
|
**Open questions:**
|
||||||
|
- How to detect which tasks are truly independent?
|
||||||
|
- Resource management (API rate limits, concurrent connections)
|
||||||
|
- How to merge results when parallel tasks have conflicts?
|
||||||
|
- Cost implications of multiple parallel LLM calls
|
||||||
|
|
||||||
|
*Note: Basic subagent delegation (Section 1) should be implemented first, parallel execution is an optimization on top.*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Plugin/Extension System 🔌
|
||||||
|
|
||||||
|
**Concept:** Allow users to add custom tools/skills without modifying core code.
|
||||||
|
|
||||||
|
**Why interesting:**
|
||||||
|
- Community contributions
|
||||||
|
- Organization-specific tools
|
||||||
|
- Clean separation of core vs. extensions
|
||||||
|
|
||||||
|
**Open questions:**
|
||||||
|
- Security implications of loading arbitrary code
|
||||||
|
- Versioning and compatibility
|
||||||
|
- Discovery and installation UX
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
*Last updated: $(date +%Y-%m-%d)* 🤖
|
*Last updated: $(date +%Y-%m-%d)* 🤖
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue