Enhance session logging and interactive sudo support

- Implemented automatic session logging, saving conversation trajectories to the `logs/` directory in JSON format, with each session having a unique identifier.
- Updated the CLI to display the session ID in the welcome banner for easy reference.
- Introduced an interactive sudo password prompt in CLI mode, allowing users to enter their password with a 45-second timeout, enhancing user experience during command execution.
- Documented session logging and interactive sudo features in `README.md`, `cli.md`, and `cli-config.yaml.example` for better user guidance.
This commit is contained in:
teknium1 2026-02-01 15:36:26 -08:00
parent 971ed2bbdf
commit bbeed5b5d1
8 changed files with 503 additions and 30 deletions

View file

@ -1,6 +1,76 @@
# Hermes Agent Environment Configuration # Hermes Agent Environment Configuration
# Copy this file to .env and fill in your API keys # Copy this file to .env and fill in your API keys
# =============================================================================
# LLM PROVIDER (OpenRouter)
# =============================================================================
# OpenRouter provides access to many models through one API
# All LLM calls go through OpenRouter - no direct provider keys needed
# Get your key at: https://openrouter.ai/keys
OPENROUTER_API_KEY=
# Default model to use (OpenRouter format: provider/model)
# Examples: anthropic/claude-sonnet-4, openai/gpt-4o, google/gemini-2.0-flash, zhipuai/glm-4-plus
LLM_MODEL=anthropic/claude-sonnet-4
# =============================================================================
# TOOL API KEYS
# =============================================================================
# Firecrawl API Key - Web search, extract, and crawl
# Get at: https://firecrawl.dev/
FIRECRAWL_API_KEY=
# Nous Research API Key - Vision analysis and multi-model reasoning
# Get at: https://inference-api.nousresearch.com/
NOUS_API_KEY=
# FAL.ai API Key - Image generation
# Get at: https://fal.ai/
FAL_KEY=
# =============================================================================
# TERMINAL TOOL CONFIGURATION (mini-swe-agent backend)
# =============================================================================
# Backend type: "local", "singularity", "docker", "modal", or "ssh"
# - local: Runs directly on your machine (fastest, no isolation)
# - ssh: Runs on remote server via SSH (great for sandboxing - agent can't touch its own code)
# - singularity: Runs in Apptainer/Singularity containers (HPC clusters, no root needed)
# - docker: Runs in Docker containers (isolated, requires Docker + docker group)
# - modal: Runs in Modal cloud sandboxes (scalable, requires Modal account)
TERMINAL_ENV=local
# Container images (for singularity/docker/modal backends)
TERMINAL_DOCKER_IMAGE=python:3.11
TERMINAL_SINGULARITY_IMAGE=docker://python:3.11
TERMINAL_MODAL_IMAGE=python:3.11
# Working directory inside the container
TERMINAL_CWD=/tmp
# Default command timeout in seconds
TERMINAL_TIMEOUT=60
# Cleanup inactive environments after this many seconds
TERMINAL_LIFETIME_SECONDS=300
# =============================================================================
# SSH REMOTE EXECUTION (for TERMINAL_ENV=ssh)
# =============================================================================
# Run terminal commands on a remote server via SSH.
# Agent code stays on your machine, commands execute remotely.
#
# SECURITY BENEFITS:
# - Agent cannot read your .env file (API keys protected)
# - Agent cannot modify its own code
# - Remote server acts as isolated sandbox
# - Can safely configure passwordless sudo on remote
#
# TERMINAL_SSH_HOST=192.168.1.100
# TERMINAL_SSH_USER=agent
# TERMINAL_SSH_PORT=22
# TERMINAL_SSH_KEY=~/.ssh/id_rsa
# ============================================================================= # =============================================================================
# SUDO SUPPORT (works with ALL terminal backends) # SUDO SUPPORT (works with ALL terminal backends)
# ============================================================================= # =============================================================================
@ -13,6 +83,74 @@
# - For SSH backend: Configure passwordless sudo on the remote server # - For SSH backend: Configure passwordless sudo on the remote server
# - For containers: Run as root inside the container (no sudo needed) # - For containers: Run as root inside the container (no sudo needed)
# - For local: Configure /etc/sudoers for specific commands # - For local: Configure /etc/sudoers for specific commands
# - For CLI: Leave unset - you'll be prompted interactively with 45s timeout
# #
# SUDO_PASSWORD=your_password_here # SUDO_PASSWORD=your_password_here
# =============================================================================
# MODAL CLOUD BACKEND (Optional - for TERMINAL_ENV=modal)
# =============================================================================
# Modal uses CLI authentication, not environment variables.
# Run: pip install modal && modal setup
# This will authenticate via browser and store credentials locally.
# No API key needed in .env - Modal handles auth automatically.
# =============================================================================
# BROWSER TOOL CONFIGURATION (agent-browser + Browserbase)
# =============================================================================
# Browser automation requires Browserbase cloud service for remote browser execution.
# This allows the agent to navigate websites, fill forms, and extract information.
#
# STEALTH MODES:
# - Basic Stealth: ALWAYS active (random fingerprints, auto CAPTCHA solving)
# - Advanced Stealth: Requires BROWSERBASE_ADVANCED_STEALTH=true (Scale Plan only)
# Browserbase API Key - Cloud browser execution
# Get at: https://browserbase.com/
BROWSERBASE_API_KEY=
# Browserbase Project ID - From your Browserbase dashboard
BROWSERBASE_PROJECT_ID=
# Enable residential proxies for better CAPTCHA solving (default: true)
# Routes traffic through residential IPs, significantly improves success rate
BROWSERBASE_PROXIES=true
# Enable advanced stealth mode (default: false, requires Scale Plan)
# Uses custom Chromium build to avoid bot detection altogether
BROWSERBASE_ADVANCED_STEALTH=false
# Browser session timeout in seconds (default: 300)
# Sessions are cleaned up after this duration of inactivity
BROWSER_SESSION_TIMEOUT=300
# Browser inactivity timeout - auto-cleanup inactive sessions (default: 120 = 2 min)
# Browser sessions are automatically closed after this period of no activity
BROWSER_INACTIVITY_TIMEOUT=120
# =============================================================================
# SESSION LOGGING
# =============================================================================
# Session trajectories are automatically saved to logs/ directory
# Format: logs/session_YYYYMMDD_HHMMSS_UUID.json
# Contains full conversation history in trajectory format for debugging/replay
# =============================================================================
# LEGACY/OPTIONAL API KEYS
# =============================================================================
# Morph API Key - For legacy Hecate terminal backend (terminal-hecate tool)
# Get at: https://morph.so/
MORPH_API_KEY=
# Hecate VM Settings (only if using terminal-hecate tool)
HECATE_VM_LIFETIME_SECONDS=300
HECATE_DEFAULT_SNAPSHOT_ID=snapshot_p5294qxt
# =============================================================================
# DEBUG OPTIONS
# =============================================================================
WEB_TOOLS_DEBUG=false
VISION_TOOLS_DEBUG=false
MOA_TOOLS_DEBUG=false
IMAGE_TOOLS_DEBUG=false

View file

@ -257,6 +257,39 @@ Skills can include:
- `templates/` - Output formats, config files, boilerplate code - `templates/` - Output formats, config files, boilerplate code
- `scripts/` - Executable helpers (Python, shell scripts) - `scripts/` - Executable helpers (Python, shell scripts)
## Session Logging
Every conversation is automatically logged to `logs/` for debugging and inspection:
```
logs/
├── session_20260201_143052_a1b2c3.json
├── session_20260201_150217_d4e5f6.json
└── ...
```
**Log Format:**
```json
{
"session_id": "20260201_143052_a1b2c3",
"model": "anthropic/claude-sonnet-4",
"session_start": "2026-02-01T14:30:52.123456",
"last_updated": "2026-02-01T14:35:12.789012",
"message_count": 8,
"conversations": [
{"from": "system", "value": "..."},
{"from": "human", "value": "..."},
{"from": "gpt", "value": "..."},
{"from": "tool", "value": "..."}
]
}
```
- **Automatic**: Logs are created and updated automatically after each conversation turn
- **Session ID in Banner**: The CLI displays the session ID in the welcome banner
- **Trajectory Format**: Uses the same format as batch processing for consistency
- **Git Ignored**: `logs/` is in `.gitignore` so logs aren't committed
## Interactive CLI ## Interactive CLI
The CLI provides a rich interactive experience for working with the agent. The CLI provides a rich interactive experience for working with the agent.
@ -538,6 +571,7 @@ All environment variables can be configured in the `.env` file (copy from `.env.
- `TERMINAL_CWD`: Working directory inside containers (default: `/tmp`) - `TERMINAL_CWD`: Working directory inside containers (default: `/tmp`)
- `TERMINAL_SCRATCH_DIR`: Custom scratch directory for sandbox storage (optional, auto-detects `/scratch`) - `TERMINAL_SCRATCH_DIR`: Custom scratch directory for sandbox storage (optional, auto-detects `/scratch`)
- `SUDO_PASSWORD`: Enable sudo commands by piping password via `sudo -S` (works with all backends) - `SUDO_PASSWORD`: Enable sudo commands by piping password via `sudo -S` (works with all backends)
- If unset in CLI mode, you'll be prompted interactively when sudo is needed (45s timeout)
**SSH Backend Configuration (for remote execution):** **SSH Backend Configuration (for remote execution):**
- `TERMINAL_SSH_HOST`: Remote server hostname or IP - `TERMINAL_SSH_HOST`: Remote server hostname or IP

72
TODO.md
View file

@ -23,32 +23,62 @@ These items need to be addressed ASAP:
- [x] **Optional sudo support via `SUDO_PASSWORD` env var:** - [x] **Optional sudo support via `SUDO_PASSWORD` env var:**
- Shared `_transform_sudo_command()` helper used by all environments - Shared `_transform_sudo_command()` helper used by all environments
- If set, auto-transforms `sudo cmd` → pipes password via `sudo -S` - If set, auto-transforms `sudo cmd` → pipes password via `sudo -S`
- Documented in `.env.example` with security warnings - Documented in `.env.example`, `cli-config.yaml`, and README
- Works for chained commands: `cmd1 && sudo cmd2` - Works for chained commands: `cmd1 && sudo cmd2`
- [ ] **Optional future enhancements:** - [x] **Interactive sudo prompt in CLI mode:**
- Interactive password prompt in CLI mode only - When sudo detected and no password configured, prompts user
- Document passwordless sudo setup in /etc/sudoers for power users - 45-second timeout (auto-skips if no input)
- Hidden password input via `getpass` (password not visible)
- Password cached for session (don't ask repeatedly)
- Spinner pauses during prompt for clean UX
- Uses `HERMES_INTERACTIVE` env var to detect CLI mode
### 2. Fix `browser_get_images` Tool 🖼️ ### 2. Fix `browser_get_images` Tool 🖼️ ✅ VERIFIED WORKING
- [ ] **Problem:** `browser_get_images` tool is broken/not working correctly - [x] **Tested:** Tool works correctly on multiple sites
- [ ] **Debug:** Investigate what's failing - selector issues? async timing? - [x] **Results:** Successfully extracts image URLs, alt text, dimensions
- [ ] **Fix:** Ensure it properly extracts image URLs and alt text from pages - [x] **Note:** Some sites (Pixabay, etc.) have Cloudflare bot protection that blocks headless browsers - this is expected behavior, not a bug
### 3. Better Action Logging for Debugging 📝 ### 3. Better Action Logging for Debugging 📝 ✅ COMPLETE
- [ ] **Problem:** Need better logging of agent actions for debugging - [x] **Problem:** Need better logging of agent actions for debugging
- [ ] **Implementation:** - [x] **Implementation:**
- Log all tool calls with inputs/outputs - Save full session trajectories to `logs/` directory as JSON
- Timestamps for each action - Each session gets a unique file: `session_YYYYMMDD_HHMMSS_UUID.json`
- Structured log format (JSON?) for easy parsing - Logs all messages, tool calls with inputs/outputs, timestamps
- Log levels (DEBUG, INFO, ERROR) - Structured JSON format for easy parsing and replay
- Option to write to file vs stdout - Automatic on CLI runs (configurable)
### 4. Stream Thinking Summaries in Real-Time 💭 ### 4. Stream Thinking Summaries in Real-Time 💭 ⏸️ DEFERRED
- [ ] **Problem:** Thinking/reasoning summaries not shown while streaming - [ ] **Problem:** Thinking/reasoning summaries not shown while streaming
- [ ] **Implementation:** - [ ] **Complexity:** This is a significant refactor - leaving for later
- Use streaming API to show thinking summaries as they're generated
- Display intermediate reasoning before final response **OpenRouter Streaming Info:**
- Let user see the agent "thinking" in real-time - Uses `stream=True` with OpenAI SDK
- Reasoning comes in `choices[].delta.reasoning_details` chunks
- Types: `reasoning.summary`, `reasoning.text`, `reasoning.encrypted`
- Tool call arguments stream as partial JSON (need accumulation)
- Items paradigm: same ID emitted multiple times with updated content
**Key Challenges:**
- Tool call JSON accumulation (partial `{"query": "wea``{"query": "weather"}`)
- Multiple concurrent outputs (thinking + tool calls + text simultaneously)
- State management for partial responses
- Error handling if connection drops mid-stream
- Deciding when tool calls are "complete" enough to execute
**UX Questions to Resolve:**
- Show raw thinking text or summarized?
- Live expanding text vs. spinner replacement?
- Markdown rendering while streaming?
- How to handle thinking + tool call display simultaneously?
**Implementation Options:**
- New `run_conversation_streaming()` method (keep non-streaming as fallback)
- Wrapper that handles streaming internally
- Big refactor of existing `run_conversation()`
**References:**
- https://openrouter.ai/docs/api/reference/streaming
- https://openrouter.ai/docs/guides/best-practices/reasoning-tokens#streaming-response
--- ---

View file

@ -89,6 +89,13 @@ terminal:
# #
# SECURITY WARNING: Password stored in plaintext! # SECURITY WARNING: Password stored in plaintext!
# #
# INTERACTIVE PROMPT: If no sudo_password is set and the CLI is running,
# you'll be prompted to enter your password when sudo is needed:
# - 45-second timeout (auto-skips if no input)
# - Press Enter to skip (command fails gracefully)
# - Password is hidden while typing
# - Password is cached for the session
#
# ALTERNATIVES: # ALTERNATIVES:
# - SSH backend: Configure passwordless sudo on the remote server # - SSH backend: Configure passwordless sudo on the remote server
# - Containers: Run as root inside the container (no sudo needed) # - Containers: Run as root inside the container (no sudo needed)
@ -205,6 +212,21 @@ toolsets:
# toolsets: # toolsets:
# - safe # - safe
# =============================================================================
# Session Logging
# =============================================================================
# Session trajectories are automatically saved to logs/ directory.
# Each session creates: logs/session_YYYYMMDD_HHMMSS_UUID.json
#
# The session ID is displayed in the welcome banner for easy reference.
# Logs contain full conversation history in trajectory format:
# - System prompt, user messages, assistant responses
# - Tool calls with inputs/outputs
# - Timestamps for debugging
#
# No configuration needed - logging is always enabled.
# To disable, you would need to modify the source code.
# ============================================================================= # =============================================================================
# Display # Display
# ============================================================================= # =============================================================================

20
cli.py
View file

@ -16,6 +16,7 @@ import os
import sys import sys
import json import json
import atexit import atexit
import uuid
from pathlib import Path from pathlib import Path
from datetime import datetime from datetime import datetime
from typing import List, Dict, Any, Optional from typing import List, Dict, Any, Optional
@ -255,7 +256,7 @@ def _get_available_skills() -> Dict[str, List[str]]:
return skills_by_category return skills_by_category
def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dict] = None, enabled_toolsets: List[str] = None): def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dict] = None, enabled_toolsets: List[str] = None, session_id: str = None):
""" """
Build and print a Claude Code-style welcome banner with caduceus on left and info on right. Build and print a Claude Code-style welcome banner with caduceus on left and info on right.
@ -265,6 +266,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
cwd: Current working directory cwd: Current working directory
tools: List of tool definitions tools: List of tool definitions
enabled_toolsets: List of enabled toolset names enabled_toolsets: List of enabled toolset names
session_id: Unique session identifier for logging
""" """
tools = tools or [] tools = tools or []
enabled_toolsets = enabled_toolsets or [] enabled_toolsets = enabled_toolsets or []
@ -284,6 +286,10 @@ def build_welcome_banner(console: Console, model: str, cwd: str, tools: List[dic
left_lines.append(f"[#FFBF00]{model_short}[/] [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]") left_lines.append(f"[#FFBF00]{model_short}[/] [dim #B8860B]·[/] [dim #B8860B]Nous Research[/]")
left_lines.append(f"[dim #B8860B]{cwd}[/]") left_lines.append(f"[dim #B8860B]{cwd}[/]")
# Add session ID if provided
if session_id:
left_lines.append(f"[dim #8B8682]Session: {session_id}[/]")
left_content = "\n".join(left_lines) left_content = "\n".join(left_lines)
# Build right content: tools list grouped by toolset # Build right content: tools list grouped by toolset
@ -487,6 +493,12 @@ class HermesCLI:
self.conversation_history: List[Dict[str, Any]] = [] self.conversation_history: List[Dict[str, Any]] = []
self.session_start = datetime.now() self.session_start = datetime.now()
# Generate session ID with timestamp for display and logging
# Format: YYYYMMDD_HHMMSS_shortUUID (e.g., 20260201_143052_a1b2c3)
timestamp_str = self.session_start.strftime("%Y%m%d_%H%M%S")
short_uuid = uuid.uuid4().hex[:6]
self.session_id = f"{timestamp_str}_{short_uuid}"
# Setup prompt_toolkit session with history # Setup prompt_toolkit session with history
self._setup_prompt_session() self._setup_prompt_session()
@ -528,6 +540,7 @@ class HermesCLI:
verbose_logging=self.verbose, verbose_logging=self.verbose,
quiet_mode=True, # Suppress verbose output for clean CLI quiet_mode=True, # Suppress verbose output for clean CLI
ephemeral_system_prompt=self.system_prompt if self.system_prompt else None, ephemeral_system_prompt=self.system_prompt if self.system_prompt else None,
session_id=self.session_id, # Pass CLI's session ID to agent
) )
return True return True
except Exception as e: except Exception as e:
@ -555,6 +568,7 @@ class HermesCLI:
cwd=cwd, cwd=cwd,
tools=tools, tools=tools,
enabled_toolsets=self.enabled_toolsets, enabled_toolsets=self.enabled_toolsets,
session_id=self.session_id,
) )
self.console.print() self.console.print()
@ -1064,6 +1078,10 @@ def main(
python cli.py -q "What is Python?" # Single query mode python cli.py -q "What is Python?" # Single query mode
python cli.py --list-tools # List tools and exit python cli.py --list-tools # List tools and exit
""" """
# Signal to terminal_tool that we're in interactive mode
# This enables interactive sudo password prompts with timeout
os.environ["HERMES_INTERACTIVE"] = "1"
# Handle query shorthand # Handle query shorthand
query = query or q query = query or q

View file

@ -117,6 +117,29 @@ terminal:
modal_image: "python:3.11" modal_image: "python:3.11"
``` ```
### Sudo Support
The CLI supports interactive sudo prompts:
```
┌──────────────────────────────────────────────────────────┐
│ 🔐 SUDO PASSWORD REQUIRED │
├──────────────────────────────────────────────────────────┤
│ Enter password below (input is hidden), or: │
│ • Press Enter to skip (command fails gracefully) │
│ • Wait 45s to auto-skip │
└──────────────────────────────────────────────────────────┘
Password (hidden):
```
**Options:**
- **Interactive**: Leave `sudo_password` unset - you'll be prompted when needed
- **Configured**: Set `sudo_password` in `cli-config.yaml` to auto-fill
- **Environment**: Set `SUDO_PASSWORD` in `.env` for all runs
Password is cached for the session once entered.
### Toolsets ### Toolsets
Control which tools are available: Control which tools are available:
@ -202,6 +225,30 @@ This allows you to have different terminal configs for CLI vs batch processing.
- **History**: Command history is saved to `~/.hermes_history` - **History**: Command history is saved to `~/.hermes_history`
- **Conversations**: Use `/save` to export conversations - **Conversations**: Use `/save` to export conversations
- **Reset**: Use `/clear` for full reset, `/reset` to just clear history - **Reset**: Use `/clear` for full reset, `/reset` to just clear history
- **Session Logs**: Every session automatically logs to `logs/session_{session_id}.json`
### Session Logging
Sessions are automatically logged to the `logs/` directory:
```
logs/
├── session_20260201_143052_a1b2c3.json
├── session_20260201_150217_d4e5f6.json
└── ...
```
The session ID is displayed in the welcome banner and follows the format: `YYYYMMDD_HHMMSS_UUID`.
Log files contain:
- Full conversation history in trajectory format
- Timestamps for session start and last update
- Model and message count metadata
This is useful for:
- Debugging agent behavior
- Replaying conversations
- Training data inspection
## Quiet Mode ## Quiet Mode

View file

@ -27,6 +27,7 @@ import random
import sys import sys
import time import time
import threading import threading
import uuid
from typing import List, Dict, Any, Optional from typing import List, Dict, Any, Optional
from openai import OpenAI from openai import OpenAI
import fire import fire
@ -117,6 +118,11 @@ class KawaiiSpinner:
def _animate(self): def _animate(self):
"""Animation loop that runs in background thread.""" """Animation loop that runs in background thread."""
while self.running: while self.running:
# Check for pause signal (e.g., during sudo password prompt)
if os.getenv("HERMES_SPINNER_PAUSE"):
time.sleep(0.1)
continue
frame = self.spinner_frames[self.frame_idx % len(self.spinner_frames)] frame = self.spinner_frames[self.frame_idx % len(self.spinner_frames)]
elapsed = time.time() - self.start_time elapsed = time.time() - self.start_time
@ -189,6 +195,7 @@ class AIAgent:
providers_ignored: List[str] = None, providers_ignored: List[str] = None,
providers_order: List[str] = None, providers_order: List[str] = None,
provider_sort: str = None, provider_sort: str = None,
session_id: str = None,
): ):
""" """
Initialize the AI Agent. Initialize the AI Agent.
@ -211,6 +218,7 @@ class AIAgent:
providers_ignored (List[str]): OpenRouter providers to ignore (optional) providers_ignored (List[str]): OpenRouter providers to ignore (optional)
providers_order (List[str]): OpenRouter providers to try in order (optional) providers_order (List[str]): OpenRouter providers to try in order (optional)
provider_sort (str): Sort providers by price/throughput/latency (optional) provider_sort (str): Sort providers by price/throughput/latency (optional)
session_id (str): Pre-generated session ID for logging (optional, auto-generated if not provided)
""" """
self.model = model self.model = model
self.max_iterations = max_iterations self.max_iterations = max_iterations
@ -337,6 +345,25 @@ class AIAgent:
if self.ephemeral_system_prompt and not self.quiet_mode: if self.ephemeral_system_prompt and not self.quiet_mode:
prompt_preview = self.ephemeral_system_prompt[:60] + "..." if len(self.ephemeral_system_prompt) > 60 else self.ephemeral_system_prompt prompt_preview = self.ephemeral_system_prompt[:60] + "..." if len(self.ephemeral_system_prompt) > 60 else self.ephemeral_system_prompt
print(f"🔒 Ephemeral system prompt: '{prompt_preview}' (not saved to trajectories)") print(f"🔒 Ephemeral system prompt: '{prompt_preview}' (not saved to trajectories)")
# Session logging setup - auto-save conversation trajectories for debugging
self.session_start = datetime.now()
if session_id:
# Use provided session ID (e.g., from CLI)
self.session_id = session_id
else:
# Generate a new session ID
timestamp_str = self.session_start.strftime("%Y%m%d_%H%M%S")
short_uuid = uuid.uuid4().hex[:6]
self.session_id = f"{timestamp_str}_{short_uuid}"
# Setup logs directory
self.logs_dir = Path(__file__).parent / "logs"
self.logs_dir.mkdir(exist_ok=True)
self.session_log_file = self.logs_dir / f"session_{self.session_id}.json"
# Track conversation messages for session logging
self._session_messages: List[Dict[str, Any]] = []
# Pools of kawaii faces for random selection # Pools of kawaii faces for random selection
KAWAII_SEARCH = [ KAWAII_SEARCH = [
@ -755,6 +782,44 @@ class AIAgent:
except Exception as e: except Exception as e:
print(f"⚠️ Failed to save trajectory: {e}") print(f"⚠️ Failed to save trajectory: {e}")
def _save_session_log(self, messages: List[Dict[str, Any]] = None):
"""
Save the current session trajectory to the logs directory.
Automatically called after each conversation turn to maintain
a complete log of the session for debugging and inspection.
Args:
messages: Message history to save (uses self._session_messages if not provided)
"""
messages = messages or self._session_messages
if not messages:
return
try:
# Convert to trajectory format (reuse existing method)
# Use empty string as user_query since it's embedded in messages
trajectory = self._convert_to_trajectory_format(messages, "", True)
# Build the session log entry
entry = {
"session_id": self.session_id,
"model": self.model,
"session_start": self.session_start.isoformat(),
"last_updated": datetime.now().isoformat(),
"message_count": len(messages),
"conversations": trajectory,
}
# Write to session log file (overwrite with latest state)
with open(self.session_log_file, "w", encoding="utf-8") as f:
json.dump(entry, f, indent=2, ensure_ascii=False)
except Exception as e:
# Silent fail - don't interrupt the user experience for logging issues
if self.verbose_logging:
logging.warning(f"Failed to save session log: {e}")
def run_conversation( def run_conversation(
self, self,
user_message: str, user_message: str,
@ -1404,6 +1469,10 @@ class AIAgent:
if self.verbose_logging: if self.verbose_logging:
logging.warning(f"Failed to cleanup browser for task {effective_task_id}: {e}") logging.warning(f"Failed to cleanup browser for task {effective_task_id}: {e}")
# Update session messages and save session log
self._session_messages = messages
self._save_session_log(messages)
return { return {
"final_response": final_response, "final_response": final_response,
"messages": messages, "messages": messages,

View file

@ -204,6 +204,106 @@ def _check_disk_usage_warning():
return False return False
# Session-cached sudo password (persists until CLI exits)
_cached_sudo_password: str = ""
def _prompt_for_sudo_password(timeout_seconds: int = 45) -> str:
"""
Prompt user for sudo password with timeout.
Returns the password if entered, or empty string if:
- User presses Enter without input (skip)
- Timeout expires (45s default)
- Any error occurs
Only works in interactive mode (HERMES_INTERACTIVE=1).
Uses getpass for hidden input with threading for timeout support.
"""
import getpass
import sys
import time as time_module
# ANSI escape codes for terminal control
CLEAR_LINE = "\033[2K" # Clear entire line
CURSOR_START = "\r" # Move cursor to start of line
# Result container for thread
result = {"password": None, "done": False}
def get_password_thread():
"""Thread function to get password with getpass (hidden input)."""
try:
result["password"] = getpass.getpass(" Password (hidden): ")
except (EOFError, KeyboardInterrupt):
result["password"] = ""
except Exception:
result["password"] = ""
finally:
result["done"] = True
try:
# Pause the spinner animation while prompting for password
os.environ["HERMES_SPINNER_PAUSE"] = "1"
time_module.sleep(0.2) # Give spinner time to pause
# Clear any spinner/animation on current line
sys.stdout.write(CURSOR_START + CLEAR_LINE)
sys.stdout.flush()
# Print a clear visual break with empty lines for separation
print("\n") # Extra spacing
print("" + "" * 58 + "")
print("│ 🔐 SUDO PASSWORD REQUIRED" + " " * 30 + "")
print("" + "" * 58 + "")
print("│ Enter password below (input is hidden), or: │")
print("│ • Press Enter to skip (command fails gracefully) │")
print(f"│ • Wait {timeout_seconds}s to auto-skip" + " " * 27 + "")
print("" + "" * 58 + "")
print()
sys.stdout.flush()
# Start password input in a thread so we can timeout
password_thread = threading.Thread(target=get_password_thread, daemon=True)
password_thread.start()
# Wait for either completion or timeout
password_thread.join(timeout=timeout_seconds)
if result["done"]:
# Got input (or user pressed Enter/Ctrl+C)
password = result["password"] or ""
if password:
print(" ✓ Password received (cached for this session)")
else:
print(" ⏭ Skipped - continuing without sudo")
print()
sys.stdout.flush()
return password
else:
# Timeout - thread is still waiting for input
print("\n ⏱ Timeout - continuing without sudo")
print(" (Press Enter to dismiss the password prompt)")
print()
sys.stdout.flush()
return ""
except (EOFError, KeyboardInterrupt):
print()
print(" ⏭ Cancelled - continuing without sudo")
print()
sys.stdout.flush()
return ""
except Exception as e:
print(f"\n [sudo prompt error: {e}] - continuing without sudo\n")
sys.stdout.flush()
return ""
finally:
# Always resume the spinner when done
if "HERMES_SPINNER_PAUSE" in os.environ:
del os.environ["HERMES_SPINNER_PAUSE"]
def _transform_sudo_command(command: str) -> str: def _transform_sudo_command(command: str) -> str:
""" """
Transform sudo commands to use -S flag if SUDO_PASSWORD is available. Transform sudo commands to use -S flag if SUDO_PASSWORD is available.
@ -211,21 +311,36 @@ def _transform_sudo_command(command: str) -> str:
This is a shared helper used by all execution environments to provide This is a shared helper used by all execution environments to provide
consistent sudo handling across local, SSH, and container environments. consistent sudo handling across local, SSH, and container environments.
If SUDO_PASSWORD is set, transforms: If SUDO_PASSWORD is set (via env, config, or interactive prompt):
'sudo apt install curl' -> password piped via sudo -S 'sudo apt install curl' -> password piped via sudo -S
If SUDO_PASSWORD is not set, command runs as-is (will fail gracefully If SUDO_PASSWORD is not set and in interactive mode (HERMES_INTERACTIVE=1):
with "sudo: a password is required" error due to stdin=DEVNULL). Prompts user for password with 45s timeout, caches for session.
If SUDO_PASSWORD is not set and NOT interactive:
Command runs as-is (fails gracefully with "sudo: a password is required").
""" """
sudo_password = os.getenv("SUDO_PASSWORD", "") global _cached_sudo_password
import re
# Check if command even contains sudo
if not re.search(r'\bsudo\b', command):
return command # No sudo in command, return as-is
# Try to get password from: env var -> session cache -> interactive prompt
sudo_password = os.getenv("SUDO_PASSWORD", "") or _cached_sudo_password
if not sudo_password:
# No password configured - check if we're in interactive mode
if os.getenv("HERMES_INTERACTIVE"):
# Prompt user for password
sudo_password = _prompt_for_sudo_password(timeout_seconds=45)
if sudo_password:
_cached_sudo_password = sudo_password # Cache for session
if not sudo_password: if not sudo_password:
return command # No password, let it fail gracefully return command # No password, let it fail gracefully
# Check if command contains sudo (simple detection)
# Handle: "sudo cmd", "sudo -flag cmd", "cmd && sudo cmd2", etc.
import re
def replace_sudo(match): def replace_sudo(match):
# Replace 'sudo' with password-piped version # Replace 'sudo' with password-piped version
# The -S flag makes sudo read password from stdin # The -S flag makes sudo read password from stdin