94 lines
2.7 KiB
Markdown
94 lines
2.7 KiB
Markdown
---
|
|
name: browser-use
|
|
version: "1.0.0"
|
|
description: Use browser-use with a Chromium CDP endpoint to perform web tasks from Hermes.
|
|
triggers:
|
|
- "browser-use"
|
|
- "open website and extract"
|
|
- "automate browser task"
|
|
- "run browser task"
|
|
allowed-tools:
|
|
- terminal
|
|
- file
|
|
- memory
|
|
---
|
|
|
|
# Browser Use (Chromium)
|
|
|
|
This skill runs browser tasks via `browser-use` and connects to Chromium through CDP.
|
|
|
|
## Prerequisites
|
|
|
|
- `hermes-agent` container is running
|
|
- `chromium` service is running in `docker-compose`
|
|
- `OPENAI_API_KEY` is present in container env (via `docker-compose` `env_file`)
|
|
- If running outside container, set `OPENAI_API_KEY` in your shell or `.env`
|
|
|
|
## Troubleshooting Environment Setup
|
|
|
|
If you get `{"success": false, "error": "OPENAI_API_KEY is not set"}`:
|
|
|
|
```bash
|
|
docker compose exec -T hermes-agent python - <<'PY'
|
|
import os
|
|
print('OPENAI_API_KEY', '<set>' if os.getenv('OPENAI_API_KEY') else '<missing>')
|
|
print('OPENAI_BASE_URL', '<set>' if os.getenv('OPENAI_BASE_URL') else '<missing>')
|
|
PY
|
|
```
|
|
|
|
If `OPENAI_API_KEY` is missing, ensure key exists in one of env files used by compose:
|
|
- `workspace/.env`
|
|
- `hermes_data/.env`
|
|
|
|
Then recreate container:
|
|
|
|
```bash
|
|
docker compose up -d hermes-agent
|
|
```
|
|
|
|
```bash
|
|
# Optional overrides when running outside Docker
|
|
export OPENAI_API_KEY="your-api-key-here"
|
|
export BROWSER_USE_CDP_URL="ws://chromium:3000/chromium?token=hermes-local"
|
|
```
|
|
|
|
**Common failure:** `{"success": false, "error": "OPENAI_API_KEY is not set"}`
|
|
- Cause: key is absent in container env
|
|
- Fix: add key to `workspace/.env` or `hermes_data/.env`, then `docker compose up -d hermes-agent`
|
|
|
|
**Common failure:** 401 `key_model_access_denied`
|
|
- Cause: API key cannot access configured model (for example `gpt-4o-mini`)
|
|
- Fix: set allowed model via `BROWSER_USE_MODEL` (or `OPENAI_MODEL`) to a model your provider key can use
|
|
|
|
**Common failure:** Connection refused to `chromium`
|
|
- Cause: Browser not running or CDP endpoint wrong
|
|
- Fix: Check `docker-compose ps` and verify `chromium` service is up
|
|
|
|
## Quick start
|
|
|
|
```bash
|
|
python-browser-use /root/.hermes/skills/autonomous-ai-agents/browser-use/scripts/run_browser_use.py \
|
|
--task "Open example.com and return page title" \
|
|
--max-steps 8
|
|
```
|
|
|
|
## How to use in Hermes
|
|
|
|
When user asks for website automation:
|
|
|
|
```bash
|
|
python-browser-use /root/.hermes/skills/autonomous-ai-agents/browser-use/scripts/run_browser_use.py \
|
|
--task "<user task in plain language>" \
|
|
--max-steps 20
|
|
```
|
|
|
|
If user gives a start URL, pass `--start-url`.
|
|
|
|
## Notes
|
|
|
|
- Default CDP URL: `ws://chromium:3000/chromium?token=hermes-local`
|
|
- Override by setting `BROWSER_USE_CDP_URL`
|
|
- Runtime Python: `BROWSER_USE_PYTHON` (defaults to `python-browser-use`)
|
|
- The script outputs JSON for easy parsing
|
|
|
|
|