architecture skill created

This commit is contained in:
Максим Туревич 2026-03-24 02:33:11 +03:00
parent 77cd8e04b0
commit 566dc54610
8 changed files with 574 additions and 2 deletions

5
.gitignore vendored
View file

@ -3,9 +3,9 @@
.DS_Store
.AppleDouble
.LSOverride
# Icon must end with two \r
Icon
Icon
# Thumbnails
._*
@ -52,3 +52,4 @@ $RECYCLE.BIN/
# Windows shortcuts
*.lnk
*.idea

50
SKILL.md Normal file
View file

@ -0,0 +1,50 @@
---
name: browser-use
version: "1.0.0"
description: |
Автоматизация браузера с помощью Playwright и библиотеки browser_use.
Выполняет навигацию, клики, заполнение форм, скриншоты, извлечение данных.
Подходит для тестирования веб-приложений, парсинга, автоматизации рутинных задач.
triggers:
- "открой сайт"
- "нажми на кнопку"
- "заполни форму"
- "сделай скриншот"
- "спарси данные"
- "автоматизируй браузер"
- "browser use"
- "playwright"
license: MIT
compatibility:
- hermes
- claude
allowed-tools:
- bash
- python
- read_file
- write_file
---
# BrowserUse Skill
Автоматизация браузера с использованием Playwright и browser_use.
## 🎯 Описание
Этот скилл позволяет Hermes-агенту управлять браузером:
- Открывать URL и навигировать
- Кликать по элементам
- Заполнять формы
- Извлекать данные (текст, атрибуты, HTML)
- Делать скриншоты
- Ждать загрузки элементов
- Выполнять кастомный JavaScript
- Работать с выпадающими списками
## 📦 Установка зависимостей
Перед первым использованием выполни:
```bash
cd ~/.hermes/skills/browser-use/scripts
chmod +x setup.sh
./setup.sh

View file

@ -0,0 +1,30 @@
---
## Файл: assets/config.example.json
```json
{
"browser": {
"headless": true,
"timeout": 30000,
"viewport": {
"width": 1280,
"height": 720
},
"user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
},
"screenshots": {
"path": "/tmp/browser-use-screenshots",
"format": "png",
"full_page": true
},
"retry": {
"max_attempts": 3,
"delay_seconds": 2
},
"logging": {
"level": "info",
"save_screenshots_on_error": true
}
}

View file

@ -0,0 +1,27 @@
---
## 📚 Файл: references/common_patterns.md
```markdown
# Common Browser Automation Patterns
## Паттерн 1: Авторизация
### Сценарий
Пользователь хочет автоматизировать вход в систему.
### Реализация
```python
{
"action": "sequence",
"steps": [
{"action": "goto", "url": "https://example.com/login"},
{"action": "wait", "selector": "form", "timeout": 5000},
{"action": "fill", "selector": "input[name='email']", "value": "user@example.com"},
{"action": "fill", "selector": "input[name='password']", "value": "password123"},
{"action": "click", "selector": "button[type='submit']"},
{"action": "wait", "selector": ".dashboard", "timeout": 10000},
{"action": "screenshot", "path": "/tmp/after_login.png"}
]
}

52
references/selectors.md Normal file
View file

@ -0,0 +1,52 @@
# CSS Селекторы — Полная шпаргалка
## Быстрый справочник
### Базовые селекторы
| Селектор | Пример | Описание |
|----------|--------|----------|
| `*` | `*` | Все элементы |
| `element` | `div` | Элемент по тегу |
| `#id` | `#main` | Элемент по ID |
| `.class` | `.button` | Элемент по классу |
| `[attr]` | `[disabled]` | Элемент с атрибутом |
| `[attr=value]` | `[type="submit"]` | Точное совпадение атрибута |
| `[attr^=value]` | `[href^="https"]` | Атрибут начинается с |
| `[attr$=value]` | `[href$=".pdf"]` | Атрибут заканчивается на |
| `[attr*=value]` | `[name*="user"]` | Атрибут содержит |
### Комбинаторы
| Селектор | Пример | Описание |
|----------|--------|----------|
| `A B` | `div p` | Потомок (любой уровень) |
| `A > B` | `div > p` | Прямой потомок |
| `A + B` | `h1 + p` | Соседний элемент |
| `A ~ B` | `h1 ~ p` | Все следующие соседние |
### Псевдоклассы
| Псевдокласс | Пример | Описание |
|-------------|--------|----------|
| `:first-child` | `li:first-child` | Первый дочерний |
| `:last-child` | `li:last-child` | Последний дочерний |
| `:nth-child(n)` | `tr:nth-child(2)` | n-й дочерний |
| `:nth-of-type(n)` | `p:nth-of-type(2)` | n-й элемент типа |
| `:not(selector)` | `div:not(.hidden)` | Исключение |
| `:has(selector)` | `div:has(p)` | Содержит дочерний элемент |
| `:contains(text)` | `a:contains("Click")` | Содержит текст |
## XPath — Альтернатива
### Базовые XPath
```xpath
//element # Все элементы
//div[@id='main'] # По атрибуту
//div[contains(@class, 'btn')] # Частичное совпадение класса
//button[text()='Submit'] # По тексту
//a[contains(text(), 'Learn')] # Частичное совпадение текста
//div[@id='main']//p # Вложенность
//div[1] # Первый div
//div[last()] # Последний div

View file

@ -0,0 +1,338 @@
## 🐍 Файл: scripts/browser_automation.py
# !/usr/bin/env python3
"""
Browser automation core module for Hermes Agent Skill
Автоматизация браузера с использованием Playwright
"""
import asyncio
import json
import sys
import os
from typing import Dict, Any, Optional, List
from playwright.async_api import async_playwright, Page, Browser, Playwright
class BrowserAutomation:
"""Основной класс для автоматизации браузера"""
def __init__(self, headless: bool = True, timeout: int = 30000):
self.headless = headless
self.timeout = timeout
self.playwright: Optional[Playwright] = None
self.browser: Optional[Browser] = None
self.page: Optional[Page] = None
async def __aenter__(self):
await self.start()
return self
async def __aexit__(self, exc_type, exc_val, exc_tb):
await self.close()
async def start(self):
"""Запуск браузера"""
self.playwright = await async_playwright().start()
self.browser = await self.playwright.chromium.launch(
headless=self.headless,
args=[
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-accelerated-2d-canvas',
'--disable-gpu'
]
)
self.page = await self.browser.new_page()
self.page.set_default_timeout(self.timeout)
async def close(self):
"""Закрытие браузера"""
if self.browser:
await self.browser.close()
if self.playwright:
await self.playwright.stop()
async def goto(self, url: str) -> Dict[str, Any]:
"""Переход по URL"""
try:
response = await self.page.goto(url, wait_until='networkidle')
status = response.status if response else None
return {
"success": True,
"url": self.page.url,
"status": status
}
except Exception as e:
return {
"success": False,
"error": f"Failed to navigate to {url}: {str(e)}"
}
async def click(self, selector: str) -> Dict[str, Any]:
"""Клик по элементу"""
try:
await self.page.wait_for_selector(selector, timeout=self.timeout)
await self.page.click(selector)
return {
"success": True,
"selector": selector,
"message": f"Clicked on {selector}"
}
except Exception as e:
return {
"success": False,
"error": f"Failed to click on {selector}: {str(e)}"
}
async def fill(self, selector: str, value: str) -> Dict[str, Any]:
"""Заполнение поля"""
try:
await self.page.wait_for_selector(selector, timeout=self.timeout)
await self.page.fill(selector, value)
return {
"success": True,
"selector": selector,
"value": value,
"message": f"Filled {selector} with '{value}'"
}
except Exception as e:
return {
"success": False,
"error": f"Failed to fill {selector}: {str(e)}"
}
async def screenshot(self, path: str = "/tmp/screenshot.png") -> Dict[str, Any]:
"""Скриншот страницы"""
try:
# Убедимся, что директория существует
os.makedirs(os.path.dirname(path), exist_ok=True)
await self.page.screenshot(path=path, full_page=True)
return {
"success": True,
"path": path,
"message": f"Screenshot saved to {path}"
}
except Exception as e:
return {
"success": False,
"error": f"Failed to take screenshot: {str(e)}"
}
async def get_text(self, selector: str) -> Dict[str, Any]:
"""Получение текста элемента"""
try:
await self.page.wait_for_selector(selector, timeout=self.timeout)
text = await self.page.text_content(selector)
return {
"success": True,
"text": text.strip() if text else "",
"selector": selector
}
except Exception as e:
return {
"success": False,
"error": f"Failed to get text from {selector}: {str(e)}"
}
async def get_text_all(self, selector: str) -> Dict[str, Any]:
"""Получение текста всех элементов"""
try:
await self.page.wait_for_selector(selector, timeout=self.timeout)
elements = await self.page.query_selector_all(selector)
texts = []
for el in elements:
text = await el.text_content()
if text:
texts.append(text.strip())
return {
"success": True,
"texts": texts,
"count": len(texts),
"selector": selector
}
except Exception as e:
return {
"success": False,
"error": f"Failed to get texts from {selector}: {str(e)}"
}
async def evaluate(self, js_code: str) -> Dict[str, Any]:
"""Выполнение JavaScript"""
try:
result = await self.page.evaluate(js_code)
return {
"success": True,
"result": result,
"code": js_code[:100] # Обрезаем для вывода
}
except Exception as e:
return {
"success": False,
"error": f"Failed to evaluate JavaScript: {str(e)}"
}
async def select(self, selector: str, value: str) -> Dict[str, Any]:
"""Выбор из выпадающего списка"""
try:
await self.page.wait_for_selector(selector, timeout=self.timeout)
await self.page.select_option(selector, value)
return {
"success": True,
"selector": selector,
"value": value,
"message": f"Selected '{value}' from {selector}"
}
except Exception as e:
return {
"success": False,
"error": f"Failed to select from {selector}: {str(e)}"
}
async def wait_for_selector(self, selector: str, timeout: int = None) -> Dict[str, Any]:
"""Ожидание появления элемента"""
timeout_ms = timeout or self.timeout
try:
await self.page.wait_for_selector(selector, timeout=timeout_ms)
return {
"success": True,
"selector": selector,
"timeout": timeout_ms,
"message": f"Element {selector} appeared"
}
except Exception as e:
return {
"success": False,
"error": f"Timeout waiting for {selector}: {str(e)}"
}
async def get_html(self) -> Dict[str, Any]:
"""Получение HTML страницы"""
try:
html = await self.page.content()
return {
"success": True,
"html": html,
"size": len(html)
}
except Exception as e:
return {
"success": False,
"error": f"Failed to get HTML: {str(e)}"
}
async def get_title(self) -> Dict[str, Any]:
"""Получение заголовка страницы"""
try:
title = await self.page.title()
return {
"success": True,
"title": title
}
except Exception as e:
return {
"success": False,
"error": f"Failed to get title: {str(e)}"
}
async def get_url(self) -> Dict[str, Any]:
"""Получение текущего URL"""
try:
url = self.page.url
return {
"success": True,
"url": url
}
except Exception as e:
return {
"success": False,
"error": f"Failed to get URL: {str(e)}"
}
async def execute_sequence(self, steps: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Выполнение последовательности действий"""
results = []
for i, step in enumerate(steps):
result = await self.execute_task(step)
results.append({
"step": i + 1,
"action": step.get("action"),
"result": result
})
# Если шаг не удался, прекращаем выполнение
if not result.get("success"):
return {
"success": False,
"error": f"Sequence failed at step {i + 1}",
"results": results
}
return {
"success": True,
"results": results,
"total_steps": len(steps)
}
async def execute_task(self, task: Dict[str, Any]) -> Dict[str, Any]:
"""Выполнение задачи по описанию"""
action = task.get("action")
actions_map = {
"goto": lambda: self.goto(task.get("url")),
"click": lambda: self.click(task.get("selector")),
"fill": lambda: self.fill(task.get("selector"), task.get("value")),
"screenshot": lambda: self.screenshot(task.get("path", "/tmp/screenshot.png")),
"get_text": lambda: self.get_text(task.get("selector")),
"get_text_all": lambda: self.get_text_all(task.get("selector")),
"evaluate": lambda: self.evaluate(task.get("code")),
"select": lambda: self.select(task.get("selector"), task.get("value")),
"wait": lambda: self.wait_for_selector(task.get("selector"), task.get("timeout")),
"get_html": lambda: self.get_html(),
"get_title": lambda: self.get_title(),
"get_url": lambda: self.get_url(),
"sequence": lambda: self.execute_sequence(task.get("steps", []))
}
if action not in actions_map:
return {
"success": False,
"error": f"Unknown action: {action}. Available: {', '.join(actions_map.keys())}"
}
return await actions_map[action]()
async def run_from_args():
"""Запуск из аргументов командной строки"""
if len(sys.argv) < 2:
print(json.dumps({
"success": False,
"error": "No task provided. Usage: python3 browser_automation.py '<JSON_TASK>'"
}))
return
try:
task = json.loads(sys.argv[1])
except json.JSONDecodeError:
# Если не JSON, пробуем как goto команду
task = {"action": "goto", "url": sys.argv[1]}
# Определяем режим headless (можно переопределить через переменную окружения)
headless = os.environ.get("BROWSER_HEADLESS", "true").lower() == "true"
async with BrowserAutomation(headless=headless) as browser:
result = await browser.execute_task(task)
print(json.dumps(result, ensure_ascii=False, indent=2))
if __name__ == "__main__":
asyncio.run(run_from_args())

2
scripts/requirements.txt Normal file
View file

@ -0,0 +1,2 @@
playwright>=1.40.0,<2.0.0
browser-use>=0.1.0,<1.0.0

72
scripts/setup.sh Normal file
View file

@ -0,0 +1,72 @@
#!/bin/bash
# Setup script for BrowserUse skill
# Устанавливает зависимости и браузеры для Playwright
set -e
echo "🔧 Installing BrowserUse skill dependencies..."
echo "================================================"
# Определяем цветной вывод
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Проверка Python
echo -n "Checking Python... "
if command -v python3 &> /dev/null; then
PYTHON_VERSION=$(python3 --version)
echo -e "${GREEN}OK${NC} ($PYTHON_VERSION)"
else
echo -e "${RED}FAILED${NC}"
echo "Python 3 is required but not installed."
exit 1
fi
# Проверка pip
echo -n "Checking pip... "
if command -v pip3 &> /dev/null; then
echo -e "${GREEN}OK${NC}"
else
echo -e "${RED}FAILED${NC}"
echo "pip3 is required but not installed."
exit 1
fi
# Установка Python пакетов
echo ""
echo "📦 Installing Python packages..."
pip3 install --upgrade pip
pip3 install -r "$(dirname "$0")/requirements.txt"
# Установка браузеров Playwright
echo ""
echo "🌐 Installing Playwright browsers..."
python3 -m playwright install chromium
python3 -m playwright install-deps # Системные зависимости для Linux
# Проверка установки
echo ""
echo -n "✅ Verifying installation... "
if python3 -c "import playwright" 2>/dev/null; then
echo -e "${GREEN}OK${NC}"
else
echo -e "${RED}FAILED${NC}"
echo "Playwright installation verification failed."
exit 1
fi
# Создание временной директории для скриншотов
mkdir -p /tmp/browser-use-screenshots
echo "📁 Created screenshot directory: /tmp/browser-use-screenshots"
echo ""
echo "================================================"
echo -e "${GREEN}✅ BrowserUse skill successfully installed!${NC}"
echo ""
echo "📖 Quick test:"
echo " python3 $(dirname "$0")/browser_automation.py '{\"action\":\"goto\",\"url\":\"https://example.com\"}'"
echo ""
echo "📚 For more examples, see SKILL.md"
echo "================================================"