refactor: rewrite duckduckgo-search skill for accuracy and usability
Follow-up to PR #267 merge: - Fix CLI syntax: -k is keywords, -m is max results (was reversed) - Add clear trigger condition: use only when web_search tool unavailable - Remove misleading curl fallback (DuckDuckGo Instant Answer API is not a web search endpoint) - Fix package name: ddgs (renamed from duckduckgo-search) - Add workflow section for search → web_extract pipeline - Add pitfalls and limitations sections - Fix author attribution to actual contributor - Rewrite shell script as simple ddgs wrapper with availability check
This commit is contained in:
parent
d19109742e
commit
2af2f148ab
2 changed files with 82 additions and 108 deletions
|
|
@ -1,133 +1,111 @@
|
||||||
---
|
---
|
||||||
name: duckduckgo-search
|
name: duckduckgo-search
|
||||||
description: Get web search results from DuckDuckGo. Use as fallback when Firecrawl unavailable. No API key needed.
|
description: Free web search via DuckDuckGo when Firecrawl is unavailable. No API key needed. Use ddgs CLI or Python library to find URLs, then web_extract for content.
|
||||||
version: 1.0.0
|
version: 1.1.0
|
||||||
author: Hermes Agent
|
author: gamedevCloudy
|
||||||
license: MIT
|
license: MIT
|
||||||
metadata:
|
metadata:
|
||||||
hermes:
|
hermes:
|
||||||
tags: [Search, DuckDuckGo, Web Search, API, Free]
|
tags: [search, duckduckgo, web-search, free, fallback]
|
||||||
related_skills: [arxiv, ocr-and-documents]
|
related_skills: [arxiv]
|
||||||
---
|
---
|
||||||
|
|
||||||
# DuckDuckGo Search
|
# DuckDuckGo Search (Firecrawl Fallback)
|
||||||
|
|
||||||
Fast, free web search. No API key required. Use when Firecrawl is unavailable.
|
Free web search using DuckDuckGo. **No API key required.**
|
||||||
|
|
||||||
## Quick Reference
|
## When to Use This
|
||||||
|
|
||||||
| Action | Command |
|
Use this skill ONLY when the `web_search` tool is not available (i.e., `FIRECRAWL_API_KEY` is not set). If `web_search` works, prefer it — it returns richer results with built-in content extraction.
|
||||||
|--------|---------|
|
|
||||||
| Web search | `ddgs text "python async" -k 5` |
|
|
||||||
| Images | `ddgs images "cat"` |
|
|
||||||
| News | `ddgs news "AI"` |
|
|
||||||
| Videos | `ddgs videos "tutorial"` |
|
|
||||||
| **Curl fallback** | `curl "https://api.duckduckgo.com/?q=QUERY&format=json"` |
|
|
||||||
|
|
||||||
## Prerequisites
|
Signs you need this fallback:
|
||||||
|
- `web_search` tool is not listed in your available tools
|
||||||
|
- `web_search` returns an error about missing FIRECRAWL_API_KEY
|
||||||
|
|
||||||
### Option 1: Python Library (Recommended)
|
## Setup
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# Install the ddgs package (one-time)
|
||||||
pip install ddgs
|
pip install ddgs
|
||||||
ddgs --help
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Option 2: Curl Only (No Dependencies)
|
## Web Search (Primary Use Case)
|
||||||
|
|
||||||
|
### Via Terminal (ddgs CLI)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Verify curl is available
|
# Basic search — returns titles, URLs, and snippets
|
||||||
curl --version
|
ddgs text -k "python async programming" -m 5
|
||||||
```
|
|
||||||
|
|
||||||
No installation needed — curl is standard on all platforms.
|
|
||||||
|
|
||||||
## Web Search
|
|
||||||
|
|
||||||
### Library (ddgs)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Basic search
|
|
||||||
ddgs text "python async programming" -k 5
|
|
||||||
|
|
||||||
# With region filter
|
# With region filter
|
||||||
ddgs text "best restaurants Tokyo" -k 3 -r jp-jp
|
ddgs text -k "best restaurants" -m 5 -r us-en
|
||||||
|
|
||||||
# Safe search
|
# Recent results only (d=day, w=week, m=month, y=year)
|
||||||
ddgs text "medical advice" -k 5 -s off
|
ddgs text -k "latest AI news" -m 5 -t w
|
||||||
|
|
||||||
|
# JSON output for parsing
|
||||||
|
ddgs text -k "fastapi tutorial" -m 5 -o json
|
||||||
```
|
```
|
||||||
|
|
||||||
### Parameters
|
### Via Python (in execute_code)
|
||||||
|
|
||||||
|
```python
|
||||||
|
from hermes_tools import terminal
|
||||||
|
|
||||||
|
# Search and get results
|
||||||
|
result = terminal("ddgs text -k 'python web framework comparison' -m 5")
|
||||||
|
print(result["output"])
|
||||||
|
```
|
||||||
|
|
||||||
|
### CLI Flags
|
||||||
|
|
||||||
| Flag | Description | Example |
|
| Flag | Description | Example |
|
||||||
|------|-------------|---------|
|
|------|-------------|---------|
|
||||||
| `-k` | Max results | `-k 5` |
|
| `-k` | Keywords (query) — **required** | `-k "search terms"` |
|
||||||
|
| `-m` | Max results | `-m 5` |
|
||||||
| `-r` | Region | `-r us-en` |
|
| `-r` | Region | `-r us-en` |
|
||||||
|
| `-t` | Time limit | `-t w` (week) |
|
||||||
| `-s` | Safe search | `-s off` |
|
| `-s` | Safe search | `-s off` |
|
||||||
|
| `-o` | Output format | `-o json` |
|
||||||
### Curl Fallback
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Basic search
|
|
||||||
curl -s "https://api.duckduckgo.com/?q=python+async&format=json&limit=5"
|
|
||||||
|
|
||||||
# Parse results
|
|
||||||
curl -s "..." | jq -r '.RelatedTopics[] | "\(.Text) - \(.FirstURL)"'
|
|
||||||
```
|
|
||||||
|
|
||||||
## Other Search Types
|
## Other Search Types
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Images
|
# Image search
|
||||||
ddgs images "landscape" -k 10
|
ddgs images -k "landscape photography" -m 10
|
||||||
|
|
||||||
# News
|
# News search
|
||||||
ddgs news "artificial intelligence" -k 5
|
ddgs news -k "artificial intelligence" -m 5
|
||||||
|
|
||||||
# Videos
|
# Video search
|
||||||
ddgs videos "python tutorial" -k 5
|
ddgs videos -k "python tutorial" -m 5
|
||||||
```
|
```
|
||||||
|
|
||||||
## Integration
|
## Workflow: Search → Extract
|
||||||
|
|
||||||
After finding URLs, retrieve full content with `web_extract`:
|
DuckDuckGo finds URLs. To get full page content, follow up with `web_extract`:
|
||||||
|
|
||||||
|
1. **Search** with ddgs to find relevant URLs
|
||||||
|
2. **Extract** content using the `web_extract` tool (if available) or curl
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Find with DDG, then extract
|
# Step 1: Find URLs
|
||||||
ddgs text "fastapi tutorial" -k 3
|
ddgs text -k "fastapi tutorial" -m 3
|
||||||
# Copy URL from output
|
|
||||||
web_extract(urls=["https://fastapi.tiangolo.com/tutorial/"])
|
# Step 2: Extract full content from a result URL
|
||||||
|
# (use web_extract tool if available, otherwise curl)
|
||||||
|
curl -s "https://example.com/article" | head -200
|
||||||
```
|
```
|
||||||
|
|
||||||
This is the standard pattern:
|
## Limitations
|
||||||
1. **DuckDuckGo** → finds URLs
|
|
||||||
2. **web_extract** → retrieves full content
|
|
||||||
|
|
||||||
## Use Cases
|
- **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add `sleep 1` between searches if needed.
|
||||||
|
- **No content extraction**: ddgs only returns titles, URLs, and snippets — not full page content. Use `web_extract` or curl for that.
|
||||||
|
- **Results quality**: Generally good but less configurable than Firecrawl's search.
|
||||||
|
- **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or add a short delay.
|
||||||
|
|
||||||
| Scenario | Tool | Reason |
|
## Pitfalls
|
||||||
|----------|------|--------|
|
|
||||||
| "Find tutorials on X" | `ddgs text` + `web_extract` | Need full content |
|
|
||||||
| Firecrawl unavailable | `ddgs text` | Free fallback |
|
|
||||||
| Quick image search | `ddgs images` | Find images |
|
|
||||||
| Latest news | `ddgs news` | News results |
|
|
||||||
|
|
||||||
## Error Handling
|
- **Don't confuse `-k` and `-m`**: `-k` is for keywords (the query), `-m` is for max results count.
|
||||||
|
- **Package name**: The package is `ddgs` (was previously `duckduckgo-search`). Install with `pip install ddgs`.
|
||||||
```bash
|
- **Empty results**: If ddgs returns nothing, it may be rate-limited. Wait a few seconds and retry.
|
||||||
# Check if library installed
|
|
||||||
ddgs --help 2>/dev/null || echo "Using curl fallback"
|
|
||||||
|
|
||||||
# Rate limiting - add delay
|
|
||||||
sleep 1
|
|
||||||
|
|
||||||
# No results - try different query
|
|
||||||
ddgs text "different keywords" -k 5
|
|
||||||
```
|
|
||||||
|
|
||||||
## Notes
|
|
||||||
|
|
||||||
- **No API key required** — completely free
|
|
||||||
- **Rate limit**: ~1 request/second recommended
|
|
||||||
- **Always follow up** with `web_extract` for full content
|
|
||||||
- Curl fallback has limited results (DDG API restrictions)
|
|
||||||
|
|
|
||||||
|
|
@ -1,32 +1,28 @@
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
# DuckDuckGo Search Helper Script
|
# DuckDuckGo Search Helper Script
|
||||||
# Fallback for when ddgs library is unavailable
|
# Wrapper around ddgs CLI with sensible defaults
|
||||||
# Usage: ./duckduckgo.sh [text|images|news|videos] <query> [limit]
|
# Usage: ./duckduckgo.sh <query> [max_results]
|
||||||
|
|
||||||
set -e
|
set -e
|
||||||
|
|
||||||
MODE="${1:-text}"
|
QUERY="$1"
|
||||||
QUERY="$2"
|
MAX_RESULTS="${2:-5}"
|
||||||
LIMIT="${3:-5}"
|
|
||||||
|
|
||||||
if [ -z "$QUERY" ]; then
|
if [ -z "$QUERY" ]; then
|
||||||
echo "Usage: $0 [text|images|news|videos] <query> [limit]"
|
echo "Usage: $0 <query> [max_results]"
|
||||||
|
echo ""
|
||||||
echo "Examples:"
|
echo "Examples:"
|
||||||
echo " $0 text 'python async' 5"
|
echo " $0 'python async programming' 5"
|
||||||
echo " $0 images 'cat' 10"
|
echo " $0 'latest AI news' 10"
|
||||||
|
echo ""
|
||||||
|
echo "Requires: pip install ddgs"
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# URL encode query
|
# Check if ddgs is available
|
||||||
ENCODED_QUERY=$(echo "$QUERY" | sed 's/ /+/g' | sed 's/&/%26/g' | sed 's/=/%3D/g')
|
if ! command -v ddgs &> /dev/null; then
|
||||||
|
echo "Error: ddgs not found. Install with: pip install ddgs"
|
||||||
case "$MODE" in
|
|
||||||
text|images|news|videos)
|
|
||||||
curl -s "https://api.duckduckgo.com/?q=${ENCODED_QUERY}&format=json&limit=${LIMIT}"
|
|
||||||
;;
|
|
||||||
*)
|
|
||||||
echo "Unknown mode: $MODE"
|
|
||||||
echo "Valid modes: text, images, news, videos"
|
|
||||||
exit 1
|
exit 1
|
||||||
;;
|
fi
|
||||||
esac
|
|
||||||
|
ddgs text -k "$QUERY" -m "$MAX_RESULTS"
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue