docs: enhance WhatsApp setup instructions and introduce mode selection

Updated the README and messaging documentation to clarify the two modes for WhatsApp integration: 'bot' mode (recommended) and 'self-chat' mode. Improved setup instructions to guide users through the configuration process, including allowlist management and dependency installation. Adjusted CLI commands to reflect these changes and ensure a smoother user experience. Additionally, modified the WhatsApp bridge to support the new mode functionality.
This commit is contained in:
teknium1 2026-03-02 17:51:33 -08:00
parent 221e4228ec
commit 14b0ad95c6
6 changed files with 148 additions and 55 deletions

View file

@ -271,22 +271,30 @@ SLACK_ALLOWED_USERS=U01234ABCDE # Comma-separated Slack user IDs
### WhatsApp Setup ### WhatsApp Setup
WhatsApp doesn't have a simple bot API like Telegram or Discord. Hermes includes a built-in bridge using [Baileys](https://github.com/WhiskeySockets/Baileys) that connects via WhatsApp Web. The agent links to your WhatsApp account and responds to incoming messages. WhatsApp doesn't have a simple bot API like Telegram or Discord. Hermes includes a built-in bridge using [Baileys](https://github.com/WhiskeySockets/Baileys) that connects via WhatsApp Web.
1. **Run the setup command:** **Two modes are supported:**
| Mode | How it works | Best for |
|------|-------------|----------|
| **Separate bot number** (recommended) | Dedicate a phone number to the bot. People message that number directly. | Clean UX, multiple users |
| **Personal self-chat** | Use your own WhatsApp. You message yourself to talk to the agent. | Quick setup, single user |
**Setup:**
```bash ```bash
hermes whatsapp hermes whatsapp
``` ```
This will: The wizard will:
- Enable WhatsApp in your config 1. Ask which mode you want
- Ask for your phone number (for the allowlist) 2. For **bot mode**: guide you through getting a second number (WhatsApp Business app on a dual-SIM, Google Voice, or cheap prepaid SIM)
- Install bridge dependencies (Node.js required) 3. Configure the allowlist
- Display a QR code — scan it with your phone (WhatsApp → Settings → Linked Devices → Link a Device) 4. Install bridge dependencies (Node.js required)
- Exit automatically once paired 5. Display a QR code — scan from WhatsApp (or WhatsApp Business) → Settings → Linked Devices → Link a Device
6. Exit once paired
2. **Start the gateway:** **Start the gateway:**
```bash ```bash
hermes gateway # Foreground hermes gateway # Foreground
@ -295,7 +303,7 @@ hermes gateway install # Or install as a system service (Linux)
The gateway starts the WhatsApp bridge automatically using the saved session. The gateway starts the WhatsApp bridge automatically using the saved session.
> **Note:** WhatsApp Web sessions can disconnect if WhatsApp updates their protocol. The gateway reconnects automatically. If you see persistent failures, re-pair with `hermes whatsapp`. Agent responses are prefixed with "⚕ Hermes Agent" so you can distinguish them from your own messages in self-chat. > **Note:** WhatsApp Web sessions can disconnect if WhatsApp updates their protocol. The gateway reconnects automatically. If you see persistent failures, re-pair with `hermes whatsapp`. Agent responses are prefixed with "⚕ Hermes Agent" for easy identification.
See [docs/messaging.md](docs/messaging.md) for advanced WhatsApp configuration. See [docs/messaging.md](docs/messaging.md) for advanced WhatsApp configuration.
@ -1635,6 +1643,7 @@ All variables go in `~/.hermes/.env`. Run `hermes config set VAR value` to set t
| `SLACK_ALLOWED_USERS` | Comma-separated Slack user IDs | | `SLACK_ALLOWED_USERS` | Comma-separated Slack user IDs |
| `SLACK_HOME_CHANNEL` | Default Slack channel for cron delivery | | `SLACK_HOME_CHANNEL` | Default Slack channel for cron delivery |
| `WHATSAPP_ENABLED` | Enable WhatsApp bridge (`true`/`false`) | | `WHATSAPP_ENABLED` | Enable WhatsApp bridge (`true`/`false`) |
| `WHATSAPP_MODE` | `bot` (separate number, recommended) or `self-chat` (message yourself) |
| `WHATSAPP_ALLOWED_USERS` | Comma-separated phone numbers (with country code) | | `WHATSAPP_ALLOWED_USERS` | Comma-separated phone numbers (with country code) |
| `MESSAGING_CWD` | Working directory for terminal in messaging (default: ~) | | `MESSAGING_CWD` | Working directory for terminal in messaging (default: ~) |
| `GATEWAY_ALLOW_ALL_USERS` | Allow all users without allowlist (`true`/`false`, default: `false`) | | `GATEWAY_ALLOW_ALL_USERS` | Allow all users without allowlist (`true`/`false`, default: `false`) |

View file

@ -141,7 +141,12 @@ pip install discord.py>=2.0
### WhatsApp ### WhatsApp
WhatsApp uses a built-in bridge powered by [Baileys](https://github.com/WhiskeySockets/Baileys) that connects via WhatsApp Web. The agent links to your WhatsApp account and responds to incoming messages. WhatsApp uses a built-in bridge powered by [Baileys](https://github.com/WhiskeySockets/Baileys) that connects via WhatsApp Web.
**Two modes:**
- **`bot` mode (recommended):** Use a dedicated phone number for the bot. Other people message that number directly. All `fromMe` messages are treated as bot echo-backs and ignored.
- **`self-chat` mode:** Use your own WhatsApp account. You talk to the agent by messaging yourself (WhatsApp → "Message Yourself").
**Setup:** **Setup:**
@ -149,12 +154,7 @@ WhatsApp uses a built-in bridge powered by [Baileys](https://github.com/WhiskeyS
hermes whatsapp hermes whatsapp
``` ```
This will: The wizard walks you through mode selection, allowlist configuration, dependency installation, and QR code pairing. For bot mode, you'll need a second phone number with WhatsApp installed on some device (dual-SIM with WhatsApp Business app is the easiest approach).
- Enable WhatsApp in your `.env`
- Ask for your phone number (for the allowlist)
- Install bridge dependencies (Node.js required)
- Display a QR code — scan it with your phone (WhatsApp → Settings → Linked Devices → Link a Device)
- Exit automatically once paired
Then start the gateway: Then start the gateway:
@ -162,16 +162,23 @@ Then start the gateway:
hermes gateway hermes gateway
``` ```
The gateway starts the WhatsApp bridge automatically using the saved session credentials in `~/.hermes/whatsapp/session/`.
**Environment variables:** **Environment variables:**
```bash ```bash
WHATSAPP_ENABLED=true WHATSAPP_ENABLED=true
WHATSAPP_MODE=bot # "bot" (separate number) or "self-chat" (message yourself)
WHATSAPP_ALLOWED_USERS=15551234567 # Comma-separated phone numbers with country code WHATSAPP_ALLOWED_USERS=15551234567 # Comma-separated phone numbers with country code
``` ```
Agent responses are prefixed with "⚕ **Hermes Agent**" so you can distinguish them from your own messages when messaging yourself. **Getting a second number for bot mode:**
| Option | Cost | Notes |
|--------|------|-------|
| WhatsApp Business app + dual-SIM | Free (if you have dual-SIM) | Install alongside personal WhatsApp, no second phone needed |
| Google Voice | Free (US only) | voice.google.com, verify WhatsApp via the Google Voice app |
| Prepaid SIM | $3-10/month | Any carrier; verify once, phone can go in a drawer on WiFi |
Agent responses are prefixed with "⚕ **Hermes Agent**" for easy identification.
> **Re-pairing:** If WhatsApp Web sessions disconnect (protocol updates, phone reset), re-pair with `hermes whatsapp`. > **Re-pairing:** If WhatsApp Web sessions disconnect (protocol updates, phone reset), re-pair with `hermes whatsapp`.

View file

@ -160,12 +160,14 @@ class WhatsAppAdapter(BasePlatformAdapter):
pass pass
# Start the bridge process in its own process group # Start the bridge process in its own process group
whatsapp_mode = os.getenv("WHATSAPP_MODE", "self-chat")
self._bridge_process = subprocess.Popen( self._bridge_process = subprocess.Popen(
[ [
"node", "node",
str(bridge_path), str(bridge_path),
"--port", str(self._bridge_port), "--port", str(self._bridge_port),
"--session", str(self._session_path), "--session", str(self._session_path),
"--mode", whatsapp_mode,
], ],
stdout=subprocess.DEVNULL, stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL, stderr=subprocess.DEVNULL,

View file

@ -168,7 +168,7 @@ def cmd_gateway(args):
def cmd_whatsapp(args): def cmd_whatsapp(args):
"""Set up WhatsApp: enable, configure allowed users, install bridge, pair via QR.""" """Set up WhatsApp: choose mode, configure, install bridge, pair via QR."""
import os import os
import subprocess import subprocess
from pathlib import Path from pathlib import Path
@ -177,12 +177,55 @@ def cmd_whatsapp(args):
print() print()
print("⚕ WhatsApp Setup") print("⚕ WhatsApp Setup")
print("=" * 50) print("=" * 50)
print()
print("This will link your WhatsApp account to Hermes Agent.")
print("The agent will respond to messages sent to your WhatsApp number.")
print()
# Step 1: Enable WhatsApp # ── Step 1: Choose mode ──────────────────────────────────────────────
current_mode = get_env_value("WHATSAPP_MODE") or ""
if not current_mode:
print()
print("How will you use WhatsApp with Hermes?")
print()
print(" 1. Separate bot number (recommended)")
print(" People message the bot's number directly — cleanest experience.")
print(" Requires a second phone number with WhatsApp installed on a device.")
print()
print(" 2. Personal number (self-chat)")
print(" You message yourself to talk to the agent.")
print(" Quick to set up, but the UX is less intuitive.")
print()
try:
choice = input(" Choose [1/2]: ").strip()
except (EOFError, KeyboardInterrupt):
print("\nSetup cancelled.")
return
if choice == "1":
save_env_value("WHATSAPP_MODE", "bot")
wa_mode = "bot"
print(" ✓ Mode: separate bot number")
print()
print(" ┌─────────────────────────────────────────────────┐")
print(" │ Getting a second number for the bot: │")
print(" │ │")
print(" │ Easiest: Install WhatsApp Business (free app) │")
print(" │ on your phone with a second number: │")
print(" │ • Dual-SIM: use your 2nd SIM slot │")
print(" │ • Google Voice: free US number (voice.google) │")
print(" │ • Prepaid SIM: $3-10, verify once │")
print(" │ │")
print(" │ WhatsApp Business runs alongside your personal │")
print(" │ WhatsApp — no second phone needed. │")
print(" └─────────────────────────────────────────────────┘")
else:
save_env_value("WHATSAPP_MODE", "self-chat")
wa_mode = "self-chat"
print(" ✓ Mode: personal number (self-chat)")
else:
wa_mode = current_mode
mode_label = "separate bot number" if wa_mode == "bot" else "personal number (self-chat)"
print(f"\n✓ Mode: {mode_label}")
# ── Step 2: Enable WhatsApp ──────────────────────────────────────────
print()
current = get_env_value("WHATSAPP_ENABLED") current = get_env_value("WHATSAPP_ENABLED")
if current and current.lower() == "true": if current and current.lower() == "true":
print("✓ WhatsApp is already enabled") print("✓ WhatsApp is already enabled")
@ -190,18 +233,28 @@ def cmd_whatsapp(args):
save_env_value("WHATSAPP_ENABLED", "true") save_env_value("WHATSAPP_ENABLED", "true")
print("✓ WhatsApp enabled") print("✓ WhatsApp enabled")
# Step 2: Allowed users # ── Step 3: Allowed users ────────────────────────────────────────────
current_users = get_env_value("WHATSAPP_ALLOWED_USERS") or "" current_users = get_env_value("WHATSAPP_ALLOWED_USERS") or ""
if current_users: if current_users:
print(f"✓ Allowed users: {current_users}") print(f"✓ Allowed users: {current_users}")
try:
response = input("\n Update allowed users? [y/N] ").strip() response = input("\n Update allowed users? [y/N] ").strip()
except (EOFError, KeyboardInterrupt):
response = "n"
if response.lower() in ("y", "yes"): if response.lower() in ("y", "yes"):
phone = input(" Phone number(s) (e.g. 15551234567, comma-separated): ").strip() if wa_mode == "bot":
phone = input(" Phone numbers that can message the bot (comma-separated): ").strip()
else:
phone = input(" Your phone number (e.g. 15551234567): ").strip()
if phone: if phone:
save_env_value("WHATSAPP_ALLOWED_USERS", phone.replace(" ", "")) save_env_value("WHATSAPP_ALLOWED_USERS", phone.replace(" ", ""))
print(f" ✓ Updated to: {phone}") print(f" ✓ Updated to: {phone}")
else: else:
print() print()
if wa_mode == "bot":
print(" Who should be allowed to message the bot?")
phone = input(" Phone numbers (comma-separated, or * for anyone): ").strip()
else:
phone = input(" Your phone number (e.g. 15551234567): ").strip() phone = input(" Your phone number (e.g. 15551234567): ").strip()
if phone: if phone:
save_env_value("WHATSAPP_ALLOWED_USERS", phone.replace(" ", "")) save_env_value("WHATSAPP_ALLOWED_USERS", phone.replace(" ", ""))
@ -209,7 +262,7 @@ def cmd_whatsapp(args):
else: else:
print(" ⚠ No allowlist — the agent will respond to ALL incoming messages") print(" ⚠ No allowlist — the agent will respond to ALL incoming messages")
# Step 3: Install bridge deps # ── Step 4: Install bridge dependencies ──────────────────────────────
project_root = Path(__file__).resolve().parents[1] project_root = Path(__file__).resolve().parents[1]
bridge_dir = project_root / "scripts" / "whatsapp-bridge" bridge_dir = project_root / "scripts" / "whatsapp-bridge"
bridge_script = bridge_dir / "bridge.js" bridge_script = bridge_dir / "bridge.js"
@ -234,13 +287,16 @@ def cmd_whatsapp(args):
else: else:
print("✓ Bridge dependencies already installed") print("✓ Bridge dependencies already installed")
# Step 4: Check for existing session # ── Step 5: Check for existing session ───────────────────────────────
session_dir = Path.home() / ".hermes" / "whatsapp" / "session" session_dir = Path.home() / ".hermes" / "whatsapp" / "session"
session_dir.mkdir(parents=True, exist_ok=True) session_dir.mkdir(parents=True, exist_ok=True)
if (session_dir / "creds.json").exists(): if (session_dir / "creds.json").exists():
print("✓ Existing WhatsApp session found") print("✓ Existing WhatsApp session found")
try:
response = input("\n Re-pair? This will clear the existing session. [y/N] ").strip() response = input("\n Re-pair? This will clear the existing session. [y/N] ").strip()
except (EOFError, KeyboardInterrupt):
response = "n"
if response.lower() in ("y", "yes"): if response.lower() in ("y", "yes"):
import shutil import shutil
shutil.rmtree(session_dir, ignore_errors=True) shutil.rmtree(session_dir, ignore_errors=True)
@ -251,11 +307,16 @@ def cmd_whatsapp(args):
print(" Start the gateway with: hermes gateway") print(" Start the gateway with: hermes gateway")
return return
# Step 5: Run bridge in pair-only mode (no HTTP server, exits after QR scan) # ── Step 6: QR code pairing ──────────────────────────────────────────
print() print()
print("" * 50) print("" * 50)
print("📱 Scan the QR code with your phone:") if wa_mode == "bot":
print(" WhatsApp → Settings → Linked Devices → Link a Device") print("📱 Open WhatsApp (or WhatsApp Business) on the")
print(" phone with the BOT's number, then scan:")
else:
print("📱 Open WhatsApp on your phone, then scan:")
print()
print(" Settings → Linked Devices → Link a Device")
print("" * 50) print("" * 50)
print() print()
@ -267,11 +328,27 @@ def cmd_whatsapp(args):
except KeyboardInterrupt: except KeyboardInterrupt:
pass pass
# ── Step 7: Post-pairing ─────────────────────────────────────────────
print() print()
if (session_dir / "creds.json").exists(): if (session_dir / "creds.json").exists():
print("✓ WhatsApp paired successfully!") print("✓ WhatsApp paired successfully!")
print() print()
print("Start the gateway with: hermes gateway") if wa_mode == "bot":
print(" Next steps:")
print(" 1. Start the gateway: hermes gateway")
print(" 2. Send a message to the bot's WhatsApp number")
print(" 3. The agent will reply automatically")
print()
print(" Tip: Agent responses are prefixed with '⚕ Hermes Agent'")
else:
print(" Next steps:")
print(" 1. Start the gateway: hermes gateway")
print(" 2. Open WhatsApp → Message Yourself")
print(" 3. Type a message — the agent will reply")
print()
print(" Tip: Agent responses are prefixed with '⚕ Hermes Agent'")
print(" so you can tell them apart from your own messages.")
print()
print(" Or install as a service: hermes gateway install") print(" Or install as a service: hermes gateway install")
else: else:
print("⚠ Pairing may not have completed. Run 'hermes whatsapp' to try again.") print("⚠ Pairing may not have completed. Run 'hermes whatsapp' to try again.")

View file

@ -1382,21 +1382,13 @@ def run_setup_wizard(args):
existing_whatsapp = get_env_value('WHATSAPP_ENABLED') existing_whatsapp = get_env_value('WHATSAPP_ENABLED')
if not existing_whatsapp and prompt_yes_no("Set up WhatsApp?", False): if not existing_whatsapp and prompt_yes_no("Set up WhatsApp?", False):
print_info("WhatsApp connects via a built-in bridge (Baileys).") print_info("WhatsApp connects via a built-in bridge (Baileys).")
print_info("Requires Node.js (already installed if you have browser tools).") print_info("Requires Node.js. Run 'hermes whatsapp' for guided setup.")
print_info("On first gateway start, you'll scan a QR code with your phone.")
print() print()
if prompt_yes_no("Enable WhatsApp?", True): if prompt_yes_no("Enable WhatsApp now?", True):
save_env_value("WHATSAPP_ENABLED", "true") save_env_value("WHATSAPP_ENABLED", "true")
print_success("WhatsApp enabled") print_success("WhatsApp enabled")
print_info("Run 'hermes whatsapp' to choose your mode (separate bot number")
allowed_users = prompt(" Your phone number (e.g. 15551234567, comma-separated for multiple)") print_info("or personal self-chat) and pair via QR code.")
if allowed_users:
save_env_value("WHATSAPP_ALLOWED_USERS", allowed_users.replace(" ", ""))
print_success("WhatsApp allowlist configured")
else:
print_info("⚠️ No allowlist set — anyone who messages your WhatsApp will get a response!")
print_info("Start the gateway with 'hermes gateway' and scan the QR code.")
# Gateway reminder # Gateway reminder
any_messaging = ( any_messaging = (

View file

@ -34,6 +34,7 @@ function getArg(name, defaultVal) {
const PORT = parseInt(getArg('port', '3000'), 10); const PORT = parseInt(getArg('port', '3000'), 10);
const SESSION_DIR = getArg('session', path.join(process.env.HOME || '~', '.hermes', 'whatsapp', 'session')); const SESSION_DIR = getArg('session', path.join(process.env.HOME || '~', '.hermes', 'whatsapp', 'session'));
const PAIR_ONLY = args.includes('--pair-only'); const PAIR_ONLY = args.includes('--pair-only');
const WHATSAPP_MODE = getArg('mode', process.env.WHATSAPP_MODE || 'self-chat'); // "bot" or "self-chat"
const ALLOWED_USERS = (process.env.WHATSAPP_ALLOWED_USERS || '').split(',').map(s => s.trim()).filter(Boolean); const ALLOWED_USERS = (process.env.WHATSAPP_ALLOWED_USERS || '').split(',').map(s => s.trim()).filter(Boolean);
mkdirSync(SESSION_DIR, { recursive: true }); mkdirSync(SESSION_DIR, { recursive: true });
@ -110,11 +111,16 @@ async function startSocket() {
const isGroup = chatId.endsWith('@g.us'); const isGroup = chatId.endsWith('@g.us');
const senderNumber = senderId.replace(/@.*/, ''); const senderNumber = senderId.replace(/@.*/, '');
// Skip own messages UNLESS it's a self-chat ("Message Yourself") // Handle fromMe messages based on mode
if (msg.key.fromMe) { if (msg.key.fromMe) {
// Always skip in groups and status
if (isGroup || chatId.includes('status')) continue; if (isGroup || chatId.includes('status')) continue;
// In DMs: only allow self-chat (remoteJid matches our own number)
if (WHATSAPP_MODE === 'bot') {
// Bot mode: separate number. ALL fromMe are echo-backs of our own replies — skip.
continue;
}
// Self-chat mode: only allow messages in the user's own self-chat
const myNumber = (sock.user?.id || '').replace(/:.*@/, '@').replace(/@.*/, ''); const myNumber = (sock.user?.id || '').replace(/:.*@/, '@').replace(/@.*/, '');
const chatNumber = chatId.replace(/@.*/, ''); const chatNumber = chatId.replace(/@.*/, '');
const isSelfChat = myNumber && chatNumber === myNumber; const isSelfChat = myNumber && chatNumber === myNumber;
@ -270,7 +276,7 @@ if (PAIR_ONLY) {
startSocket(); startSocket();
} else { } else {
app.listen(PORT, () => { app.listen(PORT, () => {
console.log(`🌉 WhatsApp bridge listening on port ${PORT}`); console.log(`🌉 WhatsApp bridge listening on port ${PORT} (mode: ${WHATSAPP_MODE})`);
console.log(`📁 Session stored in: ${SESSION_DIR}`); console.log(`📁 Session stored in: ${SESSION_DIR}`);
if (ALLOWED_USERS.length > 0) { if (ALLOWED_USERS.length > 0) {
console.log(`🔒 Allowed users: ${ALLOWED_USERS.join(', ')}`); console.log(`🔒 Allowed users: ${ALLOWED_USERS.join(', ')}`);