feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled

Add privacy.redact_pii config option (boolean, default false). When enabled, the gateway redacts personally identifiable information from the system prompt before sending it to the LLM provider: - Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256> - User IDs → hashed to user_<sha256> - Chat IDs → numeric portion hashed, platform prefix preserved - Home channel IDs → hashed - Names/usernames → NOT affected (user-chosen, publicly visible) Hashes are deterministic (same user → same hash) so the model can still distinguish users in group chats. Routing and delivery use the original values internally — redaction only affects LLM context. Inspired by OpenClaw PR #47959.
2026-03-16 05:48:45 -07:00 · 2026-03-16 05:48:45 -07:00 · c51e7b4af7
commit c51e7b4af7
parent 7d2c786acc
6 changed files with 252 additions and 6 deletions
--- a/cli-config.yaml.example
+++ b/cli-config.yaml.example
@ -742,3 +742,14 @@ display:
  #   tool_prefix: "╎"                       # Tool output line prefix (default: ┊)
  #
  skin: default
 # =============================================================================
 # Privacy
 # =============================================================================
 # privacy:
 #   # Redact PII from the LLM context prompt.
 #   # When true, phone numbers are stripped and user/chat IDs are replaced
 #   # with deterministic hashes before being sent to the model.
 #   # Names and usernames are NOT affected (user-chosen, publicly visible).
 #   # Routing/delivery still uses the original values internally.
 #   redact_pii: false
--- a/gateway/run.py
+++ b/gateway/run.py
@ -1452,8 +1452,17 @@ class GatewayRunner:
        # Set environment variables for tools
        self._set_session_env(context)
        # Read privacy.redact_pii from config (re-read per message)
        _redact_pii = False
        try:
            with open(_config_path, encoding="utf-8") as _pf:
                _pcfg = yaml.safe_load(_pf) or {}
            _redact_pii = bool((_pcfg.get("privacy") or {}).get("redact_pii", False))
        except Exception:
            pass
        # Build the context prompt to inject
-        context_prompt = build_session_context_prompt(context)
+        context_prompt = build_session_context_prompt(context, redact_pii=_redact_pii)
        # If the previous session expired and was auto-reset, prepend a notice
        # so the agent knows this is a fresh conversation (not an intentional /reset).
--- a/gateway/session.py
+++ b/gateway/session.py
@ -8,9 +8,11 @@ Handles:
 - Dynamic system prompt injection (agent knows its context)
 """
 import hashlib
 import logging
 import os
 import json
 import re
 import uuid
 from pathlib import Path
 from datetime import datetime, timedelta
@ -19,6 +21,41 @@ from typing import Dict, List, Optional, Any
 logger = logging.getLogger(__name__)
 # ---------------------------------------------------------------------------
 # PII redaction helpers
 # ---------------------------------------------------------------------------
 _PHONE_RE = re.compile(r"^\+?\d[\d\-\s]{6,}$")
 def _hash_id(value: str) -> str:
    """Deterministic 12-char hex hash of an identifier."""
    return hashlib.sha256(value.encode("utf-8")).hexdigest()[:12]
 def _hash_sender_id(value: str) -> str:
    """Hash a sender ID to ``user_<12hex>``."""
    return f"user_{_hash_id(value)}"
 def _hash_chat_id(value: str) -> str:
    """Hash the numeric portion of a chat ID, preserving platform prefix.
    ``telegram:12345`` → ``telegram:<hash>``
    ``12345``          → ``<hash>``
    """
    colon = value.find(":")
    if colon > 0:
        prefix = value[:colon]
        return f"{prefix}:{_hash_id(value[colon + 1:])}"
    return _hash_id(value)
 def _looks_like_phone(value: str) -> bool:
    """Return True if *value* looks like a phone number (E.164 or similar)."""
    return bool(_PHONE_RE.match(value.strip()))
 from .config import (
    Platform,
    GatewayConfig,
@ -146,7 +183,11 @@ class SessionContext:
        }
-def build_session_context_prompt(context: SessionContext) -> str:
+def build_session_context_prompt(
    context: SessionContext,
    *,
    redact_pii: bool = False,
 ) -> str:
    """
    Build the dynamic system prompt section that tells the agent about its context.
@ -154,6 +195,10 @@ def build_session_context_prompt(context: SessionContext) -> str:
    - Where messages are coming from
    - What platforms are connected
    - Where it can deliver scheduled task outputs
    When *redact_pii* is True, phone numbers are stripped and user/chat IDs
    are replaced with deterministic hashes before being sent to the LLM.
    Routing still uses the original values (they stay in SessionSource).
    """
    lines = [
        "## Current Session Context",
@ -165,7 +210,25 @@ def build_session_context_prompt(context: SessionContext) -> str:
    if context.source.platform == Platform.LOCAL:
        lines.append(f"**Source:** {platform_name} (the machine running this agent)")
    else:
-        lines.append(f"**Source:** {platform_name} ({context.source.description})")
+        # Build a description that respects PII redaction
        src = context.source
        if redact_pii:
            # Build a safe description without raw IDs
            _uname = src.user_name or (
                _hash_sender_id(src.user_id) if src.user_id else "user"
            )
            _cname = src.chat_name or _hash_chat_id(src.chat_id)
            if src.chat_type == "dm":
                desc = f"DM with {_uname}"
            elif src.chat_type == "group":
                desc = f"group: {_cname}"
            elif src.chat_type == "channel":
                desc = f"channel: {_cname}"
            else:
                desc = _cname
        else:
            desc = src.description
        lines.append(f"**Source:** {platform_name} ({desc})")
    # Channel topic (if available - provides context about the channel's purpose)
    if context.source.chat_topic:
@ -175,7 +238,10 @@ def build_session_context_prompt(context: SessionContext) -> str:
    if context.source.user_name:
        lines.append(f"**User:** {context.source.user_name}")
    elif context.source.user_id:
-        lines.append(f"**User ID:** {context.source.user_id}")
+        uid = context.source.user_id
        if redact_pii:
            uid = _hash_sender_id(uid)
        lines.append(f"**User ID:** {uid}")
    # Platform-specific behavioral notes
    if context.source.platform == Platform.SLACK:
@ -210,7 +276,8 @@ def build_session_context_prompt(context: SessionContext) -> str:
        lines.append("")
        lines.append("**Home Channels (default destinations):**")
        for platform, home in context.home_channels.items():
-            lines.append(f"  - {platform.value}: {home.name} (ID: {home.chat_id})")
+            hc_id = _hash_chat_id(home.chat_id) if redact_pii else home.chat_id
            lines.append(f"  - {platform.value}: {home.name} (ID: {hc_id})")
    # Delivery options for scheduled tasks
    lines.append("")
@ -220,7 +287,10 @@ def build_session_context_prompt(context: SessionContext) -> str:
    if context.source.platform == Platform.LOCAL:
        lines.append("- `\"origin\"` → Local output (saved to files)")
    else:
-        lines.append(f"- `\"origin\"` → Back to this chat ({context.source.chat_name or context.source.chat_id})")
+        _origin_label = context.source.chat_name or (
            _hash_chat_id(context.source.chat_id) if redact_pii else context.source.chat_id
        )
        lines.append(f"- `\"origin\"` → Back to this chat ({_origin_label})")
    # Local always available
    lines.append("- `\"local\"` → Save to local files only (~/.hermes/cron/output/)")
--- a/hermes_cli/config.py
+++ b/hermes_cli/config.py
@ -207,6 +207,11 @@ DEFAULT_CONFIG = {
        "show_reasoning": False,
        "skin": "default",
    },
    # Privacy settings
    "privacy": {
        "redact_pii": False,  # When True, hash user IDs and strip phone numbers from LLM context
    },
    # Text-to-speech configuration
    "tts": {
--- a/tests/gateway/test_pii_redaction.py
+++ b/tests/gateway/test_pii_redaction.py
@ -0,0 +1,132 @@
 """Tests for PII redaction in gateway session context prompts."""
 from gateway.session import (
    SessionContext,
    SessionSource,
    build_session_context_prompt,
    _hash_id,
    _hash_sender_id,
    _hash_chat_id,
    _looks_like_phone,
 )
 from gateway.config import Platform, HomeChannel
 # ---------------------------------------------------------------------------
 # Low-level helpers
 # ---------------------------------------------------------------------------
 class TestHashHelpers:
    def test_hash_id_deterministic(self):
        assert _hash_id("12345") == _hash_id("12345")
    def test_hash_id_12_hex_chars(self):
        h = _hash_id("user-abc")
        assert len(h) == 12
        assert all(c in "0123456789abcdef" for c in h)
    def test_hash_sender_id_prefix(self):
        assert _hash_sender_id("12345").startswith("user_")
        assert len(_hash_sender_id("12345")) == 17  # "user_" + 12
    def test_hash_chat_id_preserves_prefix(self):
        result = _hash_chat_id("telegram:12345")
        assert result.startswith("telegram:")
        assert "12345" not in result
    def test_hash_chat_id_no_prefix(self):
        result = _hash_chat_id("12345")
        assert len(result) == 12
        assert "12345" not in result
    def test_looks_like_phone(self):
        assert _looks_like_phone("+15551234567")
        assert _looks_like_phone("15551234567")
        assert _looks_like_phone("+1-555-123-4567")
        assert not _looks_like_phone("alice")
        assert not _looks_like_phone("user-123")
        assert not _looks_like_phone("")
 # ---------------------------------------------------------------------------
 # Integration: build_session_context_prompt
 # ---------------------------------------------------------------------------
 def _make_context(
    user_id="user-123",
    user_name=None,
    chat_id="telegram:99999",
    platform=Platform.TELEGRAM,
    home_channels=None,
 ):
    source = SessionSource(
        platform=platform,
        chat_id=chat_id,
        chat_type="dm",
        user_id=user_id,
        user_name=user_name,
    )
    return SessionContext(
        source=source,
        connected_platforms=[platform],
        home_channels=home_channels or {},
    )
 class TestBuildSessionContextPromptRedaction:
    def test_no_redaction_by_default(self):
        ctx = _make_context(user_id="user-123")
        prompt = build_session_context_prompt(ctx)
        assert "user-123" in prompt
    def test_user_id_hashed_when_redact_pii(self):
        ctx = _make_context(user_id="user-123")
        prompt = build_session_context_prompt(ctx, redact_pii=True)
        assert "user-123" not in prompt
        assert "user_" in prompt  # hashed ID present
    def test_user_name_not_redacted(self):
        ctx = _make_context(user_id="user-123", user_name="Alice")
        prompt = build_session_context_prompt(ctx, redact_pii=True)
        assert "Alice" in prompt
        # user_id should not appear when user_name is present (name takes priority)
        assert "user-123" not in prompt
    def test_home_channel_id_hashed(self):
        hc = {
            Platform.TELEGRAM: HomeChannel(
                platform=Platform.TELEGRAM,
                chat_id="telegram:99999",
                name="Home Chat",
            )
        }
        ctx = _make_context(home_channels=hc)
        prompt = build_session_context_prompt(ctx, redact_pii=True)
        assert "99999" not in prompt
        assert "telegram:" in prompt  # prefix preserved
        assert "Home Chat" in prompt  # name not redacted
    def test_home_channel_id_preserved_without_redaction(self):
        hc = {
            Platform.TELEGRAM: HomeChannel(
                platform=Platform.TELEGRAM,
                chat_id="telegram:99999",
                name="Home Chat",
            )
        }
        ctx = _make_context(home_channels=hc)
        prompt = build_session_context_prompt(ctx, redact_pii=False)
        assert "99999" in prompt
    def test_redaction_is_deterministic(self):
        ctx = _make_context(user_id="+15551234567")
        prompt1 = build_session_context_prompt(ctx, redact_pii=True)
        prompt2 = build_session_context_prompt(ctx, redact_pii=True)
        assert prompt1 == prompt2
    def test_different_ids_produce_different_hashes(self):
        ctx1 = _make_context(user_id="user-A")
        ctx2 = _make_context(user_id="user-B")
        p1 = build_session_context_prompt(ctx1, redact_pii=True)
        p2 = build_session_context_prompt(ctx2, redact_pii=True)
        assert p1 != p2
--- a/website/docs/user-guide/configuration.md
+++ b/website/docs/user-guide/configuration.md
@ -832,6 +832,25 @@ display:
 | `all` | Every tool call with a short preview (default) |
 | `verbose` | Full args, results, and debug logs |
 ## Privacy
 ```yaml
 privacy:
  redact_pii: false  # Strip PII from LLM context (gateway only)
 ```
 When `redact_pii` is `true`, the gateway redacts personally identifiable information from the system prompt before sending it to the LLM:
 | Field | Treatment |
 |-------|-----------|
 | Phone numbers (user ID on WhatsApp/Signal) | Hashed to `user_<12-char-sha256>` |
 | User IDs | Hashed to `user_<12-char-sha256>` |
 | Chat IDs | Numeric portion hashed, platform prefix preserved (`telegram:<hash>`) |
 | Home channel IDs | Numeric portion hashed |
 | User names / usernames | **Not affected** (user-chosen, publicly visible) |
 Hashes are deterministic — the same user always maps to the same hash, so the model can still distinguish between users in group chats. Routing and delivery use the original values internally.
 ## Speech-to-Text (STT)
 ```yaml