feat: simple fallback model for provider resilience

When the primary model/provider fails after retries (rate limit, overload, auth errors, connection failures), Hermes automatically switches to a configured fallback model for the remainder of the session. Config (in ~/.hermes/config.yaml): fallback_model: provider: openrouter model: anthropic/claude-sonnet-4 Supports all major providers: OpenRouter, OpenAI, Nous, DeepSeek, Together, Groq, Fireworks, Mistral, Gemini — plus custom endpoints via base_url and api_key_env overrides. Design principles: - Dead simple: one fallback model, not a chain - One-shot: switches once, doesn't ping-pong back - Zero new dependencies: uses existing OpenAI client - Minimal code: ~100 lines in run_agent.py, ~5 lines in cli.py/gateway - Three trigger points: max retries exhausted, non-retryable client errors, and invalid response exhaustion Does NOT trigger on context overflow or payload-too-large errors (those are handled by the existing compression system). Addresses #737. 25 new tests, 2492 total passing.
2026-03-08 20:22:33 -07:00 · 2026-03-08 20:22:33 -07:00 · 161436cfdd
commit 161436cfdd
parent 4d7d9d9715
6 changed files with 410 additions and 0 deletions
--- a/cli.py
+++ b/cli.py
@ -1118,6 +1118,10 @@ class HermesCLI:
        self._provider_require_params = pr.get("require_parameters", False)
        self._provider_data_collection = pr.get("data_collection")
        
+        # Fallback model config — tried when primary provider fails after retries
+        fb = CLI_CONFIG.get("fallback_model") or {}
+        self._fallback_model = fb if fb.get("provider") and fb.get("model") else None
+
        # Agent will be initialized on first use
        self.agent: Optional[AIAgent] = None
        self._app = None  # prompt_toolkit Application (set in run())
@ -1349,6 +1353,7 @@ class HermesCLI:
                session_db=self._session_db,
                clarify_callback=self._clarify_callback,
                honcho_session_key=self.session_id,
+                fallback_model=self._fallback_model,
            )
            # Apply any pending title now that the session exists in the DB
            if self._pending_title and self._session_db: