feat: implement code execution sandbox for programmatic tool calling

- Introduced a new `execute_code` tool that allows the agent to run Python scripts that call Hermes tools via RPC, reducing the number of round trips required for tool interactions.
- Added configuration options for timeout and maximum tool calls in the sandbox environment.
- Updated the toolset definitions to include the new code execution capabilities, ensuring integration across platforms.
- Implemented comprehensive tests for the code execution sandbox, covering various scenarios including tool call limits and error handling.
- Enhanced the CLI and documentation to reflect the new functionality, providing users with clear guidance on using the code execution tool.
This commit is contained in:
teknium1 2026-02-19 23:23:43 -08:00
parent 748f0b2b5f
commit 783acd712d
10 changed files with 1598 additions and 18 deletions

View file

@ -138,6 +138,12 @@ TOOLSETS = {
"includes": []
},
"code_execution": {
"description": "Run Python scripts that call tools programmatically (reduces LLM round trips)",
"tools": ["execute_code"],
"includes": []
},
# Scenario-specific toolsets
@ -189,6 +195,8 @@ TOOLSETS = {
"session_search",
# Clarifying questions
"clarify",
# Code execution sandbox (programmatic tool calling)
"execute_code",
# Cronjob management (CLI-only)
"schedule_cronjob", "list_cronjobs", "remove_cronjob"
],
@ -227,6 +235,8 @@ TOOLSETS = {
"memory",
# Session history search
"session_search",
# Code execution sandbox (programmatic tool calling)
"execute_code",
# Cronjob management - let users schedule tasks
"schedule_cronjob", "list_cronjobs", "remove_cronjob",
# Cross-channel messaging
@ -263,6 +273,8 @@ TOOLSETS = {
"memory",
# Session history search
"session_search",
# Code execution sandbox (programmatic tool calling)
"execute_code",
# Cronjob management - let users schedule tasks
"schedule_cronjob", "list_cronjobs", "remove_cronjob",
# Cross-channel messaging
@ -299,6 +311,8 @@ TOOLSETS = {
"memory",
# Session history search
"session_search",
# Code execution sandbox (programmatic tool calling)
"execute_code",
# Cronjob management
"schedule_cronjob", "list_cronjobs", "remove_cronjob",
# Cross-channel messaging
@ -335,6 +349,8 @@ TOOLSETS = {
"memory",
# Session history search
"session_search",
# Code execution sandbox (programmatic tool calling)
"execute_code",
# Cronjob management - let users schedule tasks
"schedule_cronjob", "list_cronjobs", "remove_cronjob",
# Cross-channel messaging