Add support for Atropos Agentic RL environments (requires branch tool_call_support in Atropos atm)

- Added new environments for reinforcement learning, including `HermesSweEnv` for software engineering tasks and `TerminalTestEnv` for inline testing.
- Introduced `ToolContext` for unrestricted access to tools during reward computation.
- Updated `.gitignore` to exclude `wandb/` directory.
- Enhanced `README.md` with detailed architecture and usage instructions for Atropos environments.
- Added configuration files for SWE and terminal test environments to streamline setup.
- Removed unnecessary compiled Python files from `__pycache__`.
This commit is contained in:
teknium 2026-02-07 09:17:16 +00:00
parent ac79725923
commit 07b615e96e
30 changed files with 2851 additions and 965 deletions

View file

@ -0,0 +1,19 @@
"""
Qwen 2.5 tool call parser.
Uses the same <tool_call> format as Hermes.
Registered as a separate parser name for clarity when using --tool-parser=qwen.
"""
from environments.tool_call_parsers import register_parser
from environments.tool_call_parsers.hermes_parser import HermesToolCallParser
@register_parser("qwen")
class QwenToolCallParser(HermesToolCallParser):
"""
Parser for Qwen 2.5 tool calls.
Same <tool_call>{"name": ..., "arguments": ...}</tool_call> format as Hermes.
"""
pass # Identical format -- inherits everything from Hermes