diff --git a/architecture/agents.md b/architecture/agents.md new file mode 100644 index 00000000..625d4ccf --- /dev/null +++ b/architecture/agents.md @@ -0,0 +1,55 @@ +# Agents + +Agents can be viewed as an FSM using an LLM to generate inputs into the system that operates over a DAG. + +What this really means is that the agent is just a function without memory that uses text inputs and outputs in a +defined order. + +```python +def my_agent(*args, **kwargs) -> str: + # do whatever you want! + return "Hi I'm an agent!" +``` + +Now obviously, that's like saying water's wet, but we're going to be using that definition to inform our design of the +library, namely, that we should *not* store agent state outside the function call. + +## The Agent Class + +So we don't have state, why are we using a class? + +Well, we want to initialize things, we want to have some configuration, and we want to have some helper functions. +Preferably all in a single place. + +```python +class BaseAgent: + def agent_primitives(self) -> list[BaseAgent]: + # Returns a list of Agents that are utilized by this agent to generate inputs + # We use agent primitives here instead of subagents because these are going to be part + # of the message graph, not a subagent tool call. + raise NotImplementedError + + def tools(self) -> list[BaseTool]: + # Returns a list of tools that the agent needs to run + raise NotImplementedError + + + def run(self, config, *args, **kwargs) -> ConversationGraph: + llm = get_llm(config) + tools = self.tools() + for agent in self.agent_primitives(): + tools.extend(agent.tools()) + tools = remove_duplicates(tools) + tools = initialize_tools(tools, config) + return self(llm, tools, config, *args, **kwargs) + + @staticmethod + def __call__(self, llm, tools, config, *args, **kwargs) -> ConversationGraph: + # Returns a ConversationGraph that can be parsed to get the output of the agent + # Use w/e args/kwargs you want, as long as llm/tools/config are satisfied. + raise NotImplementedError +``` + +Doesn't seem too bad (I hope), it is a bit annoying that we don't initialize everything in the constructor, but +hopefully we all kinda like it :) + diff --git a/architecture/llm_client.md b/architecture/llm_client.md new file mode 100644 index 00000000..fe15d23a --- /dev/null +++ b/architecture/llm_client.md @@ -0,0 +1,14 @@ +# LLM Client + +A quick wrapper over openai apis + +## Responsibilities + +- Transform "normal" chat/completions requests into graphs +- Translate graphs into LLM requests +- Keep a history of graphs parsed by it + - On Policy Data + - Deduplicating graphs, so we don't keep previous history as separate graphs + +## How to use +Exactly the same as the openai api! Just with the additional support of graph inputs and outputs. \ No newline at end of file diff --git a/architecture/message_graph.md b/architecture/message_graph.md new file mode 100644 index 00000000..251a3a41 --- /dev/null +++ b/architecture/message_graph.md @@ -0,0 +1,114 @@ +# Message Graph + +```mermaid +graph TD + %% Message nodes + SystemMsg["📋 System Message
Role: System
Content: Messages are nodes in a graph"] + UserMsg["👤 User Message
Role: User
Content: But messages aren't the only thing in the graph"] + subgraph PrevMessages["Previous Messages"] + PrevSystemMsg["📋 System Message
Role: System
Content: Edits are kept in the graph as context"] + PrevUserMsg["👤 User Message
Role: User
Content: So we can ensure they're immutable while keeping them editable"] + end + + %% Chat Response as a subgraph + subgraph ChatResponseBox["💬 Chat Response"] + ChatMetadata["📊 Metadata
Temp: 1.0
..."] + ChatResponseText["📝 Response
Hello, Here's a subagent call: <tool>subagent</tool>"] + ChatContent["Content: Hello, Here's a subagent call..."] + end + + %% Tool Response as a subgraph + subgraph ToolResponseBox["🔧 Tool Response"] + subgraph ToolMetadata["📊 Tool Metadata"] + ToolMetadataLength["Length: 3"] + subgraph ToolChat["💭 Subagent Chat"] + SubagentSystem["📋 System
Content: Subagent call received"] + SubagentUser["👤 User
Content: Process this request"] + SubagentAssistant["🤖 Assistant
Content: Processing..."] + SubagentSystem --> SubagentUser + SubagentUser --> SubagentAssistant + end + end + ToolContent["Content: Subagent call output"] + end + + %% Graph flow connections + SystemMsg --> UserMsg + PrevSystemMsg --> PrevUserMsg + PrevMessages -.-> UserMsg + UserMsg --> ChatResponseBox + ChatResponseBox --> ToolResponseBox + + class SystemMsg,UserMsg messageNode + class ChatResponseBox responseNode + class ToolResponseBox responseNode + class ChatMetadata,ChatResponseText,ChatContent,ToolMetadata,ToolChat,ToolContent,ToolMetadataLength metadataNode +``` + +Messages should be a graph (DAG, specifically) of immutable elements. + +## Why immutable elements? +We want to train on policy +- This means the context cannot change after we call a response. + +## Why a graph? +Nodes and connections are a natural way to represent the flow of information in an agent conversation. + +## Will this be annoying to deal with? + +It shouldn't be! While there will be internal stuff that may look ???, for the interface, it should be as simple as your +normal context window edits, so `message_history[2]['content'] = my_edit`, but internally we'll deal with the recordkeeping +and how this ends up parsing into on policy training data, if requested. + +## Edges + +Edges are the connections between nodes, and there are two types we are concerned with: +- **Sequential edges**: These represent the flow of conversation, connecting messages in the order they were sent. For example, a user message followed by an assistant response. +- **Parallel edges**: These represent versioning, e.g. edit history, context squishing, etc. +We, however, are only concerned about parallel edges when we break the prefix, and ignore any other parallel edges. + +## So what does this look like in practice? + +```python +import copy + + +class MessageGraph: + def __init__(self): + self.messages = [] + self.prev_graph = None + + def append(self, message): + self.messages.append(message) + + def __getitem__(self, index): + return self.messages[index] + + def __setitem__(self, key, value): + # check if an assistant message is after this indx + needs_new_graph = False + first_idx = -1 + for i in range(key, len(self.messages)): + if (i == key) and (value['role'] == 'assistant') and (value['content'] == self.messages[i]['content']): + # no op + return + needs_new_graph = needs_new_graph or (self.messages[i]['role'] == 'assistant') + if needs_new_graph and first_idx == -1: + first_idx = i + if needs_new_graph: + self.prev_graph = copy.deepcopy(self) + self.messages[key] = value + + def __len__(self): + return len(self.messages) + + def __eq__(self, other): + return "\n\n".join(f"{msg['role']}: {msg['content']}" for msg in self) == "\n\n".join( + f"{msg['role']}: {msg['content']}" for msg in other) + + +# in use +messages = MessageGraph() +messages.append({'role': 'system', 'content': 'Hello, I am a system message'}) +messages[0] = {'role': 'user', 'content': 'Hello, I am a user message'} +``` \ No newline at end of file diff --git a/architecture/tools.md b/architecture/tools.md new file mode 100644 index 00000000..b899c5eb --- /dev/null +++ b/architecture/tools.md @@ -0,0 +1,16 @@ +# Tools + +Not much on this, yet. Tools are just a stateful wrapper around a function, so we can do things like: +- Keep a docker container running +- Keep a game online + +```python +class BaseTool: + def definitions(self) -> List[Dict[str, Any]]: + # OpenAI API compatible definitions + raise NotImplementedError + + def __call__(self, *args, **kwargs) -> Dict[str, Any]: + # Returns at minimum {'role': 'tool', 'content': '...'} + raise NotImplementedError +``` \ No newline at end of file diff --git a/requirements.txt b/requirements.txt index d0a1bb73..f9b9514d 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,6 +1,9 @@ firecrawl-py openai fal-client +fire +git@github.com:NousResearch/hecate.git +tenacity python-dotenv fire -httpx \ No newline at end of file +httpx