Automate evaluation of instruction adherence and task completion for your traced Pirate agent.
Pirate Example
Copy
from divi import obs_openai, observablefrom divi.evaluation import Scorefrom dotenv import load_dotenvfrom openai import OpenAIload_dotenv()class Pirate: def __init__(self): self.client = obs_openai( OpenAI(), name="Pirate", scores=[Score.instruction_adherence, Score.task_completion], ) @observable(name="Talk with pirate") def talk(self, message: str): """Talk like a pirate.""" res = self.client.chat.completions.create( model="gpt-4o", messages=[ {"role": "developer", "content": "Talk like a pirate."}, { "role": "user", "content": message, }, ], ) return res.choices[0].message.contentpirate = Pirate()pirate.talk("How do I check if a Python object is an instance of a class?")
By default, api_key and base_url will use values from environment variables, and other options are configured as shown in the table above. You can customize the settings as follows:
Pirate Example
Copy
from divi import obs_openai, observablefrom divi.evaluation import EvaluatorConfig, Scorefrom dotenv import load_dotenvfrom openai import OpenAIload_dotenv()class Pirate: def __init__(self): self.client = obs_openai( OpenAI(), name="Pirate", scores=[Score.instruction_adherence, Score.task_completion], eval=EvaluatorConfig( model="gpt-4.1", n_rounds=2, ), ) @observable(name="Talk with pirate") def talk(self, message: str): """Talk like a pirate.""" res = self.client.chat.completions.create( model="gpt-4o", messages=[ {"role": "developer", "content": "Talk like a pirate."}, { "role": "user", "content": message, }, ], ) return res.choices[0].message.contentpirate = Pirate()pirate.talk("How do I check if a Python object is an instance of a class?")
The current evaluation functionality depends on OpenAI’s structured output. Please ensure that your chosen model supports this feature. We strongly recommend using gpt-4o or newer models to ensure evaluation effectiveness and compatibility.