Auditor Tools¶
This page documents the tools available to auditor agents for interacting with and manipulating target models during alignment evaluations.
Tool Collection¶
default_auditor_tools¶
petri.tools.tools.default_auditor_tools ¶
Create the default set of tools available to the auditor agent.
This function instantiates all the standard auditing tools with the specified instance identifier. These tools enable the auditor to interact with and manipulate the target model during alignment evaluations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
instance
|
str | None
|
Optional instance identifier for tool isolation |
None
|
prefill
|
bool
|
Whether to allow prefill for send_message and send_tool_call_result |
True
|
Returns:
| Type | Description |
|---|---|
list[Tool]
|
list[Tool]: List of instantiated auditor tools including: - create_target_tool: Create synthetic tools for the target - set_target_system_message: Configure target's system prompt - send_message: Send user messages to the target - send_tool_call_result: Respond to target's tool calls - rollback_conversation: Undo conversation turns - end_conversation: Terminate the audit session |
Individual Tools¶
create_target_tool¶
set_target_system_message¶
petri.tools.tools.set_target_system_message ¶
send_message¶
petri.tools.tools.send_message ¶
Creates a tool that can send a message to the target model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefill
|
bool
|
Whether to allow the auditor to prefill the target model's response. |
True
|
instance
|
str | None
|
The instance of the auditor store to read and write messages to. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
Tool |
Tool
|
A tool that can send a message to the target model. |