Auditor Tools¶
This page documents the tools available to auditor agents for interacting with and manipulating target models during alignment evaluations.
Tool Collection¶
default_auditor_tools¶
petri.tools.tools.default_auditor_tools ¶
default_auditor_tools(instance: str | None = None, prefill: bool = True, include_resource_tools: bool = False, resource_paths: list[str] | None = None) -> list[Tool]
Create the default set of tools available to the auditor agent.
This function instantiates all the standard auditing tools with the specified instance identifier. These tools enable the auditor to interact with and manipulate the target model during alignment evaluations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
instance
|
str | None
|
Optional instance identifier for tool isolation |
None
|
prefill
|
bool
|
Whether to allow prefill for send_message and send_tool_call_result |
True
|
include_resource_tools
|
bool
|
Whether to include list_directory and read_file tools for accessing reference materials (default True) |
False
|
resource_paths
|
list[str] | None
|
Optional list of additional directories the auditor can access. By default, only the package's resources directory is accessible. |
None
|
Returns:
| Type | Description |
|---|---|
list[Tool]
|
list[Tool]: List of instantiated auditor tools including: - create_target_tool: Create synthetic tools for the target - remove_target_tool: Remove previously created synthetic tools - set_target_system_message: Configure target's system prompt - send_message: Send user messages to the target - send_tool_call_result: Respond to target's tool calls - rollback_conversation: Undo conversation turns - end_conversation: Terminate the audit session - list_directory: Browse available reference materials (if enabled) - read_file: Read reference materials (if enabled) - send_file_as_tool_result: Send file contents as tool result (if enabled) - search_in_files: Search resources and send results as tool result (if enabled) |
Individual Tools¶
create_target_tool¶
set_target_system_message¶
petri.tools.tools.set_target_system_message ¶
send_message¶
petri.tools.tools.send_message ¶
Creates a tool that can send a message to the target model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefill
|
bool
|
Whether to allow the auditor to prefill the target model's response. |
True
|
instance
|
str | None
|
The instance of the auditor store to read and write messages to. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
Tool |
Tool
|
A tool that can send a message to the target model. |