Skip to content

Auditor Tools

This page documents the tools available to auditor agents for interacting with and manipulating target models during alignment evaluations.

Tool Collection

default_auditor_tools

petri.tools.tools.default_auditor_tools

default_auditor_tools(instance: str | None = None, prefill: bool = True) -> list[Tool]

Create the default set of tools available to the auditor agent.

This function instantiates all the standard auditing tools with the specified instance identifier. These tools enable the auditor to interact with and manipulate the target model during alignment evaluations.

Parameters:

Name Type Description Default
instance str | None

Optional instance identifier for tool isolation

None
prefill bool

Whether to allow prefill for send_message and send_tool_call_result

True

Returns:

Type Description
list[Tool]

list[Tool]: List of instantiated auditor tools including: - create_target_tool: Create synthetic tools for the target - set_target_system_message: Configure target's system prompt - send_message: Send user messages to the target - send_tool_call_result: Respond to target's tool calls - rollback_conversation: Undo conversation turns - end_conversation: Terminate the audit session

Individual Tools

create_target_tool

petri.tools.tools.create_target_tool

create_target_tool(instance: str | None = None) -> Tool

set_target_system_message

petri.tools.tools.set_target_system_message

set_target_system_message(instance: str | None = None) -> Tool

send_message

petri.tools.tools.send_message

send_message(prefill: bool = True, instance: str | None = None) -> Tool

Creates a tool that can send a message to the target model.

Parameters:

Name Type Description Default
prefill bool

Whether to allow the auditor to prefill the target model's response.

True
instance str | None

The instance of the auditor store to read and write messages to.

None

Returns:

Name Type Description
Tool Tool

A tool that can send a message to the target model.

send_tool_call_result

petri.tools.tools.send_tool_call_result

send_tool_call_result(prefill: bool = True, instance: str | None = None) -> Tool

rollback_conversation

petri.tools.tools.rollback_conversation

rollback_conversation(instance: str | None = None) -> Tool

end_conversation

petri.tools.tools.end_conversation

end_conversation() -> Tool