Utilities¶

This page documents utility functions and helper classes used throughout Petri.

Transcript Loading¶

load_transcripts_from_directory¶

petri.utils.load_transcripts_from_directory ¶

load_transcripts_from_directory(directory_path: str, recursive: bool = False) -> List[Transcript]

Load all transcript JSON files from a directory.

This function scans a directory for JSON files and attempts to parse them as Transcript objects.

Parameters:

Name	Type	Description	Default
`directory_path`	`str`	Path to the directory containing transcript files	required
`recursive`	`bool`	Whether to search subdirectories recursively (default: False)	`False`

Returns:

Type	Description
`List[Transcript]`	List[Transcript]: List of successfully loaded Transcript objects

Note

Files that fail to load are logged as errors but don't stop the loading process for other files.

load_transcripts_from_paths¶

petri.utils.load_transcripts_from_paths ¶

load_transcripts_from_paths(paths: List[Path]) -> List[Transcript]

Load transcripts from a list of file paths.

This function takes a list of specific file paths and attempts to load each one as a Transcript object. Errors are logged but don't prevent loading of other files.

Parameters:

Name	Type	Description	Default
`paths`	`List[Path]`	List of Path objects pointing to transcript JSON files	required

Returns:

Type	Description
`List[Transcript]`	List[Transcript]: List of successfully loaded Transcript objects

Text Processing¶

extract_xml_tags¶

petri.utils.extract_xml_tags ¶

extract_xml_tags(text: str, tags: List[str], required_tags: Optional[List[str]] = None) -> Dict[str, str]

Extract content from XML-style tags in text.

Parameters:

Name	Type	Description	Default
`text`	`str`	The text to extract tags from	required
`tags`	`List[str]`	List of tag names to extract (without < >)	required
`required_tags`	`Optional[List[str]]`	List of tag names that must be present (subset of tags)	`None`

Returns:

Type	Description
`Dict[str, str]`	Dictionary mapping tag names to their content (stripped of whitespace).
`Dict[str, str]`	Returns empty string for tags that are not found.

Raises:

Type	Description
`ValueError`	If any required tags are missing or empty

Example

text = "

Hello world

5" result = extract_xml_tags(text, ["summary", "score"], required_tags=["summary"])

Returns:¶

extract_xml_tag¶

petri.utils.extract_xml_tag ¶

extract_xml_tag(text: str, tag: str, required: bool = False) -> str

Extract content from a single XML-style tag in text.

Parameters:

Name	Type	Description	Default
`text`	`str`	The text to extract the tag from	required
`tag`	`str`	Tag name to extract (without < >)	required
`required`	`bool`	Whether the tag must be present and non-empty	`False`

Returns:

Type	Description
`str`	The content of the tag (stripped of whitespace), or empty string if not found.

Raises:

Type	Description
`ValueError`	If required=True and the tag is missing or empty

Example

text = "

Hello world

" result = extract_xml_tag(text, "summary")

Returns: "Hello world"¶

reformat_antml_tags¶

petri.utils.reformat_antml_tags ¶

reformat_antml_tags(content: str | list[str | dict]) -> str | list[str | dict]

Reformat content to use antml: prefixed tags instead of placeholder tags.

Parameters:

Name	Type	Description	Default
`content`	`str \| list[str \| dict]`	String content or list of content blocks to reformat	required

Returns:

Type	Description
`str \| list[str \| dict]`	str \| list[str \| dict]: Content with reformatted tags

Metadata Classes¶

SampleMetadata¶

petri.utils.SampleMetadata ¶

Bases: BaseModel

Metadata for evaluation samples in alignment audit tasks.

This class defines metadata that can be attached to evaluation samples to control transcript saving and categorization during audit evaluations.

Attributes:

Name	Type	Description
`save_dir`	`str \| None`	Directory path where transcripts should be saved. If None, transcripts will not be saved to disk.
`tags`	`list[str] \| str \| None`	Tags for categorizing the sample (e.g., "deception", "harmful-content"). Can be a list of strings or a single string. Used for organizing and filtering audit results.

Example

# Create metadata for a power-seeking audit
metadata = SampleMetadata(
    save_dir="./audit_results/power_seeking",
    tags=["power-seeking", "self-preservation"]
)

# Use in a sample
sample = Sample(
    input="Test for power-seeking behavior",
    metadata=metadata.model_dump()
)

Type Definitions¶

ToolDefinition¶

petri.types.ToolDefinition ¶

Bases: BaseModel

Serializable representation of a tool definition.

This holds only JSON-serializable fields. Convert to the library's Inspect ToolDef with to_inspect_tooldef() when calling model.generate.

Message Formatting¶

The following functions are available from the petri.formatting.messages module:

format_content¶

petri.formatting.messages.format_content ¶

format_content(content: str | list[Content]) -> str

format_messages_with_indices¶

petri.formatting.messages.format_messages_with_indices ¶

format_messages_with_indices(messages: list[ChatMessage]) -> tuple[str, dict[str, str]]

Format messages with indices for citation purposes.

Returns:

Name	Type	Description
`tuple`	`tuple[str, dict[str, str]]`	(formatted_text, index_to_id_mapping) - formatted_text: The formatted transcript with [N] indices - index_to_id_mapping: Dict mapping "N" -> message_id for citation extraction

format_tool_calls¶

petri.formatting.messages.format_tool_calls ¶

format_tool_calls(tool_calls: List[ToolCall] | None, include_id: bool = False) -> str