Skip to content

Utilities

This page documents utility functions and helper classes used throughout Petri.

Transcript Loading

load_transcripts_from_directory

petri.utils.load_transcripts_from_directory

load_transcripts_from_directory(directory_path: str, recursive: bool = False) -> List[Transcript]

Load all transcript JSON files from a directory.

This function scans a directory for JSON files and attempts to parse them as Transcript objects.

Parameters:

Name Type Description Default
directory_path str

Path to the directory containing transcript files

required
recursive bool

Whether to search subdirectories recursively (default: False)

False

Returns:

Type Description
List[Transcript]

List[Transcript]: List of successfully loaded Transcript objects

Note

Files that fail to load are logged as errors but don't stop the loading process for other files.

load_transcripts_from_paths

petri.utils.load_transcripts_from_paths

load_transcripts_from_paths(paths: List[Path]) -> List[Transcript]

Load transcripts from a list of file paths.

This function takes a list of specific file paths and attempts to load each one as a Transcript object. Errors are logged but don't prevent loading of other files.

Parameters:

Name Type Description Default
paths List[Path]

List of Path objects pointing to transcript JSON files

required

Returns:

Type Description
List[Transcript]

List[Transcript]: List of successfully loaded Transcript objects

Text Processing

extract_xml_tags

petri.utils.extract_xml_tags

extract_xml_tags(text: str, tags: List[str], required_tags: Optional[List[str]] = None) -> Dict[str, str]

Extract content from XML-style tags in text.

Parameters:

Name Type Description Default
text str

The text to extract tags from

required
tags List[str]

List of tag names to extract (without < >)

required
required_tags Optional[List[str]]

List of tag names that must be present (subset of tags)

None

Returns:

Type Description
Dict[str, str]

Dictionary mapping tag names to their content (stripped of whitespace).

Dict[str, str]

Returns empty string for tags that are not found.

Raises:

Type Description
ValueError

If any required tags are missing or empty

Example

text = "

Hello world5" result = extract_xml_tags(text, ["summary", "score"], required_tags=["summary"])

Returns:

extract_xml_tag

petri.utils.extract_xml_tag

extract_xml_tag(text: str, tag: str, required: bool = False) -> str

Extract content from a single XML-style tag in text.

Parameters:

Name Type Description Default
text str

The text to extract the tag from

required
tag str

Tag name to extract (without < >)

required
required bool

Whether the tag must be present and non-empty

False

Returns:

Type Description
str

The content of the tag (stripped of whitespace), or empty string if not found.

Raises:

Type Description
ValueError

If required=True and the tag is missing or empty

Example

text = "

Hello world" result = extract_xml_tag(text, "summary")

Returns: "Hello world"

reformat_antml_tags

petri.utils.reformat_antml_tags

reformat_antml_tags(content: str | list[str | dict]) -> str | list[str | dict]

Reformat content to use antml: prefixed tags instead of placeholder tags.

Parameters:

Name Type Description Default
content str | list[str | dict]

String content or list of content blocks to reformat

required

Returns:

Type Description
str | list[str | dict]

str | list[str | dict]: Content with reformatted tags

Metadata Classes

SampleMetadata

petri.utils.SampleMetadata

Bases: BaseModel

Metadata for evaluation samples in alignment audit tasks.

This class defines metadata that can be attached to evaluation samples to control transcript saving and categorization during audit evaluations.

Attributes:

Name Type Description
save_dir str | None

Directory path where transcripts should be saved. If None, transcripts will not be saved to disk.

tags list[str] | str | None

Tags for categorizing the sample (e.g., "deception", "harmful-content"). Can be a list of strings or a single string. Used for organizing and filtering audit results.

Example
# Create metadata for a power-seeking audit
metadata = SampleMetadata(
    save_dir="./audit_results/power_seeking",
    tags=["power-seeking", "self-preservation"]
)

# Use in a sample
sample = Sample(
    input="Test for power-seeking behavior",
    metadata=metadata.model_dump()
)

Type Definitions

ToolDefinition

petri.types.ToolDefinition

Bases: BaseModel

Serializable representation of a tool definition.

This holds only JSON-serializable fields. Convert to the library's Inspect ToolDef with to_inspect_tooldef() when calling model.generate.

Message Formatting

The following functions are available from the petri.formatting.messages module:

format_content

petri.formatting.messages.format_content

format_content(content: str | list[Content]) -> str

format_messages_with_indices

petri.formatting.messages.format_messages_with_indices

format_messages_with_indices(messages: list[ChatMessage]) -> tuple[str, dict[str, str]]

Format messages with indices for citation purposes.

Returns:

Name Type Description
tuple tuple[str, dict[str, str]]

(formatted_text, index_to_id_mapping) - formatted_text: The formatted transcript with [N] indices - index_to_id_mapping: Dict mapping "N" -> message_id for citation extraction

format_tool_calls

petri.formatting.messages.format_tool_calls

format_tool_calls(tool_calls: List[ToolCall] | None, include_id: bool = False) -> str