Skip to content

Utilities

This page documents utility functions and helper classes used throughout Petri.

Transcript Loading

load_transcripts_from_directory

petri.utils.load_transcripts_from_directory

load_transcripts_from_directory(directory_path: str, recursive: bool = False) -> list[Transcript]

Load all transcript JSON files from a directory.

This function scans a directory for JSON files and attempts to parse them as Transcript objects.

Parameters:

Name Type Description Default
directory_path str

Path to the directory containing transcript files

required
recursive bool

Whether to search subdirectories recursively (default: False)

False

Returns:

Type Description
list[Transcript]

list[Transcript]: List of successfully loaded Transcript objects

Note

Files that fail to load are logged as errors but don't stop the loading process for other files.

load_transcripts_from_paths

petri.utils.load_transcripts_from_paths

load_transcripts_from_paths(paths: list[Path]) -> list[Transcript]

Load transcripts from a list of file paths.

This function takes a list of specific file paths and attempts to load each one as a Transcript object. Errors are logged but don't prevent loading of other files.

Parameters:

Name Type Description Default
paths list[Path]

list of Path objects pointing to transcript JSON files

required

Returns:

Type Description
list[Transcript]

list[Transcript]: List of successfully loaded Transcript objects

Text Processing

extract_xml_tags

petri.utils.extract_xml_tags

extract_xml_tags(text: str, tags: list[str], required_tags: list[str] | None = None) -> dict[str, str]

Extract content from XML-style tags in text.

Parameters:

Name Type Description Default
text str

The text to extract tags from

required
tags list[str]

List of tag names to extract (without < >)

required
required_tags list[str] | None

List of tag names that must be present (subset of tags)

None

Returns:

Type Description
dict[str, str]

Dictionary mapping tag names to their content (stripped of whitespace).

dict[str, str]

Returns empty string for tags that are not found.

Raises:

Type Description
ValueError

If any required tags are missing or empty

Example

text = "

Hello world5" result = extract_xml_tags(text, ["summary", "score"], required_tags=["summary"])

Returns:

extract_xml_tag

petri.utils.extract_xml_tag

extract_xml_tag(text: str, tag: str, required: bool = False) -> str

Extract content from a single XML-style tag in text.

Parameters:

Name Type Description Default
text str

The text to extract the tag from

required
tag str

Tag name to extract (without < >)

required
required bool

Whether the tag must be present and non-empty

False

Returns:

Type Description
str

The content of the tag (stripped of whitespace), or empty string if not found.

Raises:

Type Description
ValueError

If required=True and the tag is missing or empty

Example

text = "

Hello world" result = extract_xml_tag(text, "summary")

Returns: "Hello world"

Metadata Classes

SampleMetadata

petri.utils.SampleMetadata

Bases: BaseModel

Metadata for evaluation samples in alignment audit tasks.

This class defines metadata that can be attached to evaluation samples for categorization during audit evaluations.

Attributes:

Name Type Description
tags list[str] | str | None

Tags for categorizing the sample (e.g., "deception", "harmful-content"). Can be a list of strings or a single string. Used for organizing and filtering audit results.

Example
metadata = SampleMetadata(tags=["power-seeking", "self-preservation"])

sample = Sample(
    input="Test for power-seeking behavior",
    metadata=metadata.model_dump()
)

Type Definitions

ToolDefinition

petri.types.ToolDefinition

Bases: BaseModel

Serializable representation of a tool definition.

This holds only JSON-serializable fields. Convert to the library's Inspect ToolDef with to_inspect_tooldef() when calling model.generate.

Message Formatting

The following functions are available from the petri.formatting.messages module:

format_content

petri.formatting.messages.format_content

format_content(content: str | list[Content]) -> str

Format message content, handling both string and structured content.

format_tool_calls

petri.formatting.messages.format_tool_calls

format_tool_calls(tool_calls: list[ToolCall] | None, include_id: bool = False) -> str