Utilities¶
This page documents utility functions and helper classes used throughout Petri.
Transcript Loading¶
load_transcripts_from_directory¶
petri.utils.load_transcripts_from_directory ¶
Load all transcript JSON files from a directory.
This function scans a directory for JSON files and attempts to parse them as Transcript objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory_path
|
str
|
Path to the directory containing transcript files |
required |
recursive
|
bool
|
Whether to search subdirectories recursively (default: False) |
False
|
Returns:
| Type | Description |
|---|---|
List[Transcript]
|
List[Transcript]: List of successfully loaded Transcript objects |
Note
Files that fail to load are logged as errors but don't stop the loading process for other files.
load_transcripts_from_paths¶
petri.utils.load_transcripts_from_paths ¶
Load transcripts from a list of file paths.
This function takes a list of specific file paths and attempts to load each one as a Transcript object. Errors are logged but don't prevent loading of other files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
paths
|
List[Path]
|
List of Path objects pointing to transcript JSON files |
required |
Returns:
| Type | Description |
|---|---|
List[Transcript]
|
List[Transcript]: List of successfully loaded Transcript objects |
Text Processing¶
extract_xml_tags¶
petri.utils.extract_xml_tags ¶
extract_xml_tags(text: str, tags: List[str], required_tags: Optional[List[str]] = None) -> Dict[str, str]
Extract content from XML-style tags in text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text to extract tags from |
required |
tags
|
List[str]
|
List of tag names to extract (without < >) |
required |
required_tags
|
Optional[List[str]]
|
List of tag names that must be present (subset of tags) |
None
|
Returns:
| Type | Description |
|---|---|
Dict[str, str]
|
Dictionary mapping tag names to their content (stripped of whitespace). |
Dict[str, str]
|
Returns empty string for tags that are not found. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any required tags are missing or empty |
Example
text = "Hello world
Returns:¶
extract_xml_tag¶
petri.utils.extract_xml_tag ¶
Extract content from a single XML-style tag in text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text to extract the tag from |
required |
tag
|
str
|
Tag name to extract (without < >) |
required |
required
|
bool
|
Whether the tag must be present and non-empty |
False
|
Returns:
| Type | Description |
|---|---|
str
|
The content of the tag (stripped of whitespace), or empty string if not found. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If required=True and the tag is missing or empty |
reformat_antml_tags¶
petri.utils.reformat_antml_tags ¶
Reformat content to use antml: prefixed tags instead of placeholder tags.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
content
|
str | list[str | dict]
|
String content or list of content blocks to reformat |
required |
Returns:
| Type | Description |
|---|---|
str | list[str | dict]
|
str | list[str | dict]: Content with reformatted tags |
Metadata Classes¶
SampleMetadata¶
petri.utils.SampleMetadata ¶
Bases: BaseModel
Metadata for evaluation samples in alignment audit tasks.
This class defines metadata that can be attached to evaluation samples to control transcript saving and categorization during audit evaluations.
Attributes:
| Name | Type | Description |
|---|---|---|
save_dir |
str | None
|
Directory path where transcripts should be saved. If None, transcripts will not be saved to disk. |
tags |
list[str] | str | None
|
Tags for categorizing the sample (e.g., "deception", "harmful-content"). Can be a list of strings or a single string. Used for organizing and filtering audit results. |
Example
Type Definitions¶
ToolDefinition¶
petri.types.ToolDefinition ¶
Bases: BaseModel
Serializable representation of a tool definition.
This holds only JSON-serializable fields. Convert to the library's
Inspect ToolDef with to_inspect_tooldef() when calling model.generate.
Message Formatting¶
The following functions are available from the petri.formatting.messages module:
format_content¶
format_messages_with_indices¶
petri.formatting.messages.format_messages_with_indices ¶
Format messages with indices for citation purposes.
Returns:
| Name | Type | Description |
|---|---|---|
tuple |
tuple[str, dict[str, str]]
|
(formatted_text, index_to_id_mapping) - formatted_text: The formatted transcript with [N] indices - index_to_id_mapping: Dict mapping "N" -> message_id for citation extraction |