General Tips for Prompt Engineering¶
The overarching theme of using Instructor and Pydantic for function calling is to make the models as self-descriptive, modular, and flexible as possible, while maintaining data integrity and ease of use.
- Modularity: Design self-contained components for reuse.
- Self-Description: Use Pydantic's
Fieldfor clear field descriptions.
- Optionality: Use Python's
Optionaltype for nullable fields and set sensible defaults.
- Standardization: Employ enumerations for fields with a fixed set of values; include a fallback option.
- Dynamic Data: Use key-value pairs for arbitrary properties and limit list lengths.
- Entity Relationships: Define explicit identifiers and relationship fields.
- Contextual Logic: Optionally add a "chain of thought" field in reusable components for extra context.
Modular Chain of Thought¶
This approach to "chain of thought" improves data quality but can have modular components rather than global CoT.
Utilize Optional Attributes¶
Use Python's Optional type and set a default value to prevent undesired defaults like empty strings.
Handling Errors Within Function Calls¶
You can create a wrapper class to hold either the result of an operation or an error message. This allows you to remain within a function call even if an error occurs, facilitating better error handling without breaking the code flow.
MaybeUser class, you can either receive a
UserDetail object in result or get an error message in message.
Simplification with the Maybe Pattern¶
You can further simplify this using instructor to create the
Maybe pattern dynamically from any
This allows you to quickly create a Maybe type for any class, streamlining the process.
Tips for Enumerations¶
To prevent data misalignment, use Enums for standardized fields. Always include an "Other" option as a fallback so the model can signal uncertainty.
If you're having a hard time with
Enum and alternative is to use
If you'd like to improve performance more you can reiterate the requirements in the field descriptions or in the docstrings.
Reiterate Long Instructions¶
For complex attributes, it helps to reiterate the instructions in the field's description.
Handle Arbitrary Properties¶
When you need to extract undefined attributes, use a list of key-value pairs.
Limiting the Length of Lists¶
When dealing with lists of attributes, especially arbitrary properties, it's crucial to manage the length. You can use prompting and enumeration to limit the list length, ensuring a manageable set of properties.
Using Tuples for Simple Types
For simple types, tuples can be a more compact alternative to custom classes, especially when the properties don't require additional descriptions.
Advanced Arbitrary Properties¶
For multiple users, aim to use consistent key names when extracting properties.
This refined guide should offer a cleaner and more organized approach to structure engineering in Python.
Defining Relationships Between Entities¶
In cases where relationships exist between entities, it's vital to define them explicitly in the model. The following example demonstrates how to define relationships between users by incorporating an id and a friends field:
class UserDetail(BaseModel): id: int = Field(..., description="Unique identifier for each user.") age: int name: str friends: List[int] = Field(..., description="Correct and complete list of friend IDs, representing relationships between users.") class UserRelationships(BaseModel): users: List[UserDetail] = Field(..., description="Collection of users, correctly capturing the relationships among them.")
Reusing Components with Different Contexts¶
You can reuse the same component for different contexts within a model. In this example, the TimeRange component is used for both work_time and leisure_time.
class TimeRange(BaseModel): start_time: int = Field(..., description="The start time in hours.") end_time: int = Field(..., description="The end time in hours.") class UserDetail(BaseModel): id: int = Field(..., description="Unique identifier for each user.") age: int name: str work_time: TimeRange = Field(..., description="Time range during which the user is working.") leisure_time: TimeRange = Field(..., description="Time range reserved for leisure activities.")
Sometimes, a component like TimeRange may require some context or additional logic to be used effectively. Employing a "chain of thought" field within the component can help in understanding or optimizing the time range allocations.