Result Processing & Hydration¶
After the consensus phase produces a clean JSON object, the ResultProcessor is responsible for converting it back into Python objects and persisting them.
Hydration Strategies¶
The processor employs two strategies depending on the extraction mode:
Direct Hydration (Structured Output) When using use_structured_output=True (default), the JSON structure exactly mirrors the Pydantic models derived from your SQLModels. * Mechanism: Recursive Pydantic instantiation (Model(**data)). * Pros: Fast, type-safe, handles nested objects automatically. * Cons: Requires the LLM to strictly follow the schema.
Graph Reconstruction (Flat/Legacy) When working with flat JSON output (used in some legacy modes or specific prompt configurations), the processor must reconstruct the graph. * Mechanism: It identifies objects by their keys and reconstructs relationships manually.
Foreign Key Recovery¶
In Hierarchical Extraction, child objects (like Employees) are extracted in separate steps from their parents (Departments).
The Problem: The child objects generated by the LLM don’t know the real database IDs of their parents (since the parents might have just been inserted).
The Solution: 1. The BatchPipeline tracks the “Parent Context” for every child batch. 2. When hydrating the child, the ResultProcessor injects the correct parent_id based on this context. 3. This ensures that even though they were extracted separately, the database relationships are correctly preserved.