llm_extraction_consensus.utils package

Submodules

llm_extraction_consensus.utils.flattening_utils module

flatten_json(nested_json: Dict[str, Any] | List[Any], parent_path: Tuple[int | str, ...] = (), separator: str = '.') Dict[Tuple[int | str, ...], str | int | float | bool | None][source]

Flattens a nested JSON-like dictionary or list into a flat dictionary.

Keys in the flat dictionary are tuples representing the path to the value. List elements are accessed by their integer index in the path.

Parameters:
  • nested_json – The nested dictionary or list to flatten.

  • parent_path – The base path for the current level of recursion. Used internally for recursive calls.

  • separator – (Not currently used in tuple-based path, but kept for potential future string path representation) String separator used to join path elements if a string path is desired.

Returns:

A flat dictionary where keys are path tuples and values are primitive JSON values.

Examples

>>> flatten_json({'a': 1, 'b': {'c': 2, 'd': [3, 4]}})
{('a',): 1, ('b', 'c'): 2, ('b', 'd', 0): 3, ('b', 'd', 1): 4}
>>> flatten_json([{'x': 5}, {'y': 6}])
{(0, 'x'): 5, (1, 'y'): 6}
unflatten_json(flat_json: Dict[Tuple[int | str, ...], str | int | float | bool | None]) Dict[str, Any] | List[Any] | str | int | float | bool | None[source]

Unflattens a flat dictionary (with tuple paths) back into a nested JSON-like structure.

Parameters:

flat_json – A flat dictionary where keys are path tuples and values are primitives.

Returns:

The reconstructed nested dictionary or list. Returns None for an empty flat_json. Returns the single value if the path was empty (e.g. {(): “value”}).

Examples

>>> unflatten_json({('a',): 1, ('b', 'c'): 2, ('b', 'd', 0): 3, ('b', 'd', 1): 4})
{'a': 1, 'b': {'c': 2, 'd': [3, 4]}}
>>> unflatten_json({(0, 'x'): 5, (1, 'y'): 6})
[{'x': 5}, {'y': 6}]
>>> unflatten_json({(): "hello"})
"hello"

llm_extraction_consensus.utils.json_validation_utils module

is_json_valid(json_data_to_validate: Any, json_schema_definition: Dict[str, Any]) bool[source]

Validates JSON data against a JSON schema.

Parameters:
  • json_data_to_validate – The Python object (e.g., dict, list) to validate. This should be the result of json.loads() if the input was a string.

  • json_schema_definition – The JSON schema as a Python dictionary.

Returns:

True if the JSON data is valid against the schema, False otherwise.

llm_extraction_consensus.utils.llm_output_processing module

Module contents

Utility modules for the Extrai project.

Provides common helper functions and classes, such as ID management.