API Reference#
PyGraft: Configurable generation of Schemas & Knowledge Graphs.
PyGraft provides APIs for generating synthetic RDF knowledge graphs from OWL ontologies with configurable size, typing, and constraint enforcement.
Exported Functions:
create_config: Create a validated configuration dictionary.generate_schema: Generate OWL schema (classes + relations).extract_ontology: Extract metadata from existing ontology.generate_kg: Generate instance-level KG triples.explain_kg: Analyze KG for logical inconsistencies.
Exported Types:
ClassInfoDict: Type for class_info.json structure.RelationInfoDict: Type for relation_info.json structure.PyGraftConfigDict: Type for configuration dictionary.KGInfoDict: Type for kg_info.json structure.
create_config #
create_config(*, config_format: str = 'json', output_dir: str | Path | None = None) -> Path
Create a new PyGraft configuration file with default values.
Generates a template configuration file that can be customized for schema and knowledge graph generation.
Parameters:
-
config_format(str, default:'json') –Output format for the configuration file. Must be one of
"json","yml", or"yaml". -
output_dir(str | Path | None, default:None) –Directory where the configuration file will be written. If
None, uses the current working directory.
Returns:
-
Path–Path to the created configuration file.
Raises:
-
ValueError–If
config_formatis not a supported format.
Example
from pygraft import create_config
# Create JSON config in current directory
config_path = create_config()
# Create YAML config in specific directory
config_path = create_config(config_format="yaml", output_dir="./configs")
generate_schema #
generate_schema(config_path: str, *, output_root: str | Path | None = None) -> tuple[Path, bool]
Generate a synthetic OWL schema from configuration parameters.
Creates an ontology with classes, relations, and OWL constraints based on the statistical parameters defined in the configuration file. The generated schema is validated for consistency using the HermiT reasoner.
This is typically the first step in the fully synthetic workflow,
followed by generate_kg().
Parameters:
-
config_path(str) –Path to a PyGraft configuration file (JSON or YAML).
-
output_root(str | Path | None, default:None) –Base directory for output files. If
None, uses./output_pygraft/in the current working directory.
Returns:
-
Path–A tuple containing:
-
bool–- schema_path: Path to the generated schema file (
.ttl,.rdf, or.nt).
- schema_path: Path to the generated schema file (
-
tuple[Path, bool]–- is_consistent:
Trueif HermiT confirms the schema is logically consistent.
- is_consistent:
Raises:
-
ValueError–If the configuration file is invalid or contains incompatible parameter combinations.
-
FileNotFoundError–If the configuration file does not exist.
Example
from pygraft import generate_schema
schema_path, is_consistent = generate_schema("pygraft.config.json")
if is_consistent:
print(f"Schema generated at: {schema_path}")
else:
print("Warning: Schema has logical inconsistencies")
extract_ontology #
extract_ontology(ontology_path: str | Path, *, output_root: str | Path | None = None) -> tuple[Path, Path, Path]
Extract metadata from an existing ontology into PyGraft JSON artefacts.
Analyzes the structure of an OWL ontology and extracts statistics about its classes and relations. The extracted metadata is written as JSON files that can be used to configure KG generation.
This is the first step in the ontology-based workflow, followed by
generate_kg().
Parameters:
-
ontology_path(str | Path) –Path to the ontology file (
.ttl,.rdf,.owl, or.xml). -
output_root(str | Path | None, default:None) –Base directory for output files. If
None, uses./output_pygraft/in the current working directory.
Returns:
-
Path–A tuple of paths to the generated JSON artefacts:
-
Path–- namespaces_path: Namespace prefix mappings (
namespaces_info.json).
- namespaces_path: Namespace prefix mappings (
-
Path–- class_info_path: Class hierarchy statistics (
class_info.json).
- class_info_path: Class hierarchy statistics (
-
tuple[Path, Path, Path]–- relation_info_path: Relation statistics (
relation_info.json).
- relation_info_path: Relation statistics (
Raises:
-
FileNotFoundError–If the ontology file does not exist.
Note
The original ontology is also copied to the output directory as
schema.ttl or schema.rdf for use with generate_kg().
Example
from pygraft import extract_ontology
namespaces, classes, relations = extract_ontology("./ontologies/pizza.ttl")
print(f"Extracted class info to: {classes}")
# Now run: pygraft kg pygraft.config.json
generate_kg #
generate_kg(config_path: str, *, output_root: str | Path | None = None) -> tuple[KGInfoDict, str, bool | None]
Generate a Knowledge Graph from an existing schema.
Creates entity instances and relation triples based on the configuration
parameters. Requires a schema to already exist in the project folder,
either from generate_schema() (synthetic workflow) or extract_ontology()
(ontology-based workflow).
Parameters:
-
config_path(str) –Path to a PyGraft configuration file (JSON or YAML).
-
output_root(str | Path | None, default:None) –Base directory for output files. If
None, uses./output_pygraft/in the current working directory.
Returns:
-
KGInfoDict–A tuple containing:
-
str–- kg_info: Dictionary with generation statistics (entity counts, triple counts, type distributions).
-
bool | None–- kg_path: Path to the generated KG file (
.ttl,.rdf, or.nt).
- kg_path: Path to the generated KG file (
-
tuple[KGInfoDict, str, bool | None]–- is_consistent: Consistency check result:
Trueif HermiT confirms the KG is logically consistent.Falseif HermiT detects inconsistencies.Noneif consistency checking was disabled in the configuration.
Raises:
-
ValueError–If the configuration is invalid.
-
FileNotFoundError–If the configuration file or required schema does not exist.
Example
from pygraft import generate_kg
kg_info, kg_path, is_consistent = generate_kg("pygraft.config.json")
print(f"Generated {kg_info['num_triples']} triples")
print(f"KG written to: {kg_path}")
if is_consistent is False:
print("Warning: KG has inconsistencies. Run 'pygraft explain' for details.")
explain_kg #
explain_kg(kg_path: str | Path, *, reasoner: str = 'pellet') -> tuple[bool, str | None]
Analyze a Knowledge Graph for logical inconsistencies.
Runs OWL reasoners to check consistency and provide detailed explanations when inconsistencies are found. The schema is automatically inferred from the same directory as the KG file.
Parameters:
-
kg_path(str | Path) –Path to the knowledge graph file (
.ttl,.rdf, or.nt). -
reasoner(str, default:'pellet') –Which reasoner(s) to use:
"hermit": Fast consistency check only (no explanation)."pellet": Detailed explanations (default)."both": HermiT first, then Pellet if inconsistent.
Returns:
-
bool–A tuple containing:
-
str | None–- is_consistent:
Trueif consistent,Falseif inconsistent.
- is_consistent:
-
tuple[bool, str | None]–- explanation: Human-readable explanation of inconsistencies,
or
Noneif consistent or using HermiT-only mode.
- explanation: Human-readable explanation of inconsistencies,
or
Raises:
-
FileNotFoundError–If the KG file or inferred schema file does not exist.
-
ValueError–If the KG file has an unsupported extension or
reasoneris not one of the valid options.
Example
from pygraft import explain_kg
is_consistent, explanation = explain_kg("./output_pygraft/my-project/kg.ttl")
if not is_consistent:
print("Inconsistency detected:")
print(explanation)