Consistency Checking#
Consistency checking validates that generated schemas and knowledge graphs are logically coherent according to OWL semantics.
On this page:
- How It Works - Understanding HermiT and Pellet reasoners
- Checking Consistency - Automatic vs manual validation
- Performance Considerations - Runtime expectations and optimization
- Configuration - Enabling/disabling consistency checks
- Java Requirements - Memory configuration for reasoners
Consistency ≠ Correctness
A consistent result only means no ontology rules were violated. It does NOT mean your KG is correct.
Critical Limitation #1: Consistent ≠ Correct
A KG can pass consistency checking but still contain errors or nonsensical data. "Consistent" only means no constraints were broken.
Example: An empty ontology with no constraints will validate any KG as "consistent" since there are no rules to violate, regardless of whether the data is actually correct.
The KG generator is designed to produce correct data, but errors are possible.
Critical Limitation #2: Generation vs Validation Gap
This applies to the ontology extraction workflow:
- KG generation uses:
class_info.json,relation_info.json,namespaces_info.json - Consistency checking uses: Full ontology + KG
If your ontology contains constraints not captured in the info files (unsupported OWL constructs), those constraints are:
- NOT enforced during generation
- BUT validated during consistency checking
Result: A KG can be generated "correctly" from the info files yet still fail consistency checking against the full ontology.
As PyGraft-gen evolves to support additional OWL constructs (see What's Supported), this gap will narrow.
How It Works#
PyGraft-gen uses two OWL reasoners via Owlready2 to validate your schemas and knowledge graphs. Each reasoner serves a different purpose: HermiT provides fast yes/no answers, while Pellet explains what went wrong.
- HermiT - Fast consistency validation (yes/no answer)
- Pellet - Detailed inconsistency explanations (identifies problematic axioms)
Technical Process
- HermiT and Pellet only accept RDF/XML format
- If your schema/KG uses Turtle (
.ttl) or N-Triples (.nt), PyGraft-gen automatically creates a temporary RDF/XML file - For KG validation, schema and KG are merged into a single temporary RDF/XML file
- Reasoner runs on the temporary file
- Temporary files are cleaned up automatically after reasoning
This conversion is transparent - you can use any RDF format and PyGraft-gen handles the rest.
HermiT Reasoner#
HermiT performs fast consistency checking and reports whether an ontology is consistent or inconsistent.
What it validates:
See the OWL 2 specification or HermiT documentation for details on OWL reasoning.
Behavior with datatypes:
HermiT is configured with ignore_unsupported_datatypes=True, meaning it will skip over datatypes it doesn't recognize rather than failing. This allows reasoning to proceed even with custom or uncommon datatype definitions.
Limitations:
- Cannot provide details - HermiT only reports consistent/inconsistent, not which axioms are problematic
- Validates constraints only - If your schema has few constraints, most KGs will pass validation regardless of data quality
Pellet Reasoner#
Pellet provides detailed explanations when inconsistencies are detected, identifying which specific axioms cause contradictions.
When to use:
- KG reported as inconsistent by HermiT
- Need to identify which specific axioms are problematic
- Debugging schema or generation issues
Pellet Performance
Pellet is significantly slower than HermiT. On large KGs, Pellet can take hours or may not complete. Only use on small to medium KGs for debugging.
Checking Consistency#
Now that you understand the reasoners, let's look at how to use them. Consistency checking happens in two contexts: automatically during generation, or manually as a standalone process.
During Generation (Automatic)#
When check_kg_consistency: true in your config, HermiT runs automatically after KG generation:
{
"kg": {
"check_kg_consistency": true
}
}
pygraft kg pygraft.config.json
# HermiT runs automatically if check_kg_consistency: true
from pygraft import generate_kg
kg_info, kg_file, is_consistent = generate_kg("pygraft.config.json")
if is_consistent:
print("KG is consistent!")
else:
print("KG is inconsistent - use pygraft explain to debug")
Result: You get a boolean answer (consistent/inconsistent) but no details about what's wrong.
Standalone Checking (Manual)#
You can check consistency of any KG file at any time using the pygraft explain command.
When to use:
- Debugging inconsistent KGs without re-running generation
- Testing existing KG files
- Choosing which reasoner to use (HermiT, Pellet, or both)
# Default: uses Pellet (slow but detailed)
pygraft explain path/to/kg.ttl
# Check with HermiT only (fast, yes/no answer)
pygraft explain path/to/kg.ttl --reasoner hermit
# Explicit Pellet (same as default)
pygraft explain path/to/kg.ttl --reasoner pellet
# Run both reasoners (HermiT first, then Pellet if inconsistent)
pygraft explain path/to/kg.ttl --reasoner both
from pygraft import explain_kg
# Default: uses Pellet
is_consistent = explain_kg("path/to/kg.ttl")
# Check with HermiT only
is_consistent = explain_kg("path/to/kg.ttl", reasoner="hermit")
# Explicit Pellet (same as default)
is_consistent = explain_kg("path/to/kg.ttl", reasoner="pellet")
# Run both
is_consistent = explain_kg("path/to/kg.ttl", reasoner="both")
Reasoner options:
| Option | Behavior |
|---|---|
pellet (default) |
Run Pellet only (slow, detailed explanation) |
hermit |
Run HermiT only (fast, yes/no answer) |
both |
Run HermiT first; if inconsistent, run Pellet for explanation |
Recommended Workflow
- Generate KG with
check_kg_consistency: trueto get quick HermiT validation - If inconsistent, run
pygraft explain kg.ttl(uses Pellet by default) to debug - This avoids redundant HermiT execution and lets you explain any KG without regenerating
Performance Considerations#
Consistency checking performance varies dramatically based on your graph size and schema complexity. Understanding these trade-offs helps you choose the right validation strategy.
Typical Runtime#
| Graph Size | HermiT | Pellet |
|---|---|---|
| Small schemas | Seconds | Seconds |
| Small KGs (10K entities) | Seconds to minutes | Minutes |
| Medium KGs (100K entities) | Minutes to tens of minutes | Tens of minutes to hours |
| Large KGs (1M+ entities) | Hours or may not complete | Likely will not complete |
Factors Affecting Performance#
- Number of entities and triples
- Schema complexity (disjointness, property characteristics)
- Number of constraints to validate
Large KG Workflow
For large KGs (1M+ entities):
- Test configuration on small KG (1K-10K entities) with checking enabled
- Verify consistency
- Test on medium KG (100K entities) with checking enabled
- Once validated, disable checking for large production KGs
- Generate large KG with
"check_kg_consistency": false
This validates generation logic without waiting hours on massive graphs.
Understanding Results#
Once validation completes, you'll get one of two outcomes:
Consistent
(HermiT) Consistent schema
No logical contradictions detected. Your KG respects all ontology constraints.
Inconsistent
(HermiT) Inconsistent schema
Logical contradictions found. Use pygraft explain with Pellet to identify specific issues:
pygraft explain kg.ttl --reasoner pellet
Common causes:
- Entities with disjoint types
- Property domain/range violations
- Functional property with multiple values
- Conflicting property characteristics
- Transitive property contradictions
Configuration#
Consistency checking behavior is controlled through your configuration file, with different rules for schemas and KGs.
Schema Consistency Checking
Always runs and cannot be disabled. Every generated schema is validated for logical coherence.
KG Consistency Checking
Controlled in your config file:
{
"kg": {
"check_kg_consistency": true
}
}
| Setting | When to Use |
|---|---|
true |
Development and validation - automatic HermiT check after generation |
false |
Production after validation - skip automatic checking |
Flexibility
Even with check_kg_consistency: false, you can still manually check any KG later using pygraft explain.
Java Requirements#
Both reasoners run in a Java Virtual Machine, which requires proper memory configuration to avoid crashes on medium to large ontologies.
Automatic Heap Configuration#
PyGraft-gen automatically configures the JVM heap to 85% of system RAM by default. This prevents OutOfMemoryError issues that occur with Java's default (often very low) heap size.
How it works:
- PyGraft-gen checks if you've already set
-Xmxvia environment variables - If not configured, it detects system RAM and sets heap to 85%
- Minimum heap: 1GB
- Configuration happens transparently before the first reasoner run
Manual Override#
If you need to set a specific heap size:
export JAVA_TOOL_OPTIONS="-Xmx8g" # 8GB heap
pygraft explain kg.ttl --reasoner pellet
This will override PyGraft-gen's automatic configuration.
Why This Matters
Without sufficient heap, reasoners will crash with OutOfMemoryError: Java heap space, especially on medium to large ontologies. PyGraft-gen's automatic configuration handles this for you.
To install Java
What's Next#
- Schema Generation - How ontologies are built
- KG Generation - How instances are created
- OWL Constraints - Understanding the constraints being validated