Consistency Checking#

Consistency checking validates that generated schemas and knowledge graphs are logically coherent according to OWL semantics.

On this page:

How It Works - Understanding HermiT and Pellet reasoners
Checking Consistency - Automatic vs manual validation
Performance Considerations - Runtime expectations and optimization
Configuration - Enabling/disabling consistency checks
Java Requirements - Memory configuration for reasoners

Consistency ≠ Correctness

A consistent result only means no ontology rules were violated. It does NOT mean your KG is correct.

Critical Limitation #1: Consistent ≠ Correct

A KG can pass consistency checking but still contain errors or nonsensical data. "Consistent" only means no constraints were broken.

Example: An empty ontology with no constraints will validate any KG as "consistent" since there are no rules to violate, regardless of whether the data is actually correct.

The KG generator is designed to produce correct data, but errors are possible.

Critical Limitation #2: Generation vs Validation Gap

This applies to the ontology extraction workflow:

KG generation uses: class_info.json, relation_info.json, namespaces_info.json
Consistency checking uses: Full ontology + KG

If your ontology contains constraints not captured in the info files (unsupported OWL constructs), those constraints are:

NOT enforced during generation
BUT validated during consistency checking

Result: A KG can be generated "correctly" from the info files yet still fail consistency checking against the full ontology.

As PyGraft-gen evolves to support additional OWL constructs (see What's Supported), this gap will narrow.

How It Works#

PyGraft-gen uses two OWL reasoners via Owlready2 to validate your schemas and knowledge graphs. Each reasoner serves a different purpose: HermiT provides fast yes/no answers, while Pellet explains what went wrong.

HermiT - Fast consistency validation (yes/no answer)
Pellet - Detailed inconsistency explanations (identifies problematic axioms)

Technical Process

HermiT and Pellet only accept RDF/XML format
If your schema/KG uses Turtle (.ttl) or N-Triples (.nt), PyGraft-gen automatically creates a temporary RDF/XML file
For KG validation, schema and KG are merged into a single temporary RDF/XML file
Reasoner runs on the temporary file
Temporary files are cleaned up automatically after reasoning

This conversion is transparent - you can use any RDF format and PyGraft-gen handles the rest.

HermiT Reasoner#

HermiT performs fast consistency checking and reports whether an ontology is consistent or inconsistent.

What it validates:

See the OWL 2 specification or HermiT documentation for details on OWL reasoning.

Behavior with datatypes:

HermiT is configured with ignore_unsupported_datatypes=True, meaning it will skip over datatypes it doesn't recognize rather than failing. This allows reasoning to proceed even with custom or uncommon datatype definitions.

Limitations:

Cannot provide details - HermiT only reports consistent/inconsistent, not which axioms are problematic
Validates constraints only - If your schema has few constraints, most KGs will pass validation regardless of data quality

Pellet Reasoner#

Pellet provides detailed explanations when inconsistencies are detected, identifying which specific axioms cause contradictions.

When to use:

KG reported as inconsistent by HermiT
Need to identify which specific axioms are problematic
Debugging schema or generation issues

Pellet Performance

Pellet is significantly slower than HermiT. On large KGs, Pellet can take hours or may not complete. Only use on small to medium KGs for debugging.

Checking Consistency#

Now that you understand the reasoners, let's look at how to use them. Consistency checking happens in two contexts: automatically during generation, or manually as a standalone process.

During Generation (Automatic)#

When check_kg_consistency: true in your config, HermiT runs automatically after KG generation:

{
  "kg": {
    "check_kg_consistency": true
  }
}

CLIPython API

pygraft kg pygraft.config.json
# HermiT runs automatically if check_kg_consistency: true

from pygraft import generate_kg

kg_info, kg_file, is_consistent = generate_kg("pygraft.config.json")

if is_consistent:
    print("KG is consistent!")
else:
    print("KG is inconsistent - use pygraft explain to debug")

Result: You get a boolean answer (consistent/inconsistent) but no details about what's wrong.

Standalone Checking (Manual)#

You can check consistency of any KG file at any time using the pygraft explain command.

When to use:

Debugging inconsistent KGs without re-running generation
Testing existing KG files
Choosing which reasoner to use (HermiT, Pellet, or both)

CLIPython API

# Default: uses Pellet (slow but detailed)
pygraft explain path/to/kg.ttl

# Check with HermiT only (fast, yes/no answer)
pygraft explain path/to/kg.ttl --reasoner hermit

# Explicit Pellet (same as default)
pygraft explain path/to/kg.ttl --reasoner pellet

# Run both reasoners (HermiT first, then Pellet if inconsistent)
pygraft explain path/to/kg.ttl --reasoner both

from pygraft import explain_kg

# Default: uses Pellet
is_consistent = explain_kg("path/to/kg.ttl")

# Check with HermiT only
is_consistent = explain_kg("path/to/kg.ttl", reasoner="hermit")

# Explicit Pellet (same as default)
is_consistent = explain_kg("path/to/kg.ttl", reasoner="pellet")

# Run both
is_consistent = explain_kg("path/to/kg.ttl", reasoner="both")

Reasoner options:

Option	Behavior
`pellet` (default)	Run Pellet only (slow, detailed explanation)
`hermit`	Run HermiT only (fast, yes/no answer)
`both`	Run HermiT first; if inconsistent, run Pellet for explanation

Recommended Workflow

Generate KG with check_kg_consistency: true to get quick HermiT validation
If inconsistent, run pygraft explain kg.ttl (uses Pellet by default) to debug
This avoids redundant HermiT execution and lets you explain any KG without regenerating

Performance Considerations#

Consistency checking performance varies dramatically based on your graph size and schema complexity. Understanding these trade-offs helps you choose the right validation strategy.

Typical Runtime#

Graph Size	HermiT	Pellet
Small schemas	Seconds	Seconds
Small KGs (10K entities)	Seconds to minutes	Minutes
Medium KGs (100K entities)	Minutes to tens of minutes	Tens of minutes to hours
Large KGs (1M+ entities)	Hours or may not complete	Likely will not complete

Factors Affecting Performance#

Number of entities and triples
Schema complexity (disjointness, property characteristics)
Number of constraints to validate

Large KG Workflow

For large KGs (1M+ entities):

Test configuration on small KG (1K-10K entities) with checking enabled
Verify consistency
Test on medium KG (100K entities) with checking enabled
Once validated, disable checking for large production KGs
Generate large KG with "check_kg_consistency": false

This validates generation logic without waiting hours on massive graphs.

Understanding Results#

Once validation completes, you'll get one of two outcomes:

Consistent

(HermiT) Consistent schema

No logical contradictions detected. Your KG respects all ontology constraints.

Inconsistent

(HermiT) Inconsistent schema

Logical contradictions found. Use pygraft explain with Pellet to identify specific issues:

pygraft explain kg.ttl --reasoner pellet

Common causes:

Entities with disjoint types
Property domain/range violations
Functional property with multiple values
Conflicting property characteristics
Transitive property contradictions

Configuration#

Consistency checking behavior is controlled through your configuration file, with different rules for schemas and KGs.

Schema Consistency Checking

Always runs and cannot be disabled. Every generated schema is validated for logical coherence.

KG Consistency Checking

Controlled in your config file:

{
  "kg": {
    "check_kg_consistency": true
  }
}

Setting	When to Use
`true`	Development and validation - automatic HermiT check after generation
`false`	Production after validation - skip automatic checking

Flexibility

Even with check_kg_consistency: false, you can still manually check any KG later using pygraft explain.

Java Requirements#

Both reasoners run in a Java Virtual Machine, which requires proper memory configuration to avoid crashes on medium to large ontologies.

Automatic Heap Configuration#

PyGraft-gen automatically configures the JVM heap to 85% of system RAM by default. This prevents OutOfMemoryError issues that occur with Java's default (often very low) heap size.

How it works:

PyGraft-gen checks if you've already set -Xmx via environment variables
If not configured, it detects system RAM and sets heap to 85%
Minimum heap: 1GB
Configuration happens transparently before the first reasoner run

Manual Override#

If you need to set a specific heap size:

export JAVA_TOOL_OPTIONS="-Xmx8g"  # 8GB heap
pygraft explain kg.ttl --reasoner pellet

This will override PyGraft-gen's automatic configuration.

Why This Matters

Without sufficient heap, reasoners will crash with OutOfMemoryError: Java heap space, especially on medium to large ontologies. PyGraft-gen's automatic configuration handles this for you.

To install Java

See Java Installation

What's Next#

Schema Generation - How ontologies are built
KG Generation - How instances are created
OWL Constraints - Understanding the constraints being validated