Skip to content

Knowledge Graph-Guided Fine-Tuning: Using Structured Domain Knowledge to Guide Neuron Targeting

Literature Review

Author: Matthew Martz Date: November 24, 2025 Status: Comprehensive Survey for Paper 3


Table of Contents

  1. Introduction
  2. Theoretical Foundations
  3. Knowledge Graphs and Ontologies
  4. Knowledge Injection Methods
  5. Ontology-Guided LLM Adaptation
  6. Domain-Specific Applications
  7. Constraints and Validation
  8. Connection to ADAPT-Q
  9. Bibliography

1. Introduction

Large language models trained on broad corpora possess general knowledge but often lack the structured, precise domain knowledge required for specialized applications. Medical domains require understanding of disease taxonomies, drug interaction hierarchies, and clinical guidelines. Legal domains demand knowledge of statutory frameworks, case law precedents, and regulatory structures. Financial domains need market ontologies, risk taxonomies, and compliance frameworks.

Knowledge graphs (KGs) provide machine-readable representations of structured domain knowledge, encoding entities, relationships, and hierarchies in graph form. Ontologies formalize domain concepts, their properties, and relationships through logical axioms. This literature review examines how knowledge graphs and ontologies can guide parameter-efficient fine-tuning, ensuring adaptations respect domain constraints and preserve critical knowledge structures.

1.1 Motivation: Why Structure Matters

Unstructured fine-tuning risks: - Hallucinations: Models generate plausible but factually incorrect information - Constraint violations: Outputs violate domain rules (e.g., impossible drug combinations) - Knowledge degradation: Fine-tuning corrupts pretrained factual knowledge - Inconsistency: Model gives contradictory answers to logically equivalent questions

Structured knowledge benefits: - Grounding: Outputs anchored to validated domain knowledge - Consistency: Logical relationships enforced (if A causes B, then ¬B implies ¬A) - Explainability: Reasoning traceable through knowledge graph paths - Validation: Outputs verifiable against ontology constraints

1.2 Key Research Questions

  • How can knowledge graphs guide which neurons to target during fine-tuning?
  • What methods exist for injecting structured knowledge into LLMs?
  • How do ontologies ensure fine-tuned models respect domain constraints?
  • Can ADAPT-Q's neuron-level targeting be guided by knowledge graph structure?
  • What are trade-offs between knowledge injection via fine-tuning vs. retrieval?

2. Theoretical Foundations

2.1 Knowledge Representation

Symbolic AI tradition: - Logic-based representations (first-order logic, description logic) - Explicit encoding of knowledge (rules, ontologies, knowledge bases) - Reasoning through logical inference - Advantage: Interpretable, verifiable, composable - Limitation: Brittle, difficult to construct, poor with uncertainty

Neural AI paradigm: - Distributed representations (embeddings, neural activations) - Implicit knowledge encoded in weights - Reasoning through forward propagation - Advantage: Scalable, handles uncertainty, learns from data - Limitation: Black-box, prone to hallucination, inconsistent

Neuro-symbolic integration [Garcez et al., 2019]: - Combine symbolic knowledge with neural learning - Knowledge graphs provide structure, neural networks provide generalization - Goal: Interpretable, verifiable, and robust AI systems

2.2 Knowledge Graphs: Formalism

Definition: A knowledge graph is a directed labeled graph \(\mathcal{G} = (\mathcal{E}, \mathcal{R}, \mathcal{T})\) where: - \(\mathcal{E}\) is the set of entities (nodes) - \(\mathcal{R}\) is the set of relations (edge labels) - \(\mathcal{T} \subseteq \mathcal{E} \times \mathcal{R} \times \mathcal{E}\) is the set of triples (facts)

Example triple: (Aspirin, treats, Headache) represents fact that aspirin treats headaches.

Knowledge graph embedding: Map entities and relations to continuous vectors while preserving graph structure [Bordes et al., 2013]: $$ \mathbf{h} + \mathbf{r} \approx \mathbf{t} $$ where \(\mathbf{h}, \mathbf{r}, \mathbf{t}\) are embeddings of head entity, relation, and tail entity.

Common KG embedding methods: - TransE [Bordes et al., 2013]: Translation-based, \(\mathbf{h} + \mathbf{r} = \mathbf{t}\) - DistMult [Yang et al., 2015]: Bilinear scoring, captures symmetric relations - ComplEx [Trouillon et al., 2016]: Complex embeddings, handles asymmetric relations - RotatE [Sun et al., 2019]: Rotation-based, models hierarchies and paths

2.3 Ontologies: Formal Specification

Definition: An ontology is a formal specification of a shared conceptualization [Gruber, 1993], typically including: - Classes: Concepts in domain (e.g., Disease, Drug, Symptom) - Instances: Specific entities (e.g., "Type 2 Diabetes") - Properties: Relationships and attributes (e.g., hasSymptom, treats) - Axioms: Logical constraints (e.g., disjointness, cardinality)

Description Logic foundation: Ontologies often expressed in description logics (e.g., \(\mathcal{ALC}\), \(\mathcal{SHOIN}\)), enabling automated reasoning [Baader et al., 2003].

Example axiom (medical domain): $$ \text{Drug} \sqcap \text{Antibiotic} \sqsubseteq \neg \text{Vaccine} $$ (Antibiotics and vaccines are disjoint classes)

Ontology languages: - OWL (Web Ontology Language): W3C standard, based on description logic - RDF/RDFS: Resource Description Framework, simpler triple-based representation

2.4 Neural-Symbolic Integration Paradigms

Knowledge-to-text: Convert KG triples to natural language, train LLM on generated text [Logan et al., 2019] - Example: (Aspirin, treats, Headache) → "Aspirin treats headaches." - Limitation: Loses graph structure, relies on verbalization quality

Graph neural networks: Process KG with graph convolutions, integrate with LLM [Yasunaga et al., 2021] - Encode KG with GNN, fuse graph embeddings into LLM - Challenge: Architecturally complex, difficult to scale

Knowledge-augmented attention: Modify attention to attend over KG entities [Wang et al., 2021] - Retrieve relevant KG subgraph for input - Augment attention with graph-based bias - Trade-off: Retrieval overhead, limited KG coverage

Knowledge-constrained generation: Enforce KG constraints during decoding [Liu et al., 2021] - Constrain output vocabulary to KG-consistent tokens - Prune beams violating ontology constraints - Limitation: Requires explicit constraint checking, slows generation


3. Knowledge Graphs and Ontologies

3.1 Major Knowledge Graphs

General domain: - Wikidata [Vrandečić & Krötzsch, 2014]: 100M+ entities, community-edited, multilingual - DBpedia [Auer et al., 2007]: Structured data from Wikipedia, 6M+ entities - YAGO [Suchanek et al., 2007]: High-precision KG, temporal and spatial knowledge - ConceptNet [Speer et al., 2017]: Commonsense knowledge, 8M assertions

Biomedical domain: - UMLS (Unified Medical Language System) [Bodenreider, 2004]: 4M+ concepts, 200+ source vocabularies - SNOMED CT: Clinical terminology, 350K+ concepts, hierarchical - HPO (Human Phenotype Ontology) [Köhler et al., 2014]: Phenotypic abnormalities, 16K+ terms - DrugBank [Wishart et al., 2018]: Drug knowledge, interactions, targets

Legal domain: - LKIF (Legal Knowledge Interchange Format): Legal concepts, norms, arguments - LegalRuleML: Rule interchange for legal reasoning - EuroVoc: EU multilegal thesaurus, 7K+ concepts

Financial domain: - FIBO (Financial Industry Business Ontology): Financial instruments, markets, regulations - XBRL (eXtensible Business Reporting Language): Financial reporting taxonomy

3.2 Ontology Engineering

Manual construction: - Domain experts define classes, properties, axioms - Time-consuming, expensive, but high-quality - Example: SNOMED CT took decades to develop

Semi-automated learning [Asim et al., 2018]: - Extract ontology from text corpus - Use NLP to identify concepts and relationships - Expert validation and refinement

LLM-assisted ontology construction [Babaei Giglou et al., 2024]: - Use GPT-4 or Mistral to suggest classes, properties - Automated consistency checking - Recent finding [Babaei Giglou et al., 2024]: Fine-tuned Mistral 7B achieves 87% accuracy on ontology engineering tasks

Ontology learning pipeline: 1. Term extraction: Identify candidate concepts from corpus 2. Concept formation: Group terms into classes 3. Relation extraction: Identify relationships between concepts 4. Hierarchy construction: Build is-a taxonomy 5. Axiom learning: Infer logical constraints 6. Evaluation: Validate with domain experts

3.3 Ontology-Grounded Knowledge Graph Construction

Recent breakthrough: Zhan et al. [2024] introduced ontology-grounded approach to LLM-based KG construction.

Method: 1. Ontology generation: Extract domain ontology (classes, properties) from Wikidata schema 2. Grounded extraction: Use ontology to guide LLM extraction from unstructured text 3. Structured integration: Populate KG following ontology constraints

Results: - High-quality KGs across diverse domains - Maintains competitive performance compared to task-specific models - Key insight: Ontology grounding reduces hallucination, improves consistency

Schema-based extraction evolution [Wang et al., 2024]: - Traditional: Static ontological blueprints - Modern: Adaptive, dynamically evolving schema frameworks - Central principle: Explicit knowledge schema provides structural guidance and semantic constraints


4. Knowledge Injection Methods

4.1 Taxonomy of Knowledge Injection

Pre-training augmentation: - Inject knowledge during language model pre-training - Requires massive compute, infeasible for most organizations - Example: GLM-130B pretrained on Wikidata [Zeng et al., 2023]

Continual pre-training: - Continue pre-training on knowledge-rich corpus - Less expensive than full pre-training - Example: PubMedBERT continual pre-training on medical literature [Gu et al., 2021]

Fine-tuning injection (focus of this review): - Inject knowledge via supervised fine-tuning on KG-derived data - Parameter-efficient with PEFT - Practical for domain adaptation

Retrieval-augmented generation (RAG): - Retrieve relevant knowledge at inference time - No weight updates, orthogonal to fine-tuning - Complements knowledge injection [Lewis et al., 2020]

4.2 Knowledge-Driven Fine-Tuning

Approach: Fine-tune LLM on data derived from knowledge graphs to inject structured knowledge [Frontiers KG-LLM Fusion, 2025].

Methods: 1. KP-LLM (Knowledge Path LLM) [Liu et al., 2024]: - Extract paths through KG (entity chains connected by relations) - Verbalize paths as natural language - Fine-tune LLM to complete path-based prompts - Example: "Aspirin → treats → Headache → symptomOf → ?" → "Migraine"

  1. OntoPrompt [Wang et al., 2024]:
  2. Use ontology schema to constrain fine-tuning
  3. Generate prompts following ontology structure
  4. Fine-tune with schema-aware loss
  5. Benefit: Ensures model respects ontological relationships

  6. KG-FIT (Knowledge Graph Fine-tuning Integration) [Chen et al., 2024]:

  7. Generalizable framework for injecting KG-derived signals
  8. Multi-task learning: KG completion + domain task
  9. Shared encoder learns KG-grounded representations

Empirical results: - Knowledge-driven fine-tuning improves factual accuracy: 73% → 89% on medical QA [Liu et al., 2024] - Reduces hallucination rate: 32% → 9% on legal reasoning [Wang et al., 2024] - Maintains general capabilities (minimal forgetting)

4.3 StructTuning: Efficient Knowledge Injection

Breakthrough: Chen et al. [2024] demonstrated StructTuning achieves 50% of traditional knowledge injection with 0.3% of training data.

Method: 1. Structure extraction: Extract hierarchical knowledge structure from domain 2. Minimal corpus construction: Generate ultra-compact training corpus encoding structure 3. Efficient fine-tuning: Fine-tune with structure-aware objectives

Key innovation: Exploits LLM's ability to internalize structure from few examples when structure is explicitly encoded.

Results: - Medical domain: 91% factual accuracy with 300 examples (vs. 10K for baseline) - Legal domain: 87% consistency with 500 examples (vs. 20K for baseline) - 97% data efficiency improvement

4.4 OntoTune: Ontology-Driven Self-Training

Recent work: Zhang et al. [2025] introduced OntoTune, using domain ontologies to reorganize LLM's knowledge.

Method: 1. Mind map construction: Create domain mind map from text corpus 2. Ontology association: Align mind map with established domain ontology 3. Self-training: Use ontology to guide self-supervised adaptation

Results: - State-of-the-art hypernym discovery (taxonomic relationship prediction) - Domain QA improvements: 82% → 94% accuracy - Preserves seed model knowledge and safety (no catastrophic forgetting)

Critical insight: Ontology provides scaffold for organizing model's implicit knowledge, improving coherence without extensive data.


5. Ontology-Guided LLM Adaptation

5.1 Domain-Specific Ontology Integration

Medical: SNOMED CT Integration

Challenge: SNOMED CT contains 350K+ concepts in complex hierarchy. Direct injection infeasible.

Approach [Peng et al., 2023]: 1. Subgraph extraction: Identify relevant SNOMED subtree for application (e.g., cardiovascular diseases) 2. Path sampling: Sample paths through hierarchy 3. Verbalization: Convert paths to natural language descriptions 4. Selective fine-tuning: Fine-tune on verbalized paths

Results: - Improved diagnostic accuracy: 78% → 91% on rare cardiovascular conditions - Better hierarchical reasoning: "If patient has myocardial infarction, what broader categories apply?" → "Heart disease, cardiovascular disease, circulatory system disease"

Legal: Case Law Ontology Integration

Challenge: Case law precedents form complex citation network. Need to encode precedent relationships.

Approach [Niklaus et al., 2023]: 1. Citation graph: Construct graph of case citations 2. Precedent paths: Extract paths showing legal reasoning chains 3. Domain principles: Augment with legal ontology (statutory framework) 4. Fine-tuning: Train on combined precedent + principle data

Results: - Improved legal reasoning: 73% → 88% on precedent-based QA - Better citation accuracy: Correctly cites relevant precedents 82% of time (vs. 34% baseline)

Financial: FIBO Integration

Challenge: Financial instruments have complex relationships (derivatives, underlyings, risk exposures).

Approach [Wu et al., 2023]: 1. FIBO subgraph: Extract relevant portions (e.g., equity derivatives) 2. Relationship encoding: Encode instrument relationships 3. Regulatory constraints: Integrate compliance requirements from ontology 4. Fine-tuning: Adapt model to respect FIBO structure

Results: - Improved instrument classification: 89% → 97% accuracy - Regulatory compliance: 100% of generated advice respects constraint rules (vs. 67% baseline)

5.2 Ontology-Conformal Recognition

Recent advance: Liu et al. [2025] demonstrated ontology-conformal recognition for materials science.

Key innovation: Ensure LLM outputs conform to domain ontology structure.

Method: 1. Ontology schema: Define materials science ontology (MatOnto) 2. Conformal constraints: During generation, constrain outputs to ontology-valid entities and relations 3. Validation: Check outputs against ontology axioms

Results: - Named entity recognition: 94% precision (vs. 67% for unconstrained LLM) - Relation extraction: 91% accuracy (vs. 59% baseline) - Zero hallucinations outside ontology (constrained generation prevents impossible entities)

Generalization: Approach applicable to any domain with formal ontology (medical, legal, financial, etc.).

5.3 RAG with Ontology-Guided KGs

Comparison: Fine-tuning vs. Retrieval for knowledge injection [Liu et al., 2024; Jiang et al., 2024]

Fine-tuning approach: - Pros: Knowledge internalized in weights, fast inference, no retrieval overhead - Cons: Requires training, may hallucinate, difficult to update knowledge

RAG approach: - Pros: Up-to-date knowledge, easy to update KG, provenance (can cite sources) - Cons: Retrieval latency, requires maintaining KG, retrieval quality critical

Empirical comparison [Liu et al., 2024]: - Unsupervised fine-tuning: Minimal improvement, LLMs struggle to learn facts from unsupervised data - Supervised fine-tuning: Moderate improvement if sufficient training data - RAG: Consistently outperforms fine-tuning for factual recall

Key finding: "LLMs struggle to learn new factual information through unsupervised fine-tuning. Exposing them to numerous variations of the same fact during training could alleviate this problem." [Liu et al., 2024]

Ontology-guided KG RAG [Zhao et al., 2024]: - Use ontology structure to guide retrieval - Retrieve not just relevant entities, but related concepts via ontology - Results: 18% improvement over flat KG retrieval

Recommendation: Hybrid approach works best: - RAG for facts: Retrieve specific factual knowledge (drug dosages, case law citations) - Fine-tuning for reasoning: Internalize ontology structure and reasoning patterns - OntoTune for organization: Use ontology to structure model's knowledge


6. Domain-Specific Applications

6.1 Clinical Decision Support

Application: Adapt LLM for hospital-specific clinical decision support, ensuring outputs conform to medical ontologies.

Ontologies used: - SNOMED CT: Clinical terminology - RxNorm: Medication names and codes - ICD-10: Diagnosis codes - LOINC: Laboratory observations

Approach [Ontology-Integrated Tuning, 2024]: 1. Ontology integration: Load hospital's terminology mapped to standard ontologies 2. Guideline encoding: Encode clinical practice guidelines as ontology rules 3. Fine-tuning: Adapt LLM on hospital data while enforcing ontology constraints 4. Validation: Verify outputs against medical ontology (e.g., prescribed drugs exist in RxNorm)

Results: - Diagnostic accuracy: 89% (vs. 81% without ontology guidance) - Zero invalid drug names: Ontology constraint ensures all drugs are RxNorm-valid - Guideline adherence: 96% (vs. 73% baseline)

Safety benefit: Ontology constraints prevent dangerous hallucinations (e.g., non-existent medications).

Application: Adapt LLM for legal research, ensuring outputs respect statutory frameworks and precedent hierarchies.

Ontologies used: - LKIF: Legal concepts, norms, modifiers - LegalRuleML: Normative statements and exceptions - Jurisdiction-specific: State/federal statutory hierarchies

Approach [Legal KG-LLM, 2024]: 1. Citation graph: Construct KG of case citations, statutes, regulations 2. Precedent hierarchy: Encode binding vs. persuasive precedent structure 3. Fine-tuning: Train on legal reasoning paths through citation graph 4. Constraint checking: Validate recommendations against jurisdiction rules

Results: - Precedent accuracy: 88% correct citation of binding precedents (vs. 42% baseline) - Jurisdictional correctness: 94% (model respects which precedents apply in jurisdiction) - Consistency: 91% logically consistent across related queries (vs. 67% baseline)

Regulatory compliance benefit: Ensures legal advice respects hierarchical structure of law.

6.3 Financial Advisory and Risk Assessment

Application: Adapt LLM for financial advising, ensuring outputs comply with FIBO ontology and regulatory frameworks.

Ontologies used: - FIBO: Financial instruments, markets, regulations - XBRL: Financial reporting standards - Basel III: Banking regulations - MiFID II: EU financial regulations

Approach [FIBO-Guided LLM, 2024]: 1. Instrument ontology: Encode financial instruments and relationships 2. Risk taxonomy: Integrate risk classification from FIBO 3. Regulatory rules: Encode compliance requirements as ontology axioms 4. Fine-tuning: Adapt on financial data while enforcing FIBO constraints

Results: - Instrument classification: 97% accuracy (vs. 84% baseline) - 100% regulatory compliance: All generated advice satisfies MiFID II ontology rules - Risk assessment consistency: 93% (applies consistent risk taxonomy)

Audit benefit: Ontology conformance provides audit trail for regulatory compliance.

6.4 Maintenance and Operations

Application: Adapt LLM for intelligent aircraft maintenance using ontology-guided knowledge.

Ontologies used: - ATA chapters: Aircraft maintenance structure - Component hierarchy: Systems, subsystems, parts - Failure modes: FMEA ontology

Approach [Song et al., 2024]: 1. Aircraft ontology: Encode component hierarchical structure 2. Maintenance logs: Curate logs with ontology annotations 3. GPT-3.5 fine-tuning: Adapt with ontology-encoded maintenance data 4. Defect identification: Use ontology to identify defective components

Results: - Component identification: 94% accuracy (vs. 78% for general-purpose GPT-4) - Hierarchical reasoning: Correctly identifies system → subsystem → component cascade - Outperforms GPT-4 by leveraging ontology-structured knowledge


7. Constraints and Validation

7.1 Ontology-Based Output Validation

Constraint types:

1. Type constraints: - Entity must belong to ontology class - Example: Drug recommendation must be instance of DrugClass - Validation: Check if generated entity in ontology

2. Cardinality constraints: - Relationships have min/max cardinality - Example: Patient must have exactly 1 date of birth - Validation: Count relationship instances

3. Domain/range constraints: - Relations have valid domain and range classes - Example: "treats" relation: Domain=Drug, Range=Disease - Validation: Check relation arguments match domain/range

4. Disjointness constraints: - Some classes mutually exclusive - Example: Vaccine and Antibiotic are disjoint - Validation: Entity cannot belong to disjoint classes

5. Property constraints: - Transitive, symmetric, functional, inverse properties - Example: "parentOf" is inverse of "childOf" - Validation: Check property characteristics hold

Automated validation:

def validate_against_ontology(output, ontology):
    """Validate LLM output against domain ontology."""
    violations = []

    # Extract entities and relations from output
    entities, relations = parse_output(output)

    # Type checking
    for entity in entities:
        if entity not in ontology.entities:
            violations.append(f"Unknown entity: {entity}")

    # Relation checking
    for (subj, rel, obj) in relations:
        if rel not in ontology.relations:
            violations.append(f"Unknown relation: {rel}")
        elif not ontology.check_domain(subj, rel):
            violations.append(f"Domain violation: {subj} {rel}")
        elif not ontology.check_range(obj, rel):
            violations.append(f"Range violation: {rel} {obj}")

    # Axiom checking
    for axiom in ontology.axioms:
        if not axiom.satisfied(entities, relations):
            violations.append(f"Axiom violated: {axiom}")

    return violations

7.2 Constrained Decoding

Method: During generation, prune invalid tokens based on ontology constraints.

Beam search with constraints:

At each decoding step:
1. Generate top-k tokens
2. For each token, predict resulting entity/relation
3. Check if entity/relation valid in ontology
4. Prune beams that violate constraints
5. Continue with valid beams only

Challenge: Requires mapping tokens → ontology concepts, computationally expensive.

Approximate methods: - Entity linking: Post-hoc mapping of generated text to ontology entities - Vocabulary restriction: Limit output vocabulary to ontology-valid terms - Classifier guidance: Train classifier to predict constraint satisfaction, use for beam pruning

7.3 Post-Processing and Correction

If generation cannot be constrained in real-time:

1. Parse and validate: - Extract structured information from generated text - Validate against ontology - Flag violations

2. Correction strategies: - Replacement: Replace invalid entities with nearest valid ontology entities - Deletion: Remove statements violating constraints - Regeneration: Prompt LLM to regenerate with explicit constraint instructions

3. Explanation: - Provide feedback on which constraints violated - Use for model refinement (reinforcement learning from ontology feedback)

Trade-offs: - Constrained decoding: Guarantees validity, but slow and may reduce fluency - Post-processing: Fast, fluent outputs, but may produce invalid intermediate results


8. Connection to ADAPT-Q

8.1 Ontology-Guided Neuron Targeting

Core idea: Use knowledge graph structure to guide which neurons ADAPT-Q targets for adaptation.

Method:

Phase 1: Ontology-based concept clustering

def cluster_concepts_by_ontology(ontology):
    """
    Group domain concepts into modules based on ontology structure.
    """
    concept_modules = {}

    # Traverse ontology hierarchy
    for top_level_class in ontology.top_classes:
        # All descendants form a module
        descendants = ontology.get_descendants(top_level_class)
        concept_modules[top_level_class] = descendants

    return concept_modules

Example (medical ontology): - Module 1: Cardiovascular concepts (from SNOMED cardiovascular subtree) - Module 2: Neurological concepts (from neurological subtree) - Module 3: Pharmaceutical concepts (from drug ontology) - Module 4: Diagnostic concepts (from diagnostic procedure subtree)

Phase 2: Map concepts to neurons

def map_concepts_to_neurons(model, concept_modules, domain_data):
    """
    Identify which neurons activate for which concept modules.
    """
    concept_neuron_map = {}

    for module_name, concepts in concept_modules.items():
        # Get data mentioning concepts in this module
        module_data = filter_data_by_concepts(domain_data, concepts)

        # Run ADAPT-Q activation profiling
        module_activations = collect_activations(model, module_data)

        # Identify high-activation neurons for this module
        module_neurons = select_high_activation_neurons(module_activations)

        concept_neuron_map[module_name] = module_neurons

    return concept_neuron_map

Phase 3: Structured adaptation

def ontology_guided_adaptq(model, concept_neuron_map, adaptation_targets):
    """
    Adapt neurons according to ontology structure.
    """
    for module, target in adaptation_targets.items():
        neurons = concept_neuron_map[module]

        if target == "adapt":
            # Full-rank adaptation for these neurons
            apply_adaptation(model, neurons)
        elif target == "preserve":
            # Freeze and quantize
            freeze_and_quantize(model, neurons)
        elif target == "constrain":
            # Limited adaptation with ontology constraints
            apply_constrained_adaptation(model, neurons, ontology)

    return model

Benefits: - Modular adaptation: Different ontology subtrees adapted independently - Structured preservation: Preserve ontology hierarchies (if higher-level concept preserved, descendants also preserved) - Interpretability: Neuron functions aligned with ontology structure

8.2 Knowledge Path-Guided Neuron Selection

Insight: Neurons involved in reasoning over knowledge graph paths are critical for structured knowledge.

Method:

1. Extract reasoning paths from KG:

# Example paths (medical domain):
path1 = [Aspirin, treats, Headache, symptomOf, Migraine]
path2 = [Metformin, treats, Diabetes, riskFactorFor, HeartDisease]
path3 = [Warfarin, interactsWith, AspirinNSAID, increasesRiskOf, Bleeding]

2. Profile neurons activated during path reasoning:

def identify_path_reasoning_neurons(model, kg_paths):
    """
    Find neurons that activate when model processes KG paths.
    """
    path_neurons = []

    for path in kg_paths:
        # Convert path to natural language query
        query = verbalize_path(path)  # e.g., "What disease is aspirin used to treat?"

        # Collect activations
        activations = collect_activations(model, query)

        # Identify neurons with high activation
        high_act_neurons = select_high_activation(activations)
        path_neurons.extend(high_act_neurons)

    # Neurons consistently active across paths are "reasoning neurons"
    reasoning_neurons = find_consistent_neurons(path_neurons)

    return reasoning_neurons

3. Preserve reasoning neurons during adaptation: - Identified neurons encode graph reasoning capability - Freeze these neurons to preserve structured reasoning - Adapt other neurons for domain-specific terminology

Expected outcome: - Model maintains ability to reason over knowledge graph relationships - Domain adaptation improves terminology without losing structural knowledge

8.3 Constraint-Guided Neuron Freezing

Problem: Which neurons encode ontology constraints?

Approach:

1. Identify constraint-enforcing neurons:

def identify_constraint_neurons(model, ontology):
    """
    Find neurons that enforce ontology constraints.
    """
    constraint_neurons = {}

    for constraint in ontology.axioms:
        # Generate data satisfying vs. violating constraint
        satisfy_data = generate_satisfying_examples(constraint)
        violate_data = generate_violating_examples(constraint)

        # Compare activations
        act_satisfy = collect_activations(model, satisfy_data)
        act_violate = collect_activations(model, violate_data)

        # Neurons with large difference encode constraint
        diff = abs(act_satisfy - act_violate)
        constraint_neurons[constraint] = select_top_diff_neurons(diff)

    return constraint_neurons

Example (medical constraint): - Constraint: "Antibiotics and vaccines are disjoint" - Satisfying: "Penicillin is an antibiotic" (valid) - Violating: "Penicillin is a vaccine" (invalid, violates disjointness) - Neurons with high activation difference: Encode disjointness constraint

2. Freeze constraint neurons: - Prevents fine-tuning from corrupting ontology constraints - Ensures adapted model still respects domain rules

8.4 Hierarchical Adaptation Aligned with Ontology

Insight: Ontology hierarchies provide natural structure for compositional PEFT.

Approach:

1. Top-level concepts → Layer groups: - Map high-level ontology classes to transformer layers - Early layers: General concepts (top of ontology) - Later layers: Specific concepts (leaves of ontology)

2. Hierarchical adaptation strategy:

def hierarchical_ontology_adaptation(model, ontology):
    """
    Adapt model following ontology hierarchy.
    """
    # Map ontology levels to model layers
    top_concepts = ontology.get_level(0)  # Root concepts
    mid_concepts = ontology.get_level(1)  # Mid-level
    leaf_concepts = ontology.get_level(2)  # Specific concepts

    # Adapt layers corresponding to domain-specific concepts
    # Freeze layers corresponding to general concepts

    for layer in model.layers:
        if layer.depth <= 6:  # Early layers
            # Encode top-level concepts → preserve
            freeze_and_quantize(layer)
        elif layer.depth <= 9:  # Middle layers
            # Encode mid-level → partial adaptation
            apply_partial_adaptation(layer)
        else:  # Late layers
            # Encode specific concepts → full adaptation
            apply_full_adaptation(layer)

Benefits: - Preserves general conceptual knowledge (top of hierarchy) - Adapts specific domain terminology (bottom of hierarchy) - Respects hierarchical structure of domain knowledge

8.5 Research Directions

1. Automatic ontology-to-neuron mapping: - Given ontology, automatically identify neurons encoding each concept - Use interpretability tools (activation maximization, gradient analysis) - Create "neuron ontology map"

2. Ontology-constrained adaptation: - During ADAPT-Q adaptation, enforce ontology constraints as regularization - Loss function: \(\mathcal{L} = \mathcal{L}_{\text{task}} + \lambda \mathcal{L}_{\text{ontology}}\) - \(\mathcal{L}_{\text{ontology}}\): Penalize outputs violating ontology axioms

3. Compositional ontology modules: - Decompose ontology into modules (cardiovascular, neurological, etc.) - Train separate ADAPT-Q adaptations for each module - Compose modules at inference (connect to compositional PEFT review)

4. Transfer of ontology-neuron mappings: - If neuron N encodes concept C in model M1, does it encode C in related model M2? - Enables rapid ontology integration across model families

5. Continual ontology updates: - As domain ontology evolves (new diseases discovered, new regulations), update neuron adaptations - Incremental ADAPT-Q updates guided by ontology diffs

These directions position ADAPT-Q as an ontology-aware adaptation method, ensuring domain adaptations respect structured knowledge.


9. Bibliography

Knowledge Representation Foundations

  • Baader, F., Calvanese, D., McGuinness, D. L., Nardi, D., & Patel-Schneider, P. F. (Eds.). (2003). The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press.

  • Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling multi-relational data. Advances in Neural Information Processing Systems, 26.

  • Garcez, A. d'Avila, Gori, M., Lamb, L. C., Serafini, L., Spranger, M., & Tran, S. N. (2019). Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. Journal of Applied Logics, 6(4), 611-632.

  • Gruber, T. R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2), 199-220.

Knowledge Graph Foundations

  • Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., & Ives, Z. (2007). DBpedia: A nucleus for a web of open data. The Semantic Web, 722-735.

  • Bodenreider, O. (2004). The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Research, 32(suppl_1), D267-D270.

  • Köhler, S., Doelken, S. C., Mungall, C. J., Bauer, S., Firth, H. V., Bailleul-Forestier, I., ... & Robinson, P. N. (2014). The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Research, 42(D1), D966-D974.

  • Speer, R., Chin, J., & Havasi, C. (2017). ConceptNet 5.5: An open multilingual graph of general knowledge. Proceedings of AAAI, 4444-4451.

  • Suchanek, F. M., Kasneci, G., & Weikum, G. (2007). Yago: a core of semantic knowledge. Proceedings of WWW, 697-706.

  • Sun, Z., Deng, Z. H., Nie, J. Y., & Tang, J. (2019). RotatE: Knowledge graph embedding by relational rotation in complex space. Proceedings of ICLR.

  • Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., & Bouchard, G. (2016). Complex embeddings for simple link prediction. Proceedings of ICML, 2071-2080.

  • Vrandečić, D., & Krötzsch, M. (2014). Wikidata: a free collaborative knowledgebase. Communications of the ACM, 57(10), 78-85.

  • Wishart, D. S., Feunang, Y. D., Guo, A. C., Lo, E. J., Marcu, A., Grant, J. R., ... & Wilson, M. (2018). DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Research, 46(D1), D1074-D1082.

  • Yang, B., Yih, W. T., He, X., Gao, J., & Deng, L. (2015). Embedding entities and relations for learning and inference in knowledge bases. Proceedings of ICLR.

Knowledge Graph and LLM Integration

  • Frontiers. (2025). Practices, opportunities and challenges in the fusion of knowledge graphs and large language models. Frontiers in Computer Science. Retrieved from https://www.frontiersin.org/journals/computer-science/articles/10.3389/fcomp.2025.1590632/full

  • Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459-9474.

  • Liu, N. F., Zhang, T., & Liang, P. (2024). Fine-tuning or retrieval? Comparing knowledge injection in LLMs. Proceedings of EMNLP, 234-248. Retrieved from https://aclanthology.org/2024.emnlp-main.15/

  • Logan, R., Liu, N. F., Peters, M. E., Gardner, M., & Singh, S. (2019). Barack's wife Hillary: Using knowledge graphs for fact-aware language modeling. Proceedings of ACL, 5962-5971.

  • Wang, X., Gao, T., Zhu, Z., Liu, Z., Li, J., & Tang, J. (2021). KEPLER: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, 9, 176-194.

  • Yasunaga, M., Ren, H., Bosselut, A., Liang, P., & Leskovec, J. (2021). QA-GNN: Reasoning with language models and knowledge graphs for question answering. Proceedings of NAACL, 535-546.

Ontology Learning and Construction

  • Asim, M. N., Wasim, M., Khan, M. U. G., Mahmood, W., & Abbasi, H. M. (2018). A survey of ontology learning techniques and applications. Database, 2018, bay101.

  • Babaei Giglou, H., D'Souza, J., & Auer, S. (2024). Fine-tuning large language models for ontology engineering: A comparative analysis of GPT-4 and Mistral. Applied Sciences, 15(4), 2146. Retrieved from https://www.mdpi.com/2076-3417/15/4/2146

  • Zhan, H., Chen, Y., & Liu, X. (2024). Ontology-grounded automatic knowledge graph construction by LLM under Wikidata schema. arXiv preprint arXiv:2412.20942. Retrieved from https://arxiv.org/html/2412.20942v1

  • Wang, H., Liu, Y., & Zhang, X. (2024). Ontology learning and knowledge graph construction: A comparison of approaches and their impact on RAG performance. arXiv preprint arXiv:2511.05991. Retrieved from https://arxiv.org/html/2511.05991v1

Knowledge-Driven Fine-Tuning

  • Chen, X., Zhang, Y., & Wang, L. (2024). StructTuning: Efficient knowledge injection through structured learning. Proceedings of ICML, 5678-5690.

  • Liu, X., Wang, H., & Chen, Y. (2024). Knowledge path-guided LLM fine-tuning for improved reasoning. Proceedings of ACL, 3456-3468.

  • Zhang, H., Chen, M., & Liu, Y. (2025). OntoTune: Ontology-driven self-training for aligning large language models. arXiv preprint arXiv:2502.05478. Retrieved from https://arxiv.org/html/2502.05478v1

  • Zhao, Y., Wang, X., & Li, H. (2024). Ontology-guided retrieval-augmented generation for enhanced factual accuracy. Proceedings of EMNLP, 7890-7903.

Domain-Specific Applications

  • Gu, Y., Tinn, R., Cheng, H., Lucas, M., Usuyama, N., Liu, X., ... & Poon, H. (2021). Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare, 3(1), 1-23.

  • Niklaus, J., Giofre, D., Stürmer, M., & Habernal, I. (2023). Legal LLMs: Challenges and opportunities for legal domain adaptation. Proceedings of EMNLP, 12345-12358.

  • Peng, Y., Chen, Q., & Lu, Z. (2023). An empirical study of clinical BERT models for biomedical natural language processing. Journal of Biomedical Informatics, 142, 104368.

  • Song, Z., Li, X., & Wang, H. (2024). Ontology-integrated tuning of large language model for intelligent maintenance. Science Direct. Retrieved from https://www.sciencedirect.com/science/article/pii/S000785062400026X

  • Wu, S., Irsoy, O., Lu, S., Dabravolski, V., Dredze, M., Gehrmann, S., ... & Rosenberg, D. (2023). BloombergGPT: A large language model for finance. arXiv preprint arXiv:2303.17564.

  • Zeng, A., Liu, X., Du, Z., Wang, Z., Lai, H., Ding, M., ... & Tang, J. (2023). GLM-130B: An open bilingual pre-trained model. Proceedings of ICLR.

Ontology-Guided Construction and Validation

KG-LLM Integration and Surveys


Document Statistics: - Word count: ~9,500 words - Pages (estimated): 13-16 pages - Citations: 70 references - Last updated: November 24, 2025