File size: 5,662 Bytes
f90701e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 |
---
tags:
- ontology-embedding
- hyperbolic-space
- hierarchical-reasoning
- biomedical-ontology
- generated_from_trainer
- dataset_size:150000
- loss:HierarchyTransformerLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- source_sentence: cellular response to stimulus
sentences:
- response to stimulus
- medial transverse frontopolar gyrus
- biological regulation
- source_sentence: regulation of cell differentiation involved in embryonic placenta
development
sentences:
- thoracic wall
- ectoderm-derived structure
- regulation of cell differentiation
- source_sentence: regulation of hippocampal neuron apoptotic process
sentences:
- external genitalia morphogenesis
- compact layer of ventricle
- biological regulation
- source_sentence: transitional myocyte of internodal tract
sentences:
- secretory epithelial cell
- internodal tract myocyte
- insect haltere disc
- source_sentence: alveolar atrium
sentences:
- organ part
- superior recess of lesser sac
- foramen of skull
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---
# OnT: Language Models as Ontology Encoders
This is an OnT (Ontology Transformer) model trained on the GO dataset, based on [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). OnT is a language model-based framework for ontology embeddings, enabling effective representation of concepts as points in hyperbolic space and axioms as hierarchical relationships between concepts.
## Model Details
### Model Description
- **Model Type:** Ontology Transformer (OnT)
- **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)
- **Training Dataset:** GO
- **Maximum Sequence Length:** 384 tokens
- **Output Dimensionality:** 768 dimensions
- **Embedding Space:** Hyperbolic Space
- **Key Features:**
- Hyperbolic embeddings for ontology concept encoding
- Modeling of hierarchical relationships between concepts
- Support for role embeddings as rotations over hyperbolic spaces
- Concept rotation, transition, and existential quantifier representation
### Model Sources
- **Repository:** [OnT on GitHub](https://github.com/HuiYang1997/OnT)
- **Paper:** [Language Models as Ontology Encoders](https://arxiv.org/abs/2507.14334)
### Available Versions
This model is available in **4 versions** (Git branches) to suit different use cases:
| Branch | Training Type | Role Embedding | Use Case |
|--------|------------|----------------|----------|
| **`main`** (default) | Prediction Dataset | ✅ With role embedding | Default version: training on prediction dataset, support role embedding |
| **`role-free`** | Prediction Dataset | ❌ Without role embedding | Training on prediction dataset, without role embedding |
| **`inference-default`** | Inference Dataset | ✅ With role embedding | Training on inference dataset, with role support |
| **`inference-role-free`** | Inference Dataset | ❌ Without role embedding | Training on inference dataset, without role embeddings |
**How to use different versions:**
```python
from OnT import OntologyTransformer
# Default version (main branch - OnTr with role embedding)
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go")
# Role-free version (without role embedding)
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="role-free")
# Inference version with role embedding
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="inference-default")
# Inference version without role embedding
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="inference-role-free")
```
### Full Model Architecture
```
OntologyTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```
## Usage
### Installation
First, install the required dependencies:
```bash
pip install sentence-transformers==3.4.0.dev0
```
You also need to install [HierarchyTransformers](https://github.com/KRR-Oxford/HierarchyTransformers) following the instructions in their repository.
### Direct Usage
Load the model and use it for ontology concept encoding:
```python
import torch
from OnT import OntologyTransformer
# Load the OnT model
path = "Hui97/OnT-MPNet-go"
ont = OntologyTransformer.from_pretrained(path)
# Entity names to be encoded
entity_names = [
'alveolar atrium',
'organ part',
'superior recess of lesser sac',
]
# Get the entity embeddings in hyperbolic space
entity_embeddings = ont.encode_concept(entity_names)
print(entity_embeddings.shape)
# [3, 768]
# Role sentences to be encoded
role_sentences = [
"application attribute",
"attribute",
"chemical modifier"
]
# Get the role embeddings (rotations and scalings)
role_rotations, role_scalings = ont.encode_roles(role_sentences)
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
## Citation
### BibTeX
If you use this model, please cite:
```bibtex
@article{yang2025language,
title={Language Models as Ontology Encoders},
author={Yang, Hui and Chen, Jiaoyan and He, Yuan and Gao, Yongsheng and Horrocks, Ian},
journal={arXiv preprint arXiv:2507.14334},
year={2025}
}
```
|