File size: 5,662 Bytes
f90701e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
---
tags:
- ontology-embedding
- hyperbolic-space
- hierarchical-reasoning
- biomedical-ontology
- generated_from_trainer
- dataset_size:150000
- loss:HierarchyTransformerLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- source_sentence: cellular response to stimulus
  sentences:
  - response to stimulus
  - medial transverse frontopolar gyrus
  - biological regulation
- source_sentence: regulation of cell differentiation involved in embryonic placenta
    development
  sentences:
  - thoracic wall
  - ectoderm-derived structure
  - regulation of cell differentiation
- source_sentence: regulation of hippocampal neuron apoptotic process
  sentences:
  - external genitalia morphogenesis
  - compact layer of ventricle
  - biological regulation
- source_sentence: transitional myocyte of internodal tract
  sentences:
  - secretory epithelial cell
  - internodal tract myocyte
  - insect haltere disc
- source_sentence: alveolar atrium
  sentences:
  - organ part
  - superior recess of lesser sac
  - foramen of skull
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---

# OnT: Language Models as Ontology Encoders

This is an OnT (Ontology Transformer) model trained on the GO dataset, based on [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). OnT is a language model-based framework for ontology embeddings, enabling effective representation of concepts as points in hyperbolic space and axioms as hierarchical relationships between concepts.

## Model Details

### Model Description
- **Model Type:** Ontology Transformer (OnT)
- **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)
- **Training Dataset:** GO
- **Maximum Sequence Length:** 384 tokens
- **Output Dimensionality:** 768 dimensions
- **Embedding Space:** Hyperbolic Space
- **Key Features:**
  - Hyperbolic embeddings for ontology concept encoding
  - Modeling of hierarchical relationships between concepts
  - Support for role embeddings as rotations over hyperbolic spaces
  - Concept rotation, transition, and existential quantifier representation

### Model Sources

- **Repository:** [OnT on GitHub](https://github.com/HuiYang1997/OnT)
- **Paper:** [Language Models as Ontology Encoders](https://arxiv.org/abs/2507.14334)

### Available Versions

This model is available in **4 versions** (Git branches) to suit different use cases:

| Branch | Training Type | Role Embedding | Use Case |
|--------|------------|----------------|----------|
| **`main`** (default) | Prediction Dataset | ✅ With role embedding | Default version: training on prediction dataset, support role embedding |
| **`role-free`** | Prediction Dataset | ❌ Without role embedding | Training on prediction dataset, without role embedding |
| **`inference-default`** | Inference Dataset | ✅ With role embedding | Training on inference dataset, with role support |
| **`inference-role-free`** | Inference Dataset | ❌ Without role embedding | Training on inference dataset, without role embeddings |

**How to use different versions:**

```python
from OnT import OntologyTransformer

# Default version (main branch - OnTr with role embedding)
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go")

# Role-free version (without role embedding)
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="role-free")

# Inference version with role embedding
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="inference-default")

# Inference version without role embedding
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="inference-role-free")
```

### Full Model Architecture

```
OntologyTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```

## Usage

### Installation

First, install the required dependencies:

```bash
pip install sentence-transformers==3.4.0.dev0
```

You also need to install [HierarchyTransformers](https://github.com/KRR-Oxford/HierarchyTransformers) following the instructions in their repository.

### Direct Usage

Load the model and use it for ontology concept encoding:

```python
import torch
from OnT import OntologyTransformer

# Load the OnT model
path = "Hui97/OnT-MPNet-go"
ont = OntologyTransformer.from_pretrained(path)

# Entity names to be encoded
entity_names = [
    'alveolar atrium',
    'organ part',
    'superior recess of lesser sac',
]

# Get the entity embeddings in hyperbolic space
entity_embeddings = ont.encode_concept(entity_names)
print(entity_embeddings.shape)
# [3, 768]

# Role sentences to be encoded
role_sentences = [
    "application attribute",
    "attribute",
    "chemical modifier"
]

# Get the role embeddings (rotations and scalings)
role_rotations, role_scalings = ont.encode_roles(role_sentences)
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->



## Citation

### BibTeX

If you use this model, please cite:

```bibtex
@article{yang2025language,
  title={Language Models as Ontology Encoders},
  author={Yang, Hui and Chen, Jiaoyan and He, Yuan and Gao, Yongsheng and Horrocks, Ian},
  journal={arXiv preprint arXiv:2507.14334},
  year={2025}
}
```