VictorMorand commited on
Commit
c1ad718
·
verified ·
1 Parent(s): b515921

Push model using huggingface_hub.

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +244 -0
  3. definition.json +1 -0
  4. parameters +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ parameters filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: llm2ner
6
+ base_model: EleutherAI/pythia-70m
7
+ tags:
8
+ - ner
9
+ - span-detection
10
+ - llm
11
+ - pytorch
12
+ pipeline_tag: token-classification
13
+ model_name: ToMMeR-pythia-70m_L1_R64
14
+ source: https://github.com/VictorMorand/llm2ner
15
+ paper: https://arxiv.org/abs/2510.19410
16
+ ---
17
+
18
+ # ToMMeR-pythia-70m_L1_R64
19
+
20
+ ToMMeR is a lightweight probing model extracting emergent mention detection capabilities from early layers representations of any LLM backbone, achieving high Zero Shot recall across a wide set of 13 NER benchmarks.
21
+
22
+ ## Checkpoint Details
23
+
24
+ | Property | Value |
25
+ |-----------|-------|
26
+ | Base LLM | `EleutherAI/pythia-70m` |
27
+ | Layer | 1|
28
+ | #Params | 66.1K |
29
+
30
+
31
+ # Usage
32
+
33
+ ## Installation
34
+
35
+ Our code can be installed with pip+git, Please visit the [repository](https://github.com/VictorMorand/llm2ner) for more details.
36
+
37
+ ```bash
38
+ pip install git+https://github.com/VictorMorand/llm2ner.git
39
+ ```
40
+
41
+ ## Fancy Outputs
42
+
43
+ ```python
44
+ import llm2ner
45
+ from llm2ner import ToMMeR
46
+
47
+ tommer = ToMMeR.from_pretrained("llm2ner/ToMMeR-pythia-70m_L1_R64")
48
+ # load Backbone llm, optionnally cut the unused layer to save GPU space.
49
+ llm = llm2ner.utils.load_llm( tommer.llm_name, cut_to_layer=tommer.layer,)
50
+ tommer.to(llm.device)
51
+
52
+ text = "Large language models are awesome. While trained on language modeling, they exhibit emergent Zero Shot abilities that make them suitable for a wide range of tasks, including Named Entity Recognition (NER). "
53
+
54
+ #fancy interactive output
55
+ outputs = llm2ner.plotting.demo_inference( text, tommer, llm,
56
+ decoding_strategy="threshold", # or "greedy" for flat segmentation
57
+ threshold=0.5, # default 50%
58
+ show_attn=True,
59
+ )
60
+ ```
61
+ <div>
62
+ <span class="tex2jax_ignore"><div class="spans" style="line-height: 2.5; direction: ltr">
63
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
64
+ Large
65
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
66
+ </span>
67
+ <span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
68
+ <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
69
+ PRED
70
+ </span>
71
+ </span>
72
+ </span>
73
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 77px;">
74
+ language
75
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
76
+ </span>
77
+ <span style="background: lightblue; top: 57px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
78
+ </span>
79
+ <span style="background: lightblue; top: 57px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
80
+ <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
81
+ PRED
82
+ </span>
83
+ </span>
84
+ </span>
85
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 77px;">
86
+ models
87
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
88
+ </span>
89
+ <span style="background: lightblue; top: 57px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
90
+ </span>
91
+ </span>
92
+ are awesome . While trained on
93
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
94
+ language
95
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
96
+ </span>
97
+ <span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
98
+ <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
99
+ PRED
100
+ </span>
101
+ </span>
102
+ </span>
103
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
104
+ modeling
105
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
106
+ </span>
107
+ </span>
108
+ , they exhibit
109
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
110
+ emergent
111
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
112
+ </span>
113
+ <span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
114
+ <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
115
+ PRED
116
+ </span>
117
+ </span>
118
+ </span>
119
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
120
+ abilities
121
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
122
+ </span>
123
+ </span>
124
+ that make them suitable for a wide range of
125
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
126
+ tasks
127
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
128
+ </span>
129
+ <span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
130
+ <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
131
+ PRED
132
+ </span>
133
+ </span>
134
+ </span>
135
+ , including
136
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
137
+ Named
138
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
139
+ </span>
140
+ <span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
141
+ <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
142
+ PRED
143
+ </span>
144
+ </span>
145
+ </span>
146
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
147
+ Entity
148
+
149
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
150
+ </span>
151
+ </span>
152
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
153
+ Recognition
154
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
155
+ </span>
156
+ </span>
157
+ (
158
+ <span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
159
+ NER
160
+ <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
161
+ </span>
162
+ <span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
163
+ <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
164
+ PRED
165
+ </span>
166
+ </span>
167
+ </span>
168
+ ) . </div></span>
169
+ </div>
170
+
171
+
172
+ ## Raw inference
173
+ By default, ToMMeR outputs span probabilities, but we also propose built-in options for decoding entities.
174
+
175
+ - Inputs:
176
+ - tokens (batch, seq): tokens to process,
177
+ - model: LLM to extract representation from.
178
+ - Outputs: (batch, seq, seq) matrix (masked outside valid spans)
179
+
180
+ ```python
181
+
182
+ tommer = ToMMeR.from_pretrained("llm2ner/ToMMeR-pythia-70m_L1_R64")
183
+ # load Backbone llm, optionnally cut the unused layer to save GPU space.
184
+ llm = llm2ner.utils.load_llm( tommer.llm_name, cut_to_layer=tommer.layer,)
185
+ tommer.to(llm.device)
186
+
187
+ #### Raw Inference
188
+ text = ["Large language models are awesome"]
189
+ print(f"Input text: {text[0]}")
190
+
191
+ #tokenize in shape (1, seq_len)
192
+ tokens = model.tokenizer(text, return_tensors="pt")["input_ids"].to(device)
193
+ # Output raw scores
194
+ output = tommer.forward(tokens, model) # (batch_size, seq_len, seq_len)
195
+ print(f"Raw Output shape: {output.shape}")
196
+
197
+ #use given decoding strategy to infer entities
198
+ entities = tommer.infer_entities(tokens=tokens, model=model, threshold=0.5, decoding_strategy="greedy")
199
+ str_entities = [ model.tokenizer.decode(tokens[0,b:e+1]) for b, e in entities[0]]
200
+ print(f"Predicted entities: {str_entities}")
201
+
202
+ >>> Input text: Large language models are awesome
203
+ >>> Raw Output shape: torch.Size([1, 6, 6])
204
+ >>> Predicted entities: ['Large language models']
205
+ ```
206
+
207
+ Please visit the [repository](https://github.com/VictorMorand/llm2ner) for more details and a demo notebook.
208
+
209
+ ## Evaluation Results
210
+
211
+ | dataset | precision | recall | f1 | n_samples |
212
+ |---------------------|-------------|----------|--------|-------------|
213
+ | MultiNERD | 0.119 | 0.9622 | 0.2118 | 154144 |
214
+ | CoNLL 2003 | 0.1496 | 0.7175 | 0.2476 | 16493 |
215
+ | CrossNER_politics | 0.1696 | 0.9468 | 0.2876 | 1389 |
216
+ | CrossNER_AI | 0.19 | 0.922 | 0.3151 | 879 |
217
+ | CrossNER_literature | 0.1824 | 0.9039 | 0.3035 | 916 |
218
+ | CrossNER_science | 0.19 | 0.9316 | 0.3156 | 1193 |
219
+ | CrossNER_music | 0.1921 | 0.9247 | 0.3181 | 945 |
220
+ | ncbi | 0.0801 | 0.8658 | 0.1466 | 3952 |
221
+ | FabNER | 0.226 | 0.8228 | 0.3546 | 13681 |
222
+ | WikiNeural | 0.1125 | 0.938 | 0.2009 | 92672 |
223
+ | GENIA_NER | 0.1539 | 0.937 | 0.2644 | 16563 |
224
+ | ACE 2005 | 0.1658 | 0.41 | 0.2361 | 8230 |
225
+ | Ontonotes | 0.1503 | 0.7275 | 0.2491 | 42193 |
226
+ | Aggregated | 0.1299 | 0.8953 | 0.2268 | 353250 |
227
+ | Mean | 0.1601 | 0.8469 | 0.2654 | 353250 |
228
+
229
+ ## Citation
230
+ If using this model or the approach, please cite the associated paper:
231
+ ```
232
+ @misc{morand2025tommerefficiententity,
233
+ title={ToMMeR -- Efficient Entity Mention Detection from Large Language Models},
234
+ author={Victor Morand and Nadi Tomeh and Josiane Mothe and Benjamin Piwowarski},
235
+ year={2025},
236
+ eprint={2510.19410},
237
+ archivePrefix={arXiv},
238
+ primaryClass={cs.CL},
239
+ url={https://arxiv.org/abs/2510.19410},
240
+ }
241
+ ```
242
+
243
+ ## License
244
+ Apache-2.0 (see repository for full text).
definition.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"objects": [{"id": 140521472283872, "module": "llm2ner.models.tommer", "type": "ToMMeR", "typename": "llm2ner.models.tommer.ToMMeR", "identifier": "770d17b72a95550e6e3d24c07e1e9bededfe33d429c819d99b709f263219dc3b", "fields": {"llm_name": "EleutherAI/pythia-70m", "layer": 1, "rank": 64, "causal_mask": true, "sliding_window": 25, "use_cosine": true, "normalize_scores": ""}}, {"id": 140521470739168, "module": "llm2ner.xpmModel", "type": "xpmTorchHubModule.Loader", "typename": "llm2ner.xpmModel.xpmTorchHubModule.Loader", "identifier": "19539fb51a70bb08ec071e0bacf7c2ccb6fc7f110760bae40ac2563b3f1e1959", "fields": {"model": {"type": "python", "value": 140521472283872}, "parameters": {"type": "path.serialized", "value": "parameters", "is_folder": false}}}], "data": [{"type": "python", "value": 140521472283872}, [{"type": "python", "value": 140521470739168}]]}
parameters ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:52c301b052353672dd87fca715f5cb761bc679233555cabcf9694e99b5a9b5d7
3
+ size 267002