Push model using huggingface_hub.

Browse files

Files changed (4) hide show

.gitattributes +1 -0
README.md +244 -0
definition.json +1 -0
parameters +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+parameters filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,244 @@

+---
+language:
+- en
+license: apache-2.0
+library_name: llm2ner
+base_model: EleutherAI/pythia-70m
+tags:
+- ner
+- span-detection
+- llm
+- pytorch
+pipeline_tag: token-classification
+model_name: ToMMeR-pythia-70m_L1_R64
+source: https://github.com/VictorMorand/llm2ner
+paper: https://arxiv.org/abs/2510.19410
+---
+# ToMMeR-pythia-70m_L1_R64
+ToMMeR is a lightweight probing model extracting emergent mention detection capabilities from early layers representations of any LLM backbone, achieving high Zero Shot recall across a wide set of 13 NER benchmarks.
+## Checkpoint Details
+| Property  | Value |
+|-----------|-------|
+| Base LLM  | `EleutherAI/pythia-70m` |
+| Layer     | 1|
+| #Params   | 66.1K |
+# Usage
+## Installation
+Our code can be installed with pip+git, Please visit the [repository](https://github.com/VictorMorand/llm2ner) for more details.
+```bash
+pip install git+https://github.com/VictorMorand/llm2ner.git
+```
+## Fancy Outputs
+```python
+import llm2ner
+from llm2ner import ToMMeR
+tommer = ToMMeR.from_pretrained("llm2ner/ToMMeR-pythia-70m_L1_R64")
+# load Backbone llm, optionnally cut the unused layer to save GPU space.
+llm = llm2ner.utils.load_llm( tommer.llm_name, cut_to_layer=tommer.layer,)
+tommer.to(llm.device)
+text = "Large language models are awesome. While trained on language modeling, they exhibit emergent Zero Shot abilities that make them suitable for a wide range of tasks, including Named Entity Recognition (NER). "
+#fancy interactive output
+outputs = llm2ner.plotting.demo_inference( text, tommer, llm,
+    decoding_strategy="threshold",  # or "greedy" for flat segmentation
+    threshold=0.5, # default 50%
+    show_attn=True,
+)
+```
+<div>
+<span class="tex2jax_ignore"><div class="spans" style="line-height: 2.5; direction: ltr">
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    Large
+    <span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
+    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
+        PRED
+    </span>
+</span>
+</span>
+<span style="font-weight: bold; display: inline-block; position: relative; height: 77px;">
+    language
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 57px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 57px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
+    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
+        PRED
+    </span>
+</span>
+</span>
+<span style="font-weight: bold; display: inline-block; position: relative; height: 77px;">
+    models
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 57px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+</span>
+are awesome . While trained on
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    language
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
+    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
+        PRED
+    </span>
+</span>
+</span>
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    modeling
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+</span>
+, they exhibit
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    emergent
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
+    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
+        PRED
+    </span>
+</span>
+</span>
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    abilities
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+</span>
+that make them suitable for a wide range of
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    tasks
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
+    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
+        PRED
+    </span>
+</span>
+</span>
+, including
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    Named
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
+    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
+        PRED
+    </span>
+</span>
+</span>
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    Entity
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+</span>
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    Recognition
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+</span>
+(
+<span style="font-weight: bold; display: inline-block; position: relative; height: 60px;">
+    NER
+<span style="background: lightblue; top: 40px; height: 4px; left: -1px; width: calc(100% + 2px); position: absolute;">
+</span>
+<span style="background: lightblue; top: 40px; height: 4px; border-top-left-radius: 3px; border-bottom-left-radius: 3px; left: -1px; width: calc(100% + 2px); position: absolute;">
+    <span style="background: lightblue; z-index: 10; color: #000; top: -0.5em; padding: 2px 3px; position: absolute; font-size: 0.6em; font-weight: bold; line-height: 1; border-radius: 3px">
+        PRED
+    </span>
+</span>
+</span>
+) . </div></span>
+</div>
+## Raw inference
+By default, ToMMeR outputs span probabilities, but we also propose built-in options for decoding entities.
+- Inputs:
+  - tokens (batch, seq): tokens to process,
+  - model: LLM to extract representation from.
+- Outputs: (batch, seq, seq) matrix (masked outside valid spans)
+```python
+tommer = ToMMeR.from_pretrained("llm2ner/ToMMeR-pythia-70m_L1_R64")
+# load Backbone llm, optionnally cut the unused layer to save GPU space.
+llm = llm2ner.utils.load_llm( tommer.llm_name, cut_to_layer=tommer.layer,)
+tommer.to(llm.device)
+#### Raw Inference
+text = ["Large language models are awesome"]
+print(f"Input text: {text[0]}")
+#tokenize in shape (1, seq_len)
+tokens = model.tokenizer(text, return_tensors="pt")["input_ids"].to(device)
+# Output raw scores
+output = tommer.forward(tokens, model) # (batch_size, seq_len, seq_len)
+print(f"Raw Output shape: {output.shape}")
+#use given decoding strategy to infer entities
+entities = tommer.infer_entities(tokens=tokens, model=model, threshold=0.5, decoding_strategy="greedy")
+str_entities = [ model.tokenizer.decode(tokens[0,b:e+1]) for b, e in entities[0]]
+print(f"Predicted entities: {str_entities}")
+>>> Input text: Large language models are awesome
+>>> Raw Output shape: torch.Size([1, 6, 6])
+>>> Predicted entities: ['Large language models']
+```
+Please visit the [repository](https://github.com/VictorMorand/llm2ner) for more details and a demo notebook.
+## Evaluation Results
+| dataset             |   precision |   recall |     f1 |   n_samples |
+|---------------------|-------------|----------|--------|-------------|
+| MultiNERD           |      0.119  |   0.9622 | 0.2118 |      154144 |
+| CoNLL 2003          |      0.1496 |   0.7175 | 0.2476 |       16493 |
+| CrossNER_politics   |      0.1696 |   0.9468 | 0.2876 |        1389 |
+| CrossNER_AI         |      0.19   |   0.922  | 0.3151 |         879 |
+| CrossNER_literature |      0.1824 |   0.9039 | 0.3035 |         916 |
+| CrossNER_science    |      0.19   |   0.9316 | 0.3156 |        1193 |
+| CrossNER_music      |      0.1921 |   0.9247 | 0.3181 |         945 |
+| ncbi                |      0.0801 |   0.8658 | 0.1466 |        3952 |
+| FabNER              |      0.226  |   0.8228 | 0.3546 |       13681 |
+| WikiNeural          |      0.1125 |   0.938  | 0.2009 |       92672 |
+| GENIA_NER           |      0.1539 |   0.937  | 0.2644 |       16563 |
+| ACE 2005            |      0.1658 |   0.41   | 0.2361 |        8230 |
+| Ontonotes           |      0.1503 |   0.7275 | 0.2491 |       42193 |
+| Aggregated          |      0.1299 |   0.8953 | 0.2268 |      353250 |
+| Mean                |      0.1601 |   0.8469 | 0.2654 |      353250 |
+## Citation
+If using this model or the approach, please cite the associated paper:
+```
+@misc{morand2025tommerefficiententity,
+      title={ToMMeR -- Efficient Entity Mention Detection from Large Language Models},
+      author={Victor Morand and Nadi Tomeh and Josiane Mothe and Benjamin Piwowarski},
+      year={2025},
+      eprint={2510.19410},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2510.19410},
+}
+```
+## License
+Apache-2.0 (see repository for full text).

definition.json ADDED Viewed

	@@ -0,0 +1 @@

+ {"objects": [{"id": 140521472283872, "module": "llm2ner.models.tommer", "type": "ToMMeR", "typename": "llm2ner.models.tommer.ToMMeR", "identifier": "770d17b72a95550e6e3d24c07e1e9bededfe33d429c819d99b709f263219dc3b", "fields": {"llm_name": "EleutherAI/pythia-70m", "layer": 1, "rank": 64, "causal_mask": true, "sliding_window": 25, "use_cosine": true, "normalize_scores": ""}}, {"id": 140521470739168, "module": "llm2ner.xpmModel", "type": "xpmTorchHubModule.Loader", "typename": "llm2ner.xpmModel.xpmTorchHubModule.Loader", "identifier": "19539fb51a70bb08ec071e0bacf7c2ccb6fc7f110760bae40ac2563b3f1e1959", "fields": {"model": {"type": "python", "value": 140521472283872}, "parameters": {"type": "path.serialized", "value": "parameters", "is_folder": false}}}], "data": [{"type": "python", "value": 140521472283872}, [{"type": "python", "value": 140521470739168}]]}

parameters ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:52c301b052353672dd87fca715f5cb761bc679233555cabcf9694e99b5a9b5d7
+size 267002