Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,31 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
datasets:
|
| 4 |
+
- codefuse-ai/F2LLM
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
base_model:
|
| 8 |
+
- Qwen/Qwen3-1.7B
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
F2LLM (Foundation to Feature Large Language Models) are foundation models directly finetuned on 6 million high-quality query-document pairs (available in [codefuse-ai/F2LLM](https://huggingface.co/datasets/codefuse-ai/F2LLM)) covering a diverse range of retrieval, classification, and clustering data, curated solely from open-source datasets without any synthetic data. These models are trained with homogeneous macro batches in a single stage, without sophisticated multi-stage pipelines.
|
| 12 |
+
|
| 13 |
+
To evaluate F2LLMs on MTEB:
|
| 14 |
+
|
| 15 |
+
```
|
| 16 |
+
import mteb
|
| 17 |
+
import logging
|
| 18 |
+
logging.basicConfig(level=logging.INFO)
|
| 19 |
+
|
| 20 |
+
task_names = ['AmazonCounterfactualClassification', 'ArXivHierarchicalClusteringP2P', 'ArXivHierarchicalClusteringS2S', 'ArguAna', 'AskUbuntuDupQuestions', 'BIOSSES', 'Banking77Classification', 'BiorxivClusteringP2P.v2', 'CQADupstackGamingRetrieval', 'CQADupstackUnixRetrieval', 'ClimateFEVERHardNegatives', 'FEVERHardNegatives', 'FiQA2018', 'HotpotQAHardNegatives', 'ImdbClassification', 'MTOPDomainClassification', 'MassiveIntentClassification', 'MassiveScenarioClassification', 'MedrxivClusteringP2P.v2', 'MedrxivClusteringS2S.v2', 'SCIDOCS', 'SICK-R', 'STS12', 'STS13', 'STS14', 'STS15', 'STS17', 'STS22.v2', 'STSBenchmark', 'SprintDuplicateQuestions', 'StackExchangeClustering.v2', 'StackExchangeClusteringP2P.v2', 'SummEvalSummarization.v2', 'TRECCOVID', 'Touche2020Retrieval.v3', 'ToxicConversationsClassification', 'TweetSentimentExtractionClassification', 'TwentyNewsgroupsClustering.v2', 'TwitterSemEval2015', 'TwitterURLCorpus', 'MindSmallReranking']
|
| 21 |
+
|
| 22 |
+
tasks = [
|
| 23 |
+
mteb.get_task(task_name, languages = ["eng"], eval_splits=["test"], exclusive_language_filter=True)
|
| 24 |
+
for task_name in task_names
|
| 25 |
+
]
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
model = mteb.get_model("codefuse-ai/F2LLM-1.7B", device="cuda:0")
|
| 29 |
+
evaluation = mteb.MTEB(tasks=tasks)
|
| 30 |
+
evaluation.run(model, encode_kwargs={"batch_size": 16})
|
| 31 |
+
```
|