codelion nielsr HF Staff commited on
Commit
d0f3f95
Β·
verified Β·
1 Parent(s): ec5143a

Add `pipeline_tag: text-classification` and improve link visibility (#1)

Browse files

- Add `pipeline_tag: text-classification` and improve link visibility (18d5728f61b0b46d66a744db9c8f67474227e4f7)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +40 -38
README.md CHANGED
@@ -1,5 +1,11 @@
1
  ---
 
 
2
  library_name: adaptive-classifier
 
 
 
 
3
  tags:
4
  - llm
5
  - routing
@@ -7,30 +13,32 @@ tags:
7
  - bert
8
  - router-arena
9
  - model-selection
10
- language:
11
- - en
12
- metrics:
13
- - accuracy
14
- license: apache-2.0
15
  ---
16
 
17
  # Chayan: Multi-Model LLM Router
18
 
19
- **Chayan** is a high-performance LLM router that intelligently routes between 4 models (gpt-4o-mini, gemini-2.5-flash-lite, gemini-2.5-flash, and gpt-4o) to optimize the accuracy-cost tradeoff.
 
 
 
 
 
 
 
20
 
21
  ## πŸ† RouterArena Performance
22
 
23
  **Official Leaderboard Results** (8,400 queries):
24
- - πŸ₯‡ **#1 Optimal Accuracy Score: 88.7%** - SOTA! (Best routing decision quality)
25
- - πŸ₯ˆ **#2 Optimal Selection Score: 43.0%** - Silver! (Second-best model selection)
26
- - **#7 Overall** (#5 open-source): 64.9% accuracy, 63.8 arena score
27
- - **$0.60 per 1K queries** - Cost-efficient routing
28
 
29
  ![RouterArena Leaderboard](routerarena_leaderboard.png)
30
 
31
  **What do these metrics mean?**
32
- - **Optimal Accuracy**: When Chayan routes to a model, that model gives the correct answer 88.7% of the time
33
- - **Optimal Selection**: Chayan selects the best available model 43% of the time
34
 
35
  View full leaderboard: [RouterArena](https://routeworks.github.io/) | [PR #24](https://github.com/RouteWorks/RouterArena/pull/24)
36
 
@@ -73,9 +81,9 @@ selected_model = max(calibrated_scores.items(), key=lambda x: x[1])[0]
73
  ## Architecture
74
 
75
  **Core Components:**
76
- - **Base Model**: BERT-base-uncased embeddings
77
- - **Classifier**: Adaptive K-NN with prototype memory (FAISS-backed)
78
- - **Innovation**: Calibrated confidence scores to correct training data imbalance
79
 
80
  **Supported Models:**
81
 
@@ -89,18 +97,18 @@ selected_model = max(calibrated_scores.items(), key=lambda x: x[1])[0]
89
  ## How It Works
90
 
91
  ### Training
92
- - **Dataset**: RouterArena sub_10 (809 queries)
93
- - **Oracle Labels**: 4-model cascade strategy (select cheapest successful model)
94
- - **Training Time**: 19.2 minutes
95
- - **Method**: K-NN classifier with 3000 prototypes, temperature 0.4
96
 
97
  ### The Calibration Breakthrough
98
 
99
  The uncalibrated router achieved 61.76% accuracy but was biased toward gpt-4o-mini (83% routing). This happened because the training data had class imbalance:
100
- - 57% gpt-4o-mini examples
101
- - 27% gpt-4o examples
102
- - 12% gemini-flash-lite examples
103
- - 4% gemini-flash examples
104
 
105
  **Solution**: Apply post-training calibration factors to correct the bias without retraining.
106
 
@@ -121,10 +129,10 @@ The uncalibrated router achieved 61.76% accuracy but was biased toward gpt-4o-mi
121
  **Key Insight**: Chayan achieves 99% of perfect oracle performance at 57% lower cost.
122
 
123
  **Full Dataset (8,400 queries):**
124
- - **Optimal Accuracy**: 88.7% (πŸ₯‡ #1)
125
- - **Optimal Selection**: 43.0% (πŸ₯ˆ #2)
126
- - **Overall Accuracy**: 64.9% (#7 overall, #5 open-source)
127
- - **Cost**: $0.60/1K queries
128
 
129
  ## Advanced Usage
130
 
@@ -144,10 +152,10 @@ predictions = router.predict(augmented, k=4)
144
 
145
  ## Limitations
146
 
147
- - Calibration factors optimized on RouterArena sub_10; may require adjustment for other domains
148
- - Requires the 4 specific models to be available via API
149
- - Performance depends on query distribution similar to RouterArena benchmark
150
- - Cost estimates assume ~500 tokens per query
151
 
152
  ## Citation
153
 
@@ -159,10 +167,4 @@ predictions = router.predict(augmented, k=4)
159
  publisher = {GitHub},
160
  url = {https://github.com/codelion/adaptive-classifier}
161
  }
162
- ```
163
-
164
- ## Links
165
-
166
- - **Library**: https://github.com/codelion/adaptive-classifier
167
- - **RouterArena**: https://routeworks.github.io/
168
- - **RouterArena Paper**: https://arxiv.org/abs/2510.00202
 
1
  ---
2
+ language:
3
+ - en
4
  library_name: adaptive-classifier
5
+ license: apache-2.0
6
+ metrics:
7
+ - accuracy
8
+ pipeline_tag: text-classification
9
  tags:
10
  - llm
11
  - routing
 
13
  - bert
14
  - router-arena
15
  - model-selection
 
 
 
 
 
16
  ---
17
 
18
  # Chayan: Multi-Model LLM Router
19
 
20
+ This model is a high-performance LLM router presented in the paper [RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers](https://huggingface.co/papers/2510.00202).
21
+
22
+ - πŸ“š Paper (Hugging Face): [RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers](https://huggingface.co/papers/2510.00202)
23
+ - πŸ“š Paper (arXiv): https://arxiv.org/abs/2510.00202
24
+ - πŸ’» Library Code: https://github.com/codelion/adaptive-classifier
25
+ - 🌐 RouterArena Project Page: https://routeworks.github.io/
26
+
27
+ **Chayan** intelligently routes between 4 models (gpt-4o-mini, gemini-2.5-flash-lite, gemini-2.5-flash, and gpt-4o) to optimize the accuracy-cost tradeoff.
28
 
29
  ## πŸ† RouterArena Performance
30
 
31
  **Official Leaderboard Results** (8,400 queries):
32
+ - πŸ₯‡ **#1 Optimal Accuracy Score: 88.7%** - SOTA! (Best routing decision quality)
33
+ - πŸ₯ˆ **#2 Optimal Selection Score: 43.0%** - Silver! (Second-best model selection)
34
+ - **#7 Overall** (#5 open-source): 64.9% accuracy, 63.8 arena score
35
+ - **$0.60 per 1K queries** - Cost-efficient routing
36
 
37
  ![RouterArena Leaderboard](routerarena_leaderboard.png)
38
 
39
  **What do these metrics mean?**
40
+ - **Optimal Accuracy**: When Chayan routes to a model, that model gives the correct answer 88.7% of the time
41
+ - **Optimal Selection**: Chayan selects the best available model 43% of the time
42
 
43
  View full leaderboard: [RouterArena](https://routeworks.github.io/) | [PR #24](https://github.com/RouteWorks/RouterArena/pull/24)
44
 
 
81
  ## Architecture
82
 
83
  **Core Components:**
84
+ - **Base Model**: BERT-base-uncased embeddings
85
+ - **Classifier**: Adaptive K-NN with prototype memory (FAISS-backed)
86
+ - **Innovation**: Calibrated confidence scores to correct training data imbalance
87
 
88
  **Supported Models:**
89
 
 
97
  ## How It Works
98
 
99
  ### Training
100
+ - **Dataset**: RouterArena sub_10 (809 queries)
101
+ - **Oracle Labels**: 4-model cascade strategy (select cheapest successful model)
102
+ - **Training Time**: 19.2 minutes
103
+ - **Method**: K-NN classifier with 3000 prototypes, temperature 0.4
104
 
105
  ### The Calibration Breakthrough
106
 
107
  The uncalibrated router achieved 61.76% accuracy but was biased toward gpt-4o-mini (83% routing). This happened because the training data had class imbalance:
108
+ - 57% gpt-4o-mini examples
109
+ - 27% gpt-4o examples
110
+ - 12% gemini-flash-lite examples
111
+ - 4% gemini-flash examples
112
 
113
  **Solution**: Apply post-training calibration factors to correct the bias without retraining.
114
 
 
129
  **Key Insight**: Chayan achieves 99% of perfect oracle performance at 57% lower cost.
130
 
131
  **Full Dataset (8,400 queries):**
132
+ - **Optimal Accuracy**: 88.7% (πŸ₯‡ #1)
133
+ - **Optimal Selection**: 43.0% (πŸ₯ˆ #2)
134
+ - **Overall Accuracy**: 64.9% (#7 overall, #5 open-source)
135
+ - **Cost**: $0.60/1K queries
136
 
137
  ## Advanced Usage
138
 
 
152
 
153
  ## Limitations
154
 
155
+ - Calibration factors optimized on RouterArena sub_10; may require adjustment for other domains
156
+ - Requires the 4 specific models to be available via API
157
+ - Performance depends on query distribution similar to RouterArena benchmark
158
+ - Cost estimates assume ~500 tokens per query
159
 
160
  ## Citation
161
 
 
167
  publisher = {GitHub},
168
  url = {https://github.com/codelion/adaptive-classifier}
169
  }
170
+ ```