nazdridoy commited on
Commit
dd78fc1
Β·
verified Β·
1 Parent(s): a6a8ac0

refactor(ui): separate model and provider selection

Browse files

- [docs] Update `README` for new model/provider format and examples (README.md)
- [refactor] Remove `parse_model_and_provider` import (chat_handler.py:17)
- [refactor] Add `provider_override` parameter to `chat_respond` (chat_handler.py:32)
- [refactor] Enforce explicit provider selection in `chat_respond` (chat_handler.py:57-58)
- [refactor] Update `handle_chat_submit` and `handle_chat_retry` signatures to include `provider` (chat_handler.py:179,214)
- [ui] Modify `chat_model_name` textbox placeholder and info (ui_components.py:47-48)
- [ui] Add `chat_provider` dropdown for explicit provider selection (ui_components.py:50-55)
- [ui] Update `chat_submit.click`, `chat_input.submit`, and `chatbot_display.retry` inputs with `chat_provider` (ui_components.py:85,98,122)
- [docs] Update `create_chat_tips` and `create_footer` markdown for provider selection (ui_components.py:137-159,663)
- [remove] Delete `parse_model_and_provider` function (utils.py:196-204)
- [refactor] Consolidate and expand provider lists into `PROVIDERS_UNIFIED` (utils.py:41-63)

Files changed (4) hide show
  1. README.md +9 -10
  2. chat_handler.py +10 -7
  3. ui_components.py +17 -16
  4. utils.py +22 -13
README.md CHANGED
@@ -70,7 +70,7 @@ The app requires:
70
  4. **Automatic Rotation**: HF-Inferoxy handles token rotation and error management
71
 
72
  ### Chat Assistant
73
- 1. **Model Selection**: Choose any HuggingFace model with optional provider specification
74
  2. **Conversation**: Engage in natural conversations with streaming responses
75
  3. **Customization**: Adjust the AI's personality with system messages and parameters
76
 
@@ -123,16 +123,11 @@ The application automatically works with all Hugging Face inference providers:
123
 
124
  ### πŸ’‘ How It Works
125
 
126
- 1. **Model Format**: Use `model_name` or `model_name:provider` format
127
- 2. **Auto Provider**: When no provider is specified, HF-Inferoxy automatically selects the best available provider
128
  3. **Fallback System**: If one provider fails, the system automatically tries alternatives
129
  4. **Token Management**: HF-Inferoxy handles token rotation and quota management automatically
130
 
131
- **Examples:**
132
- - `openai/gpt-oss-20b` (auto provider selection)
133
- - `openai/gpt-oss-20b:fireworks-ai` (specific provider)
134
- - `Qwen/Qwen-Image:fal-ai` (image model with specific provider)
135
-
136
  ## 🎨 Usage Examples
137
 
138
  ### Chat Assistant
@@ -147,9 +142,11 @@ The application automatically works with all Hugging Face inference providers:
147
  ```
148
  # Auto provider (default - let HF choose best)
149
  Model Name: openai/gpt-oss-20b
 
150
 
151
  # Specific provider
152
- Model Name: openai/gpt-oss-20b:fireworks-ai
 
153
  System Message: You are a helpful coding assistant specializing in Python.
154
  ```
155
 
@@ -223,10 +220,12 @@ System Message: You are a helpful coding assistant specializing in Python.
223
  ```
224
  # Using auto provider (default)
225
  Model: openai/gpt-oss-20b
 
226
  Prompt: "Explain quantum computing in simple terms"
227
 
228
  # Using specific provider
229
- Model: openai/gpt-oss-20b:fireworks-ai
 
230
  Prompt: "Help me debug this Python code: [paste code]"
231
 
232
  # Other example prompts:
 
70
  4. **Automatic Rotation**: HF-Inferoxy handles token rotation and error management
71
 
72
  ### Chat Assistant
73
+ 1. **Model Selection**: Choose any HuggingFace model and select a provider from the dropdown (default: Auto)
74
  2. **Conversation**: Engage in natural conversations with streaming responses
75
  3. **Customization**: Adjust the AI's personality with system messages and parameters
76
 
 
123
 
124
  ### πŸ’‘ How It Works
125
 
126
+ 1. **Model Format**: Enter the model name only (e.g., `openai/gpt-oss-20b`)
127
+ 2. **Provider**: Select the provider from the dropdown (default: Auto)
128
  3. **Fallback System**: If one provider fails, the system automatically tries alternatives
129
  4. **Token Management**: HF-Inferoxy handles token rotation and quota management automatically
130
 
 
 
 
 
 
131
  ## 🎨 Usage Examples
132
 
133
  ### Chat Assistant
 
142
  ```
143
  # Auto provider (default - let HF choose best)
144
  Model Name: openai/gpt-oss-20b
145
+ Provider: auto
146
 
147
  # Specific provider
148
+ Model Name: openai/gpt-oss-20b
149
+ Provider: fireworks-ai
150
  System Message: You are a helpful coding assistant specializing in Python.
151
  ```
152
 
 
220
  ```
221
  # Using auto provider (default)
222
  Model: openai/gpt-oss-20b
223
+ Provider: auto
224
  Prompt: "Explain quantum computing in simple terms"
225
 
226
  # Using specific provider
227
+ Model: openai/gpt-oss-20b
228
+ Provider: fireworks-ai
229
  Prompt: "Help me debug this Python code: [paste code]"
230
 
231
  # Other example prompts:
chat_handler.py CHANGED
@@ -14,7 +14,6 @@ from requests.exceptions import ConnectionError, Timeout, RequestException
14
  from hf_token_utils import get_proxy_token, report_token_status
15
  from utils import (
16
  validate_proxy_key,
17
- parse_model_and_provider,
18
  format_error_message,
19
  check_org_access,
20
  format_access_denied_message,
@@ -30,6 +29,7 @@ def chat_respond(
30
  history: list[dict[str, str]],
31
  system_message,
32
  model_name,
 
33
  max_tokens,
34
  temperature,
35
  top_p,
@@ -52,8 +52,9 @@ def chat_respond(
52
  token, token_id = get_proxy_token(api_key=proxy_api_key)
53
  print(f"βœ… Chat: Got token: {token_id}")
54
 
55
- # Parse model name and provider if specified
56
- model, provider = parse_model_and_provider(model_name)
 
57
 
58
  print(f"πŸ€– Chat: Using model='{model}', provider='{provider if provider else 'auto'}'")
59
 
@@ -168,14 +169,14 @@ def chat_respond(
168
  yield format_error_message("Unexpected Error", f"An unexpected error occurred: {error_msg}")
169
 
170
 
171
- def handle_chat_submit(message, history, system_msg, model_name, max_tokens, temperature, top_p, hf_token: gr.OAuthToken = None):
172
  """
173
  Handle chat submission and manage conversation history with streaming.
174
  """
175
  if not message.strip():
176
  yield history, ""
177
  return
178
-
179
  # Enforce org-based access control via HF OAuth token
180
  access_token = getattr(hf_token, "token", None) if hf_token is not None else None
181
  is_allowed, access_msg, _username, _matched = check_org_access(access_token)
@@ -194,7 +195,8 @@ def handle_chat_submit(message, history, system_msg, model_name, max_tokens, tem
194
  message,
195
  history[:-1], # Don't include the current message in history for the function
196
  system_msg,
197
- model_name,
 
198
  max_tokens,
199
  temperature,
200
  top_p
@@ -209,7 +211,7 @@ def handle_chat_submit(message, history, system_msg, model_name, max_tokens, tem
209
  yield current_history, ""
210
 
211
 
212
- def handle_chat_retry(history, system_msg, model_name, max_tokens, temperature, top_p, hf_token: gr.OAuthToken = None, retry_data=None):
213
  """
214
  Retry the assistant response for the selected message.
215
  Works with gr.Chatbot.retry() which provides retry_data.index for the message.
@@ -268,6 +270,7 @@ def handle_chat_retry(history, system_msg, model_name, max_tokens, temperature,
268
  prior_history,
269
  system_msg,
270
  model_name,
 
271
  max_tokens,
272
  temperature,
273
  top_p
 
14
  from hf_token_utils import get_proxy_token, report_token_status
15
  from utils import (
16
  validate_proxy_key,
 
17
  format_error_message,
18
  check_org_access,
19
  format_access_denied_message,
 
29
  history: list[dict[str, str]],
30
  system_message,
31
  model_name,
32
+ provider_override,
33
  max_tokens,
34
  temperature,
35
  top_p,
 
52
  token, token_id = get_proxy_token(api_key=proxy_api_key)
53
  print(f"βœ… Chat: Got token: {token_id}")
54
 
55
+ # Enforce explicit provider selection via dropdown
56
+ model = model_name
57
+ provider = provider_override or "auto"
58
 
59
  print(f"πŸ€– Chat: Using model='{model}', provider='{provider if provider else 'auto'}'")
60
 
 
169
  yield format_error_message("Unexpected Error", f"An unexpected error occurred: {error_msg}")
170
 
171
 
172
+ def handle_chat_submit(message, history, system_msg, model_name, provider, max_tokens, temperature, top_p, hf_token: gr.OAuthToken = None):
173
  """
174
  Handle chat submission and manage conversation history with streaming.
175
  """
176
  if not message.strip():
177
  yield history, ""
178
  return
179
+
180
  # Enforce org-based access control via HF OAuth token
181
  access_token = getattr(hf_token, "token", None) if hf_token is not None else None
182
  is_allowed, access_msg, _username, _matched = check_org_access(access_token)
 
195
  message,
196
  history[:-1], # Don't include the current message in history for the function
197
  system_msg,
198
+ model_name,
199
+ provider,
200
  max_tokens,
201
  temperature,
202
  top_p
 
211
  yield current_history, ""
212
 
213
 
214
+ def handle_chat_retry(history, system_msg, model_name, provider, max_tokens, temperature, top_p, hf_token: gr.OAuthToken = None, retry_data=None):
215
  """
216
  Retry the assistant response for the selected message.
217
  Works with gr.Chatbot.retry() which provides retry_data.index for the message.
 
270
  prior_history,
271
  system_msg,
272
  model_name,
273
+ provider,
274
  max_tokens,
275
  temperature,
276
  top_p
ui_components.py CHANGED
@@ -44,7 +44,14 @@ def create_chat_tab(handle_chat_submit_fn, handle_chat_retry_fn=None):
44
  chat_model_name = gr.Textbox(
45
  value=DEFAULT_CHAT_MODEL,
46
  label="Model Name",
47
- placeholder="e.g., openai/gpt-oss-20b or openai/gpt-oss-20b:fireworks-ai"
 
 
 
 
 
 
 
48
  )
49
  chat_system_message = gr.Textbox(
50
  value=CHAT_CONFIG["system_message"],
@@ -82,7 +89,7 @@ def create_chat_tab(handle_chat_submit_fn, handle_chat_retry_fn=None):
82
  chat_send_event = chat_submit.click(
83
  fn=handle_chat_submit_fn,
84
  inputs=[chat_input, chatbot_display, chat_system_message, chat_model_name,
85
- chat_max_tokens, chat_temperature, chat_top_p],
86
  outputs=[chatbot_display, chat_input]
87
  )
88
 
@@ -97,7 +104,7 @@ def create_chat_tab(handle_chat_submit_fn, handle_chat_retry_fn=None):
97
  chat_enter_event = chat_input.submit(
98
  fn=handle_chat_submit_fn,
99
  inputs=[chat_input, chatbot_display, chat_system_message, chat_model_name,
100
- chat_max_tokens, chat_temperature, chat_top_p],
101
  outputs=[chatbot_display, chat_input]
102
  )
103
 
@@ -119,7 +126,7 @@ def create_chat_tab(handle_chat_submit_fn, handle_chat_retry_fn=None):
119
  chatbot_display.retry(
120
  fn=handle_chat_retry_fn,
121
  inputs=[chatbot_display, chat_system_message, chat_model_name,
122
- chat_max_tokens, chat_temperature, chat_top_p],
123
  outputs=chatbot_display
124
  )
125
 
@@ -132,8 +139,8 @@ def create_chat_tips():
132
  ### πŸ’‘ Chat Tips
133
 
134
  **Model Format:**
135
- - Single model: `openai/gpt-oss-20b` (uses auto provider)
136
- - With provider: `openai/gpt-oss-20b:fireworks-ai`
137
 
138
  **Popular Models:**
139
  - `openai/gpt-oss-20b` - Fast general purpose
@@ -146,16 +153,10 @@ def create_chat_tips():
146
  gr.Markdown("""
147
  ### πŸš€ Popular Providers
148
 
149
- - **auto** - Let HF choose best provider (default)
150
- - **fireworks-ai** - Fast and reliable
151
- - **cerebras** - High performance
152
- - **groq** - Ultra-fast inference
153
- - **together** - Wide model support
154
- - **cohere** - Advanced language models
155
 
156
- **Examples:**
157
- - `openai/gpt-oss-20b` (auto provider)
158
- - `openai/gpt-oss-20b:fireworks-ai` (specific provider)
159
  """)
160
 
161
 
@@ -662,7 +663,7 @@ def create_footer():
662
 
663
  **Chat Tab:**
664
  - Enter your message and customize the AI's behavior with system messages
665
- - Choose models and providers using the format `model:provider`
666
  - Adjust temperature for creativity and top-p for response diversity
667
 
668
  **Image Tab:**
 
44
  chat_model_name = gr.Textbox(
45
  value=DEFAULT_CHAT_MODEL,
46
  label="Model Name",
47
+ placeholder="e.g., openai/gpt-oss-20b (provider via dropdown)",
48
+ info="Do not include :provider in model name"
49
+ )
50
+ chat_provider = gr.Dropdown(
51
+ choices=IMAGE_PROVIDERS,
52
+ value="auto",
53
+ label="Provider",
54
+ interactive=True
55
  )
56
  chat_system_message = gr.Textbox(
57
  value=CHAT_CONFIG["system_message"],
 
89
  chat_send_event = chat_submit.click(
90
  fn=handle_chat_submit_fn,
91
  inputs=[chat_input, chatbot_display, chat_system_message, chat_model_name,
92
+ chat_provider, chat_max_tokens, chat_temperature, chat_top_p],
93
  outputs=[chatbot_display, chat_input]
94
  )
95
 
 
104
  chat_enter_event = chat_input.submit(
105
  fn=handle_chat_submit_fn,
106
  inputs=[chat_input, chatbot_display, chat_system_message, chat_model_name,
107
+ chat_provider, chat_max_tokens, chat_temperature, chat_top_p],
108
  outputs=[chatbot_display, chat_input]
109
  )
110
 
 
126
  chatbot_display.retry(
127
  fn=handle_chat_retry_fn,
128
  inputs=[chatbot_display, chat_system_message, chat_model_name,
129
+ chat_provider, chat_max_tokens, chat_temperature, chat_top_p],
130
  outputs=chatbot_display
131
  )
132
 
 
139
  ### πŸ’‘ Chat Tips
140
 
141
  **Model Format:**
142
+ - Model only: `openai/gpt-oss-20b`
143
+ - Select provider via the Provider dropdown (default: `auto`)
144
 
145
  **Popular Models:**
146
  - `openai/gpt-oss-20b` - Fast general purpose
 
153
  gr.Markdown("""
154
  ### πŸš€ Popular Providers
155
 
156
+ - Select from dropdown. Default is **auto**.
 
 
 
 
 
157
 
158
+ **Example:**
159
+ - Model: `openai/gpt-oss-20b`, Provider: `groq`
 
160
  """)
161
 
162
 
 
663
 
664
  **Chat Tab:**
665
  - Enter your message and customize the AI's behavior with system messages
666
+ - Enter model and select provider from the dropdown (default: `auto`)
667
  - Adjust temperature for creativity and top-p for response diversity
668
 
669
  **Image Tab:**
utils.py CHANGED
@@ -35,9 +35,28 @@ IMAGE_CONFIG = {
35
  "negative_prompt": "blurry, low quality, distorted, deformed, ugly, bad anatomy"
36
  }
37
 
38
- # Supported providers
39
- CHAT_PROVIDERS = ["auto", "fireworks-ai", "cerebras", "groq", "together", "cohere"]
40
- IMAGE_PROVIDERS = ["hf-inference", "fal-ai", "nebius", "nscale", "replicate", "together"]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  # Popular models for quick access
43
  POPULAR_CHAT_MODELS = [
@@ -196,16 +215,6 @@ def validate_proxy_url():
196
  return True, ""
197
 
198
 
199
- def parse_model_and_provider(model_name):
200
- """
201
- Parse model name and provider from a string like 'model:provider'.
202
- Returns (model, provider) tuple. Provider is None if not specified.
203
- """
204
- if ":" in model_name:
205
- model, provider = model_name.split(":", 1)
206
- return model, provider
207
- else:
208
- return model_name, None
209
 
210
 
211
  def format_error_message(error_type, error_message):
 
35
  "negative_prompt": "blurry, low quality, distorted, deformed, ugly, bad anatomy"
36
  }
37
 
38
+ # Supported providers (unified across tasks)
39
+ PROVIDERS_UNIFIED = [
40
+ "auto",
41
+ "cerebras",
42
+ "cohere",
43
+ "fal-ai",
44
+ "featherless-ai",
45
+ "fireworks-ai",
46
+ "groq",
47
+ "hf-inference",
48
+ "hyperbolic",
49
+ "nebius",
50
+ "novita",
51
+ "nscale",
52
+ "replicate",
53
+ "sambanova",
54
+ "together",
55
+ ]
56
+
57
+ # Backwards compatibility exported lists
58
+ CHAT_PROVIDERS = PROVIDERS_UNIFIED
59
+ IMAGE_PROVIDERS = PROVIDERS_UNIFIED
60
 
61
  # Popular models for quick access
62
  POPULAR_CHAT_MODELS = [
 
215
  return True, ""
216
 
217
 
 
 
 
 
 
 
 
 
 
 
218
 
219
 
220
  def format_error_message(error_type, error_message):