zoharzaig commited on
Commit
42cdcd8
·
verified ·
1 Parent(s): 993d540

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,493 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:166106
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: sentence-transformers/all-MiniLM-L6-v2
11
+ widget:
12
+ - source_sentence: Not another call about this...
13
+ sentences:
14
+ - The heart with ribbon emoji is commonly used to express love, affection, or gratitude.
15
+ It can be sent to a loved one to show appreciation or as a symbol of a special
16
+ bond. It is also used in celebrations like Valentine's Day or birthdays.
17
+ - The ox emoji is often used to symbolize strength, power, and determination. It
18
+ can also represent hard work, reliability, and persistence. In Chinese culture,
19
+ the ox is associated with prosperity and hard work. This emoji can be used in
20
+ contexts related to agriculture, farming, and the zodiac sign of the Ox.
21
+ - The 😾 emoji, known as pouting cat, is often used to express annoyance, displeasure,
22
+ or stubbornness. It can convey a sense of sulking or being upset about something.
23
+ - source_sentence: A peaceful beach day sounds amazing right now.
24
+ sentences:
25
+ - This emoji depicts a woman fairy with medium-light skin tone. It is often used
26
+ to represent magic, fantasy, and whimsical themes. It can also symbolize playfulness,
27
+ enchantment, and fairy tales.
28
+ - The backpack emoji is used to symbolize carrying items while traveling, hiking,
29
+ or going to school. It can also represent adventure and exploration.
30
+ - The spiral shell emoji is often used to represent the beach, ocean, sea life,
31
+ or vacation. It can also symbolize relaxation, tranquility, and summer vibes.
32
+ - source_sentence: Do you know how far it is to the nearest ladies room?
33
+ sentences:
34
+ - The women's room emoji is used to indicate a restroom designated for females.
35
+ It is commonly used to specify the location of women's bathrooms in public spaces
36
+ or buildings.
37
+ - The grinning face emoji is used to express happiness, joy, or a friendly greeting.
38
+ It can also be used to show excitement or amusement. This emoji is commonly used
39
+ in casual conversations.
40
+ - The person walking facing right emoji with light skin tone is used to represent
41
+ someone walking to the right. It can symbolize movement, walking, or simply the
42
+ action of going in the direction of the right.
43
+ - source_sentence: Let's fix this together.
44
+ sentences:
45
+ - The 💯 emoji is used to emphasize perfection, excellence, or a job well done. It
46
+ can also be used to indicate that something is top-notch or 100% correct. This
47
+ emoji is commonly used to show approval or admiration for something.
48
+ - The rightwards pushing hand emoji is used to indicate pushing or movement to the
49
+ right. It can also be used in a friendly manner to encourage someone to go in
50
+ a certain direction or to express agreement or approval.
51
+ - This emoji represents a man kneeling with light skin tone. It is commonly used
52
+ to show humility, gratitude, or to ask for forgiveness. It can also be used in
53
+ the context of prayer or protest.
54
+ - source_sentence: I love how the sound of popping bubbles is so relaxing
55
+ sentences:
56
+ - The man surfing emoji with dark skin tone is used to represent someone enjoying
57
+ surfing in the ocean. It can be used in the context of beach activities, water
58
+ sports, or simply to convey a sense of relaxation and fun.
59
+ - This emoji is often used to signify raising a hand in agreement, asking a question,
60
+ or volunteering for something. It can also be used to show excitement or cheerfulness.
61
+ - The 🫧 emoji is commonly used to represent bubbles or to express a feeling of lightness
62
+ and playfulness. It can also be used in contexts related to cleanliness, baths,
63
+ or water.
64
+ pipeline_tag: sentence-similarity
65
+ library_name: sentence-transformers
66
+ ---
67
+
68
+ # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
69
+
70
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
71
+
72
+ ## Model Details
73
+
74
+ ### Model Description
75
+ - **Model Type:** Sentence Transformer
76
+ - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision c9745ed1d9f207416be6d2e6f8de32d1f16199bf -->
77
+ - **Maximum Sequence Length:** 256 tokens
78
+ - **Output Dimensionality:** 384 dimensions
79
+ - **Similarity Function:** Cosine Similarity
80
+ <!-- - **Training Dataset:** Unknown -->
81
+ <!-- - **Language:** Unknown -->
82
+ <!-- - **License:** Unknown -->
83
+
84
+ ### Model Sources
85
+
86
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
87
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
88
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
89
+
90
+ ### Full Model Architecture
91
+
92
+ ```
93
+ SentenceTransformer(
94
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
95
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
96
+ (2): Normalize()
97
+ )
98
+ ```
99
+
100
+ ## Usage
101
+
102
+ ### Direct Usage (Sentence Transformers)
103
+
104
+ First install the Sentence Transformers library:
105
+
106
+ ```bash
107
+ pip install -U sentence-transformers
108
+ ```
109
+
110
+ Then you can load this model and run inference.
111
+ ```python
112
+ from sentence_transformers import SentenceTransformer
113
+
114
+ # Download from the 🤗 Hub
115
+ model = SentenceTransformer("zoharzaig/emoji-prediction-model")
116
+ # Run inference
117
+ sentences = [
118
+ 'I love how the sound of popping bubbles is so relaxing',
119
+ 'The \U0001fae7 emoji is commonly used to represent bubbles or to express a feeling of lightness and playfulness. It can also be used in contexts related to cleanliness, baths, or water.',
120
+ 'The man surfing emoji with dark skin tone is used to represent someone enjoying surfing in the ocean. It can be used in the context of beach activities, water sports, or simply to convey a sense of relaxation and fun.',
121
+ ]
122
+ embeddings = model.encode(sentences)
123
+ print(embeddings.shape)
124
+ # [3, 384]
125
+
126
+ # Get the similarity scores for the embeddings
127
+ similarities = model.similarity(embeddings, embeddings)
128
+ print(similarities)
129
+ # tensor([[1.0000, 0.6450, 0.2089],
130
+ # [0.6450, 1.0000, 0.3229],
131
+ # [0.2089, 0.3229, 1.0000]])
132
+ ```
133
+
134
+ <!--
135
+ ### Direct Usage (Transformers)
136
+
137
+ <details><summary>Click to see the direct usage in Transformers</summary>
138
+
139
+ </details>
140
+ -->
141
+
142
+ <!--
143
+ ### Downstream Usage (Sentence Transformers)
144
+
145
+ You can finetune this model on your own dataset.
146
+
147
+ <details><summary>Click to expand</summary>
148
+
149
+ </details>
150
+ -->
151
+
152
+ <!--
153
+ ### Out-of-Scope Use
154
+
155
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
156
+ -->
157
+
158
+ <!--
159
+ ## Bias, Risks and Limitations
160
+
161
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
162
+ -->
163
+
164
+ <!--
165
+ ### Recommendations
166
+
167
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
168
+ -->
169
+
170
+ ## Training Details
171
+
172
+ ### Training Dataset
173
+
174
+ #### Unnamed Dataset
175
+
176
+ * Size: 166,106 training samples
177
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
178
+ * Approximate statistics based on the first 1000 samples:
179
+ | | sentence_0 | sentence_1 |
180
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
181
+ | type | string | string |
182
+ | details | <ul><li>min: 5 tokens</li><li>mean: 11.95 tokens</li><li>max: 22 tokens</li></ul> | <ul><li>min: 26 tokens</li><li>mean: 45.67 tokens</li><li>max: 78 tokens</li></ul> |
183
+ * Samples:
184
+ | sentence_0 | sentence_1 |
185
+ |:-------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
186
+ | <code>Solidarity and strength united.</code> | <code>The left-facing fist emoji with medium skin tone is often used to represent solidarity, unity, and strength. It can also be used to show support or encouragement.</code> |
187
+ | <code>Have you ever seen anything strange while stargazing?</code> | <code>The alien emoji is often used to represent anything extraterrestrial or outer-space related. It can also be used to convey a sense of otherness or weirdness. Some people use it to refer to conspiracy theories or unidentified flying objects.</code> |
188
+ | <code>Can't wait until I can hold you again.</code> | <code>This emoji represents a kiss between two men, it is often used to symbolize love, affection, or a romantic relationship between two male individuals.</code> |
189
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
190
+ ```json
191
+ {
192
+ "scale": 20.0,
193
+ "similarity_fct": "cos_sim"
194
+ }
195
+ ```
196
+
197
+ ### Training Hyperparameters
198
+ #### Non-Default Hyperparameters
199
+
200
+ - `per_device_train_batch_size`: 16
201
+ - `per_device_eval_batch_size`: 16
202
+ - `num_train_epochs`: 5
203
+ - `multi_dataset_batch_sampler`: round_robin
204
+
205
+ #### All Hyperparameters
206
+ <details><summary>Click to expand</summary>
207
+
208
+ - `overwrite_output_dir`: False
209
+ - `do_predict`: False
210
+ - `eval_strategy`: no
211
+ - `prediction_loss_only`: True
212
+ - `per_device_train_batch_size`: 16
213
+ - `per_device_eval_batch_size`: 16
214
+ - `per_gpu_train_batch_size`: None
215
+ - `per_gpu_eval_batch_size`: None
216
+ - `gradient_accumulation_steps`: 1
217
+ - `eval_accumulation_steps`: None
218
+ - `torch_empty_cache_steps`: None
219
+ - `learning_rate`: 5e-05
220
+ - `weight_decay`: 0.0
221
+ - `adam_beta1`: 0.9
222
+ - `adam_beta2`: 0.999
223
+ - `adam_epsilon`: 1e-08
224
+ - `max_grad_norm`: 1
225
+ - `num_train_epochs`: 5
226
+ - `max_steps`: -1
227
+ - `lr_scheduler_type`: linear
228
+ - `lr_scheduler_kwargs`: {}
229
+ - `warmup_ratio`: 0.0
230
+ - `warmup_steps`: 0
231
+ - `log_level`: passive
232
+ - `log_level_replica`: warning
233
+ - `log_on_each_node`: True
234
+ - `logging_nan_inf_filter`: True
235
+ - `save_safetensors`: True
236
+ - `save_on_each_node`: False
237
+ - `save_only_model`: False
238
+ - `restore_callback_states_from_checkpoint`: False
239
+ - `no_cuda`: False
240
+ - `use_cpu`: False
241
+ - `use_mps_device`: False
242
+ - `seed`: 42
243
+ - `data_seed`: None
244
+ - `jit_mode_eval`: False
245
+ - `use_ipex`: False
246
+ - `bf16`: False
247
+ - `fp16`: False
248
+ - `fp16_opt_level`: O1
249
+ - `half_precision_backend`: auto
250
+ - `bf16_full_eval`: False
251
+ - `fp16_full_eval`: False
252
+ - `tf32`: None
253
+ - `local_rank`: 0
254
+ - `ddp_backend`: None
255
+ - `tpu_num_cores`: None
256
+ - `tpu_metrics_debug`: False
257
+ - `debug`: []
258
+ - `dataloader_drop_last`: False
259
+ - `dataloader_num_workers`: 0
260
+ - `dataloader_prefetch_factor`: None
261
+ - `past_index`: -1
262
+ - `disable_tqdm`: False
263
+ - `remove_unused_columns`: True
264
+ - `label_names`: None
265
+ - `load_best_model_at_end`: False
266
+ - `ignore_data_skip`: False
267
+ - `fsdp`: []
268
+ - `fsdp_min_num_params`: 0
269
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
270
+ - `fsdp_transformer_layer_cls_to_wrap`: None
271
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
272
+ - `deepspeed`: None
273
+ - `label_smoothing_factor`: 0.0
274
+ - `optim`: adamw_torch
275
+ - `optim_args`: None
276
+ - `adafactor`: False
277
+ - `group_by_length`: False
278
+ - `length_column_name`: length
279
+ - `ddp_find_unused_parameters`: None
280
+ - `ddp_bucket_cap_mb`: None
281
+ - `ddp_broadcast_buffers`: False
282
+ - `dataloader_pin_memory`: True
283
+ - `dataloader_persistent_workers`: False
284
+ - `skip_memory_metrics`: True
285
+ - `use_legacy_prediction_loop`: False
286
+ - `push_to_hub`: False
287
+ - `resume_from_checkpoint`: None
288
+ - `hub_model_id`: None
289
+ - `hub_strategy`: every_save
290
+ - `hub_private_repo`: None
291
+ - `hub_always_push`: False
292
+ - `hub_revision`: None
293
+ - `gradient_checkpointing`: False
294
+ - `gradient_checkpointing_kwargs`: None
295
+ - `include_inputs_for_metrics`: False
296
+ - `include_for_metrics`: []
297
+ - `eval_do_concat_batches`: True
298
+ - `fp16_backend`: auto
299
+ - `push_to_hub_model_id`: None
300
+ - `push_to_hub_organization`: None
301
+ - `mp_parameters`:
302
+ - `auto_find_batch_size`: False
303
+ - `full_determinism`: False
304
+ - `torchdynamo`: None
305
+ - `ray_scope`: last
306
+ - `ddp_timeout`: 1800
307
+ - `torch_compile`: False
308
+ - `torch_compile_backend`: None
309
+ - `torch_compile_mode`: None
310
+ - `include_tokens_per_second`: False
311
+ - `include_num_input_tokens_seen`: False
312
+ - `neftune_noise_alpha`: None
313
+ - `optim_target_modules`: None
314
+ - `batch_eval_metrics`: False
315
+ - `eval_on_start`: False
316
+ - `use_liger_kernel`: False
317
+ - `liger_kernel_config`: None
318
+ - `eval_use_gather_object`: False
319
+ - `average_tokens_across_devices`: False
320
+ - `prompts`: None
321
+ - `batch_sampler`: batch_sampler
322
+ - `multi_dataset_batch_sampler`: round_robin
323
+ - `router_mapping`: {}
324
+ - `learning_rate_mapping`: {}
325
+
326
+ </details>
327
+
328
+ ### Training Logs
329
+ <details><summary>Click to expand</summary>
330
+
331
+ | Epoch | Step | Training Loss |
332
+ |:------:|:-----:|:-------------:|
333
+ | 0.0482 | 500 | 1.2811 |
334
+ | 0.0963 | 1000 | 1.014 |
335
+ | 0.1445 | 1500 | 0.874 |
336
+ | 0.1926 | 2000 | 0.7975 |
337
+ | 0.2408 | 2500 | 0.7261 |
338
+ | 0.2890 | 3000 | 0.6903 |
339
+ | 0.3371 | 3500 | 0.6547 |
340
+ | 0.3853 | 4000 | 0.6412 |
341
+ | 0.4334 | 4500 | 0.5815 |
342
+ | 0.4816 | 5000 | 0.5765 |
343
+ | 0.5298 | 5500 | 0.5461 |
344
+ | 0.5779 | 6000 | 0.5402 |
345
+ | 0.6261 | 6500 | 0.5347 |
346
+ | 0.6742 | 7000 | 0.5102 |
347
+ | 0.7224 | 7500 | 0.488 |
348
+ | 0.7706 | 8000 | 0.4968 |
349
+ | 0.8187 | 8500 | 0.4758 |
350
+ | 0.8669 | 9000 | 0.4725 |
351
+ | 0.9150 | 9500 | 0.4564 |
352
+ | 0.9632 | 10000 | 0.4563 |
353
+ | 1.0114 | 10500 | 0.4213 |
354
+ | 1.0595 | 11000 | 0.381 |
355
+ | 1.1077 | 11500 | 0.376 |
356
+ | 1.1558 | 12000 | 0.3991 |
357
+ | 1.2040 | 12500 | 0.3845 |
358
+ | 1.2522 | 13000 | 0.377 |
359
+ | 1.3003 | 13500 | 0.3752 |
360
+ | 1.3485 | 14000 | 0.3648 |
361
+ | 1.3966 | 14500 | 0.3914 |
362
+ | 1.4448 | 15000 | 0.3665 |
363
+ | 1.4930 | 15500 | 0.3867 |
364
+ | 1.5411 | 16000 | 0.3606 |
365
+ | 1.5893 | 16500 | 0.3706 |
366
+ | 1.6374 | 17000 | 0.3462 |
367
+ | 1.6856 | 17500 | 0.3616 |
368
+ | 1.7338 | 18000 | 0.3424 |
369
+ | 1.7819 | 18500 | 0.3465 |
370
+ | 1.8301 | 19000 | 0.3433 |
371
+ | 1.8783 | 19500 | 0.336 |
372
+ | 1.9264 | 20000 | 0.3448 |
373
+ | 1.9746 | 20500 | 0.3463 |
374
+ | 2.0227 | 21000 | 0.3171 |
375
+ | 2.0709 | 21500 | 0.3087 |
376
+ | 2.1191 | 22000 | 0.2961 |
377
+ | 2.1672 | 22500 | 0.2991 |
378
+ | 2.2154 | 23000 | 0.3007 |
379
+ | 2.2635 | 23500 | 0.2982 |
380
+ | 2.3117 | 24000 | 0.2886 |
381
+ | 2.3599 | 24500 | 0.2881 |
382
+ | 2.4080 | 25000 | 0.2867 |
383
+ | 2.4562 | 25500 | 0.2998 |
384
+ | 2.5043 | 26000 | 0.2942 |
385
+ | 2.5525 | 26500 | 0.2941 |
386
+ | 2.6007 | 27000 | 0.2938 |
387
+ | 2.6488 | 27500 | 0.2776 |
388
+ | 2.6970 | 28000 | 0.2705 |
389
+ | 2.7451 | 28500 | 0.2949 |
390
+ | 2.7933 | 29000 | 0.2856 |
391
+ | 2.8415 | 29500 | 0.2724 |
392
+ | 2.8896 | 30000 | 0.2891 |
393
+ | 2.9378 | 30500 | 0.2835 |
394
+ | 2.9859 | 31000 | 0.2896 |
395
+ | 3.0341 | 31500 | 0.2571 |
396
+ | 3.0823 | 32000 | 0.2541 |
397
+ | 3.1304 | 32500 | 0.2527 |
398
+ | 3.1786 | 33000 | 0.2587 |
399
+ | 3.2267 | 33500 | 0.251 |
400
+ | 3.2749 | 34000 | 0.2437 |
401
+ | 3.3231 | 34500 | 0.252 |
402
+ | 3.3712 | 35000 | 0.2533 |
403
+ | 3.4194 | 35500 | 0.2388 |
404
+ | 3.4675 | 36000 | 0.2391 |
405
+ | 3.5157 | 36500 | 0.2488 |
406
+ | 3.5639 | 37000 | 0.2442 |
407
+ | 3.6120 | 37500 | 0.2398 |
408
+ | 3.6602 | 38000 | 0.254 |
409
+ | 3.7083 | 38500 | 0.2427 |
410
+ | 3.7565 | 39000 | 0.2396 |
411
+ | 3.8047 | 39500 | 0.2409 |
412
+ | 3.8528 | 40000 | 0.2406 |
413
+ | 3.9010 | 40500 | 0.2529 |
414
+ | 3.9491 | 41000 | 0.2485 |
415
+ | 3.9973 | 41500 | 0.2427 |
416
+ | 4.0455 | 42000 | 0.2133 |
417
+ | 4.0936 | 42500 | 0.2337 |
418
+ | 4.1418 | 43000 | 0.2307 |
419
+ | 4.1899 | 43500 | 0.2172 |
420
+ | 4.2381 | 44000 | 0.2137 |
421
+ | 4.2863 | 44500 | 0.235 |
422
+ | 4.3344 | 45000 | 0.2198 |
423
+ | 4.3826 | 45500 | 0.2275 |
424
+ | 4.4307 | 46000 | 0.2339 |
425
+ | 4.4789 | 46500 | 0.227 |
426
+ | 4.5271 | 47000 | 0.2265 |
427
+ | 4.5752 | 47500 | 0.2232 |
428
+ | 4.6234 | 48000 | 0.2248 |
429
+ | 4.6715 | 48500 | 0.223 |
430
+ | 4.7197 | 49000 | 0.2287 |
431
+ | 4.7679 | 49500 | 0.2168 |
432
+ | 4.8160 | 50000 | 0.2262 |
433
+ | 4.8642 | 50500 | 0.2207 |
434
+ | 4.9123 | 51000 | 0.2043 |
435
+ | 4.9605 | 51500 | 0.2233 |
436
+
437
+ </details>
438
+
439
+ ### Framework Versions
440
+ - Python: 3.9.6
441
+ - Sentence Transformers: 5.0.0
442
+ - Transformers: 4.53.1
443
+ - PyTorch: 2.7.1
444
+ - Accelerate: 1.8.1
445
+ - Datasets: 3.6.0
446
+ - Tokenizers: 0.21.2
447
+
448
+ ## Citation
449
+
450
+ ### BibTeX
451
+
452
+ #### Sentence Transformers
453
+ ```bibtex
454
+ @inproceedings{reimers-2019-sentence-bert,
455
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
456
+ author = "Reimers, Nils and Gurevych, Iryna",
457
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
458
+ month = "11",
459
+ year = "2019",
460
+ publisher = "Association for Computational Linguistics",
461
+ url = "https://arxiv.org/abs/1908.10084",
462
+ }
463
+ ```
464
+
465
+ #### MultipleNegativesRankingLoss
466
+ ```bibtex
467
+ @misc{henderson2017efficient,
468
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
469
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
470
+ year={2017},
471
+ eprint={1705.00652},
472
+ archivePrefix={arXiv},
473
+ primaryClass={cs.CL}
474
+ }
475
+ ```
476
+
477
+ <!--
478
+ ## Glossary
479
+
480
+ *Clearly define terms in order to be accessible across audiences.*
481
+ -->
482
+
483
+ <!--
484
+ ## Model Card Authors
485
+
486
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
487
+ -->
488
+
489
+ <!--
490
+ ## Model Card Contact
491
+
492
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
493
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 1536,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 6,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.53.2",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.0.0",
4
+ "transformers": "4.53.2",
5
+ "pytorch": "2.7.1"
6
+ },
7
+ "model_type": "SentenceTransformer",
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5726386c25d380821b8bb997a6b62feb1d27e1a16a9c609e490cfb87f941dfb
3
+ size 90864192
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 128,
51
+ "model_max_length": 256,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff