Update README.md
Browse files
README.md
CHANGED
|
@@ -5,16 +5,16 @@ datasets:
|
|
| 5 |
language:
|
| 6 |
- kk
|
| 7 |
metrics:
|
| 8 |
-
- name: F1
|
| 9 |
-
type: F1 Score
|
| 10 |
value: 31.405
|
| 11 |
-
- name: Exact Match (
|
| 12 |
type: Exact Match
|
| 13 |
value: 14.675
|
| 14 |
-
- name: F1 (
|
| 15 |
type: F1 Score
|
| 16 |
value: 56.819
|
| 17 |
-
- name: Exact Match (
|
| 18 |
type: Exact Match
|
| 19 |
value: 35.454
|
| 20 |
base_model:
|
|
@@ -33,13 +33,18 @@ This model was developed by **Kundyz Maksutova**, PhD Candidate, as part of rese
|
|
| 33 |
- **Dataset**: `Kundyzka/informatics_kaz`
|
| 34 |
- **Language**: Kazakh (`kk`)
|
| 35 |
- **Task**: Question Answering
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
### Dataset:
|
| 45 |
The `Kundyzka/informatics_kaz` dataset is curated to provide a diverse set of questions and answers in Kazakh, primarily targeting topics in computer science. This dataset ensures the model handles domain-specific terminology effectively.
|
|
@@ -54,3 +59,10 @@ This model is designed for answering questions in the Kazakh language, with appl
|
|
| 54 |
- **Domain-Specific Bias**: Performance may drop on topics outside computer science.
|
| 55 |
- **Dataset Bias**: Potential biases from the dataset can influence model outputs.
|
| 56 |
- **Language Support**: The model is optimized for Kazakh and does not support other languages.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
language:
|
| 6 |
- kk
|
| 7 |
metrics:
|
| 8 |
+
- name: F1 (Before Training)
|
| 9 |
+
type: F1 Score
|
| 10 |
value: 31.405
|
| 11 |
+
- name: Exact Match (Before Training)
|
| 12 |
type: Exact Match
|
| 13 |
value: 14.675
|
| 14 |
+
- name: F1 (After Training)
|
| 15 |
type: F1 Score
|
| 16 |
value: 56.819
|
| 17 |
+
- name: Exact Match (After Training)
|
| 18 |
type: Exact Match
|
| 19 |
value: 35.454
|
| 20 |
base_model:
|
|
|
|
| 33 |
- **Dataset**: `Kundyzka/informatics_kaz`
|
| 34 |
- **Language**: Kazakh (`kk`)
|
| 35 |
- **Task**: Question Answering
|
| 36 |
+
|
| 37 |
+
### Performance:
|
| 38 |
+
This model demonstrates significant improvements after fine-tuning, as shown by the following metrics:
|
| 39 |
+
|
| 40 |
+
- **Before Training**:
|
| 41 |
+
- F1 Score: 31.405
|
| 42 |
+
- Exact Match (EM): 14.675
|
| 43 |
+
- **After Training**:
|
| 44 |
+
- F1 Score: 56.819
|
| 45 |
+
- Exact Match (EM): 35.454
|
| 46 |
+
|
| 47 |
+
These metrics highlight the enhanced ability of the model to handle domain-specific questions after training on the `Kundyzka/informatics_kaz` dataset.
|
| 48 |
|
| 49 |
### Dataset:
|
| 50 |
The `Kundyzka/informatics_kaz` dataset is curated to provide a diverse set of questions and answers in Kazakh, primarily targeting topics in computer science. This dataset ensures the model handles domain-specific terminology effectively.
|
|
|
|
| 59 |
- **Domain-Specific Bias**: Performance may drop on topics outside computer science.
|
| 60 |
- **Dataset Bias**: Potential biases from the dataset can influence model outputs.
|
| 61 |
- **Language Support**: The model is optimized for Kazakh and does not support other languages.
|
| 62 |
+
|
| 63 |
+
### Tags:
|
| 64 |
+
- `computerscience`
|
| 65 |
+
- `question-answering`
|
| 66 |
+
- `Kazakh`
|
| 67 |
+
|
| 68 |
+
This model represents a significant step toward advancing natural language processing tools for low-resource languages like Kazakh. For further details or customization, refer to the model repository.
|