Spaces:

codealchemist01
/

turkish-sentiment-analysis-finetuned

Running

App Files Files Community

codealchemist01 commited on 7 days ago

Commit

458db64

verified ·

1 Parent(s): ac12c6d

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +149 -12

README.md CHANGED Viewed

@@ -1,12 +1,149 @@
----
-title: Turkish Sentiment Analysis Finetuned
-emoji: 😻
-colorFrom: red
-colorTo: yellow
-sdk: gradio
-sdk_version: 5.49.1
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+---
+title: Turkish Sentiment Analysis (Fine-tuned)
+emoji: 🚀
+colorFrom: purple
+colorTo: blue
+sdk: gradio
+sdk_version: 4.44.0
+app_file: app.py
+pinned: false
+license: apache-2.0
+base_model: codealchemist01/turkish-sentiment-analysis
+---
+# Turkish Sentiment Analysis (Fine-tuned) 🇹🇷
+Fine-tuned Turkish sentiment analysis model with improved neutral class detection. This model is based on [codealchemist01/turkish-sentiment-analysis](https://huggingface.co/codealchemist01/turkish-sentiment-analysis) and fine-tuned on a balanced dataset.
+## Model Bilgileri
+- **Model:** [codealchemist01/turkish-sentiment-analysis-finetuned](https://huggingface.co/codealchemist01/turkish-sentiment-analysis-finetuned)
+- **Base Model:** [codealchemist01/turkish-sentiment-analysis](https://huggingface.co/codealchemist01/turkish-sentiment-analysis)
+- **Task:** Text Classification (Sentiment Analysis)
+- **Language:** Turkish
+- **Labels:** positive, negative, neutral
+- **Fine-tuning Type:** Continued fine-tuning on balanced dataset
+## 🎯 Ana Özellikler
+### İyileştirmeler:
+- ✅ **Neutral sınıfı algılama:** %80 iyileşme (test örneklerinde)
+- ✅ **Daha dengeli dataset:** 556,888 örnek (37.6% neutral)
+- ✅ **Gerçek dünya performansı:** Daha iyi genelleme
+- ✅ **Belirsiz ifadeler:** Daha doğru tahmin
+### Performans:
+- **Accuracy:** 91.96%
+- **Neutral F1:** 90.57% ⬆️
+- **Positive F1:** 94.61%
+- **Negative F1:** 88.68%
+## 📊 Eğitim Verisi
+### Fine-tuning Dataset:
+- **Toplam:** 556,888 örnek
+- **Positive:** 237,966 (42.7%)
+- **Neutral:** 209,668 (37.6%) ⬆️
+- **Negative:** 109,254 (19.6%) ⬆️
+### Kullanılan Dataset'ler:
+1. **Orijinal Dataset:**
+   - `winvoker/turkish-sentiment-analysis-dataset`
+   - `WhiteAngelss/Turkce-Duygu-Analizi-Dataset`
+2. **Ek Dataset'ler:**
+   - `maydogan/Turkish_SentimentAnalysis_TRSAv1` (150,000 samples)
+   - `turkish-nlp-suite/MusteriYorumlari` (73,920 samples)
+   - `W4nkel/turkish-sentiment-dataset` (4,800 samples)
+   - `mustfkeskin/turkish-movie-sentiment-analysis-dataset` (Kaggle, 83,227 samples)
+## 🚀 Kullanım
+### Python ile:
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# Load model
+model_name = "codealchemist01/turkish-sentiment-analysis-finetuned"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+# Example text
+text = "Ürün normal, beklediğim gibi. Özel bir şey yok."
+# Tokenize
+inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
+# Predict
+with torch.no_grad():
+    outputs = model(**inputs)
+    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+    predicted_label_id = predictions.argmax().item()
+# Map to label
+id2label = {0: "negative", 1: "neutral", 2: "positive"}
+predicted_label = id2label[predicted_label_id]
+confidence = predictions[0][predicted_label_id].item()
+print(f"Label: {predicted_label}")
+print(f"Confidence: {confidence:.4f}")
+```
+### Gradio Space:
+Bu Space'te interaktif olarak test edebilirsiniz!
+## 📈 İyileştirme Sonuçları
+### Test Sonuçları (15 örnek test):
+- **Genel Accuracy:** 66.7% → 86.7% (+20.0%)
+- **Neutral:** 0% → 80% (+80.0%) 🚀
+- **Negative:** 100% → 80%
+- **Positive:** 100% → 100%
+### Test Seti Performansı (55,689 örnek):
+- **Accuracy:** 91.96%
+- **Weighted F1:** 91.93%
+- **Neutral F1:** 90.57%
+- **Positive F1:** 94.61%
+- **Negative F1:** 88.68%
+## 🔧 Fine-tuning Detayları
+- **Base Model:** codealchemist01/turkish-sentiment-analysis
+- **Epochs:** 2
+- **Learning Rate:** 1e-5 (fine-tuning için optimize edilmiş)
+- **Batch Size:** 12
+- **Max Length:** 128 tokens
+- **Optimizer:** AdamW
+## 💡 Kullanım Önerileri
+- ✅ Neutral ifadeleri daha iyi algılar
+- ✅ "Normal", "standart", "orta seviye" gibi ifadeleri doğru tahmin eder
+- ✅ Daha dengeli sınıf performansı
+- ✅ Gerçek dünya metinlerinde daha iyi genelleme
+## ⚠️ Limitasyonlar
+- Çok kısa metinlerde (< 3 kelime) performans düşebilir
+- Farklı domainlerde (sosyal medya, haber, yorum) performans değişebilir
+- Bazı belirsiz ifadeler hala yanlış tahmin edilebilir
+## 📝 Citation
+```bibtex
+@misc{turkish-sentiment-analysis-finetuned,
+  title={Turkish Sentiment Analysis Model (Fine-tuned)},
+  author={codealchemist01},
+  year={2024},
+  base_model={codealchemist01/turkish-sentiment-analysis},
+  howpublished={\url{https://huggingface.co/codealchemist01/turkish-sentiment-analysis-finetuned}}
+}
+```
+## 📄 License
+Apache 2.0