Update README.md
Browse files
README.md
CHANGED
|
@@ -3,4 +3,57 @@ license: apache-2.0
|
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
base_model: urchade/gliner_small-v2
|
| 6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
base_model: urchade/gliner_small-v2
|
| 6 |
+
datasets:
|
| 7 |
+
- gretelai/synthetic_pii_finance_multilingual
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# GLiNER-Finance-PII-Detection
|
| 11 |
+
|
| 12 |
+
## Training and evaluation data
|
| 13 |
+
|
| 14 |
+
I have used 0.5 epochs in fine tuning.
|
| 15 |
+
|
| 16 |
+
## Training procedure notebook
|
| 17 |
+
|
| 18 |
+
https://github.com/mit1280/fined-tuning/blob/main/Fine_Tune_GLiNER_Token_Classification.ipynb
|
| 19 |
+
|
| 20 |
+
### Training hyperparameters
|
| 21 |
+
|
| 22 |
+
The following hyperparameters were used during training:
|
| 23 |
+
- learning_rate: 1e-5
|
| 24 |
+
|
| 25 |
+
### Inference Code
|
| 26 |
+
|
| 27 |
+
```python
|
| 28 |
+
|
| 29 |
+
!pip install -q gliner
|
| 30 |
+
|
| 31 |
+
import os
|
| 32 |
+
import re
|
| 33 |
+
import torch
|
| 34 |
+
from gliner import GLiNERConfig, GLiNER
|
| 35 |
+
|
| 36 |
+
fine_tuned_model = GLiNER.from_pretrained("Mit1208/gliner-fine-tuned-pii-finance-multilingual")
|
| 37 |
+
|
| 38 |
+
text = "Loan Application\n\nFull Legal Name: Luigi Clelia Togliatti\nDate of Birth: 11/27/1967\n\nMailing Address:\n4893 Justin Terrace\n[City, State, Zip Code]\n\nPhone Number: [(123) 456-7890]\nEmail Address: [[email protected]]\n\nEducational Institution: University of Toronto\nExpected Graduation Date: [Graduation Year]\n\nProgram of Study: Bachelor of Science in Computer Science\n\nFuture Career Plans: After graduation, I plan to pursue a career as a software engineer at a tech company. I am particularly interested in the field of artificial intelligence and machine learning.\n\nLoan Amount Requested: $20,000\n\nPersonal Financial Information:\n\n* Monthly Income: $2,500\n* Monthly Expenses: $1,500\n* Total Assets: $10,000\n* Total Debts: $5,000\n\nI confirm that all the information provided is true and accurate to the best of my knowledge.\n\nSignature: Luigi Clelia Togliatti\nDate: [Today's Date]"
|
| 39 |
+
|
| 40 |
+
# Labels for entity prediction
|
| 41 |
+
labels = ["street_address", "company", "date_of_birth", "email", "date", "name"]
|
| 42 |
+
|
| 43 |
+
# Perform entity prediction
|
| 44 |
+
entities = fine_tuned_model.predict_entities(text, labels, threshold=0.85)
|
| 45 |
+
|
| 46 |
+
# Display predicted entities and their labels
|
| 47 |
+
for entity in entities:
|
| 48 |
+
print("(", entity["text"], "=>", entity["label"], ") (start & end ==>", entity["start"], "&", entity["end"], ")")
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
# Output
|
| 52 |
+
'''
|
| 53 |
+
( Luigi Clelia Togliatti => name ) (start & end ==> 35 & 57 )
|
| 54 |
+
( 11/27/1967 => date_of_birth ) (start & end ==> 73 & 83 )
|
| 55 |
+
( 4893 Justin Terrace => street_address ) (start & end ==> 102 & 121 )
|
| 56 |
+
( [email protected] => email ) (start & end ==> 194 & 219 )
|
| 57 |
+
( Luigi Clelia Togliatti => name ) (start & end ==> 842 & 864 )
|
| 58 |
+
'''
|
| 59 |
+
```
|