Commit
·
e5c691d
1
Parent(s):
266c594
Add initial model datacard.
Browse files
README.md
ADDED
|
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
datasets:
|
| 3 |
+
- mvasiliniuc/iva-kotlin-codeint-clean-train
|
| 4 |
+
- mvasiliniuc/iva-kotlin-codeint-clean-valid
|
| 5 |
+
language:
|
| 6 |
+
- code
|
| 7 |
+
tags:
|
| 8 |
+
- gpt2
|
| 9 |
+
- code
|
| 10 |
+
- kotlin
|
| 11 |
+
- mobile
|
| 12 |
+
- generation
|
| 13 |
+
widget:
|
| 14 |
+
- text: "/**\n\t* A function that returns the version of the current operating system.\n*/\n"
|
| 15 |
+
example_title: "Get current device operating system"
|
| 16 |
+
- text: "/**\n\t* A function that returns the current TimeZone.\n*/\n"
|
| 17 |
+
example_title: "Get current timezone"
|
| 18 |
+
- text: "/**\n\t* A data class representing a Bank Account.\n*/\n"
|
| 19 |
+
example_title: "Data Class - BankAccount"
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
iva-codeint-kotlin-small GPT-2 is (small version - 239.4M parameters) trained from scratch to obtain results in the text-to-code task tailored for Kotlin language used
|
| 23 |
+
in native mobile development (Android).
|
| 24 |
+
|
| 25 |
+
## Usage
|
| 26 |
+
|
| 27 |
+
```Python
|
| 28 |
+
from transformers import pipeline
|
| 29 |
+
|
| 30 |
+
pipe = pipeline("text-generation", model="mvasiliniuc/iva-codeint-kotlin-small")
|
| 31 |
+
outputs = pipe("fun printToConsole()")
|
| 32 |
+
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
### Inference
|
| 36 |
+
```Python
|
| 37 |
+
API_URL = "https://api-inference.huggingface.co/models/mvasiliniuc/iva-codeint-kotlin-small"
|
| 38 |
+
headers = {"Authorization": "Bearer <key>"}
|
| 39 |
+
def query(payload):
|
| 40 |
+
response = requests.post(API_URL, headers=headers, json=payload)
|
| 41 |
+
return response.json()
|
| 42 |
+
|
| 43 |
+
output = query({
|
| 44 |
+
"inputs": """
|
| 45 |
+
/**
|
| 46 |
+
* A public function that returns the current version of the operating system.
|
| 47 |
+
*/
|
| 48 |
+
"""
|
| 49 |
+
})
|
| 50 |
+
pprint.pprint(output, compact=True)
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
## Training
|
| 54 |
+
|
| 55 |
+
| Config | Value |
|
| 56 |
+
|------|------------------|
|
| 57 |
+
| seq length | 1024 |
|
| 58 |
+
| weight decay | 0.1 |
|
| 59 |
+
| learning rate | 0.0005 |
|
| 60 |
+
| max eval steps | -1 |
|
| 61 |
+
| shuffle buffer | 10000 |
|
| 62 |
+
| max train steps | 150000 |
|
| 63 |
+
| mixed precision | fp16 |
|
| 64 |
+
| num warmup steps | 2000 |
|
| 65 |
+
| train batch size | 5 |
|
| 66 |
+
| valid batch size | 5 |
|
| 67 |
+
| lr scheduler type | cosine |
|
| 68 |
+
| save checkpoint steps | 15000 |
|
| 69 |
+
| gradient checkpointing | false |
|
| 70 |
+
| gradient accumulation steps | 1 |
|
| 71 |
+
|
| 72 |
+
## Resources
|
| 73 |
+
|
| 74 |
+
Resources used for research:
|
| 75 |
+
* [Training a causal language model from scratch](https://huggingface.co/learn/nlp-course/chapter7/6)
|
| 76 |
+
* [CodeParrot a GPT-2 model (1.5B parameters) trained to generate Python code](https://huggingface.co/codeparrot/codeparrot)
|