Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Tanaybh
/
gpt2-rlhf-anthropic
like
0
Text Generation
Transformers
Safetensors
Anthropic/hh-rlhf
gpt2
rlhf
reinforcement-learning-from-human-feedback
anthropic-hh-rlhf
chatgpt-style-training
ppo
supervised-fine-tuning
human-preferences
ai-alignment
text-generation-inference
License:
mit
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
gpt2-rlhf-anthropic
499 MB
1 contributor
History:
6 commits
Tanaybh
Upload RLHF-trained GPT-2 model
822364b
verified
25 days ago
.gitattributes
Safe
1.52 kB
initial commit
about 1 month ago
README.md
4.63 kB
Update README.md
29 days ago
config.json
Safe
874 Bytes
Upload RLHF-trained GPT-2 model
about 1 month ago
generation_config.json
Safe
119 Bytes
Upload RLHF-trained GPT-2 model
about 1 month ago
merges.txt
Safe
456 kB
Upload RLHF-trained GPT-2 model
about 1 month ago
model.safetensors
498 MB
xet
Upload RLHF-trained GPT-2 model
25 days ago
special_tokens_map.json
Safe
470 Bytes
Upload RLHF-trained GPT-2 model
about 1 month ago
tokenizer_config.json
Safe
556 Bytes
Upload RLHF-trained GPT-2 model
about 1 month ago
training_metadata.json
532 Bytes
Add training metadata
about 1 month ago
vocab.json
Safe
999 kB
Upload RLHF-trained GPT-2 model
about 1 month ago