DenseRewardRLHF-PPO
					Collection
				
This repository contains  the released models for our paper Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model. 
					• 
				18 items
				• 
				Updated
					
				•
					
					1