- 
	
	
	
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
Paper • 2510.18701 • Published • 66 - 
	
	
	
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper • 2508.20751 • Published • 89 - 
	
	
	
CodeGoat24/UniGenBench-Eval-Images
Viewer • Updated • 762k • 424 • 4 - 
	
	
	
				CodeGoat24/UniGenBench-EvalModel-qwen-72b-v1
Image-to-Text • 73B • Updated • 89 • 3 
SII-Yibin Wang
CodeGoat24
		AI & ML interests
I'm part of Shanghai Innovation Institute, focusing on Multimodal RL and Generation.
		Recent Activity
						upvoted 
								a
								paper
							
						about 6 hours ago
						
					
						
						
						UniREditBench: A Unified Reasoning-based Image Editing Benchmark
						
						commented on 
								a paper
							
						about 6 hours ago
						
					
						
						
						UniREditBench: A Unified Reasoning-based Image Editing Benchmark
						
						liked
								a dataset
							
						about 22 hours ago
						
					
						
						
						
						maplebb/UniREdit-Data-100K
						Organizations
UnifiedReward 1.0 Qwen Models
			
			
	
	- 
	
	
	
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 - 
	
	
	
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 93 - 
	
	
	
				CodeGoat24/UnifiedReward-Think-qwen-7b
8B • Updated • 562 • 3 - 
	
	
	
				CodeGoat24/UnifiedReward-qwen-32b
33B • Updated • 18 • 1 
UnifiedReward 1.0 LLaVA Model
			
			
	
	- 
	
	
	
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 - 
	
	
	
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 93 - 
	
	
	
				CodeGoat24/UnifiedReward-Think-7b
8B • Updated • 16 • 10 - 
	
	
	
				CodeGoat24/UnifiedReward-7b-v1.5
8B • Updated • 3.18k • 6 
UnifiedReward 2.0 Models
			
			
	
	UnifiedReward 1.0 Qwen Models GGUF
			
			
	
	- 
	
	
	
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 - 
	
	
	
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 93 - 
	
	
	
mradermacher/UnifiedReward-qwen-32b-i1-GGUF
33B • Updated • 288 • 1 - 
	
	
	
mradermacher/UnifiedReward-Think-qwen-7b-i1-GGUF
8B • Updated • 247 
UnifiedReward Training Data
			
			
	
	- 
	
	
	
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 - 
	
	
	
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 93 - 
	
	
	
CodeGoat24/UnifiedReward-2.0-T2X-score-data
Viewer • Updated • 337k • 657 - 
	
	
	
CodeGoat24/ImageGen-CoT-Reward-5K
Viewer • Updated • 5.54k • 74 • 1 
Pref-GRPO & UniGenBench
			
			
	
	- 
	
	
	
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
Paper • 2510.18701 • Published • 66 - 
	
	
	
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper • 2508.20751 • Published • 89 - 
	
	
	
CodeGoat24/UniGenBench-Eval-Images
Viewer • Updated • 762k • 424 • 4 - 
	
	
	
				CodeGoat24/UniGenBench-EvalModel-qwen-72b-v1
Image-to-Text • 73B • Updated • 89 • 3 
UnifiedReward 2.0 Models
			
			
	
	UnifiedReward 1.0 Qwen Models
			
			
	
	- 
	
	
	
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 - 
	
	
	
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 93 - 
	
	
	
				CodeGoat24/UnifiedReward-Think-qwen-7b
8B • Updated • 562 • 3 - 
	
	
	
				CodeGoat24/UnifiedReward-qwen-32b
33B • Updated • 18 • 1 
UnifiedReward 1.0 Qwen Models GGUF
			
			
	
	- 
	
	
	
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 - 
	
	
	
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 93 - 
	
	
	
mradermacher/UnifiedReward-qwen-32b-i1-GGUF
33B • Updated • 288 • 1 - 
	
	
	
mradermacher/UnifiedReward-Think-qwen-7b-i1-GGUF
8B • Updated • 247 
UnifiedReward 1.0 LLaVA Model
			
			
	
	- 
	
	
	
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 - 
	
	
	
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 93 - 
	
	
	
				CodeGoat24/UnifiedReward-Think-7b
8B • Updated • 16 • 10 - 
	
	
	
				CodeGoat24/UnifiedReward-7b-v1.5
8B • Updated • 3.18k • 6 
UnifiedReward Training Data
			
			
	
	- 
	
	
	
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 - 
	
	
	
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Paper • 2505.03318 • Published • 93 - 
	
	
	
CodeGoat24/UnifiedReward-2.0-T2X-score-data
Viewer • Updated • 337k • 657 - 
	
	
	
CodeGoat24/ImageGen-CoT-Reward-5K
Viewer • Updated • 5.54k • 74 • 1