improve readme :) (#1)
Browse files- improve readme :) (48c03470ed4e962d2db455a72b0b94b9577df65c)
- Create assets/ (4add9a73fb8e9f906b48a96468c964d89a4aa39d)
- Delete assets (cefdfafa153c08ac71bfd1d9f2ce2b2b07d86428)
- Update README.md (e71fb976b3d146dd309876ef97f8b065f083ef02)
- Upload show.jpg (637097afbf1ea1a25420c60b469b0a1163cfcb17)
- Delete show.jpg (8ff7c42dda4499150f6fdb03977c2e964963f810)
Co-authored-by: Linoy Tsaban <[email protected]>
- .gitattributes +1 -0
- README.md +16 -3
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
show.jpg filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -1,9 +1,22 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model:
|
|
|
|
| 3 |
- bytedance-research/OneReward
|
|
|
|
|
|
|
|
|
|
| 4 |
---
|
| 5 |
-
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
base_model:
|
| 4 |
+
- black-forest-labs/FLUX.1-Fill-dev
|
| 5 |
- bytedance-research/OneReward
|
| 6 |
+
language:
|
| 7 |
+
- en
|
| 8 |
+
pipeline_tag: image-to-image
|
| 9 |
---
|
| 10 |
+
# OneReward - ComfyUI
|
| 11 |
|
| 12 |
+
**ComfyUI community** checkpoint for **[OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning](https://arxiv.org/abs/xxxx)**.
|
| 13 |
|
| 14 |
+
[](https://arxiv.org/abs/2508.21066) [](https://github.com/bytedance/OneReward) [](https://one-reward.github.io/)
|
| 15 |
+
<br>
|
| 16 |
+
|
| 17 |
+
This repo contains the checkpoint from [OneReward](https://huggingface.co/bytedance-research/OneReward) processed into a single model suitable for ComfyUI use.
|
| 18 |
+
|
| 19 |
+
**OneReward** is a novel RLHF methodology for the visual domain by employing Qwen2.5-VL as a generative reward model to enhance multitask reinforcement learning, significantly improving the policy model’s generation ability across multiple subtask. Building on OneReward, **FLUX.1-Fill-dev-OneReward** - based on FLUX Fill [dev], outperforms closed-source FLUX Fill [Pro] in inpainting and outpainting tasks, serving as a powerful new baseline for future research in unified image editing.
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
For more details and examples see original model repo: [**OneReward**](https://huggingface.co/bytedance-research/OneReward)
|