Update README.md
Browse files
README.md
CHANGED
|
@@ -49,8 +49,10 @@ license: apache-2.0
|
|
| 49 |
For the codebase, refer to: https://github.com/bangx7/code_aesthetics
|
| 50 |
|
| 51 |
## 🎉 News
|
|
|
|
| 52 |
- __[2025.10.27]__: Release the [Project Page](https://bangx7.github.io/code-aesthetics/) and the [Arxiv](https://arxiv.org/abs/2510.23272) version.
|
| 53 |
|
|
|
|
| 54 |
## 📷 Abstract
|
| 55 |
Large Language Models (LLMs) have become valuable assistants for developers in code-related tasks. While LLMs excel at traditional programming tasks such as code generation and bug fixing, they struggle with visually-oriented coding tasks, often producing suboptimal aesthetics. In this paper, we introduce a new pipeline to enhance the aesthetic quality of LLM-generated code. We first construct AesCode-358K, a large-scale instruction-tuning dataset focused on code aesthetics. Next, we propose agentic reward feedback, a multi-agent system that evaluates executability, static aesthetics, and interactive aesthetics. Building on this, we develop GRPO-AR, which integrates these signals into the GRPO algorithm for joint optimization of functionality and code aesthetics. Finally, we develop OpenDesign, a benchmark for assessing code aesthetics. Experimental results show that combining supervised fine-tuning on AesCode-358K with reinforcement learning using agentic reward feedback significantly improves performance on OpenDesign and also enhances results on existing benchmarks such as PandasPlotBench. Notably, our AesCoder-4B surpasses GPT-4o and GPT-4.1, and achieves performance comparable to large open-source models with 480B-685B parameters, underscoring the effectiveness of our approach.
|
| 56 |
|
|
|
|
| 49 |
For the codebase, refer to: https://github.com/bangx7/code_aesthetics
|
| 50 |
|
| 51 |
## 🎉 News
|
| 52 |
+
- __[2025.10.29]__: Release the [AesCoder-4B](https://huggingface.co/SamuelBang/AesCoder-4B/) model.
|
| 53 |
- __[2025.10.27]__: Release the [Project Page](https://bangx7.github.io/code-aesthetics/) and the [Arxiv](https://arxiv.org/abs/2510.23272) version.
|
| 54 |
|
| 55 |
+
|
| 56 |
## 📷 Abstract
|
| 57 |
Large Language Models (LLMs) have become valuable assistants for developers in code-related tasks. While LLMs excel at traditional programming tasks such as code generation and bug fixing, they struggle with visually-oriented coding tasks, often producing suboptimal aesthetics. In this paper, we introduce a new pipeline to enhance the aesthetic quality of LLM-generated code. We first construct AesCode-358K, a large-scale instruction-tuning dataset focused on code aesthetics. Next, we propose agentic reward feedback, a multi-agent system that evaluates executability, static aesthetics, and interactive aesthetics. Building on this, we develop GRPO-AR, which integrates these signals into the GRPO algorithm for joint optimization of functionality and code aesthetics. Finally, we develop OpenDesign, a benchmark for assessing code aesthetics. Experimental results show that combining supervised fine-tuning on AesCode-358K with reinforcement learning using agentic reward feedback significantly improves performance on OpenDesign and also enhances results on existing benchmarks such as PandasPlotBench. Notably, our AesCoder-4B surpasses GPT-4o and GPT-4.1, and achieves performance comparable to large open-source models with 480B-685B parameters, underscoring the effectiveness of our approach.
|
| 58 |
|