On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub
Abstract
Agent-assisted pull requests generated by Claude Code are largely accepted in open-source projects, with most requiring minimal human modification.
Large language models (LLMs) are increasingly being integrated into software development processes. The ability to generate code and submit pull requests with minimal human intervention, through the use of autonomous AI agents, is poised to become a standard practice. However, little is known about the practical usefulness of these pull requests and the extent to which their contributions are accepted in real-world projects. In this paper, we empirically study 567 GitHub pull requests (PRs) generated using Claude Code, an agentic coding tool, across 157 diverse open-source projects. Our analysis reveals that developers tend to rely on agents for tasks such as refactoring, documentation, and testing. The results indicate that 83.8% of these agent-assisted PRs are eventually accepted and merged by project maintainers, with 54.9% of the merged PRs are integrated without further modification. The remaining 45.1% require additional changes benefit from human revisions, especially for bug fixes, documentation, and adherence to project-specific standards. These findings suggest that while agent-assisted PRs are largely acceptable, they still benefit from human oversight and refinement.
Community
๐ ๐๐ด๐ฒ๐ป๐๐ถ๐ฐ ๐ฃ๐ฅ๐ ๐ฎ๐ฟ๐ฒ ๐๐ต๐ถ๐ฝ๐ฝ๐ถ๐ป๐ด โ ๐ด๐ฏ.๐ด% ๐บ๐ฒ๐ฟ๐ด๐ฒ ๐ฟ๐ฎ๐๐ฒ ๐
Not a demo, not a toy. We study Claude Code PRs on GitHub, agentic PRs are merged 83.8% of the time vs 91.0% for humans, with similar median merge speeds (1.23 hrs vs 1.04 hrs).
โข ๐ง๐ต๐ฒ ๐๐๐ผ๐ฟ๐ ๐ผ๐ป ๐๐ต๐ฒ ๐ด๐ฟ๐ผ๐๐ป๐ฑ: agents accelerate setup and routine improvements; humans carry context, enforce quality, and keep scope tight.
โข ๐ช๐ต๐ฎ๐ ๐ฎ๐ด๐ฒ๐ป๐๐ ๐ฑ๐ผ ๐บ๐ผ๐ฟ๐ฒ: refactoring, tests, and docs.
โข ๐ช๐ต๐ ๐ฟ๐ฒ๐ท๐ฒ๐ฐ๐๐ถ๐ผ๐ป๐ ๐ต๐ฎ๐ฝ๐ฝ๐ฒ๐ป: alternative solutions, oversized PRs, or obsolescence โ not simply โbad AI codeโ.
โข ๐ช๐ต๐ฎ๐ ๐ฟ๐ฒ๐๐ถ๐ฒ๐๐ฒ๐ฟ๐ ๐๐๐ถ๐น๐น ๐ณ๐ถ๐
: bugs (45.1%), docs (27.4%), refactoring (25.7%), style (22.1%) before merge.
โข ๐ช๐ต๐ฒ๐ฟ๐ฒ ๐๐ต๐ฒ๐ ๐๐๐ถ๐น๐น ๐๐๐๐บ๐ฏ๐น๐ฒ: legacy-heavy codebases or cross-cutting PRs.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- The Impact of Large Language Models (LLMs) on Code Review Process (2025)
- Does AI Code Review Lead to Code Changes? A Case Study of GitHub Actions (2025)
- What Were You Thinking? An LLM-Driven Large-Scale Study of Refactoring Motivations in Open-Source Projects (2025)
- On the Use of Agentic Coding Manifests: An Empirical Study of Claude Code (2025)
- AutoCodeSherpa: Symbolic Explanations in AI Coding Agents (2025)
- Benchmarking and Studying the LLM-based Code Review (2025)
- An Empirical Study on the Amount of Changes Required for Merge Request Acceptance (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper