TL;DR: Tell your AI to split work into small reviewable pull requests before it writes any code.

Common Mistake ❌

You ask your AI agent to build a feature.

The agent opens a 2,000-line pull request that touches twelve files across the backend, the frontend, and the tests.

You stare at the diff and you have no idea where to start reviewing.

Reviewer attention is the scarcest resource on your team today and a bottleneck.

You skim, you trust the tests, you click approve.

That pull request ships with defects you never saw.

Problems Addressed 😔

Reviewers procrastinate on reviewing or skim huge AI-generated PRs.
AI co-authored code already contains roughly 1.7 times more issues per change than human-only code, so large diffs hide more defects.
Merge conflicts multiply while the PR sits open.
You lose the ability to revert one logical change without ripping out the rest.
A failing CI run on a giant change blocks every other branch behind it.
You sacrifice the second pair of eyes guarantee that code review provides.
Human reviewers are the scarcest resource on your team.
Every oversized PR drains the attention they can never get back.

How to Do It 🛠️

Write a short spec before you prompt the agent.
Ask the AI to read the spec and propose a plan that splits the work into small reviewable pull requests.
Give the agent a concrete size cap, for example 100 lines per pull request.
Tell the agent each pull request must do one logical thing and stand on its own.
Review the plan first and reject any step that mixes refactoring with new behavior.
Let the agent open each pull request immediately so CI starts running early.
Reference related past pull requests as context for the agent.
Reject any PR that grows beyond the cap and ask the AI to split it again.

Benefits 🎯

Reviewable code: A human can finish reading the diff without losing focus.
Faster feedback: Smaller PRs get reviewed in hours, not days.
Easier rollbacks: You revert one small commit instead of untangling a megachange.
Fewer merge conflicts: Short-lived branches rarely collide with main.
Earlier CI signals: Each PR triggers its own pipeline as soon as it opens.
Higher review quality: Reviewers stay engaged on small diffs and catch real defects.
Cheaper context: The agent loads a focused task instead of the whole codebase.

Context 🧠

Michael Bolin, the Tech Lead for the Codex CLI repo at OpenAI, recently described his workflow for building a permissions system.

The first thing he asked Codex was to create a plan and break the work into right-sized pull requests.

The initial output was about six PRs.

He explicitly reminded the agent that a human still has to review the code.

A 2025 CodeRabbit study analyzed 470 open-source pull requests and found that AI co-authored PRs contain roughly 1.7 times more issues per change than human-only PRs.

Critical issues rose about 40 percent.

Major issues rose about 70 percent.

Big AI PRs hide more defects per line than big human PRs do.

Salesforce reported that AI-assisted coding pushed their average pull request past 1,000 lines and 20 files.

Review latency went up.

Worst of all, review time for the largest pull requests started to plateau, a clear signal that reviewers had stopped engaging.

Reviewer attention is finite and scarce.

Once it's depleted, no tool can restore it.

Small PRs solve the math.

A SmartBear study of 2,500 pull requests found smaller PRs ship with fewer defects.

Teams that keep PRs near 50 lines ship roughly 40 percent more code than teams who routinely exceed 200.

Small PRs are a form of functional slicing.

Each slice cuts vertically through the feature so it compiles, tests, and deploys on its own.

A spec-driven approach naturally produces sliceable work: the spec defines the boundary, and the AI splits the implementation into shippable increments.

Incrementalism and baby steps keep main green at every commit.

The AI doesn't practice incremental delivery by default.

Prompt Reference 📝

Bad Prompt 🚫

Build the complete user authentication feature: login,
registration, password reset, email verification, OAuth
with Google and GitHub, session management, rate limiting,
and all the tests. Make it production-ready.

Good prompt 👉

Here is the spec for the user authentication feature:
[spec content]

Before writing any code, read the spec and propose a
plan that splits the work into pull requests of at most
100 lines each.

Each pull request must do one logical thing and pass CI
on its own.

Keep refactoring and new behavior in separate PRs.

Show me the plan only. Don't write any code until I
approve the plan.

Considerations ⚠️

A 100-line cap is a guideline, not a law.

A trivial rename across 50 files can be longer and still trivial to review.

Never mix a refactor with a functional change in the same pull request.

A reviewer can't tell if a behavior change is intentional or a side effect of the refactor.

Keep structural changes and behavior changes in separate PRs, even when the AI wants to bundle them, to avoid divergent change.

Some features have tight internal coupling and resist clean splits.

Be honest about that limit instead of forcing artificial splits that confuse reviewers.

Avoid long-lived feature branches.

A branch that lives for weeks drifts from main and accumulates merge conflicts.

When the feature isn't ready to ship but the code is ready to merge, use a feature toggle instead.

Each PR merges to main behind the toggle and the feature activates when all pieces are in place.

Stacked pull requests work well when one change depends on another.

Each layer should compile and pass tests on its own.

If your team takes two days to review a small PR, the workflow collapses.

Fix the review SLA before you push smaller PRs on people.

Type 📝

[X] Semi-Automatic

Limitations ⚠️

Some agents resist splitting and try to ship everything in one PR even after you ask.

You may need to repeat the cap or paste it into your project skill file or AGENTS.md harness so the rule survives a fresh context.

Tags 🏷️

Planning

Level 🔋

[X] Intermediate

https://maximilianocontieri.com/ai-coding-tip-003-force-read-only-planning

https://maximilianocontieri.com/ai-coding-tip-006-review-every-line-before-commit

https://maximilianocontieri.com/ai-coding-tip-013-use-progressive-disclosure

https://maximilianocontieri.com/ai-coding-tip-022-give-ai-a-harness-to-work-with

Conclusion 🏁

Your AI doesn't care how big the pull request is.

You do.

Human reviewers are the scarcest resource in your pipeline.

The AI can write code all day.

Reviewers can't review all day.

Set the size cap before the agent starts, and ask for the split plan first.

That single instruction protects the human who has to read the code. 🚀

More Information ℹ️

How OpenAI Codex Tech Lead Does AI-Assisted Engineering, by Gregor Ojstersek

State of AI vs Human Code Generation Report, CodeRabbit

Scaling Code Reviews: Adapting to a Surge in AI-Generated Code, Salesforce Engineering

Why small pull requests are better, Swarmia

Agent pull requests are everywhere, GitHub Blog

GitHub Targets Large Merge Problem with Stacked PRs, InfoQ

Also Known As 🎭

Right-Sized-AI-PRs
AI-PR-Budgeting
Reviewable-AI-Commits
Pre-Coding-PR-Plan

Tools 🧰

Codex CLI
GitHub Stacked PRs (gh-stack)
Graphite
CodeRabbit

Disclaimer 📢

The views expressed here are my own.

I am a human who writes as best as possible for other humans.

I use AI proofreading tools to improve some texts.

I welcome constructive criticism and dialogue.

I shape these insights through 30 years in the software industry, 25 years of teaching, and writing over 500 articles and a book.

This article is part of the AI Coding Tip series.

https://maximilianocontieri.com/ai-coding-tips

AI Coding Tip 023 - Shrink your AI's Pull Request

Common Mistake ❌

Problems Addressed 😔

How to Do It 🛠️

Benefits 🎯

Context 🧠

Prompt Reference 📝

Bad Prompt 🚫

Good prompt 👉

Considerations ⚠️

Type 📝

Limitations ⚠️

Tags 🏷️

Level 🔋

Conclusion 🏁

More Information ℹ️

Also Known As 🎭

Tools 🧰

Disclaimer 📢

Comments

AI Coding Tips

AI Coding Tip 024 - Force a Criteria Check Before the Task Ends

More from this blog

AI Coding Tip 028 - Build a Company Brain

AI Coding Tip 027 - Force Code Standards

AI Coding Tip 026 - Assign a Persona to Every Skill Definition

The Dirty Secret Behind Loop Engineering

Code Smell 320 - Vanity Coverage

Command Palette

Common Mistake ❌

Problems Addressed 😔

How to Do It 🛠️

Benefits 🎯

Context 🧠

Prompt Reference 📝

Bad Prompt 🚫

Good prompt 👉

Considerations ⚠️

Type 📝

Limitations ⚠️

Tags 🏷️

Level 🔋

Related Tips 🔗

Conclusion 🏁

More Information ℹ️

Also Known As 🎭

Tools 🧰

Disclaimer 📢

Comments

AI Coding Tips

AI Coding Tip 024 - Force a Criteria Check Before the Task Ends

More from this blog