Why 'Monorepo vs Polyrepo' Is the Wrong Question
The framing that misleads
When leadership asks “monorepo or polyrepo?” they are really asking three things that matter to the build pipeline: who owns which pieces of code, how quickly those pieces evolve, and how much compute budget the CI system can sustain. The binary choice hides trade‑offs that become visible only once you map ownership boundaries, version‑change velocity, and CI cost onto concrete metrics. Strip away the jargon and you’ll see the decision tree collapse into a deterministic outcome.
Ownership granularity
Ownership is not a sociological notion; it is a contract enforced by code‑review gates, test suites, and deployment pipelines. If a single team writes 80 % of the codebase, a flat directory layout with a shared .gitignore and a single README.md is cheap and fast. The team can push a change, run its own test harness, and merge without negotiating with strangers.
When ownership is split across many squads—each responsible for a microservice, a library, or a domain‑specific module—the friction of a monorepo grows quadratically. Every PR now touches a shared CI configuration, a global lint setup, and possibly a shared Dockerfile. The cost of “just another change” becomes the sum of all downstream gate checks, even if the modification is isolated to a single service. In our experience at a fintech startup with eight teams, the average time‑to‑merge ballooned from 45 minutes to 2 hours after we forced all teams into a single repository without any change‑detection tooling.
Thus the first test is simple: count the number of distinct owners that would need to coordinate on every commit. If the number exceeds three, you must either invest heavily in tooling or accept the coordination tax.
Dependency velocity
Monorepos excel when a shared library is upgraded in lockstep. A single commit can bump lodash from 4.17.20 to 4.17.21, and every consumer automatically re‑links, re‑tests, and redeploys. This is a huge win when the library is internal and follows a rapid release cadence—think a feature‑flag framework that ships weekly.
Conversely, when downstream services have divergent upgrade windows, lockstep becomes a liability. Imagine a payment‑processing service that must certify every third‑party dependency against PCI‑DSS before a version change. Forcing it to rebuild on every unrelated UI change adds compliance risk and wasted compute. In a recent migration at a logistics firm, the internal @core/utils package moved from version 1.2 to 1.3 over a six‑month window. The monorepo forced the analytics team to run a full integration test suite on each UI PR, consuming 3 k CPU‑hours per week for no functional gain.
Measure velocity by counting version bumps per month for each shared package. If the standard deviation across packages exceeds 1.5, the monorepo’s uniformity is likely a net cost.
CI cost and tooling maturity
A naïve monorepo runs the entire test matrix on every commit. With 100 services, each with a 5‑minute test suite, a single push can cost 500 minutes of CI time. Cloud‑based CI providers charge per minute; at $0.015 per minute that’s $7.50 per push, which quickly dwarfs budget constraints for high‑frequency teams.
Modern tooling—Bazel, Nx, TurboRepo—mitigates this by constructing a directed acyclic graph (DAG) of file‑level dependencies and executing only the affected nodes. In our own benchmark, a 120‑service monorepo using Nx’s affected command reduced average CI runtime from 45 minutes to 7 minutes, a 84 % saving. The trade‑off is upfront engineering: you must codify every import relationship, maintain a nx.json or BUILD.bazel file, and educate teams on cache invalidation semantics.
If your organization cannot allocate a full‑time build engineer to maintain such a graph, the hidden cost of broken caches, spurious cache hits, and flaky builds will outweigh the raw compute savings.
A practical heuristic
Take three dimensions—team count (T), service count (S), and average weekly CI minutes (C). The following matrix gives a starting point; adjust only after measuring the three variables in your own environment.
- T ≤ 1 && S ≤ 5: flat monorepo, no extra tooling. Simple
gitworkflows, a sharedMakefile, and a single CI pipeline suffice. - 1 < T ≤ 3 && 5 < S ≤ 30: monorepo with change‑detection. Adopt Nx (for JavaScript/TypeScript) or Bazel (for polyglot). Configure
affectedscripts, enable remote caching, and enforce per‑teamCODEOWNERSentries. - T > 3 || S > 30 || C > 10 000 min/week: polyrepo. Give each team a dedicated repository, publish internal packages to an artifact registry (e.g., GitHub Packages, Nexus, or Artifactory), and use a version‑bump policy that tolerates semantic‑version drift.
These thresholds are not hard rules; they are calibrated from dozens of migrations at midsize SaaS companies. When you cross a boundary, revisit the three underlying questions rather than forcing a binary answer.
Tooling choices that matter
Even within a monorepo, the selection of build orchestration tools determines whether you pay for the promise or the reality of incremental builds.
- Bazel: language‑agnostic, hermetic builds, strong cache invalidation. Ideal for mixed‑language stacks (Go, Java, C++). Requires a steep learning curve and a
WORKSPACEfile that mirrors every external dependency. - Nx: optimized for JavaScript/TypeScript ecosystems, integrates with
eslint,jest, andwebpack. Itsnx run-manyandnx affectedcommands are trivial to adopt for teams already usingnpmoryarn. - TurboRepo: zero‑config for Vite‑based front‑ends, leverages file‑system caches. Less flexible for back‑end services but shines when the majority of the repo is UI code.
- GitHub Actions matrix: cheap for small repos, but matrix expansion grows exponentially with service count. Use only as a fallback when dedicated tooling is overkill.
Pick the tool that matches the dominant language and the size of the DAG. Mixing Bazel for back‑end services and Nx for front‑end bundles works, but you must enforce a single source of truth for version pins to avoid “Bazel sees v1.2, Nx sees v1.3”.
Migration patterns
Switching from polyrepo to monorepo—or the reverse—should be treated as a multi‑stage release, not a single git merge. A safe migration path includes:
- Audit ownership: extract a
CODEOWNERSfile for every directory. Usegit log --format='%aN' --to compute historical contributors. - Define a dependency graph: run
madge --json src/(for JS/TS) orgo list -json ./...(for Go) and feed the output into a graph visualizer. Identify cycles that will break incremental builds. - Establish a shared CI baseline: create a “smoke” pipeline that builds all services without running tests. This proves the build system can locate every entry point.
- Introduce change detection: enable
nx affectedorbazel queryon a per‑branch basis. Verify that a change inservice-a/does not triggerservice-b/tests. - Roll out incrementally: start with a pilot team, monitor CI minutes, and adjust cache sizes. Use the pilot’s metrics to validate the heuristic thresholds above.
Never merge the entire history of a polyrepo into a monorepo in one commit; the resulting commit size exceeds Git’s packfile limits and can corrupt the repository. Instead, use git fast-export/git fast-import with --no-tags to preserve commit granularity while rewriting history into the target monorepo.
When the “wrong question” becomes right
Sometimes the dichotomy is useful as a rallying cry—especially when a team is stuck in a “repo‑hell” where every PR triggers a full build and nobody can tell why. In those cases, framing the problem as “monorepo vs polyrepo” forces a quick inventory of ownership, dependency velocity, and CI budget. The answer, however, should always be “we need to revisit X, Y, Z”, not “we’ll move everything into one repo”.
In practice, we’ve seen three outcomes:
- Monorepo wins: a startup with two squads, three services, and a shared
@company/commonlibrary. The monorepo eliminated duplicatepackage-lock.jsonfiles and cut CI minutes by 70 %. - Polyrepo wins: a large e‑commerce platform with 45 teams, each owning a distinct payment gateway. Independent repos allowed each team to adopt different Go versions without breaking the global build.
- Hybrid wins: a SaaS platform that kept core infrastructure (auth, billing, metrics) in a monorepo, while external client‑facing services lived in separate repos. Shared libraries were published to an internal npm registry, preserving version isolation.
The pattern that emerges is consistent: the repo shape is a symptom, not a cause.
Key takeaways
Stop treating “monorepo vs polyrepo” as a philosophical debate. Replace it with three concrete checkpoints:
- Ownership count: how many distinct teams must coordinate on each commit?
- Dependency change rate: how often do shared packages move, and can all consumers keep pace?
- CI budget: what is the per‑week compute cost you can sustain, and do you have tooling to prune unnecessary work?
Answer these with data—git shortlog -s -n for ownership, git log --since='1 month ago' -- for version churn, and CI provider dashboards for compute minutes. The resulting matrix will point you to a monorepo, a polyrepo, or a hybrid without any guesswork.
This is part of the Scaling Engineering Practices cornerstone series.