- Mar 20, 2026
- 2 min read
Why Companies Are Quitting Microservices (And Going Modular Instead)
Microservices were sold as the solution to every scaling and team coordination problem. For most teams — below 30 engineers, without a dedicated platform team, without distributed tracing infrastructure — they delivered 100 services, debugging nightmares, and 3× infra bills. This guide is for backend engineers and engineering leads who need to understand what actually broke, what the industry data says, and how modular architecture fixes it without throwing away the good ideas microservices introduced.
The Architecture Consolidation Trend Is Real
Something measurable is happening in software architecture right now. Teams that spent 2017–2022 decomposing their systems into dozens — or hundreds — of microservices are reversing course. Not with a flashy rewrite announcement. But steadily, pragmatically, consolidating services and rebuilding internal module boundaries that should never have been network boundaries in the first place.
This is the microservices consolidation 2026 trend, and it is not a failure of the pattern. It is the industry correcting a systematic case of microservices premature optimization.
The CNCF Q3 2025 State of Cloud Native Development report, conducted across 12,021 developers, found that while 46% of developers are actively building microservices, service mesh adoption dropped from 18% in Q3 2023 to just 8% in Q3 2025. The operational tooling around distributed systems is being cut — even where the services themselves remain.
The core problem: Microservices were applied as an industry default — not as a solution to a specific, verified organizational problem. The architecture designed for 200-engineer teams at Netflix was copy-pasted into 10-person product teams with zero operational infrastructure to support it.
Donald Knuth’s warning that premature optimization is the root of all evil applies at the architecture level just as much as the code level. Distributing your system before you have the team size, scale, or operational maturity to support it adds every cost of microservices with none of the benefit.
The monolith first strategy, articulated by Martin Fowler, makes this argument clearly: almost all successful microservice stories started with a monolith that grew too large. Teams that start with microservices from day one almost always run into serious trouble.
Related on Wishyor: Domain-Driven Design: A Practical Intro for Backend Engineers · Evolutionary Architecture: Designing Systems That Can Change
The Four Failure Modes Nobody Warned You About
1. Service Proliferation Without a Governor
There is no natural limit on service count in a microservices organization. The service proliferation problem emerges when every domain decomposition feels locally justified, but the cumulative system becomes unmanageable.
“User service” becomes “auth service,” which becomes “permissions service,” which spawns “roles service.” At 100 services, the system exceeds the cognitive capacity of any single engineer — including the ones who built it. The result is a distributed monolith: all the operational overhead, none of the independence, because services are still so tightly coupled they deploy together anyway.
2. Microservices Debugging Overhead
In a monolith, a stack trace is a complete story. In a distributed system, a single failing request might touch fifteen services. The microservices debugging overhead in practice:
# Correlate a single failed request across services
kubectl logs -n gateway deploy/kong --since=30m | grep "trace-7f3a"
kubectl logs -n payments deploy/payment-svc --since=30m | grep "trace-7f3a"
kubectl logs -n orders deploy/order-svc --since=30m | grep "trace-7f3a"
kubectl logs -n notify deploy/notify-svc --since=30m | grep "trace-7f3a"
# Check the event queue
rabbitmqctl list_queues | grep order.completed
# Verify the service mesh isn't silently dropping traffic
istioctl proxy-config routes deploy/order-svc -n orders
# Elapsed: 45 minutes. Customer filed a support ticket 40 minutes ago.
Compare that to a modular monolith:
# One log. One place.
grep "trace-7f3a" /var/log/app/application.log
# Root cause in under 5 minutes.
3. Microservices Local Development Problems
The microservices local development problems are experienced daily. Running the full system locally means running every service — including message brokers, service registries, and mock external APIs. Docker Compose files grow to hundreds of lines.
New engineers spend their first two weeks learning how to start the infrastructure, not learning the business domain. Microservices onboarding complexity directly extends time-to-first-commit and erodes the throughput the architecture was supposed to improve.
4. Ownership Diffusion
The promise: one team, one service, full accountability. The reality: services outlive the teams that built them. Organizational changes create orphaned codebases. Documentation drifts from reality. The architecture designed to enforce ownership scattered it instead.
Research on DORA metrics and architecture consistently finds a microservices team size threshold of approximately 10 developers — below that, coordination overhead dominates and microservices produce net-negative outcomes on deployment frequency and change failure rate.
Related on Wishyor: DORA Metrics Explained: What They Measure and Why They Matter
Microservices ROI: The Real Cost Comparison
The microservices vs monolith cost comparison is rarely done honestly before adoption. Here is what the full picture looks like:
| Cost Category | Over-distributed Microservices | Modular Monolith |
|---|---|---|
| Infrastructure | Scales with service count — container per service, mesh, observability stack | Scales with actual traffic load |
| CI/CD | N independent pipelines for N services | One to a handful of pipelines |
| Observability | Distributed tracing required; expensive at scale | Standard APM and logging sufficient longer |
| On-call | Rotation surface per service or service group | Smaller, manageable rotation |
| Platform engineering | Dedicated Kubernetes operations team | Substantially reduced |
| Developer onboarding | Weeks to get local environment functional | Days |
| Debugging | Trace IDs, distributed logs, topology knowledge required | Stack traces |
The Kubernetes Complexity Tax
Kubernetes operational cost at small team scale is the most underestimated line item. Engineers spend meaningful portions of every week on cluster management, certificate rotation, ingress configuration, and upgrade cycles — not product development.
The CNCF 2025 Annual Cloud Native Survey found that while Kubernetes production adoption hit 82%, the top ongoing challenges are security (36%), lack of training (36%), and complexity (34%) — costs that compound with every additional service boundary.
The Service Mesh Overhead Signal
Service mesh adoption dropped from 18% in Q3 2023 to just 8% in Q3 2025 — a 55% decline in two years. Teams are not abandoning distributed systems. They are abandoning the complexity layer that was supposed to manage them.
The Amazon Prime Video Proof Point
Amazon Prime Video published a case study documenting how the Video Quality Analysis team consolidated a distributed serverless monitoring system — built on AWS Step Functions and S3 inter-component data transfer — into a single process on EC2 and ECS:
- Infrastructure cost dropped over 90%
- Scaling capability improved
- The original architecture hit a hard scaling ceiling at 5% of expected load
The team’s conclusion: “Microservices and serverless components are tools that do work at high scale, but whether to use them over monolith has to be made on a case-by-case basis.”
Related on Wishyor: Cloud Architecture Best Practices
The Modular Monolith First Approach Explained
The modular monolith first approach is not a return to spaghetti code. It is the discipline of microservices applied at the code level rather than the infrastructure level.
Bounded Contexts Without Network Boundaries
The core idea from domain-driven design (DDD) — formalized by Eric Evans and expanded in Martin Fowler’s microservices guide — is that your system should be organized around bounded contexts: coherent domains with explicit public interfaces and isolated internal logic.
In microservices, each bounded context becomes a network service. In a modular monolith, each bounded context becomes a strongly-typed module with a public API and enforced boundaries — without the network hop.
What Enforcement Actually Looks Like
src/
├── modules/
│ ├── billing/
│ │ ├── public/ # Public API — what other modules may call
│ │ │ └── BillingService.ts
│ │ ├── internal/ # Private — cross-module import = CI violation
│ │ │ ├── domain/
│ │ │ ├── repository/
│ │ │ └── services/
│ │ └── index.ts # Barrel — exports only the public surface
│ ├── orders/
│ │ ├── public/
│ │ └── internal/
│ └── notifications/
│ ├── public/
│ └── internal/
├── shared/
│ └── kernel/ # Cross-cutting primitives only — no domain logic
└── app.ts
# Boundary enforcement tooling by platform:
# Ruby → Packwerk (Shopify open-source)
# JVM → ArchUnit, Spring Modulith
# Node → ESLint import rules, dependency-cruiser
# Python → importlinter
# Violations caught at CI time, not discovered in production incidents.
Hexagonal Architecture at the Module Level
This maps directly to hexagonal architecture (ports and adapters): each module exposes ports (interfaces) and depends on adapters — never on the internals of sibling modules. Business logic stays decoupled from infrastructure, external APIs, and other modules.
The result: modular monolith core plus extracted services only where the evidence justifies it. Same internal discipline as microservices. Without the universal network overhead.
Conway’s Law in Both Directions
Conway’s Law states that organizations design systems mirroring their communication structures. It is the real organizational argument for microservices — genuine team independence for genuinely separate teams.
But it is also the most misapplied justification for premature distribution. A 15-person team does not have the Conway’s Law coordination problem. The hybrid architecture microservices monolith approach acknowledges this: apply service boundaries where real Conway’s Law pressure exists, keep the rest modular.
The Inverse Conway Maneuver works in both directions — you can structure your codebase to support a team organization that does not yet require distribution.
Related on Wishyor: Domain-Driven Design: A Practical Intro for Backend Engineers
The Service Extraction Pattern: Earning Distribution
The service extraction pattern is the mechanism by which a modular monolith graduates specific modules to independent services. The key shift in thinking: distribution is earned by evidence, not assumed by default.
As Martin Fowler’s guide on breaking a monolith into microservices documents, start with capabilities that are fairly decoupled from the rest of the system — and don’t require changes to many client-facing systems — before tackling deeper extractions.
Evidence Thresholds That Justify Extraction
EXTRACT a module into its own service when:
1. SCALING MISMATCH IS REAL AND CURRENT
The module needs 10x+ more compute than the rest of the system TODAY.
Not "might need to scale differently someday."
2. DEPLOYMENT INDEPENDENCE IS GENUINELY REQUIRED
The team ships on a fundamentally different cadence with different risk —
e.g., payments needs zero-downtime hotfix capability independent of a CMS
team on a weekly release cycle.
3. COMPLIANCE MANDATES INFRASTRUCTURE ISOLATION
PCI-DSS scope reduction, HIPAA data segregation, SOC2 audit boundaries.
Compliance is a hard requirement, not an architectural preference.
4. THE API HAS BEEN STABLE FOR 6+ MONTHS
The module's public interface is unlikely to churn under a network boundary.
Extracting an unstable interface adds versioning overhead on top of design
overhead simultaneously.
DO NOT extract because:
→ It "feels cleaner" on a whiteboard
→ A conference talk said microservices are best practice
→ The module is large (size is not a distribution criterion)
→ You anticipate needing it "someday"
→ Teams want to feel more ownership
Why Clean Extraction Is Possible
When extraction happens at this point, it is a targeted architectural decision with a clear ROI justification. Because the module boundary already exists in code with a stable public interface, the extraction is clean — you are promoting a well-defined module to a separate deployment, not untangling implicit coupling through a new network boundary simultaneously.
This is the Strangler Fig pattern applied correctly: strangle along real bounded-context boundaries that already exist in the codebase. Not ad-hoc cuts through undifferentiated business logic.
Related on Wishyor: Evolutionary Architecture: Designing Systems That Can Change
Production Evidence: What Real Companies Did
Amazon Prime Video — 90% Infrastructure Cost Reduction
In 2023, Amazon Prime Video’s senior SDE Marcin Kolny published a detailed post on primevideotech.com documenting how the Video Quality Analysis team consolidated a distributed serverless monitoring system into a single process on EC2 and ECS.
The original architecture used AWS Step Functions for orchestration — performing multiple state transitions per second of video stream — and S3 as intermediate storage for video frames between components. These two design choices hit a hard scaling ceiling at just 5% of expected load and were generating enormous per-request costs.
Moving to in-process communication eliminated both bottlenecks simultaneously:
- Infrastructure cost dropped over 90%
- Scaling capacity improved, not degraded
- The architecture designed for scale was itself the scaling bottleneck
The team’s direct conclusion: “Microservices and serverless components are tools that do work at high scale, but whether to use them over monolith has to be made on a case-by-case basis.”
Shopify — Modular Monolith at $5B Black Friday Scale
Rather than migrating to microservices as their codebase crossed 2.8 million lines and 500,000 commits, Shopify chose to evolve into a modular monolith — keeping all code in one codebase while enforcing domain boundaries with Packwerk, their open-source Ruby dependency enforcement tool that flags cross-boundary violations at CI time.
The scale this architecture supports today:
- $5 billion in Gross Merchandise Volume on a single Black Friday
- 284 million requests per minute at peak
- 45 million database queries per second
GitHub — Monolith Architecture at Developer Infrastructure Scale
GitHub has maintained a Ruby on Rails monolith as its core application since inception, serving tens of millions of daily active developers globally. Specialized services — Git storage, Actions infrastructure, notifications — are extracted where the evidence justifies independent scaling. The core product serving millions of daily active users remains a well-structured monolith.
Stack Overflow & Basecamp — The Boring Tech Benchmark
Stack Overflow’s infrastructure is the canonical example of the boring tech movement: one of the highest-traffic developer platforms running on a small cluster of well-tuned servers with a structured monolithic core. No service mesh, no distributed tracing infrastructure — just excellent code and good engineering discipline.
Basecamp has argued publicly for the Majestic Monolith for nearly two decades and runs one of the most profitable SaaS products per engineer in the industry.
None of these companies failed to understand distributed systems. They understood them well enough to know when not to use them.
Decision Framework: Which Architecture Fits Your Team
Team Size Thresholds
TEAM < 10 ENGINEERS
→ Modular monolith. No debate.
→ Microservices coordination overhead dominates every sprint.
→ Strong internal module boundaries. One deployment unit.
TEAM 10–30 ENGINEERS, SINGLE PRODUCT
→ Modular monolith with a deliberate extraction roadmap.
→ Identify which 1–2 modules are extraction candidates and why.
→ Do not extract until the four evidence criteria are met.
TEAM 30–100 ENGINEERS, MULTIPLE PRODUCT AREAS
→ Evaluate per area: which teams have genuinely different release cadences?
→ Apply Conway's Law analysis: where are the real deployment bottlenecks?
→ Extract those specific modules. Keep the rest modular within a monolith.
TEAM 100+ ENGINEERS, MULTIPLE INDEPENDENT TEAMS
→ Microservices make organizational sense for genuine team independence.
→ But: still enforce strong module boundaries inside each service.
→ At this scale, microservices solve a real coordination problem.
OVERRIDE SIGNALS (apply regardless of team size):
→ A module needs 10x+ more compute than the rest of the system TODAY.
→ Compliance mandates infrastructure-level isolation.
→ A team genuinely needs zero-coordination deployment independence.
→ Any of the above → targeted extraction of that specific module only.
The Honest Questions to Ask Before Distributing
Before introducing any new network boundary, answer these four questions:
| Question | If Yes | If No |
|---|---|---|
| Does this module see fundamentally different traffic today? | Consider extraction | Keep as internal module |
| Does this team have a truly different release cadence? | Consider extraction | Coordinate at the monolith level |
| Does compliance require infrastructure isolation? | Must extract | Keep as internal module |
| Has the module’s API been stable for 6+ months? | Safe to extract | Wait for stability |
All four answers should point toward extraction before you proceed. One clear “if yes” is not enough — the full operational overhead must be justified by the specific constraint it solves.
Serverless vs Microservices
Serverless vs microservices is a separate decision that often gets conflated. Serverless functions make sense for:
- Event-driven, intermittent workloads (webhooks, async processing, scheduled jobs)
- Workloads with genuinely unpredictable bursty traffic
- Isolated tasks where cold start latency is acceptable
They make the distribution problem worse, not better, for high-throughput synchronous paths — exactly the scenario that led to Amazon Prime Video’s 90% cost reduction after moving away from Lambda. The distributed tracing, cold start, and state management overhead of serverless compounds every microservices anti-pattern.
Related on Wishyor: DORA Metrics Explained: What They Measure and Why They Matter
Further Reading
External References
| Source | Why It Matters |
|---|---|
| Martin Fowler — MonolithFirst | The canonical case for starting with a monolith; cites real project failure data |
| Martin Fowler — Microservices Guide | Original definition, bounded contexts, and organizational tradeoffs |
| Martin Fowler — Breaking a Monolith into Microservices | When and how to extract services correctly — sequencing matters |
| Prime Video — 90% Cost Reduction Case Study | Amazon’s own documentation of microservices consolidation |
| Shopify Engineering — Deconstructing the Monolith | Modular monolith at $5B Black Friday scale |
| CNCF — State of Cloud Native Development Q3 2025 | Industry data on microservices and service mesh adoption trends |
| CNCF — 2025 Annual Cloud Native Survey | Kubernetes at 82% production adoption; complexity remains top challenge |
| Stack Exchange — Performance | The boring tech benchmark in concrete numbers |
| DHH — The Majestic Monolith | Basecamp’s original argument for monolithic simplicity |
Architecture decisions compound — choose the right defaults early. Module boundaries are cheap; network boundaries are expensive. Distribute when real evidence demands it: scaling mismatches, compliance requirements, or genuine team independence needs. The companies winning in 2026 are not the ones with the most services — they are the ones whose engineers can understand, operate, and ship without the architecture being the bottleneck.
Ready to audit your architecture? Work with our team to identify where microservices overhead is slowing you down — and build the modular boundaries that let you scale without chaos at Wishyor.