• Mar 20, 2026
  • 2 min read

Why Companies Are Quitting Microservices (And Going Modular Instead)

Microservices were sold as the solution to every scaling and team coordination problem. For most teams — below 30 engineers, without a dedicated platform team, without distributed tracing infrastructure — they delivered 100 services, debugging nightmares, and 3× infra bills. This guide is for backend engineers and engineering leads who need to understand what actually broke, what the industry data says, and how modular architecture fixes it without throwing away the good ideas microservices introduced.

The Architecture Consolidation Trend Is Real

Something measurable is happening in software architecture right now. Teams that spent 2017–2022 decomposing their systems into dozens — or hundreds — of microservices are reversing course. Not with a flashy rewrite announcement. But steadily, pragmatically, consolidating services and rebuilding internal module boundaries that should never have been network boundaries in the first place.

This is the microservices consolidation 2026 trend, and it is not a failure of the pattern. It is the industry correcting a systematic case of microservices premature optimization.

The CNCF Q3 2025 State of Cloud Native Development report, conducted across 12,021 developers, found that while 46% of developers are actively building microservices, service mesh adoption dropped from 18% in Q3 2023 to just 8% in Q3 2025. The operational tooling around distributed systems is being cut — even where the services themselves remain.

The core problem: Microservices were applied as an industry default — not as a solution to a specific, verified organizational problem. The architecture designed for 200-engineer teams at Netflix was copy-pasted into 10-person product teams with zero operational infrastructure to support it.

Donald Knuth’s warning that premature optimization is the root of all evil applies at the architecture level just as much as the code level. Distributing your system before you have the team size, scale, or operational maturity to support it adds every cost of microservices with none of the benefit.

The monolith first strategy, articulated by Martin Fowler, makes this argument clearly: almost all successful microservice stories started with a monolith that grew too large. Teams that start with microservices from day one almost always run into serious trouble.

Related on Wishyor: Domain-Driven Design: A Practical Intro for Backend Engineers · Evolutionary Architecture: Designing Systems That Can Change


The Four Failure Modes Nobody Warned You About

1. Service Proliferation Without a Governor

There is no natural limit on service count in a microservices organization. The service proliferation problem emerges when every domain decomposition feels locally justified, but the cumulative system becomes unmanageable.

“User service” becomes “auth service,” which becomes “permissions service,” which spawns “roles service.” At 100 services, the system exceeds the cognitive capacity of any single engineer — including the ones who built it. The result is a distributed monolith: all the operational overhead, none of the independence, because services are still so tightly coupled they deploy together anyway.


2. Microservices Debugging Overhead

In a monolith, a stack trace is a complete story. In a distributed system, a single failing request might touch fifteen services. The microservices debugging overhead in practice:

# Correlate a single failed request across services
kubectl logs -n gateway  deploy/kong         --since=30m | grep "trace-7f3a"
kubectl logs -n payments deploy/payment-svc  --since=30m | grep "trace-7f3a"
kubectl logs -n orders   deploy/order-svc    --since=30m | grep "trace-7f3a"
kubectl logs -n notify   deploy/notify-svc   --since=30m | grep "trace-7f3a"

# Check the event queue
rabbitmqctl list_queues | grep order.completed

# Verify the service mesh isn't silently dropping traffic
istioctl proxy-config routes deploy/order-svc -n orders

# Elapsed: 45 minutes. Customer filed a support ticket 40 minutes ago.

Compare that to a modular monolith:

# One log. One place.
grep "trace-7f3a" /var/log/app/application.log
# Root cause in under 5 minutes.

3. Microservices Local Development Problems

The microservices local development problems are experienced daily. Running the full system locally means running every service — including message brokers, service registries, and mock external APIs. Docker Compose files grow to hundreds of lines.

New engineers spend their first two weeks learning how to start the infrastructure, not learning the business domain. Microservices onboarding complexity directly extends time-to-first-commit and erodes the throughput the architecture was supposed to improve.


4. Ownership Diffusion

The promise: one team, one service, full accountability. The reality: services outlive the teams that built them. Organizational changes create orphaned codebases. Documentation drifts from reality. The architecture designed to enforce ownership scattered it instead.

Research on DORA metrics and architecture consistently finds a microservices team size threshold of approximately 10 developers — below that, coordination overhead dominates and microservices produce net-negative outcomes on deployment frequency and change failure rate.

Related on Wishyor: DORA Metrics Explained: What They Measure and Why They Matter


Microservices ROI: The Real Cost Comparison

The microservices vs monolith cost comparison is rarely done honestly before adoption. Here is what the full picture looks like:

Cost CategoryOver-distributed MicroservicesModular Monolith
InfrastructureScales with service count — container per service, mesh, observability stackScales with actual traffic load
CI/CDN independent pipelines for N servicesOne to a handful of pipelines
ObservabilityDistributed tracing required; expensive at scaleStandard APM and logging sufficient longer
On-callRotation surface per service or service groupSmaller, manageable rotation
Platform engineeringDedicated Kubernetes operations teamSubstantially reduced
Developer onboardingWeeks to get local environment functionalDays
DebuggingTrace IDs, distributed logs, topology knowledge requiredStack traces

The Kubernetes Complexity Tax

Kubernetes operational cost at small team scale is the most underestimated line item. Engineers spend meaningful portions of every week on cluster management, certificate rotation, ingress configuration, and upgrade cycles — not product development.

The CNCF 2025 Annual Cloud Native Survey found that while Kubernetes production adoption hit 82%, the top ongoing challenges are security (36%), lack of training (36%), and complexity (34%) — costs that compound with every additional service boundary.

The Service Mesh Overhead Signal

Service mesh adoption dropped from 18% in Q3 2023 to just 8% in Q3 2025 — a 55% decline in two years. Teams are not abandoning distributed systems. They are abandoning the complexity layer that was supposed to manage them.

The Amazon Prime Video Proof Point

Amazon Prime Video published a case study documenting how the Video Quality Analysis team consolidated a distributed serverless monitoring system — built on AWS Step Functions and S3 inter-component data transfer — into a single process on EC2 and ECS:

  • Infrastructure cost dropped over 90%
  • Scaling capability improved
  • The original architecture hit a hard scaling ceiling at 5% of expected load

The team’s conclusion: “Microservices and serverless components are tools that do work at high scale, but whether to use them over monolith has to be made on a case-by-case basis.”

Related on Wishyor: Cloud Architecture Best Practices


The Modular Monolith First Approach Explained

The modular monolith first approach is not a return to spaghetti code. It is the discipline of microservices applied at the code level rather than the infrastructure level.

Bounded Contexts Without Network Boundaries

The core idea from domain-driven design (DDD) — formalized by Eric Evans and expanded in Martin Fowler’s microservices guide — is that your system should be organized around bounded contexts: coherent domains with explicit public interfaces and isolated internal logic.

In microservices, each bounded context becomes a network service. In a modular monolith, each bounded context becomes a strongly-typed module with a public API and enforced boundaries — without the network hop.

What Enforcement Actually Looks Like

src/
├── modules/
│   ├── billing/
│   │   ├── public/          # Public API — what other modules may call
│   │   │   └── BillingService.ts
│   │   ├── internal/        # Private — cross-module import = CI violation
│   │   │   ├── domain/
│   │   │   ├── repository/
│   │   │   └── services/
│   │   └── index.ts         # Barrel — exports only the public surface
│   ├── orders/
│   │   ├── public/
│   │   └── internal/
│   └── notifications/
│       ├── public/
│       └── internal/
├── shared/
│   └── kernel/              # Cross-cutting primitives only — no domain logic
└── app.ts

# Boundary enforcement tooling by platform:
#   Ruby   → Packwerk (Shopify open-source)
#   JVM    → ArchUnit, Spring Modulith
#   Node   → ESLint import rules, dependency-cruiser
#   Python → importlinter
# Violations caught at CI time, not discovered in production incidents.

Hexagonal Architecture at the Module Level

This maps directly to hexagonal architecture (ports and adapters): each module exposes ports (interfaces) and depends on adapters — never on the internals of sibling modules. Business logic stays decoupled from infrastructure, external APIs, and other modules.

The result: modular monolith core plus extracted services only where the evidence justifies it. Same internal discipline as microservices. Without the universal network overhead.

Conway’s Law in Both Directions

Conway’s Law states that organizations design systems mirroring their communication structures. It is the real organizational argument for microservices — genuine team independence for genuinely separate teams.

But it is also the most misapplied justification for premature distribution. A 15-person team does not have the Conway’s Law coordination problem. The hybrid architecture microservices monolith approach acknowledges this: apply service boundaries where real Conway’s Law pressure exists, keep the rest modular.

The Inverse Conway Maneuver works in both directions — you can structure your codebase to support a team organization that does not yet require distribution.

Related on Wishyor: Domain-Driven Design: A Practical Intro for Backend Engineers


The Service Extraction Pattern: Earning Distribution

The service extraction pattern is the mechanism by which a modular monolith graduates specific modules to independent services. The key shift in thinking: distribution is earned by evidence, not assumed by default.

As Martin Fowler’s guide on breaking a monolith into microservices documents, start with capabilities that are fairly decoupled from the rest of the system — and don’t require changes to many client-facing systems — before tackling deeper extractions.

Evidence Thresholds That Justify Extraction

EXTRACT a module into its own service when:

1. SCALING MISMATCH IS REAL AND CURRENT
   The module needs 10x+ more compute than the rest of the system TODAY.
   Not "might need to scale differently someday."

2. DEPLOYMENT INDEPENDENCE IS GENUINELY REQUIRED
   The team ships on a fundamentally different cadence with different risk —
   e.g., payments needs zero-downtime hotfix capability independent of a CMS
   team on a weekly release cycle.

3. COMPLIANCE MANDATES INFRASTRUCTURE ISOLATION
   PCI-DSS scope reduction, HIPAA data segregation, SOC2 audit boundaries.
   Compliance is a hard requirement, not an architectural preference.

4. THE API HAS BEEN STABLE FOR 6+ MONTHS
   The module's public interface is unlikely to churn under a network boundary.
   Extracting an unstable interface adds versioning overhead on top of design
   overhead simultaneously.

DO NOT extract because:
  → It "feels cleaner" on a whiteboard
  → A conference talk said microservices are best practice
  → The module is large (size is not a distribution criterion)
  → You anticipate needing it "someday"
  → Teams want to feel more ownership

Why Clean Extraction Is Possible

When extraction happens at this point, it is a targeted architectural decision with a clear ROI justification. Because the module boundary already exists in code with a stable public interface, the extraction is clean — you are promoting a well-defined module to a separate deployment, not untangling implicit coupling through a new network boundary simultaneously.

This is the Strangler Fig pattern applied correctly: strangle along real bounded-context boundaries that already exist in the codebase. Not ad-hoc cuts through undifferentiated business logic.

Related on Wishyor: Evolutionary Architecture: Designing Systems That Can Change


Production Evidence: What Real Companies Did

Amazon Prime Video — 90% Infrastructure Cost Reduction

In 2023, Amazon Prime Video’s senior SDE Marcin Kolny published a detailed post on primevideotech.com documenting how the Video Quality Analysis team consolidated a distributed serverless monitoring system into a single process on EC2 and ECS.

The original architecture used AWS Step Functions for orchestration — performing multiple state transitions per second of video stream — and S3 as intermediate storage for video frames between components. These two design choices hit a hard scaling ceiling at just 5% of expected load and were generating enormous per-request costs.

Moving to in-process communication eliminated both bottlenecks simultaneously:

  • Infrastructure cost dropped over 90%
  • Scaling capacity improved, not degraded
  • The architecture designed for scale was itself the scaling bottleneck

The team’s direct conclusion: “Microservices and serverless components are tools that do work at high scale, but whether to use them over monolith has to be made on a case-by-case basis.”


Shopify — Modular Monolith at $5B Black Friday Scale

Rather than migrating to microservices as their codebase crossed 2.8 million lines and 500,000 commits, Shopify chose to evolve into a modular monolith — keeping all code in one codebase while enforcing domain boundaries with Packwerk, their open-source Ruby dependency enforcement tool that flags cross-boundary violations at CI time.

The scale this architecture supports today:

  • $5 billion in Gross Merchandise Volume on a single Black Friday
  • 284 million requests per minute at peak
  • 45 million database queries per second

GitHub — Monolith Architecture at Developer Infrastructure Scale

GitHub has maintained a Ruby on Rails monolith as its core application since inception, serving tens of millions of daily active developers globally. Specialized services — Git storage, Actions infrastructure, notifications — are extracted where the evidence justifies independent scaling. The core product serving millions of daily active users remains a well-structured monolith.


Stack Overflow & Basecamp — The Boring Tech Benchmark

Stack Overflow’s infrastructure is the canonical example of the boring tech movement: one of the highest-traffic developer platforms running on a small cluster of well-tuned servers with a structured monolithic core. No service mesh, no distributed tracing infrastructure — just excellent code and good engineering discipline.

Basecamp has argued publicly for the Majestic Monolith for nearly two decades and runs one of the most profitable SaaS products per engineer in the industry.

None of these companies failed to understand distributed systems. They understood them well enough to know when not to use them.


Decision Framework: Which Architecture Fits Your Team

Team Size Thresholds

TEAM < 10 ENGINEERS
  → Modular monolith. No debate.
  → Microservices coordination overhead dominates every sprint.
  → Strong internal module boundaries. One deployment unit.

TEAM 10–30 ENGINEERS, SINGLE PRODUCT
  → Modular monolith with a deliberate extraction roadmap.
  → Identify which 1–2 modules are extraction candidates and why.
  → Do not extract until the four evidence criteria are met.

TEAM 30–100 ENGINEERS, MULTIPLE PRODUCT AREAS
  → Evaluate per area: which teams have genuinely different release cadences?
  → Apply Conway's Law analysis: where are the real deployment bottlenecks?
  → Extract those specific modules. Keep the rest modular within a monolith.

TEAM 100+ ENGINEERS, MULTIPLE INDEPENDENT TEAMS
  → Microservices make organizational sense for genuine team independence.
  → But: still enforce strong module boundaries inside each service.
  → At this scale, microservices solve a real coordination problem.

OVERRIDE SIGNALS (apply regardless of team size):
  → A module needs 10x+ more compute than the rest of the system TODAY.
  → Compliance mandates infrastructure-level isolation.
  → A team genuinely needs zero-coordination deployment independence.
  → Any of the above → targeted extraction of that specific module only.

The Honest Questions to Ask Before Distributing

Before introducing any new network boundary, answer these four questions:

QuestionIf YesIf No
Does this module see fundamentally different traffic today?Consider extractionKeep as internal module
Does this team have a truly different release cadence?Consider extractionCoordinate at the monolith level
Does compliance require infrastructure isolation?Must extractKeep as internal module
Has the module’s API been stable for 6+ months?Safe to extractWait for stability

All four answers should point toward extraction before you proceed. One clear “if yes” is not enough — the full operational overhead must be justified by the specific constraint it solves.

Serverless vs Microservices

Serverless vs microservices is a separate decision that often gets conflated. Serverless functions make sense for:

  • Event-driven, intermittent workloads (webhooks, async processing, scheduled jobs)
  • Workloads with genuinely unpredictable bursty traffic
  • Isolated tasks where cold start latency is acceptable

They make the distribution problem worse, not better, for high-throughput synchronous paths — exactly the scenario that led to Amazon Prime Video’s 90% cost reduction after moving away from Lambda. The distributed tracing, cold start, and state management overhead of serverless compounds every microservices anti-pattern.

Related on Wishyor: DORA Metrics Explained: What They Measure and Why They Matter


Further Reading

External References

SourceWhy It Matters
Martin Fowler — MonolithFirstThe canonical case for starting with a monolith; cites real project failure data
Martin Fowler — Microservices GuideOriginal definition, bounded contexts, and organizational tradeoffs
Martin Fowler — Breaking a Monolith into MicroservicesWhen and how to extract services correctly — sequencing matters
Prime Video — 90% Cost Reduction Case StudyAmazon’s own documentation of microservices consolidation
Shopify Engineering — Deconstructing the MonolithModular monolith at $5B Black Friday scale
CNCF — State of Cloud Native Development Q3 2025Industry data on microservices and service mesh adoption trends
CNCF — 2025 Annual Cloud Native SurveyKubernetes at 82% production adoption; complexity remains top challenge
Stack Exchange — PerformanceThe boring tech benchmark in concrete numbers
DHH — The Majestic MonolithBasecamp’s original argument for monolithic simplicity

Architecture decisions compound — choose the right defaults early. Module boundaries are cheap; network boundaries are expensive. Distribute when real evidence demands it: scaling mismatches, compliance requirements, or genuine team independence needs. The companies winning in 2026 are not the ones with the most services — they are the ones whose engineers can understand, operate, and ship without the architecture being the bottleneck.


Ready to audit your architecture? Work with our team to identify where microservices overhead is slowing you down — and build the modular boundaries that let you scale without chaos at Wishyor.