- Mar 22, 2024
- 1 min read
Evolutionary Architecture: Designing Systems That Can Change
Traditional architecture assumes you can predict the future. You can’t. Evolutionary architecture embraces change — building systems that evolve incrementally through strangler patterns, resilient distributed designs, and clear service boundaries, without the risk and cost of full rewrites.
The Problem: Architecture That Cannot Scale
Legacy System Modernization Challenges
Your system faces growing pain:
Symptoms:
- ❌ Monolith scaling limitations make it hard to grow
- ❌ Breaking changes in production happen constantly
- ❌ Deployment risk increases with each change
- ❌ Architecture that cannot scale beyond current size
- ❌ Feature development slows due to coupled code
- ❌ Team velocity decreases month over month
- ❌ Can’t deploy frequently without breaking things
Why Traditional Architecture Fails
Old Approach:
- Predict everything upfront
- Build complete architecture on day 1
- Hope requirements don’t change
- Rewrite when they do (disaster)
Reality:
- Requirements ALWAYS change
- Predictions are often wrong
- Rewrites are expensive and risky
- Technical debt compounds daily
Evolutionary Architecture Solution:
- Build something good today
- Design for change
- Refactor continuously
- Adapt as business evolves
Evolutionary Architecture Overview
Designing Systems for Continuous Change
Evolutionary architecture means your system can evolve without complete rewrites.
Key Principles:
- Modular Design - Independent, replaceable parts
- Clear Boundaries - Well-defined interfaces
- Replaceability - Swap components easily
- Minimal Coupling - Services don’t depend on each other
- Continuous Refactoring - Safe, ongoing improvements
- Architectural Fitness - Test that architecture stays correct
- Incremental Change - Small steps, not big rewrites
Architecture for High Availability Systems
Building systems that stay up requires:
Redundancy:
- Multiple instances
- No single points of failure
- Automatic failover
Resilience:
- Circuit breakers (fail safely)
- Retry logic (handle transient failures)
- Timeouts (prevent cascading failures)
- Fallbacks (graceful degradation)
Observability:
- Comprehensive monitoring
- Structured logging
- Distributed tracing
- Alerting on business metrics
Core Design Principles
1. Software Architecture Scalability Strategies
Horizontal vs Vertical Scaling:
// Vertical Scaling (Limited)
// Bigger server = temporary fix
const server = {
cpu: '32 cores',
memory: '256 GB',
limit: 'Still maxes out'
};
// Horizontal Scaling (Unlimited)
// Multiple servers = grows indefinitely
const loadBalancer = {
servers: ['server1', 'server2', 'server3', '...'],
scale: 'Add more servers anytime'
};
Database Scaling:
- Vertical: Bigger database (limited)
- Horizontal: Sharding by customer, region, or data
- Caching: Redis/Memcached reduces load
- Read replicas: Distribute read traffic
- Event sourcing: Alternative to mutable state
2. Service Decomposition Strategies
How to break apart monoliths:
Step 1: Identify Bounded Contexts
- Domain-Driven Design analysis
- Where does language change?
- What teams work on what?
Step 2: Extract Core Context
- Start with least-coupled service
- Build anti-corruption layer
- Parallel run (old + new)
Step 3: Repeat
- Extract next context monthly
- Build from within-monolith to microservice
- Remove old code as confidence grows
Timeline:
Month 1: Extract Orders → 10% traffic
Month 2: Extract Payments → 25% traffic
Month 3: Extract Inventory → 50% traffic
Month 4-5: Extract remaining → 100% traffic
Month 6: Delete monolith
3. Data Consistency in Microservices
Problem: Each service owns its data (no shared database)
Solution 1: Event-Driven Microservices with Kafka
// Order Service publishes event
class OrderService {
async createOrder(order) {
const savedOrder = await this.repository.save(order);
// Publish event
await this.eventBus.publish('OrderCreated', {
orderId: savedOrder.id,
customerId: order.customerId,
items: order.items
});
return savedOrder;
}
}
// Inventory Service subscribes
class InventoryService {
async onOrderCreated(event) {
// Reserve stock based on event
const reservation = new Reservation(
event.orderId,
event.items
);
await this.repository.save(reservation);
}
}
// Shipping Service subscribes independently
class ShippingService {
async onOrderCreated(event) {
// Create shipment plan
const plan = new ShippingPlan(event.orderId, event.items);
await this.repository.save(plan);
}
}
Solution 2: Saga Pattern (Distributed Transactions)
How to Build Resilient Distributed Systems
Architecture for High Availability
1. Distributed System Failure Handling
// Circuit Breaker Pattern
class CircuitBreaker {
constructor(private service, private threshold = 5) {
this.failureCount = 0;
this.state = 'CLOSED'; // Normal operation
}
async call(request) {
if (this.state === 'OPEN') {
// Too many failures, stop trying
throw new Error('Circuit breaker is OPEN');
}
try {
const response = await this.service.call(request);
this.failureCount = 0; // Reset on success
this.state = 'CLOSED';
return response;
} catch (error) {
this.failureCount++;
if (this.failureCount >= this.threshold) {
this.state = 'OPEN'; // Stop making calls
this.scheduleHalfOpen();
}
throw error;
}
}
private scheduleHalfOpen() {
setTimeout(() => {
this.state = 'HALF_OPEN'; // Try again
this.failureCount = 0;
}, 30000); // Wait 30 seconds
}
}
// Usage
const breaker = new CircuitBreaker(paymentService);
async function processPayment(order) {
try {
return await breaker.call({ orderId: order.id });
} catch (error) {
// Payment service is down
// Use fallback: queue for later or notify customer
return handlePaymentLater(order);
}
}
2. Retry with Exponential Backoff
async function retryWithBackoff(fn, maxRetries = 3) {
let lastError;
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
lastError = error;
// Don't retry on permanent errors
if (error.isPermanent) throw error;
// Wait before retry: 1s, 2s, 4s, 8s...
const delayMs = Math.pow(2, attempt) * 1000;
await sleep(delayMs);
}
}
throw lastError;
}
// Usage
async function fetchUserData(userId) {
return retryWithBackoff(async () => {
return await userService.getUser(userId);
});
}
3. Timeout and Fallback
async function callWithTimeout(fn, timeoutMs = 5000) {
return Promise.race([
fn(),
new Promise((_, reject) =>
setTimeout(() => reject(new Error('Timeout')), timeoutMs)
)
]);
}
async function getUserWithFallback(userId) {
try {
return await callWithTimeout(
() => userService.getUser(userId),
5000 // 5 second timeout
);
} catch (error) {
// Service is slow or down
// Use cached data or defaults
return getCachedUser(userId) || getDefaultUser(userId);
}
}
Strangler Pattern Migration Example
How to Evolve Legacy Systems Safely
The strangler pattern gradually replaces old code while it still runs.
// Phase 1: Identify boundary (where to intercept)
// Old monolith handles all requests
app.get('/orders/:id', (req, res) => {
return legacyMonolith.handleOrderRequest(req, res);
});
// Phase 2: Create new service alongside
// Build new order service (DDD structured)
class NewOrderService {
async getOrder(orderId) {
// Clean implementation
}
}
// Phase 3: Route percentage of traffic to new service
app.get('/orders/:id', (req, res) => {
const userId = req.user.id;
const shouldUseNew = (hashUserId(userId) % 100) < migrationPercentage;
if (shouldUseNew) {
try {
return newOrderService.getOrder(req.params.id);
} catch (error) {
// Fall back to old if new fails
console.error('New service failed, using legacy');
return legacyMonolith.handleOrderRequest(req, res);
}
}
return legacyMonolith.handleOrderRequest(req, res);
});
// Timeline:
// Week 1: 5% traffic → new service
// Week 2: 25% traffic
// Week 3: 50% traffic
// Week 4: 100% traffic
// Week 5: Delete old code
Why Strangler Works:
- ✅ Both versions run simultaneously
- ✅ Easy rollback (just change percentage)
- ✅ Verify new implementation before full cutover
- ✅ Zero downtime migration
- ✅ Can abort easily if issues appear
Microservices Communication Patterns
Designing API Contracts for Microservices
Contract-First Development:
// Define contract first (OpenAPI/Swagger)
const orderServiceContract = {
createOrder: {
request: {
customerId: 'string',
items: [{
sku: 'string',
quantity: 'number',
price: 'Money'
}]
},
response: {
orderId: 'string',
status: 'PENDING',
createdAt: 'ISO-8601'
},
errors: {
400: 'Invalid request',
422: 'Business rule violation'
}
}
};
// Service implements contract
class OrderService {
async createOrder(request) {
// Must match contract
if (!request.customerId) throw new ValidationError();
if (!request.items.length) throw new BusinessRuleError();
const order = new Order(request.customerId, request.items);
await this.repository.save(order);
return {
orderId: order.id,
status: 'PENDING',
createdAt: order.createdAt.toISOString()
};
}
}
// Client uses contract
class PaymentService {
async processOrder(orderId) {
const response = await this.orderService.createOrder({
customerId: this.customerId,
items: this.items
});
// Trust contract - response has orderId, status, createdAt
this.trackOrderCreation(response.orderId);
}
}
API Gateway vs Backend for Frontend
API Gateway Pattern:
- Single entry point
- Centralized authentication
- Rate limiting
- Request routing
- Service discovery
Backend for Frontend (BFF) Pattern:
- Each UI has its own backend
- Optimized for specific client
- Different data shapes
- Mobile vs Web vs Desktop
// BFF for Mobile
class MobileOrderAPI {
async getOrder(orderId) {
// Mobile needs minimal data
return {
orderId,
total: '99.99',
status: 'DELIVERED'
};
}
}
// BFF for Web Admin
class AdminOrderAPI {
async getOrder(orderId) {
// Admin needs detailed data
return {
orderId,
customerId,
items: [...],
payments: [...],
shipments: [...],
notes: [...]
};
}
}
Implementing Saga Pattern Step by Step
Distributed Transactions Without Locks
Problem: Order → Reserve Inventory → Process Payment (need all-or-nothing)
Solution: Saga Pattern
// Choreography Saga (Event-Driven)
class OrderSaga {
async executeCreateOrderSaga(order) {
// Step 1: Create order
const createdOrder = await this.orderService.create(order);
// Publish event - triggers other services
await this.eventBus.publish('OrderCreated', {
orderId: createdOrder.id,
items: order.items
});
}
}
// Inventory Service subscribes
class InventoryService {
async onOrderCreated(event) {
try {
const reserved = await this.reserve(event.items);
// Success - publish event
await this.eventBus.publish('InventoryReserved', {
orderId: event.orderId,
items: event.items
});
} catch (error) {
// Failure - publish compensation event
await this.eventBus.publish('OrderFailed', {
orderId: event.orderId,
reason: 'InventoryUnavailable'
});
}
}
}
// Payment Service subscribes
class PaymentService {
async onInventoryReserved(event) {
try {
const payment = await this.charge(event.orderId);
// Success - order complete
await this.eventBus.publish('OrderConfirmed', {
orderId: event.orderId,
paymentId: payment.id
});
} catch (error) {
// Failure - need to undo inventory
await this.eventBus.publish('OrderFailed', {
orderId: event.orderId,
reason: 'PaymentFailed'
});
}
}
}
// Compensation (undo on failure)
class CompensationService {
async onOrderFailed(event) {
if (event.reason === 'PaymentFailed') {
// Release reserved inventory
await this.inventoryService.release(event.orderId);
}
if (event.reason === 'InventoryUnavailable') {
// Don't need to undo anything yet
}
}
}
Hexagonal vs Clean vs Onion Architecture
Architecture Comparison: Which Should You Use?
Hexagonal Architecture (Ports & Adapters)
Focus: Isolate domain from external systems
External Systems
↕️
(Adapters)
↕️
┌──────────┐
│ Domain │ (Core business logic)
│ Core │
└──────────┘
↕️
(Adapters)
↕️
External Systems
Clean Architecture
Focus: Independence from frameworks and databases
Frameworks & Drivers
↑ ↑ ↑
↓ ↓ ↓
Controllers / Presenters
↑ ↑ ↑
↓ ↓ ↓
Use Cases / Business Rules
↑ ↑ ↑
↓ ↓ ↓
Entities (Core Domain)
Onion Architecture
Focus: Dependency inward (dependencies flow toward core)
Infrastructure
Controllers
Services
Domain Models
Hexagonal Architecture vs Clean Architecture
| Aspect | Hexagonal | Clean |
|---|---|---|
| Focus | Isolation from externals | Framework independence |
| Layers | 3 (Domain, Ports, Adapters) | 4-5 layers |
| Complexity | Moderate | Higher |
| Learning Curve | Easier | Steeper |
| Best For | Microservices | Large enterprises |
Onion Architecture vs Layered Architecture
Layered (Traditional):
Presentation
↓
Business Logic
↓
Data Access
↓
Database
❌ Can become tightly coupled ❌ Hard to test core logic ❌ Database changes affect layers above
Onion (Better):
Domain (Core) ← depends on nothing
↑
Services ← depends on domain
↑
Controllers ← depends on services
↑
Infrastructure ← depends on everything else
✅ Core isolated ✅ Easy to test ✅ Framework/DB changes don’t affect domain
Real-World Case Studies & FAQ
Case Study 1: Evolutionary Architecture Real World Example
Company: Mid-size SaaS Platform
Before:
Monolith (10 years old)
├─ Orders
├─ Payments
├─ Inventory
├─ Shipping
├─ Analytics
└─ (1000s of files)
Problems:
- 6-month deployment cycles
- Any change breaks multiple parts
- Team velocity declining
- Can't hire/scale team
Evolutionary Transformation (18 months):
Phase 1: Strangler Pattern (Month 1-3)
- Extract Orders → Deploy alongside monolith
- Anti-corruption layer translates between systems
- 10% traffic to new Orders service
- Success → 50% → 100%
Phase 2: Extract Payments (Month 4-6)
- Payments become independent
- Saga pattern coordinates with Orders
- Event-driven communication
Phase 3: Extract Remaining Services (Month 7-12)
- Inventory, Shipping, Analytics
- Each follows strangler pattern
Phase 4: Final Cleanup (Month 13-18)
- Remove monolith dependencies
- Optimize communication patterns
- Decommission legacy code
After:
Microservices (Event-Driven)
├─ Orders Service
├─ Payments Service
├─ Inventory Service
├─ Shipping Service
├─ Analytics Service
└─ Message Queue (Kafka)
Results:
✅ Deployments: 2/day (10x faster)
✅ Lead time: 2 hours (from weeks)
✅ Incidents: Down 50%
✅ Team size: 5x larger (clear ownership)
✅ Feature velocity: 3x faster
FAQ & Resources
Q: When should we start evolutionary architecture?
A: Now. It’s not something you “start” later. Build for evolution from day 1.
Q: Can we apply this to existing systems?
A: Absolutely. Use strangler pattern to extract services while old system runs.
Q: What about data consistency?
A: Use event-driven (Kafka) or saga pattern for distributed transactions.
Q: How do we test evolving systems?
A: Contract testing, architectural fitness functions, integration tests.
External Learning Resources
Master Evolutionary Architecture from these authority sources:
- Building Evolutionary Architectures - The definitive book
- ThoughtWorks Technology Radar - Architecture trends
- Strangler Fig Pattern - Fowler’s explanation
- Architectural Fitness Functions - Testing architecture
- Domain-Driven Design - Domain modeling
- Microservices Patterns - Distributed patterns
- Release It! - Production reliability
- Building Microservices - Microservices guide
Key Takeaways
- Design for evolution - requirements will change
- Modularity is foundational - independent, replaceable parts
- Clear boundaries - interfaces prevent chaos
- Strangler pattern - replace gradually, not all at once
- Event-driven - loose coupling via messages
- Saga pattern - distributed transactions safely
- Resilience matters - circuit breakers, retries, timeouts
- Architecture fitness - test that architecture stays correct
- Culture enables - safe to refactor when team trusts process
- Iterate and improve - continuous evolution
Design for evolution, not perfection. The best architecture is the one that can change safely when the business demands it. Start modular, extract when evidence demands it, and let architectural fitness functions guard the boundaries.