Migrating the Monolith II: Design • Jairus Joer

Recap and Alignment

In the previous article, we discussed how monolithic architectures can hinder innovation and scalability when the scope of teams and products exceeds their original boundaries. We then explored a conceptual approach to refactoring.

Migrating the Monolith I: Conceptualization

We identified several challenges, including tightly coupled APIs, diverging product requirements, and a loss of cohesion due to cross-domain entanglement. We also identified ways to address these challenges. This installment focuses on translating that vision into a practical design.

Mapping Domains

A successful migration starts with identifying clear, product-aligned domains within the existing monolith. Defining these boundaries allows teams to understand how components interact and where unhealthy dependencies have formed over time. Use tools and techniques, such as detailed dependency graphs and thorough static code analysis, to reveal cycles and identify volatile sections or risky “hot spots.” Additional runtime tracing may also reveal areas of latency and potential bottlenecks.

By cataloging these domains and assigning ownership to teams, organizations can direct their efforts toward detaching low-risk, self-contained components as the first candidates for migration. This initial groundwork exposes the most urgent risks in the system and builds a tangible, prioritized roadmap for iterative modularization. With clear domain boundaries, teams gain the confidence and autonomy to operate more independently, thereby accelerating technical and organizational transformation.

Technical Roadmap

A migration to modular architecture should never be attempted suddenly or all at once. Instead, develop a technical roadmap that introduces change in manageable increments. Use patterns such as the strangler fig, which introduces a façade and gradually routes specific traffic to newly extracted services, alongside feature flags and parallel runs for risky or high-impact extractions. This flexibility enables teams to build confidence in the new system incrementally and apply lessons learned as the migration progresses.

Initial efforts should focus on extracting smaller, low-impact features that quickly demonstrate value and build momentum. Over time, the successful completion of these migrations paves the way for tackling more critical milestones and unlocking broader business value. Robust planning requires aligning each step of the migration with existing CI/CD capabilities, observability standards, and recovery procedures. Before redirecting user traffic away from the monolith, teams must prepare cutover playbooks and establish effective health metrics and dashboards to ensure reliability and transparency throughout the process.

API Layer Redesign

As the migration progresses, the API layer must transition from serving as a direct pass-through for backend logic to functioning as a robust composition layer. In this orchestrator role, the API layer coordinates interactions among newly separated, independent services. Decoupling the API from internal logic is essential for managing accidental complexity, supporting scalability, and improving maintainability. To ensure ongoing reliability, strategies such as service discovery and circuit breakers must be embedded directly into the service design.

Consistent handling of partial failures and proper fallback mechanisms are equally vital so that no single point of failure can disrupt the entire interaction chain. Rigorous versioning practices formalized through OpenAPI specifications or protocol buffers ensure seamless communication across service boundaries. Automated contract testing further protects consumers from breaking changes. Strategically placing adapters and normalizers masks legacy quirks while enforcing clear, predictable interactions at every interface.

Product Isolation

Real modularity begins when each product or domain assumes full responsibility for its logic, database schema, and persistent storage, bringing an end to the “shared everything” era. To prevent accidental data coupling, which can undermine independence, shared write paths must gradually be eliminated. The outbox pattern and distributed sagas facilitate eventual consistency, enabling business processes to span domains without requiring risky distributed transactions.

To optimize for high performance and separation of concerns, read models can be tailored for specific consumers using Command Query Responsibility Segregation. This design divides the system into components that handle commands (writes) and queries (reads) separately. This allows each component to scale according to its needs. Product isolation improves maintainability and allows teams to evolve their domains independently, supporting different delivery and innovation speeds.

Shared Infrastructure

Despite the emphasis on modularity, not every concern should be broken apart. Some, such as authentication, error monitoring, logging, and core UI components, are best managed as centralized services. Centrally governing these cross-cutting concerns ensures consistency, increases reliability, and prevents unnecessary duplication of effort across teams.

Implementing robust governance frameworks, utilizing compatibility matrices, and maintaining strict version control within shared services reduces the risk of accidentally reintroducing unhealthy coupling. To accelerate developer onboarding and maintain high engineering standards, provide standardized project templates, reusable continuous integration pipelines, and a suite of linting and testing tools. Continuous end-to-end observability, supported by distributed tracing, incident SLOs, and transparent error budgets, is essential for effectively diagnosing problems and recovering from incidents.

Testing and Cutover

Testing must take center stage for every service extraction. Multi-tiered testing, including unit, integration, and comprehensive contract validation, is essential for ensuring that each new service is ready to safely take on live responsibilities. Before exposing actual user traffic to new implementations, confirm operational correctness under realistic load using realistic data sets and mirrored traffic.

Progressive rollout tactics provide an additional safeguard against large-scale failures by letting teams deploy changes to a small subset of users through canary releases and activate features with toggles for staged exposure. Automated rollback mechanisms enable teams to respond instantly if discrepancies are found. Regular rehearsals of failure scenarios and reconciliation exercises prepare teams for the realities of distributed system operations and strengthen organizational readiness at every stage.

Performance, Cost, and Risk Management

Migrating from a monolith to distributed services introduces new challenges, such as increased network latency, operational complexity, and infrastructure costs. Implementing local caching strategies and carefully planning the placement of closely related services can help minimize network round trips and address these issues. Bundling related requests and optimizing the serialization process can also reduce communication overhead.

Capacity planning is essential. Establish autoscaling rules to handle unpredictable loads without wasting resources, and implement load-shedding and backpressure mechanisms to prevent cascading service failures during spikes. During migration, pay special attention to temporary cost increases resulting from running the monolith and new services in parallel. Regularly updating the risk register and conducting ongoing monitoring and retrospectives that incorporate real-world metrics ensure the migration remains on track and enable teams to respond quickly to threats as they emerge.

Organizational Enablement

Migration is as much about empowering people and teams as it is about updating technology. Clearly assigning domain ownership gives teams responsibility and authority to innovate. Well-defined escalation paths reduce bottlenecks when issues arise. Synchronized release schedules coupled with planned integration and testing windows help ensure that all critical milestones are met without unintended disruption.

Comprehensive cross-team training broadens the shared understanding of best practices for distributed systems, and living documentation keeps knowledge accessible, preventing the formation of isolated information silos. When teams are actively involved in the migration process, they evolve technical and organizational practices in tandem, becoming agents of change for long-term success.

Roadmap & Next Steps

During the initial migration phase, focus on establishing clear boundaries, deploying the appropriate API gateways and façades, and rolling out distributed tracing frameworks to gain in-depth visibility into the system. Next, extract low-risk features to achieve quick wins, rigorously implement contract testing, and validate new services in production using shadow traffic routed by feature flags.

In the long term, increase the number of services, systematically retire parts of the monolith, and refine metrics that measure the organization’s software delivery performance and operational resilience. Tracking deployment frequency, change failure rates, and mean time to recovery will help sustain executive buy-in and create a cycle of continuous improvement.

In a follow-up article, we will continue our story with an in-depth look at the complexities of migration execution. You can expect practical advice on phased extraction, rollouts, and adapting architectural plans to unpredictable production realities.

TL;DR

Designing a modularized monolithic system begins with mapping domains and progresses through careful transition planning and staged cutover. This process requires robust engineering, cross-team collaboration, and resilient organizational processes.