Devlog #0: The Profiles Bounded Context RFC

This is the first post in the devlog documenting the journey of extracting a service from an existing monolith. I'll write about every step, from initial planning to final implementation, taking a comprehensive look at all the triumphs and challenges along the way.

Throughout this devlog series, I'll take you through our entire journey—from initial planning to implementation—sharing strategies that proved successful, those that were less effective, and everything in between. We'll dig into the details of how we collaborated across teams, handled system dependencies, and safely refactored code, providing an honest and transparent account of what worked well and what didn’t, so others can learn from our experience when tackling similar projects.

I believe sharing an experience offers valuable insights that can benefit others facing similar challenges.

Whether you're a seasoned architect or a curious engineer, buckle up for a bumpy ride through the journey of monolith decomposition.

Setting the Stage

TransferGo is a digital money service that aims to simplify financial services for individuals. Founded in 2012, the organisation has grown significantly, now containing over 100 people in engineering alone. Operating for more than a decade, in a complex and highly regulated industry, can make architectural changes somewhat challenging.

My main focus lately has been to separate a new bounded context from a large complex monolith - a system which, over the years, has collected a fair amount of legacy code. That may sound negative, but I certainly don't mean it that way. The terms “legacy” and “monolith” have a bad reputation these days. Legacy is only natural when a system has had success for a period of time - and a monolith is also a perfectly viable system architecture.

Unraveling complexity is a formidable challenge. Making a significant architectural change to a mature system carries substantial risk and requires careful planning and execution. Successfully extracting a part of the system will demand attention to detail, a deep understanding of dependencies and ensuring everyone remains aligned.

The journey begins at the planning phase. While planning and documenting a major change may be less exciting to read about than hands-on coding, it's a crucial step — especially in organisations with complex systems and regulatory requirements. To give an authentic view of the project, I feel it's important to include these early stages. Skipping the project's origins would leave out an essential part of the story.

If you're eager for the technical implementation details — don't worry! I'll cover them throughout the series. First though, let's explore how a significant change (one with substantial risk and complexity) can be proposed to the wider company.

Why does TransferGo use RFCs?

We use Request for Comment (RFC) documents to plan and explain proposed system changes, outlining what we want to do and why we want to do it, before going ahead. This gives everyone a chance to review the plan and share their input.

Instead of requiring a design review meeting, RFC documents let team members review and provide feedback on their own schedule. This asynchronous approach makes it easier to get input from everyone, regardless of their location or time zone. As a remote and distributed company, this is critically important to how we operate.

Each team has a member on our Architecture Board who reviews proposals. They check for potential issues or identify dependencies that may require further discussion or planning. The Board provide guidance, but our decision making is transparent — with all documents being open to everyone. Anyone can contribute ideas to ensure we arrive at the best possible technical design.

I want to take a moment to touch on the topic of organisational memory. RFCs are a good way to improve collaboration and technical design, but also serve as a historical record of technical evolution. I can't emphasise enough how important this can be when onboarding a new team member, working in unfamiliar areas of the codebase, breaking down knowledge silos or trying to reduce the bus factor.

Reviewing previous RFCs helps us understand why certain parts of our system were designed the way they were, helping us make more informed decisions. They follow a standardised structure with each document including specific details — from quick summaries to explicit links to company OKRs. Having a structured format helps authors provide all required details for reviewers to assess the change, verify alignment with organisational goals, and determine if the project is worth the investment.

Overall, we find that RFCs significantly enhance communication, alignment, and technical design. By providing a clear and structured framework for proposing and discussing technical changes, they ensure everyone is aligned on a project's goals and potential impacts. RFCs have become a key part of how we collaborate to build and improve our systems effectively.

Now you're more familiar with our process, let's examine the proposal at the heart of this devlog: the Profiles Bounded Context RFC.

The Profiles Bounded Context RFC

The Problem

Defining the problem provides the needed context for technical proposals. Reviewers need to understand the issues before assessing if changes make sense. Jumping to the solution, without first explaining the underlying problem, would make it difficult for reviewers to evaluate the proposal.

In this section, the problems caused by the tight coupling between the Profile and User components were explained. Details demonstrating how the concepts have become tightly coupled, the complexity from the entangled structure, and lack of boundaries, were added to show how difficult the code is to maintain.

Many problems were added about working in this area of the codebase, along with the associated risks. For example, even simple changes require lots of investigation and coordination with teams. Since these components predate our RFC process, we lack documentation explaining the historical decisions. We often spend time trying to understand how and why the system works as it does, which slows down our ability to deliver new features and improvements.

Complexity was also mention as it has increased the potential of introducing bugs. Given the critical nature of these components, a bug in this area could lead to a significant incident.

Perhaps the most pressing challenge is that the current state of the system directly impacts our ability to deliver upcoming business initiatives. Addressing these issues has become a priority for our team to ensure we can meet our goals.

Beyond technical considerations, the RFC provided a business justification linked directly to our company OKRs. While technical improvements are important, the RFC communicated the importance of addressing these concerns before they hinder the development of future functionality.

The Solution

Once the problems have been clearly articulated, we can move onto the proposed solution.

We will address the coupling and ownership issues by moving Profile functionality into a separate Bounded Context, while keeping it within the monolith. Using a technique called Branch by Abstraction, we will perform a true refactoring and carefully migrate the code without changing any behaviour.

Why a Bounded Context?

Deciding to stay within the monolith, rather than create a new microservice, wasn't a popular proposal to everyone. However, I see it simply as a case of sequencing. Regardless of where we end up, we need to decouple the components and implement the new boundary. This is all possible without introducing a network boundary, additional infrastructure, a new code repository, deployment pipelines, monitoring systems, etc.

Moving to a microservice would mean taking on data ownership, which would require migrating data between the old and new system. This data migration would increase the complexity, scope and risk of the project.

However, we believe we can establish a new boundary without creating a microservice or migrating the data. By focusing purely on refactoring at this stage, we can make smaller, regular changes that decouple the components.

Once complete, we will have a separated Profile service with an established boundary between it and the monolith. With an abstraction for communication, we will be able to swap the implementation if we later decide to continue with extracting a microservice.

If extracting a microservice was truly the right architectural decision from the start, it will still be the right decision after we've established the bounded context.

It’s all about sequencing and avoiding premature decisions. Separating these tightly coupled components is already complex enough; there’s no need to complicate it further. Whether we eventually create a microservice or not, the real issue lies in addressing the volume of coupled code. We can always extract a microservice later if needed, but attempting to tackle everything at once would only add unnecessary complexity and delay delivering value.

Why Branch by Abstraction?

The Strangler Fig pattern is a common approach where a new system is built around an existing one. Requests at the boundary are intercepted and rebuilt until the inner system is fully replaced. The problem is: we don’t have a boundary to intercept. We have a clear idea of what we’re aiming for, as proposed in the RFC—but that boundary doesn’t exist yet. Even then, some parts of it will need to be discovered gradually as the project progresses.

Strangler influenced our initial approach, but we moved towards Branch by Abstraction (BbA). BbA, one of the Trunk-Based Development techniques, is like having a long-lived feature branch inside the codebase — instead of version control. Strangler and BbA share the same goal of improving a system over time. Neither approach is inherently better - it's all about choosing what works best for your specific needs. The key difference lies in their approach: Strangler Fig typically wraps the old system from the outside in, while Branch by Abstraction transforms it from the inside out.

Our approach will follow these steps: First, we'll define an abstraction point and identify the behaviour we want inside the new boundary. After encapsulating the old behaviour, we'll refactor the codebase to flow through this abstraction point. Once we verify no issues were introduced, we'll implement the replacement functionality and gradually enable the new flow while monitoring for problems. We will repeat this process for every behaviour we wish to be inside the boundary.

Feature Flags will allow us to run the old and new flows side by side, switch between them instantly, and test changes with just a small group first. This helps us manage the risk by letting us test changes with just a few users first. If something goes wrong, we can quickly switch back to the old code without having to do a new deployment.

Throughout this process, both old and new code will coexist, with the abstraction point determining which path to execute. This gradual (and reversible) approach allows us to maintain system stability while we establish the new boundary through small, manageable changes. BbA aims to provide the benefits of a feature branch - where you can work on changes separately - but without getting stuck with messy merges and integration problems that can happen when branches live too long.

Our goal will be to make changes gradually while keeping everything running smoothly.

I'll dig into how we'll implement Branch by Abstraction in detail in my next post.

Risks / Challenges

We will be dealing with complex and tightly coupled code. This means we're likely to run into some unexpected issues along the way. At least it could make for an entertaining devlog.

The code we'll be working with is among the oldest in our system and is deeply integrated with our core User functionality. Since this predates our documentation practices, we lack documentation of decisions. Before implementing changes, we'll first need to ensure we understand the existing system.

Regular communication with all teams will be essential. We'll need to provide frequent updates about progress and changes, share timelines, discuss potential impacts, and gather feedback to maintain smooth coordination across the organization.

Our system has an event-driven architecture and will require careful attention. Profile updates trigger downstream events that affect multiple system components, so we must ensure these event flows remain intact and function properly throughout and after the changes.

Even with the new bounded context in place, we'll face important technical considerations. The service calls will need to work within our existing transactions, requiring careful attention to data consistency and atomicity. These changes could impact multiple layers of our application stack, including the user interface.

Outcome

The RFC received the required approval for TransferGo to invest in the project. With this milestone achieved, we’re ready to move forward.

The next phase of the project is what we are calling "discovery-by-implementation". Rather than getting stuck in an endless loop of planning, we recognise the importance of staying flexible and adapting our approach as we learn. We have sufficient knowledge and understanding to begin the work, but recognise we can't have all the answers yet. Some challenges will only become apparent once we start implementing.

It's not about creating a flawless plan, or designing the perfect system. It's about continuously learning and improving as we go.

Perfect is a verb, not a noun.

Should be fun!

Composition over Inheritance

Primitive Obsession