This is the third devlog in the series documenting the extraction of a Profiles service from our monolith. You can read about The Profiles RFC and Implementing Branch by Abstraction in the previous posts.
The timestamp was wrong. Not by much, just a few seconds, but unexpected payloads are worthy of investigation - particularly in a system driven by events.
I was deep in a routine refactoring session, examining events destined for our new Profiles service boundary, when
the UserWasMovedIntoAddress
event caught my attention.
The event contained a timestamp, but our logs clearly showed the action happened a few seconds earlier.
I examined the payload for another event of the same type.
Different data entirely!
What started as curiosity for the timestamp became a deeper investigation when I realized the event payload was changing between creation and publication. Not just timestamps - entire field values were different from what they should have been when the event was first raised.
This was our legacy monolith, battle-tested for over a decade.
Events had been working fine, yet here I was, staring at evidence that a critical part of our architecture was... broken?
I needed to understand the scope.
A quick investigation showed UserWasMovedIntoAddress
wasn't the only event with a payload different from when
it was raised.
Similar inconsistencies appeared across other events, but only sporadically.
The pattern wasn't immediately obvious.
The first clue came when noticing something the events had in common:
final class UserWasMovedIntoAddress implements Event{ public function __construct( private User $user, private Address $address ) { }
public function getPayload(): array { return [ 'user_id' => $this->user->getId(), 'address' => [ 'id' => $this->address->getId(), 'street' => $this->address->getStreet(), // ... other fields 'updated_at' => $this->address->getUpdatedAt(), ] ]; }}
The events I was investigating were those migrating to our new Profiles Bounded Context.
Many of them were dependent on an instance of User
and Address
.
But these are not a data object, they are references to entities. Not immutable bags of data, but mutable objects.
I traced the event lifecycle to understand when and why payload data was changing:
getPayload()
is called, using current state of the entities.The gap between creation and publication was where corruption occurred. In this area of our legacy system, events were held in-memory during the request, and published if the operation was successful.
It didn't take long to notice these events were mutable and was clearly the reason why the timestamp contained a few extra seconds than expected.
However, remember when I said the system has functioned properly for over a decade?
How can this be possible?
I had to continue digging to understand fully what has been going on.
Unfortunately, this wasn't just a data consistency bug - it was something which had shaped many workarounds over the years:
// Handle duplicate eventsif (!$this->hasEventBeenRecorded(UserWasMovedIntoAddress::class)) { $this->recordEvent( new UserWasMovedIntoAddress($this->user, $this->address) );}
When symptoms started to manifest, it appeared as "duplicate events". Teams noticed this issue and some simple deduplication logic was introduced because, at that time, only the "freshest" data was needed.
But these events were not duplicates, and the de-duplication logic masked the real problem. The events happened to be the same type, but were recorded at different points in time during the process.
They were never intended to be mutable and, had they contained different data payloads, this would have been obvious.
Because immutability is fundamental for a reliable event-driven architecture, we decided to refactor events being migrated to our new Profiles service boundary.
Depending upon the event, the implementation looked a little different.
For some, we could simply inject the data needed:
final readonly class UserWasMovedIntoAddress implements Event{ public function __construct( private UserId $userId, private array $address ) { }
public function getPayload(): array { return [ 'user_id' => $this->userId->value(), 'address' => $this->address ]; }}
Here's an example of how this event could be created:
$event = new UserWasMovedIntoAddress( userId: $user->id, address: [ 'id' => $address->getId(), 'street' => $address->getStreet(), // other data 'updated_at' => $address->getUpdatedAt(), ]);
This approach worked fine for many events - particularly those fortunate to be using value objects or DTOs.
But, in this instance, the shape of the address
key would now be determined outside of the event.
Without encapsulating this logic inside the event, we risk the shape of event payload being inconsistent.
For events where their creation was more complex, we could encapsulate it within the event:
final readonly class UserWasMovedIntoAddress implements Event{ private function __construct( private UserId $userId, private array $address ) { }
public static function create(User $user, Address $address): self { return new self( userId: $user->id, address: [ 'id' => $this->getId(), 'street' => $this->getStreet(), // other data 'updated_at' => $this->getUpdatedAt(), ] ); }
public function getPayload(): array { return [ 'user_id' => $this->userId->value(), 'address' => $this->address ]; }}
These events are still immutable, yet we can still leverage the entities:
$event = UserWasMovedIntoAddress::create( user: $user, address: $address);
These examples are somewhat simplified, but demonstrates the majority of the changes we made to iterate towards where we want to go.
But, for full transparency, some events were more challenging and required repaying too much Technical Debt to refactor.
For these events, to still make them immutable, a little shortcut was taken:
final readonly class UserWasMovedIntoAddress implements Event{ private array $payload;
private function __construct( User $user, Address $address ) { $this->payload = $this->generatePayload($user, $address); }
public function getPayload(): array { return $this->payload; }
private function generatePayload(User $user, Address $address): array { return [ 'user_id': $user->id, 'address': [ 'id' => $address->getId(), 'street' => $address->getStreet(), // other data 'updated_at' => $address->getUpdatedAt(), ] ]; }}
Is this solution ideal? No.
Are these events still coupled to entities? Yep.
Extracting a microservice from a large legacy monolith is a marathon. Sometimes you need to focus on progress over perfection and keep in mind that you're iterating towards the solution.
We'll be exploring the many coupling issues in a future post, no doubt.
Finally, let's wrap up with the softer side of software development.
Implementing immutable events requires more than code changes. As a Staff Engineer, it's my responsibility to re-establish the practice:
Architecture Decision Record: documenting the requirement for new events to be immutable, providing clear guidance to help prevent regression to "old ways".
Guild Presentation: presented this investigation at our back-end guild meetup, focusing on sharing knowledge, the decision-making process and the impact of mutable and immutable events.
Code Review Guidelines: introducing a new event is now a specific code review checkpoint, with reviewers specifically checking for immutability.
Even though many services had become dependent upon the workarounds, once the events were immutable, we were able to remove the workarounds without any negative side effects.
Have you discovered similar architectural issues hiding in your legacy systems? Reach out and let me know about your investigation process and what problems were revealed.