Event sourcing Explained - part 2 - some things you should know about

By Mattias Holmqvist - September 27, 2018

In part 1 we had a look at some of the fundamental ideas behind Event Sourcing and why it is a powerful technique that brings us lots of benefits. In this post we’ll have a look at some skills that are important to know in order to succeed with Event Sourcing.

1. Use Event Storming to figure out what to do

A great methodology for breaking down and understanding any current or new system is Event Storming - coined by Alberto Brandolini. It deserves a post of its own but meanwhile you can start by looking here: http://eventstorming.com

Basic outcome from Event Storming

Event Storming helps us find the boundaries of different subsystems, identify possible knowledge gaps, highlight the most important business events and build software that makes sense for the user and your business.

Besides figuring out the big picture of what our system is supposed to do, Event Storming is also a great starting point for carving out the basic building blocks for creating a solution using Domain-Driven Design and Event Sourcing.

Note: You can still use Event Storming even if you don’t aim to use Event Sourcing , but it is a great technique that matches Event Sourcing very well.

Event Storming is a simple, yet effective modelling technique and I encourage you to take a few minutes to try it out before jumping into coding. Also, make sure to involve people from different parts of your organization - not only developers.

2. Break down the problem into Bounded Contexts and Aggregates

Domain-Driven Design brings some extremely useful concepts to the table for us when designing and structuring more complex software. As we’ll see, Event Sourcing is a great fit for Domain-Driven Design. Let’s straighten out some basic terminology.

There are a lot of golden nuggets in DDD, but the most important fundamental concepts to grasp in order to effectively use Event Sourcing are Bounded Contexts, Aggregates and Domain Events.

Bounded Contexts

A Bounded Context is a system boundary designed to solve a particular problem. You make the decisions where you draw the lines of your bounded contexts, but most of the time you also need to identify bounded contexts that are out of your control, such as external APIs or other external parties.

When drawing up the map of a complete system it quickly becomes clear that some concepts occur in different situations and have different characteristics depending on where the concept occurs. A single concept (or word) can exist in multiple bounded contexts, for different purposes and with different meaning.

That a single word can have different meaning in different contexts and situations is not that hard to understand if you think more closely about it. An order at a restaurant and an order in an online e-commerce site selling clothes are two different things. The order also means one thing to the finance department handling payments and another thing to the logistics department handling shipping the orders.

The same aggregate (Order) in two different contexts

Historically, we as software developers have been too quick in finding commonalities between concepts that sound the same, but actually are quite different when you look more closely at the behavior around them.

Designing a software system where the code is shared between two different things just because they have the same name is a big problem. If we instead embrace the fact that different things can have the same name and create different models and solutions for them, our lives as developers will be so much better and we will produce less complex and buggy software.

Aggregates

A common problem for developers is the mismatch between the real world (sloppy, inexact, muddled, unclear, full of exceptions) and the computing world (deterministic, rule-based, exact). This mismatch requires us to make decisions about what to model in our software and what to leave out.

Our goal is to build something useful rather that perfect. An aggregate is a concept in your business that models a number of rules (or invariants) that always should hold in our system.

A fancy word for rule is invariant

So, our aggregate is a piece of code (in OO typically a cluster of classes) that says “yes” or “no” to requests sent to our system. When the request is successful, the aggregate also updates the state of the system.

We call requests to perform a system action a command.

The flow from request to events

An aggregate defines a strong consistency boundary. Ideally, no invalid requests performed from an outside actor should slip through the aggregate.

Therefore, we need to some way handle simultaneous requests to the same aggregate from different users since in some situations we might have a contention between the requests (imagine two simultaneous requests to book a single remaining concert ticket).

This situation requires some kind of structure around the aggregate to make sure that we’re not creating an inconsistency in our data.

A common pattern is to use a single-writer model for each aggregate, but it can also be effective to use some kind of concurrency check (optimistic or pessimistic) in the storage system. Both alternatives could be viable, depending on your requirements.

3. Separate your writes from your reads

Since the requirements between writes and reads have very different characteristics, we should make sure not to mix them in our code.

Since aggregates take care of our business rules and emit events that are the single source of truth for our system - this is our write side. Therefore, aggregates should not be designed to answer any queries that users or external systems may need.

However, very few systems are useful without some kind of external representation. Instead of letting our aggregates answer to these queries we build up separate models for these requirements. We refer to this as projections or read-models. By using Event Sourcing we can easily create projections from the events that have been emitted from the aggregates and build up whatever representation we need.

An illustration of CQRS

To separate writes from reads is nothing revolutionary within software development. The CQS principle is something that is inherent in functional programming languages, where functions are pure calculations that return data (read) and side-effects are handled explicitly (writes). If you apply this pattern on a system level (like described above), we use what Greg Young coined as CQRS.

Our aggregates emits domain events when the answer to a request to the aggregate is successful. These events are stored in a log for each aggregate which is used both when handling future commands and as a source for producing data to our read models.

4. Make sure your events are Domain Events

Our aggregates emits domain events when the answer to a request to the aggregate is successful. These events are stored in a log which is used both when handling future commands and as a source for producing data to our read models.

There are different kinds of events in a system and it can sometimes be confusing if it is a good idea to use Event Sourcing for everything. Even though it might be interesting to save all events that we see for different reasons, everything should not be used for Event Sourcing.

Application (technical) events

Since Domain Events come from our domain model they should not be technical but rather make sense in the domain - your business. Take care when naming and organizing them.

Don’t use internal technical events in your domain model

Application events can be things such as logging messages, system notifications or other technical notifications that are internal to your system and useful for the system to work. They will however mean little to the people outside the development teams and should therefore not be used for the event sourced domain model.

External events and observations

Our domain model is needed to keep the invariants and business rules intact and to keep a strong consistency. If you cannot say anything but OK to the messages sent to the model, you don’t have a domain model but instead an event sink/processor.

For example, there is no reason we should reject a TemperatureChangedEvent that is sent to our model from a temperature sensor, so there is no reason for our domain model to handle this event. It might be interesting to store this event or to process it in some way but in most scenarios these external observations are not used for Event Sourcing.

5. Embrace eventual consistency

In larger systems, many different subsystems can be interested in the events, for example to create a chain of commands to drive more advanced business flows or to update a view for a mobile app or to send a notification to a user.

We need to acknowledge that updates between aggregates and sub-systems does not happen immediately. Especially if your system involves a lot of users or entities, connecting different parts and requiring them to be strongly consistent will make the system slow and totally cripple the user experience. Therefore, we accept that it takes a while for the entire system to react and become consistent. This is what we mean when we say that the system is eventually consistent.

Actually, this is also how the real world works if you think about it. For a great (and a bit philosophical) talk on the topic, watch Rick Hickey’s Are We There Yet.

Using events as the source of change for our aggregates, it is possible to create a mechanism where other parts of the system can subscribe to these events and handle them according to their individual requirements.

Create read models from events

Summary

This was a summary of some important concepts that we found very important to keep top-of-mind to succeed with Event Sourcing.

I only scratch the surface of these concepts but hopefully knowing about their relevance in the context of Event Sourcing will be useful also for you!

In part 3 of this series we’ll dig into examples of implementations of different components in an event sourced system. Stay tuned!