A (Hopefully) Gentle Introduction to Serialized and Event Sourcing

In this article Raymond Camden explores Serialized and takes us though his journey of how Serialized could be used to build a CMS using Event Sourcing.

Raymond Camden

3/10/2022

I am — shall we say — somewhat new to the world of CQRS and Event Stores, to the point of having to google them, read something, scratch my head, read another thing, scratch my head even harder, and then finally simply ask for help. It was only after a long conversation with Jaden Baptista where he (very patiently, I might add) helped me gather a slim inkling of an understanding of what this space is all about.

A quick note on my background: I have a very good model in my mind of data storage, whether it be traditional DBMS or NoSQL type solutions. I’m very familiar with working with data and writing queries against them to get what I need.

Event Sourcing changes this model to one where the event, not the content, defines your data.

Consider this hypothetical situation: In a content management system, especially for a large organization, the act of creating an article like this one consists of more than just a title, an author, and a blob of text. Instead, there's a series of events that defines the process by which the article is created, goes through editorial, and is then published:

  1. We may begin this process by defining an idea. We can call this event “IdeaCreation” and the data of that event is the idea itself, "Top Ten Ways Cats are Better than Humans."
  2. Then, the idea gets reviewed and processed. A follow up event could simply be an approval with the data pointing to the editor who signed off.
  3. After this, an event could be created “Draft” that includes the body of the text.
  4. Then, we might have multiple Review and Edit events that update the body of text.
  5. We could have an event for Approval, which logs the people who signed off on the article.
  6. A Scheduled event would be nice, including information about the date when it will go live.
  7. And to top it all off, a truly final Published event with the actual date and link is key.
  8. If we’re contracting external authors, we might even go further with an event to mark when we’ve received their invoice or when they’ve been paid.

All of the above could be represented by one row in a database, but in the event sourcing based model, we have an incredible amount of detail describing how the data changes over time.

A real-world example of this approach to data storage would be Git. While you can think of Git as simply "a storage for my files", the true power of Git (and other SCMS providers) is the event history of information in the repository.

What Serialized does

Serialized fits in as a provider for storing your events. It provides simple-to-use APIs to not only let you record events, but also create projections that provide a customizable view of your data (for example, what state is an article in) and set up reactions that let you execute logic on other servers/systems based on event triggers. All of this can be tested for free under their very generous free tier. You can find detailed pricing information for specifics and if my explanations above made no sense, you can watch a quick two minute introduction video too. Serialized provides a good set of docs as well as resources for Java, JavaScript, and TypeScript developers, but their core API is ridiculously simple to use once you've signed up for a free account and gotten your credentials. Let's look at some examples. If you want to replicate any of this yourself, be sure to sign up first and get your access key and secret access key. (To be clear, both should be kept secret but one’s really secret!)

Content Management Event Sourcing

For our first demo, we’re going to play out our article status-tracker idea. Serialized calls our articles “aggregate types” — that’ll be important later. Let’s just build the first event for now, called ArticleSubmission, to represent the idea of suggesting a new article. It will consist of a title and a short abstract.

As a raw network call, you would post a JSON object consisting of an array of events, even if you’re sending only one. You would include your keys in two headers, Serialized-Access-Key and Serialized-Secret-Access-Key. I’m using Node so I’ll make use of the dotenv package as a simple way to make my keys available in my environment.

We begin by crafting the URL. The core URL is the Serialized API endpoint for aggregates: https://api.serialized.io/aggregates. We then add the name of our aggregate type, which in this case is articles. Next, we add a unique identifier. This can be anything, but we’ll use the npm uuid package to generate these for us — something like 3df1d8fa-fbd8-4666-adc1-c68d672e7494. Finally, as this is an event, the complete URL will then end with /events, bringing the total composite URL to this: https://api.serialized.io/aggregates/articles/3df1d8fa-fbd8-4666-adc1-c68d672e7494/events

Now that we have our URL, let’s start off with a (mostly) static example that adds an event for a new article.

require('dotenv').config();
const { v4: uuidv4 } = require('uuid');
const fetch = require('node-fetch');


const ACCESS_KEY = process.env['Serialized-Access-Key'];
const SECRET_ACCESS_KEY = process.env['Serialized-Secret-Access-Key'];

const type = 'articles';
const eventType = 'ArticleSubmission';

(async () => {
  let id = uuidv4();

  let data = {
    events: [
      {
        eventId:uuidv4(),
        eventType:eventType,
        data: {
          title:'An article about cats',
          abstract:'A deep look into the world of cats.'
        }
      }
    ]
  };

  let resp = await fetch(`https://api.serialized.io/aggregates/${type}/${id}/events`, {
    method:'POST',
    headers: {
      'Content-Type':'application/json',
      'Serialized-Access-Key': ACCESS_KEY,
      'Serialized-Secret-Access-Key': SECRET_ACCESS_KEY
    },
    body: JSON.stringify(data)
  });

  let result = await resp.json();
  console.log(result);
})()

I’m using the Fetch library to do my post, since it’s what I’m most familiar with and it demonstrates the underlying API. If you prefer the abstraction, you can always use Serialized’s TypeScript driver. You can see my URL is comprised of the parts I mentioned above. My data is an array of one event with a hardcoded title and abstract. When run in the terminal, we get:

{
  result: 'SUCCESS',
  aggregateVersion: 1,
  taskId: '361fb394-d387-4b0d-b33e-8aec6f7bfb30'
}

In the Serialized dashboard, we get a nice overview of our data, which for now is pretty minimal:

We can drill down into aggregates to see a list of items:

And then into the event itself:

Note the timestamps that Serialized automatically added.

Now, let’s approve the idea. This will be another event, called ArticleApproval, and we need to use the ID from the previous script execution. I’ve hardcoded it in the script below.

require('dotenv').config();
const { v4: uuidv4 } = require('uuid');
const fetch = require('node-fetch');


const ACCESS_KEY = process.env['Serialized-Access-Key'];
const SECRET_ACCESS_KEY = process.env['Serialized-Secret-Access-Key'];

const type = 'articles';
const eventType = 'ArticleApproval';

(async () => {

  // id from first creation test
  let id = '3df1d8fa-fbd8-4666-adc1-c68d672e7494';

  let data = {
    events: [
      {
        eventId:uuidv4(),
        eventType:eventType,
      }
    ]
  };

  let resp = await fetch(`https://api.serialized.io/aggregates/${type}/${id}/events`, {
    method:'POST',
    headers: {
      'Content-Type':'application/json',
      'Serialized-Access-Key':ACCESS_KEY,
      'Serialized-Secret-Access-Key':SECRET_ACCESS_KEY
    },
    body: JSON.stringify(data)
  });

  let result = await resp.json();
  console.log(result);

})()

This is virtually the exact same code, with only the event type changing. Back in the Serialized.io dashboard, if we go into the detail for our object, we can see the new event.

We can simplify things a bit with a few utility functions. For example, we can abstract out the operation of creating a new article idea:

async function createIdea(title, abstract) {

  const type = 'articles';
  const eventType = 'ArticleSubmission';

  let id = uuidv4();

  let data = {
    events: [
      {
        eventId:uuidv4(),
        eventType:eventType,
        data: {
          title,
          abstract
        }
      }
    ]
  };

  let resp = await fetch(`https://api.serialized.io/aggregates/${type}/${id}/events`, {
    method:'POST',
    headers: {
      'Content-Type':'application/json',
      'Serialized-Access-Key':ACCESS_KEY,
      'Serialized-Secret-Access-Key':SECRET_ACCESS_KEY
    },
    body: JSON.stringify(data)
  });

  let result = await resp.json();
  return result;
}

And while we’re at it, simplify the process of approving:

async function approveIdea(id) {

  const type = 'articles';
  const eventType = 'ArticleApproval';

  let data = {
    events: [
      {
        eventId:uuidv4(),
        eventType:eventType
      }
    ]
  };

  let resp = await fetch(`https://api.serialized.io/aggregates/${type}/${id}/events`, {
    method:'POST',
    headers: {
      'Content-Type':'application/json',
      'Serialized-Access-Key':ACCESS_KEY,
      'Serialized-Secret-Access-Key':SECRET_ACCESS_KEY
    },
    body: JSON.stringify(data)
  });

  let result = await resp.json();
  return result;
}

We can run these utilities a few times to create and approve some articles, but how do we get an overview of our articles and their status? This is where projections come in.

You can think of a projection as a high level view of what the logged events sum up to. Projections can apply some pretty complex rules to your data and I highly recommend reading the docs on them, but let’s consider an example to get some hands-in-the-dirt experience: one helpful view of our article pipeline data would be the given state of a particular topic, either in the queue or approved. Before we get into any code, let’s formalize this example. It might seem a bit pedantic, but bear with me for a moment because this will be very helpful. A new idea is considered to have a property called status, set to a value of QUEUE. An approved idea has it set to APPROVED.

With that in mind, let’s figure out how to make this plan actually happen. The first step is creating the projection with an API call. While you can view, test, and delete projections via your dashboard, initial creation has to be done in code, and this is a feature I’d love to see Serialized add in the future. A web-based interface to design new projections would be the cherry-on-top here.

The data for our API will consist of a name, the aggregate data type it’s based on (called a feed name), and then a set of handlers. Handlers map events to particular actions on your data. This is done via a JSONPath syntax that maps event data to the eventual view of your data you want. Here’s a projection for what we described above:

{
  projectionName:'articles_status',
    feedName:'articles',
  handlers:[
  {
    eventType:'ArticleSubmission',
    functions:[
      {
        function:'set',
        targetSelector:'$.projection.status',
        rawData:'QUEUE'
      },
      {
        function:'set',
        targetSelector:'$.projection.title',
        eventSelector:'$.event.title'
      },
      {
        function:'set',
        targetSelector:'$.projection.abstract',
        eventSelector:'$.event.abstract'
      }
    ]
  },
  {
    eventType:'ArticleApproval',
    functions:[
      {
        function:'set',
        targetSelector:'$.projection.status',
        rawData:'APPROVED'
      },

    ]
  },

]
}

Note the two objects in the handlers array:

  1. The first handler maps to the submission event and sets three values - status (hard coded to QUEUE for now), title, and abstract.
  2. The second handler maps to the approval process and only needs to set the status. All of this is POSTed to https://api.serialized.io/projections/definitions and when it’s done, it’ll show up in your dashboard:

To run a projection, you make an API call to https://api.serialized.io/projections/single/${name}/${id} where ${name} represents the name of your projection and ${id} is the data you’re inspecting. We can think of a call to this endpoint as a command to apply this projection onto my data based on the events that we have recorded.

To test our new setup, I made a new article using the ArticleSubmission event. Here’s the projection version of this new article:

{
  projectionId: 'fc2beff3-0e74-4089-bf34-b78bbc4da2bd',
    createdAt: 1645734831361,
  updatedAt: 1645734831362,
  data: {
  status: 'QUEUE',
    title: 'Article on Thursday',
    abstract: 'Thursday article'
  }
}

After running the approval event, the same projection returns:

{
  projectionId: 'fc2beff3-0e74-4089-bf34-b78bbc4da2bd',
    createdAt: 1645735213175,
  updatedAt: 1645735213185,
  data: {
  status: 'APPROVED',
    title: 'Article on Thursday',
    abstract: 'Thursday article'
}
}

While it’s easy to point out all the cool features of a new approach like this one, I want to point out the thing we’re actually missing here: a record for the article! This object returned from our projection just represents our article, but there’s no single row in a database somewhere for that article. We just described it using events, so instead of a single article row, we have a rich history of the article’s journey.

Now projections can get much more complex than shown here, and there are more operations we can do with them, but that about wraps up my initial discovery of this burgeoning method of storing data. Really, it’s just been a taste of what Serialized offers, but hopefully you find it as intriguing as I did. There is definitely a bit of a mental shift in terms of how you think about your data, but the event-based approach actually feels like it could make a lot more sense for many applications.

If you’re interested in going further, I’d recommend checking out their Getting Started guide for another fantastic walkthrough of these features and the many I left out.