[DevTip] Is a 100% Provable Audit Log Possible? – Hint, with Event Sourcing it Is and I’ll Show You How

If you’ve been developing for any length of time you’ve probably had to create an audit log. Here’s the question though, are any of these logs, a 100% provable audit of all changes in the system? I’m guessing not. With event sourcing, a 100% provable event log just happens to be a handy by product. So how does it work and how can you implement it?
In this post I’m going to assume a web based line of business application. Event sourcing and CQRS commonly go together but today I’m going to use CRUD (create, read, update and delete). Yes thats event sourcing with CRUD! You can grab a visual studio sln with working code with download button at the end of this article.

1. Request To Change

In a typical MVC architecture, a controller receives a request in the form of a model. This model represents the change (often some form of viewmodel).  Assuming it passes basic validation, it is then handled in the controller. In my opinion this violates the ‘Single Responsibility’ principle found in the SOLID guidelines. This extra responsibility is better delegated to another class. If you are learning CQRS, you would create a command. The command  would then be sent via a bus or message router and handled elsewhere. For simplicity I’m going to handle the business logic within the controller.

2. Apply the Business Logic and Publish the Event

This is a 3 step process. The first is to ensure the action can happen. In other words, there is no reason within the application that the action should fail. This is also when any processing, lookups or calculation takes place. Assuming the action can happen, the second step occurs. You then create the event message, and save it to disk. Only if the save is successful, the final step takes place. This is to publish the event to the message router or bus.  At this point, rather than creating a new class you could just use the original model. This differs from what we usually see in a controller. The update request would flow down to a business logic layer. This in turn would pass it down to a data access layer to update the underlying data source via a repository.
[HttpPost]
public ActionResult CreateCustomer(Customer customer)
{
// Check business logic here (but no persistance)
if (DB.Customers.Values.ToList().Any(c => c.Name == customer.Name))
{
ModelState.AddModelError(string.Empty, “Duplicate Name Detected”);
}
if (!ModelState.IsValid)
{
return View(customer);
}
// assuming all is good create and save event message
var createdEventMessdage = new CustomerCreated { Id = Guid.NewGuid(), Name = customer.Name };
DB.SaveEvent(createdEventMessdage);
// Send it off to be routed arround the system
_router.Handle(createdEventMessdage); // This could be handling multiple tasks if needed – all de-coupled and simple
return RedirectToAction(“Index”);
}

3. The Message Router

The role of the router is to inspect the message and deliver it to the appropriate handlers. Because all messages go through this ‘gateway’ it offers an opportunity to add extra steps rather like a pipeline. If you are familiar with ActionFilters in ASP.Net MVC then this should sound familiure. You can inspect incoming events and apply pre-processing to them.  For example, using the specification pattern, you can apply security rules before forwarding the message. I have also used this approach in other projects to carry out system logging and  performance monitoring. It is also a useful place to handle exceptions. If the flexibility of creating a message router are not required then you do not strictly need to use one. In this case you could just call all the appropriate message handlers directly. For a simple system this may in fact be a more pragmatic approach.

public class SimpleRouter
{
protected Dictionary<Type, List<Action<IEvent>>> Routes;
public SimpleRouter()
{
Routes = new Dictionary<Type, List<Action<IEvent>>>();
}
public void Register<T>(Action<T> handler) where T: IEvent
{
// I’ve included the DelegateAdjuster code in the download, but you could just you dynamic to simplify this
if(Routes.ContainsKey(typeof(T))) Routes[typeof(T)].Add(DelegateAdjuster.CastArgument<IEvent, T>(x => handler(x)));
Routes[typeof(T)] = new List<Action<IEvent>> { DelegateAdjuster.CastArgument<IEvent, T>(x => handler(x)) };
}
public void Handle(IEvent message)
{
Before(message);
foreach (Action<IEvent> action in Routes[message.GetType()])
{
action.Invoke(message);
}
After(message);
}
protected virtual void Before(IEvent message)
{
}
protected virtual void After(IEvent message)
{
}
}

4. Message Handlers

Message handlers are the only part of the system that changes the read model. In a CQRS  architecture they are refered to as de-normalisers.They receive an event messages and then update the read models based on the data in the message. The flow would be:
a) Receive an update Person request
b) Load Person by id from the data store
c) Change the appropriate fields
d) Save the the Person object to disk
public class CustomerHandler
{
public void Handle(CustomerCreated customer)
{
// Do Insert
DB.Customers.Add(customer.Id, new Customer{Id = customer.Id, Name = customer.Name});
}
}
// ######################
// The messages
public interface IEvent { }
public class CustomerCreated : IEvent
{
public Guid Id { get; set; }
public string Name { get; set; }
}
public class Customer
{
public Guid Id { get; set; }
[Required]
public string Name { get; set; }
}

5. The Read Model

The read model is the basis of any view models. An advantage of CQRS would be that you could do away with view models and ensure that there would always be a read model, of the required shape. In the absence of CQRS, many MVC applications end up converting between, viewmodels, models and finally to tables in an RDMS. This is often done via services like AutoMapper and various ORM’s such as Entity Framework or NHibernate. This sneaking complexity overtime has the bad habit of turning what was once a simple architecture into a complex beast.

Conclusion

I made the claim that this approach is 100% provable, so how would you prove it? Answer – re-run all the events and write out the changes to a new database. Then compare the new database with the original. If there are no sneaky updates happening outside of the event messages, then both databases will be identical.There are all sorts of other interesting benefits to this approach beyond the 100% audit log, such as:
  • Event playback – This allows you to create new read models from existing data. You can even project into new types of data store like a graph, document or even object database.
  • Debugging – Roll back to just before the error occurred and then step through the action that failed.
  • Scaling – You can send event messages over the wire to read models stored on other load balanced machines. This makes sense when you consider the number of reads in a line of business application compared to the writes. These other machines can also located close to the source of the traffic. This helps improve response times. This does impact the UI and you should ensure you have a good strategy in place to handle **eventual consistency**.
  • Security – All events go through the 1 gateway and can be security filtered. I suggest using the specification pattern here to make readable and composable security rules.
  • Performance monitoring – Again it’s trivial to attach timers to messages. Monitoring this timing data can provide early warnings of performance issues. This can trigger alerts if outside stated bounds.
  • No data loss – Traditional systems store the current state. Event sourcing in contrast stores the state change. The state change can hold valuable data which is otherwise hard to capture.

It is also interesting to note that event sourcing can be used without CQRS.

Have you ever tried to build an audit log? Would event sourcing have solved the issues you faced? What other approaches are there to solve this problem?

DOWNLOAD EXAMPLE EVENT SOURCING WITH CRUD PROJECT

event-sourcing

[DevTip] 6 Code Smells with your CQRS Events – and How to Avoid Them

When starting out with CQRS, designing the ‘events’ is not always obvious or easy. The more experienced you are, the more likely your habits will lead you astray.  This post will help you spot the problems early and stay on the right track.

1. Watch out for the ‘Updated’ word

Watch out for event names that include the word ‘Updated’ or ‘Edited’. These are usually associated with state based operation like those found in REST (Representational state transfer) or CRUD (Create, Read, Update and Delete). Through heavy use of ORM’s like Entity Framework and NHibernate, we have trained ourselves to think in CRUD terms. In the classic bank account scenario, when depositing money, what would make a good event name? A sensible answer may be ‘AccountUpdated’, why not, after all you have just updated the account, right? So here’s the issue. Can you read from the name of the event what just happened? AccountUpdated could refer to a deposit, a correction of the account holder name, a credit or the application of a charge etc. Event message names should reflect what just happened, as well as describe the actual change in it’s content. In fact, the words you use for event names are important and should reflect the ‘ubiquitous language’ of the domain.

2. Event Streams that don’t make Sense to a Domain Expert

Another indicator of bad event naming can be found when looking at the event stream. Would a domain expert be able to infer from the names alone what has just happened in the system? The following 2 event streams clearly illustrates the difference:

Stream 1 – Badly named events

Stream 2 – Well named events

AccountInserted AccountCreated
AccountUpdated FeeCharged
AccountUpdated MoneyWithdrawn
AccountUpdated AccountCredited
AccountDeleted AccountClosed

This also makes it easier to debug your application. Out of place events stand out in an event stream when they are clearly named.

3. Event Messages are not View Models

Another common temptation is to put the view model fields into your event. The key purpose of an event message is to represent what has just happened. The message should contain the information needed to rebuild the state of the domain object. It may well end up looking like a view model but that should not be the driving force behind it’s design. Given the bank account example above, you can spot this kind of issue when looking at the fields. A deposit event may have the current balance but will have have the amount deposited. i.e. the change in state!

Events are also subscribed to by de-normalisers which can use the information to build out a highly optimised read model. By keeping the contents of the event focused on describing the state change, it gives you greater scope to produce better and potentially more diverse read models.

4. Event Names should be Past Tense

This seems obvious but remember that an ‘Event’ is always something that has happened. It should therefore reflect this in it’s name being in the past tense.

5. Missing Commonly Required Fields

Here are a list of common fields found in a typical event and what they are used for:

  • AggregateId – This field is used to associate the particular event to a specific aggregate root.
  • Date Time Stamp – Ordering of events is crucial. Replaying events in the wrong order can result is unpredictable outcomes.
  • UserId – This field is commonly required in line of business applications and can be used to build audit logs. It is a common field, but not always necessary and depends on the specific domain.
  • Version – The version number allows the developer to handle concurrency conflicts and partial connection scenarios. For more information, take a look at Handling Concurrency Issues in a CQRS Event Sourced system.
  • ProcessId – At it’s simplest, this field can be used to tie a series of events back to their originating command. However, it can also be used to ensure the idempotence* of the event.

* Idempotence refers to the ability of a system to produce the same outcome, even if an event or message is received more than once.

There are no hard and fast rules as to what to include, however the list above will give you some guidance toward the right direction.

6. Mutable Events

Given the reliance of events to be the source of truth, it is vitally important to ensure they are immutable. That is to say, once an event object is created, it should not be possible to change any of it’s fields. This property of messaging in general has some profound implications on your over all system. Immutable objects are far easier to test with, can be reliably sent around a system or communications bus and play well in a multi threaded environment. I follow a very simple pattern when creating my events:

public class CustomerCreated : Event
{
public readonly Guid AggreagteId;
public readonly int Version;
public readonly Guid UserId;
public readonly Guid ProcessId
// Other fields specific to this event
public CustomerCreated(Guid aggregateId, int version, Guid userId, Guid processId)
{
AggreagteId = aggregateId;
Version = version;
UserId = userId;
ProcessId = processId;
}
}
view rawExample Event hosted with ❤ by GitHub

Conclusion

The 6 indicators described above are just rules of thumb. Use them as a guidenot a rule book. They should help you avoid some common early mistakes. The process of naming events gets easier as your knowledge of the specific domain grows and matures. It is often harder to find the right names early in a project and usually hints at missing concepts and lack of understanding.  When you do get it right and your events form the basis of a common language, domain experts are then able to clearly reason about your system and feel able to communicate their ideas and requirements far more clearly.

REF: http://danielwhittaker.me/2014/10/18/6-code-smells-cqrs-events-avoid/

[DevTip] CQRS – A Step-by-Step Guide to the Flow of a typical Application

A common issue I see is understanding the flow of commands, events and queries within a typical CQRS ES based system. The following post is designed to clear up what happens at each step. Hopefully this will help you to reason about your code and what each part does.

CQRS flowCQRS flow

1. A Command is generated from the UI

Typically the UI displays data to the user. The user generates a command by requesting some form of change.

2. The Message Router or Bus receives the command

The message bus is responsible to routing the command to it’s handler.

3. The handler prepares the aggregate root

The handler gets the aggregate root and applies all previous events to it. This brings the AR (aggregate root) up to it’s current state. This is typically very fast, even with thousands of events, however, if this becomes a performance bottleneck, a snapshotting process can be adopted to overcome this issue.

4. The command is issued

Once the AR is up to date the command is issued. The AR then ensures ensures it can run the command and works out what would need to change but DOESN’T  at this point change anything. Instead, it builds up the event or events that need to be applied to actually change it’s state. It then applies them to itself. This part is crucial and is what allows ‘events’ to be re-run in the future. The command phase can be thought of as the behaviour and the apply phase is the state transition.

5. The command handler requests the changes

At this point, assuming no exceptions have been raised, the command handler requests the uncommitted changes. Note that no persistence has actually taken place yet. The domain classes have no dependencies on external services. This makes them much easier to write and ensures they are not polluted by persistence requirements (I’m look at you Entity Framework).

6. The command handler requests persistence of the uncommitted events

Here is when an event storage service comes into play. It’s responsibility is to persist the events and also to ensure that no concurrency conflicts occurs. You can read up on how to do this on a previous post of mine: How to handle concurrency issues in a CQRS Event Sourced system.

7. The events are published onto the Bus or Message Router

Unlike commands which only trigger 1 command handler, events can be routed to multiple de-normalisers. This enables you to build up very flexible optimised read models.

8. De-normalisers build up the Read Model

The concept of a de-normaliser can at first be a little tricky. The problem is that we are all trained to think in ‘entities’, ‘models’ or ‘tables’. Generally these are derived from normalised data and glued together into the form required for the front end. This process often involves complex joins, views and other database query techniques. A de-normaliser on the hand translates certain events into the perfect form required for the various screens in your system. No joins required at all, ever! This is makes reads, very fast  and is the basis behind the claim that this style architecture is, almost, linearly scalable. Most people begin to get twitchy at this point when they realise that duplicate data may exist in the read model. The important thing to remember is that the ‘event stream’ is the only source of truth and there is no (or should be no) accidental duplication within it. This allows you to re-create the entire read model or just parts of it, at will.

9. Data Transfer Objects are persisted to the Read Model

The final phase of the de-normaliser is to persist the simple DTO’s (data transfer objects) to the database. These objects and essentially property buckets and usually contain the ID of the aggregate they are associated with and a version number to aid in concurrency checking. These DTO’s provide the information the user requires, in order to form new commands and start the cycle over again.

All this results in a Highly Optimised Read Side Model

The read/query side is entirely independent of the commands and events, hence CQRS (Command Query Responsibility Segregation). The query side of the application is designed to issue queries against the read model for DTO’s. This process is made entirely trivial due to the de-normalisation of the read data.

A. User requests data

All the data is optimised for reading. This makes querying very simple. If you require values for a ‘type ahead drop down list’, just get the items from an optimised list designed especially for the task. No extra data need be supplied apart from that required to drive the drop down. The helps keeps the weight of the data payload light which in turn helps the appication remain responsive to the user.

B. Simple Data Transfer Objects

The read model just returns simple and slim DTO’s that are, as I said before easy to work with on the front end.

In Conclusion

CQRS’s biggest hurdle is it’s perceived complexity. Don’t be fooled by all the steps above. Unlike a ‘simple CRUD’ approach which starts off simple but quickly gains in complexity over time. This approach remains relatively resistant to increased complexity in the scope of the application.

REF: http://danielwhittaker.me/2014/10/02/cqrs-step-step-guide-flow-typical-application/