Building Meaningful Audit Trail in your System

I have seen two primary approaches to building a audit trails for systems – entity level audits and property level audits.

Entity level audit maintains an audit log for every entity and makes an audit entry for ‘any’ operation done on these entities. Weather there is a change in one property of the entity or the entire entity changes as part of an operation that affects a bunch of other entities in the same transaction. At database level, this is often implemented using triggers that would copy the row being changed to an audit table along with some audit information.

Property level audit, on the other hand, goes to a different extreme and makes an audit entry every time any property on any entities changes. At database level this often translates into one big fat table which keeps track of any field in any table that has changed along with some general audit information.

Whiles these are two widely adopted implementation approaches for performing system audits, what both of these approaches fail to capture is the actual intent of the user. The granularity at which the system captures the audit data has an impedance mismatch with the business and it can get difficult to tell what actually was done to system to create this audit entry. Hence, these audit data would have a very limited business meaning and use.

Think about a modelling strategy where the actual operations that can be performed on the system are expressed explicitly. Wouldn’t it be a lot easier to just record these operations or the fact that these operations got performed on the system? This kind of audit data would have the granularity that would make sense to the business and creating a business meaningful audit reports out of such audit logs would be a lot easier. Isn’t it?

Advertisements

Implementing Repository in DDD – Part 1

Implementing Repository is fairly staright forward in DDD. There are generally two styles of implementation,
1. aggregate-dedicated repository and
2. generic repository.

An aggregate-dedicated repository will have a method per command or query; while, Generic repository will have a standard interface for every Aggregate. However, in both cases, the repository is suppose to operate at the aggregate-root level. This post will demostrate implementation of an aggregate-dedicated repository,

In your domain layer,
public class Order: IAggregateRoot
{
public Order(IOrderRepository repository)
{ ..}

public void Order(PlaceOrderCommand command)
{
..
repository.Add(this);
}
public Order Get(Guid id)
{
return repository.Single(id);
}

public IList<Order> GetOrdersByCustomer(Guid customerId)
{
return repository.Where(o=>o.CustomerId = customerId);
}
}

Application Layer,

public class OrderService
{
public void PlaceOrder(PlaceOrderCommand command)
{
this.Transaction();
var Order = new Order(new OrderRepository());
Order.PlaceOrder(command);
this.CompleteTransaction();
}
}

The Application Layer uses dependency injection to inject an instance of the repository that the aggregate uses. The aggregate-root then calls the repository internally to persists the data and retrieve data.

In this style of implementation, a dedicated repository designed for each aggregate-root. This means a little more coding up-front compared to generic repositories, as there is one repository per aggregate. Writing unit-tests might be a little more difficult. But over all, the simplicity and flexibility that this implementation brings is huge and depending on your taste, choice of tech-stacks and data-access needs it might be a very good fit for you. Definitely worth a consideration.

Command Query Separation

Have a look at your current application for a moment and try to split it into two parts – the ‘read-only’ side and the write side. Please notice that I have mentioned the word ‘read-only’ side. As the name suggests, the read-only side is the side of the application that is responsible for reading data from the database and displaying it on the UI. No where during the request is it supposed to save any change back to the database. Displaying data in pagable, sortable grid falls under the ‘read-only’ side of the application. Reports would fall under ‘read-only’ side. These are the requests that do not make any change to the database. In other words, it do not change the state of system

Where-as, the other side of the application i.e the write-side of the application would persist changes made by the users into the database.

Let’s say that you were to architect the read-only and the write sides of the application separately, as though they were two different systems, what would some of the design goals that you would come up with? Try to do this exercise with your current application and put down few design goals in the order of their priorities for each side.

As far as I am concerned, eighty percent of my users are going to use twenty percent of my application. Moreover, this twenty percent of the application consists primarily of the read-only side of the application. Hence, for me, some of the high priority design goals for the read-only side of my applications would be

1. Performance
2. Scalability

By no means am I attempting to suggest that the write-side of the application is any less important. In my case, it is actually the foundation for the read-side and hence is critically important for the success of the system. But, it has different design priorities. Since, it mostly deals with complex business rules and ensures non-corruption of data, some of the design goals that would top the list would be as follows,

1. Data Integrity
2. Maintainability
2. Extendibility
3. Flexibility

Clearly, the architectural needs of the read-only and write sides of my application are different in nature. Is it the same for you also? If it is, then the question we should be asking ourselves is that does it justify applying or rather imposing same architectural patterns to both sides of application just for the sake of symmetry?

Rich object graphs are ideal for write-side of the application. They result in high degree maintainable code. But they start to play nasty in situations where complex joins are needed and high performance is a priority, something that the read-only side of application needs a lot. And really! These old fashioned stored procedures and inline queries works like charm in these kind of situation. But the problem with Stored Procedures and inline query approach is that they do not provide the same kind of maintainability and data integrity that rich object graphs do. Hence, separating out the read-only and the write sides of the application and applying different architectural patterns to both can very well be the answer.

Bertrand Meyer in his book “Object Oriented Software Construction” separates an object’s methods into two categories:

Queries: Return a result and do not change the observable state of the system (are free of side effects). In other words, the read-only methods.

Commands: Change the state of a system but do not return a value. The write methods.

Meyer calls this principle the Command Query Separation. This principle, applied at architectural level leads to a clear segregation of the commands (write operations) and the queries (read-only operation) and lends itself to the flexibility of applying different architectural patterns to the very different design needs of the Command and Query sides of the application.