Tuesday, June 10, 2008

Idempotent Messages

I know in my last post I said we'd be continuing our outsourcing example. However before doing so I need to explain the concept of idempotent messages (you'll understand why when you read my next post).

Idempotence is actually not so much a property of the message, but a property of how the message is handled by the receiving service. A message is idempotent if the service operation that processes it yields the same result regardless of the number of times the message is received.

Some operations are idempotent by nature, whereas others require special treatment in order to become idempotent. Read only operations by their very nature are idempotent because they don't have any lasting effect. An "update customer" operation is idempotent because no matter how many times you update the customer with the same information, it yields the same result.

Operations such as "transfer $100 from account X to account Y" however are not idempotent. If the same message is replayed 10 times, then $1,000 will be transferred over 10 transactions. In these cases we need a mechanism to detect the duplicate messages and ignore them.

In some cases duplicates are easy to detect. For example, if we receive a ShipOrderRequest message containing an order number and store the order number in the Shipping service database, then all we need to do when receiving a ShipOrderRequest message is check the Shipping database for the given order number and if found, disregard the request.

Some scenarios require a bit more effort from the service consumer. Consider the account transfer operation described above. In this case, there is nothing in the message to identify that we have already processed that message. We cannot differentiate between a duplicate and another legitimate request to transfer $100 between the same accounts.

In such cases what we do is require that the service consumer place a unique message ID in each request message. A GUID works well for this. The receiving service can then store the message ID against the resultant account transfer transaction record in the database. Before processing a message, the receiving service checks the transaction table to see if the given message ID is already present. If so the request message is discarded.

So why go to all this effort? Under what circumstances do we need idempotent messages? Well so far in our discussions to date I have assumed the use of a transactional guaranteed message delivery transport (such as MSMQ). Such transports handle the detection and removal of duplicate messages as part of the messaging infrastructure.

Furthermore a transactional transport allows us to remove a message from a queue or topic as part of a broader distributed transaction. This means that the message is not lost if the service fails to process it. A failure results in the message being placed back on the queue or topic. I'll cover transactional services in more detail in a future post.

However such transports are not always available. For example when integrating with third party organisations, we generally tend to rely on Web services over an HTTP transport. HTTP does not guarantee delivery.

The problem with this is that when a failure occurs (e.g. the connection fails), the service consumer cannot determine whether or not the request message was actually successfully delivered and processed. Now for some situations, losing a message isn't very important. For example if someone is sending us weather updates every minute, it may not matter if we lose one because there'll be another along shortly.

However for other situations, we require a guaranteed message delivery service level agreement. This is only achievable over an unreliable transport if the consumer resends the message over and over until it receives confirmation from the service provider in the form of a response message that the original message has been successfully processed.

Now this is fine if the message is lost en route to the service provider. But what if the message was successfully processed and the confirmation response message is lost on its way back to the service consumer? The consumer will resend the request message and the service provider will receive and process the message twice.

When the operation performed by the service provider in response to receiving this message is not naturally idempotent, the service provider must detect the duplicate message and disregard it.

Of course this is a lot of extra effort to go to when implementing your service logic. So use transactional guaranteed delivery transports where available and appropriate. They'll save you a lot of time.

No comments: