Playing the Hand You’re Dealt — Even When it is Bad

The text books are full of integration patterns and best practices. All worthy of study and the Holy Grail to which we aspire as solution architects, designers and developers.

Then we encounter the real world where things just don’t work quite the way our training course and industry gurus told us they should.

Let me lead with the fact that best practices are there for a reason and we should always strive to ensure that we implement those patterns and methods as far as we can. However, the Integration World is — by definition — one in which we will encounter unexpected, and less than perfect, external interfaces and APIs over which we have no control.

When this happens, we can behave like a novice and rage against the machine. We can tell our stakeholders that the third party suppliers need to fix their interface and make it conform to industry standards (for the record, they won’t).

We can rant about how poor the external systems are and complain about the shocking implementations. We can even list all the reasons why it is inadvisable or even impossible to integrate any solution with such a badly designed interface…

Or we can behave like a Mule Whisperer — and find a solution!

Creating a Delta on Inbound data

So it was when a client came to me with this problem.

They needed to transfer data from one source system to two target systems. The challenge was that source data is supplied once daily in the form of a fixed-length file — and the legacy source system dumps its entire data set into that file each day.

Of course we don’t want to make thousands of unnecessary updates and would prefer to filter the data set down to just the new records that have been added or those that have changed since the last transfer. Now for the challenge; there is no way to identify whether a record is new or whether it has been changed in the source data.

Nor is there any date or incremented identity value which we can use as a watermark on the inbound data. The only way to establish whether a record has changed is to compare every field in the source data against every field in the target data. If any one of the fields does not match, then we can determine that the data has changed.

The proposed solution (from the client’s technical team) was that we perform this check for every record in the source data before actually adding / updating each record…

… in order to reduce the performance hit on the target systems.

I politely pointed out that this would be even more expensive, from a performance point of view, than simply writing thousands of unnecessary updates on each daily run.

Solution — Playing the hand You’re dealt

One of the target systems stores its data in an underlying Oracle database. A bit of research revealed that I could concatenate all the required fields and perform an MD5 hash on the concatenated result using a handful of out-of-the-box Oracle functions — all in a single SQL statement.

SELECT my_record_id, LOWER(RAWTOHEX(STANDARD_HASH(col1 || col2 || col3 || col4,'MD5'))) hash_data 
FROM my_table tbl

I whispered…

… and Mule delivered – with a little co-operation from Oracle’s PL-SQL!

By crafting the SQL statement just so, I was able to produce an identical hash value to one I could generate from my source data in a Dataweave 2.0 transform.

%dw 2.0
output application/json
import dw::Crypto
var dataArray = payload.dataArray
---
{
   recordId:"<my-record-id>",
   hashValue: Crypto::MD5(dataArray filter($ != null) joinBy("") as Binary)
}

So… by making a single call to fetch all of the hashed target data records as a (relatively small) data set and then performing the Dataweave hash on each source record, I was able to add a simple Validation component to my Mule flow and filter out any records that had matching hash values.

Was this an ideal solution? No; the target data set is still larger than I would like. Would it still work if the data grew to hundreds of thousands of records? Probably not.

But what it achieved was a solution that worked for the client’s use case and and solved a problem that would have been extremely difficult without the ability to leverage the Mule’s power.

When it comes to integration in the real world, we often have to play the hand we are dealt rather than one we would like — and a Mule Whisperer needs to know how to play a bad hand well.

Creating a Delta on Inbound data

Solution — Playing the hand You’re dealt

Leave a Reply Cancel Reply