No matter how hard I try, I can never take my focus out of some of the non functional requirements (NFR) while implementing MDM solutions. The top one among those NFR’s should be performance of the MDM system.

I often get looked down upon, showered over by sarcastic smiles whenever I try to bring in performance related topics while designing the solution. I have also seen the same ridiculing faces turning grim when the master data hub fails to deliver the speed which is in agreeable standards. The truth is performance engineering is one of the big things to be considered during any software design and MDM is no exception to this. In fact performance and scalability are two very important aspects of MDM deliverables.

So, now that we know we need to give so much emphasis to performance, let’s see what stops us from getting the system deliver optimal performance.

The first important challenge is the quantity of data a MDM system needs to manage. Depending on the domain being addressed and the size of the organization, the data volume varies anywhere from few millions to few hundred millions. As the data is sourced from distributed applications, volume of data continues to grow over time hampering the performance of the system.

Next thing is the overhead involved in cleaning, grooming, validating and standardizing the data coming into MDM. Whenever a persistent transaction is carried out; MDM needs to compare, match and merge the incoming data. This is to ensure the consolidation of master data from different sources is effective so a single, true version of master data is delivered.

The third major contributing factor to performance is the heterogeneous client infrastructures and variety of hardware configurations. Believe me, there are as many varieties as the number of projects you have worked on. Simply put, each customer is different. There are different database management systems, different middleware, partitions, and hundreds of varying parameters.

Setting up MDM product and tuning it to achieve optimal performance under these diverse environments remains the biggest challenge today. The technical teams need to be more competent than ever before.

Some of the other aspects which impact performance are – poor data model design, inability of the architecture team to look at the big picture and absence of best practices.

In order to provide reliable, secure and scalable MDM solutions which meet customer’s evolving needs for master data across the enterprise, a lot of attention should be given to performance. The growth of data over a period of time should be given special attention while designing the data model. The data validation and matching algorithms need to be efficient. The current and future state of the data needs to be forethought. Engaging a performance evangelist at the beginning of the initiative and designing the system keeping performance in mind are going to take away lot of re-work and re-engineering at the later stages of the project.

I hope you liked this post. Please share your thoughts via comments. I would like to hear from you about your experience implementing MDM with a vision towards performance.