What is Master Data?
Master data is context for operational and analytical aspects of an organization’s data. It is non-transactional in nature and provides the ability to have a single (or 360) version of the truth. Proper management of master data improves and enhances data quality and can help eliminate the maintenance of data across multiple sources.
So why do organizations wait to master data? Often, it is easier to accrue technical debt than tackle this area, especially because it can feel like a daunting task and companies don’t know where to start.
The best place to start is to build a business case around MDM because it is an investment in the enterprise. In the short-term its implementation will increase costs, but in the long run it will reduce costs as well as lessen risks and errors. Implementing mastered data across multiple, disparate software applications is not easy–it takes time and patience to do well.
Additionally, an organization might not realize that they have a master data issue. They may have a single ERP or perhaps cleaned the input data prior to ERP migration. But this kind of effort is typically focused at the department level, not the enterprise level. It’s great that the data was cleaned prior to ERP implementation, but master data is not a one and done activity. It needs to become a part of the overall governance strategy and consistently managed and monitored.
How is data mastered? It is relatively easy for IT to buy and install a best of breed MDM tool, but this doesn’t guarantee success. MDM is more about governance than a specific piece of software. It is critical to have policies and processes in place to help ensure uniformity, accuracy, consistency, and completeness of the master data. Considerations must also be made for security, such as the handling of potentially sensitive PII/PHI data. Tagging may also be necessary for privacy and regulatory compliance (e.g. CCPA, GDPR). Thus, MDM becomes a component of the overall data management or governance practice.
Ensuring Data Quality is an important aspect of an MDM initiative. As an example, at a previous client there was a requirement to master customer data. Data came from retail kiosks and customers and used address standardization plus custom rules before routing data to the MDM. Since the “state” was not a drop down selection in the kiosk, we found 50+ permutations of spelling for a single state. It is crucial that users be able to trust the data. Once trust is lost, it is difficult to regain. Thus DQ and standardization are preliminary requirements for MDM implementation.
Another important aspect of an MDM initiative is handling match and merge of duplicate data. This prevents having multiple versions of an entity within the domain. For example, many of us have experienced receiving duplicate US Mail offers. The problem can be as simple as slight variations on a name recorded in the system as separate entries.
There are many tools that can be used for mastering data. However, one tool that is not recommended is spreadsheets. It never ceases to surprise us how many established companies master their data via spreadsheets. For anyone who’s been around data for very long, they’ve probably run across this as well.
Master Data Implementation Styles
Gartner recently described 4 standard MDM implementation types or styles: Consolidation, Registry, Coexistence, and Centralized.
Consolidation is probably the easiest. It is analytics or data warehouse focused and does not attempt to clean up data in the upstream systems. It is often used for customer 360 analytics. I had a marketing client recently who wanted to expand information for model scoring using 3rd party data, clickstream data, doubleclick ads, salesforce data (and others). Since each of those upstream systems were distinct 3rd party applications each with their own customer identifiers and the business cases was analytics, a consolidation of customer 360 within the DW was the obvious solution.
The second style is Registry. The initial lookup of a record is against a central repository but data is pushed and further augmented in the local systems (think pub/sub model). Governance is more challenging as it is distributed. But this may mitigate major changes to the applications consuming the data as they can still query data locally. Data is not written back to the MDM hub directly from the applications.
Coexistence is a distributed model with master data centralized plus mirrored to applications. Data is mastered and stored redundantly across the application. Security is challenging as any change needs to be applied at central and distributed layers.
Centralized provides the best control and fits the common services paradigm. Data is mastered centrally and applications read/write directly to and from the master hub. Very recently, we had a client begin an initiative to re-engineer their applications with direct integration to a central customer hub using standard API gets/puts. It is an excellent style for centralized governance but is probably the most time consuming to fully implement depending on the number of applications that need to be refactored.
We’ve also seen clients implement a hybrid solution blending two or more of these styles or implementing one style for the product domain MDM and another for the customer domain. It could be a mix of on-prem and cloud or OTS blended with some custom development.
It is important to keep in mind, MDM is NOT one-size-fits-all. Capturing requirements before embarking on the solution is critical and Caserta can help guide you through this process as well as the implementation of the MDM hubs.