John F. Kennedy

Ask Not What Your Data Can Do For You

While attending and speaking at the TDWI Strategy Summit recently, I got to chatting over lunch with a couple data and analytics luminaries I’ve known for years: Professor Hugh Ryan (University of Georgia, Terry College of Business), and Marie Clark (GE Entrepreneur in Residence, and founder of Ambient Intelligence). We were noting how quick many organizations are to hand the keys to BI tools over to business professionals without them having a sufficient appreciation of the data itself. And in a moment of clumsily adapted political nostalgia, I blurted “Ask not what your data can do for you, ask what you can do for your data.” We had a quick laugh, then got down to outlining what this means.

Our contention is that business professionals can do themselves and their organization’s data a great service by better curating, understanding, managing, and preparing their data before attempting any kind of analysis. It’s often been said that you can’t manage what you don’t measure. And it follows that you can’t monetize well what you’re not managing well. Nowhere is this more evident than in today’s data landscape.

Curating Data

Business leaders are desperate to generate more economic value from the information assets available to them. This means that they shouldn’t just focus on their own data, but those available externally. So first, do your data a favor by identifying external/alternative data assets that may enhance or be an improvement upon the value of those you already have. This may be in the form of enhanced customer data from any variety of data broker, competitor pricing data harvested from their websites, social media insights to augment your own customer support/feedback data, or global economic indicators to enhance your own forecast data. Unfortunately, in a room of about 100 data and analytics leaders I presented to at the event, not one person acknowledged that their organization has anyone dedicated to identifying and arranging access to external data sources. Yet each of their organizations has an entire department dedicated to procuring office furniture or other material assets.

Understanding Your Data

Also, do your data a favor by getting to know it better. This is a key first component of what’s called “data literacy” or “data fluency”. Understand its origin or provenance, its age, its lineage, its limits, its regulatory usage restrictions, its biases, how it may be the combination of other data sources, and its real (not assumed) business meaning. Does your data represent a complete set of something such as sales transactions, or only a subset such as sales of items that were not returned. In fact most organizations have a dozen or more definitions of what constitutes a “customer” (e.g. individual, household, current/former, shipping, billing, etc.), and at least as many data sets for each version. Learn how others have used similar data, both inside and outside your organization. Moreover, understand how any given data set relates to others. Reading a conceptual or logical data model isn’t really that difficult, presuming one exists. And data profiling tools also exist to help understand the ranges, averages, outliers, and other measures of the data itself.

Managing Your Data

Be circumspect about moving or copying data. Every time data is extracted or copied or moved from one place to another your organization incurs certain risks and expenses. You incur the risk that the data becomes out of sync with the source, thereby rendering it inaccurate or incomplete. You increase your attack surface for data breaches. And you incur additional expenses to sync, store, and manage yet another data set. So do your data another favor, and leave it where it lives…either in an accessible operational database, or more likely a data warehouse or data lake. Technologies abound for creating virtual databases that are easier to understand, manipulate and analyze than trying to navigate an enterprise data warehouse. Work with your IT or data organization to set these up. And resist creating dangerous Excel extracts of valuable corporate data assets, regardless how expedient it may be to do so.

Preparing Your Data

Your data may not be ready to serve you in its current form. Don’t expect that raw data or even data that exists in the data warehouse will service your analytic needs as-is. So do your data a favor by integrating it with other data, subselect it, filter it, tag it , cleanse it, or otherwise transform it simpler or improved analysis. Ideally you want to do this within the existing data warehouse environment if possible. Data leaders such as CDOs should ensure that these kinds of basic data prep services or technologies are available to business professionals, and that sufficient training on them exists.

Analyzing Your Data

Additionally, do your data a favor by learning a bit more than basic analytic functions and operations. Also, learn how to develop and test hypotheses and which types of visual representations are appropriate for enabling different types of insights. If you’re just knowledgeable on how to create pie charts and bar charts, or a basic linear regression, then you’re doing your data, and your organization a disservice. The better prepared you are to perform diagnostic, predictive or prescriptive analyses upon the data, the more your data will do for you and your organization.
So before you jump headlong into any kind of analytics, ask what you can do for your data. And if you are a data & analytics leader such as a CDO, consider establishing an analytics center of excellence (ACE) and/or formal data literacy and certification program to prepare your business professionals to do more with their data.