Higher Ed Gets Serious About Data Warehousing

Higher Education and Big Data

For more than a decade, a well-designed data warehouse has been an essential component in the IT arsenal of nearly every successful large corporation.  Providing managers with rapid access to critical business intelligence is the only way for large corporations to stay relevant in a competitive world.

While it would be romantic to think of colleges and universities as exalted institutions that are immune from the chaos and pressure of market competition, the reality is they fight for resources just as any organization does.  Some universities struggle to attract students.  Others battle to meet diversity goals.  Many have trouble with faculty retention.  Most don’t have good visibility into their grant and fundraising programs.  All know that having good data is necessary to address these problems.

Fortunately, many institutions of higher learning are beginning to understand the role that the data warehouse plays in making sure that good data is available to the people that need it when they need it.  Many major universities both in the US and around the world are beginning to take their data warehouse initiatives more seriously, and invest adequate resources for them to thrive.

It’s not enough for a university to have a data warehouse. 

The data warehouse has to be designed in such a way that it can gather data from disparate systems across the university and serve it up in a form that’s intuitive and easy to digest for non-technical users.

A simple question that a leader at a large university might ask is, “Which of my departments and which faculty members within those departments are best at generating research grants?”  The leader knows that all of this data is collected by the university and resides in its systems.  Surely it can’t be difficult to get a report.

Providing a useful answer to this question is much harder than it seems, however.  The problem is that the data is almost always scattered across multiple systems – some lives in the finance system, some is in the HR system, some comes from the ERP system, and some of it is in that “other system,” you know, the one with the funny acronym that’s only used by one department. Some isn’t even in any system – it’s simply captured in Excel spreadsheets or an Access database maintained by the admin of one of the department heads.  Worse still, none of these systems talk to each other, and each has its own peculiar way of keeping track of things.  One system refers to SMITH_JOHN, ID 45682.  The other refers to the same person as jsmith, ID 2130.

So we need to identify a big data solution.

The solution is to bring all of this data together and let these disparate elements “shake hands” with one another.  This can be done with a university-wide data warehouse, or what we term a UDW.  Just as enterprises have built out enterprise-wide data warehouses or EDWs to get holistic views of their operations, universities need to think in terms of UDWs.

To do this, data needs to be extracted from the various source systems – including the spreadsheets – then cleaned to remove garbage and fix mistakes, conformed (so that we recognize SMITH_JOHN and jsmith as the same person) and then delivered into the UDW in a form that’s easy to query.  Once this heavy lifting is done, it becomes a simple matter to use a business intelligence tool to answer the question about research grants.  In fact, with a UDW in place, that question and a whole host of related questions can be answered as a series of easy-to-read, pretty-to-look-at, drillable dashboards.

Conceptually, it’s easy to build a data warehouse, but the devil is in the details.  If you don’t have a top notch design and if you don’t execute the data conform steps correctly, you’ll wind up with results that at best aren’t useful, and at worst could even be misleading.

A UDW should be designed in phases.  The first phase should answer a particular set of critical questions, such as the one posed above about grants, or more generally, university fundraising efforts.  If the warehouse is designed properly, it will then be a simple matter to add functionality to answer other important questions, such as ones about student and faculty performance or the admissions process.  This incremental method is known as Agile Data Warehouse Design, and it can provide for a quick win for all stakeholders.  As each phase is completed, the team has tangible success it can point to that generates the momentum needed to continue building out the UDW.

Much has been said recently about the coming change in higher education.  Many believe that new online universities will fundamentally reshape the way that education is delivered.  Others are more circumspect, believing that today’s universities will remain largely intact as long as they incorporate online courseware appropriately.  Whatever your view on this weighty question, one thing seems sure: having a good data strategy built atop a solid data warehouse foundation will be key to survival.