The MVDP framework enables data science projects to fail fast and recover fast.
The initial stages of developing a product start off with determining if the idea is viable, feasible and serves a need. Another critical requirement is to do such an exercise with minimal time and resources while maximizing the learning outcome.This approach is termed the build-measure-learn feedback loop. It is one of the central principles of Lean Startup (Eric Riess). This framework provides an approach to develop an MVP or “minimum viable product” using an iterative approach, and it can be a powerful tool to apply to “data products” as well.
The first step in creating an MVP is to determine what problem needs to be solved (Creating the hypothesis). Then we come up with metrics and measures to determine the degree of success and the need for improvements and/or modifications (Testing out hypothesis). We then build the product and test our hypothesis on it. With every iteration we can increase the size of the audience.
This approach lets us develop products in a lean manner while developing the product in small iterations making sure the product matches the users needs. Another part of this process is to fail fast and to recover fast. We use approaches that help us very quickly determine if our product isn’t viable and we are quick to gather insights from our failure to have a better next iteration.
Data products utilize data in innovative ways through machine learning and statistical analysis to provide unique insights or meaning. This may involve anomaly detection pipelines, recommendation engines, community detection, causal analysis or some combination of several such techniques. Examples of data products are: retail recommenders like Amazon, text classification, such as with Gmail, or content recommendation as with Netflix or Spotify.
The MVDP or Minimum Viable Data Product
Like the standard MVP, the end goal of having a successful MVDP starts out with small steps. For example, let’s say we have a online music library. We already have a few million subscribers and several million tracks. We also have data for users, tracks and track-play data containing time, track, artist, album and user.
We determine the need for a recommender that helps users discover new artists that would appeal to them. Initially we start with the first iteration which is only targeted to the data science / product team. The measure would be to determine if we can increase the number of plays on new artists without decreasing the users time on the app by building a recommendation engine. Basically shifting the user to new artists without decreasing the time they spend on the app (1. build).
Next we develop a recommendation engine with the existing data to use offline methods to measure the improvement (2. measure). We use the results to determine what we could improve and use the insights to improve our initial product (3. learn).
Once this product reaches satisfaction we can expand it to the next step. Which could be rolling it out as a part of the app to selected users or internal users (build). We then measure the engagement metrics of the users through A/B testing, RMSE and other approaches (measure). We then analyze the data to derive actionable insights (learn), make improvements to our recommender, and re-iterate until we get satisfactory results.
The same process can be repeated until we can use the app with a public beta and then a public release. At this point the recommender is still in a very basic state. We are just verifying the customer response and gaining an understanding how the market reacts.
Once we gather insights we can decide on improvements to the app and repeat a similar process.
In this process the first iteration of the app we could release to users is our MVDP.
There also may be several steps in the process where we realize our approach or product isn’t viable and we have to potentially restart. This does save us from having to spend large amounts of time and resources before realizing we failed.
The MVDP approach helps us put a product in the hands of users and the the leader in the space while gaining insight to improve the product. The biggest hurdle this poses to competitors is the insights we have already gained and the customer base we have retained.