Building a Data Lake for Digital Dominance at Strata Data Conference

How a global entertainment company successfully built a data lake for continued digital dominance.

Joe Caserta, CEO at Caserta, and Elliott Cordo, Chief Architect Caserta, deliver customer use case keynote.

Understanding ongoing digital innovation is essential to this music giant’s continued success.

With a broad roster of new stars and legendary artists, this global record company has long been considered a technology innovator and progressive force in the music business. Ahead of the industry with its development of a dedicated digital strategy team, the company is recognized for adopting and harnessing digital technology for music creation and distribution.

There was a need to replace legacy applications with a fully integrated framework based on open source big data technologies. That meant seamlessly integrating data from more than 15+ record labels and a global publishing catalog containing more than 100 million+ copyrights held worldwide.

To transition and embrace the benefits of a rapidly evolving technology ecosystem, the company and Caserta worked together to create a comprehensive roadmap and implement a new data platform in the cloud.

In this video, you’ll learn strategy, process, challenges, and the solution:
• The strategy built with a laser focus on capacity models to ensure system scalability
• The process for assessing the current landscape, and strategic recommendations for re-architecting critical components of the platform to gain stability, performance and resiliency.
• The core components required to build a production-worthy data lake
• Framework to integrate data feeds from real-time and streaming sources such as Pandora, Spotify, iTunes, etc., each in different formats and time sequences
• The task of systematically onboarding the more than 140+ unique data feeds
• Resolving data ingestion bottlenecks with a new, open and scalable framework to seamlessly accommodate existing and new data sources
• A complete overview of data ecosystem core components and moving parts.

Joe and Elliott also covered emerging alternatives such as Spark, and how, where, when, and why particular technologies are relevant in the new data lake architecture.