Firms are increasingly using alternative data like satellite imagery to inform their investing decisions.
The financial services industry has always used data to inform its investing decisions. After all, data-backed decision making mitigates risk and instills confidence in investors on the part of clients. Until recently, however, that data was gathered through “traditional” data sources like press releases, SEC filings, earnings reports, and credit scores.
With the development of the ability to ingest, process, and analyze large amounts of data that is both structured and unstructured from new and never before used sources, firms now have the ability to harness these alternative data insights to use them to inform their investing decisions, make more accurate investments, and ultimately generate alpha.
What you’ll learn in this article:
This article explains everything you need to know about alternative data. Specifically, you’ll learn:
Alternative data is data that is not generated by “traditional” financial data and that is used in the investment process to inform investing decisions. Taking the form of anything from scraped web content to social media sentiment analysis, to satellite imagery, it can provide unique and timely insights into investment opportunities that investors can’t get from traditional data sources.
Is alternative data worth the hype?
Short answer: yes. Spending on alternative data is expected to surpass $1 billion by 2020 and with more than 400 alternative data suppliers on the market today, demand and use are only expected to increase.
The number of employees dedicated to alternative data full time has grown by 450% in the past five years. In fact, according to a 2018 study, almost 80% of investors turn to alternative data to inform their investing decisions. Given that investors want access to the most granular insights available to make the best possible investing decisions, alternative data is here to stay.
How is alternative data generated?
Over the past two years alone, we have generated 90% of the data in the world. Humans and businesses are more connected to technology than ever before. By the year 2022, there will be 28.5 billion networked devices and connections and 4.8 billion global internet users according to Cisco.
There is no comprehensive list of what alternative data is. Technology today can track social media, media, sentiment, IoT, geolocation, eCommerce buying habits, airline bookings, retail inventory data, mortgage data, entertainment events data, hotel bookings, satellite images, and more.
Alternative data can be divided roughly into three categories: data generated by individuals, data generated through business processes, and data generated by sensors.
Data generated by individuals is traditionally unstructured. Coming primarily from web traffic, app usage, and social media, this data is valuable for detecting sentiment and consumer behavior. With 1.56 billion daily active users on Facebook and 126 million daily active users on Twitter, scraping social media for sentiment and feedback has become commonplace to understand brand performance.
Businesses also generate data in the form of banking records, credit card transaction records, commercial transactions, supply chain data, and government and corporation data. This data is usually structured and can be a good overall indicator of business performance and a good predictor of company sales.
Data generated by sensors is the third major source of alternative data. This data comes from satellite images, weather forecasts and predictions, and geolocation data through wifi signals. Data from sensors is usually the largest and is unstructured. It can be used to track foot traffic and detect the health of stores.
Benefits of using alternative data
You know what it is and how it’s generated, but what are the benefits of using alternative data?
Alternative data can save traditional money managers time by sifting through news and data on their behalf, leading to more accurate and unbiased decision-making. In addition, analysts can get a better signal about what’s going on in the market in real time by using these larger and more dynamic alternative data sets.
Alternative viewpoints and unforeseen insight
Using alternative data gives analysts access to different viewpoints that you would not normally have access to when using traditional data sources. With access to thousands of new data sources that we didn’t have access to a decade ago, alternative data can generate new investment ideas, lead to unforeseen insight, and even allow investors to predict future market moves.
Transparency into company performance
Traditional data that portfolio managers used only gave historical insight into company performance. Having to wait for quarterly earnings reports and financial statements meant being reactive with investments and not knowing the truth until after the fact. Integrating alternative data allows portfolio managers and investors to get real-time signals into company performance from sources like live market reports and consumer behavior and sentiment.
Using alternative data gives you a competitive edge over other firms in your industry. Your firm will be able to know everything it can about a possible deal or investment. While almost 80% of firms today use some form of alternative data to inform their investment strategy, there is still time to take advantage of what alternative data has to offer.
Alternative data can be used in several different ways. Below are just a few of the ways alternative data has been used by firms.
A hedge fund wanted to predict which industries would be affected by Brexit. Building bots to scrape data from Bloomberg, Financial Times, and other publications, they looked for “Brexit” and SEC-delineated industries to figure out what was being covered and focused in on specific industries of focus.
By using publicly available flight information, a company was able to use flight data to track the flight patterns of corporate jets. They trained a machine learning model to make predictions about repeated flight patterns for certain companies to predict M&A deals.
For example, executives from Cisco were flying repeatedly to Carlsbad, CA, where Luxora is headquartered. They later acquired Luxora. HCA Health executives were visiting Asheville, NC often, and they later acquired Mission Health, which is headquartered in Asheville.
Testing new ideas and identifying new opportunities
A leading long/short equity hedge fund created a flexible and scalable alternative data program that continuously and rapidly digests any number, size, or format of data sources, converts unstructured data into machine-readable formats, and incorporates data quality checks.
Analysts at the hedge fund now onboard new data in less than one hour. With access to high-powered and quickly integrated alternative data insights through its custom-made data platform, the hedge fund has completely changed how it processes and investigates data. Its investment strategies are now enhanced with next-gen insights gathered from alternative data, enabling analysts to make better-informed investment decisions. Learn more.
Alternative data providers
What to look for in an alternative data provider:
When looking for an alternative data provider, it is important to find those that have the data that you need, and enough of it to inform your investing decisions. Other important questions to ask when looking at alternative data vendors:
Can the source be integrated with your existing system?
How much will the data cost?
Will the alternative data be a good ROI, or will it just add noise to your analysis?
How long will it take to integrate the data into your system?
Alternative data provider examples:
Today, there are over 400 alternative data providers in the marketplace. This list is not exhaustive, but it highlights some of the top alternative data providers in the market that feature different data sources and purposes.
Quandl:Quandl is a data aggregator that leverages relationships to source alternative data from IoT, consumers, natural resources, logistics, B2B, and more to enhances trading strategies for its users.
YipitData: YipitData is a data aggregator that sources web data, [anonymized] consumer receipts, and survey data weekly and monthly from over 70 companies across multiple industries and locations.
Dataminr:Dataminr scrapes public tweets and turns them into real-time alerts and sentiment analysis to assist its clients across several industries in trading, market awareness, client advisory, and thesis generation.
UBS Evidence Lab: UBS Evidence Lab sells insight-ready datasets of quality and vetted data for other financial services firms to integrate with their own data.
1010data:1010data aggregates credit card data from third-party providers
S&P Global Market Intelligence: S&P Global Market Intelligence uses natural language processing to understand sentiment from earnings calls. It has data on 8,300 companies dating back to 2004.
AppAnnie:AppAnnie collects mobile app usage data and trends to enable its users to make more informed decisions on consumer behavior.
Key Requirements of an Alternative Data Platform
In order for an alternative data program to be effective, it needs to do more than just bring in data. Your alternative data platform needs to be able to:
continuously ingest any number of raw data sources of any size, volume, or structure, including structured, semi-structured, and unstructured data; large quantities of small data files; large quantities of large data files, etc.
rapidly onboard new data sources of any kind
enable discovery and hypothesis testing on ingested data in order to create and refine use cased based on a subset of the available data
productionalize the preparation of newly ingested data towards all established use cases and rapidly deploy new use cases
update a visualization layer that enables analysis and real-time tracking of use cases
Challenges in Implementing an Alternative Data Program
Using alternative data to enhance and inform investing decisions might sound great, but establishing and implementing an alternative data program comes with unique challenges. When creating an alternative data program, companies should ensure their alternative data programs are capable of the below points.
Onboarding and enabling new data sources with minimal effort
In order to take advantage of the potential of alternative data, your system needs to be able to onboard and enable new data sources with minimal effort. You’ll want to avoid the need for additional development or code refactoring to accommodate something unforeseen in a new data source that you want to onboard.
The creation of a universal data extraction process
Using alternative data requires your system to be flexible. An extraction process has to be capable of extracting all the data points that are contained within a given file format. Relying on a hard-coded schema for each individual data source will lead to problems down the line.
Accommodating schema shifts in the source data
Accommodating schema shifts in the source data is particularly important with unstructured or self-gathered datasets because you need to be flexible if the data change.
Scalability and avoidance of bottlenecks
You want to be able to digest and process each individual data source in a timely manner and be able to accomplish this in parallel with different data sources that will be landed at the same time. If your system needs to suddenly ingest a new, large dataset while simultaneously updating datasets already in its arsenal, it will need to be flexible and scalable to avoid bottlenecks and possible downtime.
Privacy, sensitivity, and confidentiality
Alternative data systems need effective data governance in place to be used safely and efficiently. Companies should also keep laws like GDPR when collecting, ingesting, and using data.
Architecting for alternative data: important considerations
Data governance is essential to ensuring your alternative data program is an effective part of your overall enterprise. An effective strategy prevents organizational issues and conflicts resulting from the mismanagement of data and allows the appropriate people to access certain levels of the data.
In addition, when building your alternative data program, you need to keep the following best practices in mind:
Create an alternative data framework
Automate mundane tasks
Embed data cleansing and integration
Use a big data ecosystem
Sift through the noise and find data that will give your firm an edge over the competition
Caserta’s Alternative Data Supply Chain
At Caserta, our alternative data engagements involve the same process, based on the supply chain. The alternative data supply chain ensures a cohesive and effective, alpha-generating alternative data program is created with maximum time-to-value.
Plan: We identify the business initiatives, plan the alternative data architecture, including data platform, data pipelines, ML/AI tools, and data visualization.
Source: We define an implementation plan and success criteria. Next, we source the alternative data and make decisions such as build vs. buy, change data capture (CDC) techniques, and persist raw data in the internal data lake.
Construct: We integrate the data through testing, transform it into usable formats and structures, integrate new data assets with enterprise data, train and refine models, and create new data assets for delivery.
Deliver & Consume: We deliver the alternative data through the management of governance of new data assets, availability to business users, data usage patterns and refined dissemination methods, and coordination of data sharing with external partners.
Caserta is a leading strategic technology consulting and implementation firm with a reputation for creating bold, state-of-the-art solutions. Our business is built around creative thinking and harmonious collaboration with our clients.
Click on the different category headings to find out more. You can also change some of your preferences. Note that blocking some types of cookies may impact your experience on our websites and the services we are able to offer.
Essential Website Cookies
These cookies are strictly necessary to provide you with services available through our website and to use some of its features.
Because these cookies are strictly necessary to deliver the website, you cannot refuse them without impacting how our site functions. You can block or delete them by changing your browser settings and force blocking all cookies on this website.
Google Analytics Cookies
These cookies collect information that is used either in aggregate form to help us understand how our website is being used or how effective our marketing campaigns are, or to help us customize our website and application for you in order to enhance your experience.
If you do not want that we track your visist to our site you can disable tracking in your browser here:
Other external services
We also use different external services like Google Webfonts, Google Maps and external Video providers. Since these providers may collect personal data like your IP address we allow you to block them here. Please be aware that this might heavily reduce the functionality and appearance of our site. Changes will take effect once you reload the page.
Google Webfont Settings:
Google Map Settings:
Vimeo and Youtube video embeds: