IA Ventures was founded on the belief that managing and extracting value from massive, occasionally unstructured, often real-time data sets is a competitive advantage.
Most data generated today is simply treated as exhaust—lost forever along with the valuable insights held in it. This is purely because all but the most sophisticated organizations are overwhelmed by the massive datasets that are now commonplace. And this is just structured data. Messy unstructured data is everywhere and goes largely unexplored. Unlocking even a small amount of the information held within it will transform entire industries.
We believe many factors, including commodity clusters, cloud platforms, and advanced statistical algorithms, are making it possible to store, manage, and extract insight from massive datasets—creating a big-data revolution. We invest in talented early stage teams fueling this revolution with the development of innovative tools, technologies and analytics for managing and extracting value from big-data—both structured and unstructured. We are interested in a wide variety of verticals including government, healthcare, financial services, and often our investments are directly or indirectly relevant to quantitative trading. Specific investment areas include:
Big Data is going mainstream. Where traditionally Big Data was the exclusive domain of large organizations, distributed systems built from on-site commodity hardware and public clouds are allowing everyone to participate. Ironically, these low-cost techniques are “trickling-up” into traditional enterprise IT departments and industries with “high-value” information. This creates additional requirements in a number of areas including management and security (particularly for cloud-based systems). We are interested in companies producing tools, services, and infrastructure to support large-scale distributed systems / cloud computing. This includes but is not limited to system and job distribution/management, security and privacy.
The world is quietly undergoing a Big Data storage crisis. Over the last decade the data requirements in most industries have increased exponentially. Dataset sizes which were unimaginable only a few years ago, are now the norm. Off-the-shelf storage systems are near the breaking point, unable to efficiently and quickly deal with today’s needs. What should be transparent technological plumbing is in fact limiting the types of analytics one can perform further up the stack. This ultimately constrains the insights one can uncover from available data. To overcome this bottleneck companies are rolling their own technology. This is inefficient and unsustainable as Big Data storage and analysis becomes mainstream. As is the historical nature of the technology industry, these one-off solutions will become standardized commercially available platforms—often based on existing open source technologies.
We are interested in scalable structured storage systems including approaches to scale traditional RDBMs, enterprise ready NoSQL systems, and novel new approaches including cloud based structured storage systems. Additional areas of interest include: embedded highly-parallelized analytics processing frameworks that can enable analytics to happen as close to the data as possible, and approaches to efficient data security and leakage prevention.
Clean data is the lifeblood of all Big Data applications. While technology has lowered barriers to entry in many areas of the Big Data space, access to clean data (in particular clean financial data) has not yet enjoyed the same benefits. The rise of low-cost scalable structured storage including cloud storage is making it feasible for data middlemen to provide access cost- effectively. However, access is only half the challenge—the data must be clean. Clean data is critical to any algorithmic Big Data system, without it the most sophisticated prediction algorithms are rendered worthless. We are interested in providers of clean, novel, tradable data either locally or in the cloud. Also of interest are systems that can intelligently and automatically clean and normalize tradable data.
The web is a gold mine of information waiting to be unlocked. Historically it has been difficult for computers to extract valuable, meaningful and actionable information because it is buried in large, disparate, and potentially unstructured data including text, images, and video. This is changing. Increasingly sophisticated algorithms and analytics (enabled by low-cost hardware including GPUs and commodity clusters) are beginning to extract meaningful information from a sea of data. Much of this information could be valuable to a wide variety of markets and fields including quantitative trading. We are interested in providers of this automatically derived information whether they are local or cloud-based. It is strongly preferred that the information be directly relevant or easily re-purposable to trading.
The internet combined with Big Data technology is transforming many markets, making them increasingly more liquid, transparent, and time-sensitive. For example, historically display ad markets have been inefficient, lacking price transparency and reliable metrics for valuating ad inventory. With the advent of real-time per-impression bidding and exchange-based buying and selling, these markets are poised to become liquid markets with search-like price discovery. This is a harbinger of what will happen in other markets, not to mention new markets that may be created. We are interested in companies utilizing Big Data technologies to create and/or gain an “edge” in modern markets.