10 Hot Big Data Companies You Should Watch In 2021

Here’s a roundup of 10 big data technology companies to keep an eye on in 2021—some well-known (looking at you Snowflake), some likely to break out this year, and some just starting to appear on industry watchers’ radar screens.


The Ones To Watch In 2021

IDC estimates that the “global datasphere” in 2020—the total amount of data created and consumed during the year—reached 59 zettabytes and will continue to grow at a 26 percent CAGR through 2024.

Businesses and organizations are increasingly challenged to manage and govern all that data and find ways to collect, prepare and analyze it to derive analysis and insight that can guide business strategies and provide a competitive edge. Gartner estimates that data and analytics leaders are spending 36 percent of their time on data preparation and data integration, more than any other data management job.

Sponsored post

To address those challenges, businesses can turn to a broad range of technologies and services from numerous big data companies—some still in startup mode and some that have been around for a while. Here’s a look at 10 big data companies that we will be keeping our eye on through 2021.


CEO: Steven Mih

Ahana is one of two companies on our “To Watch” list that is working with Presto, an open-source, high-performance SQL engine that speeds up queries from multiple distributed data sources—a key issue for business analytics tasks that involve huge volumes of data scattered across on-premises and cloud systems.

Ahana exited stealth in 2020, offering software and services around Presto. In September the company debuted its Ahana Cloud for Presto, a fully managed service designed to simplify the adoption, use and management of Presto and make it easier to connect the big data technology with databases, data catalogs and data lake systems.

Ahana, based in San Mateo, Calif., has raised $4.8 million in venture financing.


CEO: Michael Klaus

Numerous big data companies, both established players and startups, are vying to develop software that automates and speeds up the complex, time-consuming data preparation and management tasks that must be done at the start of any business analytics initiative.

The Ataccama One system offers a single, AI-powered platform that carries out a number of data management and data governance tasks including data discovery and profiling, metadata management, data cataloging, data quality management, master data management and data integration.

In November the Toronto-based company launched a beta release of Ataccama One Gen2 with increased emphasis on automated, self-driving data management capabilities.

Cockroach Labs

CEO: Spencer Kimball

Cockroach Labs, founded in 2015, develops CockroachDB, a cloud-native, distributed SQL database that’s designed to handle workloads with huge volumes of transactional data. The company’s motto of “scale fast, survive anything, thrive anywhere” (hence the Cockroach name) stems from the database’s elasticity, failure-resistant architecture and multi-cloud flexibility.

Cockroach Labs said it more than doubled both its revenue and its customer roster in 2020. More than half of Cockroach Labs’ customers are running their critical applications on CockroachCloud, the fully managed cloud instance of CockroachDB that became generally available on AWS and the Google Cloud Platform last year.

The database management system (DBMS) market is currently valued at $55.4 billion and a recent Gartner report forecast that by 2022, 75 percent of all databases will be deployed or migrated to cloud platforms.

There are a lot of database software vendors, both established players and startups, competing for that business. Cockroach Labs appears well-positioned to lead the pack: The company just raised $160 million in Series E funding, bringing its total funding to $355 million and putting its market cap at $2 billion.


CEO: Jay Kreps

Confluent is one of several big data companies that investors are eagerly awaiting an initial public offering and possibly repeating Snowflake’s huge success with its 2020 IPO.

Confluent’s flagship product, the Confluent Platform, organizes and manages massive volumes of streaming data and makes it available to operational applications, business analytics tools and information workers. The event stream processing software is based on Apache Kafka, open-source stream processing technology originally developed by Confluent’s founders while working at LinkedIn.

In April 2020 Confluent launched “Project Metamorphosis” to develop new products and capabilities to help customers adopt the Confluent platform. The Mountain View, Calif.-based company certainly has the resources: Confluent raised $250 million in Series E funding that month, bringing its total funding to more than $455 million and vaulting its market valuation to $4.5 billion.


CEO: Ali Ghodsi

Databricks, founded in 2013 by the developers of the popular Spark big data processing engine, has been one of the hottest IT startups in recent years.

The San Francisco-based company’s product portfolio includes the Databricks Unified Data Service and the Databricks Lakehouse Platform. The company also develops big data tools such as the Delta Engine query execution software, introduced in June, and SQL Analytics for running SQL queries against massive data lakes, which debuted in November.

The company has raised $897 million in financing—including a stunning $400 million Series F round in 2019—and some observers are looking for the company to go public this year in an IPO to possibly rival last year’s Snowflake blockbuster.


CEO: Kaycee Lai

Before information workers and analysts can engage in business analytics tasks, the data they need must often go through a number of costly, complex, time-consuming data operations including data discovery, preparation and assembly.

Startup Promethium has developed an all-in-one data management system that provides data discovery, preparation, query and visualization capabilities, fulfilling the promise of self-serve analytics by utilizing natural language processing and automating the entire data preparation and analysis process.

Founded in 2018, Menlo Park, Calif.-based Promethium raised $6 million in funding in early 2020. The company’s Promethium Data Navigation System was a finalist in the CRN 2020 Tech Innovator Awards.


CEO: Raj Verma

SingleStore develops a distributed, SQL, relational database management system that the company said can handle operational and real-time analytical workloads. Like other next-generation database developers (such as Cockroach Labs in this roundup) SingleStore is pitching itself as a replacement for mainstream databases from Oracle and other vendors.

SingleStore (founded in 2011 as MemSQL) raised $80 million in Series E financing in December, bringing its total funding to $238 million. At the same time the San Francisco-based company struck a strategic alliance with data analytics software giant SAS under which the massively parallel analytics engine in the SAS Viya artificial intelligence and analytics platform is being integrated with the SingleStore database.

SingleStore also recently made some moves in the channel, naming CRG Solutions as a VAR for SingleStore software in Indian and ASEAN (Southeast Asian) markets and tapping NextGen as a value-added distributor in Asia-Pacific.


CEO: Frank Slootman

Data cloud platform provider Snowflake made a huge splash in 2020 when the company’s September IPO turned out to be one of the biggest of the year, giving it a bigger market capitalization (currently around $82 billion) than such leading IT vendors as Dell Technologies and VMware.

So, what do you do for an encore?

Snowflake initially positioned itself as a cloud-based data warehouse service provider, but the company has broadened its offerings to include a range of cloud-based data management services including data science and data sharing. That means it’s competing with established giants like Amazon Web Services and Microsoft as well as smaller companies and startups offering various cloud-based big data products and services.

In November Snowflake launched a number of new platform features and capabilities that enable customers to work with more data types, exert better control over data and more easily access data services. New development capabilities called Snowpark allow data engineers, data scientists and developers to write code in languages of their choice and execute workloads such as data preparation and ETL (extract, transform, load) on Snowflake.

Those are the kind of innovations Snowflake, based in San Mateo, Calif., must continue to provide in 2021. Partners will also be watching to gauge the progress of its channel efforts, following the launch of the Snowflake Partner Network in June 2020.


CEO: Justin Borgman

Starburst Data develops Starburst Enterprise for Presto, the company’s commercial offering of the Presto open-source, distributed SQL query engine for finding and analyzing data that resides in a variety of distributed data sources.

Presto is capable of querying data where it resides without having to move it—a major advantage in cloud and hybrid IT environments with increasingly scattered data sources. That makes it a more cost-efficient alternative to traditional data warehouse systems.

Founded in 2017, Starburst just raised $100 million in Series C funding, bringing the Boston-based company’s total funding to $164 million and its market cap to $1.2 billion. In November the company launched Starburst Orbit, the company’s first partner program.


CEO: Dan Streetman

TIBCO entered 2021 by closing its acquisition of business analytics pioneer Information Builders, announced in October, for an undisclosed sum. Now begins the challenging work of combining the two companies’ operations and product lines.

TIBCO plans to add Information Builders’ flagship WebFocusS business analytics and reporting platform to its product lineup and enrich its Hyperconverged Analytics business analytics strategy. Information Builders’ data quality, preparation and integration products are being added to the TIBCO Any Data Hub and Tibco Responsive Application Mesh strategies.

TIBCO is already a player in the business analytics and big data management arena thanks to previous acquisitions such as Spotfire, Jaspersoft, Statistica, Alpine Data Labs and SnappyData. But the IBI acquisition was TIBCO’s largest and the industry will be watching to see how well the company leverages its expanded product portfolio.