No more typing reviews! Try our Samantha, our new voice AI agent.

Apache Flink vs Databricks comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Dec 17, 2024

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Flink
Ranking in Streaming Analytics
4th
Average Rating
7.8
Reviews Sentiment
6.7
Number of Reviews
19
Ranking in other categories
No ranking in other categories
Databricks
Ranking in Streaming Analytics
1st
Average Rating
8.2
Reviews Sentiment
7.0
Number of Reviews
94
Ranking in other categories
Cloud Data Warehouse (4th), Data Science Platforms (1st), Data Management Platforms (DMP) (5th)
 

Mindshare comparison

As of June 2026, in the Streaming Analytics category, the mindshare of Apache Flink is 8.2%, down from 13.7% compared to the previous year. The mindshare of Databricks is 7.9%, down from 14.5% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Streaming Analytics Mindshare Distribution
ProductMindshare (%)
Databricks7.9%
Apache Flink8.2%
Other83.9%
Streaming Analytics
 

Featured Reviews

Sanjay Srivastava - PeerSpot reviewer
Software Architect at IBM
Streaming workflows have improved data integration and support real-time pipelines across platforms
We are not using Apache Flink in its advanced window capabilities. We are using the Apache Flink job in Apache SeaTunnel, meaning we can write the code inside Apache SeaTunnel. Currently, we are moving; both solutions are there. We are doing it on-premises with the help of Kubernetes and OpenShift. The main reason why Apache Flink is better is that it has more functions, and being open source with easy code in Apache SeaTunnel helps us achieve that. Cost is a major issue. I would rate the stability of the product as an eight. For Apache Flink, the final point can be rated an eight. I can recommend Apache Flink to other users for streaming support, and I am recommending it. I would rate this review an eight overall.
SimonRobinson - PeerSpot reviewer
Governance And Engagement Lead
Improved data governance has enabled sensitive data tracking but cost management still needs work
I believe we could improve Databricks integration with cloud service providers. The impact of our current integration has not been particularly good, and it's becoming very expensive for us. The inefficiencies in our implementation, such as not shutting down warehouses when they're not in use or reserving the right number of credits, have led to increased costs. We made several beginner mistakes, such as not taking advantage of incremental loading and running overly complicated queries all the time. We should be using ETL tools to help us instead of doing it directly in Databricks. We need more experienced professionals to manage Databricks effectively, as it's not as forgiving as other platforms such as Snowflake. I think introducing customer repositories would facilitate easier implementation with Databricks.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The ease of usage, even for complex tasks, stands out."
"With Flink, it provides out-of-the-box checkpointing and state management. It helps us in that way. When Storm used to restart, sometimes we would lose messages. With Flink, it provides guaranteed message processing, which helped us. It also helped us with maintenance or restarts."
"We value this solution's intricate system because it comes with a state inside the mechanism and product, allowing us to process batch data, stream to real-time and build pipelines, and we do not need to process data from the beginning when we pause as we can continue from the same point where we stopped, helping us save time as 95% of our pipelines will now be on Amazon and we'll save money by saving time."
"Another feature is how Flink handles its radiuses. It has something called the checkpointing concept. You're dealing with billions and billions of requests, so your system is going to fail in large storage systems. Flink handles this by using the concept of checkpointing and savepointing, where they write the aggregated state into some separate storage. So in case of failure, you can basically recall from that state and come back."
"It provides us the flexibility to deploy it on any cluster without being constrained by cloud-based limitations."
"The product helps us to create both simple and complex data processing tasks. Over time, it has facilitated integration and navigation across multiple data sources tailored to each client's needs. We use Apache Flink to control our clients' installations."
"Apache Flink is meant for low latency applications. You take one event opposite if you want to maintain a certain state. When another event comes and you want to associate those events together, in-memory state management was a key feature for us."
"This is truly a real-time solution."
"It is a cost-effective solution."
"The most valuable feature is the versatility of the ecosystem."
"The capacity of use of the different types of coding is valuable. Databricks also has good performance because it is running in spark extra storage, meaning the performance and the capacity use different kinds of codes."
"It's easy to increase performance as required."
"Databricks is a scalable solution. It is the largest advantage of the solution."
"Prior to using Azure Databricks in the cloud, we had Databricks installed in clusters, and since our implementation, the performance has increased and our cost has been reduced."
"The solution is built from Spark and has integration with MLflow, which is important for our use case."
"Imageflow is a visual tool that helps make it easier for business people to understand complex workflows."
 

Cons

"Amazon's CloudFormation templates don't allow for direct deployment in the private subnet."
"In terms of improvement, there should be better reporting. You can integrate with reporting solutions but Flink doesn't offer it themselves."
"The state maintains checkpoints and they use RocksDB or S3. They are good but sometimes the performance is affected when you use RocksDB for checkpointing."
"Failure is another area where it is a bit rigid or not that flexible."
"There is a learning curve. It takes time to learn."
"One way to improve Flink would be to enhance integration between different ecosystems."
"The technical support from Apache is not good; support needs to be improved. I would rate them from one to ten as not good."
"There are more libraries that are missing and also maybe more capabilities for machine learning."
"The solution could improve by providing better automation capabilities. For example, working together with more of a DevOps approach, such as continuous integration."
"When I used the support, I had communication problems because of the language barrier with the agent. The accent was difficult to understand."
"The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team."
"The biggest problem associated with the product is that it is quite pricey."
"I would like to see improvement with the UI. It is functional and useful, but it's a bit clunky at times."
"If I want to create a Databricks account, I need to have a prior cloud account such as an AWS account or an Azure account. Only then can I create a Databricks account on the cloud. However, if they can make it so that I can still try Databricks even if I don't have a cloud account on AWS and Azure, it would be great. That is, it would be nice if it were possible to create a pseudo account and be provided with a free trial. It is very essential to creating a workforce on Databricks. For example, students or corporate staff can then explore and learn Databricks."
"They release patches that sometimes break our code. These patches are supposed to fix issues, but sometimes they cause disruptions."
"In the next release, I would like to see more optimization features."
 

Pricing and Cost Advice

"It's an open source."
"Apache Flink is open source so we pay no licensing for the use of the software."
"It's an open-source solution."
"This is an open-source platform that can be used free of charge."
"The solution is open-source, which is free."
"I am based in South Africa, where it is expensive adapting to the cloud, and then there is the price for the tool itself."
"The solution is based on a licensing model."
"The cost is around $600,000 for 50 users."
"The basic version of this solution is now open-source, so there are no license costs involved. However, there is a charge for any advanced functionality and this can be quite expensive."
"The cost for Databricks depends on the use case. I work on it as a consultant, so I'm using the client's Databricks, so it depends on how big the client is."
"I rate the price of Databricks as eight out of ten."
"Licensing on site I would counsel against, as on-site hardware issues tend to really delay and slow down delivery."
"We're charged on what the data throughput is and also what the compute time is."
report
Use our free recommendation engine to learn which Streaming Analytics solutions are best for your needs.
900,644 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
19%
Retailer
13%
Computer Software Company
9%
Manufacturing Company
5%
Financial Services Firm
18%
Manufacturing Company
10%
Computer Software Company
7%
Healthcare Company
5%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
By reviewers
Company SizeCount
Small Business5
Midsize Enterprise3
Large Enterprise12
By reviewers
Company SizeCount
Small Business27
Midsize Enterprise12
Large Enterprise57
 

Questions from the Community

What is your experience regarding pricing and costs for Apache Flink?
The solution is expensive. I rate the product’s pricing a nine out of ten, where one is cheap and ten is expensive.
What needs improvement with Apache Flink?
Apache could improve Apache Flink by providing more functionality, as they need to fully support data integration. The connectors are still very few for Apache Flink. There is a lack of functionali...
What is your primary use case for Apache Flink?
I am working with Apache Flink, which is the tool we use for data integration. Apache Flink is for data, and we are working on the data integration project, not big data, using Apache Flink and Apa...
Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or Python. It offers many different cluster choices and excellent integration with ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designed to accelerate innovation projects. It is based on Spark so it is very fast. It...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analytics teams that have to interpret data to further the business goals of their orga...
 

Comparisons

 

Also Known As

Flink
Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
 

Overview

 

Sample Customers

LogRhythm, Inc., Inter-American Development Bank, Scientific Technologies Corporation, LotLinx, Inc., Benevity, Inc.
Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
Find out what your peers are saying about Apache Flink vs. Databricks and other solutions. Updated: June 2026.
900,644 professionals have used our research since 2012.