Try our new research platform with insights from 80,000+ expert users
Rupal Sharma - PeerSpot reviewer
Data Architect at Three Ireland (Hutchison) - Infrastructure
Real User
Processes large data for data science and data analytics purposes
Pros and Cons
  • "Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours."
  • "There is room for improvement in visualization."

What is our primary use case?

It's mainly used for data science, data analytics, visualization, and industrial analytics.

What is most valuable?

Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours.

So that's why it's quite convenient to use for data science, for training machine learning models. By using more computing power, you can make it even faster.

What needs improvement?

There is room for improvement in visualization.

For how long have I used the solution?

I used it for two years. I worked with the latest update. 

Buyer's Guide
Databricks
September 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: September 2025.
868,787 professionals have used our research since 2012.

What do I think about the stability of the solution?

I would rate the stability a nine out of ten. I didn't face performance drops.

What do I think about the scalability of the solution?

I would rate the scalability an eight out of ten.

How are customer service and support?

Databrick's support is great. If we need any support, they are very quick with it. And they genuinely want you to use Databricks. So, whatever we ask them, they come up with multiple solutions to problem statements. That's really good.

Overall, the customer service and support are very good.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I personally prefer using Databricks. However, we also considered using Snowflake, but the pricing was different. It's  price per query.

So, as per your storage, a data scientist or a data analytics team needs to query again and again, which does not suit a data-heavy organization.

What was our ROI?

It's a good return on investment for Databricks from a delivery perspective. Delivered multiple dashboards. So, it's quite a good return on investment. And being a small organization, everyone can use Databricks, and cost-wise, it's also good for small organizations.

Which other solutions did I evaluate?

If the company is a startup, Databricks might be suitable. If a big company needs a lot of storage, Teradata might be best for them. It depends on the situation.

What other advice do I have?

Overall, I would rate the solution a eight out of ten. I would definitely recommend this solution for small organizations. 

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Sahil Taneja - PeerSpot reviewer
Principal Consultant/Manager at Tenzing
Real User
Processes tremendous data easily
Pros and Cons
  • "The processing capacity is tremendous in the database."
  • "There is room for improvement in the documentation of processes and how it works."

What is our primary use case?

Our primary use case is in our project; we are dealing with Duo Special Data, where we need a lot of computing resources. Here, the traditional warehouse cannot handle the amount of data we are using, and this is where Databricks comes into the picture. 

What is most valuable?

The processing capacity is tremendous in the database. We are dealing with Azure as storage, so we have not faced any challenges. And also the connectors to different data sources. Moreover, it is not a language-dependent tool. Therefore, development also takes place faster. It is one of the best features of Databricks.

What needs improvement?

There is room for improvement in the documentation of processes and how it works. I was trying to get one of the certifications, so I saw an area of improvement there. 

For how long have I used the solution?

I have been using Databricks for eight to nine months.

What do I think about the stability of the solution?

It is a stable product for us. We didn't see any challenges. 

What do I think about the scalability of the solution?

There are around 30 to 35 users in our organization. 

How was the initial setup?

The initial setup was easy because the third-party team made the clusters for us. 

What about the implementation team?

A third-party team enabled the cluster to make the setup easy for us. 

What other advice do I have?

I would advise using it based on the use case because it easily handles big data. It is your go-to tool if you are dealing with massive data. 

Overall, I would rate the solution a nine out of ten. The tool performs well in various use cases, availability of documentation online, and compatibility with big data systems like GCP, Azure, or AWS.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Databricks
September 2025
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: September 2025.
868,787 professionals have used our research since 2012.
Tajinder_Singh - PeerSpot reviewer
Senior Software Engineer at a computer software company with 201-500 employees
Real User
Leaderboard
Valuable data analysis and engineering features with an easy setup
Pros and Cons
  • "The setup is quite easy."
  • "Can be improved by including drag-and-drop features."

What is our primary use case?

Our primary use case for the solution is data analysis by providing a Spark cluster environment with a driver to analyze a huge amount of data and gigabytes of data and can create Notebooks in Databricks. We can write SQL commands, Python code, Scala, or Spark with Python. With Databricks, we get a cluster hosted in the public cloud and we adjust it based on how much we use it.

What is most valuable?

The most valuable features are data engineering and data science because we can create Notebooks on them. We can use any Python library to build data science models, or we can use libraries like Seaborn or Matplotlib to create charts based on data for data analysis. It is a really valuable capability.

What needs improvement?

Microsoft Azure has its learning environment on the Microsoft website. We can complete certifications, but the Databricks certification is more expensive than Microsoft. It costs between $2,000 and $2,500, and the knowledge is linked. They're also charged based on whether a person doesn't want to analyze large amounts of data. Hence, we want to have the capacity for free student users so that people can learn and build their professional skills.

For how long have I used the solution?

We have been using the solution for approximately one year.

What do I think about the stability of the solution?

The solution is stable. Microsoft offers a public service, and we can get it from the Databricks website. Additionally, many companies use it to analyze their data or create a Spark cluster to run Python or SQL scripts based on their data. I rate the stability a nine out of ten.

How was the initial setup?

The setup is quite easy, and Databricks has also partnered with Microsoft, so we get this service on Microsoft Azure.

What was our ROI?

We have seen a return on investment.

What's my experience with pricing, setup cost, and licensing?

We have a pay-as-you-go subscription and pay for it based on our usage.

Which other solutions did I evaluate?

We chose this solution because my company uses Microsoft Azure for a project, and my role as a data engineer primarily focuses on data-related services. For storing data, we use Data Lake; similarly, for the data processing engine, we use Spark, which Databricks provides.

What other advice do I have?

I rate the solution an eight out of ten. The solution is good but can be improved by including drag-and-drop features because it can be helpful for users who are unfamiliar with coding. I advise new users to have prior experience with Python or SQL before utilizing this solution if they use it for data science or model building. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2041779 - PeerSpot reviewer
Principal at a computer software company with 5,001-10,000 employees
Real User
Has advanced modeling and machine-learning features; highly scalable, with no stability issues
Pros and Cons
  • "What I like about Databricks is that it's one of the most popular platforms that give access to folks who are trying not just to do exploratory work on the data but also go ahead and build advanced modeling and machine learning on top of that."
  • "I have had some issues with some of the Spark clusters running on Databricks, where the Spark runtime and clusters go up and down, which is an area for improvement."

What is our primary use case?

I've worked with Databricks primarily in the pharmaceuticals and life sciences space, which means a lot of work on patient-level data and the predictive analytics around that.

Another use case for Databricks is in the manufacturing industry. I'm a consultant, so the use cases for the product vary, but my primary use case for it is in the pharma space.

What is most valuable?

From a data science and applied analytics perspective, what I like about Databricks is that it's probably one of the most popular platforms that give access to folks who are trying not just to do exploratory work on the data but also go ahead and build advanced modeling and machine learning on top of that, and then go ahead and make that available for dissemination of insights. For example, you can save all data and build out endpoints, so business analysts and users can access that data through a dashboard.

During the process, I also like that Databricks allows you to do portion control to keep track of your operations on the data and maintain that lineage to create reproducible results. 

The most significant Databricks advantage is that you can do everything within the platform. You don't need to exit the platform because it's a one-stop shop that can help you do all processes.

The solution is top-notch from a data science, applied ML, or advanced analytics perspective.

What needs improvement?

I have had some issues with some of the Spark clusters running on Databricks, where the Spark runtime and clusters go up and down, which is an area for improvement. Still, I am generally unaware of any super-critical issues.

For how long have I used the solution?

My experience with Databricks is two and a half years.

What do I think about the stability of the solution?

Databricks stability is an eight out of ten because I never had issues with its stability.

What do I think about the scalability of the solution?

Databricks has high scalability. Most of my work on the solution has been in the pharma space, which has massive data sets, so it's a nine out of ten, scalability-wise.

How are customer service and support?

I've never dealt with the Databricks technical support team.

How was the initial setup?

I don't have experience setting up Databricks because that's generally taken care of by the IT, data, or software engineering team before the data science team comes in and starts leveraging the platform. I have yet to experience setting up the Databricks environment personally. However, I have had experience setting up clusters, which was pretty straightforward. Still, in the overall environment of an enterprise-wide system, I have yet to gain experience setting Databricks up.

What's my experience with pricing, setup cost, and licensing?

The cost for Databricks depends on the use case. I work on it as a consultant, so I'm using the client's Databricks, so it depends on how big the client is. If it's a global organization, that cost varies versus a smaller organization that has just adopted the platform and is trying to onboard a small team of five people. It depends.

What other advice do I have?

I'm a data scientist, so I frequently use Databricks and Domino Data Science Platform.

I'm a consultant, so every client has a different version or a different runtime in Databricks, so the versions used would vary per client.

The deployment for the solution is on the cloud, predominantly on AWS or Azure.

My clients adopted Databricks as the platform of choice, and with different use cases and more teams coming on board, the usage of Databricks will increase. I don't see that going down. It can only go up.

My advice to anyone looking into implementing Databricks is that it should be one of your top choices, especially if you're looking to focus on data processing, standard ETL operations, advanced analytics, or the ML type of work.

I'd rate the solution as nine out of ten. It checks almost all the boxes that modern applications need to have.

My organization is an active partner and implementer of Databricks, but it doesn't resell the solution.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
PeerSpot user
Lead Data Scientist at a manufacturing company with 10,001+ employees
Real User
A great solution that has allowed for collaboration within our organization
Pros and Cons
  • "We have the ability to scale, collaborate and do machine learning."
  • "The product cannot be integrated with a popular coding IDE."

What is our primary use case?

Our primary use case for this solution is research for data scientists. The solution is deployed on cloud.

How has it helped my organization?

It has allowed our data engineers, data scientists, and analysts to collaborate and work on the same platform. 

What is most valuable?

We have the ability to scale, collaborate and do machine learning.

What needs improvement?

The product cannot be integrated with a popular coding IDE.

For how long have I used the solution?

We have been using this solution for approximately three years.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

The solution is scalable. There are five people using it in our organization.

How are customer service and support?

I rate my experience with customer service and support an eight out of ten.

Which solution did I use previously and why did I switch?

We previously used H2O.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

Implementation was done in-house.

What was our ROI?

We have seen a return on investments.

What's my experience with pricing, setup cost, and licensing?

Licensing costs are charged on a yearly basis and costs between 25,000 and 30,000.

Which other solutions did I evaluate?

We evaluated other options but this solution was the best fit for what we required.

What other advice do I have?

I rate this solution nine out of ten. The solution is good but can be improved by integrating with a popular coding IDE.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Sudhendra Umarji - PeerSpot reviewer
Technical Architect at Infosys
MSP
Enables us to find anomalies and apply rules to the streaming data
Pros and Cons
  • "The ability to stream data and the windowing feature are valuable."
  • "Support for Microsoft technology and the compatibility with the .NET framework is somewhat missing."

What is our primary use case?

We use this solution for finding anomalies and applying the rules to the streaming data.

There are around 50 people using this solution in my organization, including data scientists.

What is most valuable?

The ability to stream data and the windowing feature are valuable. There are a number of targeted integration points, so that is a difference between Stream Analytics and Databricks. The integrations input or output are better in Databricks. It's accessible to use any of the Python or even Java. I can use the third party, deploy it, and use it.

What needs improvement?

Support for Microsoft technology and the compatibility with the .NET framework is somewhat missing. There should be reliability between these two. Databricks is based on open sources. If it's more synchronous between the Microsoft technology and the programming languages, it'll be better. Python has better languages, but compatibility would be a great help.

I would like to have better support for Microsoft technology and better language components.

With Azure or Cosmo DB, I can store other data links or time series data tables. That would be a great help for analytics in real time.

For how long have I used the solution?

I have been using Databricks for eight months.

What do I think about the scalability of the solution?

The scalability is fine. We had thousands of devices and were sending data infrequently, so that worked for us. If the amount increases, the windowing function and job schedule may not perform as expected.

How are customer service and support?

I would rate technical support 4 out of 5. We had some issues with setup, and they were finally solved but it was after following up a few times.

Which solution did I use previously and why did I switch?

Azure Stream Analytics is easy to use and easy to deploy. It's a little bit better. Databricks is still having some stability issues. Azure Stream Analytics has a few input and output sources, and it's scalable to all types of third party or interfaces.

How was the initial setup?

Setup was complex. There were some issues with setting up a database and installing the third party component on top of services. I would rate the setup 3 out of 5.

What about the implementation team?

Implementation was done in-house.

What's my experience with pricing, setup cost, and licensing?

The cost is around $600,000 for 50 users.

I would rate the price 2 out of 5.

What other advice do I have?

I would rate this solution 8 out of 10.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Jithin James - PeerSpot reviewer
Financial Analyst 4 (Supply Chain & Financial Analytics) at Juniper Networks
MSP
Top 5
Easy to collaborate with other team members who are working on it
Pros and Cons
  • "Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy."
  • "Databricks would have more collaborative features than it has. It should have some more customization for the jobs."

What is our primary use case?

We use the solution for reliability engineering, where we apply ML and Deep Learning models to identify the fear failure patterns across different geographies and products.

What is most valuable?

Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy.

What needs improvement?

Databricks would have more collaborative features than it has. It should have some more customization for the jobs. Also, it has an average dashboarding tool. They can bring advanced features so we don't depend on other BI tools to build a dashboard. We are using Tableau to create a dashboard. If Databricks has more advanced features, we can entirely use Databricks.

For how long have I used the solution?

I have been using Databricks for one year.

What do I think about the stability of the solution?

The product is stable. It has been giving consistent outputs without any major issues.

What do I think about the scalability of the solution?

The solution is hosted on the cloud. It supports high scalability features.

10-20 users are using this solution.

How are customer service and support?

There was a training session from Databricks where they explained how to use it. We never had to contact them because they had already given us proper training on the platform.

Which solution did I use previously and why did I switch?

I have used Alteryx before. We switched to Databricks because it can compute and turn your code into production-ready code in very few seconds. Also, the stability is relatively high.

How was the initial setup?

The initial setup is easy.

What about the implementation team?

We have a dedicated team for the deployment.

What other advice do I have?

Delta Lake is a free system. We practically work on the data that we get from Snowflake. Databricks are returned to the model outputs that are returned to Delta Lake. It is easy for us to collaborate using Delta Lake, and the computation speed is also quite high for Delta Lake.

The learning curve for Databricks is not very steep. It's pretty easy, and you will find a lot of materials online. So, if you are comfortable coding in Python, it's very straightforward. There is nothing to worry about when using Databricks.

Overall, I rate the solution a ten out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
MILTON FERREIRA - PeerSpot reviewer
Co-founder/Senior Data Scientist at Hence
Real User
Responsive support, integrates and scales well
Pros and Cons
  • "The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production."
  • "The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team."

What is our primary use case?

We are using Databricks for machine learning workloads specifically.

Databricks aligns well with our skillset and overall approach. We sought out their solution specifically for a big data application we are currently working on, as we needed a platform capable of handling large amounts of data and building models. Additionally, the fact that they use open-source software and can integrate data warehouse and data lake systems was particularly appealing, as we have encountered such issues in the past. We determined that Databricks would be an effective solution for our needs.

What is most valuable?

The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production. 

What needs improvement?

The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team.

The most important feature other than the Jupyter interface would be to have the RStudio interface inside Databricks. This would be perfect.

For how long have I used the solution?

We have been using Databricks for approximately one year.

What do I think about the stability of the solution?

The stability of Databricks is good.

I rate the stability of Databricks a nine out of ten.

What do I think about the scalability of the solution?

Databricks is scalable.

I rate the scalability of Databricks a nine out of ten.

How are customer service and support?

I have been receiving responsive answers from Databricks's support. I have been pleased with the support.

I rate the support from Databricks a ten out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup of Databricks is simple. I did not experience any challenges. The time it takes for the deployment is approximately four hours.

I rate the initial setup of Databricks.

What about the implementation team?

We did the deployment of the solution in-house. There were three people involved in the deployment. A data engineer, data analyst, and machine learning engineer.

What's my experience with pricing, setup cost, and licensing?

We have only incurred the cost of our AWS cloud services. This is because during this period, Databricks provided us with an extended evaluation period, and we have not spent much money yet. We are just starting to incur costs this month, I will know more later on the full cost perspective.

We only pay standard fees for the solution. 

What other advice do I have?

We use a data engineer, data analyst, and machine learning engineer for the maintenance of the solution.

I rate Databricks a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: September 2025
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.