I like the simplicity and ease of use.
You can deploy the solution to many clouds easily.
The initial setup is straightforward.
The solution offers a free community version.
I like the simplicity and ease of use.
You can deploy the solution to many clouds easily.
The initial setup is straightforward.
The solution offers a free community version.
The auto models can be improved.
We can create auto models like Microsoft Azure Machine Learning. In Azure Machine Learning, they have these features, for example, for auto models or code, or by code. They need this in Databricks.
We need more connectors between on-premises and the cloud.
We'd like a more visual dashboard for analysis It needs better UI.
I've used the solution for one and a half months.
The solution is very stable. There are no bugs or glitches. It doesn't crash or freeze.
Scalability is no problem. At the beginning, we created a cluster, for example, and if we need more performance in the future, for example, or to accelerate the training, we can change the cluster. It's quite straightforward.
We have five people using the solution.
In one or two years, we'd like to promote the solution to clients and increase usage. Right now, the way it is used is limited. I know that some banks and aeronautics companies use it.
In terms of technical support, for now, we use the community.
We are also aware of KNIME, Azure Machine Learning, and Anaconda. In Anaconda, we use many frameworks, for example.
We started with other platforms, like Azure Machine Learning due to the fact that, with AutoML, it's easy to use. However, now that we have more skills, we need other tools or platforms like Databricks. It's a good platform to deploy and develop machine learning in employees.
The implementation is quite easy. It's not complex or difficult. The first time, I did it using a tutorial which was quite helpful. Later, I took a course. I know it quite well.
The deployment only takes a few days.
You only need to deploy or maintain the solution.
We did not need any outside assistance in terms of setting up the solution.
For us, this product is free. We use the community version.
I am interested in using the enterprise version, however. Whether we use it or not depends on the projects and customers we get.
I work with a solution provider. We are a Databrick customer.
We are not partners of Databricks. Only we are partnered with Microsoft Azure and Amazon AWS.
We are using the latest version of the solution. However, I do not know the exact version number.
I still need time with the solution before providing advice to others. I need to prepare the capacity internally. So far, it's been great.
I'd rate the solution eight out of ten.
We use Databricks to define tool data and have many use cases to analyze and distribute the data.
Data is open to everyone; they can access it through many channels, including notebooks or SQL. That on its own democratizes the data.
I like cloud scalability and data access for any type of user.
It would be better if it were faster. It can be slow, and it can be super fast for big data. But for small data, sometimes there is a sub-second response, which can be considered slow.
In the next release, I would like to have automatic creation of APIs because they don't have it at the moment, and I spend a lot of time building them.
I have been using Databricks for roughly one and a half years.
Stability is excellent.
Databricks is scalable. You can use the power of the cloud to scale your cluster size, either CPU or memory. The data doesn't work like a standard database, so you don't have it based on files, and you don't copy the data. It's super scalable. It's only the computing that you have to scale with the data.
We probably have 40 users with roles like developers, business analysts, and data scientists. We have big plans to increase the usage and have more departments using it.
Technical support has helped us.
On a scale from one to ten, I would give technical support a five.
Positive
We used Cloudera before switching to Databricks.
The initial setup was fairly okay. It takes about two minutes to deploy this solution. It's all code, so we click a button, and then it's done.
On a scale from one to five, I would give the initial setup a four.
We set up and deployed this solution.
On a scale from one to five, I would give our ROI a three.
We only pay for the Azure compute behind the solution. If you want to compute, you have to have a database layer and Azure below.
On a scale from one to five, I would give their pricing a two.
We looked at other options such as Snowflake and Cloudera on the cloud,
I would tell potential users that they need proper cloud engineers and a
cloud infrastructure team to use this solution.
On a scale from one to ten, I would give Databricks a nine.
I use Databricks for customer marketing analytics.
Databricks lets you schedule jobs pretty easily, and you can use SQL, Spark SQL, Python, or R. It also allows you to save a table or view.
I like that you can connect to multiple data sources. Most of our data is stored in the Azure data lake, but my previous company connected to SQL databases or even blob storage.
They've improved on many features. I don't do data engineering, but I had an issue a couple of years ago at my two companies ago. It took a long time to read and save tables, but I think the new Delta feature helped.
I like how easy it is to share your notebook with others. You can give people permission to read or edit. I think that's a great feature. You can also pull in code from GitHub pretty easily. I didn't use it that often, but I think that's a cool feature.
I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data.
Because I work in analytics and not data engineering, I think that's probably the biggest one. There are better graphical tools, so I don't think Databricks can compete. You can do a simple graph, and it's not that great. However, I don't think they can ever stack up to Tableau, so it's probably not worth it to improve upon that.
I've been using Databricks for two years.
Databricks is stable.
Databricks is scalable.
Databricks tech support has been great every time I've dealt with them. Their team is highly knowledgeable.
Setting up Databricks is easy. I set it up at my previous company. That was on Azure as well, but they utilized a third-party team with expertise in Databricks to ensure everything was optimized.
I rate Databricks 10 out of 10. I recommend taking advantage of Databricks support or a third-party provider to ensure it's set up optimally. I don't know if it's an additional service you must pay for, but we always had access to Databricks support in my last company.
I think that's worth the money because there are so many different scenarios with distributed computing. Even people who study analytics may not understand the ins and out of Spark. It's worth it to have a service contract for support.
I use Databricks to explore new features and provide the industry visibility and scalability of Databricks to the companies that I work with.
I create proof of concepts for companies. As a consultant, I also create training courses on Databricks. If a company wants to leverage a service provided by Databricks and needs to train people, they use our courses.
Databricks has a scalable Spark cluster creation process. The creators of Databricks are also the creators of Spark, and they are the industry leaders in terms of performance.
Databricks has made great strides in terms of performance.
It is very user friendly. I like the ease of creating a Spark cluster, submitting a job, or creating a notebook.
The UI has also changed for the better compared to what it was two years ago.
If I want to create a Databricks account, I need to have a prior cloud account such as an AWS account or an Azure account. Only then can I create a Databricks account on the cloud. However, if they can make it so that I can still try Databricks even if I don't have a cloud account on AWS and Azure, it would be great. That is, it would be nice if it were possible to create a pseudo account and be provided with a free trial. It is very essential to creating a workforce on Databricks. For example, students or corporate staff can then explore and learn Databricks.
It's a big ask to have people jump through a lot of hoops to get approval to create a Databricks cluster just to explore it, but if they can try it on their own with a free trial without an underlying cloud account it would be more convenient.
Documentation can be improved as well. There are so many versions of documents. For example, when I tried to create a DBU vault and secrets file, I had to go through multiple versions of documents. This could be improved so that the documentation is easy to use.
I've been using this solution for about two years.
Stability wise, it's quite okay. In my experience, it doesn't crash.
I have not used autoscaling because it consumes a lot of money and because my experience has been alright. In some cases, though, it is tied to the quota of the underlying infrastructure. I have not tested the scalability to its fullest extent, but with the workloads I run, it has been fine.
When I wanted to create an AWS account and contacted technical support via email, I never received a response. Recently, however, I think they have improved their support a little bit, and I did get a call in response to my question. Overall, I've not faced any issues with the person I had to contact directly.
The initial setup is not very easy, but it's medium in complexity.
Databricks is a very expensive solution. Pricing is an area that could definitely be improved. They could provide a lower end compute and probably reduce the price.
I would rate Databricks at seven on a scale from one to ten. If you compare it to Snowflake, for example, Snowflake doesn't mandate an underlying cloud account. It creates one on its own. That's a subtle convenience that Snowflake has and one that Databricks could also build.
Snowflake's documentation is easy to use in comparison to that of Databricks.
We use Databricks for video streaming and security purposes.
Databricks' Lakehouse architecture has been most useful for us. The data governance has been absolutely efficient in between other kinds of solutions.
I would like it if Databricks made it easier to set up a project. The use case determines which services we are going to use. You have the application engine, and you generate a potential budget for your workloads, so you can understand what you are going to do, what you are going to use, and what you will invest in.
Because I'm deploying on the Google Cloud Platform, measuring the investment, value, and use case is extremely difficult. So I leave it and move on without the risk. It would be easier if I had one page where you can see three columns: one for the use cases of a specific architecture, a second one for the prices based on the volume of data or machine time, and the third column for the budget. That would make it easier to know if I am using the appropriate architecture for the right solution.
I have seen something like that in Microsoft Azure, but obviously Microsoft Azure costs a lot of money. Amazon has something like that, but it's very complicated to use.
We've been using Databricks for about five years.
Databricks is very stable and powerful.
It was simple to make Databricks scalable. We found that we could set up an alert to tell us if we needed more resources, money, or time from our team. We're alerted when the system detects some trigger for any use of the instance. If you have another alert from your side, that would be extremely useful because it takes a lot of time to develop that kind of trigger.
Databricks technical support was lovely. We don't need it so much, but the few questions we had were answered immediately.
I am not a data engineer because I just started data science at the company, but it was straightforward and clear for the architect to set up. He provided me with that idea because he realized it would take time if we had use cases. You can select and change the data or add some modules or products. You have all the technology to do so.
I rate Databricks eight out of 10. I like to move my customers into Databricks, but I take care of the internal system infrastructure so they can continue to use familiar software or operating systems and databases. They have a lot of doubts because they don't know the solution. We need to train them, explain things, and show the solution's potential value.
Generally, companies try to keep the same flavor when they migrate. For example, if they are using many Microsoft products, they want to work with Azure. If they are open to other options, they go with GCP or AWS. However, Databricks doesn't have enough customers here in my market because it's not a visible brand. Azure, GCP, and AWS are highly visible here, so the local teams are friendly with the three brands.
We use Databricks for data science work in projects that create data pipelines, pre-processing, data wrangling, big data cluster management and ML, machine learning and deep learning tasks.
Databricks collaborates very well with the Azure platform, Dataiku, and enterprise AI tool. Databricks is a new connection to pull the data or connect to the Spark cluster. It is helpful for us to instance it or distribute the load through the Spark cluster, and it is very user-friendly.
The most valuable feature is the Spark cluster which is very fast for heavy loads, big data processing and Pi Spark.
Databricks as a solution is integrated with Azure, but Google Cloud has some restrictions. I'm not sure about AWS Cloud, but it would be great if Databricks could integrate all the cloud platforms. Regarding additional features, we would like to see them mostly on the data engineering side, where we have a Spark cluster and some inbuilt ML. In addition, pre-processing steps will be useful.
We have been using this solution for two years and are using the latest update.
It is a stable solution as long as the Microsoft Azure Platform is stable too.
It is a scalable solution, both vertically and horizontally, which is good. My organization is big, and we have a lot of users. In my department, we have about 15 people using Databricks.
We have not escalated any issues to technical support, but we initially struggled with configuration and the settings of Hive metastore, but we resolved it. I rate the technical support a nine out of ten.
Positive
We were using the looped EMR elastic MapReduce from AWS before using Databricks. We switched to Databricks because the whole platform changed from AWS to Azure platform, and Databricks comes as a package.
The initial setup was easy to complete and not complex. It may initially be challenging for a new user, but it improves over time. The CICD pipeline works well with the Microsoft Azure platform because the continuous integration, development and deployment come with the Git integration. It makes it easier for Databricks and the CICD. The deployment should be improved from the perspective of auto ML functionality, so it doesn't have intensive automation learning capability.
We don't use Databricks directly because we work on a data science project. It requires an auto ML and inbuilt machine learning capability. We found capabilities like the large language model using NLP and other deep learning models that are not that intensive. It is meant for data engineering purposes rather than data science purposes. It'll be great if Databricks could be intensive for data science.
We used a third-party, Dataiku platform for the deployment, where we connected to Databricks and completed the ML ops. We required about three people for deployment, and it is easy to maintain the solution.
We have seen an ROI but cannot differentiate because it also comes with the Azure platform.
I do not have details about the pricing.
I rate this solution a nine out of ten. Regarding advice, Databricks is a very good platform, popular and easy to use daily for data engineers and data scientists who rely on a large dataset to do advanced analytics reporting. It's a very good tool.
We are using Databricks for machine learning workloads specifically.
Databricks aligns well with our skillset and overall approach. We sought out their solution specifically for a big data application we are currently working on, as we needed a platform capable of handling large amounts of data and building models. Additionally, the fact that they use open-source software and can integrate data warehouse and data lake systems was particularly appealing, as we have encountered such issues in the past. We determined that Databricks would be an effective solution for our needs.
The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production.
The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team.
The most important feature other than the Jupyter interface would be to have the RStudio interface inside Databricks. This would be perfect.
We have been using Databricks for approximately one year.
The stability of Databricks is good.
I rate the stability of Databricks a nine out of ten.
Databricks is scalable.
I rate the scalability of Databricks a nine out of ten.
I have been receiving responsive answers from Databricks's support. I have been pleased with the support.
I rate the support from Databricks a ten out of ten.
Positive
The initial setup of Databricks is simple. I did not experience any challenges. The time it takes for the deployment is approximately four hours.
I rate the initial setup of Databricks.
We did the deployment of the solution in-house. There were three people involved in the deployment. A data engineer, data analyst, and machine learning engineer.
We have only incurred the cost of our AWS cloud services. This is because during this period, Databricks provided us with an extended evaluation period, and we have not spent much money yet. We are just starting to incur costs this month, I will know more later on the full cost perspective.
We only pay standard fees for the solution.
We use a data engineer, data analyst, and machine learning engineer for the maintenance of the solution.
I rate Databricks a nine out of ten.
Our team is currently utilizing machine learning for various applications, and a few members are also exploring Databrick's use for ML operations.
In the manufacturing industry, Databricks can be beneficial to use because of machine learning. It is useful for tasks, such as product analysis or predictive maintenance.
I have been using Databricks for approximately six months
The stability of the clusters or the instances of Databricks would be better if it was a much more stable environment. We've had issues with crashes.
The scalability of Databricks is good as long as you have a data lake, and it's easy to scale.
We have approximately 50 users using this solution in my company.
We have a different team who handles the support. I do not have contact with Databricks support.
I have not used a similar solution to Databricks.
I have seen an ROI using Databricks.
I rate the price of Databricks as eight out of ten.
Having a good understanding of physical security in relation to cybersecurity in an OT (Operational Technology) environment would be beneficial, and utilizing an existing data lake prior to implementing a Databricks initiative would greatly aid in its success.
I rate Databricks an eight out of ten.
