We use the solution to maintain our legacy data warehouse for better performance and more extensive storage.
Technical Presales Engineer at a tech services company with 51-200 employees
Provides extensive data storage capacity and ensures better performance
Pros and Cons
- "The solution's most valuable feature is the enterprise data platform."
- "They should focus on upgrading their technical capabilities in the market."
What is our primary use case?
What is most valuable?
The solution's most valuable feature is the enterprise data platform.
What needs improvement?
They should work on the solution's pricing. Also, finding resources with good experience in the solution is difficult. Thus, they should upgrade their technical capabilities in the market.
They should add features like AutoML and AutoDev for enhanced machine-learning experiences. In addition, they should consider developing an integration capability similar to Informatica for an end-to-end enterprise solution.
For how long have I used the solution?
We have been using the solution for one year.
Buyer's Guide
Cloudera Distribution for Hadoop
June 2026
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2026.
902,456 professionals have used our research since 2012.
How are customer service and support?
The solution's customer support team could be better. We received their assistance only with installation and configuration.
What's my experience with pricing, setup cost, and licensing?
The solution is expensive. The license costs around 10k.
What other advice do I have?
Cloudera is a cost-effective solution if you need more storage space. In this case, I advise you to opt for it. I rate the solution as an eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer. Reseller
Great being able to manage the security layer using the shared SDX which provides flexibility
Pros and Cons
- "With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
- "For enterprise organizations that can bear the cost, it's a good solution."
- "This is a very expensive solution."
- "The only thing that needs improvement is the cost, it's a very expensive solution and one of the main reasons companies are not attracted to the product."
What is our primary use case?
This product is a framework for edge AI, it comes with multiple ecosystems as a project. I'm a senior data architect manager and we are consultants. We offer Cloudera to our customers but we don't have a partnership with them.
What is most valuable?
The best feature is the layer shared experience. If you have a cluster available on-prem or in the cloud, you can manage that security layer using the shared SDX and it provides flexibility. New features are constantly being added.
What needs improvement?
The only thing that needs improvement is the cost, it's a very expensive solution and one of the main reasons companies are not attracted to the product.
What do I think about the stability of the solution?
This product has been around for a long time so it's very mature and stable.
What do I think about the scalability of the solution?
The scalability is very good.
How was the initial setup?
The initial setup has become easier although you need a dedicated admin to maintain and manage the solution because it's a framework and not a single product. Deployment nowadays is much smoother with the PaaS offering in the public cloud, so you can carry out the deployment with an in-house team. The deployment only takes a day but a company is unlikely to go with the default so the solution needs fine-tuning which can take a couple of weeks.
What's my experience with pricing, setup cost, and licensing?
For enterprise organizations that can bear the cost, it's a good solution. A smaller company wouldn't be able to afford the licensing fees. You can get a free trial for 60 days. They'll never have a community version because they're the only ones in the market offering this kind of framework.
What other advice do I have?
I rate this solution nine out of 10.
Disclosure: My company has a business relationship with this vendor other than being a customer. Consultant
Buyer's Guide
Cloudera Distribution for Hadoop
June 2026
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2026.
902,456 professionals have used our research since 2012.
Engineering Manager/Solution architect at a computer software company with 201-500 employees
Preferred solution for on-prem
Pros and Cons
- "Cloudera is a very manageable solution with good support."
- "Cloudera is one of the best solutions for on-prem."
- "The initial setup of Cloudera is difficult."
- "Cloudera is not as easy, as it requires more DevOps resources than other solutions."
What is our primary use case?
We are a distributor for Hadoop. Our customers choose whether they would like to use Cloudera or another product.
Cloudera Distribution is deployed on-premise as well as on bare metal servers in AWS.
What is most valuable?
Cloudera is a very manageable solution with good support.
What needs improvement?
When you compare Cloudera with EMR, EMR has a lot of administrative features, so you don't need to manage the solution. Cloudera is not as easy, as it requires more DevOps resources than other solutions.
For how long have I used the solution?
We have been offering this solution for five years.
What do I think about the stability of the solution?
Cloudera Distribution is stable.
What do I think about the scalability of the solution?
This is a scalable solution. We have clients that have a large installation of Cloudera.
How are customer service and support?
Technical support from Cloudera is fine.
How was the initial setup?
The initial setup of Cloudera is difficult. After you have installed it once, it is not difficult to reproduce.
What about the implementation team?
For a POC deployment, we required only one DevOps. On larger-scale implementation, we also require a data engineer.
What's my experience with pricing, setup cost, and licensing?
Cloudera requires a license to use.
Which other solutions did I evaluate?
We looked at EMR, however Cloudera is better when using OnPrem.
What other advice do I have?
Cloudera is one of the best solutions for on-prem.
I would rate this solution an 8 out of 10.
Which deployment model are you using for this solution?
Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
IT expert at a comms service provider with 201-500 employees
Reliable, stable, but difficult to use
Pros and Cons
- "The solution is reliable and stable, it fits our requirements."
- "The procedure for operations could be simplified."
What is our primary use case?
We are in the testing phase of Cloudera Distribution for Hadoop, and we will be in production soon.
What needs improvement?
The procedure for operations could be simplified.
For how long have I used the solution?
I have used Cloudera Distribution for Hadoop within the past 12 months.
What do I think about the stability of the solution?
The solution is reliable and stable, it fits our requirements.
How was the initial setup?
The implementation of Cloudera Distribution for Hadoop is not easy. It works on multiple nodes and can be complex for testing. The whole process took us one and a half days.
What about the implementation team?
We used a local system integrator for the implementation. We had approximately five people for the implementation.
We have not had to do maintenance of the solution because we are still in the testing phase.
What other advice do I have?
My advice to others is this solution can be complex.
I rate Cloudera Distribution for Hadoop a seven out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Associate Manager at a consultancy with 501-1,000 employees
Easy to install, good technical support, and with a single script we can run jobs within minutes
Pros and Cons
- "I don't see any performance issues."
- "With a single script, we are able to run the jobs within minutes, which is an advantage."
- "It could be faster and more user-friendly."
What is our primary use case?
We use this solution to process data.
When using an SQL Server you have to build indexes and you need to fine-tune the data.
We import the data that is in the SQL Source.
With a single script, we are able to run the jobs within minutes, which is an advantage.
We are using the Power BI model for the business convention. The performance in Power BI will be reduced if you incorporate more calculations. Those calculations are captured in the Hadoop layer and processed.
What needs improvement?
It could be faster and more user-friendly.
For how long have I used the solution?
I have been using this solution for seven months.
What do I think about the stability of the solution?
It's a stable product. I don't see any performance issues.
What do I think about the scalability of the solution?
This solution is scalable. We have 40 users for different projects in our organization.
We will continue to use this solution.
How are customer service and technical support?
Technical support is good.
Which solution did I use previously and why did I switch?
I didn't use any other product.
How was the initial setup?
The installing is straightforward.
Our clients provide us with the access to use it directly.
Once you have been given access to the edge nodes we are able to run the scripts in the Hadoop layer.
What's my experience with pricing, setup cost, and licensing?
We do not pay for licensing because our customers forward it, so there is no need to purchase the license for the project.
What other advice do I have?
I would recommend this solution.
I would rate Cloudera Distribution for Hadoop a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Chief Executive Officer at a financial services firm with 51-200 employees
Overall operational, stable but price could be better
Pros and Cons
- "The product as a whole is good."
- "There are better solutions out there that have more features than this one."
What is our primary use case?
We use the solution for the data warehousing.
What is most valuable?
The product as a whole is good.
What needs improvement?
There are better solutions out there that have more features than this one.
For how long have I used the solution?
I have just started using the solution.
What do I think about the stability of the solution?
I do not know of any issues with the stability of the solution.
What about the implementation team?
I have an internal team that does maintenance for the solution.
What's my experience with pricing, setup cost, and licensing?
The price could be better for the product.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Performs cost analysis tasks for our customers in the financial industry
Pros and Cons
- "The most valuable feature is Kubernetes."
- "Cloudera really has no competition."
- "The price of this solution could be lowered."
What is our primary use case?
We are a solution provider and this is one of the systems that we implement for our clients.
Our clients for this product are in the financial industry and they use it to perform cost analysis tasks.
What is most valuable?
The most valuable feature is Kubernetes.
What needs improvement?
The price of this solution could be lowered.
For how long have I used the solution?
We have been using the Cloudera Distribution for Hadoop for five years.
What do I think about the stability of the solution?
It is a stable solution.
What do I think about the scalability of the solution?
The Cloudera Distribution for Hadoop can be scaled. Our customers are enterprise-level companies and they have about 100 users for this solution.
How are customer service and technical support?
We offer technical support for this solution to our customers.
Which solution did I use previously and why did I switch?
We did not use another solution prior to this one.
How was the initial setup?
The initial setup is straightforward.
What's my experience with pricing, setup cost, and licensing?
The pricing is expensive.
Which other solutions did I evaluate?
Cloudera really has no competition.
What other advice do I have?
I would rate this solution a nine out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer. reseller
Data engineer at a tech services company with 11-50 employees
Supports a wide range of tools and has a good support community
Pros and Cons
- "We also really like the Cloudera community. You can have any question and will have your answer within a few hours."
- "Cloudera is always developing new tools and supports a wide range of tools."
- "Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment."
- "I subscribe to Cloudera to get an enterprise version but I have found that I can get some of its features from other vendors that would be at a lower cost than Cloudera."
What is our primary use case?
Our primary use case for this solution is to host a big amount of data in our platform, processing, analysis and all of this stuff on the platform.
What is most valuable?
Cloudera is always developing new tools and supports a wide range of tools. We also really like the Cloudera community. You can have any question and will have your answer within a few hours. Cloudera is better than other competitors because they acquired Hortonworks.
What needs improvement?
We're processing a huge amount of data on our system. Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment. Cloudera is trying to adopt new technologies.
I think the idea of open source tools now is dominating. So Cloudera has to decide how to deal with open-source tools. I subscribe to Cloudera to get an enterprise version but I have found that I can get some of its features from other vendors that would be at a lower cost than Cloudera. They should lower the price.
For how long have I used the solution?
We have been using Cloudera for a year.
What do I think about the stability of the solution?
It's stable. I have no issue regarding the stability.
What do I think about the scalability of the solution?
It's scalable. You can add more nodes and you can expand your cluster easily.
How are customer service and technical support?
After we open a ticket, the issue can be resolved very quickly, they have a management portal. I don't contact them directly, but I haven't heard anybody having any problems with it.
How was the initial setup?
The initial setup is complicated. We needed the vendor to install it themselves. The deployment took around three weeks. Three people were involved because they just follow up and supervise the deployment, but they're not deploying anything. The vendor does it.
What other advice do I have?
In terms of the advice, I would say to focus on what tools are available on the market. In terms of open-source, most companies are delivering open source technologies and providing support to these tools. Now I have the option to purchase a license for whatever platform for $1. I can deliver it with another small company at a lower cost. If I was the decision-maker, I'd invest in open-source tools. Cloudera and all of these companies are trying to adapt to these big data technologies and open source tools. Cloudera is trying to put it inside their platform so that we can have a compatible solution.
I would rate it an eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Senior Software Engineer at a tech services company with 10,001+ employees
Performs well and the technical support is helpful, but the upgrade process needs to be consolidated
Pros and Cons
- "The most valuable feature is Impala, the querying engine, which is very fast."
- "There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon."
What is our primary use case?
We are dealing with data from the telecom industry. We were using an Oracle system but our volume has increased. We now have a lot of real-time data that needs to be transformed so that it can be made available and used.
What is most valuable?
The most valuable feature is Impala, the querying engine, which is very fast. We have been able to work with one terabyte of data in less than 20 minutes. The speed makes it easy for us to process all of the data that comes in, in time.
The support is very good.
All of the data has automatic triple replication in order to secure integrity.
What needs improvement?
There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon.
When we are upgrading CDH, there are many things that need to be upgraded and it would be helpful if it were bundled. As it is now, we have to upgrade many different things separately.
For how long have I used the solution?
I have been working with the Cloudera Distribution for Hadoop for around two years.
What do I think about the stability of the solution?
It is a stable solution.
What do I think about the scalability of the solution?
The scalability is good and it works on commodity hardware. One of the problems we have right now is that there is a lot of data and we're moving it from our Oracle solution. This means that there is a double cost, in terms of storage, during our transition to working with big data.
We are using a data lake that is a store for all of the data in our organization. There are more than25 projects, with between 25 and 30 people in each one, for a total of almost 1,000 people. All of them are dependent on this solution.
Most of our users are technicians who have problems to solve using the data available to them. A couple of them are data scientists and the remainder are upper management, who do the analysis.
How are customer service and technical support?
The technical support is very good. Whenever we open a ticket, we get support right away.
Which solution did I use previously and why did I switch?
We did use another solution prior to this one but it could not keep up with our increase in data.
What other advice do I have?
This suitability of this solution depends on the size of the data that you are going to be working with. If you have going to be working with a huge dataset that contains many gigabytes of data then this is a good solution. For smaller datasets, you should also consider other technologies.
My advice for anybody who is implementing this solution is to take some time to learn it. Beyond that, be sure to contact support if you have any problems because they are very helpful.
I would rate this solution a seven out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Data Management at BCX
Offers big data support for analytical applications but the technical support needs improvement
Pros and Cons
- "In terms of scalability, if you have enough hardware you can scale out. Scalability doesn't have any issues."
- "The pricing is very competitive, it's not bad."
- "The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it."
- "The stability is problematic. We did encounter quite a lot of issues with the cluster going down quite frequently."
What is our primary use case?
We primarily use it only for big data support for analytical applications.
What is most valuable?
The feature that we've used quite intensively is Spark, in how it specifically can speed up some of the data to assist with processing.
What needs improvement?
The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it.
In the next release, I think it would be helpful if there was easier integration into all the other existing data back corners. It will be a big plus as it's a favorite capability. We had to go with a third-party application in order to achieve that.
For how long have I used the solution?
I've been using the solution since 2016.
What do I think about the stability of the solution?
The stability is problematic. We did encounter quite a lot of issues with the cluster going down quite frequently.
What do I think about the scalability of the solution?
In terms of scalability, if you have enough hardware you can scale out. Scalability doesn't have any issues. Currently, only about 10 people in total are using the solution. So we have about four business users and then four technical people. It's only limited to two environments.
How are customer service and technical support?
I think there's a lot of room for improvement on the technical support side. Mostly because we don't have a lot of local skills in South Africa that could have supported the solution. It was an issue.
Which solution did I use previously and why did I switch?
This is our first solution. We tested a bunch of other technologies, but that was our first one and we're still using it.
How was the initial setup?
The initial implementation was straightforward from an application side. There weren't any hiccups. In terms of deployment time, it's going to be difficult to say, because most of it was related to hardware problems. Software took about two months to deploy. We required four people for deployment.
What's my experience with pricing, setup cost, and licensing?
The pricing is very competitive. It's not bad.
Which other solutions did I evaluate?
We considered working with a few other companies, including IBM Bluemix.
What other advice do I have?
I would recommend the solution given that they've proven the business case and that they've proven the technology. We have found that if you don't use or address the right business code you end up buying a technology that doesn't necessarily solve your business problems.
I would rate the solution seven out of ten. The main reason for not rating it higher is that I think that the overall support is not great and we've found some limitations. It wasn't mature when we started. It's getting there. It's getting better. The main reason for the score of seven is mainly the support as well as the limited functionality.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros
sharing their opinions.
Updated: June 2026
Popular Comparisons
Microsoft Azure Cosmos DB
MongoDB Enterprise Advanced
Apache Spark
IBM Netezza Performance Server
Couchbase Enterprise
IBM Spectrum Computing
Neo4j Graph Database
HPE Data Fabric
Apache HBase
DataStax Enterprise
Aerospike Database
Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:













