

Cloudera Distribution for Hadoop (CDH) and Cloudera Data Platform (CDP) are major players in managing big data environments, with CDP offering more modern features and flexibility, particularly in security and scalability.
Features: CDP includes enterprise security, efficient cluster management, and easy integration with tools like Ranger, enhancing scalability in hybrid cloud settings. CDH is recognized for its administration via Cloudera Manager, query optimization with Impala, and comprehensive documentation.
Room for Improvement: CDP could benefit from improved Spark and AI workload support, easier integration, and better API documentation. Enhancements in Ranger's security implementation are needed for complex environments. CDH requires better performance in HBase, simpler installation, and improved API customization, while lacking advanced cloud functionality compared to CDP.
Ease of Deployment and Customer Service: CDP supports versatile deployments across on-premises and various cloud configurations, offering better integration and support, whereas CDH focuses on traditional on-premises setups with less flexibility.
Pricing and ROI: CDH's high per-node cost can be challenging at scale, but it offers value in complex setups despite higher initial costs. CDP has a more complex pricing model based on resources like cores and terabytes. Its flexibility in cloud environments can lead to cost efficiency but may also lead to higher costs depending on usage.
There are licensing costs that have been saved when we moved some of the data platforms, decommissioned them, and moved on to this platform.
In terms of return on investment, I see great changes in operational effectiveness measured by RTO when comparing on-premises solutions with cloud solutions.
A specific example of the positive impact of Cloudera Data Platform is the clearly saved time and improved performance, which is the main result of it.
I would rate the customer support of Cloudera Data Platform ten out of ten.
I have communicated with technical support, and they are responsive and helpful.
Cloudera support is timely and responsive, adhering to the SLAs they provide.
The technical support is quite good and better than IBM.
CDP allows for easy, mostly automated scalability where I can schedule job workflows, fine-tune system resource metrics, and add nodes with just a click.
They have the cloud burst feature available where if the on-premises capacity is not sufficient at a point in time, you can run that Spark job on the cloud itself.
The ability to scale processing capacity on demand for batch jobs without impacting other workloads, and support for a growing number of concurrent users and teams accessing the platform simultaneously are significant advantages.
Sometimes the end user is not experienced or does not have all the expertise related to Cloudera specifically, making it very difficult to manage properly
Sometimes a node goes down, but it automatically returns to a healthy state.
Cloudera Data Platform is pretty stable in my experience; there are not any downtime or reliability issues.
We faced challenges but overcame those challenges successfully.
We aim to address these issues with a Kubernetes-based platform that will simplify the task of upgrading services.
Cloudera Data Platform should include additional capabilities and features similar to those offered by other data management solutions like Azure and Databricks.
Cloudera Data Platform can be improved by addressing the feasibility of using it in the cloud; there are some complexities around the components used in cloud by Cloudera Data Platform that are not really convenient.
Integrating with Active Directory, managing security, and configuration are the main concerns.
Initially, CDH had a straightforward pricing model based on nodes, but CDP includes factors like processors, cores, terabytes, and drives, making it difficult to calculate costs.
We find Cloudera Data Platform to be cost-effective.
So far, I would say that it is competitive pricing that we have received.
It can be deployed on-premises, unlike competitors' cloud-only solutions.
By using the Hadoop File System for distributed storage, we have 1.5 petabytes of physical storage with 500 terabytes of effective storage due to a replication factor of three.
The Ranger integration makes it more flexible and reliable for me by allowing control over data access, specifying who can access at what level, such as table level, masking, or data layer level.
What stands out the most in Cloudera Manager are SDX, which provide centralized control for governance, security, and data lineage across multiple sources.
This is the only solution that is possible to install on-premise.
| Product | Market Share (%) |
|---|---|
| Cloudera Data Platform | 7.6% |
| Palantir Foundry | 15.6% |
| Informatica Intelligent Data Management Cloud (IDMC) | 10.8% |
| Other | 66.0% |
| Product | Market Share (%) |
|---|---|
| Cloudera Distribution for Hadoop | 15.1% |
| HPE Data Fabric | 14.9% |
| Apache Spark | 13.9% |
| Other | 56.1% |


| Company Size | Count |
|---|---|
| Small Business | 8 |
| Midsize Enterprise | 7 |
| Large Enterprise | 26 |
| Company Size | Count |
|---|---|
| Small Business | 16 |
| Midsize Enterprise | 9 |
| Large Enterprise | 31 |
Cloudera Data Platform offers a powerful fusion of Hadoop technology and user-centric tools, enabling seamless scalability and open-source flexibility. It supports large-scale data operations with tools like Ranger and Cloudera Data Science Workbench, offering efficient cluster management and containerization capabilities.
Designed to support extensive data needs, Cloudera Data Platform encompasses a comprehensive Hadoop stack, which includes HDFS, Hive, and Spark. Its integration with Ambari provides user-friendliness in management and configuration. Despite its strengths in scalability and security, Cloudera Data Platform requires enhancements in multi-tenant implementation, governance, and UI, while attribute-level encryption and better HDFS namenode support are also needed. Stability, especially regarding the Hue UI, financial costs, and disaster recovery are notable challenges. Additionally, integration with cloud storage and deployment methods could be more intuitive to enhance user experience, along with more effective support and community engagement.
What are the key features?Cloudera Data Platform is implemented extensively across industries like hospitality for data science activities, including managing historical data. Its adaptability extends to operational analytics for sectors like oil & gas, finance, and healthcare, often enhanced by Hortonworks Data Platform for data ingestion and analytics tasks.
We monitor all Data Management Platforms (DMP) reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.