

ClickHouse and Cloudera Data Platform compete in big data management and analytics. ClickHouse has the upper hand in terms of speed and cost-efficiency, whereas Cloudera excels in scalability and integration capabilities.
Features: ClickHouse is known for its exceptional speed and cost-efficiency, real-time query performance, and OLAP optimization. It integrates well with tools like Kafka and Tableau and boasts a supportive community. Cloudera Data Platform provides extensive data management, scalability, and integration with HDFS and Hive. Its ecosystem supports complex analytics with strong security features through Ranger, and offers centralized management for distributed storage and compute.
Room for Improvement: ClickHouse users report a need for better integration with third-party tools, comprehensive documentation, and improved handling of cloud services and cluster management. There are also limitations in mutations and SQL syntax compatibility. Cloudera Data Platform users face a complex setup and call for enhanced AI and machine learning innovation. It needs more intuitive deployment processes and better cloud storage integration handling.
Ease of Deployment and Customer Service: ClickHouse supports flexible deployment across hybrid cloud, on-premises, and cloud-based setups, with a strong open-source community. However, support may be limited for non-cloud users. Cloudera Data Platform offers diverse deployment models, with support typically tied to cloud subscriptions. Its support is prompt, yet user feedback suggests improvement is needed during critical issues.
Pricing and ROI: ClickHouse provides a free open-source version and relatively low-cost cloud services, offering strong ROI through cost savings and developer efficiency. It is competitive for self-hosted deployments. Cloudera Data Platform, despite its complex pricing model covering nodes and processors, is seen as cost-effective for large-scale data management. Both deliver high ROI, though ClickHouse's licensing and hosting flexibility often make it more cost-efficient.
I estimate we save four to five hours per person per week due to this efficiency, translating to around 20 to 25 hours saved monthly for each individual.
We could reduce the amount of employees needed when we migrated to ClickHouse Cloud.
With ClickHouse, we didn't need to spend much on resources, cutting costs by around 25 to 30%.
There are licensing costs that have been saved when we moved some of the data platforms, decommissioned them, and moved on to this platform.
In terms of return on investment, I see great changes in operational effectiveness measured by RTO when comparing on-premises solutions with cloud solutions.
A specific example of the positive impact of Cloudera Data Platform is the clearly saved time and improved performance, which is the main result of it.
If more timely support could be provided during critical issues, situations could have been resolved much more quickly, saving considerable time.
When we faced any challenges, the ClickHouse support team provided helpful resolutions.
We utilize AVN ClickHouse, which is effectively managed by AVN, providing bug fixes and developing new functionalities along with architecture reviews.
I would rate the customer support of Cloudera Data Platform ten out of ten.
I have communicated with technical support, and they are responsive and helpful.
Cloudera support is timely and responsive, adhering to the SLAs they provide.
The vertical scalability is impressive, with high insert throughput, allowing millions of rows per second with low latency.
ClickHouse is highly scalable.
The scalability of ClickHouse is great.
CDP allows for easy, mostly automated scalability where I can schedule job workflows, fine-tune system resource metrics, and add nodes with just a click.
They have the cloud burst feature available where if the on-premises capacity is not sufficient at a point in time, you can run that Spark job on the cloud itself.
The ability to scale processing capacity on demand for batch jobs without impacting other workloads, and support for a growing number of concurrent users and teams accessing the platform simultaneously are significant advantages.
I can confidently say that it is very consistent and stable even when handling high volume loads and real-time streaming analytics across financial and operational domains.
ClickHouse handles large volumes of data efficiently.
ClickHouse is stable, as we did not encounter stability issues in production.
Sometimes the end user is not experienced or does not have all the expertise related to Cloudera specifically, making it very difficult to manage properly
Sometimes a node goes down, but it automatically returns to a healthy state.
Cloudera Data Platform is pretty stable in my experience; there are not any downtime or reliability issues.
Another challenge is the lack of robust support for transactional databases, which limits its use as a primary database.
ClickHouse should be able to import data from other types of sources like Parquet and Iceberg tables and all the new upcoming data formats.
My experience with ClickHouse's documentation is that it needs improvement; I think it can be made more beginner-friendly, while the community support is really good.
We aim to address these issues with a Kubernetes-based platform that will simplify the task of upgrading services.
Cloudera Data Platform should include additional capabilities and features similar to those offered by other data management solutions like Azure and Databricks.
Cloudera Data Platform can be improved by addressing the feasibility of using it in the cloud; there are some complexities around the components used in cloud by Cloudera Data Platform that are not really convenient.
My experience with pricing, setup cost, and licensing indicates that it is very expensive—ClickHouse is the most expensive option.
ClickHouse is open source with no hidden fees, offering cost-effective data management.
I found ClickHouse's pricing to be efficient in comparison to other services such as Redshift.
Initially, CDH had a straightforward pricing model based on nodes, but CDP includes factors like processors, cores, terabytes, and drives, making it difficult to calculate costs.
We find Cloudera Data Platform to be cost-effective.
So far, I would say that it is competitive pricing that we have received.
ClickHouse has reduced our storage cost and improved our 99th percentile latency by 40%.
For cost optimization, after deploying the cluster on-premises and using S3 Express, approximately 5x cost savings were achieved on data storage.
ClickHouse positively impacted our organization by absorbing the whole logging system without hassle, storing logs for six months efficiently.
By using the Hadoop File System for distributed storage, we have 1.5 petabytes of physical storage with 500 terabytes of effective storage due to a replication factor of three.
The Ranger integration makes it more flexible and reliable for me by allowing control over data access, specifying who can access at what level, such as table level, masking, or data layer level.
What stands out the most in Cloudera Manager are SDX, which provide centralized control for governance, security, and data lineage across multiple sources.
| Product | Market Share (%) |
|---|---|
| ClickHouse | 6.5% |
| PostgreSQL | 14.7% |
| Firebird SQL | 13.3% |
| Other | 65.5% |
| Product | Market Share (%) |
|---|---|
| Cloudera Data Platform | 7.6% |
| Palantir Foundry | 15.6% |
| Informatica Intelligent Data Management Cloud (IDMC) | 10.8% |
| Other | 66.0% |


| Company Size | Count |
|---|---|
| Small Business | 13 |
| Midsize Enterprise | 4 |
| Large Enterprise | 8 |
| Company Size | Count |
|---|---|
| Small Business | 8 |
| Midsize Enterprise | 7 |
| Large Enterprise | 26 |
ClickHouse is renowned for its speed, scalability, and real-time query performance. Its compatibility with SQL standards enhances flexibility while enabling integration with popular tools.
ClickHouse leverages a column-based architecture for efficient data compression and real-time analytics. It seamlessly integrates with tools like Kafka and Tableau and is effective in handling large datasets due to its cost-efficient aggregation capabilities. With robust data deduplication and strong community backing, users can access comprehensive documentation and up-to-date functionality. However, improvements in third-party integration, cloud deployment, and handling of SQL syntax differences are noted, impacting ease-of-use and migration from other databases.
What features make ClickHouse outstanding?
What benefits should users consider?
ClickHouse is deployed in sectors like telecommunications for passive monitoring and is beneficial for data analytics, logging Clickstream data, and as an ETL engine. Organizations harness it for machine learning applications when combined with GPT. With the ability to be installed independently, it's an attractive option for avoiding cloud service costs.
Cloudera Data Platform offers a powerful fusion of Hadoop technology and user-centric tools, enabling seamless scalability and open-source flexibility. It supports large-scale data operations with tools like Ranger and Cloudera Data Science Workbench, offering efficient cluster management and containerization capabilities.
Designed to support extensive data needs, Cloudera Data Platform encompasses a comprehensive Hadoop stack, which includes HDFS, Hive, and Spark. Its integration with Ambari provides user-friendliness in management and configuration. Despite its strengths in scalability and security, Cloudera Data Platform requires enhancements in multi-tenant implementation, governance, and UI, while attribute-level encryption and better HDFS namenode support are also needed. Stability, especially regarding the Hue UI, financial costs, and disaster recovery are notable challenges. Additionally, integration with cloud storage and deployment methods could be more intuitive to enhance user experience, along with more effective support and community engagement.
What are the key features?Cloudera Data Platform is implemented extensively across industries like hospitality for data science activities, including managing historical data. Its adaptability extends to operational analytics for sectors like oil & gas, finance, and healthcare, often enhanced by Hortonworks Data Platform for data ingestion and analytics tasks.
We monitor all Open Source Databases reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.