We use Cloudera Distribution for file storage.
This solution is deployed on-premise.
We use Cloudera Distribution for file storage.
This solution is deployed on-premise.
The file system is a valuable feature.
The security of this solution could be improved. There should also be a way to basically have a blockchain enabled storage with the HDFS.
I have been working with Cloudera Distribution for Hadoop for 11 years.
This solution is stable.
This solution is scalable enough for us.
We have created a product, using HDFS, and when our engineers install it for themselves or for customers, we use this solution. There are about 15 to 20 people using it at any point of time.
The installation is straightforward. We use command-line-based installation and we have created our own way of installing with our product.
Depending on the customer or depending on internal usage, our DevOps engineer will install it or my development team will install it.
We are very well-versed on these tools, so we implemented it ourselves.
I haven't bought a license for this solution. I'm only using the Apache license version.
I rate this solution an eight out of ten. Cloudera is a great product and, overall, there are many features.
We actually use Cloudera HDFS underneath, and we build our product on top of it. So, we don't use the Cloudera versions of all the other products, we just use the Cloudera HDFS, nothing else.
CDH has a wide variety of proprietary tools that we use, like Impala. So from that perspective, it's quite useful as opposed to something open-source. We get a lot of value from Cloudera's proprietary tools.
Integration is one of the main things we struggle with because we're working with several other environments. For example, we've got an MPP environment outside the Hadoop environment. Many cloud-based platforms like Azure are fully integrated with technology that gives you MPP machine learning and data lakes all in one environment. We've got on-premises IBM solutions and Cloudera, so it isn't easy to integrate. It would be useful if Cloudera had more tools like SQL Engines that offer the traditional relational database. We have to do a lot of work preparing the data outside Cloudera before getting it into the platform. And ideally, we should get as much raw data as possible into the platform before we can do the engineering, so we have machine learning and model training.
I've been using CDH for about two years, or rather, I manage the team that uses it.
We haven't had any issues with Cloudera. It's a solid product.
Cloudera is dependable, and it's completely scalable.
We have engaged the technical support based in the UK. My team hasn't worked with them directly, but the administration team has. To my knowledge, they're fairly responsive.
I rate Cloudera Distribution for Hadoop eight out of 10.
We are using this solution for storing Big Data in one centralized location.
It has been helpful in allowing data storage in one centralized location with data lakes and all of the surrounding applications.
All of the data processes are being stored into the Big Data Lake.
It allows us to store huge amounts of data, which is an advantage.
They have BI (Business Intelligence) tools. There are many AI tools.
We are able to connect and analyze the data to get reports. The reports are very good.
The main advantage is the storage is less expensive.
The performance can be improved. We have experienced some performance issues. It is not as sophisticated as Oracle Sybase.
Currently, we are using many other tools such as Spark and Blade Job to improve the performance.
The setup could be simplified, it's complex.
The security needs to be improved.
I have been using this solution since 2015.
It's a stable solution.
Scalability is good. It's replicated and by default, with Big Data there is a replication factor.
Over the years we have grown, when we started we had 10 nodes now we have increased to a large number of nodes.
Technical support is good. I have been able to learn from them. As a developer, I am learning every day.
I would rate the technical support a ten out of ten.
Previously we were using Oracle Sybase SQL. We switched because now, we have introduced Big Data.
The initial setup was complex.
It's not as simple as Oracle Sybase.
It's a complex architecture because you have raw data and many engines.
When comparing with Oracle Sybase and SQL, it's cheaper. It's not expensive.
I am a part of security and software development.
We are currently considering migrating to the cloud, and planning on using Microsoft Azure, mainly for the Big Data component.
I would rate this solution a five out of ten.
It is a good enterprise platform. It is easier and more stable. Additionally, it has the best proxy, security, and support features compared to open-source products.
The areas of improvement depend on the scale of the project. For banking customers, security features and an essential budget for commercial licenses would be the top priority. Data regulation could be the most crucial for a project with extensive data or an extra use case.
We have been using Cloudera Distribution for Hadoop for a few years.
I rate the product’s stability a ten out of ten.
We have ten customers using the product. They include data engineers, performance engineers, and environment engineers.
I rate its stability a ten out of ten.
The product has a support subscription for one year. We use technical support only for complex use cases. We work with their team as we have direct and quick access to contact them. It helps us better understand the technical and business-related queries of the customers.
The on-cloud version is easy to set up. Although, it is complicated to process a large amount of data for on-premises or hybrid setup. It is not a ready-to-use solution for telecom or finance technology. It requires the deployment of robust technology relying on network infrastructure.
The product’s price depends from project to project. It is more expensive than open-source solutions and could be cheaper. However, in some cases, it is less costly than open-source.
It is the best solution in the world at the moment. I advise others to go for it if you have an enterprise customer. I rate it a ten out of ten.
We use the solution to maintain our legacy data warehouse for better performance and more extensive storage.
The solution's most valuable feature is the enterprise data platform.
They should work on the solution's pricing. Also, finding resources with good experience in the solution is difficult. Thus, they should upgrade their technical capabilities in the market.
They should add features like AutoML and AutoDev for enhanced machine-learning experiences. In addition, they should consider developing an integration capability similar to Informatica for an end-to-end enterprise solution.
We have been using the solution for one year.
The solution's customer support team could be better. We received their assistance only with installation and configuration.
The solution is expensive. The license costs around 10k.
Cloudera is a cost-effective solution if you need more storage space. In this case, I advise you to opt for it. I rate the solution as an eight out of ten.
This product is a framework for edge AI, it comes with multiple ecosystems as a project. I'm a senior data architect manager and we are consultants. We offer Cloudera to our customers but we don't have a partnership with them.
The best feature is the layer shared experience. If you have a cluster available on-prem or in the cloud, you can manage that security layer using the shared SDX and it provides flexibility. New features are constantly being added.
The only thing that needs improvement is the cost, it's a very expensive solution and one of the main reasons companies are not attracted to the product.
This product has been around for a long time so it's very mature and stable.
The scalability is very good.
The initial setup has become easier although you need a dedicated admin to maintain and manage the solution because it's a framework and not a single product. Deployment nowadays is much smoother with the PaaS offering in the public cloud, so you can carry out the deployment with an in-house team. The deployment only takes a day but a company is unlikely to go with the default so the solution needs fine-tuning which can take a couple of weeks.
For enterprise organizations that can bear the cost, it's a good solution. A smaller company wouldn't be able to afford the licensing fees. You can get a free trial for 60 days. They'll never have a community version because they're the only ones in the market offering this kind of framework.
I rate this solution nine out of 10.
We are a distributor for Hadoop. Our customers choose whether they would like to use Cloudera or another product.
Cloudera Distribution is deployed on-premise as well as on bare metal servers in AWS.
Cloudera is a very manageable solution with good support.
When you compare Cloudera with EMR, EMR has a lot of administrative features, so you don't need to manage the solution. Cloudera is not as easy, as it requires more DevOps resources than other solutions.
We have been offering this solution for five years.
Cloudera Distribution is stable.
This is a scalable solution. We have clients that have a large installation of Cloudera.
Technical support from Cloudera is fine.
The initial setup of Cloudera is difficult. After you have installed it once, it is not difficult to reproduce.
For a POC deployment, we required only one DevOps. On larger-scale implementation, we also require a data engineer.
Cloudera requires a license to use.
We looked at EMR, however Cloudera is better when using OnPrem.
Cloudera is one of the best solutions for on-prem.
I would rate this solution an 8 out of 10.
We are in the testing phase of Cloudera Distribution for Hadoop, and we will be in production soon.
The procedure for operations could be simplified.
I have used Cloudera Distribution for Hadoop within the past 12 months.
The solution is reliable and stable, it fits our requirements.
The implementation of Cloudera Distribution for Hadoop is not easy. It works on multiple nodes and can be complex for testing. The whole process took us one and a half days.
We used a local system integrator for the implementation. We had approximately five people for the implementation.
We have not had to do maintenance of the solution because we are still in the testing phase.
My advice to others is this solution can be complex.
I rate Cloudera Distribution for Hadoop a seven out of ten.
