No more typing reviews! Try our Samantha, our new voice AI agent.
Miodrag Milojevic - PeerSpot reviewer
Senior Data Archirect at Yettel
Real User
Aug 9, 2023
A file system for data collection that contains needed information and files
Pros and Cons
  • "It is a file system for data collection. There are nodes in this cluster that contain all the information, directories, and other files. The nodes are based on the MySQL database."
  • "The stability of the solution needs improvement."

What is our primary use case?

I have been using the latest version of Apache Hadoop. It is a file system for data collection. There are nodes in this cluster that contain all the information, directories, and other files. The nodes are based on the MySQL database.

What needs improvement?

Hadoop isn't so problematic. It deals with file storage and maintenance. It is a network of file operations.

The stability of the solution needs improvement.

For how long have I used the solution?

I have been using Apache Hadoop for more than three to four years.

What do I think about the stability of the solution?

There are some issues with file retention and its stability but they can be worked through. There are a lot of things that are based on disk space that require the preparation of different and sophisticated controls. The software itself is not unstable, but sometimes its options can cause stability issues.

Buyer's Guide
Apache Hadoop
May 2026
Learn what your peers think about Apache Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: May 2026.
893,311 professionals have used our research since 2012.

What do I think about the scalability of the solution?

The scalability includes adding nodes and it is not so easy to do. It is a detailed process that requires precision. 

There are almost 25 users, including data engineers and others, but no specialists. We plan to increase endpoint users and introduce running reports, automated reports, or reports based on some tools. 

How are customer service and support?

Apache is an open source software and only has a community, instead of customer support. There is Cloudera which provides Apache Hadoop on license and offers support. In Cloudera, there are some consultants with less knowledge who offer support for small issues. As the case escalates, they provide more support with better technical expertise.

Which solution did I use previously and why did I switch?

We checked a few solutions and tried solutions from Azure. There were pros and cons but this solution was more acceptable.

How was the initial setup?

The setup depends on the data. Vast data can be hard to set up. You might have some issues with the setup, but it depends on the number of nodes. More nodes can cause issues and more time to resolve. The reshuffling is also complex and can cause problems.

The on-premise setup can be difficult as it requires the subsequent setup of nodes while expanding. Cloud deployment can be easier but only supports other software.

What was our ROI?

The ROI is very hard to calculate. The source of data for the company can help to run different technologies and make many decisions based on the data, but it's very hard to calculate the return on investment.

What's my experience with pricing, setup cost, and licensing?

I am not updated with the licensing cost, but you need to pay for a license if purchased from Cloudera.

What other advice do I have?

If you plan to use Apache Hadoop, purchase the license from Cloudera because they provide you with technical support.

I rate the overall solution an eight out of ten. 

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Juliet Hoimonthi - PeerSpot reviewer
Manager at Robi Axiata Limited
Real User
Jul 25, 2022
Has good analysis and processing features for AI/ML use cases, but isn't as user-friendly and requires an advanced level of coding or programming
Pros and Cons
  • "What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution. I like the data processing feature for AI/ML use cases the most because some solutions allow me to collect data from relational databases, while Hadoop provides me with more options for newer technologies."
  • "What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution."
  • "What could be improved in Apache Hadoop is its user-friendliness. It's not that user-friendly, but maybe it's because I'm new to it. Sometimes it feels so tough to use, but it could be because of two aspects: one is my incompetency, for example, I don't know about all the features of Apache Hadoop, or maybe it's because of the limitations of the platform. For example, my team is maintaining the business glossary in Apache Atlas, but if you want to change any settings at the GUI level, an advanced level of coding or programming needs to be done in the back end, so it's not user-friendly."
  • "What could be improved in Apache Hadoop is its user-friendliness. It's not that user-friendly, and if you want to change any settings at the GUI level, an advanced level of coding or programming needs to be done in the back end, so it's not user-friendly."

What is our primary use case?

I'm from the data governance team, and this is how my team uses Apache Hadoop: there's a GUI called Apache Atlas, then there's an option called the "business glossary". My team uses the business glossary from Apache Atlas and also uses Apache Ranger. Apache Ranger is another GUI where you can check who is using which data source through the Apache Hadoop platform. My team also uses the Apache Hadoop platform for AI-related use cases and relevant data, so the data required from any kind of AI use case, that data is processed with ETL, specifically with the Talend tool. My team then loads the data in Apache Hadoop, uses that data by making some clusters, and uses the data for AI/ML cases.

What is most valuable?

What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution. I like the data processing feature for AI/ML use cases the most because some solutions allow me to collect data from relational databases, while Hadoop provides me with more options for newer technologies.

What needs improvement?

What could be improved in Apache Hadoop is its user-friendliness. It's not that user-friendly, but maybe it's because I'm new to it. Sometimes it feels so tough to use, but it could be because of two aspects: one is my incompetency, for example, I don't know about all the features of Apache Hadoop, or maybe it's because of the limitations of the platform. For example, my team is maintaining the business glossary in Apache Atlas, but if you want to change any settings at the GUI level, an advanced level of coding or programming needs to be done in the back end, so it's not user-friendly.

What do I think about the stability of the solution?

Apache Hadoop has good stability.

What do I think about the scalability of the solution?

I'm not sure how scalable Apache Hadoop is.

How are customer service and support?

In terms of technical support from Apache Hadoop, we are working with an external vendor and they are the ones helping us in every case. They are helpful.

Which solution did I use previously and why did I switch?

We used Oracle Exadata before using Apache Hadoop. It was one or two years ago when we started using the Apache Hadoop platform. We're still thinking about using both platforms in parallel or choosing one of the two. We're still looking into the benefits of each platform, but currently, we're using both Oracle Exadata and Apache Hadoop.

How was the initial setup?

I wasn't part of the team that set up Apache Hadoop, but using it after it was set up was very easy. The solution was ready immediately, and the GUI was smooth and fast, with no issues.

What about the implementation team?

Apache Hadoop was implemented by the IT team, so it was an in-house implementation.

What's my experience with pricing, setup cost, and licensing?

If my company can use the cloud version of Apache Hadoop, particularly the cloud storage feature, it would be easier and would cost less because an on-premises deployment has a higher cost during storage, for example, though I don't know exactly how much Apache Hadoop costs.

What other advice do I have?

My company is using both Apache Hadoop and Oracle Exadata.

I'm unsure which version of Apache Hadoop I'm using, but it could be the latest version.

Currently, the solution is deployed on-premises because here in Bangladesh, there's a limitation with transferring data outside of the country. As far as I know, there's no cloud solution internally in Bangladesh, so if you want to use a cloud solution here, you'll have to move your data outside Bangladesh, and this is why Apache Hadoop is still deployed on-premises.

More than fifty people use Apache Hadoop directly, particularly the IT and analytics expert teams. The solution is being used by developers, people in operations, and people who maintain security.

In my company, Apache Hadoop is not fully implemented yet. It's still in the implementation phase and at least for the next two to three years, there isn't any plan of discarding it.

I'm giving Apache Hadoop a rating of seven out of ten.

I don't have any recommendations currently for people who want to implement Apache Hadoop because I'm still in the learning phase and I don't have much knowledge yet. The IT team in my company is also struggling every time in terms of preparing everything and still needs help from external vendors because the team isn't an expert on Apache Hadoop yet. My company's expertise is in Oracle Exadata because usage of that product started in 2002 or 2003.

My company is a customer of Apache Hadoop.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Apache Hadoop
May 2026
Learn what your peers think about Apache Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: May 2026.
893,311 professionals have used our research since 2012.
Teodor Muraru - PeerSpot reviewer
Developer at Emag
Real User
Top 5Leaderboard
May 26, 2024
Helps to store and retrieve information
Pros and Cons
  • "Apache Hadoop is crucial in projects that save and retrieve data daily. Its valuable features are scalability and stability. It is easy to integrate with the existing infrastructure."

    What is our primary use case?

    The solution helps to store and retrieve information.

    What is most valuable?

    Apache Hadoop is crucial in projects that save and retrieve data daily. Its valuable features are scalability and stability. It is easy to integrate with the existing infrastructure. 

    For how long have I used the solution?

    I have been using the tool for a few years. 

    What do I think about the stability of the solution?

    I rate the tool's stability a nine out of ten. 

    How are customer service and support?

    I take support from the DevOps team. 

    What other advice do I have?

    I recommend the tool to others since it is good. 

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    reviewer2324613 - PeerSpot reviewer
    Data Architect at a computer software company with 51-200 employees
    Real User
    Jan 6, 2024
    Allows for customization and optimization of applications and performance using in-house resources but lacks community support
    Pros and Cons
    • "It's open-source, so it's very cost-effective."
    • "The main thing is the lack of community support. If you want to implement a new API or create a new file system, you won't find easy support."

    What is our primary use case?

    We work on Apache Hadoop for various customers. 

    What is most valuable?

    It's open-source, so it's very cost-effective. Apache Hadoop has its strengths. For example, in my previous organization, which was a small startup, we used it because it was cost-effective. 

    We only had to pay for the servers, and we could optimize applications and performance using our employees, which was especially cost-effective in India. So, human resources were the main investment, not software. 

    That was five years ago, though. In the last five years, I've mainly seen Redshift, Azure, and Oracle in the market.

    What needs improvement?

    The main thing is the lack of community support. If you want to implement a new API or create a new file system, you won't find easy support. 

    And then there's the server issue. You have to create and maintain servers on your own, which can be hectic. Sometimes, the configurations in the documentation don't work, and without a strong community to turn to, you can get stuck. That's where cloud services play a vital role.

    In future releases, the community needs to be improved a lot. We need a better community, and the documentation should be more accurate for the setup process.

    Sometimes, we face errors even when following the documentation for server setup and configuration. We need better support. 

    Even if we raise a ticket, it takes a long time to get addressed, and they don't offer online support. They ask for screenshots, which takes even more time. Instead of direct screensharing or hopping on a call. But it's free, so we can't complain too much.

    For how long have I used the solution?

    I've been working with Apache Hadoop for ten years. I started my career with Hadoop. I've worked with it at Infinia, Microsoft, and AWS, for a total of about eight years.

    What do I think about the stability of the solution?

    I would rate the stability a seven out of ten. There is room for improvement in performance.

    What do I think about the scalability of the solution?

    It can be scalable in certain cases. Typically, for startups or product-based companies with limited budgets during product development, Apache Hadoop is often the only viable option. They cannot afford the costs of other cloud-based systems, so Apache Hadoop plays a main role in those scenarios.

    Which solution did I use previously and why did I switch?

    For some customers, we use Oracle Autonomous Database. Now, I cannot compare Apache Hadoop with Oracle Autonomous Data Warehouse when it comes to value for money. They're not directly comparable.

    How was the initial setup?

    The initial setup is a hectic task. Configuring servers and nodes takes a long time. That's one of the big advantages of an Autonomous Data Warehouse. You can start implementing within half the time. 

    With Apache Hadoop, you have to wait for the setup, architecture, and data evaluation. But with Autonomous, those things are automated. It scales as you use more data, so you can focus on the business rather than infrastructure.

    What's my experience with pricing, setup cost, and licensing?

    We just use the free version.

    What other advice do I have?

    We can't use Apache Hadoop for everything, like storage and data errors. But we can use some tools that are native to Hadoop, like Kafka.

    For the current situation, I'd rate it a seven out of ten. 

    However, five years ago, I would have rated it a nine out of ten. Back then, I was working with it fully. But now we're used to working with cloud systems. Creating servers is more difficult nowadays.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    reviewer1976262 - PeerSpot reviewer
    Credit & Fraud Risk Analyst at a financial services firm with 10,001+ employees
    Real User
    Oct 16, 2022
    Has the ability to take a large amount of data and deliver the necessary splices and summary charts
    Pros and Cons
    • "Apache Hadoop can manage large amounts and volumes of data with relative ease, which is a feature that is beneficial."
    • "Apache Hadoop can manage large amounts and volumes of data with relative ease, which is a feature that is beneficial."
    • "I mentioned it definitely, and this is probably the only feature we can improve a little bit because the terminal and coding screen on Hadoop is a little outdated, and it looks like the old C++ bio screen. If the UI and UX can be improved slightly, I believe it will go a long way toward increasing adoption and effectiveness."
    • "In terms of processing speed, I believe that some of this software as well as the Hadoop-linked software can be better."

    What is our primary use case?

    We use Apache Hadoop for analytics purposes.

    What is most valuable?

    The ability to take a lot of data and attempt to basically deliver the appropriate splices and summary chart is the most crucial function that I have discovered. 

    This stands in contrast to some of the other tools that are available, such as SQL and SAS, which are likely incapable of handling such a large volume of data. Even R, for instance, is unable to handle such data volumes. 

    Apache Hadoop can manage large amounts and volumes of data with relative ease, which is a feature that is beneficial.

    What needs improvement?

    In terms of processing speed, I believe that some of this software as well as the Hadoop-linked software can be better. While analyzing massive amounts of data, you also want it to happen quickly. Faster processing speed is definitely an area for improvement.

    I am not sure about the cloud's technical aspects, whether there are things that happen in the cloud architecture that essentially make it a little slow, but speed could be one. And, second, the Hadoop-linked programs and Hadoop-linked software that are available could do much more and much better in terms of UI and UX.

    I mentioned it definitely, and this is probably the only feature we can improve a little bit because the terminal and coding screen on Hadoop is a little outdated, and it looks like the old C++ bio screen. 

    If the UI and UX can be improved slightly, I believe it will go a long way toward increasing adoption and effectiveness.

    For how long have I used the solution?

    I have been using Apache Hadoop for six months.

    What do I think about the stability of the solution?

    It is far more stable than some of the other software that I have tried. It's also the current version of Hadoop software and is becoming increasingly more stable.

    When a new version is released, the subsequent ones are always more stable and easier to use.

    What do I think about the scalability of the solution?

    According to what I have seen in my current enterprise, once I joined the organization, it was fairly simple to have it for an employee, and this is true for everyone who's been onboarded in my own designation. I would imagine that it is fairly scalable across an enterprise.

    I am fairly certain that we have between 10 and 15,000 employees who use it.

    How are customer service and support?

    I have not had any direct experience with technical support.

    We have an in-house technical support team that handles it.

    Which solution did I use previously and why did I switch?

    I have since changed careers, I no longer use any automation tools, nor does my job need me to compare the capabilities of other tools.

    I am working with Risk Analytic tools. I work with data these days, therefore I use technologies like Hive, Shiny R, and other data-intensive programs.

    Shiny is a plugin that you can have on R. As a result of changing my profiles, I am now working in a position that is more data-centric and less focused on process automation.

    We currently have proprietary tools, a proprietary cloud software, therefore I don't really need to employ any external cloud vendors. Aside from that, I only use the third-party technologies I've already indicated, primarily Hadoop and R.

    This is one of the prime, one of the cornerstone software that we use. I have never been in a position to compare the like-for-like comparison with another software.

    How was the initial setup?

    As it is proprietary software for the enterprise that I am currently working on, I had no trouble setting it up.

    What's my experience with pricing, setup cost, and licensing?

    I am not sure about the price, but in terms of usability and utility of the software as a whole, I would rate it a three and a half to four out of five.

    Which other solutions did I evaluate?

    When I was a digital transformation consultant for my prior employer, I downloaded and read the reviews.

    It involved learning about workflow automation tools as well as process automation. I looked at a number of these platforms as part of that, but I have never actually used them.

    What other advice do I have?

    I would recommend this solution for data professionals who have to work hands-on with big data.

    For instance, if you work with smaller or more finite data sets, that is, data sets that do not keep updating themselves, I would most likely recommend R or even Excel, where you can do a lot of analysis. However, for data professionals who work with large amounts of data, I would strongly recommend Hadoop. It's a little more technical, but it does the job.

    I would rate Apache Hadoop an eight out of ten. I would like to see some improvements, but I appreciate the utility it provides.

    Which deployment model are you using for this solution?

    Public Cloud
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Abhik Ray - PeerSpot reviewer
    Co-Founder at Quantic
    Real User
    Aug 5, 2022
    Has good processing power and speed and is capable of handling large volumes of data and doing online analysis
    Pros and Cons
    • "The most important feature is its ability to handle large volumes. Some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume. So, its processing power and speed are most valuable."
    • "The most important feature is its ability to handle large volumes, as some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume, so its processing power and speed are most valuable."
    • "It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it."
    • "It requires a great deal of learning curve to understand."

    What is our primary use case?

    Its main use case is to create a data warehouse or data lake, which is a collection of data from multiple product processors used by a banking organization. They have core banking, which has savings accounts or deposits as one system, and they have a CRM or customer information system. They also have a credit card system. All of them are separate systems in most cases, but there is a linkage between the data. So, the main motivation is to consolidate all that data in one place and link it wherever required so that it acts as a single version of the truth, which is used for management reporting, regulatory reporting, and various forms of analyses.

    We have done two or three projects with Hadoop, and we have taken the latest version available at that time. So far, it was deployed on-premises.

    What is most valuable?

    The most important feature is its ability to handle large volumes. Some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume. So, its processing power and speed are most valuable.

    Another feature that I like is online analysis. In some cases, data requires online analysis. We like using Hadoop for that.

    What needs improvement?

    It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it. However, when it comes to power, I have nothing to say. It is really good.

    For how long have I used the solution?

    We have been working with this solution for two and a half to three years.

    What do I think about the stability of the solution?

    The core file system and the offline data ingestion are extremely stable. In my experience, there is a bit less stability during online data ingestion. When you have incremental online data, sometimes it stops or aborts before finishing. It is rare, but it does, but the offline data injection and the basic processing are very stable.

    What do I think about the scalability of the solution?

    Its scalability is very good. Most of our clients have used it on-prem. So, to a large extent, it is up to them to provide hardware for large data, which they have. Its scalability is linear. As long as the hardware is given to it, there are no complaints.

    About 70% of its users are from a client's IT in terms of setting it up and providing support to make sure that the pipeline is there. Business users are about 30%. They are the people who use the analytics derived from the warehouse or data lake. Collectively, there are about 120 users. The size of the data is mostly in terms of the number of records it handles, which could be 30 or 40 million.

    How are customer service and support?

    We have not dealt with them too many times. I would rate them a four out of five. There are no complaints.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    Some of our clients are using Teradata, and some of them are using Hadoop.

    How was the initial setup?

    After the hardware is available, getting the environment and software up and running has taken us a minimum of a week or 10 days. Sometimes, it has taken us longer, but usually, this is what it takes at the minimum to get everything up. It includes the downloads and also setting it up and making things work together to start using it.

    For the original deployment, because there are so many components and not everyone knows everything pretty well, we have seen that we had to deploy four or five people in various areas at the initial deployment stage. However, once it is running, one or two people are required for maintenance.

    What was our ROI?

    Different clients derive different levels of return based on the sophistication of the analytics that they derive out of it and how they use it. I don't know how much ROI they have got, but I can say that some clients have not got a decent ROI, but some of our clients are happy with it. It is very much client-dependent.

    What's my experience with pricing, setup cost, and licensing?

    We don't directly pay for it. Our clients pay for it, and they usually don't complain about the price. So, it is probably acceptable.

    What other advice do I have?

    I would rate it a nine out of ten because of the complexity, but technically, it is okay.

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Satya Raju - PeerSpot reviewer
    Archtect - software engineering at Innominds
    Reseller
    Top 5
    Oct 22, 2024
    Robust data processing and analytics with potential improvements for streaming capabilities
    Pros and Cons
    • "Hadoop can store any kind of data—structured, unstructured, and semi-structured—and presents it using the relational model through Hive."
    • "Hadoop lacks OLAP capabilities."

    What is our primary use case?

    I use Hadoop as a data lake in an AIML solution, where it connects to various data sources and ingests data into Hadoop. It is utilized for processing large data volumes with various data sources such as RDBMS, file systems, Kafka for real-time streaming data, IoT, web sockets, and API metadata.

    How has it helped my organization?

    Hadoop provides a robust data lake functionality, allowing for the ingestion and processing of varied data types. It ensures no data loss through data replication and efficient transformation jobs handled in parallel, enhancing data analytics.

    What is most valuable?

    Hadoop can store any kind of data—structured, unstructured, and semi-structured—and presents it using the relational model through Hive. The combination with Spark enhances data analytics capabilities.

    What needs improvement?

    Hadoop lacks OLAP capabilities. I recommend adding a Delta Lake feature to make the data compatible with ACID properties. Also, video and audio streaming import issues could be improved to ensure proper data validation.

    For how long have I used the solution?

    I have been working with Apache Hadoop for the last ten years.

    What do I think about the stability of the solution?

    The stability of Hadoop is good. I have been working with it for the last ten years and have not encountered significant stability issues.

    What do I think about the scalability of the solution?

    Hadoop offers good scalability with horizontal scaling, especially when deployed on cloud platforms like Cloudera, which takes care of scaling the infrastructure. On-premises requires the maintenance of data nodes.

    How are customer service and support?

    I rate the customer support at eight out of ten. It is satisfactory.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    I have worked with traditional RDBMS which do not support analytics, querying, and data replication as efficiently as Hadoop.

    How was the initial setup?

    Setting up Apache Hadoop requires some learning curve and is of medium complexity. I rate the setup a seven out of ten.

    What about the implementation team?

    We need provision for the cluster deployment, including a master node, coordinator node, and the setup of Spark and Hive for development.

    Which other solutions did I evaluate?

    I have also worked with HDFS, Hive, and Apache Spark with Scala and Java.

    What other advice do I have?

    As cloud technology is emerging, it is advisable to transition from traditional Hadoop to cloud-based solutions like AWS EMR and Azure, which offer better maintenance-free infrastructure management.

    I'd rate the solution seven out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Anand Viswanath - PeerSpot reviewer
    Project Manager at Unimity Solutions
    Real User
    Apr 3, 2024
    Offers reasonable integration features but needs to improve the setup process
    Pros and Cons
    • "The tool's stability is good."
    • "The load optimization capabilities of the product are an area of concern where improvements are required."

    What is our primary use case?

    I use the solution in my company for security purposes.

    In my company, we have intranet portals that we need to ensure are not accessible by outsiders. All the data that are within the internal applications is only accessible with valid credentials within the domain. In general, my company uses Apache Hadoop to secure our internal applications.

    What needs improvement?

    Tools like Apache Hadoop are knowledge-intensive in nature. Unlike other tools in the market currently, we cannot understand knowledge-intensive products straight away. To use Apache Hadoop, a person needs intensive knowledge, which is something that not everybody can get familiarized with in a straightforward manner. It would be beneficial if navigating through tools like Apache Hadoop is made user-friendly. For non-technical users, if the tool is made easy to navigate, it will be easier to use, and one may not have to depend on experts.

    The load optimization capabilities of the product are an area of concern where improvements are required.

    The complex setup phase can be made easier in the future.

    For how long have I used the solution?

    I have four years of experience with Apache Hadoop.

    What do I think about the stability of the solution?

    The tool's stability is good.

    What do I think about the scalability of the solution?

    I am not sure about the scalability features of the product.

    There are around 500 users of the product in my company.

    When there is a huge load or a huge number of people accessing the product simultaneously, there is a visible delay in the loading of pages.

    How was the initial setup?

    The product's initial setup phase is complex.

    I have not dealt with the setup phase straight away. I always like to rely on the infra person in my company who knows Apache Hadoop.

    The solution is deployed on the cloud.

    What about the implementation team?

    The product can be deployed with the help of the in-house infra team at my company.

    What other advice do I have?

    There was a scenario when the product was essential for my company's data analytics needs. Before my company makes any web solution available in production, we have prototypes and replicas of the application in lower environments. My company uses Apache Hadoop to ensure that the lower environments in which we operate are secure and accessible only by those people in our company with valid credentials.

    I suggest that those planning to use the product first understand the tool's features and capabilities and then choose the right configuration to avoid misconfigurations.

    The product's integration capabilities are good since I see that we have not faced any time outs or downtime in our company when using the tool.

    My company uses the tool to have security and the right availability, which means availability to the right people at the right time. So I think our expectation was met. The value we got from the tool was what we wanted in our company.

    My company started to use the tool expecting that it would offer security and ensure its availability to the right people at the right time. I believe that the tool was able to meet our company's expectations, so we got the value that we expected the product to deliver.

    I rate the tool a seven out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Buyer's Guide
    Download our free Apache Hadoop Report and get advice and tips from experienced pros sharing their opinions.
    Updated: May 2026
    Product Categories
    Data Warehouse
    Buyer's Guide
    Download our free Apache Hadoop Report and get advice and tips from experienced pros sharing their opinions.