Apache Hadoop Reviews

Name: Apache Hadoop
Brand: Apache
Rating: 4.0 (41 reviews)

Vendor: Apache

4.0 out of 5

41 reviews
89% willing to recommend

Leave a review

What is Apache Hadoop?

Apache Hadoop provides a scalable, cost-effective open-source platform capable of handling vast data volumes with features like HDFS, distributed processing, and high integration capabilities.

Get the Apache Hadoop Buyer's Guide and find out what your peers are saying about Apache Hadoop, Dell PowerStore, Teradata and more!

Apache Hadoop is the #9 ranked solution in top Data Warehouse solutions. PeerSpot users give Apache Hadoop an average rating of 8.0 out of 10. Apache Hadoop is most commonly compared to Dell PowerStore: Apache Hadoop vs Dell PowerStore. Apache Hadoop is popular among the large enterprise segment, accounting for 59% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a financial services firm, accounting for 28% of all views.

Helped 893,221 peers since 2012

Featured Apache Hadoop reviews

Nick Rapoport

Financial Advisor at a financial services firm with 10,001+ employees

Hadoop was used for years, but there were problems since the people who originally set it up left the firm. The group that owned it later didn't have the technical resources to properly maintain it. Although there was nothing wrong with Hadoop itself, issues arose without proper management and upgrades.

Read full review

YuQing Ding

Principle Network and Database Engr at Parsons Corporation

The best features of Apache Hadoop are that we use it to analyze with AI, machine learning, and everything. The data replication feature of Apache Hadoop is notable, though I'm not that advanced in using it. I assess Apache Hadoop's fault tolerance during hardware failures positively since we have hardware failover, which works without problems. We have dual servers everywhere, and Apache Hadoop is installed in these dual servers. Apache Hadoop helps us in cases of hardware failure because it works 24/7, and sometimes servers crash in the field. Not all of the servers are in a data center, and Apache Hadoop helps us to failover and analyze. It's very powerful.

Read full review

Cheikh Mbengue

Database Administrator at Lacoste

Our use case is for a customer who wants to migrate their data warehouse to Hadoop. It's a request from a customer in Senegal who wants to migrate their Oracle data warehouse to Hadoop. I'm trying to migrate it to Hive or HBase. They're choosing between upgrading Oracle or moving to Cloudera…

Read full review

Apache Hadoop mindshare

As of May 2026, the mindshare of Apache Hadoop in the Data Warehouse category stands at 3.3%, down from 4.3% compared to the previous year, according to calculations based on PeerSpot user engagement data.

Data Warehouse Mindshare Distribution
Product	Mindshare (%)
Apache Hadoop	3.3%
Snowflake	9.3%
Teradata	8.8%
Other	78.6%

Data Warehouse

PeerResearch reports based on Apache Hadoop reviews

Type	Title	Date
Category	Data Warehouse	May 9, 2026	Download
Product	Reviews, tips, and advice from real users	May 9, 2026	Download
Comparison	Apache Hadoop vs Snowflake	May 9, 2026	Download
Comparison	Apache Hadoop vs Oracle Exadata	May 9, 2026	Download
Comparison	Apache Hadoop vs Teradata	May 9, 2026	Download

Valuable Features

Apache Hadoop's most valuable features include HDFS and its scalability. Users appreciate its distributed file system for storing diverse data types and its powerful processing capabilities. Its cost-effectiveness, open-source nature, and ability to handle large data volumes efficiently make it appealing. Users value integration with various tools and its support for massive parallel processing, providing flexibility and fault tolerance. Spark enhances analysis capabilities, while Ambari simplifies configuration and management tasks.

"The scalability of Apache Hadoop is very good."
"We selected Apache Hadoop because it is not dependent on third-party vendors."
"Hadoop is extensible — it's elastic."

Room for Improvement

Apache Hadoop requires improvements in user interface and user-friendliness due to its complex setup and steep learning curve. Its data replication could be optimized to reduce redundancy. Compatibility issues with components like Apache Oozie and performance challenges with processing speed and incremental data are noted. Users express a desire for enhanced security features, improved documentation, and advanced analytics tools. Technical support and community engagement require strengthening. Enterprise features like real-time processing and better integration are needed.

"The integration with Apache Hadoop with lots of different techniques within your business can be a challenge."
"Apache Hadoop's real-time data processing is weak and is not enough to satisfy our customers, so we may have to pick other products."
"Hadoop's security could be better."

ROI

Apache Hadoop offers valuable storage and processing capabilities at a lower cost. Clients experience varying levels of return, largely dependent on how they leverage analytics. While calculating ROI is challenging, several users have benefited from significant cost savings by optimizing compute resources based on workload demands.

Pricing

Apache Hadoop pricing can vary significantly based on the distribution and deployment model. While it is open source and free, enterprise versions like Cloudera may involve licensing costs. Costs are influenced by data storage and compute needs; on-premises deployments are often more expensive than cloud solutions. Enterprises appreciate its scalability for large datasets, though medium and small businesses may find it costly. Users often choose cloud services from providers like Amazon to manage expenses effectively.

"The product is open-source, but some associated licensing fees depend on the subscription level."
"It's reasonable, but there's room for improvement in cost-effectiveness."
"For any big enterprise the costs can be handled, and it is suitable for big enterprises because the scale of data is large. For medium and small enterprises, the tool is on the high-price side."

Popular Use Cases

Organizations use Apache Hadoop for data aggregation, analytics, and storage. They create data lakes for streaming dashboards, integrate legacy systems, and manage enterprise data hubs. Apache Hadoop supports big data engineering, ETL processes, and real-time streaming using Spark. It handles large data volumes, unstructured data analysis, and serves as a data warehouse or hub. Additionally, it aids in content management, network monitoring, AntiFraud systems, and security for internal applications.

Service and Support

Customer service quality varies by Hadoop distributor, with Hortonworks rated highly. Apache Hadoop relies on community and forums for support, offering helpful documentation. Cloudera and other vendors provide premium support packages, helpful for complex issues. Many find online resources and forums sufficient for resolving problems. Some companies use external vendors or in-house teams, while official Apache support is limited. Technical support satisfaction varies, often reliant on distribution-specific services.

Deployment

Apache Hadoop's initial setup varies in complexity. Many users find it challenging without prior expertise. Tools like Ambari help simplify the process. Scaling nodes or integrating with additional tools adds complexity. Security, data migration, and precise configuration are common hurdles. Setup time ranges from days to months based on expertise and environment. Cloud deployments often ease the process. Maintenance typically requires skilled personnel to ensure stability and address unforeseen issues.

Scalability

Apache Hadoop is known for its scalability, handling large-scale data efficiently. Many users report it scales better in cloud environments than on-premises. It supports horizontal scaling, allowing easy addition of nodes for increased demands. Companies with diverse user bases, including data engineers and business users, find it flexible, supporting thousands of users and large data volumes. Its design allows efficient management, making it suitable for enterprises needing reliable data processing at scale.

Stability

Users report stable experiences with Apache Hadoop since transitioning from 1.x. Stability challenges arise mostly from startup procedures, insufficient memory, and configuration management. Many achieve high stability, citing its capability to handle large datasets. Versions consistently improve, enhancing reliability. Some encounter occasional issues during extensive processes or require careful management of components. These stability experiences are generally favorable, with most users rating it between seven and nine out of ten, noting improvements and adaptability as strengths.

These insights are based on the in-depth reviews provided by peers to help you make a better buying decision.

Download our Apache Hadoop Buyer's Guide for additional reliable information.

Review data by company size

By reviewers
Company Size	Count
Small Business	11
Midsize Enterprise	7
Large Enterprise	16

By reviewers

By visitors reading reviews
Company Size	Count
Small Business	64
Midsize Enterprise	39
Large Enterprise	150

By visitors reading reviews

Top industries

By visitors reading reviews

Financial Services Firm

28%

University

Comms Service Provider

Manufacturing Company

Government

Computer Software Company

Performing Arts

Construction Company

Marketing Services Firm

Transportation Company

Healthcare Company

Energy/Utilities Company

Real Estate/Law Firm

Outsourcing Company

Recreational Facilities/Services Company

Non Profit

Wholesaler/Distributor

Media Company

Educational Organization

Retailer

Hospitality Company

Logistics Company

Renewables & Environment Company

Insurance Company

Compare Apache Hadoop with alternative products

Learn more about Apache Hadoop

Apache Hadoop is known for its distributed file system HDFS, which supports large data volumes efficiently. Its open-source nature allows cost-effective scalability and compatibility with tools like Spark for enhanced analytics. While it offers significant processing power, areas for improvement include user-friendliness, interface design, security measures, and real-time data handling. Users benefit from data storage for structured and unstructured data, facilitated by its distributed processing architecture. Data replication ensures fault tolerance, while its capability to integrate with tools like Apache Atlas and Talend highlights its versatility.

What are the key features of Apache Hadoop?

HDFS: Distributed file system managing vast data scales.
Scalability: Seamlessly add nodes to increase capacity.
Integration: Works with tools such as Spark to enhance analytics.
Data Replication: Ensures fault tolerance and reliability.
Low Latency: Offers high throughput for efficient processing.

Why choose Apache Hadoop based on user reviews?

Cost-Effectiveness: Open-source framework reduces expenses.
Data Versatility: Supports multiple data types and sources.
Processing Power: Efficiently handles distributed data processing.
High Storage Capacity: Manages large data volumes efficiently.
Expandability: Easily integrates with numerous data processing tools.

Industries leverage Apache Hadoop for Big Data analytics, data lakes, ETL tasks, and enterprise data hubs, handling unstructured and structured data from IoT, RDBMS, and real-time streams. Its applications extend to data warehousing, AI/ML projects, and data migration, employing tools like Apache Ranger, Hive, and Talend for effective data management and analysis.

Apache Hadoop customers

Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab

Product Categories

Data Warehouse

Popular Comparisons

Dell PowerStore vs Apache Hadoop

Teradata vs Apache Hadoop

Snowflake vs Apache Hadoop

VMware Tanzu Data Solutions vs Apache Hadoop

Oracle Exadata vs Apache Hadoop

OpenText Analytics Database (Vertica) vs Apache Hadoop

Amazon Redshift vs Apache Hadoop

IBM Netezza Performance Server vs Apache Hadoop

SAP IQ vs Apache Hadoop

Oracle Database Appliance vs Apache Hadoop

SAP BW4HANA vs Apache Hadoop

Actian Ingres vs Apache Hadoop

IBM Db2 Warehouse vs Apache Hadoop

Microsoft Parallel Data Warehouse vs Apache Hadoop

Infobright DB vs Apache Hadoop

See all alternatives

Apache Hadoop Reviews Summary
Author info	Rating	Review Summary
Financial Advisor at a financial services firm with 10,001+ employees	4.0	We had a limited on-premises deployment of Apache Hadoop which scaled well and was reliable, but maintaining it was challenging due to a lack of resources and expertise after the original team left. We prefer solutions with structured support.
Principle Network and Database Engr at Parsons Corporation	4.5	I use Apache Hadoop daily to analyze unstructured incident data, benefiting from its AI and machine learning capabilities, strong failover support, and fault tolerance, especially useful in our dual-server setup and field environments prone to hardware failures.
Database Administrator at Lacoste	4.5	I am working on migrating a customer's data warehouse from Oracle to Hadoop, specifically considering Cloudera, due to its data warehouse capabilities and scalability. The integration of reporting tools like Power BI with Hadoop poses some challenges.
Software developer at Fiserv	4.0	I use Apache Hadoop in my company for its efficient analytical processing and organized data distribution. However, it struggles with incremental data processing and has high licensing costs. Setup and technical support need improvement for better user experience and resolution times.
Head of Data at a energy/utilities company with 51-200 employees	4.5	Apache Hadoop's distributed computing capability efficiently accelerates data processing by distributing tasks across multiple nodes. While it offers cost savings on compute resources through optimization, the availability of comprehensive training materials could be improved to enhance onboarding and skill development.
Senior Assosiate Consultant at Applied Materials	3.0	I use Apache Hadoop for data storage and report generation, appreciating its open-source nature, ability to handle large data volumes, and effectiveness in data processing and storage. However, limited support requires deeper knowledge and improvisation.
IT Support Specialist at Convergys Corporation	4.5	We used Apache Hadoop mainly for data analysis and storage, benefiting from its low cost, open-source nature, and efficient performance on commodity hardware. While its flexibility and resilience are advantageous, improved security measures would enhance its capabilities.
Head Of Data Governance at Alibaba Group	4.0	We use the Hadoop File System for big data due to its open-source nature and cost-effectiveness. However, dealing with data skewness requires custom solutions, unlike Spark, which is more efficient and faster due to in-memory processing.

Title	Rating	Mindshare	Recommending
Dell PowerStore	4.4	1.4%	97%	187 interviews Add to research
Teradata	4.1	8.8%	88%	83 interviews Add to research

Apache Hadoop Reviews

What is Apache Hadoop?

Featured Apache Hadoop reviews

Apache Hadoop mindshare

PeerResearch reports based on Apache Hadoop reviews

Valuable Features

Room for Improvement

ROI

Pricing

Popular Use Cases

Service and Support

Deployment

Scalability

Stability

Review data by company size

Top industries

Compare Apache Hadoop with alternative products

Learn more about Apache Hadoop

Apache Hadoop customers

Related questions

Product Categories

Popular Comparisons