Try our new research platform with insights from 80,000+ expert users

Amazon EC2 Auto Scaling vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Amazon EC2 Auto Scaling
Ranking in Compute Service
3rd
Average Rating
9.0
Reviews Sentiment
8.2
Number of Reviews
45
Ranking in other categories
No ranking in other categories
Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.7
Number of Reviews
65
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
 

Mindshare comparison

As of April 2025, in the Compute Service category, the mindshare of Amazon EC2 Auto Scaling is 10.8%, down from 12.6% compared to the previous year. The mindshare of Apache Spark is 11.2%, up from 9.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Erick  Karanja - PeerSpot reviewer
Scaling is as easy as hitting a button and setup is straightforward
AWS has already made improvements. In the past, if you provisioned a large EC2 instance and underutilized it, you still paid a premium. Now, AWS encourages using Kubernetes, where you primarily pay for the compute power you actually use in production. There is room for improvement. You might end up paying a high price if you're not careful and you provision a server that's underutilized. AWS has left it to engineers to figure out solutions. If you find the cost too high, you can move to Kubernetes, which might be a better solution for you than large EC2 instances. So, the improvements need to come from the user side, not the provider. Software engineers and engineering teams need to know their limits with EC2 instances. They need to recognize when it's time to transition their applications to Kubernetes. This means building with the cloud in mind from the start, making it easier to move solutions to the cloud without suffering upgrades and integration issues.
Ilya Afanasyev - PeerSpot reviewer
Reliable, able to expand, and handle large amounts of data well
We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The solution is scalable."
"The Amazon EC2 Auto Scaling features are simple to use."
"The most useful feature is elasticity. You can scale up or down based on traffic."
"The tool helps me to process large data sets while scaling up."
"We appreciate that this solution allows us to run all of our severs through it, meaning that our workloads are mainly on the EC2 instance only."
"The solution is highly scalable."
"Amazon EC2 Auto Scaling has good integration."
"Auto-scaling is a good feature."
"I appreciate everything about the solution, not just one or two specific features. The solution is highly stable. I rate it a perfect ten. The solution is highly scalable. I rate it a perfect ten. The initial setup was straightforward. I recommend using the solution. Overall, I rate the solution a perfect ten."
"The product’s most valuable features are lazy evaluation and workload distribution."
"Provides a lot of good documentation compared to other solutions."
"The most valuable feature of this solution is its capacity for processing large amounts of data."
"The fault tolerant feature is provided."
"The tool's most valuable feature is its speed and efficiency. It's much faster than other tools and excels in parallel data processing. Unlike tools like Python or JavaScript, which may struggle with parallel processing, it allows us to handle large volumes of data with more power easily."
"The most valuable feature of Apache Spark is its flexibility."
"The product's deployment phase is easy."
 

Cons

"The licensing cost is expensive."
"The spinning up in the solution can be much faster...The product should have a faster scalability option."
"When creating a new instance there is a set of questions that have to be answered, and this is something that can be simplified."
"The documentation for this solution could be improved. For example, it is difficult to find documentation for integration with applications."
"The product does not explain why a particular instance is terminated."
"There is room for improvement. You might end up paying a high price if you're not careful and you provision a server that's underutilized."
"Sometimes the configuration is not intuitive."
"There should be an AWS instance in South Africa, where the latency would be even lower. It might happen soon since AWS has recently opened more data centres in Nigeria. AWS may extend its reach to South Africa, and offer hosted CLI servers there. Most of the problems with AWS are not to do with the solution itself but with configuration. It is something on design, more or less."
"The Spark solution could improve in scheduling tasks and managing dependencies."
"More ML based algorithms should be added to it, to make it algorithmic-rich for developers."
"There could be enhancements in optimization techniques, as there are some limitations in this area that could be addressed to further refine Spark's performance."
"The solution’s integration with other platforms should be improved."
"The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive."
"The initial setup was not easy."
"Its UI can be better. Maintaining the history server is a little cumbersome, and it should be improved. I had issues while looking at the historical tags, which sometimes created problems. You have to separately create a history server and run it. Such things can be made easier. Instead of separately installing the history server, it can be made a part of the whole setup so that whenever you set it up, it becomes available."
"Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors."
 

Pricing and Cost Advice

"The licences for this solution are based on the number of instances. This determines the EC2 type that is used and this is then priced accordingly."
"The solution's licensing is based on a pay-as-you-go model. You only pay for the resources you use, whether it's RAM, processing power, or storage. So, it's calculated based on the time you use those resources, typically billed in hours or minutes."
"When we want to use more services, we need to pay more. It's a monthly subscription, rather than licensed-based. Pricing or fees for Amazon EC2 Auto Scaling could be improved."
"Compared to the performance, the price is quite high. I would rate it a ten because it is expensive. There are additional costs including bandwidth costs, data transfer costs, and load balancing costs."
"The product is quite expensive."
"AWS offered some credits, so we have been able to enjoy some of those benefits. The pricing was fair."
"Pricing could be a little bit more competitive."
"The pricing is not fixed and it is based on usage."
"Apache Spark is an expensive solution."
"It is an open-source solution, it is free of charge."
"Apache Spark is an open-source tool."
"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
"It is an open-source platform. We do not pay for its subscription."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
"Licensing costs can vary. For instance, when purchasing a virtual machine, you're asked if you want to take advantage of the hybrid benefit or if you prefer the license costs to be included upfront by the cloud service provider, such as Azure. If you choose the hybrid benefit, it indicates you already possess a license for the operating system and wish to avoid additional charges for that specific VM in Azure. This approach allows for a reduction in licensing costs, charging only for the service and associated resources."
"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
845,406 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
24%
Computer Software Company
14%
Government
8%
University
7%
Financial Services Firm
28%
Computer Software Company
13%
Manufacturing Company
8%
Comms Service Provider
5%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Amazon EC2 Auto Scaling?
The solution removes the need for hardware. We can easily create servers or machines. Just by clicking or specifying our requirements, like memory size or disk space, it's set up for us. The tool e...
What is your experience regarding pricing and costs for Amazon EC2 Auto Scaling?
The pricing structure from AWS is really complex and depends on factors like the region and specific services used. Prices can vary significantly even within the same service across different locat...
What needs improvement with Amazon EC2 Auto Scaling?
There is a need for improvement in understanding the pricing structure, as it is complex and depends on several factors such as the location of data centers.
What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...
What needs improvement with Apache Spark?
The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...
 

Also Known As

AWS RAM
No data available
 

Overview

 

Sample Customers

Expedia, Intuit, Royal Dutch Shell, Brooks Brothers
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Find out what your peers are saying about Amazon EC2 Auto Scaling vs. Apache Spark and other solutions. Updated: March 2025.
845,406 professionals have used our research since 2012.