Apache Spark Streaming Reviews

Name: Apache Spark Streaming
Brand: Apache
Rating: 3.9 (17 reviews)

Vendor: Apache

3.9 out of 5

17 reviews
94% willing to recommend

Leave a review

What is Apache Spark Streaming?

Apache Spark Streaming efficiently processes real-time data with features like micro-batching and native Python support. It's scalable and integrates with many services, ideal for reducing data latency and enabling real-time analytics across industries.

Get the Apache Spark Streaming Buyer's Guide and find out what your peers are saying about Apache Spark Streaming, Databricks, Qlik Talend Cloud and more!

Apache Spark Streaming is the #10 ranked solution in Streaming Analytics tools. PeerSpot users give Apache Spark Streaming an average rating of 7.8 out of 10. Apache Spark Streaming is most commonly compared to Databricks: Apache Spark Streaming vs Databricks. Apache Spark Streaming is popular among the large enterprise segment, accounting for 50% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a financial services firm, accounting for 22% of all views.

Buyer's Guide

Apache Spark Streaming

June 2026

Get the report

Helped 900,644 peers since 2012

Featured Apache Spark Streaming reviews

Himansu Jena

Sr Project Manager at Raj Subhatech

There are various ways we can improve Apache Spark Streaming through best practices. The initial part requires attention to batch interval tuning, which helps small intervals in micro batches based on latency requirements and helps prevent back pressure. We can use data formats such as Parquet or ORC for storage that needs faster reads and leveraging feature predicate push-down optimizations. We can implement serialization which helps with any Kyro in terms of .NET or Java. We have boxing and unboxing serialization for XML and JSON for converting key-pair values stored in browser. We can also implement caching mechanisms for storing and recomputing multiple operations. We can use specified joins which help with smaller databases, and distributed joins can minimize users. We can implement project optimization memory for CPU efficiency, known as Tungsten. Additionally, load balancing, checkpointing, and schema evaluation are areas to consider based on performance and bottlenecks. We can use Bugzilla tools for tracking and Splunk to monitor the performance of process systems, utilization, and performance based on data frames or data sets.

Read full review

Kuldeep Pal

Data Engineer at Walmart Global Tech

The positive impact from Apache Spark Streaming is its near real-time capability. It has a good ecosystem that provides good support. However, if you need purely real-time data, you would be going with Flink. Apache Spark Streaming is good for near-to-real-time data and requires less maintenance, which is beneficial for developers and companies. The new feature coming in Apache Spark Streaming 4 is continuous streaming. If continuous streaming becomes stable and performs comparably to Flink, then Apache Spark Streaming would be preferred everywhere due to its good maintenance and support system. While it is reliable, there are some issues with Apache Spark Streaming as it is not 100% reliable. Sometimes it fails, requiring numerous configurations such as checkpointing, watermarking, and other features. If you select a 10-minute window and the data arrives at the 30th minute, it sometimes loses data in between. You also have to apply back pressure when numerous messages are coming in. It requires constant monitoring and maintenance. I would say it is 90-95% reliable, but multiple configurations and frequent maintenance make it slightly less reliable. The continuous deployment feature being in beta phase could benefit everyone if released earlier.

Read full review

Khoa Dang Le

Principal AI Engineer at IMT Solutions

I find the fault tolerance feature beneficial because I use it for serving data from a landing area. I understand all of the structures we have for Spark SQL, Spark Streaming, and MLlib. The ability of Apache Spark Streaming to handle out-of-order data using watermarking and windowing is something we use in our pipeline. Nearly 50% of our usage is based on that because we use it for landing data, and we appreciate that we can work with it. The main benefits of Apache Spark Streaming include cost savings, time savings, and efficiency improvements about data storage. The fast storage capability is crucial because Apache Spark replaces Hadoop's MapReduce, allowing us to manage our data more efficiently.

Read full review

Apache Spark Streaming mindshare

As of June 2026, the mindshare of Apache Spark Streaming in the Streaming Analytics category stands at 4.6%, up from 2.6% compared to the previous year, according to calculations based on PeerSpot user engagement data.

Streaming Analytics Mindshare Distribution
Product	Mindshare (%)
Apache Spark Streaming	4.6%
Apache Flink	8.2%
Databricks	7.9%
Other	79.3%

Streaming Analytics

PeerResearch reports based on Apache Spark Streaming reviews

Type	Title	Date
Category	Streaming Analytics	Jun 23, 2026	Download
Product	Reviews, tips, and advice from real users	Jun 23, 2026	Download
Comparison	Apache Spark Streaming vs Databricks	Jun 23, 2026	Download
Comparison	Apache Spark Streaming vs Azure Stream Analytics	Jun 23, 2026	Download
Comparison	Apache Spark Streaming vs Apache Kafka	Jun 23, 2026	Download

Valuable Features

Valuable features of Apache Spark Streaming include real-time analytics, scalability, efficient data processing, integration with other technologies, and low latency. Users appreciate its versatility in supporting multiple languages, stability, fault tolerance, and ease of deployment. Features like checkpointing, windowing, and watermarking enhance its capability to manage out-of-order data. It enables handling large-scale data, provides native Python support, and facilitates integration with databases and other data sources. Users benefit from cost efficiency and strong community support.

"It is the most scalable tool that I have seen before."
"The main benefits of Apache Spark Streaming include cost savings, time savings, and efficiency improvements about data storage."
"For Apache Spark Streaming, the feature I appreciated most is that it provides live data delivery; additionally, it provides the capability to send a larger amount of data in parallel."

Room for Improvement

Apache Spark Streaming requires enhancements in user configuration to be more business-friendly, easier installation, and improved cloud-native support. Further improvements include handling real-time analytics, memory management, and latency issues. Users desire more robust monitoring, better UI, integration of arbitrary stateful functions in Python, and handling unstructured data. Auto-tuning, continuous deployment features, and capabilities for instant job stopping directly from the UI need attention. Improvements in Spark SQL, scalability, debugging, and adaptation to real-time processing are also necessary.

"The problem is we need to use it in a certain manner. After that, we need to apply another pipeline for the machine learning processes, and that's what we work on."
"The downside is when you have this the other way around in the columns, it becomes really hard to use."
"Monitoring is an area where they could definitely improve Apache Spark Streaming. When you have a streaming application, it generates numerous logs. After some time, the logs become meaningless because they're quite large and impossible to open."

Pricing

Enterprise users find Apache Spark Streaming cost-effective due to its open-source nature. While using Apache Spark as a service incurs costs, the open-source version has no licensing fees. Cloud setups like AWS EMR, Google Cloud's DataProc, and similar services provide managed solutions, influencing cost through additional features. On-premises setups are generally more expensive. Costs can vary based on additional service dependencies, but the core software remains free, offering a flexible pricing structure for businesses.

"Spark is an affordable solution, especially considering its open-source nature."
"On a scale from one to ten, where one is expensive, or not cost-effective, and ten is cheap, I rate the price a seven."
"I was using the open-source community version, which was self-hosted."

Popular Use Cases

Apache Spark Streaming is utilized for predictive maintenance, anomaly detection, and real-time streaming. Entities employ it for healthcare data processing, ETL tasks, GIS, and IoT-data processing. It's integrated with diverse data ecosystems like Kafka, Google Cloud, and HDFS, enhancing real-time decision-making in telecommunications and order management. Organizations use Apache Spark Streaming with micro-batching for tasks like fraud detection and building Customer 360 profiles, benefiting from reduced latency and improved data integration.

Service and Support

Apache Spark Streaming offers strong documentation and a robust open-source community support. Users frequently rely on available online resources and community assistance rather than directly contacting Apache's team. Many appreciate the community, especially major contributions from Databricks. External consulting is sometimes used to enhance support. Apache's technical support is often unnecessary as users find ample guidance through public information and community channels for managing databases or storage-related tasks.

Deployment

Apache Spark Streaming's initial setup varies; some find it developer-focused and straightforward, while others see it as complex, requiring Java or Scala knowledge. Installation can be done in minutes in hosted or hybrid cloud environments. There is extensive documentation and community support, facilitating ease in smaller-scale setups. Scaling is simple in cloud settings, and maintenance benefits from active development and version migration. Multiple users highlight easy deployment if supported by other tools like Hadoop, Data Lake, or Kafka.

Scalability

Apache Spark Streaming appears highly scalable with support for large-scale data processing and many active users. Its distributed compute architecture, horizontal scalability, and features like auto-scaling, adaptive query planning, and handling of data skewness enhance performance. Adaptable across several domains, it manages workload balancing globally. Users didn't face issues scaling operations, even with significant data loads, and its capabilities extend to handling real-time processing, machine learning, and data visualizations.

Stability

Apache Spark Streaming is regarded as stable and mature, with no significant bugs or crashes noted. Users report it as reliable, especially version 3.0.1. Crashes may occur with large datasets or improper configuration but are infrequent and manageable. It benefits from being open and transparent, allowing users to understand its operations. Although maintenance on platforms like EMR, EKS, or Azure Databricks is necessary, its stability is highly rated and suitable for various use cases.

These insights are based on the in-depth reviews provided by peers to help you make a better buying decision.

Download our Apache Spark Streaming Buyer's Guide for additional reliable information.

Review data by company size

By reviewers
Company Size	Count
Small Business	7
Midsize Enterprise	2
Large Enterprise	5

By reviewers

By visitors reading reviews
Company Size	Count
Small Business	47
Midsize Enterprise	7
Large Enterprise	54

By visitors reading reviews

Top industries

By visitors reading reviews

Financial Services Firm

22%

Outsourcing Company

Computer Software Company

Comms Service Provider

University

Marketing Services Firm

Healthcare Company

Manufacturing Company

Real Estate/Law Firm

Construction Company

Performing Arts

Government

Legal Firm

Insurance Company

Media Company

Logistics Company

Religious Institution

Retailer

Aerospace/Defense Firm

Wholesaler/Distributor

Educational Organization

Sports Company

Pharma/Biotech Company

Recreational Facilities/Services Company

Compare Apache Spark Streaming with alternative products

Learn more about Apache Spark Streaming

Apache Spark Streaming is a powerful tool for real-time data processing and analytics, offering support for multiple languages and robust integration capabilities. Its open-source nature, combined with features like checkpointing and watermarking, makes it a reliable choice for managing data streams with low latency. However, it faces challenges with Kubernetes deployments and requires improvements in memory management and latency. The installation process and handling of structured and unstructured data also present complexities. Despite these challenges, it's heavily utilized in building data pipelines and leveraging machine learning algorithms.

What are Apache Spark Streaming's key features?

Native Python Support: Efficient processing with Python language integration.
Micro-Batching: Handles streams in small batches for real-time processing.
Real-Time Analytics: Enables instant data insights.
Scalability: Adapts to varying data loads.
Low Latency: Processes data with minimal delays.

What benefits or ROI should users expect?

Efficiency: Streamlined real-time data processing.
Reliability: Consistent performance across tasks.
Integration: Seamless connection with other services.
Cost Optimization: Reduces processing expenses over time.

In industries like healthcare, telecommunications, and logistics, Apache Spark Streaming is implemented for real-time data processing and machine learning. It aids in predictive maintenance, anomaly detection, and fraud detection by reducing data latency with comprehensive analytics. Organizations frequently use it alongside Kafka and cloud storage solutions to enhance GIS, predictive analytics, and Customer 360 profiling.

Apache Spark Streaming was previously known as Spark Streaming.

Apache Spark Streaming customers

UC Berkeley AMPLab, Amazon, Alibaba Taobao, Kenshoo, eBay Inc.

Product Categories

Streaming Analytics

Popular Comparisons

Databricks vs Apache Spark Streaming

Qlik Talend Cloud vs Apache Spark Streaming

Confluent vs Apache Spark Streaming

Azure Stream Analytics vs Apache Spark Streaming

Apache Flink vs Apache Spark Streaming

Spring Cloud Data Flow vs Apache Spark Streaming

Amazon Kinesis vs Apache Spark Streaming

Amazon MSK vs Apache Spark Streaming

Starburst Enterprise vs Apache Spark Streaming

Striim vs Apache Spark Streaming

Apache Pulsar vs Apache Spark Streaming

Aiven Platform vs Apache Spark Streaming

Altair Panopticon vs Apache Spark Streaming

Redpanda vs Apache Spark Streaming

Software AG Apama vs Apache Spark Streaming

See all alternatives

Apache Spark Streaming Reviews Summary
Author info	Rating	Review Summary
Sr Project Manager at Raj Subhatech	4.0	I've used Apache Spark Streaming for real-time GIS and data processing, benefiting from its scalability, integration with Python tools, and predictive analytics, though handling varied data types sometimes presents challenges with missing or incomplete values.
Data Engineer at Walmart Global Tech	4.0	I've used Apache Spark Streaming for near real-time fraud detection with Kafka. Its flexible windowing, checkpointing, and scalability work well, though it requires careful configuration. It's reliable but not perfect, and continuous monitoring is essential.
Principal AI Engineer at IMT Solutions	4.0	I've used Apache Spark Streaming for three years for real-time data processing and machine learning, appreciating its fault tolerance and scalability, though retraining MLlib models for each pipeline remains a notable limitation.
Sr. Manager Data Engineer at a tech consulting company with 51-200 employees	3.5	I've used Apache Spark Streaming for years to process network data in near real-time. It's scalable and easy to deploy on AWS, but lacks support for certain features, monitoring, and handling of slowly changing dimensions.
Data Engineer III at a tech consulting company with 10,001+ employees	4.0	I've used Apache Spark Streaming to improve data latency for real-time customer profiling and ML features, though I’d like true real-time processing instead of micro-batches; setup was easy, and scalability and community support are excellent.
Gen AI Lead/Architect at Alvaria	3.5	I used Apache Spark Streaming during my academics for live data transmission and appreciated its real-time capabilities, though it lacks support for unstructured data, which limits some use cases; overall, I’d rate it seven out of ten.
Chief Data Strategist And Director at theworkshop.es	4.5	I use Apache Spark Streaming for processing real-time data in web analytics. Its versatility in supporting multiple languages makes it ideal for integrating diverse data sources. While the UI could improve, it effectively handles various scenarios and requires careful use case consideration.
Engineering Leader at Walmart	4.0	I use Apache Spark Streaming for near real-time analytics, appreciating its scalability. However, its latency and memory management issues prevent true real-time use, requiring complex setup and significant maintenance, despite offering good integration.
Data Engineer at a comms service provider with 201-500 employees	4.0	I use Apache Spark Streaming to handle industry-related use cases, like managing orders from our system. Key features include checkpointing and the Streaming API, though it could improve in cost and load optimizations. We previously used Apache NiFi.
Chief Technology Officer at Teslon Technologies Pvt Ltd	4.0	We used Spark Streaming for streaming IoT data and applied Spark ML in healthcare. Its native Python support, ease of deployment, good documentation, and community support were valuable, though transitioning it to cloud environments was challenging compared to Storm.

Himansu Jena

Sr Project Manager at Raj Subhatech

Aug 19, 2025

Efficient real-time data management and analysis with advanced features

What is our primary use case?

I use Apache Spark Streaming for GIS (Graphical Information System), satellite imaging processing, image processing, longitude, latitude, and predicting electricity, road, and transformations in these areas.

I process all the information in real time where I can get lots of petabyte data, terabyte data from any type of XML, Excel, structured data, semi-structured data, and unstructured data. I use micro-batching, streams, transformations, and this information. Based on that, I predict and create models that can be used for regular expressions and image processing. Then using TensorFlow, I create dynamic views. Additionally, I create models which provide accuracy of predictive analytics.

With Apache Spark Streaming's integration with Anaconda and Miniconda with Python, I interact with databases using data frames or data sets in micro versions. I create solutions based on what business is expecting for decision-making, logistic regression, linear regression, or machine learning which will give image or voice record, graphical data that will provide more accuracy. These features are implemented based on client requirements. We ensure we are on track using AML and real-time processing from various data sources, whether structured, unstructured, or semi-structured data.

What is most valuable?

I use Apache Spark Streaming's checkpoint and debugging features including the concept of Spunk which provides error information, health performance, and fault tolerance. In the driver nodes, we check query progress logs with checkpoint locations, recovery areas, memory streaming, processing unit duration, and resource utilization. We monitor resources in terms of central processing unit, memory, identify bottlenecks, optimize applications, and display this information in Tableau dashboards. This makes it more predictable and allows end clients to see issues so they can provide more data for improved accuracy.

With Apache Spark Streaming's integration with Anaconda and Miniconda with Python, I interact with databases using data frames or data sets in micro versions. I create solutions based on business expectations for decision-making, logistic regression, linear regression, or machine learning which provides image or voice record and graphical data for improved accuracy. These features are implemented based on client requirements. We ensure we stay on track using AML and real-time processing from various data sources, including structured, unstructured, or semi-structured data.

What needs improvement?

We can implement serialization which helps with any Kyro in terms of .NET or Java. We have boxing and unboxing serialization for XML and JSON for converting key-pair values stored in browser. We can also implement caching mechanisms for storing and recomputing multiple operations.

We can use specified joins which help with smaller databases, and distributed joins can minimize users. We can implement project optimization memory for CPU efficiency, known as Tungsten. Additionally, load balancing, checkpointing, and schema evaluation are areas to consider based on performance and bottlenecks. We can use Bugzilla tools for tracking and Splunk to monitor the performance of process systems, utilization, and performance based on data frames or data sets.

For how long have I used the solution?

I have been working with Apache Spark Streaming for the last seven years as a Data Science Project Manager.

What do I think about the stability of the solution?

Apache Spark Streaming is stable with regular maintenance and updated versions such as three and four. It continues to grow and improve.

What do I think about the scalability of the solution?

In terms of scalability, Apache Spark Streaming ranks at the top due to its distributed compute architecture which provides horizontal scalability.

When we use RDD (resilient distributed data sets) or data frames, it enhances performance in terms of input-output processing operations. It helps handle large data efficiently and assists with workload balancing. When performing load balancing across servers in different locations such as the UK, US, Singapore, Japan, or Russia, they can coordinate without any performance issues when processing large scale data across the globe.

It supports unified analytical engine capabilities including real-time processing, machine learning, graph analysis, and data visualizations using tools such as Matplotlib, ggplot, Tableau, or D3.js. We have various visualization options which help process the data and meet requirements.

What other advice do I have?

Most features in Apache Spark Streaming are used for database operations, focusing on speed, fault tolerance, scalability in terms of batch, real-time, SQL analytics, machine learning, graph processing, lazy evaluation, and compatibility.

Distributed systems provide more accuracy and clustering of machines across large data sets. The data is divided into portions, partitions, or small pieces and processed in parallel across multiple work nodes, significantly accelerating processing time compared to single solutions. It helps with in-memory computing, storing memory, reducing frequent disk input-output, and enabling faster algorithms.

We use NumPy and Pandas for matrix operations, creating algorithms that generate models fitting our deep learning or machine learning techniques. The accuracy level typically reaches 90% and above based on the data quality.

When dealing with various data types including COBOL, Excel, JSON, video, audio, and MPG files, challenges can arise with incomplete or missing values. This particularly affects GIS data accuracy, such as predicting transport routes or electrical pole placements. While we achieve 90% efficiency, working with historical data versus current data presents challenges in business growth predictions.

When encountering fault tolerance issues, we communicate directly with the Apache Spark Streaming development team through LinkedIn channels or their on-site team. They provide customer support where issues can be reported via SMS or email with the file name for solution assistance. The team helps address issues with data frames, data sets, RDD functionality, version migrations, and integration with tools such as Miniconda, Anaconda, and Node.js server.

I rate Apache Spark Streaming 9 out of 10.

Kuldeep Pal

Data Engineer at Walmart Global Tech

Aug 22, 2025

Efficient data handling empowers near-real-time fraud detection and robust recovery mechanisms

What is our primary use case?

I am an end user of Apache Spark Streaming as a developer. I use this technology to build streaming pipelines.

Our usual use case for Apache Spark Streaming is mostly on the fraud side. Whatever data is coming in, we use Apache Spark Streaming with Kafka to stream all the real-time messages, and in real-time we are trying to find out the frauds from the real-time data. It is currently near to real-time because Apache Spark Streaming works window by window.

The real-time data messages coming in are processed by Apache Spark Streaming. While we have real-time messages, which are actually near to real-time, we are able to find out frauds. If somebody is conducting a scam or fraud, we match it based on our rules.

What is most valuable?

With Apache Spark Streaming, you can have multiple kinds of windows. Depending on your use case, you can select either a tumbling window, a sliding window, or a static window. According to the use case, you can select the windows to determine how much data you want to process at a single point of time.

There are multiple features such as watermarking and checkpointing, which are already integrated into the solution. Processing time interval and trigger time interval are used for handling large scale data. For example, if today 100 records are coming in and tomorrow 10,000 records suddenly arrive, there could be an issue in the pipeline. It can handle this automatically if we set all the configurations of processing time interval and trigger time interval.

Checkpointing in Apache Spark Streaming is crucial when you have pipeline failures. If a pipeline fails, you don't know until what point your messages have been processed. Checkpointing helps in getting the offset number and everything. You can either process it from that offset number or go to the latest offset and then push a message. If your pipeline fails, you do not have to risk anything as your data is not lost.

For out-of-order data, you have the window concept plus watermarking in Apache Spark Streaming. With watermarking, you can easily handle out-of-order data or late-arriving data. We can handle these things, but it depends on your use case and windows that you have selected.

What needs improvement?

The new feature coming in Apache Spark Streaming 4 is continuous streaming. If continuous streaming becomes stable and performs comparably to Flink, then Apache Spark Streaming would be preferred everywhere due to its good maintenance and support system.

While it is reliable, there are some issues with Apache Spark Streaming as it is not 100% reliable. Sometimes it fails, requiring numerous configurations such as checkpointing, watermarking, and other features. If you select a 10-minute window and the data arrives at the 30th minute, it sometimes loses data in between. You also have to apply back pressure when numerous messages are coming in. It requires constant monitoring and maintenance. I would say it is 90-95% reliable, but multiple configurations and frequent maintenance make it slightly less reliable.

The continuous deployment feature being in beta phase could benefit everyone if released earlier.

For how long have I used the solution?

I have been working with Apache Spark Streaming for close to three years.

What was my experience with deployment of the solution?

There weren't many challenges. Most issues were handled, with only some minor technical challenges here and there.

What do I think about the stability of the solution?

The technical challenges were mostly on the failure side when something fails in Apache Spark Streaming. On the deployment side, there weren't many issues. We faced configuration issues when we had to scale down or scale up the pipeline, requiring proper configuration settings.

What do I think about the scalability of the solution?

It is scalable and can be used for any amount of data.

How was the initial setup?

I handled the initial setup and deployment. We are currently running it on Dataproc.

Which other solutions did I evaluate?

I have used Dataflow but it was very costly, so we didn't continue with it.

We had options including Flink and Apache Beam. We chose Apache Spark Streaming because of its good ecosystem. Every developer would know Spark and can work on it. Otherwise, you have to hire a separate developer, which is another challenge for a team and company.

What other advice do I have?

Apache Spark Streaming helps in processing real-time data messages. While it's near to real-time rather than true real-time, we are able to identify frauds based on our matching rules.

We learned from the documentation available on the internet.

It is currently free as it is an open source system. You can download and use it.

On a scale of 1-10, I rate Apache Spark Streaming an 8.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google

Khoa Dang Le

Principal AI Engineer at IMT Solutions

Sep 26, 2025

Have faced challenges with complex data handling and seek smoother integration for machine learning workflows

What is our primary use case?

We work with Apache Spark Streaming for our project because we use that as one of the landing data sources, and we work with it to ensure we can get all of the data before it goes through our data warehouse. Apache Spark is one of the solutions we work with for that, especially for applying machine learning to our data.

We use Apache Spark Streaming for real-time processing, and we utilize MLlib for machine learning. Those are two components we use for that.

I find the setup of Apache Spark Streaming straightforward. I run it on the local machine. The complexity doesn't stem from our end but rather from how we integrate data. Using SQL with Apache Spark can be quite easy, and that's all I engage with regarding Apache Spark Streaming without complex issues.

What is most valuable?

I find the fault tolerance feature beneficial because I use it for serving data from a landing area. I understand all of the structures we have for Spark SQL, Spark Streaming, and MLlib.

The ability of Apache Spark Streaming to handle out-of-order data using watermarking and windowing is something we use in our pipeline. Nearly 50% of our usage is based on that because we use it for landing data, and we appreciate that we can work with it.

The main benefits of Apache Spark Streaming include cost savings, time savings, and efficiency improvements about data storage. The fast storage capability is crucial because Apache Spark replaces Hadoop's MapReduce, allowing us to manage our data more efficiently.

What needs improvement?

One of the improvements we need is in Spark SQL and the machine learning library. I don't think there is too much to work on, but the issue is when we want to use machine learning, we always need to retrain MLlib in Apache Spark for it to run their pipeline.

They cannot change the transformational machine learning aspect significantly. The problem is we need to use it in a certain manner. After that, we need to apply another pipeline for the machine learning processes, and that's what we work on.

For how long have I used the solution?

I use Apache Spark Streaming for about three years now.

What do I think about the stability of the solution?

We find stability is what we aim for with the SQL. So there are no stability issues.

What do I think about the scalability of the solution?

I find Apache Spark Streaming to be scalable. We factor in our client projects, and we don't change the driver configurations; we focus on needing servers with adequate RAM and storage capacity to serve every data request. The problem doesn't lie with scaling because we have enough tools for that.

How are customer service and support?

Regarding Apache's technical support for Apache Spark Streaming, I have always just worked with the driver behind it. We discuss everything we want to do or manage in our database or storage.

How would you rate customer service and support?

Negative

Which solution did I use previously and why did I switch?

Before using Apache Spark Streaming, I worked with Kafka and Hadoop. Those are two solutions I have experience with, but before that, I utilized Informatica and its features, and we used everything before I started using Apache Spark.

How was the initial setup?

I find the setup of Apache Spark Streaming straightforward. I didn't face challenges during the setup; I run it on the local machine.

Which other solutions did I evaluate?

I prefer Apache Spark Streaming over Informatica or Hadoop technologies because it is really fast and can retrieve data efficiently. The key difference is that I always have the ongoing challenge of ensuring we have our data in our database, which is tough to manage because of the vast amounts of big data we deal with.

What other advice do I have?

One thing I would share with other organizations considering Apache Spark Streaming is the necessity of having effective data storage. We want to ensure we acquire and manage our data storage effectively.

I would always recommend using Apache Spark Streaming as a solution for managing our entire data pipeline.

On a scale of 1-10, I rate Apache Spark Streaming an 8.

Which deployment model are you using for this solution?

On-premises

Shahzad Munir

Sr. Manager Data Engineer at a tech consulting company with 51-200 employees

Aug 25, 2025

Effectively processes network data with micro-batching but struggles with monitoring and file handling

What is our primary use case?

I work for a telecommunication company where we process network data in near real-time. All of our use cases revolve around that.

What is most valuable?

I appreciate Apache Spark Streaming's micro-batching capabilities. The watermarking functionality and related features are quite good, though I do notice some gaps.

What needs improvement?

In the old way of working with Apache Storm, which is also a real-time processing engine, and Flink as there was the ability to append small data files. With those solutions, you can append to the same file until reaching a certain limit, then start writing to another file. I miss this feature in Apache Spark Streaming. From an architecture point of view, it's not possible for Apache Spark Streaming, but this is a feature I really miss compared to Flink or Apache Storm.

Monitoring is an area where they could definitely improve Apache Spark Streaming. When you have a streaming application, it generates numerous logs. After some time, the logs become meaningless because they're quite large and impossible to open. Monitoring how the streaming is progressing, including record rejection rates, failures, and successes is crucial. The rejected records are most critical, being those that cannot be compacted or processed in the streaming line. While monitoring features exist in the Spark UI with graphs showing input limits, incoming rates, and output rates, accessing this information requires navigating through Spark's UI. Furthermore, if the Spark application runs for an extended period, the UI becomes inaccessible, making it impossible to monitor your Apache Spark Streaming application.

Another significant missing feature is the handling of slowly changing dimensions. When dealing with big data in Apache Spark Streaming, there are two different types of datasets: static data and streaming data. Apache Spark Streaming doesn't provide a way to automatically update static data when joining it with streaming data. For example, if you have customer data as static data and network data as streaming data, the application starts consuming network data but loads customer data from a previous snapshot. After 24 hours, Apache Spark Streaming cannot reload the customer data independently. The application must be stopped and restarted to consume the latest customer snapshot for joining with streaming data.

For how long have I used the solution?

I have been using Apache Spark Streaming for approximately six to seven years.

What was my experience with deployment of the solution?

In the past, deployment was quite difficult. Now it's quite simple. On AWS, you have managed Apache Spark Streaming instances EMR over EKS or EMR over EC2. You can simply click and spin up those instances to have the Apache Spark Streaming cluster running. In the past, setting up Hadoop and various Hive files was necessary before running Apache Spark Streaming. With these cloud-managed solutions, it's easier not just for Apache Spark Streaming but for many other applications as

What do I think about the stability of the solution?

Maintenance requirements exist for Apache Spark Streaming. For example, if you're running on EMR over EKS, your cluster running on EKS needs to be maintained. Apache Spark Streaming is an engine with serverless options available. If somebody chooses the serverless option, maintenance requirements are reduced, but it has its own limitations. This is why people often choose other solutions such as EMR, EKS, or Azure Databricks.

Apart from community support, there isn't any Apache support available with Apache Spark Streaming. If you're using a managed solution from AWS, AWS becomes your contact point. If you're using Databricks, then Databricks is your contact point.

What do I think about the scalability of the solution?

Scaling with Apache Spark Streaming is fine as it's a distributed system. The latest versions have introduced features adaptive query planning and adoption which handle data skewness automatically. When it comes to scalability, it depends on your data. Adding multiple compute engines or executors isn't an issue. However, determining how to effectively combine data with Apache Spark Streaming is an art that depends on your specific dataset.

What's my experience with pricing, setup cost, and licensing?

We use EMR over EKS, which is a managed Apache Spark Streaming solution from AWS. I'm not using Databricks, so I cannot speak to their pricing. With EMR over EKS, we pay for the service with Apache Spark bundled in. On AWS, we cannot specifically determine how much Apache Spark Streaming is costing us.

What other advice do I have?

I'm a user of Apache Spark Streaming. I would give Apache Spark Streaming a rating of seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Venkata Phaneendra Reddy Janga

Data Engineer III at a tech consulting company with 10,001+ employees

Aug 18, 2025

Improved data latency and integration with diverse data sources enables robust real-time processing

What is our primary use case?

We have used Apache Spark Streaming for ingestion from streaming sources such as Apache Kafka. Another use case is for building a Customer 360, specifically C360 in real-time. Additionally, we used it to build a real-time feature store to build features that will be used for machine learning models and AI models.

When building this customer profile with Apache Spark Streaming, we create a micro-batch of one minute for customer profiling. For example, we track email changes, contact changes, and address changes. This approach is near-real-time due to the micro-batch.

By using Apache Spark Streaming, the data freshness rate and latency have decreased significantly. Earlier we used to run 24-hour batches, and now it is less than one minute, allowing us to communicate any customer changes, such as address or email, to downstream systems such as Adobe Experience Platform for marketing within one minute, capturing all changes in near real-time.

We have integrated Apache Spark Streaming with Google's Cloud Storage (GCS) and Google BigQuery. Additionally, we have integrated with native HDFS and Hive as well.

What is most valuable?

The best feature of Apache Spark Streaming is that it's built upon the Spark SQL engine. This is easy for someone coming from a SQL background to work with real-time data, even if they are new to real-time processing. They can quickly get started using the Spark SQL engine.

Another valuable feature is that we can control many aspects such as the configuration of the engine, memory management, and have a checkpointing mechanism that allows us to manually start or restart jobs from a specific point. This is particularly useful for restoring messages of a Kafka topic from a specific date and time using the checkpointing mechanism.

The integration with Spark's ecosystems such as MLlib and GraphX has significant potential, although I have not worked on that part as we focus mainly on data engineering.

We can handle late-arriving data with Apache Spark Streaming. Sometimes aggregation results might be missed if data arrives out of order, but features such as windowing allow us to manage out-of-order data by specifying a watermark time. Recently released mechanisms to query the state make it easier to handle data programmatically.

What needs improvement?

One improvement I would expect is real-time processing instead of micro-batch or near real-time. Frameworks such as Apache Beam or Apache Flink process data in real-time, and integrating this capability into Apache Spark Streaming would be beneficial.

Another improvement could be in the job stopping process. In the DataProc environment, we have to place a file for the Apache Spark Streaming job to detect and stop gracefully. A feature to stop the job directly from the UI would be helpful.

For how long have I used the solution?

I have worked with Apache Spark Streaming for over one year. The last time I worked with it was eight months back.

What was my experience with deployment of the solution?

Setting up Apache Spark Streaming is not difficult because there is good documentation on their website. If we follow it step-by-step, challenges won't arise; it's pretty straightforward.

What do I think about the stability of the solution?

We have not experienced downtime with Apache Spark Streaming, but we have had crashes. Sometimes the state memory keeps piling up, necessitating tuning. We had crashes, but not very often and were able to resolve these issues by checking load times.

What do I think about the scalability of the solution?

Apache Spark Streaming is extremely scalable. We used it in production to process around 120 million events in a day. In seconds, we used to get 120-130 events. It's a very good framework for scalability.

How are customer service and support?

The community around Apache Spark Streaming is great. Compared to other frameworks, Apache Spark has the best community with major committers from Databricks. You have a very good community for Apache Spark Streaming.

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

Before Apache Spark, I haven't worked on anything in real-time data processing. Apache Spark introduced me to real-time data processing. After Apache Spark, I worked with Apache Beam, which I'm currently working on.

When we initially thought of doing real-time processing, we considered Apache Spark Structured Streaming, Flink, and Apache Beam. We chose Apache Spark because it has many Python contributions, and our team consists mostly of Python engineers. Apache Spark was also more mature compared to Flink and Apache Beam at that point in time.

How was the initial setup?

Setting up Apache Spark Streaming is not difficult, as there is good documentation on their website on how to get started. Following it step-by-step, challenges won't arise; it's pretty straightforward.

What about the implementation team?

We hosted Apache Spark Streaming on GCP, using Cloud DataProc for running the Spark cluster. We deployed using GCloud commands or had our Python Airflow DAG for deployment, starting and stopping jobs.

What was our ROI?

By integrating Apache Spark Streaming, the data freshness rate, and latency have significantly improved from 24-hour batch processing to less than one minute. This improvement facilitated faster communication to downstream systems, aiding marketing campaigns.

What's my experience with pricing, setup cost, and licensing?

Apache Spark Streaming is completely open source, and you don't need any fees.

Running on the managed clusters using DataProc, we leverage Google's best practices for cluster management, scaling, and autoscaling. Cost depends on Google's feature set, although, on-premises, monitoring would be necessary for optimizing resource use.

Which other solutions did I evaluate?

For real-time processing, Apache Spark is a good start due to its strong open source community and contributions. However, for serverless processing, exploring other frameworks might be beneficial.

What other advice do I have?

I would suggest Apache Spark for streaming processing if they want to manage clusters on their own. For serverless options, exploring other use cases could be beneficial. Apache Spark is a good starting point, considering its strong open source community contributions.

We have not experienced downtime with Apache Spark Streaming, but we have had crashes. Sometimes the state memory keeps piling up, so we have to make tuning. We had crashes, but not very often. We were able to check specific load times and resolve those issues.

On a scale of 1-10, I rate Apache Spark Streaming an 8.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google

Ajay Hiremath

Gen AI Lead/Architect at Alvaria

Sep 9, 2025

Handles real-time data transfers during projects but struggles with high column datasets

What is our primary use case?

My use cases for Apache Spark Streaming were during my academics. During that time, I used Apache Spark Streaming to transmit data live from one source to another.

What is most valuable?

For Apache Spark Streaming, the feature I appreciated most is that it provides live data delivery. Additionally, it provides the capability to send a larger amount of data in parallel.

What needs improvement?

I believe the downsides of Apache Spark Streaming are that it primarily supports structured data. Currently, in my organization, we require thousands of transcripts that need to be handled during live conversations. If Apache Spark Streaming allowed all unstructured data to transfer, that would be a really great use case.

My name can be put on the top of my review. My company name can be mentioned. Apache can contact me in case they have questions or comments about this review. I am interested in being a reference for Apache.

For how long have I used the solution?

I have been using Apache Spark Streaming for just about one to two years.

What do I think about the stability of the solution?

Regarding stability in Apache Spark Streaming, I recall during my academics when working on a really large dataset, there was one major drawback. It easily transforms the data when it has thousands of columns or millions of rows. However, the downside is when you have this the other way around in the columns, it becomes really hard to use.

What do I think about the scalability of the solution?

I believe Apache Spark Streaming has auto-scaling; Apache handles auto-scaling on its own. In the big data ecosystem, the worker and master nodes might have the auto-scaling feature, though I am not entirely certain about this.

How was the initial setup?

The initial deployment of Apache Spark Streaming solution was actually easy to use because I had experience with different tools. While I cannot recall exactly, I used Facebook Flume to transfer the data in a live way. Then using Apache Spark Streaming, I was transforming this data.

Which other solutions did I evaluate?

For big data use cases, I have not used any alternatives to Apache Spark Streaming. I started with Apache Spark Streaming and normal Spark.

What other advice do I have?

The product I discussed was Apache Spark Streaming.

For academics, I used the free version.

Regarding maintenance, I do not think it requires any maintenance on my part.

On a scale of one to ten, I rate Apache Spark Streaming a seven out of ten.

Oscar Estorach

Chief Data Strategist And Director at theworkshop.es

Jan 25, 2024

Versatile and flexible when dealing with large-scale data streams

What is our primary use case?

As a data engineer, I use Apache Spark Streaming to process real-time data for web page analytics and integrate diverse data sources into centralized data warehouses.

What is most valuable?

What I like about Spark is its versatility in supporting multiple languages and that makes it my preferred choice for building scalable and efficient systems, whether it is hooking databases with web applications or handling large-scale data transformations.

Apache Spark Streaming is versatile. You can use it for competitive intelligence, gathering data from competitors, or for internal tasks like monitoring workflows. It works well in the cloud, and you can structure data using Databricks or Spark, providing flexibility for different projects.

Spark Streaming's flexibility shines when dealing with large-scale data streams. It caters to different needs, offering real-time insights for tasks like online sales analytics. The ability to prioritize data streams is valuable, especially for monitoring competitor prices online.

What needs improvement?

In terms of improvement, the UI could be better. Additionally, Spark Streaming works well for various use cases, but improvements could be made for ultra-fast scenarios where seconds matter. While some business processes require real-time data every second, not all projects demand such speed. For instance, batch processing, short intervals for competitive intelligence, or operational intelligence actions might not need sub-second precision. Streaming is versatile but needs careful consideration based on the specific use case and problem at hand.

For how long have I used the solution?

I have been working with Apache Spark Streaming for almost eight years.

What do I think about the scalability of the solution?

It is a fairly scalable solution.

How was the initial setup?

Setting up Apache Spark Streaming is straightforward and it involves deploying simple statements. The process is not complex and can be done easily.

What's my experience with pricing, setup cost, and licensing?

Apache Spark Streaming is affordable.

What other advice do I have?

For those starting with Apache Spark Streaming, I recommend studying and understanding data relationships. While it might seem complex at first, there are helpful resources available. Overall, I would rate Apache Spark Streaming as a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

AbhishekGupta

Engineering Leader at Walmart

Oct 8, 2022

Easy integration, beneficial auto-scaling, and good open-sourced support community

What is our primary use case?

We have built services around Apache Spark Streaming. We use it for real-time streaming use cases. There are many last-minute delivery use cases. We are trying to build on Apache Spark Stream but the latency has to be better.

How has it helped my organization?

The solution has helped our organization with easy integration.

What is most valuable?

Apache Spark Streaming's most valuable feature is near real-time analytics. The developers can build APIs easily for a code-steaming pipeline. The solutions have an ecosystem of integration with other stock services.

What needs improvement?

The service structure of Apache Spark Streaming can improve. There are a lot of issues with memory management and latency. There is no real-time analytics. We recommend it for the use cases where there is a five-second latency, but not for a millisecond, an IOT-based, or the detection anomaly-based. Flink as a service is much better.

Apache Spark Streaming does not have auto-tuning. A customer needs to invest a lot, in terms of management and maintenance.

For how long have I used the solution?

I have been using Apache Spark Streaming for more than six months.

What do I think about the stability of the solution?

Apache Spark Streaming is stable.

What do I think about the scalability of the solution?

Apache Spark Streaming is scalable. There are good auto-scale features.

We have approximately 1,000 users using the solution.

This is a heavily used solution.

How are customer service and support?

Apache Spark Streaming has a good open-source support community.

Which solution did I use previously and why did I switch?

I have used Apache Storm, Apache Flume, and Flink. Each of the solutions has its use case in that they work better.

How was the initial setup?

The initial setup of Apache Spark Streaming is complex.

What was our ROI?

We have received a return on investment.

What's my experience with pricing, setup cost, and licensing?

People pay for Apache Spark Streaming as a service.

What other advice do I have?

There are 18 to 20 people needed for maintenance with our 1,000 users.

My advice to others is they need to fine-tune their job so the testing becomes important. Fine-tuning becomes important. It would be beneficial to have consulting or some background in the solution before using it.

I rate Apache Spark Streaming an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

Prashast Tripathi

Data Engineer at a comms service provider with 201-500 employees

Jul 24, 2023

A robust solution that is configurable based on one's requirements with features like checkpointing and API

What is our primary use case?

The solution has industry-related use cases, with orders flowing from the order management system. We use Apache Spark Streaming to collect and store these orders in our database.

How has it helped my organization?

Before the introduction of Apache Spark Streaming, we primarily relied on cloud-related tools. Apache Spark Streaming is a robust solution configurable based on our requirements. The batches we deal with are very use-case-specific, and we tune in those batches accordingly.

What is most valuable?

Apache Spark Streaming has features like checkpointing and Streaming API that are useful.

What needs improvement?

Apache Spark Streaming is a native integration of some libraries in terms of cost and load-related optimizations. The cost and load-related optimizations are areas where the tool lacks and needs improvement.

For how long have I used the solution?

I have been using Apache Spark Streaming for a year. We use Apache Spark Streaming 3 in our company.

What do I think about the stability of the solution?

The solution is stable.

I rate the solution's stability an eight out of ten.

What do I think about the scalability of the solution?

The solution's scalability is very good. We have more than ten teams with around 100 consumers.

We have some alternative solutions.

I rate the solution's scalability a nine out of ten.

How are customer service and support?

We have not been in touch with Apache's support team. For support, we use the information available to the public.

Which solution did I use previously and why did I switch?

We have used Apache NiFi before in my company.

How was the initial setup?

I rate the initial setup a five on a scale from one to ten, where one is difficult, and ten is easy.

Apache Spark Streaming's deployment usually takes two to three minutes.

The solution deployment is a fully automated process, and we have a CI/CD process in place, so we trigger Jenkins Pipeline for the deployment.

The solution is deployed on a hybrid cloud.

The deployment can be done with just one click, so not even a person is needed for deployment.

What's my experience with pricing, setup cost, and licensing?

On a scale from one to ten, where one is expensive, or not cost-effective, and ten is cheap, I rate the price a seven.

What other advice do I have?

Apache Spark Streaming has very specific use cases and needs to be evaluated based on the needs of an individual before choosing it.

Overall, I rate the solution an eight out of ten.

Which deployment model are you using for this solution?

Hybrid Cloud

Daleep R

Chief Technology Officer at Teslon Technologies Pvt Ltd

Jun 8, 2023

Easy deployment as a cluster and good documentation

What is our primary use case?

We used Spark and Spark Streaming, as well as Spark ML, for multiple use cases, particularly streaming IoT-related data. Additionally, we applied Spark ML for various machine learning algorithms on the streaming data, mainly in the healthcare space. So, primarily in the healthcare domain.

What is most valuable?

With Spark Streaming, there was native Python support, which was beneficial for us. It was easy to deploy as a cluster, and the website was user-friendly. The documentation was also pretty good, and there was strong community support. Overall, it was considered an industry standard at the time.

What needs improvement?

In terms of disadvantages, it was a bit cumbersome due to its size. It wasn't quite cloud-native back then, meaning it wasn't easy to deploy it in a Kubernetes cluster and similar environments. I found it a bit challenging, but I'm not sure if that's still the case now. It probably has better support.

It was on-prem when we wanted to migrate it to the cloud, especially on Kubernetes, I remember facing some difficulties in successfully migrating the system.

For how long have I used the solution?

I explored it as part of a pilot project some time ago. We were using Spark Streaming, and I explored Pulse as a replacement for Spark Streaming for that use case. Overall, I've used Spark Streaming for around five years or so.

What do I think about the stability of the solution?

It is a stable solution.

What do I think about the scalability of the solution?

Scalability is pretty good. However, I must mention that I haven't tested it extensively with large-scale production scenarios. The testing I conducted was more of a pilot nature, and the scale was not very high. But based on what I've read, scalability shouldn't be an issue.

In the pilot project, there were around a thousand users. I didn't encounter any issues while scaling to that level.

How are customer service and support?

I mainly relied on the documentation and community support. There was sufficient support available for me during various times. I didn't actually contact Apache for any support-related activities.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Spark Streaming was more widely used and had better documentation. It had frequent releases and active development compared to Storm, which had limited language support and stopped active development at some point. Spark Streaming also had top-level consultant support, which was beneficial for the team I was working with. That's why I made a switch.

How was the initial setup?

It was easy to install. I didn't find any difficulties while installing and trying it out, at least on a smaller scale.

Apache Spark Streaming was straightforward in terms of maintenance. It was actively developed, and migrating from an older to a newer version was quite simple. That was the main aspect of maintenance, and overall, it was a straightforward process. The documentation was good, and there was good community support. So I didn't face any problems while deploying and maintaining the solution.

What's my experience with pricing, setup cost, and licensing?

I was using the open-source community version, which was self-hosted. I'm not familiar with the pricing of the commercial version.

Which other solutions did I evaluate?

I had previously used Apache Storm, which is an open-source solution. I later switched to Spark Streaming and also tried Pulsar for similar use cases in the healthcare domain.

What other advice do I have?

I would highly recommend Spark Streaming for standard streaming or IoT use cases. The entire Spark ecosystem, including Spark Core, streaming, ML, and other components, can be highly beneficial. It's better to stick with the Spark ecosystem rather than use other platforms and frameworks. For streaming and IoT, Spark Streaming is a great choice.

Overall, I would rate the solution an eight out of ten. The only issue I found, at least during the time I actively worked with it, was that it was resource-intensive, even for small-scale applications. In comparison, some other platforms, like Pulsar, had lighter resource consumption and performed better in terms of resource usage and associated costs. At least, to begin with, it performs better with the resource usage and dollar value associated with it. But at least to begin with it is a bit heavy and resource intensive, which is why I rate it an eight.

Which deployment model are you using for this solution?

On-premises

Title	Rating	Mindshare	Recommending
Databricks	4.1	7.9%	96%	94 interviews Add to research
Qlik Talend Cloud	4.0	3.1%	89%	56 interviews Add to research

Apache Spark Streaming Reviews

What is Apache Spark Streaming?

Featured Apache Spark Streaming reviews

Apache Spark Streaming mindshare

PeerResearch reports based on Apache Spark Streaming reviews

Valuable Features

Room for Improvement

Pricing

Popular Use Cases

Service and Support

Deployment

Scalability

Stability

Review data by company size

Top industries

Compare Apache Spark Streaming with alternative products

Learn more about Apache Spark Streaming

Apache Spark Streaming customers

Related questions

Product Categories

Popular Comparisons

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

What other advice do I have?

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What was my experience with deployment of the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How was the initial setup?

Which other solutions did I evaluate?

What other advice do I have?

Which deployment model are you using for this solution?

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

How was the initial setup?

Which other solutions did I evaluate?

What other advice do I have?

Which deployment model are you using for this solution?

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What was my experience with deployment of the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

What's my experience with pricing, setup cost, and licensing?

What other advice do I have?

Which deployment model are you using for this solution?

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

What is our primary use case?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What was my experience with deployment of the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

Which solution did I use previously and why did I switch?

How was the initial setup?

What about the implementation team?

What was our ROI?

What's my experience with pricing, setup cost, and licensing?

Which other solutions did I evaluate?

What other advice do I have?