Try our new research platform with insights from 80,000+ expert users

Azure Data Factory vs StreamSets comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Dec 19, 2024

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Azure Data Factory
Ranking in Data Integration
1st
Average Rating
8.0
Reviews Sentiment
6.9
Number of Reviews
90
Ranking in other categories
Cloud Data Warehouse (3rd)
StreamSets
Ranking in Data Integration
15th
Average Rating
8.4
Reviews Sentiment
7.0
Number of Reviews
21
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of April 2025, in the Data Integration category, the mindshare of Azure Data Factory is 9.5%, down from 12.7% compared to the previous year. The mindshare of StreamSets is 1.6%, up from 1.3% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Integration
 

Featured Reviews

Joy Maitra - PeerSpot reviewer
Facilitates seamless data pipeline creation with good analytics and and thorough monitoring
Azure Data Factory is a low code, no code platform, which is helpful. It provides many prebuilt functionalities that assist in building data pipelines. Also, it facilitates easy transformation with all required functionalities for analytics. Furthermore, it connects to different sources out-of-the-box, making integration much easier. The monitoring is very thorough, though a more readable version would be appreciable.
Nantabo Jackie - PeerSpot reviewer
Simplified pipelines and helped us break down data silos within our organization
The design experience when implementing batch streaming or ECL pipelines is very easy and straightforward. When we initially attempted to integrate StreamSets with Kafka, it was somewhat challenging until we consulted the documentation, after which it became straightforward. We use StreamSets to move data into modern analytics platforms. Moving the data into modern analytics platforms is still complex. It requires a lot of understanding of logic. StreamSets enables us to build data pipelines without knowing how to code. StreamSets' ability to build data pipelines without requiring us to know complex programming is very important, as it allows us to focus on our projects without spending time writing code. StreamSets' Transformer for Snowflake is simple to use for designing both simple and complex transformation logic. StreamSets' Transformer for Snowflake is extremely important to me as it helps me to connect external data sources and keep my internal workflow organized. Transformer for Snowflake's functionality is a perfect ten out of ten. It is important and cost-effective that Transformer for Snowflake is a serverless engine embedded within the platform, as without this feature, it would be very expensive. This feature helps us to sell at lower budget costs, which would otherwise be at a high cost with other servers. StreamSets has helped improve our organization. StreamSets simplified pipelines for our organization. It is easier to complete a project when we know where and how to start, and working with the team remotely makes it more efficient. This helps us to save time and be more organized when creating data pipelines. Being a structured company that produces reliable resources for our application benefits both our clients and contacts. StreamSets' built-in data drift resilience plays a part in our ETL operations. With prior knowledge, the built-in data drift resilience is very effective, but it can be challenging to implement without the preexisting knowledge. The built-in data drift resilience reduced the time it takes us to fix data drift breakages by 45 percent. StreamSets helped us break down data silos within our organization. The use of StreamSets to break down data silos enabled us to be confident in the services and products we provide, as well as the real-time streaming we offer. This has had a positive impact on our business, as it allowed us to accurately determine the analytics we need to present to stakeholders, clients, and our sources while ensuring that the process is secure and transparent. StreamSets saved us time because anyone can use StreamSets not just developers. We can save around 40 percent of our time. StreamSets' reusable assets helped us reduce workload by around 25 percent. StreamSets saved us money by not having to hire developers with specialized skills. We saved around $2,000 US. StreamSets helped us scale our data operations. Since StreamSets makes it easy to scale our data operations, it enabled us to know exactly where to start at any time. We are aware of the timeline for completing the project, and depending on our familiarity with the software, we can come up with a solution quickly.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The most valuable features are data transformations."
"The solution includes a feature that increases the number of processors used which makes it very powerful and adds to the scalability."
"The most valuable feature is the copy activity."
"The trigger scheduling options are decently robust."
"What I like best about Azure Data Factory is that it allows you to create pipelines, specifically ETL pipelines. I also like that Azure Data Factory has connectors and solves most of my company's problems."
"Data Flow and Databricks are going to be extremely valuable services, allowing data solutions to scale as the business grows and new data sources are added."
"It is easy to deploy workflows and schedule jobs."
"Data Factory's best feature is the ease of setting up pipelines for data and cloud integrations."
"The entire user interface is very simple and the simplicity of creating pipelines is something that I like very much about it. The design experience is very smooth."
"The scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy."
"The most valuable feature is the pipelines because they enable us to pull in and push out data from different sources and to manipulate and clean things up within them."
"It is really easy to set up and the interface is easy to use."
"The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customize it to do what you need. Many other tools have started to use features similar to those introduced by StreamSets, like automated workflows that are easy to set up."
"The most valuable would be the GUI platform that I saw. I first saw it at a special session that StreamSets provided towards the end of the summer. I saw the way you set it up and how you have different processes going on with your data. The design experience seemed to be pretty straightforward to me in terms of how you drag and drop these nodes and connect them with arrows."
"I really appreciate the numerous ready connectors available on both the source and target sides, the support for various media file formats, and the ease of configuring and managing pipelines centrally."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
 

Cons

"There are performance issues, particularly with the underlying compute, which should be configurable."
"You cannot use a custom data delimiter, which means that you have problems receiving data in certain formats."
"The setup and configuration process could be simplified."
"Real-time replication is required, and this is not a simple task."
"The number of standard adaptors could be extended further."
"Data Factory would be improved if it were a little more configuration-oriented and not so code-oriented and if it had more automated features."
"When working with AWS, we have noticed that the difference between ADF and AWS is that AWS is more customer-focused. They're more responsive compared to any other company. ADF is not as good as AWS, but it should be. If AWS is ten out of ten, ADF is around eight out of ten. I think AWS is easier to understand from the GUI perspective compared to ADF."
"The solution should offer better integration with Azure machine learning. We should be able to embed the cognitive services from Microsoft, for example as a web API. It should allow us to embed Azure machine learning in a more user-friendly way."
"One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Because of this memory issue, we have to upgrade our EC2 boxes in the Amazon AWS infrastructure."
"We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which was painful. Also, pipeline failures were common, and data drifting wasn't addressed, which made things worse. Licensing was another issue we encountered."
"I would like to see it integrate with other kinds of platforms, other than Java. We're going to have a lot of applications using .NET and other languages or frameworks. StreamSets is very helpful for the old Java platform but it's hard to integrate with the other platforms and frameworks."
"One area for improvement could be the cloud storage server speed, as we have faced some latency issues here and there."
"The software is very good overall. Areas for improvement are the error logging and the version history. I would like to see better, more detailed error logging information."
"In terms of the product, I don't think there is any room for improvement because it is very good. One small area of improvement that is very much needed is on the knowledge base side. Sometimes, it is not very clear how to set up a certain process or a certain node for a person who's using the platform for the first time."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"StreamSet works great for batch processing but we are looking for something that is more real-time. We need latency in numbers below milliseconds."
 

Pricing and Cost Advice

"Pricing appears to be reasonable in my opinion."
"ADF is cheaper compared to AWS."
"The licensing model for Azure Data Factory is good because you won't have to overpay. Pricing-wise, the solution is a five out of ten. It was not expensive, and it was not cheap."
"It's not particularly expensive."
"Our licensing fees are approximately 15,000 ($150 USD) per month."
"Understanding the pricing model for Data Factory is quite complex."
"The solution's fees are based on a pay-per-minute use plus the amount of data required to process."
"Azure products generally offer competitive pricing, suitable for diverse budget considerations."
"There are two editions, Professional and Enterprise, and there is a free trial. We're using the Professional edition and it is competitively priced."
"It's not so favorable for small companies."
"The overall cost is very flexible so it is not a burden for our organization... However, the cost should be improved. For small and mid-size organizations it might be a challenge."
"The licensing is expensive, and there are other costs involved too. I know from using the software that you have to buy new features whenever there are new updates, which I don't really like. But initially, it was very good."
"Its pricing is pretty much up to the mark. For smaller enterprises, it could be a big price to pay at the initial stage of operations, but the moment you have the Seed B or Seed C funding and you want to scale up your operations and aren't much worried about the funds, at that point in time, you would need a solution that could be scaled."
"StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
"StreamSets is an expensive solution."
"It has a CPU core-based licensing, which works for us and is quite good."
report
Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
845,406 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
14%
Computer Software Company
12%
Manufacturing Company
9%
Healthcare Company
7%
Financial Services Firm
14%
Computer Software Company
11%
Manufacturing Company
10%
Insurance Company
8%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

How do you select the right cloud ETL tool?
AWS Glue and Azure Data factory for ELT best performance cloud services.
How does Azure Data Factory compare with Informatica PowerCenter?
Azure Data Factory is flexible, modular, and works well. In terms of cost, it is not too pricey. It offers the stability and reliability I am looking for, good scalability, and is easy to set up an...
How does Azure Data Factory compare with Informatica Cloud Data Integration?
Azure Data Factory is a solid product offering many transformation functions; It has pre-load and post-load transformations, allowing users to apply transformations either in code by using Power Q...
What do you like most about StreamSets?
The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customiz...
What needs improvement with StreamSets?
We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which ...
What is your primary use case for StreamSets?
StreamSets is used for data transformation rather than ETL processes. It focuses on transforming data directly from sources without handling the extraction part of the process. The transformed data...
 

Overview

 

Sample Customers

1. Adobe 2. BMW 3. Coca-Cola 4. General Electric 5. Johnson & Johnson 6. LinkedIn 7. Mastercard 8. Nestle 9. Pfizer 10. Samsung 11. Siemens 12. Toyota 13. Unilever 14. Verizon 15. Walmart 16. Accenture 17. American Express 18. AT&T 19. Bank of America 20. Cisco 21. Deloitte 22. ExxonMobil 23. Ford 24. General Motors 25. IBM 26. JPMorgan Chase 27. Microsoft (Azure Data Factory is developed by Microsoft) 28. Oracle 29. Procter & Gamble 30. Salesforce 31. Shell 32. Visa
Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
Find out what your peers are saying about Azure Data Factory vs. StreamSets and other solutions. Updated: March 2025.
845,406 professionals have used our research since 2012.