We primarily use IBM InfoSphere DataStage for ETL scenarios, extracting, transforming, and loading data across various systems efficiently. Although we experimented with using it for real-time or on-time integrations in the past few years, our main focus remains on its strength in traditional batch data integration processes.
Arquitecto Industrial IoT at Xignux SA de CV
Effectively handles large volumes of records for ETL processes
Pros and Cons
- "The most valuable feature for our data processing needs is IBM InfoSphere DataStage's capability to handle ETL tasks with large record volumes."
- "Improvements for DataStage could include better integration with modern data sources like cloud solutions and documents, along with enhancing its capability to handle non-structured data."
What is our primary use case?
What is most valuable?
The most valuable feature for our data processing needs is IBM InfoSphere DataStage's capability to handle ETL tasks with large record volumes.
What needs improvement?
Improvements for DataStage could include better integration with modern data sources like cloud solutions and documents, along with enhancing its capability to handle non-structured data.
For how long have I used the solution?
I have been working with IBM InfoSphere DataStage for almost four years.
Buyer's Guide
IBM InfoSphere DataStage
March 2026
Learn what your peers think about IBM InfoSphere DataStage. Get advice and tips from experienced pros sharing their opinions. Updated: March 2026.
885,264 professionals have used our research since 2012.
What do I think about the stability of the solution?
I would rate the stability of DataStage at around eight out of ten. While it is generally stable, occasional issues arise due to our team not consistently following best practices during process development, impacting server installations.
What do I think about the scalability of the solution?
I would rate the scalability of DataStage at around seven out of ten for our organization because our licensing is based on CPUs, which complicates scaling without hardware adjustments.
How are customer service and support?
I would rate IBM's support for InfoSphere at around five out of ten. It is complicated due to the reliance on our partner and integrator for support, which sometimes affects the quality of assistance we receive directly from IBM.
How was the initial setup?
The initial setup of DataStage, implemented through an IBM partner in Mexico, faced challenges due to partner skill gaps, leading to some complications. Deployment took around one to two years until the system became more stable, and currently, a team of three handles maintenance and support, with two individuals at level two communicating with IBM for assistance.
Which other solutions did I evaluate?
Before choosing IBM, we evaluated Data Factory from Azure. The main difference lies in DataStage's strength in handling large-scale scenarios for data integration, while Data Factory is more suitable for specific service scenarios and may require complementary tools for broader use cases, like email integration.
What other advice do I have?
We have used IBM InfoSphere DataStage effectively for managing Big Data within our products, particularly in scenarios involving large volumes of records for ETL processes. However, we have seen that for near real-time or on-time integration tasks, DataStage may not be optimal due to its resource-intensive nature.
DataStage's scalability has indeed supported our data growth, particularly for ETL tasks involving large volumes of data, enabling us to manage increased data loads effectively.
The scalability of DataStage supported our data growth by allowing us to manage increased data loads effectively, primarily through optimizing the usage of the tool rather than inherent scalability features. However, we faced challenges with real-time processing as DataStage could not trigger processes based on events like emails, requiring us to schedule tasks at intervals, which limited its suitability for real-time scenarios.
DataStage integrates with our existing IT infrastructure by connecting to our manufacturing processes and systems like ERP and SAP. It facilitates integration by consolidating data from various sources, enabling us to view unified information across our systems.
I would recommend DataStage for data integration, especially for SQL data and ETL tasks.
Overall, I would rate DataStage at a seven out of ten. While it is a robust solution for data integration and ETL tasks, there is room for improvement in adopting more modern architectures to meet evolving needs.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Allows for the integration of multiple data sources into a single data warehouse but there is potential for scalability improvement
Pros and Cons
- "In IBM DataStage, the Transformer is the most valuable feature for me. It enables me to apply complex transformations, generate the gateway key, and map source tables into the session table."
- "So, there are some features that are missing. If I compare DataStage to Talend, Talend allows you to write custom code in Java or use these tools in your applications as well if you are building a job application. But in DataStage, it does not allow you to write custom code for any component."
What is our primary use case?
We have integrated multiple data sources into a single data warehouse. For this, we used to build complex ETL jobs and datasets to integrate data from multiple sources into a single data warehouse. So, these are basically the use cases.
What is most valuable?
In IBM DataStage, the Transformer is the most valuable feature for me. It enables me to apply complex transformations, generate the gateway key, and map source tables into the session table.
What needs improvement?
So, there are some features that are missing. If I compare DataStage to Talend, Talend allows you to write custom code in Java or use these tools in your applications as well if you are building a job application. But in DataStage, it does not allow you to write custom code for any component.
Moreover, Talend allows you to extract Java code and call it in your APIs or applications, DataStage does not have this feature.
In future releases, DataStage could benefit from the ability to save metadata into a database. So, if the database crashes or you lose the data in the database, you could recover it. Unlike files, which are harder to manage.
For how long have I used the solution?
I have been using this solution for five years.
What do I think about the stability of the solution?
I would rate the stability of this solution a seven out of ten.
What do I think about the scalability of the solution?
I would rate the scalability of this solution a five out of ten. It should be improved. We have almost eight end users in our area. Some are engineers, one is an administrator, and two of them monitor the pipelines. The rest are developers.
We plan to increase the further usage.
How are customer service and support?
In terms of documentation and support, IBM is reliable for providing support to its partners or those with licenses. You can easily find problem resolution support online.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
I have previously worked on SSIS. After using SSIS, I moved to DataStage. We explored IBM DataStage for our specific needs.
How was the initial setup?
I would rate my experience with the initial setup a seven out of ten, where one is difficult and ten is easy. I have worked both on-premises and deployed it on a private cloud.
The deployment process usually takes a day.
What about the implementation team?
For deployment, you first need to install DataStage on the desired server. Then, you have to take a backup of the development and deploy it on the server. After importing, you need to execute and schedule it through your job application.
People required for the deployment depends on the scenario. Sometimes, one person is more than enough for deployment.
What's my experience with pricing, setup cost, and licensing?
Pricing is handled by the procurement department. But compared to other enterprise tools like Informatica or Pentaho, IBM DataStage is quite cheaper.
What other advice do I have?
I would highly recommend this solution because of its shared-nothing architecture that it uses, the capabilities it offers, and the fact that every feature has its own use. For example, it has a Director for creating jobs, clients for monitoring and scheduling jobs, and an Administrative client for administration purposes. This is something well managed by IBM.
Overall, I would rate the solution a seven out of ten. There are certain areas of improvement.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
IBM InfoSphere DataStage
March 2026
Learn what your peers think about IBM InfoSphere DataStage. Get advice and tips from experienced pros sharing their opinions. Updated: March 2026.
885,264 professionals have used our research since 2012.
Data engineer at nust
A scalable ETL tool with a slow connection that can make it time-consuming to work on
Pros and Cons
- "The solution's scalability is really good...we are using multi-instance jobs where you can scale them easily."
- "It takes a lot of time to actually trigger your job and then go into the logs and other stuff. So all of this is really time-consuming."
What is our primary use case?
Right now, I'm working for a telecom company. So, we are using IBM InfoSphere DataStage for constructing ETL jobs for them so that they can load data from their various different sources into their warehouse.
What is most valuable?
The valuable feature of the solution is, I think, its functionality. So, there are a lot of transformations that you can apply by just using a transformer. Also, you don't need to complicate your SQL queries while trying to transform your data. Hence, the transformer is something I like in the solution.
What needs improvement?
I don't know if it's just a problem with me, but the issue I see is that when we connect to the server from the client, especially when you're going to run a job or something, the whole connection is really slow. It takes a lot of time to actually trigger your job and then go into the logs and other stuff. So all of this is really time-consuming.
For how long have I used the solution?
I have been using IBM InfoSphere DataStage for five years. Also, I am using IBM InfoSphere DataStage Version 11.7. My company is a consultant for DataStage.
What do I think about the stability of the solution?
Most of the time, it is stable. Sometimes there are some issues you don't understand and go away when you have a read-only job. But that is quite rare. Other times, it seems quite stable.
What do I think about the scalability of the solution?
The solution's scalability is really good. In terms of parallel jobs, we are using multi-instance jobs where you can scale them easily.
In my company, my team is spread across multiple countries, including Pakistan and India.
How are customer service and support?
I haven't contacted IBM's technical support.
How was the initial setup?
The solution's initial setup is straightforward. Also, it's a one-time activity. It is better to have a competent person for deployment since newbies cannot do it themselves.
When I started using IBM InfoSphere DataStage, it was already deployed on the server. So I did not have to go through the installation phase.
What was our ROI?
ROI is something that the client takes care of, and I think they must be keeping track of it and getting a certain result indicating a good ROI. So, that's why they may have continued using it over the years.
Which other solutions did I evaluate?
Before DataStage, I did not evaluate other options. Our client was already comfortable with DataStage, so that's what we had to use.
What other advice do I have?
I recommend that other people who want to use it go for DataStage on the cloud. The on-prem version of the solution looks and feels old. Also it's time-consuming as well. Overall, I rate the solution a six out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
Teamlead at Tata consultancy services
User-friendly with a lot of functionalities, and doesn't require much coding because of its drag-and-drop features
Pros and Cons
- "The best feature of IBM InfoSphere DataStage for me was that it was very much user-friendly. The solution didn't require that much raw coding because most of its features were drag and drop, plus it had a large number of functionalities."
- "The best feature of IBM InfoSphere DataStage for me was that it was very much user-friendly, the solution did not require that much raw coding because most of its features were drag and drop, plus it had a large number of functionalities."
- "What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag. Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources. The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well."
- "What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag."
What is our primary use case?
IBM InfoSphere DataStage was mostly used for ETL and data integration purposes, so extract, transfer, and load, including some data quality use cases. My team used the solution to extract data from various sources, do some business transformations, load the data into a target database, or generate files.
What is most valuable?
The best feature of IBM InfoSphere DataStage for me was that it was very much user-friendly. The solution didn't require that much raw coding because most of its features were drag and drop, plus it had a large number of functionalities.
What needs improvement?
What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag.
Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources.
The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well.
For how long have I used the solution?
I've been working with IBM InfoSphere DataStage for more than seven years.
What do I think about the stability of the solution?
IBM InfoSphere DataStage is a stable product and it's been in the market for quite some time, but in its latest version, there's been some instability caused by the new features introduced in the solution. The architecture was changed a lot and that was causing issues and frequent outages that my company had to go back to IBM for troubleshooting. My team didn't face issues in the earlier version of IBM InfoSphere DataStage. It was the latest version that had instability issues.
What do I think about the scalability of the solution?
IBM InfoSphere DataStage is a very scalable product.
How are customer service and support?
IBM InfoSphere DataStage has a pretty good technical support, but with the new version, particularly the new architecture and the microservice concept, support sometimes takes a bit of time, even for the IBM team to figure out what's wrong, but once that's been figured out, the team comes up with the solution or with a patch.
How was the initial setup?
Setting up IBM InfoSphere DataStage was easy.
How long the deployment takes would depend on certain factors, but it usually takes just two to three hours.
What's my experience with pricing, setup cost, and licensing?
I have no information on the exact pricing for IBM InfoSphere DataStage because the solution is usually procured by the clients my company works with, though the pricing is higher compared to other solutions, so many clients choose to go with a different solution rather than IBM InfoSphere DataStage.
What other advice do I have?
The last version of IBM InfoSphere DataStage which I've worked with was version 11.7.
I work for an IT service company that works with multiple clients on multiple projects, so close to two hundred people use IBM InfoSphere DataStage for various clients.
Per project, on average, three people take care of IBM InfoSphere DataStage deployment, maintenance, and support-related activities.
My advice to people looking into implementing IBM InfoSphere DataStage is that it's a very good product. A lot of similar products have come up nowadays, but this product has a pretty good reputation as it's been in the market for quite a while. I do think other products such as Talend, Informatica PowerCenter, and Informatica Data Quality are better than IBM InfoSphere DataStage.
My rating for IBM InfoSphere DataStage is eight out of ten.
My company has a partnership with IBM.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
Technical Data Analyst at Swedish Armed Forces
Helps to migrate old databases to newer ones
Pros and Cons
- "We can view what we want to do. We can transform data and put them on tables."
- "I want the tool to continue with the on-prem version, not the cloud one."
What is our primary use case?
We migrate data from old databases to new databases.
What is most valuable?
We can view what we want to do. We can transform data and put them on tables.
What needs improvement?
I want the tool to continue with the on-prem version, not the cloud one.
For how long have I used the solution?
I have been using the solution for six years.
What do I think about the stability of the solution?
I rate the solution's stability a seven out of ten.
What do I think about the scalability of the solution?
My company has 12 users for IBM InfoSphere DataStage. I rate its scalability a nine out of ten.
How are customer service and support?
It takes time to get answers from the support. The responsiveness varies. Sometimes it is fast, and other times it is slow.
How would you rate customer service and support?
Neutral
How was the initial setup?
IBM has good installation documents. The tool's deployment took a day to complete.
What other advice do I have?
As part of the tool's maintenance, we install the updates and take backups. We rely on two engineers to help with the process. I rate IBM InfoSphere DataStage an eight out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
CEO at DELOMID IT
Powerful and agile with good support
Pros and Cons
- "It works with multiple servers and offers high availability."
- "I'd like to be able to do more with the data and metadata, including copy and pasting, et cetera."
What is our primary use case?
We use the solution for data warehousing and data migration.
What is most valuable?
The solution is one of the best solutions.
It is very powerful. It's quite agile.
It works with multiple servers and offers high availability. It can handle very complex architecture. It has good active-passive capabilities. It makes migrations very easy.
It is stable.
The solution can scale.
It offers pretty good support services, at least in France.
What needs improvement?
A lot about the solution could be improved.
I'd like to be able to do more with the data and metadata, including copy and pasting, et cetera. It has become easier with the cloud, however.
I'd like to have the ability to customize code.
For how long have I used the solution?
I've been using the solution for more than ten years. I've used it since 2006. I started with version 7.
What do I think about the stability of the solution?
The product is stable and reliable. I'd rate it eight out of ten. There are no bugs or glitches. It doesn't crash.
What do I think about the scalability of the solution?
The solution is scalable. I'd rate the scalability eight out of ten.
We have six or seven developers on the solution.
How are customer service and support?
I tend to do most of the troubleshooting. I do not need the assistance of technical support as I am quite knowledgeable. That said, my understanding is that support in France is very good.
Which solution did I use previously and why did I switch?
I've also used SQL and Talend. I've also used Informatica, Spark, and AWS Glue. I use a variety of solutions for various clients.
How was the initial setup?
The installation is pretty fast. It doesn't take too long. We did a deployment a few years ago, and it only took maybe two to three days. We updated it to version 11 at that point. The length of time depends on the architecture. It can vary a bit.
First, we have to install it on the web server. After that, we have to set up the repository with Oracle or DB2. That takes a lot of time. When you are a big organization, it takes a lot of people. There are configurations and prerequisites that have to be considered.
Only one person is needed to manage the maintenance.
What about the implementation team?
We handle the installation ourselves. I handle it mostly on my own.
What was our ROI?
I have not noted any ROI statistics.
What's my experience with pricing, setup cost, and licensing?
The pricing depends on the setup. However, we paid $100,000 as a one-time cost for an on-premises setup.
You do have extra costs when using the product on-premises. For example, you need to have servers to host it.
What other advice do I have?
I used to be a partner with IBM. I have to reset the partnership.
I would recommend the solution for on-premises setups.
I would rate the solution eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Program Manager at a consultancy with 10,001+ employees
The solution can incorporate very complex business rules, is moderately scalable, and is stable
Pros and Cons
- "The most valuable feature of the solution is the ability to incorporate very complex business rules in Data Stage."
- "The solution can be a bit more user-friendly, similar to Informatica."
What is our primary use case?
The solution is mainly used for, marketing campaigns, customer segmentation, and home loans.
What is most valuable?
The most valuable feature of the solution is the ability to incorporate very complex business rules in Data Stage.
What needs improvement?
The solution can be a bit more user-friendly, similar to Informatica.
I would like the solution to have some basic streaming functionality added.
For how long have I used the solution?
I have been using the solution for one year.
What do I think about the stability of the solution?
We don't currently have much in our production environment. We are gradually moving into production, so whatever small setup we have is okay for now. I'm taking the overall perspective into account and I think we do have dependencies on the other jobs. This is purely based on the feedback we receive, which sometimes means that we're not able to run our process because there are dependencies, similar to other jobs also. The jobs don't complete on time. We received feedback that there was a problem handling data, which caused jobs to fail and needed to be rerun. This could be product-specific, design-specific, or anything else, but I think there is room for improvement in terms of stability. I would give the solution a seven out of ten.
What do I think about the scalability of the solution?
I think that scalable systems should also have good performance. The scalability of this solution in my opinion may not be on the same level as Informatica Power Exchange Data Integration.
I give the scalability of the solution a seven out of ten. We are facing problems whenever we have huge amounts of data and there are job failures. We need to take care of how to tackle that situation.
How was the initial setup?
We didn't need to do anything because the customer, with whom we are working on the project, had already set everything up for us. The initial setup was not in our preview.
What other advice do I have?
I give the solution a seven out of ten.
We have a separate platform team or support team. In case of any query, it used to be routed to this team, which was internally used to deal with the Data Stage people.
I'm not a technical expert because I haven't been a developer for 12 years. This is what I understand from the feedback I've received. Informatica Power Exchange Data Integration is much better from a scalability perspective, compared to IBM InfoSphere Data Stage. Scalability, user-friendliness, and inclusion of different business rules are all important, but I think Informatica Power Exchange Data Integration gives us one step further on that.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google
Disclosure: My company has a business relationship with this vendor other than being a customer.
Project Manager at Blue Technology
A highly-stable and scalable solution with seamless integration capabilities
Pros and Cons
- "The concept of integration is a valuable feature of the product."
- "The graphical user interface (GUI) feels a lot like the interfaces from the 1980s."
What is our primary use case?
We use it for many kinds of projects. For example, we use it for business intelligence, master data management, data quality, data governance, data integration to SAP, Oracle retail, and real-time integration.
What is most valuable?
The concept of integration is a valuable feature of the product. The product excels in this area. In fact, it is one of the two leading products in the integration field.
Based on my experience, interacting with data through integration has been more profitable than any other products or projects. Data integration is a reliable feature.
What needs improvement?
The graphical user interface (GUI) feels a lot like the interfaces from the 1980s. Regarding IBM, they initially indicated that version 8.11.7 would be their final release. We require more connectors to establish clear connections to cloud services. Connecting to the cloud is not easy or transparent within the product. Although the product has the potential to connect to the cloud, configuring and setting it up is challenging.
In future releases, connecting to the cloud should be easy and transparent.
For how long have I used the solution?
I’ve been using this solution since 2005. I’m currently using version 11.7.
What do I think about the stability of the solution?
The stability of the solution is ten on ten.
What do I think about the scalability of the solution?
The scalability of the solution is nine on ten.
How was the initial setup?
Setting up the system is difficult and requires a lot of technical expertise. It demands a high level of experience during the setup process.
However, despite the difficulty, our team, possibly the only one in Mexico, is good at quickly completing the setup.
Most of our implementations or installations are on-premises, accounting for approximately 90% of the total. The remaining 10% of our implementations are on the cloud.
What was our ROI?
The product has proven to be exceptional and highly competitive in the market. As a result, I have experienced substantial financial success with the product. There are many examples we have for the ROI.
What's my experience with pricing, setup cost, and licensing?
The price of the product is reasonable, especially for mid-sized companies. We have been successfully selling to several mid-sized companies.
Which other solutions did I evaluate?
The market is indeed filled with numerous solutions. Currently, I am exploring Microsoft's offerings, particularly their integration capabilities. Also, the purpose and functionalities of Azure are pretty interesting. I have been working with Quick Link, and specifically, I have been working with Azure Database.
What other advice do I have?
If you want to learn about integrating data, this product offers a valuable learning opportunity. Then you can consider transitioning to the next product in line. The underlying concept remains the same.
I believe this solution is more reliable and provides a better understanding of data integration, data quality management, and master data management. It covers all aspects of data management, making it easier to learn.
This is my perspective, possibly influenced by my extensive experience working with this product for many years. It is the main strength of this solution.
Overall, I would rate the solution a ten out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Download our free IBM InfoSphere DataStage Report and get advice and tips from experienced pros
sharing their opinions.
Updated: March 2026
Product Categories
Data IntegrationPopular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
Azure Data Factory
Informatica PowerCenter
Qlik Talend Cloud
Oracle Data Integrator (ODI)
Oracle GoldenGate
SAP Data Services
Qlik Replicate
Pentaho Data Integration and Analytics
Buyer's Guide
Download our free IBM InfoSphere DataStage Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- How do you compare Informatica PowerCenter with IBM DataStage?
- Would you upgrade to more premium versions of IBM InfoSphere DataStage?
- Is IBM InfoSphere DataStage more difficult to use compared to other tools in the field?
- Do you rely on IBM Cloud Paks for your data? Have you utilized this product, or do you use IBM InfoSphere DataStage without it?
- When evaluating Data Integration, what aspect do you think is the most important to look for?
- Microsoft SSIS vs. Informatica PowerCenter - which solution has better features?
- What are the best on-prem ETL tools?
- Which integration solution is best for a company that wants to integrate systems between sales, marketing, and project development operations systems?
- Experiences with Oracle GoldenGate vs. Oracle Data Integrator?
- What are the must-have features for a Data integration system?
















