Pentaho Data Integration and Analytics Reviews and Pricing

Manager, Systems Development at a manufacturing company with 5,001-10,000 employees

Aug 7, 2022

An affordable solution that makes it simple to do some fairly complicated things, but it could be improved in terms of consistency of different transformation steps

Pros and Cons

"It makes it pretty simple to do some fairly complicated things. Both I and some of our other BI developers have made stabs at using, for example, SQL Server Integration Services, and we found them a little bit frustrating compared to Data Integration. So, its ease of use is right up there."

"Its basic functionality doesn't need a whole lot of change. There could be some improvement in the consistency of the behavior of different transformation steps. The software did start as open-source and a lot of the fundamental, everyday transformation steps that you use when building ETL jobs were developed by different people. It is not a seamless paradigm. A table input step has a different way of thinking than a data merge step."

What is our primary use case?

Our primary use case is to populate a data warehouse and data marts, but we also use it for all kinds of data integration scenarios and file movement. It is almost like middleware between different enterprise solutions. We take files from our legacy app system, do some work on them, and then call SAP BAPIs, for example.

It is deployed on-premises. It gives you the flexibility to deploy it in any environment, whether on-premises or in the cloud, but this flexibility is not that important to us. We could deploy it on the cloud by spinning up a new server in AWS or Azure, but as a manufacturing facility, it is not important to us. Our customer preference is primarily to deploy things on-premises.

We usually stay one version behind the latest one. We're a manufacturing facility. So, we're very sensitive to any bugs or issues. We don't do automatic upgrades. They're a fairly manual process.

How has it helped my organization?

We've had it for a long time. So, we've realized a lot of the improvements that anybody would realize from almost any data integration product.

The speed of developing solutions has been the best improvement. It has reduced the development time and improved the speed of getting solutions deployed. The reduced ETL development time varies by the size and complexity of the project. We probably spend days or weeks less than then if we were using a different tool.

It is tremendously flexible in terms of adding custom code by using a variety of different languages if you want to, but we had relatively few scenarios where we needed it. We do very little custom coding. Because of the tool we're using, it is not critical. We have developed thousands of transformations and jobs in the tool.

What is most valuable?

It makes it pretty simple to do some fairly complicated things. Both I and some of our other BI developers have made stabs at using, for example, SQL Server Integration Services, and we found them a little bit frustrating compared to Data Integration. So, its ease of use is right up there.

Its performance is a pretty close second. It is a pretty highly performant system. Its query performance on large data sets is very good.

What needs improvement?

Its basic functionality doesn't need a whole lot of change. There could be some improvement in the consistency of the behavior of different transformation steps. The software did start as open-source and a lot of the fundamental, everyday transformation steps that you use when building ETL jobs were developed by different people. It is not a seamless paradigm. A table input step has a different way of thinking than a data merge step.

Buyer's Guide

Pentaho Data Integration and Analytics

September 2025

Free Report: Pentaho Data Integration and Analytics Reviews and More

Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: September 2025.

DOWNLOAD NOW

868,787 professionals have used our research since 2012.

For how long have I used the solution?

We have been using this solution for more than 10 years.

What do I think about the stability of the solution?

Its stability is very good.

What do I think about the scalability of the solution?

Its scalability is very good. We've been running it for a long time, and we've got dozens, if not hundreds, of jobs running a day.

We probably have 200 or 300 people using it across all areas of the business. We have people in production control, finance, and what we call materials management. We have people in manufacturing, procurement, and of course, IT. It is very widely and extensively used. We're increasing its usage all the time.

How are customer service and support?

They are very good at quickly and effectively solving the issues we have brought up. Their support is well structured. They're very responsive.

Because we're very experienced in it, when we come to them with a problem, it is usually something very obscure and not necessarily easy to solve. We've had cases where when we were troubleshooting issues, they applied just a remarkable amount of time and effort to troubleshoot them.

Support seems to have very good access to development and product management as a tier-two. So, it is pretty good. I would give their technical support an eight out of ten.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We didn't have another data integration product before Pentaho.

How was the initial setup?

I installed it. It was straightforward. It took about a day and a half to get the production environment up and running. That was probably because I was e-learning as I was going. With a services engagement, I bet you would have everything up in a day.

What about the implementation team?

We used Pentaho services for two days. Our experience was very good. We worked with Andy Grohe. I don't know if he is still there or not, but he was excellent.

What was our ROI?

We have absolutely seen an ROI, but I don't have the metrics. There are analytic cases that we just weren't able to do before. Due to the relatively low cost compared to some of the other solutions out there, it has been a no-brainer.

What's my experience with pricing, setup cost, and licensing?

We did a two or three-year deal the last time we did it. As compared to other solutions, at least so far in our experience, it has been very affordable. The licensing is by component. So, you need to make sure you only license the components that you really intend to use.

I am not sure if we have relicensed after the Hitachi acquisition, but previously, multi-year renewals resulted in a good discount. I'm not sure if this is still the case.

We've had the full suite for a lot of years, and there is just the initial cost. I am not aware of any additional costs.

What other advice do I have?

If you haven't used it before, it is worth engaging services with Pentaho for initial implementation. They'll just point out a number of small foibles related to perhaps case sensitivity. They'll just save you a lot of runs through the documentation to identify different configuration points that might be relevant to you.

I would highly recommend the Data Integration product, particularly for anyone with a Java background. Most of our BI developers at this point do not have a Java background, which isn't really that important. Particularly, if you're a Java business and you're looking for extensibility, the whole solution is built in Java, which just makes certain aspects of it a little more intuitive at first.

On the data integration side, it is really a good tool. A lot of investment dollars go into big data and new tech, and often, those are not very compelling for us. We're in an environment where we have medium data, not big data.

It provides a single end-to-end data management experience from ingestion to insights, but at this point, that's not critical to us. We mostly do the data integration work in Pentaho, and then we do the visualization in another tool. The single data management experience hasn't enabled us to discontinue the use of other data management analysis delivery tools just because we didn't really have them.

We take an existing job or transformation and use that as a test. It is certainly easy enough to copy one object to another. I am not aware of a specific templating capability, but we are not really missing anything there. It is very easy for us to clone a job or transformation just by doing a Save As, and we do that extensively.

Vantara's roadmap is a little fuzzy for me. There has been quite a bit of turnover in the customer-facing roles over the last five years. We understand that there is a roadmap to move to a pure web-based solution, but it hasn't been well communicated to us.

In terms of our decision to purchase Hitachi's product services or solutions, our satisfaction level is average or on balance.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

On-premises

Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.

reviewer1855218

Data Architect at a consumer goods company with 1,001-5,000 employees

May 12, 2022

Download

I can extend and customize existing pipeline templates for changing requirements, saving time

Pros and Cons

"I can use Python, which is open-source, and I can run other scripts, including Linux scripts. It's user-friendly for running any object-based language. That's a very important feature because we live in a world of open-source."

"I would like to see improvement when it comes to integrating structured data with text data or anything that is unstructured. Sometimes we get all kinds of different files that we need to integrate into the warehouse."