Name: Pentaho Data Integration and Analytics
Brand: Hitachi Vantara
Rating: 4.0 (59 reviews)

reviewer1872000

Senior Data Analyst at a tech services company with 51-200 employees

Jun 8, 2022

Download

We're able to query large data sets without affecting performance

Pros and Cons

"One of the most valuable features is the ability to create many API integrations. I'm always working with advertising agents and using Facebook and Instagram to do campaigns. We use Pentaho to get the results from these campaigns and to create dashboards to analyze the results."

"Parallel execution could be better in Pentaho. It's very simple but I don't think it works well."

What is our primary use case?

I use it for ETL. We receive data from our clients and we join the most important information and do many segmentations to help with communication between our product and our clients.

How has it helped my organization?

Before we used Pentaho, our processes were in Microsoft Excel and the updates from databases had to be done manually. Now all our routines are done automatically and we have more time to do other jobs. It saves us four or five hours daily.

In terms of ETL development time, it depends on the complexity of the job, but if the job is simple it saves two or three hours.

What is most valuable?

One of the most valuable features is the ability to create many API integrations. I'm always working with advertising agents and using Facebook and Instagram to do campaigns. We use Pentaho to get the results from these campaigns and to create dashboards to analyze the results.

I'm working with large data sets. One of the clients I'm working with is a large credit card company and the database from this client is very large. Pentaho allows me to query large data sets without affecting its performance.

I use Pentaho with Jenkins to schedule the jobs. I'm using the jobs and transformations in Pentaho to create many links.

I always find ways to have minimal code and create the processes with many parameters. I am able to reuse processes that I have created before.

Creating jobs and putting them into production, as well as the visibility that Pentaho gives, are both very simple.

What needs improvement?

Parallel execution could be better in Pentaho. It's very simple but I don't think it works well.

Buyer's Guide

Pentaho Data Integration and Analytics

January 2026

Free Report: Pentaho Data Integration and Analytics Reviews and More

Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.

DOWNLOAD NOW

881,757 professionals have used our research since 2012.

For how long have I used the solution?

I've been working with Pentaho for four or five years.

What do I think about the stability of the solution?

The stability is good.

What do I think about the scalability of the solution?

It's scalable.

How are customer service and support?

I find help on the forums.

Which solution did I use previously and why did I switch?

I used SQL Server Integration Services, but I have much more experience with Pentaho. I have also worked with Apache NiFi but it is more focused on single data processes but I'm always working with batch processes and large data sets.

How was the initial setup?

The first deployment was very complex because we didn't have experience with the solution, but the next deployment was simpler.

We create jobs weekly in Pentaho. The development time takes, on average, one week and the deployment takes just one day or so.

We just put it on Git and pull a server and schedule the execution.

We use it on-premises while the infrastructure is Amazon and Azure.

What other advice do I have?

I always recommend Pentaho for working with automated processes and to do API integrations.

Which deployment model are you using for this solution?

On-premises

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Aqeel UR Rehman

BI Analyst at a computer software company with 51-200 employees

May 19, 2022

Download

Simple to use, supports custom transformations, and the open-source version can be used free of charge

Pros and Cons

"This solution allows us to create pipelines using a minimal amount of custom coding."

"I have been facing some difficulties when working with large datasets. It seems that when there is a large amount of data, I experience memory errors."