Try our new research platform with insights from 80,000+ expert users
Lead, Data and BI Architect at a financial services firm with 201-500 employees
Real User
Jan 13, 2022
We can use the same tool on all our environments. The patching is buggy.
Pros and Cons
  • "Flexible deployment, in any environment, is very important to us. That is the key reason why we ended up with these tools. Because we have a very highly secure environment, we must be able to install it in multiple environments on multiple different servers. The fact that we could use the same tool in all our environments, on-prem and in the cloud, was very important to us."
  • "The testing and quality could really improve. Every time that there is a major release, we are very nervous about what is going to get broken. We have had a lot of experience with that, as even the latest one was broken. Some basic things get broken. That doesn't look good for Hitachi at all. If there is one place I would advise them to spend some money and do some effort, it is with the quality. It is not that hard to start putting in some unit tests so basic things don't get broken when they do a new release. That just looks horrible, especially for an organization like Hitachi."

What is our primary use case?

We run the payment systems for Canada. We use it as a typical ETL tool to transfer and modify data into a data warehouse. We have many different pipelines that we have built with it.

How has it helped my organization?

I love the fact that we haven't come up with a problem yet that we haven't been able to address with this tool. I really appreciate its maturity and the breadth of its capabilities.

If we did not have this tool, we would probably have to use a whole different variety of tools, then our environment would be a lot more complicated.

We develop metadata pipelines and use them.

Flexible deployment, in any environment, is very important to us. That is the key reason why we ended up with these tools. Because we have a very highly secure environment, we must be able to install it in multiple environments on multiple different servers. The fact that we could use the same tool in all our environments, on-prem and in the cloud, was very important to us. 

What is most valuable?

Because it comes from an open-source background, it has so many different plugins. It is just extremely broad in what it can do. I appreciate that it has a very broad, wide spectrum of things that it can connect to and do. It has been around for a while, so it is mature and has a lot of things built into it. That is the biggest thing. 

The visual nature of its development is a big plus. You don't need to have very strong developers to be able to work with it.

We often have to drop down to JavaScript, but that is fine. I appreciate that it has the capability built-in. When you need to, you can drop down to a scripting language. This is important to us.

What needs improvement?

The documentation is very basic.

The testing and quality could really improve. Every time that there is a major release, we are very nervous about what is going to get broken. We have had a lot of experience with that, as even the latest one was broken. Some basic things get broken. That doesn't look good for Hitachi at all. If there is one place I would advise them to spend some money and do some effort, it is with the quality. It is not that hard to start putting in some unit tests so basic things don't get broken when they do a new release. That just looks horrible, especially for an organization like Hitachi.

Buyer's Guide
Pentaho Data Integration and Analytics
January 2026
Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.
881,757 professionals have used our research since 2012.

For how long have I used the solution?

Overall, I have been using it for about 10 years. At my current organization, I have been using it for about seven years. It was used a little bit at my previous organization as well.

What do I think about the stability of the solution?

The stability is not great, especially when you start patching it a lot because things get broken. That is not a great look. When you start patching, you are expecting things to get fixed, not new things to get broken.

With modern programming, you build a lot of automated testing around your solution, and it is specifically for that. I changed this piece of code. Well, what else got broken? Obviously they don't have a lot of unit tests built into their code. They need to start doing that because it looks horrible when they change one thing, then two other things get broken. Then, they released that as a commercial product, which is horrible. Last time, somehow they broke the ability to connect with databases. That is something incredibly basic. How could you release this product without even testing for that?

What do I think about the scalability of the solution?

We don't have a huge amount of data, so I can't really answer how we could scale up to very large solutions.

How are customer service and support?

Lumada’s ability to quickly and effectively solve issues we have brought up is not great. We have a service for the solution with Hitachi. I don't get the sense that Pentaho, and Hitachi still calls it Pentaho, is a huge center of focus for them. 

You kind of get help, but the people from whom you get help aren't necessarily super strong. It often goes around in circles forever. I eventually have to find my own solution. 

I haven't found that the Hitachi support site has a depth of understanding for the solution. They can answer simple questions, but when it gets more in-depth, they have a lot of trouble answering questions. I don't think the support people have the depth of expertise to really deal with difficult questions.

I would rate them as five out of 10. They are responsive and polite. I don't feel ignored or anything like that, just the depth of knowledge isn't there.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

It has always been here. There was no solution like it until I got to the company.

How was the initial setup?

The initial setup was complex because we had to integrate with SAML. Even though they had some direction on that, it was really a do-it-yourself kind of thing. That was pretty complicated, so if they want to keep this product fresh, I think they have to work on making it integrate more with modern technology, like single sign-on and stuff like that. Every organization has that now and Pentaho doesn't have a good story for that. However, it is the platform that they don't give a lot of love to.

It took us a long time to figure it out, something like two weeks.

What was our ROI?

This has reduced our ETL development time. If it wasn't for this solution, we would be doing custom coding. The reason why we are using the solution is because of its simplicity of development.

What's my experience with pricing, setup cost, and licensing?

The cost of these types of solutions are expensive. So, we really appreciate what we get for our money. Though, we don't think of the solution as a top-of-the-line solution or anything like that.

Which other solutions did I evaluate?

Apache has a project going on called Apache Hop. Because Pentaho was open sourced, people have taken and forged it. They are really modernizing the solution. As far as I know, Hitachi is not involved yet. I would highly advise them to get involved in that open-source project. It will be the next generation of Pentaho. If they get left behind, they're not going to have anything. It would be a very bad move to just ignore it. Hitachi should not ignore Apache Hop.

What other advice do I have?

I really like the data integration tool. However, it is part of a whole platform of tools, and it is obvious the other tools just don't get a lot of love. We are in it for Pentaho Data Integration (PDI) because that is what we want as our ETL tool. We use their reporting platform and stuff like that, but it is obvious that they just don't get a lot of love or concern.

I haven't looked at the roadmap that much. We are also a Google customer using BigQuery, etc. Hitachi is really just a very niche part of what we do. Therefore, we are not generally looking very seriously at what Hitachi is doing with their products nor a big investor in what Hitachi is doing.

I would recommend this specific Hitachi product to a friend or colleague, depending on their use case and need. If they have a very similar need, I would recommend it. I wouldn't be saying, "Oh, this is the best thing next to sliced bread," but say, "Hey, if this is what you need, this works well for us."

On a scale of one to 10 for recommending the product, I would rate it as seven out of 10. Overall, I would also rate it as seven out of 10.

We really appreciated the breadth of its capabilities. It is not the top-of-the-line solution, but you really get a lot for what you pay for.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
PeerSpot user
Analytics Team Leader at a healthcare company with 11-50 employees
Real User
Jan 3, 2022
Enables us to manage our workload and generate a high volume of reporting
Pros and Cons
  • "We're using the PDI and the repository function, and they give us the ability to easily generate reporting and output, and to access data. We also like the ability to schedule."
  • "Since Hitachi took over, I don't feel that the documentation is as good within the solution. It used to have very good help built right in."

What is our primary use case?

We use it to connect to multiple databases and generate reporting. We also have ETL processes running on it.

Portions of it are in AWS, but we also have desktop access.

How has it helped my organization?

The solution has allowed us to automate reporting by automating its scheduling. 

It is also important to us that the solution enables you to leverage metadata to automate data pipeline templates and reuse them. It allows us to generate reports with fewer resources.

If we didn't have this solution, we wouldn't be able to manage our workload or generate the volume of reporting that we currently do. It's very important for us that it provides a single, end-to-end data management experience from ingestion to insights. We are a high-volume department and without those features, we wouldn't be able to manage the current workload.

What is most valuable?

We're using the PDI and the repository function, and they give us the ability to easily generate reporting and output, and to access data. We also like the ability to schedule.

What needs improvement?

Since Hitachi took over, I don't feel that the documentation is as good within the solution. It used to have very good help built right in. There's good documentation when you go to the site but the help function within the solution hasn't been as good since Hitachi took over.

For how long have I used the solution?

I've been using Lumada Data Integration since 2016, but the company has been using it much longer.

We are currently on version 8.3, but we're going to be doing an upgrade to 9.2 next month.

What do I think about the stability of the solution?

The stability is good. We haven't had any issues related to Pentaho.

What do I think about the scalability of the solution?

Its scalability is very good. We use it with multiple, large databases. We've added to it over time and it scales.

We have about 10 users of the solution including a data quality manager, clinical analyst, healthcare informatics analysts, senior healthcare informatics analyst, and an analytics team leader. It's used very extensively by all of those job roles in their day-to-day work. When we add additional staff members, they routinely get access to and are trained on the solution.

How are customer service and support?

Their ability to quickly and effectively solve issues we have brought up is very good. They have a ticketing system and they're very responsive to any tickets we enter. And that's true not only for issues but if we have questions about functionality.

How would you rate customer service and support?

Positive

How was the initial setup?

The solution is very flexible. It's pretty easy to set up connections within the solution.

Maintenance isn't required day-to-day. Our technical staff does the upgrades. They also, on occasion, have to do things like restarting the services, but that's typically related to server issues, not Pentaho itself.

What other advice do I have?

My advice would be to take advantage of the training that's offered.

The query performance of Lumada on large data sets is good, but the query performance is really only as good as the server.

In terms of Hitachi's roadmap, we haven't seen it in a little while. We did have a concern that they're going to be going away from Pentaho and rolling it into another product and we're not quite sure what the result of that is going to be. We don't have a good understanding of what's going to change. That's the concern.

We currently only use Pentaho. We don't have other Hitachi products but we're satisfied with it. We would recommend Pentaho.

Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
PeerSpot user
Buyer's Guide
Pentaho Data Integration and Analytics
January 2026
Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.
881,757 professionals have used our research since 2012.
it_user1510395 - PeerSpot reviewer
Technical Manager at a computer software company with 51-200 employees
Real User
Mar 11, 2021
Quite simple to learn and there is a lot of information available online
Pros and Cons
  • "Pentaho Data Integration is quite simple to learn, and there is a lot of information available online."
  • "I'm still in the very recent stage concerning Pentaho Data Integration, but it can't really handle what I describe as "extreme data processing" i.e. when there is a huge amount of data to process. That is one area where Pentaho is still lacking."

What is our primary use case?

We have an event planning system, which enables us to obtain a large report. It includes data Mart or data warehouse data. This is where we take data from the IT online system and pass it to the data warehouse. Then, from the data warehouse, they generate reports. We have 6 developers who are using the Panel Data Integrator, but there are no end users. We deploy the product, and the customer uses it for reporting. We have one person who undertakes a regular maintenance activity when it is required.

How has it helped my organization?

As we are a software company, we are using the tools provided with the Pentaho Data Integration for our various teams.

What is most valuable?

Pentaho Data Integration is quite simple to learn, and there is a lot of information available online. It is not a steep learning curve. It also integrates easily with other databases and that is great. We use the provided documentation, which is a simple process for integration compared to other proprietary tools.

What needs improvement?

I don't think they market it that well. We can make suggestions for improvements but they don't seem to take the feedback on board. This contrasts with Informatica who are really helpful and seem to listen more to their customer feedback. I would also really like to see improved data capture. At the moment the emphasis seems to be on data processing. I would like to see a real-time processing data integration tool. This would provide instant reporting whenever the data changes. I'm still in the very recent stage concerning Pentaho Data Integration, but it can't really handle what I describe as "extreme data processing" i.e. when there is a huge amount of data to process. That is one area where Pentaho is still lacking.

For how long have I used the solution?

We have been using Pentaho Data Integration for 6 years. The customer is using Mirabilis Cloud, which is a public cloud. We are currently using version A.3.

How are customer service and support?

Technical Support is really good. To get our answers only takes a little bit of time.

Which solution did I use previously and why did I switch?

One of our customers was completely into the Microsoft core framework. We have to use SSIS because it's readily available with them, and is part of the system. We had to use it for five years. 

As mentioned, one of our teams has worked with Informatica in the past. In terms of integration, Informatica isn't more powerful, but more accurate in some aspects. The community is also quite strong.

How was the initial setup?

The setup of Pentaho Data Integration is straightforward. 

What about the implementation team?

We implemented Pentaho Data Integration in-house. The current deployment has taken three months for the current set of requirements. We have another deployment in the pipeline where we are connecting other different data sources. These projects usually take a few months to complete.

What's my experience with pricing, setup cost, and licensing?

Sometimes we provide the licenses or the customer can procure their own licenses. Previously, we had an enterprise license. Currently, we are on a community license as this is adequate for our needs.

What other advice do I have?

For newcomers to the product, it is best to start with something simple. You can then scale it up fast as it is not a steep learning curve. If somebody wants to set up a good inbound integration platform, they can use the Panel Data Integrator. It's really simple and easy to use. The online community really helps you with numerous issues, such as licensing and a lot of other things. I would rate Pentaho Data Integration 8 out of 10.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
PeerSpot user
Assistant General Manager at a consultancy with 10,001+ employees
Real User
Jan 8, 2021
Scales well with data and processes, but the cost should be lower and real-time processing capabilities improved
Pros and Cons
  • "The amount of data that it loads and processes is good."
  • "I would like to see improvements made for real-time data processing."

What is our primary use case?

We are using just the simple features of this product.

We're using it as a data warehouse and then for building dimensions.

What needs improvement?

The shortcoming in version 7 is that we are unable to connect to Google Cloud Storage (GCS), where I can write the results from Pentaho. I'm able to connect to S3 using Pentaho 8, but when using it for GCS, I'm unable to connect. With people moving from on-premises deployments to the cloud, be it S3, Azure, or Google, we need a plugin where we can interact with these cloud vendors.

I would like to see improvements made for real-time data processing. It is something that I will be looking out for.

For how long have I used the solution?

We have been using Pentaho Data Integration for three years.

What do I think about the stability of the solution?

For all of the features that we have been using, it is a stable product.

What do I think about the scalability of the solution?

In terms of data loading and processes, the scalability is good.

We have a team of four people who are using it for analytics.

How are customer service and technical support?

As we are using the Community Version, we have not been in contact with technical support. Instead, we rely on forums and websites when we need to resolve a problem.

Which solution did I use previously and why did I switch?

In the past, I have worked with Talend, as well as SAP BO Data Services (BODS). However, that was with another company. This organization started with Pentaho and we are still using it.

How was the initial setup?

It is a straightforward setup process. It took between three and four hours to complete.

What's my experience with pricing, setup cost, and licensing?

We are using the Community Version, which is available free of charge.

The price of the regular version is not reasonable and it should be lower.

What other advice do I have?

My advice for anybody who is researching this product is that if they want to do batch processing, then this is a good choice. The amount of data that it loads and processes is good.

Based on the features that I have used and my experience, I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Nirmal Kosuru - PeerSpot reviewer
Nirmal KosuruData Architect at a tech services company with 201-500 employees
Top 10Real User

Yes the integration tool should be made available as Professional or Community / Standard / Enterprise Editions and Pricing should be made accordingly on the industry by industry  basis or cases by case. And also there should be Transparency in the pricing and availability of community edition as the case was earlier when Pentaho management realeased it into market.

IT-Services Manager & Solution Architect at a tech services company with 1-10 employees
Real User
Jul 14, 2021
Free to use, easy to set up, and has great UI
Pros and Cons
  • "It's my understanding that the product can scale."
  • "The product needs more plugins."

What is our primary use case?

We basically receive information from our clients via Excel. We take this information and transform it in order to create some data marks.

With this information, on these processes we are running right now, we receive new data every day. The solution processes the Excels and creates a data mark for them.

While we read the data and transform it as well as put it in a database, in order to explore the information, we need an analytics solution for that - and that is typically Microsoft's solution, Power BI.

What is most valuable?

Running itself with the ETL was very fast. It makes it so that it is very easy to transform the information we have. We found that very useful. 

The UI is very easy to understand and learn.

The solution offers lots of documentation.

The initial setup is easy.

It's my understanding that the product can scale.

We've found the solution to be stable. 

The product is free to use if you choose the free version.

What needs improvement?

The solution needs better, higher-quality documentation, similar to AWS. Right now, we find that although documentation exists, it's not easy to find the answers we seek.

I have tried some cloud services with the ETL, so perhaps that would be good to add.

The product needs more plugins. Right now, it just has a standard database connection and there are other solutions there that can have straightforward connections for Oracle, MySQL, and stuff like that. However, more plugins would make it a much better product.

For how long have I used the solution?

We recently finished two projects with Pentaho.

What do I think about the stability of the solution?

The product is stable. There are no bugs or glitches. It doesn't crash or freeze. It's reliable. 

What do I think about the scalability of the solution?

According to the documentation, it's quite scalable. That said, I haven't tried to expand it. We just use a single server and that's all we need right now. We don't have plans to increase usage.

We have three people who use the solution currently.

How are customer service and technical support?

We don't really use support. We tend to do everything on our own and solve any problems we have ourselves. We basically have just read the manuals and that's about it. 

How was the initial setup?

The initial setup is not complex or difficult. It's straightforward. 

The deployment process takes about two weeks. 

We had two people who handled the deployment process. They were an AWS DevOps person and a Pentaho expert.

What's my experience with pricing, setup cost, and licensing?

We do not pay any license costs. We use a free version of the product.

What other advice do I have?

I'm a consultant and an end-user.

I downloaded the latest version of the solution. I can't speak to the version number. 

I'd rate the solution at an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1384743 - PeerSpot reviewer
Specialist in Relational Databases and Nosql at a computer software company with 5,001-10,000 employees
Real User
Jul 17, 2020
Free to use, easy to set up, and has a great metadata injection feature
Pros and Cons
  • "The solution has a free to use community version."
  • "It's not very stable, at least not in the case of the community edition. I'm working with the community edition right now and I think perhaps it is because of that it is not very stable, it causes the system to sometimes hang. I'm not sure if this is the case for pair tiers."

What is our primary use case?

The most common use for the solution is gathering data from our databases or files in order to gather them into a different database. Another common use is to compare data between different databases. Due to a lack of integrity, you can attach these to synchronization issues.

What is most valuable?

One important feature, in my opinion, is the Metadata Injection. It gives flexibility to the scripts due to the fact that the scripts don't depend on a fixed structure or a fixed data model. Instead, you can develop transformations that are not dependant on the fixed structure or data models. 

Let me give a pair of examples. Sometimes your tables change, adding fields or dropping some of them. When this happens if you have a transformation without using Metadata Injection your transformation fails or doesn't manage the whole info from the table. If you use Metadata Injection instead, the new fields are included and the dropped columns are excluded from the transformation. Other times you have a complex transformation to apply to a lot of different tables. Traditionally, without the Metadata Injection feature, you had to repeat the transformation for each table, adapting the transformation to the concrete structure of each table. Fortunately, with the Metadata Injection, the same transformation is valid for all the tables you want to treat. A little bit effort gives you a great benefit.

Furthermore, the solution has a free to use community version.

The solution is easy to set up, very intuitive, clear to understand and easy to maintain.

What needs improvement?

I'm currently looking at a new competitor that's got some interesting features that this solution doesn't have. I have found this competitor has a feature braking system that is not present in the Pentaho Data Integration approach. The way their system sets can somehow maintain a track for the last executions and store the state which gives you the potential to run from the point that it ended the last time. It's very interesting. It would be nice if Pentaho had this type of feature.

Often you are required to install plugins. If you need to have access to, in my case, Neo4j databases new folder databases, you do need a plugin to do it.

For how long have I used the solution?

Between my current role and the role at my last company, I've been working with the solution for over five years.

What do I think about the stability of the solution?

It's not very stable, at least not in the case of the community edition. I'm working with the community edition right now and I think perhaps it is because of that it is not very stable, it causes the system to sometimes hang. I'm not sure if this is the case for pair tiers.

What do I think about the scalability of the solution?

I am the only person using the solution currently. There are two other people that occasionally also assist in it. I'm helping them understand the tool and they are beginning to use it. In that sense, we're slowly scaling.

I don't know if the solution scales well on a large scale, however.

It scales very well, overall with the very useful feature to run n copies to Start attribute in every step, perhaps balancing with the side effect of consuming a lot of memory and CPU resources.

How are customer service and support?

We haven't really contacted technical support in the past. We try to handle any issues ourselves in-house. I can't speak to the quality of the technical support, having never directly dealt with them.

Which solution did I use previously and why did I switch?

We've never really used another solution like this in our organization. This is the first.

How was the initial setup?

The solution is pretty simple to set up. It's not complex.

For our, deployment took about one month.

Maintenance is easy. The only maintenance tasks are to upgrade to the newer versions and backing up the repository frequently.

What about the implementation team?

I handled the implementation on my own. I didn't need any help from a reseller or consultant.

What's my experience with pricing, setup cost, and licensing?

We're using the community edition, which is free to use. I'm not sure how much their paid services cost. We haven't purchased any licensing.

What other advice do I have?

We're just users of the solution. We don't have a professional relationship with the company.

The solution is great to use and easy to share with teams via the central repository. It's very functional overall. I'd recommend the solution to other companies.

I'd rate the solution eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user254223 - PeerSpot reviewer
Project Manager - Business Intelligence at a consultancy with 51-200 employees
Consultant
Feb 5, 2018
It has improved our data integration capabilities​
Pros and Cons
  • "It has improved our data integration capabilities​."
  • "Provides a good open source option."
  • "​There is not a data quality or MDM solution in the Pentaho DI suite.​"
  • "​I could not connect to our Hadoop environment in an easy and flexible way, and it was important to scale our data warehouse​."
  • "​I work with the Community Edition, therefore I do not have support. There was an issue that I could not resolve with community support.​"

How has it helped my organization?

Developed ETL processes to load a data warehouse. Has improved our data integration capabilities.

What is most valuable?

  • Easy to use
  • Development of the product
  • A lot of predefined steps
  • Good open source option

What needs improvement?

There is not a data quality or MDM solution in the Pentaho DI suite.

For how long have I used the solution?

Three to five years.

What do I think about the stability of the solution?

No issues.

What do I think about the scalability of the solution?

I could not connect to our Hadoop environment in an easy and flexible way, and it was important to scale our data warehouse.

How are customer service and technical support?

I work with the Community Edition, therefore I do not have support. There was an issue that I could not resolve with community support.

Which solution did I use previously and why did I switch?

I switched from our previous solution for cost reasons.

How was the initial setup?

It was not complex.

What's my experience with pricing, setup cost, and licensing?

There is a good open source option (Community Edition).

Which other solutions did I evaluate?

No.

What other advice do I have?

There is a lack of support if you work with the Community Edition.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
PeerSpot user
Consultant at a comms service provider with 11-50 employees
Consultant
Sep 26, 2017
Simple to install and simple to use and helps us mine, clean, and arrange terabytes of data
Pros and Cons
  • "It's very simple compared to other products out there."
  • "One thing that I don't like, just a little, is the backward compatibility."

What is most valuable?

It's very simple compared to other products out there.

How has it helped my organization?

We use Pentaho for data integration, but also PI to implement data mining. That has improved the intelligence behind the data. So, we are able to provide our customer with the ability to understand their data. Our customer produces terabytes of data, so arranging the data, cleaning the data, on data integration, aided our customer to understand the data to improve their business.

What needs improvement?

One thing that I don't like, just a little, is the backward compatibility. I used Pentaho from version 4, and version 6 does not work with the whole ETL design. So backward compatibility is a problem.

For how long have I used the solution?

I have worked with this product for seven years.

What do I think about the stability of the solution?

It's a stable product. In fact, contains some mocks, where you can write your own Java software, and do an ETL, specific for your needs.

How is customer service and technical support?

The support is very fast, but there are also a lot of forums to address problems, so you can find the solution to your issue easily. There is also the possibility to buy support, and when we bought support they resolved our problem in 24 hours.

How was the initial setup?

It was very, very simple. I copied the integration folder, started the tool to design the ETL, and it worked. Time was required to design the ETL, just to understand how each block works. So, when you understand how each block works, you need spend no more time to use the product.

Which other solutions did I evaluate?

Before using Pentaho, I analyzed other products to understand what is the best ETL product. I tested Talend and Oracle Data Integrator. Oracle Data Integrator is a little bit more difficult to understand, how it works.

So, I preferred Pentaho Data Integration because you just have to drag and drop the block, draw a line to connect the block, write the query, and connect to the DB. There's nothing else you need to do. For Oracle Data Integrator, and also for Talend, you spend more time installing the product. By contrast, with Pentaho, you just have to copy the folder, launch the product, and then you just need the Java machine and it works.

What other advice do I have?

When you start to use this product, if you have just a little experience and know about ETL, you will have to spend little time to learn the it. The product is very, very simple to understand. You can build functionality by yourself.

Anyone thinking about an ETL product, if they want high productivity on data cleaning and data movement, Pentaho Data Integration, in my opinion, is the best tool.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Pentaho Data Integration and Analytics Report and get advice and tips from experienced pros sharing their opinions.
Updated: January 2026
Product Categories
Data Integration
Buyer's Guide
Download our free Pentaho Data Integration and Analytics Report and get advice and tips from experienced pros sharing their opinions.