Try our new research platform with insights from 80,000+ expert users
PeerSpot user
Owner with 51-200 employees
Vendor
Pentaho BI Suite Review: Pentaho Reporting – Part 3 of 6

This is the third of a six-part review of the Pentaho BI suite. In each part of the review, we will take a look at the components that make up the BI suite, according to how they would be used in the real world.

In this third part, we'll be discussing the tools and facilities, with which all of the reports are designed, generated, and served. A full BI suite should have a few reporting facilities that are usable by users with different level of technical/database knowledge.

Why is this important? Because in the real world, owners of data (people who consume the reports to make various business decisions) ranges from accountants, customer account managers, supply-chain managers, C-level executives, manufacturing managers, etc. Notice that proficiency in writing SQL queries a prerequisite to any of those positions?

In the Pentaho BI Suite, we have these reporting components:

  1. Pentaho Report Designer – A stand-alone program that are par with Jasper or iReport and to the lesser extent Crystal report designers.
  2. Pentaho Model Data Source – A way to encapsulate data sources which includes the most flexible of all, a SQL query. Once this is setup by the data personnel, data owners can use it to generate ad-hoc reports – and dashboards too, which we'll discuss in Part 5 of this review series.
  3. Saiku Reporting Tool – A convenient way to create ad-hoc reports based on the Pentaho Data Sources (see number 2 above).

Let's discuss each of these components individually. The screenshots below are sanitized to remove references to our actual clients. A fictitious company called “DonutWorld” is used to illustrate and relate the concepts.

This Java standalone program feels like the Eclipse Java development IDE because they share the UI library. If you are already familiar with Jasper Reports, iReports, or Crystal Report, the concepts are similar (bands, groups, details, sub-reports). You start with a master report in which you can combine different data sources (SQL and MDX queries in this case) into a layout that is managed via a set of properties.

Learning experience: As with any report designers, which are complex software because of the sheer number of tweak-able properties governing each element of the reports, one has to be prepared to learn the PRD. While the tools are laid out logically, it will take some time for a new personnel to absorb the main concepts. The sub-report facility is one of the most powerful feature of this program and it is the key to create reports that drills into more than one axis (or dimension) of data.

Usage experience: Things like the placement accuracy of elements within the page is not 100% precise and there are times when I had to work around the quirks and inconsistencies revolving around setting default values for properties, especially the ones containing formulas. Be prepared to have a dedicated personnel (either a permanent employee or a consultant) that can be reached for report designs *and* subsequent modifications. In addition, aesthetic considerations are also important in order to create a visually engaging reports (who wants to read a boring and bland report?).

Figure 1. The typical look of PRD when designing a report.

The Data Source facility is accessible from within the Pentaho BI Server UI (the PUC, see Part 2 of this review series for more information). Once you have logged in, look for a section on the screen that allows you to create or manage existing data sources.

This feature allows data personnel to setup “models” that can be constructed from various data sources, that represents a flat-view of data, of which a non-technical data owners can create ad-hoc reports or dashboards. Obviously this feature does not alleviate the need for knowing how to use the available tools for creating those reports and dashboards. It simply detach the dependency on crafting SQL/MDX queries and the intricacies of OLAP data structures from creating an ad-hoc report.

Learning experience: A data personnel who are familiar with the Data Warehouse (DW) can easily create models out of SQL queries against existing tables within the DW, or by using MDX queries against existing OLAP cubes. Data owners who are familiar with the data itself, can then start to use the Saiku Ad hoc Reporting tool or the CDE (Community-tools Dashboard Editor) to create dashboards. In reality, expect a couple of weeks for the personnels to get accustomed to this feature. Assumption: A knowledgeable BI teacher or consultant is available during this time. Usage experience: By separating the technical-database skill from the ability to generate ad-hoc reports, Pentaho has provided a way for organizations to streamline their business decision-making process further away from the technical minutiae that tends to bog down the process with details that are not relevant to the business goals. I highly rate this feature in the Pentaho BI Suite as one of the more innovative contribution to the area of Business Process Management.


Figure 2. Creating a model out of a SQL query

NOTE: The most important part of using this facility has to do more with business process than the familiarity of the data itself. Without a good process in place, it is quite obvious that the reports can get out of sync with the underlying data model. This is where the construction and maturity of the Data Warehouse is tested. For example, a DW with sufficient maturity will notify the data personnel of any data model changes which will trigger the updating of the Model Data Structure, which may or may not have an effect on the ad-hoc reports.

If the DW is designed correctly, there should be quite a few fact tables that can readily be translated into a Model Data Source. This is the first step. Now let's look at how to use this model.

Saiku is the name of two tools available from the PUC. The first one is the Saiku Analytics tool which allows us drill into an OLAP cube and perform analysis using aggregated measures (we'll review this in Part 4). The second one is the Saiku Ad-hoc Reporting tool. This is the one we are going to look into at this time. Using the modern UI library such as jQuery, the developers of Saiku give us a convenient drag-and-drop UI that is easy to learn and use.

Once a model is published, it will be available to choose from the drop-down list on the top left of the Saiku Ad-hoc Reporting tool. See the screenshot below:
Figure 3. A Saiku report in progress

Next, you can start to choose from the list of available fields in the model to specify as part of either the Columns list, or Groups list. Next, from the same list of available fields, you can specify some values as filters. The most obvious example would be the transaction date and time range which determines what period is the report for.

As you select the fields into the proper report elements, the tool started to populate the preview area with what the report would look like. You can also specify aggregation for each of the groupings, which is very handy.

There is a limited control on templates which governs the appearance of the report, but obviously won't be enough for serious usages. The best remedy however, is available, via the exporting to .prpt file, which you can open in the PRD and tweak to your heart's content.

After you are happy with the report, you can save it for later editing. Another thoughtful design decision by the Pentaho team.

In overall, the Saiku Ad-hoc Reporting tool is a handy facility to craft quick reports that answer specific questions based on the available model data sources. If your data personnel diligently updates and maintains the models, this tool can be invaluable to support your business decisions.

None of the above discussions would mean a whole lot without a practical and useful way for the reports to be delivered to its requesters. Here, the comprehensive nature of the Pentaho BI Suite helps by providing the facilities like xaction and input UI controls for report parameters.

For example a report designed in PRD can be published on the PUC. At some point it is opened by the user on the PUC who supplies the necessary parameters, then the xaction script fire an ETL which renders a .prpt file into a .pdf and either email it to the requester or drop it in a shared folder.

Reports can also be “burst” via an ETL script that utilizes the Pentaho Reporting Output step available from within Spoon (the ETL editor). I have used this method to distribute periodically-generated reports to different recipients containing data that is specific to the said recipient's access permission level. This saves a lot of time and increased the efficiency of up-to-date information distribution inside a company.

The reporting tools in the Pentaho BI Suite is designed to allow different users within the company to generate reports that are either pre-designed or ad-hoc. The reports are made available on the Pentaho User Console (PUC) where users login and initiate the report generation. Reports can also be scheduled to be generated via ETL scripts.

The PRD will be instantly recognizable by anyone who has experience using tools like Crystal Reports and its derivatives. You can also specify MDX queries against any OLAP cube schema published in the Pentaho BI Server as a data source.

The Model Data Source facility allows data owners who are not data personnels to create ad-hoc reports quickly and save it for future use and modifications.

The Saiku Ad-Hoc report is the UI with which available models can be used to generate reports on-the-fly. These reports can also be saved for later use.

Next in part-four, we will discuss the Pentaho Mondrian (MDX query engine) and the OLAP Cube Schema tools.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Owner with 51-200 employees
Vendor
Pentaho BI Suite Review: Pentaho BI Server – Part 2 of 6

Introduction

This is the second of a six-part review of the Pentaho BI suite. In each part of the review, we will take a look at the components that make up the BI suite, according to how they would be used in the real world.

In this second part, we'll be discussing the Pentaho BI Server from which all of the reports, dashboards, and analytic tools are served to the users. A BI suite usually has a central place where users log in using their assigned credentials. In this case, the server is a standalone web server (an Apache Tomcat instance) that is augmented by various tools that provides the functionalities – most of these tools are written by Webdetails (webdetails.pt). We'll visit these tools in subsequent review parts, for now, let's focus on the server itself.

In the case of Pentaho BI Server, it has two components:

  • The Pentaho User Console (a.k.a PUC) – this is what we usually associate with the main BI Server in the Pentaho world; where users would spend the majority of their time generating reports (both real-time or scheduled), using the analytic tools, build and publish dashboards, etc. This is also where administrator users can manage who can access which reports either by User or by Role – obviously, Role-based ACL is cleaner and easier to maintain.

  • The Administration Console (a.k.a PAC) – this is where admin users go to create new Users, Roles, and schedule jobs. It is another standalone web server that can be started and stopped when needed, it is totally independent of the main PUC server.

Is it Corporate-Ready?

BI servers are considered ready for corporate “demands” based on the number of users they can support, and the facilities to manage them. The Pentaho BI Suite Enterprise Edition is without a doubt ready for corporate use because it comes with the support that will make sure that is the case.

The Community Edition is more interesting, it is definitely corporate ready, but the personnels who set it up needs to be intimately familiar with the ins and outs of the server itself. Having installed three of these, I am confident that the BI Server, due to its built in ACL management is ready for prime time in the corporate world.

Although the Pentaho BI server includes a scheduler, another “corporate” feature, I find myself using cron (or Windows Task Scheduler) for the most part. The built-in scheduler is based on the Quartz library for Java. It is a good facility with decent UI to schedule reports or ETL from within the PUC.

Is it Easy to Use?

The PAC is very easy to use. The UI interface is simple enough due to the minimum numbers of menus and options. In a sense, it's a simple facility to manage user/role and scheduling – not ACL, just users and roles.

The PUC is more involved, but adopting the familiar file folder look and feel on the left panel, it is quite easy to get into and start using. Administrators would love the way they can set who can Execute, Edit, Schedule each reports, saved analytic views, and dashboards – by the way, Pentaho calls these: Solutions.

Setting up the BI server is better left to the consultants who are used to doing it. Or if there are in-house personnels who would be doing this, it is worth the time to participate in the training webinars that Pentaho held periodically. The steps to setup a BI server far from being simple, but that is the case for all BI servers, regardless the brand.

The collapsible left panel serves as the directory of the solutions, with the top part shows the folders, and the bottom part shows the individual solution. The bigger panel on the right is where you actually see the content of the solutions. And in some cases, that's where you'd create a Dashboard using the CDE tool (we'll revisit this in later review part).

Is it Easy to Create Solutions?

Remember that the concept “solution” here refer to the different types of reports, dashboards, analytic views. Pentaho BI server employs a “glue” scripting facility called the xactions. These are XML documents that contain some sequence of actions that can do various things like:

  1. Asking users for input parameters

  2. Issuing a SQL query based on user input

  3. Trigger an ETL that produce reports

Once you are familiar with this facility, it is not that hard to start producing solutions, but it pays to install the included examples and study them to find out how to do certain things with xaction and/or to copy snippets into your own scripts.

On the PUC, we can build these solutions:

  1. Dashboards using CDE

  2. Ad-hoc reports and data model using the built in Model generator (very handy for accessing those BI tables that are populated by ETL runs)

  3. Analytic Views using tools like Saiku or its equivalent for the Professional and Enterprise edition. NOTE: This requires a pre-published schema which is built using another tool called the schema-workbench (we will see this in the latter parts of this review series)

Is it Customizable?

Being the user-facing tool, one of the requirement would be the ability to customize the appearance via themes, at the very least, a BI server need to allow companies to change the logo into their own.

The good news is, you can do all that with Pentaho BI Server. If you opt for the Professional and Enterprise editions, you can rely on the support that you already paid for. For those using the Community Edition, customizing the appearance requires knowledge on how a typical Java Web Server is structured. Again, any good BI consultant should be able to tackle this without too much difficulties.

Here is an example of a customized PUC login page:

In case you are wondering, yes, you can customize the PUC interface also, and it even comes with a theme structure in which you can assign your graphic artists to redefine the CSS elements.

Summary

The Pentaho BI server, is the central place where users are going to interact with Pentaho BI Suite. It brings together solutions (what Pentaho call contents) produced by the other tools in the suite, and expose it to the user while being protected by a robust ACL.

On the balance between ease-of-use and the ability to customize, the Pentaho BI Server scores well provided that the personnel in charge is familiar with the Java Enterprise environment. To illustrate this, in one project, I managed to tweak the security framework to make the PUC part of a single-sign-on Liferay portal, along with other applications such as Opentaps and Alfresco.

Next in part-three, we will discuss the wide array of Pentaho Reporting tools.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Pentaho Business Analytics
May 2025
Learn what your peers think about Pentaho Business Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
851,823 professionals have used our research since 2012.
PeerSpot user
Owner with 51-200 employees
Vendor
Pentaho BI Suite Review: PDI – Part 1 of 6

Introduction

The Pentaho BI Suite is one of the more comprehensive BI suite that is also available as an Open Source project (the Community Edition). Interestingly, the absence of license fees is far from being the only factor in choosing this particular tool to build your Data Warehouses (OLAP systems).

This is the first of a six-part review of the BI suite. In each part of the review, we will take a look at the components that make up the BI suite, according to how they would be used in the real world.

In this first part, we'll be discussing the Pentaho Data Integration (from here on will be referred to as PDI) which is the ETL tool that comes with the suite. An ETL tool is the means with which you input data from various sources – typically out of some transactional systems, then transform the format and flow into another data model that is OLAP-friendly. Therefore it acts as the gateway into using the other parts of the BI suite.

In the case of PDI, it has two components:

  • Spoon (the GUI), where you string together a set of Steps within a Transformation and optionally string multiple Transformations within a single Job. This is where you would spend the bulk of your time developing ETL scripts.

  • The accompanying set of command-line scripts that we can configure to be launched from a scheduler like cron or Windows Task Scheduler. Notably pan a single Transformation runner, kitchen the Job runner, and carte the slave-server runner. These tools give us the flexibility to create our own network of multi-tiered notification system, should we need to.

Is it Feature-Complete'

ETL tools are interesting because anyone who has implemented a BI system have a standard list of major features expected to be available. This standard list does not change from one tool brand to the other. Let's see how PDI fares:

  1. Serialized vs Parallel ETL processing: PDI handles parallel (async.) steps using Transformations, which can be strung together in a Job when we need a serialized sequences.

  2. Parameter-handling: PDI has a property file that allows us to parameterize things that are specific to different platforms (dev/test/prod) such as database name, credentials, external servers. It also features parameters that can be created during the ETL run out of the data in the stream, then passed on from one Transformation to another within a Job.

  3. Script management: Just like any other IT documents (or as some call it artifacts), ETL scripts need to be managed, version-controlled, and documented. PDI scores high on this front. Not because of some specific features, instead, due to design decisions that favor simplicity: The scripts are plain XML documents. That makes it very easy to manage, version-control, and if necessary batch-edit. NOTE: For those who wants enterprise level script management and version-control built into the tool, Pentaho made it available as part of their Enterprise offerings. But for the rest of us who already have a document management process – because we also develop software using other tools – it is not as crucial.

  4. Clustering: PDI supports round-robin -style load-balancing given a set of slave-servers. For those using Hadoop clusters, Pentaho recently added their support to run Jobs on those.

Is it Easy to Use'

With the drag and drop graphical UI approach, the ease of use is a given. It is quite easy to string together steps to accomplish the ETL process. The trick is knowing which steps to use, and when to use it.

The documentation on how to use each step can stand improvements that fortunately, slowly over the years have started to catch up – and should you have the budget, you can always pay for support that comes with the Enterprise Edition. But overall, it is a matter of using those enough to be familiar with the use cases.

This is why competent BI consultants are worth their weights in gold because they have been in the trenches, and have accumulated ways to deal with the quirks which is bound to be encountered in a software system this complex (not just Pentaho, this applies to any BI Suite products out there).



NOTE: I feel obligated to point out one (very) annoying fact that I cannot hit the Enter key to edit the selected step. Think about how many times we would use this functionality on any ETL tool.

Aside from that, in the few years that I've used various versions of the GUI, I've never encountered severe data loss due to stability problems.

Another measurement of ease-of-use that I evaluate a tool with is: How easy it is to debug the ETL scripts. With PDI, the logical structures of the scripts could be easily followed, therefore it's quite debug-friendly.

Is it Extensible'

It may be a strange question at first, but let us think about it. One of the purpose of using an ETL tool is to deal with a variety of data sources. No matter how comprehensive the included data format readers/writers, sooner or later you would have to talk to a proprietary system that is not widely-known. We had to do this once for one of our clients. We ended up writing a custom PDI step that communicates with the XML-RPC backend of an ERP system.

The good news is, with PDI, anyone with some Java SDK development experience, can readily implement the published interfaces and thus creating their own custom Transformation steps. In this regard, I am quite impressed with the modular design, that allows users to extend the functionality and consequently, the usefulness of the tool.

The scripting ability built into the Steps is also one of the ways to handle proprietary – or extremely complex data. PDI allows us to write Javascript (and Java, should you want faster performance) programs to manipulate the data both at the row level as well as pre- and post- run, which comes very handy to handle variable initializations or sending notifications that contain statistical info about all of the rows.

Summary

The PDI, is one of the jewels in the Pentaho BI Suite. Aside from some minor inconveniences within the GUI tool, the simplicity, extensibility, and stability of the whole package makes PDI a good tool for building a network of ETLs marshaling data from one end of the systems to another. In some cases, it even serves well as a development tool for the batch-processing side of an OLTP system.

Next in part-two, we will discuss the Pentaho BI Server.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user76890 - PeerSpot reviewer
Engineer at a marketing services firm with 51-200 employees
Vendor
It does a lot of what we need but off-the-shelf solutions often can’t do exactly what you need

Being in the business of online-to-offline ad attribution and advertising analytics, we need tools to help us analyze billions of records to discover interesting insights for our clients. One of the tools we use is Pentaho, an open source business intelligence platform that allows us to manage, transform, and explore our data. It offers some nice GUI tools, can be quickly set up on top of existing data, and has the advantage of being on our home team.

But for all the benefits of Pentaho, making it work for us has required tweaking and in some cases replacing Pentaho with other solutions. Don’t take this the wrong way: we like Pentaho, and it does a lot of what we need. But at the edges, any off-the-shelf solution often can’t do exactly what you need.

Perhaps the biggest problem we faced was getting queries against our cubes to run quickly. Because Pentaho is built around Mondrian, and Mondrian is a ROLAP, every query against our cubes requires building dozens of queries that join tables with billions of rows. In some cases this meant that Mondrian queries could require hours to run. Our fix has been to make extensive use of summary tables, i.e. summarizing counts of raw data at levels we know our cubes will need to execute queries. This has allowed us to take queries that ran in hours to run in seconds by doing the summarization for all queries once in advance. At worst our Mondrian queries can take a couple minutes to complete if we ask for really complicated things.

Early on, we tried to extend our internal use of Pentaho to our clients by using Action Sequences, also known as xactions after the Action Sequence file extension. Our primary use of xactions was to create simple interfaces for getting the results of Mondrian queries that could then be displayed to clients in our Rails web application. But in addition to sometimes slow Mondrian queries (in the world of client-facing solutions, even 15 seconds is extremely slow), xactions introduce considerable latency as they start up and execute, adding as much as 5 seconds on top of the time it takes to execute the query.

Ultimately we couldn’t make xactions fast enough to deliver data to the client interface, so we instead took the approach we use today. We first discover what is useful in Pentaho internally, then build solutions that query directly against our RDBMS to quickly deliver results to clients. Although, to be fair to Mondiran, some of these solutions require us to summarize data in advance of user requests to get the speed we want because that data is just that big and the queries are just that complex.

We’ve also made extensive use of Pentaho Data Integration, also known as Kettle. One of the nice features about Kettle is Spoon, a GUI editor for writing Kettle jobs and transforms. Spoon made it easy for us to set up ETL processes in Kettle and take advantage of Kettle’s ability to easily spread load across processing resources. The tradeoff, as we soon learned, was that Spoon makes the XML descriptions of Kettle jobs and transforms difficult to work on concurrently, a major problem for us since we use distributed version control. Additionally, Kettle files don’t have a really good, general way of reusing code short of writing custom Kettle steps in Java, so it makes maintaining our large collection of Kettle jobs and transforms difficult. On the whole, Kettle was great for getting things up and running quickly, but over time we find its rapid development advantages are outweighed by the advantages of using a general programming language for our ETL. The result is that we are slowly transitioning to writing ETL in Ruby, but only transitioning 0n an as-needed basis since our existing Kettle code works well.

As we move forward, we may find additional places where Pentaho does not fully meet our needs and we must find other solutions to our unique problems. But on the whole, Pentaho has proven to be a great starting platform for getting our analytics up and running and has allowed us to iteratively build out our technologies without needing to develop custom solutions from scratch for everything we do. And, I expect, Pentaho will long have a place at our company as an internal tool for initial development of services we will offer to our clients.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user108285 - PeerSpot reviewer
it_user108285Works at a financial services firm
Vendor

Have you looked into using Talend?? It's got a great user interface, very similar to kettle, and their paid for version has version control that works very well, and you get the ability to run "joblets" which are basically re-usable pieces of code. Even in the free version there is version control, although it's pretty clumsy, and not joblets in the free, and the free version is difficult to get working with Github.

See all 7 comments
Ricardo Díaz - PeerSpot reviewer
COO at a tech services company with 11-50 employees
Consultant
Top 20
Fast Development (Agile BI), Good Charts and Visualization, Good Security, Good User Interface

What is most valuable?

Pentaho Analyzer (EE)

Saiku (CE)

Marketplace (CE)

R (EE and CE)

Community Dashboard Framework (CE)

Dashboard Editor (EE)

How has it helped my organization?

Powerful Analytics, Fast KPI Analysis

For how long have I used the solution?

4 Years

What was my experience with deployment of the solution?

Integration with GeoServer (Specially ShapeFiles Layers on Maps)

What do I think about the stability of the solution?

None

What do I think about the scalability of the solution?

Migrate old version of Reports (.prpt) to a new version

How are customer service and technical support?

Customer Service:

5/10

Technical Support:

9/10

Which solution did I use previously and why did I switch?

Yes, QlikView.

How was the initial setup?

Difficulty: medium

What was our ROI?

45%

Which other solutions did I evaluate?

QlikView

Tableau, SpagoBI

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user736950 - PeerSpot reviewer
Director Tecnologia
Real User
Increases productivity and lowers costs, though should improve the construction of its dashboards
Pros and Cons
  • "I use the BI Server, CDE Dashboards, Saiku, and Kettle, because these tools are very good and highly experienced."
  • "Pentaho, at the general level, should greatly improve the easy construction of its dashboards and easy integration of information from different sources without technical user intervention."

What is most valuable?

I use the BI Server, CDE Dashboards, Saiku, and Kettle, because these tools are very good and highly experienced.

How has it helped my organization?

The first eight years, I used this tool in one company. Now, I have some customers who hire me to give them advice. I have a couple of great customers in my country and they are very satisfied because they have increased productivity and lowered costs.

What needs improvement?

Pentaho, at the general level, should greatly improve the easy construction of its dashboards and easy integration of information from different sources without technical user intervention.

For how long have I used the solution?

For 12 years. I have been using Pentaho CE 6.0 and 7.0. Last year, I implemented Pentaho CE 5.0.

What do I think about the stability of the solution?

I am actually trying to use Pentaho 7.0 CE and determine if it has some issues. In Pentaho EE, I have several years using it without having issues.

What do I think about the scalability of the solution?

No, it is a highly experienced tool. It can do anything.

How is customer service and technical support?

Really, I don't know about the support of Pentaho EE. As for the support of Pentaho CE, it is bad. Fortunately, I am highly experienced and use it very little.

How was the initial setup?

To start, the first configurations were very difficult. I started with the CE version and without good documentation or support. I spent years learning for myself.

What other advice do I have?

Hire specialized support for Pentaho. If customers want a professional tool and have the money, they should invest in the enterprise version of Pentaho or hire a company from your country specializing in Pentaho with high experience.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user798240 - PeerSpot reviewer
Identity and Access Management Engineer at a financial services firm with 10,001+ employees
Real User
Easy to install, easy to use, the free edition meets our needs
Pros and Cons
  • "Easy to use components to create the job."
  • "Logging capability is needed."
  • "Version control would be a good addition."

What is most valuable?

Easy to use components to create the job.

What needs improvement?

  • Logging capability.
  • Version control would be a good addition.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

A lot of time jobs get stuck, causing them to lock out and fail to run, until we kill them.

What do I think about the scalability of the solution?

I  have not needed to scale this product so far.

How are customer service and technical support?

Open community, you can find good responses at a high level for the free edition. I have not used the commercial version which includes support.

Which solution did I use previously and why did I switch?

This is first solution I have used and I like it.

How was the initial setup?

Simple, easy to install.

What's my experience with pricing, setup cost, and licensing?

Free and commercial versions are available.

Which other solutions did I evaluate?

We did not evaluate other options as this one is free.

What other advice do I have?

Good for any size organization. There are other products and vendors available to better handle errors and logging, but for us, the free version of Pentaho is good enough to satisfy our needs.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user394440 - PeerSpot reviewer
Data Scientist at a tech services company with 501-1,000 employees
Consultant
It became a lot easier for our developers to switch between or join the different development projects.

What is most valuable?

I found Pentaho Data Integration the most valuable component since it is the most mature open-source ETL tool available. Compared to other proprietary products it has a less steep learning curve due to it's very intuitive user interface. Besides that it has a pluggable architecture which makes it quite easy to extend with custom functionality and features.

Another thing worth mentioning is the very active user community around the products which provide some great resources for community support.

How has it helped my organization?

As for the data integration part each development team were writing their own integration scripts, parsers and interfaces from scratch on each different project over and over again. With Pentaho Data Integration which offers all these common tasks out-of-the-box we reduced development time significantly. Also by using such a universal tool and introducing a uniform architecture it became a lot easier for our developers to switch and/or join between the different development projects.

Also on the business intelligence part we moved from developing custom solutions on each track to the usage of standard functionality of the BI server and thus cutting down both complexity and development time.

What needs improvement?

Since most of our projects start off as a proof-of-concept with the Community Edition version of the products we found that the differences between the Community- and the Enterprise Editions are too big on certain levels. It would be a big gain if the Community Edition version would be a full representation of the Enterprise Editions making it easier to move on to the Enterprise Edition and support.

For how long have I used the solution?

I started using Pentaho Data Integration around seven years ago and moved on to the full stack about five years ago.

What was my experience with deployment of the solution?

I have seen many different (custom build) deployment solutions for Pentaho throughout the years each having their own pros and cons.

What do I think about the stability of the solution?

We've had no issues with its stability.

What do I think about the scalability of the solution?

Since Pentaho supports running as a single process to a clustered architecture and has a big focus on big data (distributed) environments, scalability hasn't been an issue for us.

How are customer service and technical support?

The open source strategy of Pentaho has resulted in a very active community which provided us all the support we need. Compared to other big vendors my personal experience is that response times are a lot shorter.

Which solution did I use previously and why did I switch?

Most of our previously used solutions were custom built. We have evaluated both open-source and proprietary competitive products but found that Pentaho was the easiest to adopt.

How was the initial setup?

Depending upon the solutions nature, the initial setup for a basic data warehouse architecture is quite straightforward. But as with all solutions as the landscape grows and user requirements evolve, the complexity increases. I think that Pentaho suits well in today's demand for a continuous integration approach. With this in mind the initial setup is crucial in a way not to find yourself spending a lot of time and effort in refactoring the complete solution over-and-over again.

What about the implementation team?

We implemented it in-house. Keep your development and implementation cycles short and small if possible. Users demand fast implementation of requirements so the continuous integration approach becomes more crucial as well as self-service functionality. From which the latter is not yet the strongest use-case for using Pentaho yet.

What was our ROI?

Decrease of development time compared to our traditional development cycles in pure Enterprise JAVA solutions should be estimated around 60%.

What's my experience with pricing, setup cost, and licensing?

Unfortunately I can't provide any exact figures about this. But using the Community edition for the development and test cycles drops down the licensing costs for the complete OTAP street.

What other advice do I have?

As mentioned before, there is a great community of users, developers and other enthusiasts which I recommend to consult for your particular use-case. Check the latest Gartner report (2016) about BI vendors and ultimately visit one of the Pentaho Community Meetups to get more insight.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Pentaho Business Analytics Report and get advice and tips from experienced pros sharing their opinions.
Updated: May 2025
Buyer's Guide
Download our free Pentaho Business Analytics Report and get advice and tips from experienced pros sharing their opinions.