Try our new research platform with insights from 80,000+ expert users
Data Scientest at a wellness & fitness company with 51-200 employees
Real User
Leaderboard
​Data ingestion​ ​has reduced manual effort to import data

What is our primary use case?

The primary use case is for data ingestion. We current have HDP 2.6 installed on Ubuntu 16.04.

How has it helped my organization?

Has reduced manual effort to import data.

What is most valuable?

Data ingestion

What needs improvement?

Not enough material is available for beginners.

Buyer's Guide
Talend Data Quality
June 2025
Learn what your peers think about Talend Data Quality. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
860,592 professionals have used our research since 2012.

For how long have I used the solution?

Less than one year.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user827655 - PeerSpot reviewer
Principal Developer
Real User
​It lowers the amount of time in development from weeks to a day
Pros and Cons
  • "​It lowers the amount of time in development from weeks to a day.​"
  • "If the SQL input controls could dynamically determine the schema-based on the SQL alone, it would simplify the steps of having to use a manually created and saved schema for use in the TMap for the Postgres and Redshift components. This would make things even easier."

What is our primary use case?

We use it to load our big data system with S3 and Redshift. We also use it to process in HL7 from hospitals in real-time.

How has it helped my organization?

It lowers the amount of time in development from weeks to a day.

What is most valuable?

The ease of transforming data with inputs to TMaps and tJavaRow makes life so easy.

What needs improvement?

There is one place where I would appreciate an upgrade, if it is possible. If the SQL input controls could dynamically determine the schema-based on the SQL alone, it would simplify the steps of having to use a manually created and saved schema for use in the TMap for the Postgres and Redshift components. This would make things even easier. When it does guess the schema it tends to bring back every column from every table or every column from the table specified in the table name in the component. Sometimes, the SQL comes from multiple tables and has some transformations of data. 

I do not know if it would even be possible, but if this could be figured out automatically for the column names and types, that would be amazing.

For how long have I used the solution?

More than five years.

What other advice do I have?

I have not run into anything we could not use Talend to find a solution for.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Talend Data Quality
June 2025
Learn what your peers think about Talend Data Quality. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
860,592 professionals have used our research since 2012.
it_user826299 - PeerSpot reviewer
Junior ETL Developer at a marketing services firm with 51-200 employees
Real User
Heap space issues plague us consistently. However, the file fetch process is impeccable.
Pros and Cons
  • "The file fetch process is impeccable."
  • "We are able to get emails from URLs very easily using this function when others fail."
  • "tLogRows are also great for finding bad data."
  • "NullPointerExceptions are going to be the death of me and are a big reason for our transition away from Talend. One day, it is fine with a 1000 blank rows, then the next day, it will find one blank cell and it breaks down."
  • "Heap space issues plague us consistently. We maxed it out and it runs fine, then it doesn’t, then it does."
  • "Finding assistance with issues can be spotty. With Python, there are literally millions of open source answers which are recent and apply to the version that we are using."

What is our primary use case?

We are a marketing and advertising company. We use this tool to fetch data from Google, Bing, and Adobe. We receive marketing data daily via email, FTP, and API, then process the data into MySQL tables.

How has it helped my organization?

Coming into the department with no knowledge of Talend, the interface has been user-friendly enough to allow me to come up to speed in four to five months on almost all its functions and use it like a pro.

What is most valuable?

  • The file fetch process is impeccable. 
  • We are able to get emails from URLs very easily using this function when others fail. 
  • tLogRows are also great for finding bad data.

What needs improvement?

NullPointerExceptions are going to be the death of me and are a big reason for our transition away from Talend. One day, it is fine with a 1000 blank rows, then the next day, it will find one blank cell and it breaks down. When we are dealing with millions of rows of data, this can be super hard to find. 

Heap space issues also plague us consistently. We maxed it out and it runs fine, then it doesn’t, then it does. 

Finding assistance with issues can be spotty. With Python, there are literally millions of open source answers which are recent and apply to the version that we are using. 

Inconsistency is a big issue.

For how long have I used the solution?

Three to five years.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user826677 - PeerSpot reviewer
Technical Consultant
Consultant
Provides a flexible development environment to the coder
Pros and Cons
  • "It has definitely streamlined certain processes.​"
  • "Provides a flexible development environment to the coder.​"
  • "The ability to change the code when debugging the JavaScript could be improved."

What is our primary use case?

Data migration (database to database using direct DB access and commands or using web services).

How has it helped my organization?

It has definitely streamlined certain processes.

What is most valuable?

The ability to build the interface using clear components and access the code (Java) to validate and trace any error. The wide range of components which suits a variety of purposes and provides a flexible development environment to the coder.

What needs improvement?

The ability to change the code when debugging the JavaScript could be improved.

For how long have I used the solution?

One to three years.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Practice Manager
Real User
It reduces the QA effort immensely by handling most of the test scenarios in a reusable way
Pros and Cons
  • "It reduces the QA effort immensely by handling most of the test scenarios in a reusable way."
  • "​This product speeds up the unit testing and QA for specific test scenarios. As a result, the development output quality can be evaluated and adjusted.​"
  • "I like idea of storing the results of Data Quality jobs in a DB and having the ability to run reports in the DB to show a dashboard of quality metrics."
  • "There are too many functions which could be streamlined."
  • "There are more functions in a non-streamlined manner, which could be refined to arrive at a better off-the-shelf functions."

What is our primary use case?

Data Quality is used to automate the quality control check on the data loaded from batch jobs. This includes BCA for field level data quality and cross table checks for key column mismatches.

The data is in Redshift and the load volume is around 10 million records per batch load over more than 100 tables in a Data Vault model.

This is for a short three month project. I have used it from dev phase until QA. This reduces the QA effort immensely by handling most of the test scenarios in a reusable way.

How has it helped my organization?

This product speeds up the unit testing and QA for specific test scenarios. As a result, the development output quality can be evaluated and adjusted.

What is most valuable?

I like the components provided by Data Quality, such as:

  • Address standardization
  • Fuzzy match
  • Schema compliance check as they pack lot of code, which is required to perform these standard data operations. 
  • Doing the same by coding would be erroneous, take a lot of time, and provide output quality which is biased. 

Apart from specific components, I like idea of storing the results of Data Quality jobs in a DB and having the ability to run reports in the DB to show a dashboard of quality metrics.

What needs improvement?

  • The report generation and using the report in DI job steps could be improved. 
  • There are too many functions which could be streamlined. 
  • The report generated often has too many pages to go through, if not loaded into a DB.
  • There are more functions in a non-streamlined manner, which could be refined to arrive at a better off-the-shelf functions.

For how long have I used the solution?

Trial/evaluations only.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user497733 - PeerSpot reviewer
Executive Director and Business Unit Manager at a tech company with 51-200 employees
Vendor
It helps more accurately identify data-quality issues, and it is simple to install.

What is most valuable?

  • Analysing data trends: This works when you add a column to analyse. It shows you max, min, nulls, etc. per field. It allows a snapshot of your data.
  • Duplication

How has it helped my organization?

  • More accurate data-quality issue identification
  • Reporting

What needs improvement?

I would like to see them add a configuration wizard.

For how long have I used the solution?

I have been using for two years.

What do I think about the stability of the solution?

I did not encounter any stability issues.

What do I think about the scalability of the solution?

I encountered scalability issues.

How is customer service and technical support?

I consulted a lot of product forums, but I did not ask for support from Talend.

How was the initial setup?

The Talend software is very simple to install. Because it runs on the Java platform, you need to make sure you have a JRE installed. Then, you download the ZIP file from the Talend website. You extract the file, and the software is ready to use by executing the EXE file.

What's my experience with pricing, setup cost, and licensing?

Try the free version first!

What other advice do I have?

It is a good tool; include it in your planning.

Disclosure: My company has a business relationship with this vendor other than being a customer. We are a Talend distribution partner
PeerSpot user
it_user158814 - PeerSpot reviewer
Developer with 51-200 employees
Vendor
Has allowed us to organise & deploy our staged ETL transformation processes; toolbox integration could be better.

What is most valuable?

Fuzzy matching lookups.

How has it helped my organization?

Talend has allowed us to systematically organise/structure and deploy our staged ETL transformation processes from Development into production, we have tracked our data quality efforts during our runs and supplied comprehensive feedback during our development.

What needs improvement?

Toolbox/component integration, performance (optimal memory performance) bench marks / manual across 64bit 32 bit architectures not existent.

For how long have I used the solution?

2-4 years.

What was my experience with deployment of the solution?

No.

What do I think about the stability of the solution?

Sometimes when working with larger datasets (possibly due to insufficient memory).

What do I think about the scalability of the solution?

No.

How are customer service and technical support?

Customer Service:

Excellent

Technical Support:

Excellent

Which solution did I use previously and why did I switch?

Yes I have, found Talend less fussy with different data and debugging tools. It is a superior solution once you are acquainted with it.

How was the initial setup?

Straightforward.

What about the implementation team?

In-house,

What was our ROI?

100%.

What's my experience with pricing, setup cost, and licensing?

No setup costs or usage costs. Talend open studio.

Which other solutions did I evaluate?

Yes, SSIS and Pentaho.

What other advice do I have?

Platform/Technology specific decisions need to be made upfront before considering this solution.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
PeerSpot user
Information Architect at a healthcare company
Vendor
Good and easy debugging functions while better tools for geo-data are needed.

Valuable Features

Maybe the best thing is the product's easy start-up level when you are familiar with Java. Also job creation is fast compared to some other tools. One more good thing is that tables' metadata is easy to bring into the tool and utilize. Last thing to mention here is flexibility to use Java code inside the job.

Improvements to My Organization

These are: fast job creation from start to finish which improves ROI, good and easy debugging functions.

Room for Improvement

First, We faced problems with stability of the products. Also some components were clearly not tested well, which meant that there were bugs. Better tools for geo-data are needed. Documentation was poor in the beginning but it got better over time.

Use of Solution

Talend Enterprise Data Integration 5.1 (1) and Talend Platform for
Data Services (2)

2 years by one customer (without Data Quality (1)), 6 months in other customer (with Data Quality(2))

Deployment Issues

At the customer deployment to the production environment from the test one was a bit exhausting. This could be because they didn't use/know the best-practices.

Stability Issues

Yes we had issues. Quite often the server needed rebooting as if there were memory leaks. Sometimes the CVS version management got stuck.

Scalability Issues

No issues. Only issues were with the Java memory which is scalable and changeable from the job settings.

Customer Service and Technical Support

Customer Service:

Customer service was good most of the time. Answers came in a timely fashion.

Technical Support:

It was good most of the time. Answers came in a timely fashion.

Initial Setup

It was pretty straightforward. Memory settings by the client needed some modification in the first place. From the server point of view I cannot say.

Implementation Team

In house team.

Other Solutions Considered

Yes. We evaluated IBM DataStage.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Talend Data Quality Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2025
Buyer's Guide
Download our free Talend Data Quality Report and get advice and tips from experienced pros sharing their opinions.