Try our new research platform with insights from 80,000+ expert users
Andre Luis Tiago Soares - PeerSpot reviewer
Developer-Data Engineer at Collab
Real User
Top 20
Good large data processing and scalable but must overcome pipeline challenges
Pros and Cons
  • "The best thing about AWS Glue is its scalability and how easy it is to process a large amount of data."
  • "Setting up pipelines is challenging, especially with version control and testing requirements."

What is our primary use case?

I use AWS Glue primarily for ETL jobs. In my organization, it's just me using it as we are a small company. The IT team consists of four people, and I am the data engineering specialist.

What is most valuable?

The best thing about AWS Glue is its scalability and how easy it is to process a large amount of data. It integrates well with Redshift, S3, and AWS Glue catalog. 

For processing extensive data, having a managed Spark service fulfills that role. If you're already working on AWS and you need to process a lot of data that can't be handled on a single node or server, AWS Glue will serve you well. While it's quite expensive, it's valuable for large data processing needs.

What needs improvement?

Setting up pipelines is challenging, especially with version control and testing requirements. While the initial setup is easy, it doesn't accommodate more complex development needs. You might feel hesitant about changing pipelines that are already running and processing business-critical data due to limited versioning and testing capabilities.

For how long have I used the solution?

I've been using AWS Glue since 2022, so for two years.

Buyer's Guide
AWS Glue
August 2025
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: August 2025.
865,295 professionals have used our research since 2012.

What do I think about the stability of the solution?

The stability of AWS Glue is fine. I haven't had any problems with it.

What do I think about the scalability of the solution?

The scalability of AWS Glue is commendable.

Which solution did I use previously and why did I switch?

Previously, in different jobs, I have worked with Databricks for ETL processes. I've also utilized Lambda functions for handling smaller data. I didn’t switch to AWS Glue, but used it in a different context.

How was the initial setup?

The initial setup of AWS Glue is easy, yet not adequate for more complex requirements. If you need to do something robust, like creating a notebook, it is straightforward. 

However, when dealing with complex pipelines handling critical business data, it's hard to set up versioning and testing.

What other advice do I have?

AWS Glue receives a hesitant five out of ten from me. I recommend it if you're already on AWS and need to process large data sets. However, for smaller data volumes, I would suggest Airflow because AWS Glue can be quite expensive.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Muthuvel Sivaraman - PeerSpot reviewer
AVP at a manufacturing company with 10,001+ employees
Real User
Top 10
Handles a huge volume of data and is serverless, but it can be considered costly by some users
Pros and Cons
  • "It is a very scalable solution."
  • "The drawbacks associated with the product stem from the fact that, based on the data volume, it can become very costly."

What is our primary use case?

I use the solution in my company for building datalake and for a variety of data sources like Oracle, MongoDB, and other multiple data sources, like SQL server, and AWS S3 buckets as a datalake storage tool, and then further we use AWS Glue to process it and move to AWS' search engine which will be like a lakehouse solution.

What is most valuable?

AWS Glue handles a huge volume of data, and it is a serverless tool. We don't need to put extra servers as long as the job runs or goes on with AWS Glue. Maintaining other options like AWS EMR can be very costly when handling the company's functions. AWS Glue creates a data catalog.

What needs improvement?

The drawbacks associated with the product stem from the fact that, based on the data volume, it can become very costly. There is a huge cost if the source system is not properly designed. If the changes are frequent and not valid, then, initially, you will use huge amounts of data in the ETL. The biggest challenges are associated with AWS Glue's costs, and it takes one-third of my entire pipeline cost.

For how long have I used the solution?

I have been using AWS Glue for two years. I am a customer of the tool.

What do I think about the scalability of the solution?

It is a very scalable solution. The problem with the tool is that it has a huge set of options, so there are a lot of hidden costs involved. Suppose you have to get into data quality, the billing needs to be done for the data quality part. As per what I learned, there is a need to scrape off AWS Glue and look for other solutions.

How are customer service and support?

We use Amazon's services to provide technical support for the product. If you want to have support, Oracle and others offer a single support, and other tools have a direct support window. For Amazon, we need to pay 10 percent of my billing amount for the tool to get support services. Whether to raise a support ticket or not is an issue since ten percent is a huge amount. My company ends up using all the options without help from support. It is very difficult for any common man to understand why there is a need to pay ten percent for support. If I find an issue in the product, and I need to get support from AWS to fix it, then I need to pay ten percent of the tool's bill amount to Amazon. AWS is a very tricky tool because everything is evolving nowadays. AWS engineers are getting hired from other places, and even after that, if I am not getting any technical support, then things will be very nasty. There are some good engineers who help users outside the normal support cycle, but it doesn't meet their needs.

I rate the technical support a four out of ten.

How would you rate customer service and support?

Neutral

How was the initial setup?

The product's installation phase is easy, especially since it is serverless.

The solution can be deployed in a day or two since it is serverless. AWS Glue alone doesn't work perfectly. If we have the right data model, then we can use AWS Glue. Inside AWS Glue, we use PySpark.

What was our ROI?

My company is in the mode of scraping off AWS Glue. My company is not approving the budget for us to use AWS Glue. I am trying to see some solutions that are not costly, like the ClickHouse database, which is an open-source tool.

What's my experience with pricing, setup cost, and licensing?

The costs of the tool are huge, especially when moving from the source to the datalake.

I rate the tool an eight on a scale of one to ten, where one is expensive, and ten is expensive. I cannot have any predictability factor regarding the costs associated with the tool.

What other advice do I have?

The main piece of AWS Glue is the ETL part. AWS Glue is for ETL to deal with S3 data sources to Redshift. We use AWS Glue for the CDC.

As the product is serverless, the tool runs fine. Most of the maintenance and monitoring are among the biggest challenges of the tool.

In terms of the product's ability to handle data volumes during scaling needs, I would say that though it offers the area of data volumes, the challenges are associated with costs.

The latency would be there if the source had a huge amount of data coming in, and so based on it, it would read the source system sequentially because of the way the CDC works. If I need to capture the change in the source change in the order, it can happen, and if you have a better network, you can also scale up by bringing the source to S3 or AWS Glue. When you can scale up, it is not really relevant for the group. The latency is not because of AWS Glue but because when it comes to ETL or CDC, I need to process it the same way I do it with AWS Glue. I cannot do parallel processing, and I need to do it sequentially.

I don't see any AI capabilities in the product, and it is more of an ETL solution.

As the product has many problems, people are moving to Bare Metal and other cloud services. Our company has spent a lot of time investigating what AWS Glue does, including the time required to use it to maintain the servers.

I need to spend on the product's maintenance along with the other activities for which I need to make payments to use the solution. Once you are able to predict the data volumes and other factors that are there over the cloud, it is possible to predict what my server will cost for the next five years and then get the servers at a very low price instead of depending on AWS.

Though it is a good solution, it is not cost-sensitive.

I rate the tool a six out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
AWS Glue
August 2025
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: August 2025.
865,295 professionals have used our research since 2012.
ParamShah - PeerSpot reviewer
Engineering Manager at Milestone Technologies
MSP
Top 10
A cloud solution with easy configuration with output limitation
Pros and Cons
  • "Our entire use case was very easily handled or solved using this solution."
  • "It is not clear how the partition discovery would have been affected by more data coming in."

What is our primary use case?

We use the solution to build tables on CSV data. We get data from some different sources, pull it in S3, and then create tables using Glue to get some metrics out of that data.

How has it helped my organization?

The entire use case was very easily handled or solved using AWS Glue. We had to get the files available in S3. The workflow was seamlessly integrated with the data, landing in S3, and then it detected changes made in the data. Configuring it was really easy. It gave us what we were looking for without going through a lot of hassle and that too within our budget.

What is most valuable?

The AWS Glue crawlers are valuable features. It is very versatile. It can detect the nature of the underlying data. It is quite smart, and it takes a lot of offloads rather than having to worry about configuring it or managing it.

What needs improvement?

There are output limitations and configuration of its three parts. There was a lot of trial and error that we had to go through. It is not clear how the partition discovery would have been affected by more data coming in. We've made some expensive mistakes, which, if there were any tutorials available or if there was easy documentation available with FAQs, could have been avoided. There is documentation, but it doesn't cover all.

There are three specific partition changes, and AWS Glue is tightly tied to Athena. We don't have much flexibility in managing the Athena.

AWS Glue could integrate with an AI model or a more advanced version that processes chat-based inputs rather than configuration. This would align it more closely with the functionalities of chat-based interfaces, making it easier to adopt.

For how long have I used the solution?

I have been using AWS Glue for four to five years.

What do I think about the stability of the solution?

The product is stable. There are no issues, errors, or downtime. It is managed by AWS.

I rate the solution’s stability a nine out of ten.

What do I think about the scalability of the solution?

The solution’s scalability is quite high. It is flexible and scalable. We haven't seen any challenges.

I rate the solution’s scalability an eight or nine out of ten.

How are customer service and support?

The customer support is nice and very helpful without AWS premium support. There are community support, medium articles, and AWS knowledge-based articles, which provide AWS support.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I have used Spark before. We switched to AWS Glue since it is a managed solution. We were writing lines of code, managing it separately on our own, and sending the jobs on a cluster. It is taken care of by the managed service itself. We have to configure it, and then it takes care of all.

How was the initial setup?

AWS Glue catalog was complex to understand. Deployment was very quick. We did some POCs. We were able to take it to production in about six to eight weeks.

We were using the console directly. There are no automated CI/CD. You can manage, create, and set up via the console using the AWS Glue service.

I rate the initial setup an eight out of ten, where one is easy, and ten is difficult.

What about the implementation team?

Deployment was done in-house with the help of two data analysts.

What's my experience with pricing, setup cost, and licensing?

The solution is expensive. It has a pay-as-you-go model. Whatever you are using, you are paying for that.

I rate the product’s pricing a six out of ten, where one is cheap and ten is expensive.

Which other solutions did I evaluate?

We are looking for Databricks. It is comparable to Spark but it was excessive for a use case. We didn't have the workload. It has a lot of additional features which we don't need. The cost is not justifiable.

What other advice do I have?

Two of us are sufficient for the solution’s maintenance.

The solution is easy to set up and starts with a lot of standard data analytical use cases where we extract data. If you want something customizable, then look at other solutions because cost might be a factor for more advanced solutions.

Overall, I rate the solution a seven out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Rajesh Ramadoss - PeerSpot reviewer
Technology Specialist at a consultancy with 10,001+ employees
Real User
Top 20
An interpreted language that does not need compilation, but it is very difficult to learn
Pros and Cons
  • "You do not need many frameworks to run Glue."
  • "It is very difficult to learn the tool and remember the syntaxes comparatively."

What is our primary use case?

We have a lot of microservices written in Glue, which are responsible for triggering based on certain events. The solution will be responsible for another container to containerize them and run over the cloud. We use the solution for different purposes, including data computing.

What is most valuable?

You do not need many frameworks to run Glue. It's an interpreted language that does not require to be compiled at all.

What needs improvement?

It is very difficult to learn the tool and remember the syntaxes comparatively. Sometimes, I face issues integrating the solution with some third-party services or services that are not a part of Glue. Such integrations take a lot of time, and not much content is available over the internet for the same.

For how long have I used the solution?

I have been using the solution for three to four years.

What do I think about the scalability of the solution?

We have eight developers on our team. My team works on almost four to five Glue services. We have four team members working on Glue, including me.

How are customer service and support?

I once faced an issue with Glue. There was a scenario where I wanted Glue to pick certain images, containerize them, and run over the code. That containerization integration wasn't happening successfully. I dropped a couple of messages in the community channel. I got good support from them, which helped me resolve my issue as quickly as possible. The community is very small, but the people are very helpful.

Which solution did I use previously and why did I switch?

I previously worked in Java, .NET, and Python. I have extensive experience with Python and .NET. Since my organization is language-independent, we have microservices written in almost all the languages, including Glue, Python, Java, and .NET.

How was the initial setup?

I'm not handling the solution's end-to-end deployment, but we have a CI/CD pipeline set up for that. The CI/CD pipeline will remain the same. It's all about how you containerize your Glue application. That is the only challenge we have faced while setting up the deployment. The rest of the configuration was pretty smooth.

What other advice do I have?

Glue is not a must-have tool. You can choose Glue if you have the capability to learn Glue as quickly as possible. There are other alternatives where you will find a lot of articles, study material, and certificates over the internet apart from Glue. If you do not have any other option, go for Glue.

If Glue is not mandatory for you, go for something else because it is difficult to learn Glue and remember the syntaxes. You will need support whenever you have a bigger integration or connectivity with third-party libraries or services. You will not receive many articles or help over the internet. Although the community is available, you need to spend some time with them to make them understand the issue.

It is not easy for a beginner to learn to use the solution for the first time. There are a few videos and courses available, but it's difficult. It's not as easy as other languages in terms of content. It's hard, but you can use it once you understand the concept.

Overall, I rate the solution seven and a half out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
UjjwalGupta - PeerSpot reviewer
Module Lead at Mphasis
Real User
Top 5
Provides inbuilt data quality and cataloging features, but it is costly compared to other tools
Pros and Cons
  • "The most valuable feature of AWS Glue is that it provides a GUI format with a drag-and-drop feature."
  • "AWS Glue is more costly compared to other tools like Airflow."

What is our primary use case?

We use AWS Glue for building ETL pipelines.

What is most valuable?

The most valuable feature of AWS Glue is that it provides a GUI format with a drag-and-drop feature. The solution provides a codeless feature or no code feature, where you can write a pipeline without adding code in AWS Glue.

If you are working on AWS services for your pipeline, AWS Glue better interacts with the AWS services than other third-party tools. The solution provides inbuilt data quality and cataloging features. AWS Glue provides a complete package, and you can do all things in one place.

What needs improvement?

AWS Glue is more costly compared to other tools like Airflow. It would be better if the solution's pricing could be reduced. The default scheduling that AWS Glue provides is not as good as Airflow. The scheduler of AWS Glue could be improved because you cannot customize it.

For how long have I used the solution?

I have been using AWS Glue for more than three years.

What do I think about the stability of the solution?

AWS Glue is a stable product because it's an AWS-managed service. We can directly contact AWS for any issues we face. If there are any glitches in the version, AWS solves the issue by creating patches or version upgrades to the solution.

What do I think about the scalability of the solution?

Around 100 to 200 people are using the solution in our organization.

How was the initial setup?

As AWS Glue is a SaaS product, you don't have to set up anything. You can just create a new job, write your script, and run your job. You have to select the particular cluster or nodes you want to run and write the code. Not much admin part is required for the solution's setup.

What's my experience with pricing, setup cost, and licensing?

AWS Glue is a paid service that doesn't come under the free trial of AWS. You have to pay a charge for using the solution.

What other advice do I have?

If you are doing a job once or twice a day, AWS Glue will not cost you much. If you run jobs for four to five times a day or hourly jobs, it will be costlier compared to other tools. If you are required to run hourly jobs five to six times a day, then using other tools would be a better option. You can choose AWS Glue if you are running jobs only one or two times a day.

Our company decided to go with AWS Glue because the tools we were using in the pipeline were AWS services only. AWS Glue easily interacts with AWS services. The jobs we were running were also not frequent.

You can use AWS Glue for learning purposes. AWS Glue is a paid service that doesn't come under the free trial of AWS. You have to pay a charge for using the solution. You can learn the code by directly testing the basic spark code in any local system. Once you are comfortable that your code is working fine, then you can run your code in AWS Glue jobs. You should test the code in the local system first and then run it in AWS Glue. Testing on AWS Glue will be costly.

If a person is familiar with Spark jobs or Python jobs, they can easily learn AWS Glue. A new user will take the same amount of time to learn AWS Glue as he takes to get comfortable with Spark. Since it provides the GUI and no code thing, users can directly start using AWS Glue without having to learn any code. It's much easier to learn to use the solution.

Overall, I rate the solution a seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
AWS DATA ENGINEER at Coforge Growth Agency
Real User
Top 5
Intuitive with a good user interface and ETL integration capabilities
Pros and Cons
  • "The two features I find most valuable in AWS Glue are its user interface and ease of use."
  • "Beginners need additional support as it currently lacks some features required for complex transformations, often necessitating custom Python coding."

What is our primary use case?

I have been working as a data engineer, where dealing with the ETL process is essential. We are using AWS Glue as a primary ETL tool to serve our organization's needs. I have implemented several Glue jobs still in production.

How has it helped my organization?

AWS Glue has enabled us to perform ETL processes efficiently, with ease of use for AWS cloud users, providing a serverless service that eliminates the need for infrastructure maintenance.

What is most valuable?

The two features I find most valuable in AWS Glue are its user interface and ease of use. The user interface is intuitive, and navigating through the Glue console is seamless. 

Additionally, its ability to integrate with other AWS services is excellent, providing flawless coordination with services such as SNS, S3, and Lambda.

What needs improvement?

I see scope for improvement in the drag-and-drop feature of AWS Glue. Beginners need additional support as it currently lacks some features required for complex transformations, often necessitating custom Python coding.

For how long have I used the solution?

I have been using Glue for more than five years now.

What do I think about the stability of the solution?

Overall, the stability of AWS Glue is excellent. I would rate it a nine out of ten. Some network-related issues may arise. That said, they are rare and do not affect its functionality significantly.

What do I think about the scalability of the solution?

Regarding scalability, AWS Glue is nearly perfect. I would rate it a nine out of ten, although there is always room for improvement.

How are customer service and support?

AWS customer service is great, but there is room for improvement. The issue I face is the inconsistency in dealing with different customer service representatives for the same issue, which disrupts personal touch.

How would you rate customer service and support?

Neutral

What's my experience with pricing, setup cost, and licensing?

On an organizational level, the pricing of AWS Glue does not pose a concern. It is in line with other ETL tools in the market. However, AWS Glue's cost to free-tier users is an issue because it is not entirely free, even for trial purposes.

What other advice do I have?

I advise potential users to adopt AWS Glue primarily due to its user-friendly interface, extensive documentation, and seamless integration with other AWS services, making it ideal for data engineers.

I'd rate the solution nine out of ten.

Disclosure: My company has a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2290962 - PeerSpot reviewer
VP- Cloud Data/ Solution Architect at a financial services firm with 10,001+ employees
Real User
Top 5
Offers metadata management, logging, and ETL processing capabilities
Pros and Cons
  • "AWS Glue is fast and managed by AWS. Hence, you don't have to worry about capacity and the performance of Glue jobs. It has integrations with other data stores of AWS. The product offers metadata management, logging, and ETL processing capabilities. It comes with a powerful feature, Glue Studio, which helps to do queries interactively within the community. It is a managed service and very secure. Another popular and mature service is S3."
  • "I have encountered challenges with multi-region support."

What is most valuable?

AWS Glue is fast and managed by AWS. Hence, you don't have to worry about capacity and the performance of Glue jobs. It has integrations with other data stores of AWS. The product offers metadata management, logging, and ETL processing capabilities. It comes with a powerful feature, Glue Studio, which helps to do queries interactively within the community. It is a managed service and very secure. Another popular and mature service is S3. 

What needs improvement?

I have encountered challenges with multi-region support. 

How are customer service and support?

AWS Glue has standard procedures for tech support. 

How would you rate customer service and support?

Positive

How was the initial setup?

AWS Glue is very user-friendly. However, if you are not a UI person, you can do IaaS. AWS provides a cloud automation service and a service catalog. 

What's my experience with pricing, setup cost, and licensing?

I rate the tool's pricing a four out of ten. 

What other advice do I have?

I rate AWS Glue an eight out of ten. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Neelabh Sharma - PeerSpot reviewer
Data Engineer at Scania
Real User
Top 10
Provides good scalability and has an easy setup process
Pros and Cons
  • "The product has a valuable feature for data catalog."
  • "The product is expensive for data streaming. This area needs improvement."

What is our primary use case?

We use AWS Glue for ETL batch processing purposes.

What is most valuable?

The product has a valuable feature for data catalog.

What needs improvement?

The product is expensive for data streaming compared to EMR. This area needs improvement.

For how long have I used the solution?

We have been using AWS Glue for one and a half years.

What do I think about the stability of the solution?

I rate the product's stability a ten out of ten.

What do I think about the scalability of the solution?

We have five to six AWS Glue users. I rate its scalability a nine out of ten.

Which solution did I use previously and why did I switch?

We have used Cloudera before. We switched to AWS Glue for better pricing, scalability, and innovation.

How was the initial setup?

The initial setup is easy. I rate the process an eight or nine out of ten. It could be deployed on-premises and on the cloud as well. We have a team of five executives to carry out the implementation.

What's my experience with pricing, setup cost, and licensing?

It is an expensive product. I rate its pricing a nine out of ten.

What other advice do I have?

I rate AWS Glue a nine out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros sharing their opinions.
Updated: August 2025
Product Categories
Cloud Data Integration
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros sharing their opinions.