What is our primary use case?
When it comes to AWS Lake Formation, it is about different ways AWS can help create, manage, and help with the data transformation that clients are looking for.
There are many use cases around Lake Formation. Typically, clients are reimagining their entire process from the standpoint of user experience, infrastructure duplication, and data duplication. Often, data isn’t organized properly, which affects both the user experience and infrastructure. So, when they’re reimagining their data strategy, one of the conversations revolves around Lake Formation.
Lake Formation is essentially about creating a centralized repository. It allows you to structure your back-end repository, whether the data is structured or unstructured. You can scale it and manage it effectively. It also covers securing your data and implementing various cataloging mechanisms for quick data retrieval. So, Lake Formation is crucial when clients are reimagining their data landscape.
Another product is Amplify. Amplify is more focused on the developer experience, especially when transitioning from a traditional waterfall model to more agile methodologies like DevOps or DevSecOps. It supports full-stack development for both web and mobile applications.
AWS Amplify offers a lot of automation capabilities. It keeps the developer experience at the forefront, providing automation for CI/CD pipelines, many out-of-the-box features, and seamless API integrations. Amplify handles these with native capabilities, making it a powerful tool for developers.
What needs improvement?
There are a couple of areas for improvement with Lake Formation. One of the main challenges, especially when dealing with rich media content, like in MarTech (Marketing Technology) or ad agencies, is its versatility. Some clients feel that Lake Formation doesn’t meet their needs and they tend to prefer competitor products for those specific use cases.
The second area for improvement is in data governance. Specifically, Lake Formation could enhance its capabilities in audit logs, real-time monitoring, and advanced data governance. This includes managing the entire data lineage—where the data originated, how it moves, and where it’s currently stored. The visibility of the data as it evolves is crucial, and that’s where more advanced governance capabilities would be beneficial.
For how long have I used the solution?
I have a pretty good understanding of the entire AWS suite. I’ve been working with AWS for many, many years, if not almost a decade.
When cloud providers like AWS started, there were a lot of skeptics, including myself. For instance, the data center AWS was never going to move.
And then, fully and surely, we changed our own perspective and the client’s perspective. So it’s been quite a long journey, and I’ve been through the journey for over ten plus years.
What do I think about the scalability of the solution?
As AWS customers, we typically handle the initial sale. From what I understand, Lake Formation is highly scalable.
How was the initial setup?
It's fairly easy to use for people who have been trained on it. But again, we're moving closer to the technical user and the non-technical user divide.
What my teams tell me is that when we go to sell and the client has AWS-certified staff, they say, "Hey, this is easy to use." Lake Formation is fairly intuitive. However, as AWS expands and more users start using Lake Formation, there is a need for some democratization. This would allow business users, like product managers, to access the system without needing developers or technical staff.
I think there's a need for more pre-configured templates to really democratize the usage—not for the formation of the lake itself, but for retrieving complex queries. More templates and pre-configured queries could allow more users to interact with the system.
What's my experience with pricing, setup cost, and licensing?
The pricing really depends on the kind of contract you have with AWS and who is selling you the AWS services. There are many ways to consume AWS services. It depends on whether you have an enterprise license, if you're buying direct, if you're using Lake Formation or other products. It's not an easy answer to give because there are so many variables.
In general, though, AWS Lake Formation works hand in hand with other products. You're not just paying for Lake Formation; you also need S3, Glue, Athena, and other services to get the full value out of the data lake. No one buys a data lake just to store data—it’s actively used to retrieve, query, and gain insights that provide a competitive edge. So, the pricing becomes a bit complex because it involves more than just Lake Formation.
What other advice do I have?
Overall, I would rate the solution a seven out of ten, with ten being the best and one being the worst, because there are a few shortcomings. Many clients are moving away from using just one hyperscaler to multiple hyperscaler, so hybrid cloud scenarios are becoming more common. Lake Formation would need to integrate with not just AWS, but also with other hyperscalers, whether it's on-premise, off-premise, or in the cloud. Scalability and integration with competitors are important factors.
Then, there is data governance and the need to support more data formats. As the use cases expand from core IT data to other business functions, like CAD/CAM drawings or marketing with rich content and videos, Lake Formation needs to support these versatile data formats. It seems limited in terms of the data formats it can support.
Disclosure: My company has a business relationship with this vendor other than being a customer. partner/customer