What is our primary use case?
My main use case for Amazon S3 is to store company data, and I have worked on it extensively. Currently, I am working on migrating our legacy data onto Amazon S3, and we have created AppX on top of DynamoDB and Amazon S3.
We use Amazon S3 for storing large volumes of structured and unstructured data, including raw, processed, and backup data for analytics and pipeline workflows. Since AppX is built on top of DynamoDB and Amazon S3, we initially thought this would be faster as a proprietary product, but we realized it runs a little slowly. My main job role is to migrate all our databases, all our business objects, and their relevant tables into Amazon S3.
The migration process involves using Amazon S3 because it is cost-efficient, and I work with ETL pipelines extensively, making it very suitable for hosting intermediate data. I am currently working on migration, but previously, it was very easy to integrate Amazon S3 with AWS Glue, Athena, and Spark for analytics, and it is excellent for maintaining versioned data.
What is most valuable?
The best features that Amazon S3 offers are high durability and availability with virtually unlimited storage, lifecycle management for automated tiering and cost control, versioning to retain and recover data changes, and integration with services such as Glue, Athena, EMR, Lambda, along with strong security via IAM roles, bucket policies, and encryption.
These features help by ensuring data is always available, securely stored, and easily accessible for pipelines. Lifecycle policies reduce manual cleanup, versioning safeguards against accidental deletions, and integrations allow us to directly run analytics or trigger workflows without moving data around.
Amazon S3 event notifications help automate tasks such as triggering ETL jobs or Lambda functions on new data arrivals, and features such as cross-region replication ensure data redundancy and compliance, while fine-grained access control makes it easy to manage permissions across teams securely.
My organization has seen improvements in data accessibility, reduced infrastructure costs, and simplified data management through Amazon S3. We now have a unified scalable storage layer that supports analytics, backups, and migrations seamlessly, leading to faster workflows and better decision-making.
What needs improvement?
Amazon S3 can be improved with more intuitive cost visualization tools with forecasting capabilities, faster restore times from Glacier tiers, better real-time monitoring of data access patterns, and perhaps simplified cross-account sharing and fine-grained access management at scale.
The main pain point that I have faced is the need for faster restore times from Glacier tiers, which could definitely be improved.
For how long have I used the solution?
I have been working in my current field for three years, and I have been using Amazon S3 for the same amount of time, so I have over three years of experience with it.
What do I think about the stability of the solution?
Amazon S3 is stable except for the time approximately a week or two back when DynamoDB crashed for US-West-1 or US-East-1, which made our whole system go down. Otherwise, it offers 99.99% availability.
What do I think about the scalability of the solution?
The scalability of Amazon S3 is excellent as it handles unlimited data and seamlessly supports growth without manual provisioning or performance degradation.
How are customer service and support?
The customer support for Amazon S3 is very good; they are responsive and knowledgeable, providing quick resolutions, clear documentation, and proactive guidance during migrations or performance tuning.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Previously, we used on-premises storage and Azure Blob Storage for some workloads before fully transitioning to Amazon S3. The main reason for switching to Amazon S3 was for better scalability and integration with AWS analytics tools.
How was the initial setup?
The setup for Amazon S3 was straightforward with minimal upfront cost, its pricing is fully pay-as-you-go which keeps it flexible, and licensing is not very complex as it mainly depends on storage class, data transfer, and API requests. Managing lifecycle policies helps control expenses.
What was our ROI?
I have seen a return on investment, and the ROI that we observed in our organization includes reducing manpower by 20% and achieving zero downtime for storage expansion, which improves overall productivity.
What's my experience with pricing, setup cost, and licensing?
Since using Amazon S3, storage costs have reduced by around 35% to 40% through lifecycle tiering, data availability improved to 99.9% which minimizes downtime, ETL processing time decreased by approximately 25% to 30% due to faster data access, and backup retrieval time has been cut by 50% using Amazon S3 Glacier for archival.
Which other solutions did I evaluate?
We evaluated Azure Blob Storage and GCP's Google Cloud Storage, and we chose Amazon S3 for its superior ecosystem integration, proven durability, and extensive automation features.
What other advice do I have?
My advice for others looking into using Amazon S3 is to start with clear bucket organization and tagging from day one and to use lifecycle rules to control costs.
I would appreciate if this interview could be more user-friendly, as the microphone gets cut off and my voice is not picked up clearly. There should be a way to type out responses as a backup in case the microphone is not working or my voice cannot be picked up properly.
I would appreciate a short poem or haiku that summarizes my review. I gave this review an overall rating of 9.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?