Data Engineer at a financial services firm with 1,001-5,000 employees
Real User
Top 10
2024-11-15T20:04:00Z
Nov 15, 2024
In AWS, there is a feature called vector database, which could improve its versatility. Currently, Azure Data Lake Storage has blob, table, and file share, but no vector feature. With the emergence of AI technology, it would be convenient for storing vector indexes, essential for AI solutions.
DevOps Manager at a computer software company with 5,001-10,000 employees
MSP
Top 5
2024-11-11T16:08:00Z
Nov 11, 2024
Improvement is needed in the migration process from Lakehouse to Enterprise Data Lake (EDL). Currently, migration is only one-way possible, and it would be beneficial if this aspect could be improved. Additionally, increasing the retention period beyond the current seven days would be helpful.
Version control would be a great improvement. Currently, there is no version control, and if something is deleted, it's permanently gone. The addition of a trash item would help in recovering data deleted by mistake.
Independent consultant at a hospitality company with 1-10 employees
Real User
Top 20
2024-09-11T03:09:00Z
Sep 11, 2024
If I had to nitpick, maybe the throughput could be faster - how quickly you can access data and how fast data can be written onto the Azure Data Lake Storage.
I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data.
Strategy Consultant at a computer software company with 201-500 employees
Consultant
Top 20
2024-05-27T14:31:12Z
May 27, 2024
There is room for improvement in Microsoft support. I didn't have a good experience with it. Microsoft, in general, needs to simplify its licensing model. That's one of the biggest issues with Microsoft. The licensing model is either quite difficult to understand or is constantly evolving. I like the move from Data Lake to Lakehouse. I think it's more up to Microsoft, regarding the trends of the market and what organizations need.
Data Engineer at Universidad Peruana de Ciencias Aplicadas
Real User
Top 20
2024-05-10T15:48:00Z
May 10, 2024
When you store your files manually, you can't ensure complete data integrity, which can impact data security. When you make these types of releases or improvements in this solution, you can enhance the data's stability. You can also include features like security integration with Active Directory for data access and ensure compatibility for various integrations. This approach complements both structured and unstructured data, making it more suitable for big data solutions.
In my company, we are not facing any slowness or other kinds of issues with the product. Each day in my company, we create new directories and put the current files into them, so there is the segregation part that is taken care of, and because of this, there are no issues with the tool. In our company, one of the teams use Azure Databricks to read data from Azure Data Lake Storage's account and as per the business use case, they move data or take the data further. The project I am currently doing has only limited work. I haven't explored all the points associated with the tool. The high price of the product is an area of concern where improvements are required.
Azure Data Lake Storage is widely used for data warehousing, storing processed data, raw customer files, and integrating data from multiple sources, supporting analytics, reporting, and machine learning by securely storing JSON, CSV, and other formats.
Organizations use Azure Data Lake Storage to aggregate information for reporting, integrate it into data pipelines, and benefit from secure transfer capabilities. It serves data scientists as a staging area and businesses leverage its Big...
Maybe the solution could be a bit more user-friendly.
We have not explored the AI features. It would be beneficial if some AI features were added.
In AWS, there is a feature called vector database, which could improve its versatility. Currently, Azure Data Lake Storage has blob, table, and file share, but no vector feature. With the emergence of AI technology, it would be convenient for storing vector indexes, essential for AI solutions.
Improvement is needed in the migration process from Lakehouse to Enterprise Data Lake (EDL). Currently, migration is only one-way possible, and it would be beneficial if this aspect could be improved. Additionally, increasing the retention period beyond the current seven days would be helpful.
Version control would be a great improvement. Currently, there is no version control, and if something is deleted, it's permanently gone. The addition of a trash item would help in recovering data deleted by mistake.
If I had to nitpick, maybe the throughput could be faster - how quickly you can access data and how fast data can be written onto the Azure Data Lake Storage.
I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data.
There is room for improvement in Microsoft support. I didn't have a good experience with it. Microsoft, in general, needs to simplify its licensing model. That's one of the biggest issues with Microsoft. The licensing model is either quite difficult to understand or is constantly evolving. I like the move from Data Lake to Lakehouse. I think it's more up to Microsoft, regarding the trends of the market and what organizations need.
When you store your files manually, you can't ensure complete data integrity, which can impact data security. When you make these types of releases or improvements in this solution, you can enhance the data's stability. You can also include features like security integration with Active Directory for data access and ensure compatibility for various integrations. This approach complements both structured and unstructured data, making it more suitable for big data solutions.
In my company, we are not facing any slowness or other kinds of issues with the product. Each day in my company, we create new directories and put the current files into them, so there is the segregation part that is taken care of, and because of this, there are no issues with the tool. In our company, one of the teams use Azure Databricks to read data from Azure Data Lake Storage's account and as per the business use case, they move data or take the data further. The project I am currently doing has only limited work. I haven't explored all the points associated with the tool. The high price of the product is an area of concern where improvements are required.
The solution needs to improve APIs and make them more accessible.