There are several features that are most valuable for us--
- Hue
- Hive
- Spark
- S3
There are several features that are most valuable for us--
With it, we have faster processing times for our apps.
It needs to be quicker and to have the ability to automate deployment on multiple nodes.
I've used it for two years.
Sometimes there were issues.
Sometimes there were issues.
Sometimes there were issues.
I've not had to use it.
No solution had been used previously, but we are using it alongside AWS EMR.
It was complex to configure.
It was done in-house.
We provide services for product implementation, so people looking for such products can contact me.
Ease of deployment and management of the Hadoop cluster are features we've found most valuable.
It allows our organization to collect data from databases that are different, and where the data is similar, it allows for a detailed analysis from a single data store.
The ability to update data is an area where the product could improve.
I've used it for one year.
We had an issue during deployment. You have to be sure that your base image is perfect and that your infrastructure is properly configured or issues will occur.
We don't have their paid support, but I have had discussions with their engineers and they have been extremely helpful. So based on that, I would give them 8/10.
The initial setup is complex. It mainly stems from small issues that typically pop up and also a lack of experience in deploying the product. I highly suggest taking the Hortonworks Training prior to deploying.
We used an in-house team. Take your time and utilize the free resources provided by Hortonworks.
At this point I don't believe I could provide a ROI as we aren't fully utilizing the product.
If possible, I would suggest paying for the professional services which would give you on-site engineers to help deploy the cluster.
We did look at Cloudera, but due to having literally no money to spend for the project, we chose Hortonworks due to its being completely free and open source.
Take your time and script as much as you can so that all base images are the same.
It is a different paradigm than standard relational databases. We can also process different tasks then just those related to the standard database world. That said, we are capable of processing various data science tasks, e.g. natural language processing or log processing.
I've used it for three years alongside MapR and Cloudera.
Almost every part of the Hadoop ecosystem has its problems and bugs.
Almost every part of the Hadoop ecosystem has its problems and bugs.
Almost every part of the Hadoop ecosystem has its problems and bugs.
The paid service is pretty good, but if you don't pay, there is documentation available in the community which is pretty good.
I slightly experienced Cloudera which is very similar to Hortonworks, but there are parts which are not open source. I'm working more with Hortnoworks because all its parts are open source. and my company has a longer partnership with Hortnoworks
It is easy if you have good administrators. It is also easy if you want to just play with it on your laptop. For real work and stability, I definitely recommend some paid support.
I was involved in multiple projects. Usually, it was done in-house with paid support.
Every project is different. Since Hadoop is an infrastructure for a long period there is no simple ROI. Also, each customer has different expectations.
Its flexibility is the most valuable feature because you can leverage any Hadoop component and take full advantage of its open source capabilities.
We're able to perform sentiment analysis on Twitter data.
It needs a better UI.
I used it for five months, one year ago.
It requires too much coding work; we're not good Java and Python developers.
No issues.
No issues.
They are detailed and informational.
Technical Support:We got stuck with Java job development and were able to get assistance from tech support.
No previous solution was used.
It was straightforward since we have Linux resources.
We did it in-house.
You need to be tech savvy.
Its ability to scale out seamlessly with little to no effort is very valuable to us. All the tools in the stack are built from the ground up to support massive amounts of data.
It allows us to provide our customers with data insights that they previously were unable to obtain.
There have been some governance initiatives, but they are far from production ready. I would like to see a big improvement in that space, as governance is critical in many regulated industries.
I've been using it for one year.
Stability is good if configured properly, but for some tools such as for instance HBase, configuration is extremely hard to get right.
Scalability is superb.
I never interacted with customer support.
Technical Support:Cloudera and vanilla Big Data tech. We continue to use them alongside HortonWorks, depending on our clients preferences and needs.
Cloudera and vanilla Big Data tech. We continue to use them alongside Hortonworks, depending on our clients preferences and needs.
With Ambari, it is pretty straightforward, but I have no idea why they prefer FQDN over IP.
My colleagues and I are the implementation team. The general advice is to start out with a small enough scope. Try to get an MVP up and running before bringing out the big guns./
Licensing is on a per node basis and it encourages people to scale vertically rather than horizontally yet the whole purpose of the tools they sell is to scale horizontally. I do like that everything is also available freely for those that do not require support.
Make sure you understand what happens under the hood. Out-of-the-box tools are sub-par. Customisation is the way to go for now.
The HDFS (Java-based file system) and Hive Utilities are proving to be most useful.
Hortonworks has allowed my organization to increase the amount of data that we regularly store from sensor data and weblogs, which in turn gives us a greater scope of data to analyze.
I would like to see an increase in usability for the Apache Storm engine within the data platform.
I have been using it for less than a year.
When initializing our cluster, we did not allocate enough space to our VAR partition and that ended up causing some issues with the networking to our onsite Tomcat server.
It's fairly low customer service.
Technical Support:It's fairly low technical support.
We started off with this product.
Both straightforward and complex, everything was easy to set up, but a lot of the behind the scenes configuration changes for customization could be rather time consuming.
We used an in-house team. My advice is to study hard and read all the documentation thoroughly before starting any implementation. It is paramount that one understands the system before implementing it.
Current ROI is none as we are still in the POC phase with most of our products.
Be sure that the product is necessary for the situation.
There’s not only one, the all-stack of Hadoop is valuable, the distributed file system HDFS, Spark, Kafka, HBase, etc. Hortonworks has certainly got the most up-to-date version of each component of Hadoop.
Compared to the other Hadoop distributions, the Ambari server provides the user an easy way to manage, to administrate and to configure their cluster. Ambari also provides a single view that gives you the possibility to use different Hadoop components from the same web interface.
This product gives the possibility to the organization to easily and quickly install and configure a Hadoop cluster. With this cluster, the organization will be able to store and process their data and bring out some specificity on it. For example, unknown common points between their clients or key elements that will increase or decrease the churn of the client.
It would be interesting to have an easy way to implement multi-tenant for HDFS with federation. At the moment, you have to do it manually in command line.
Also, it needs to support having more than two HDFS namenodes. HDFS supports more than 2 namenodes, but Hortonworks doesn't.
I work with it in different projects and POCs for two years now.
The only issue that I had was when I tried to reinstall the software on every node. You have to manually clean up everything, as Hortonworks doesn’t provide the ability to perform a clean uninstall (software, library, log, configuration files, etc). In some case, it can generate some problems if the uninstall has not done correctly.
I never had to create a case at the support, so I don’t know. I always find the answers to my questions on the web (forum or blog). There’s a big community that can support you.
I also used Cloudera, MapR, and Microsoft HD Insight.
The first time, I didn’t know anything about Big Data and Hadoop, so yes it was difficult because I did not clearly understand what I was doing.
The implementation was at the clients datacenter. My advice is to perform a POC on premise or via a virtual machine to learn how to use it and how to tune the configuration of each Hadoop component.
When implementing it in production, firstly you need to have a clear view of the requirements you need to perform the install. For example, if you are using a local repository to install the software, it has to be updated with Hortonworks sources, especially if there are security rules (firewall access, root access limitation, etc.).
My last piece of advice is that if you have a heavy load, it is really important to implement the solution on premise, not in a virtualized environment. If you do both, you will see the difference in performance.
The use of Hortonworks is free there’s no license but if you want there’s a support. It’s up to you to see if you need it (certainly) and to maybe negotiate it.
I did not really made the choice, as the client made it dependent on their experience, functionality of each distribution, privacy of the data and the licensing/support price.
Firstly perform a POC to learn and to get an idea of the load of your future applications. Then, you should be able to correctly design the need infrastructure.
