I am a solution architect and this is one of the products that I implement for my customers.
Kafka works well when subscribes want to stream data for specific topics.
I am a solution architect and this is one of the products that I implement for my customers.
Kafka works well when subscribes want to stream data for specific topics.
The most valuable feature is the performance.
Kafka is complex and there is a little bit of a learning curve.
I have been using Apache Kafka for between one and two years.
Resilience-wise, Kafka is very good.
Kafka is a very scalable system. You can have multiple, scalable architectures.
I have not seen any problems with technical support. There is licensed support available, which is not the case with all open-source solutions. Open-source products often have issues when it comes to getting support.
I have customers who were using IBM MQ but they have been switching to open-source.
The initial setup was straightforward for me. However, it is not straightforward for everyone because there are some tricky things to implement. In single-mode it is a little bit easier, but when it is set up as a distributed system then it is more complex because there are a lot of things to be considered.
Kafka is open-source and it is cheaper than any other product.
There is a competing open-source solution called NATS but I see that Apache Kafka is widely used in many places.
Performance-wise, Kafka is better than any of the other products.
This is currently the product that I am recommending to customers. Some customers want an open-source solution.
There are some newer products that are coming on to the market that are even faster than Kafka but this solution is very resilient.
In the long run, I think that open-source will dominate the pace.
I would rate this solution a seven out of ten.
We primarily use the solution for big data. We often get a million messages per second, and with such a high output we use Kafka to help us handle it.
When we're working with big data, we need a throughput computing panel, which is something that Kafka provides, and something we find extremely valuable. It helps us support computing and ensures there's no loss of data. It can even do replication with some data.
The delivery of data is it's most valuable aspect.
It's an easy to use product overall.
The solution is quite mature.
It's an open-source product, which means it doesn't cost us anything to use it.
We're still going through the solution. Right now, I can't suggest any features that might be missing. I don't see where there can be an improvement in that regard.
The speed isn't as fast as RabbitMQ, even though the solution touts itself as very quick. It could be faster. They should work to make it at least as fast as RabbitMQ.
The UI is based on command line. It would be helpful if they could come up with a simpler user interface.
They should make it easier to configure items on the solution.
The solution would benefit from the addition of better monitoring tools.
I've been using the solution for six months.
The solution is a bit slow in comparison to RabbitMQ. It's supposed to be a very fast solution, and it has okay performance, but speed-wise, it's quite slow.
The scaling of the solution is quite good.
In terms of technical support, we don't get that directly from Apache Kafka. We have certain cloud data distribution so we get assistance from our cloud data support.
We're continuously deploying the product. We're still in the process of deployment.
It's an open-source product, so the pricing isn't an issue. It's free to use. We don't have costs associated with it.
I'm not the product owner, so I didn't have a say in what should be chosen. We were seeing a high throughput with Kafka which is why we ultimately chose it.
I'd rate the solution eight out of ten. It's good at scaling, and, performance-wise, it's excellent. If they could add upon the UI and allow for easier configuration, I'd rate them higher.
I'm a software architect. The use case will depend my customers. They usually use it for data transfer from static files to a legacy system.
In the next release, I would like for there to be some authorization features and HTL security.
We also need bigger software and better monitoring.
I have been using Apache Kafka for the last ten years.
The stability is good. We've never had any issues.
Scalability is very good.
I have never needed to contact technical support. My colleagues get support from here, in Morrocco.
The setup is not a big deal for us. We can handle it. After the system is set up, the person who administers it has to do so with Apache Kafka.
Depending on the setup, it will usually take two weeks.
I would rate it a nine out of ten. Not a ten because of the monitoring and admin improvement I'd like for them to make.
We are currently using this solution on our cloud-based clusters.
We use Kafka as part of our services. Our product (cloud clusters) has many components and Kafka is one of them.
For example, we use Kafka as a data integration tool. If you take Oracle GoldenGate as a typical use case, what happens is GoldenGate collects the data for the replication and sends this data to the Kafka servers. We collect the data on the Kafka servers, and we create some transformations, some operations, from that data. We then copy the data to the HTTP or hub site.
Previously, when I worked at Nokia, we were collected data using Kafka and then we stored the data on the Kafka servers. We did all transformations through Kafka streaming. Later, Kafka moved data over to the HP site.
Kafka has a good storage layer on its side. I can store this data if it's streaming, and, if we do encounter any error, for example, on the network or server, we can later use the data to do some analytics on it using the Kafka server.
Kafka provides us with a way to store the data used for analytics. That's the big selling point. There's very good log management.
Kafka provides many APIs that can be flexible and can be placed or expanded using the development life cycle. For example, using Java, I can customize the API according to our customers' demands. I can expand the functionality according to our customer demands as well. It's also possible to create some models. It allows for more flexibility than much of the competition.
If the graphical user interface was easier for the Kafka administration it would be much better. Right now, you need to use the program with a command-line interface. If the graphical user interface was easier, it could be a better product.
I've been using the solution for more than three years.
The solution can be quite stable. We haven't encountered any issues on the Kafka side. However, Creating custom stabilizations would be good for dealing with stabilizing issues.
The scalability of the solution is very good. You can analyze system events horizontally and the cluster can be brought over to the cloud side with the Kafka user's server.
We use the solution for both small and medium-sized organizations, but also larger enterprises. Some of our clients are in the banking and financial sector.
Officially, I did not create any Kafka support tasks on the configuration support that is offered. I have created some questions on the stack overflow, however. Technical support is very good and I've found their response is very quick, giving you an answer within a day.
We didn't previously use a different solution. We did some applications with Java for the consumer content but not the application function within that. We did objects instead.
The initial setup isn't too complex. I know Kafka very well and don't find it to be overly difficult. There's also very good documentation which users can take advantage of.
Deployment, including security integration, only took about one day.
Two people handled the deployment. One person created the authentification group and after creating groups and users, another handled topic authentification and user definition for the customer.
I handled the implementation for cloud-based clusters. I defined the broker nodes and other nodes for Kafka. We are a cloud integrator, so we handled it ourselves.
I'm unaware of the costs surrounding licensing and setup.
We're using the 2.1.30 version of the solution for our cloud-based clusters. We use the on-premises deployment model. Most customers use the on-premise solution for cloud-based clusters.
Kafka is a very good solution for log management. If you need anything done related to log management, Kafka can do it. Kafka can also store the data in the brokers. This prevents data loss as well as the duplication of data. It's quite comprehensive.
I'd rate the solution seven out of ten. If the solution could provide a user interface I'd rate it higher. This is important for managing Kafka's clusters on the administration side. It would also be helpful if two to three files could be minimized to one configuration file.
In my previous company, we had a proprietary implementation and we changed it with Kafka. We changed it because we had many different connectors available and it also allowed us to create a window to our products for the client. It was an on-premise product and it allowed the outline to take the data out, without us developing anything.
You can connect in any language and there are a lot of connectors available, it helps a lot. And it creates visibility into the data and stability. There are several alternatives but this is one of the best options for this.
It's very easy to install and it's pretty stable.
The possibility to have connectors is very helpful. Another valuable aspect is that it's mature and open-source.
From a scalability point of view, you just add servers and it's scalable. The whole architecture is very scalable.
There is a feature that we're currently using called MirrorMaker. We use it to combine the information from different Kafka servers into another server. It's very wide and it gives a very generic scenario. I think it would be great if the possibility would exist out of the box and not as a third party. The third party is not very stable and sometimes you have problems with this component. There are some developments in newer versions and we're about to try them out, but I'm not sure if it closes the gap.
I have been using this solution for six months. I also worked with it additionally in my previous company but not so intensively.
I have never needed to use technical support. I know it's available but we haven't needed it because there's a lot of information on the internet that has helped us to solve our issues.
I would definitely recommend Kafka. In our current position, we use it to move a lot of data and I think it's definitely working well. I would definitely recommend it.
I would rate it an eight out of ten.
It's convenient and flexible for almost all kinds of data producers. We integrated it with Kafka Streams, which can perform some easy data processing, like summary, count, group, etc
It eases our current data flow and framework, which digests all types of sources regardless of it being structured or not.
With such a large digest, I was genuinely impressed at the process being almost real-time.
Kafka 2.0 has been released for over a month, and I wanted to try out the new features. However, the configuration is a little bit complicated: Kafka Broker, Kafka Manager, ZooKeeper Servers, etc.
Through its publisher-subscriber pattern, Kafka has allowed our applications to access and consume data at a real time pace.
I like the performance and reliability of Kafka. I needed a data streaming buffer that could handle thousands of messages per second with at least one processing point for an analytics pipeline. Kafka fits this requirement very well, as it is a fast, distributed message broker. It definitely does exactly what it is designed to do.
As an open-source project, Kafka is still fairly young and has not yet built out the stability and features that other open-source projects have acquired over the many years. If done correctly, Kafka can also take over the stream-processing space that technologies such as Apache Storm cover.
Currently, as it is in the big/fast data integration world, you need to piece together many different open-source technologies. For example, to create a reliable, fault-tolerant streaming processing system that ingests data, you need:
This is simply to ingest the data and does not necessarily account for the analytical pieces, which may consist of Spark ML, SystemML, ElasticSearch, Mahout, etc.
What I'm getting at is basically the need for a Spring framework of big data.
The only stability issues we had were mostly a result of the evolving APIs and existing bugs.
Kafka is designed to be very easily scalable so I did not have any trouble here.
We used the open-source version and did not buy support from Confluent.
We did not have any other previous solutions. Our project was green field and a new type of project development.
Initial setup was straightforward. We simply hosted multiple Kafka brokers and ZooKeeper servers on AWS EC2 instances.
We implemented it in-house and then went with the Hortonworks Data Platform distribution.
We evaluated AWS Kinesis as well.
Kafka is open source and requires an administrator to maintain the servers.
The most valuable features are performance, persistent messaging, and reliability. It allows us to persist the message for a configurable number of days, even after it has been delivered to the consumer. The message delivery is also fast.
We wanted to track the customer activities on our application and store those details on another system(RDBMS/Apache Hadoop). We do extensive analysis with that. This helps the company to analyze the customer activities, such as search terms, and do better.
It’s perfect for our requirements.
I have been using Apache Kafka for two years.
We have had no issues with stability.
We have had no issues with scalability.
We use the open source one, so we did not opt for any technical support.
We started to use Apache Kafka with our application from scratch.
The initial setup was straightforward. We faced some issues during the development in areas such as message producer and consumer. We rectified those with the tweaking the producer and consumer configurations. The documentation is very good.
I don’t have any idea, as we use the open source version.
It's a high-performance distributed system. If you want to track the user activities or any stream processing, then this is perfect. We have used Docker Kafka for our implementation. It's very easy for setup and testing. You could also try the same.