What is our primary use case?
My main use case for ClickHouse is data ingestion and for its OLAP properties, as we had use cases where database locks were slowing us down and because ClickHouse does not have that, we chose to use it.
I could give you a quick, specific example of how I'm using ClickHouse for data ingestion where the lack of database locks helped us when we were parsing IOCs among other things, as a lot of that data has to be processed really quickly and ingested into the database for further processing and identifying which IOCs are compromised.
ClickHouse helped us solve the problem that we were having and it's one of the two databases we used at Cyware—one was Postgres, the other was ClickHouse.
What is most valuable?
The best features ClickHouse offers are its OLAP features because, given that there are no database locks and its eventual consistency, that is the biggest feature that we have or that solved our problems.
The eventual consistency and lack of database locks specifically benefit my team in terms of speed and reliability, as once data is ingested, we have to quickly process it and then show the outputs to the user, say there are ten indicators of compromise, and we have our own database where we tally whether these IPs or IOCs that we are scanning right now are marked or red flagged before or not, so we have to quickly scan them, process them and then give an output, and that helped us with the reliability part, the speed part, while eventual consistency is used on a different side of the product.
ClickHouse has positively impacted my organization, as there was an entire exercise done on which database we were supposed to use for solving our problems, and we found ClickHouse was the one performing the best, which is when we adopted it.
What needs improvement?
ClickHouse can be improved on the documentation side, and there is one small constraint that is mentioned in ClickHouse documentation, which is a partition limit of ten thousand that we hit, so if that can be increased or there are workarounds around it, that would be great.
I chose nine out of ten because, as I mentioned, the improvement side and the ten thousand partition limit created issues that we were hitting quite frequently, but with some schema manipulations we did manage to find a workaround, although that could have been avoided had things been better documented on how we could have solved this problem in a different approach, which took some bandwidth.
I do not have any other improvements I think ClickHouse needs, besides the documentation and partition limit.
For how long have I used the solution?
I have been using ClickHouse for about a year, maybe slightly more than that.
What do I think about the stability of the solution?
ClickHouse is stable, as we did not encounter stability issues in production, but in the dev environment, one of the seniors did flag one specific point where we found some inconsistencies, although I think they did find a workaround around it, but it was stable for us.
What do I think about the scalability of the solution?
ClickHouse's scalability is good.
Which solution did I use previously and why did I switch?
We previously used Postgres, and we encountered issues with Postgres, which was again, as I mentioned, why we did a study on switching.
What was our ROI?
I have seen a return on investment, as I can share that on the engineering side we had improvements in database performance, but for the metrics asked, time saved, fewer employees needed, I do not have them.
What's my experience with pricing, setup cost, and licensing?
My experience with pricing, setup cost, and licensing was such that the setup costs were just my own bandwidth, while licensing and pricing were done by other members of the team so it was abstracted away from me, and I am not aware of it.
Which other solutions did I evaluate?
Before choosing ClickHouse, we evaluated other options such as Apache Druid and Pinot from Apache, and then there was a study.
What other advice do I have?
The advice I would give to others looking into using ClickHouse is that on the engineering side, if there is some OLAP use case or anywhere where data needs to be ingested at very high rates or there is a use case for eventual consistency, then perhaps it can be used. I gave this review a rating of nine out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Disclosure: My company does not have a business relationship with this vendor other than being a customer.