What is our primary use case?
We use Datadog for observability and monitoring primarily. Various cross-functional teams have built various dashboards, including Developers, QA, DevOps, and SRE.
There are also some dashboards created for senior leadership to keep tabs on days to day activities like cost, scale, issues, etc.
Also, we've set up monitors and alarms that kick off when any metrics go beyond the threshold. With Slack and PagerDuty integration, correct team members get alerted and react to solve the issue based on various runbooks.
How has it helped my organization?
Using Datadog metrics has helped the organization a lot in many manners. With one centralized monitoring place, it's a lot less effort to keep track of the system and applications' health.
Using this also helps teams be proactive in dealing with any issues before they get escalated by customers.
Lastly, having so many integrations makes the DevOps and SRE's lives a lot easier when automating the detection and resolution of any issues hidden in the system or applications. Overall, it has helped a lot.
What is most valuable?
My favorite feature is creating dashboards as that empowers me to sleep calmly at night and not to keep watch on critical system metrics. Be it DB metrics or computer-related metrics, it's always easy to view them.
The ease of correcting these dashboards and widgets when needed is amazing.
The only issue I face is when more than one person editing these dashboards simultaneously, one or the other person sometimes loses his/her work. That said, they will resolve that soon. With the variety of widgets, it's so easy to plot the data in a timely manner, and that makes monitoring a lot easier.
What needs improvement?
The solution can be improved in a few areas.
The parallel editing of the dashboards should not cause users to lose the work of another person.
Secondly, we would like to see more demos of tools that are in beta version, when they come live. I am sure they will help us a lot.
For how long have I used the solution?
I've been using the solution for slightly over two years.
What do I think about the stability of the solution?
I find the solution to be very stable.
What do I think about the scalability of the solution?
I totally love it. It is scalable.
Which solution did I use previously and why did I switch?
We previously used Sumo Logic.
How was the initial setup?
The initial setup is not so difficult.
What about the implementation team?
We implemented the solution in-house.
What was our ROI?
The ROI is very fair so far.
What's my experience with pricing, setup cost, and licensing?
I can't recommend the licensing.
Which other solutions did I evaluate?
I was not involved in any pre-evaluation process.
Which deployment model are you using for this solution?
Hybrid Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.