Datadog Reviews and Pricing

Ramon Snir

CTO at a tech vendor with 1-10 employees

Oct 31, 2022

Download

Increases delivery velocity with les manual testing and good integrations

Pros and Cons

"Since we integrated Datadog, we have had increased confidence in the quality of our service, and we had an easier time increasing our delivery velocity."
"From the number of outages stopped or shortened (which lead to lost revenue from non-renewals) and the number of hours saved on investigations (which correlates to engineering salaries), I estimate that the ROI of the implementation time and monthly charges to be between 10x and 20x."

"Since the Datadog platform has so many separate features, solving so many use cases, there are often inconsistencies in feature availability and interoperability between products."

What is our primary use case?

We use Datadog for three main use cases, including:

Infrastructure and application monitoring. It is ensuring that our services are available and performant at all times. This allows us to proactively address incidents and outages without customers contacting us. This includes monitoring of cloud resources (databases, load balancers, CPU usage, etc.), high-level application monitoring (response times, failure rates, etc.), and low-level application monitoring (business-oriented metrics and functional exceptions to customer experience.
Analyzing application behavior, especially around performance. We often use Datadog's application performance monitoring on non-production environments to evaluate the impact of newly introduced features and gain confidence in changes.
End-to-end regression testing for APIs and browser-based experiences. Using Datadog's synthetic testing checks periodically that the system behaves in the exact correct way. This is often used as a canary to detect issues even before users reach them organically.

How has it helped my organization?

Since we integrated Datadog, we have had increased confidence in the quality of our service, and we had an easier time increasing our delivery velocity.

We have seen time after time that the monitors we have carefully created based on all ingested data are detecting issues quickly and accurately.

This means we allow ourselves to manually test things less frequently. We have also had an easier time investigating application errors and slowness using Datadog's APM and log explorer products which allow us to introspect any part of the system, in its execution context.

What is most valuable?

The most valuable features include:

Integrated observability data ingestions: All data that Datadog collects is connected. This allows easily connected logs with failed requests, and slow database questions with services and requests.
Broad integrations allow us to monitor our entire production environment in a single place, not just cloud resources. Since all parts stream metrics, logs, and events to Datadog, we can have unified dashboards and manage monitors and incidents all from the same page.
A high level of configuration. We can configure and modify many parts, from how data is collected from our applications to how Datadog parses and visualizes it. This means that we always get the best experience, and we don't need to find ten different products that do small things well or settle on one product that does everything badly.

What needs improvement?

Since the Datadog platform has so many separate features, solving so many use cases, there are often inconsistencies in feature availability and interoperability between products.

Older, more mature products tend to be complete (many features, customization, broad integrations, etc.), while newer products will often be at a "just above minimum viable product" phase for a long time, doing what's intended yet missing valuable customizations and integrations.

Buyer's Guide

Datadog

June 2026

Free Report: Datadog Reviews and More

Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: June 2026.

DOWNLOAD NOW

900,747 professionals have used our research since 2012.

For how long have I used the solution?

We've used the solution for 12 months.

What do I think about the scalability of the solution?

The solution scales very well on technical aspects, being able to ingest large quantities of data from many services. However, the pricing often doesn't scale naturally, and effort has to be put in to keep ongoing costs at a reasonable amount.

How are customer service and support?

Customer service and support are generally very high-quality. In most cases, they reply very quickly and offer well-researched and relevant responses. This is contrasted with many vendors who take a long time to reply and send links to documentation instead of understanding the problem.

However, we had cases where support took several weeks to reply to a complicated request and sometimes eventually responded that the issue cannot be resolved. These are rare edge-case occurrences.

How was the initial setup?

A large part of the initial setup was straightforward. We were able to collect about 80% of the relevant and 90% of the meaningful insights from just a couple of hours of connecting the AWS integration and the Datadog APM agent.

Getting it to 100% and configuring and customizing things to our unique situation, took about two weeks. Datadog's documentation and support team were extremely helpful during both phases.

What about the implementation team?

We handled the setup in-house.

What was our ROI?

From the number of outages stopped or shortened (which lead to lost revenue from non-renewals) and the number of hours saved on investigations (which correlates to engineering salaries), I estimate that the ROI of the implementation time and monthly charges to be between 10x and 20x.

What other advice do I have?

We use the solution as a SaaS deployment.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

reviewer2003202

Architect at a comms service provider with 10,001+ employees

Oct 31, 2022

Download

Good for monitoring and following metrics with a helpful flame graph

Pros and Cons

"Flame graphs are pretty useful for understanding how GraphQL resolves our federated queries when it comes to identifying slow points in our requests. In our microservice environment with 170 services."
"The APM has made observability and tracing more accessible to developers."

"I often have issues with the UI in my browser."
"I often have issues with the UI in my browser. I tend to have a lot of tabs open, yet have issues with it not responding or not showing data."

What is our primary use case?

We use the solution primarily for distributed tracing, service insight and observability, metrics, and monitoring. We create custom metrics from outbound service calls to trace the availability of back-office systems.

We use the flame graph to get insights into our GraphQL implementation. It helps highlight how resolvers work.

However, it's lacking in tracing which GraphQL queries are run, and we use custom spans for that.

How has it helped my organization?

Prior, the team only had Instana, and few people used it. The main barriers to entry were the access (since it was not integrated into our SSO) and the user experience, which made it hard to follow. We had an on-prem version, and it wasn't the snappiest. The APM has made observability and tracing more accessible to developers.

What is most valuable?

Flame graphs are pretty useful for understanding how GraphQL resolves our federated queries when it comes to identifying slow points in our requests. In our microservice environment with 170 services. There are complex transactions over the course of a single user request since we essentially operate as a middle layer with 90 back office systems we integrate to.

What needs improvement?

I often have issues with the UI in my browser. I tend to have a lot of tabs open, yet have issues with it not responding or not showing data. A couple of times, pasting the URL into an incognito window shows the data that's there.

For how long have I used the solution?

I've used the solution for two years.

How was the initial setup?

The initial setup was complex and required a bit of tweaking to get everything configured correctly and into our pipelines.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Buyer's Guide

Datadog

June 2026

Free Report: Datadog Reviews and More

Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: June 2026.

DOWNLOAD NOW

900,747 professionals have used our research since 2012.

reviewer2000466

Senior Cloud Engineer, Vice President of Monitoring at a financial services firm with 10,001+ employees

Oct 31, 2022

Download

Good ServiceNow integration, helpful API crawlers, and useful APM metrics

Pros and Cons

"The seamless integration between Datadog and hundreds of apps makes onboarding new products and teams a breeze."
"Using the product has caused a paradigm shift in how we deploy monitoring."

"It seems that admin cost control granularity is an afterthought."
"The real issue with this product is cost control."

What is our primary use case?

We are using the solution for migrating out of the data center. Old apps need to be re-architected. We are planning on moving to multi-cloud for disaster recovery and to avoid vendor lockouts.

The migration is a mix between an MSP (Infosys) and in-house developers. The hard part is ensuring these apps run the same in the cloud as they do on-premises. Then we also need to ensure that we improve performance when possible. With deadlines approaching quickly it's important not to cut corners - which is why we needed observability

How has it helped my organization?

Using the product has caused a paradigm shift in how we deploy monitoring. Before, we had a one-to-one lookup in ServiceNow. This wouldn't scale, as teams wouldn't be able to create monitors on the fly and would have to wait on us to contact the ServiceNow team to create a custom lookup. Now, in real-time, as new instances are spun up and down, they are still guaranteed to be covered by monitoring. This used to require a change request, and now it is automatic.

What is most valuable?

For use, the most valuable features we have are infrastructure and APM metrics.

The seamless integration between Datadog and hundreds of apps makes onboarding new products and teams a breeze.

We rely heavily on the API crawlers Datadog uses for cloud integrations. These allow us to pick up and leverage the tags teams have already deployed without having to also make them add it at the agent level. Then we use Datadog's conditionals in the monitor to dynamically alert hundreds of teams.

With the ServiceNow integration, we can also assign tickets based on the environment. Now our top teams are using the APM/profiler to find bottlenecks and improve the speed of our apps

What needs improvement?

The real issue with this product is cost control. For example, when logs first came out they didn't have any index cuts. This caused runaway logs and exploding costs.

It seems that admin cost control granularity is an afterthought. For example, synthetics have been out for over four years, yet there is no way to limit teams from creating tests that fire off every minute. If we could say you can't test more than once every five minutes, that would save us 5X on our bill.

For how long have I used the solution?

I've used the solution for about three years.

What do I think about the stability of the solution?

The solution is very stable. There are not too many outages, and they fix them fast.

What do I think about the scalability of the solution?

It is easy to scale. That is why we adopted it.

How are customer service and support?

Before premium support, I would avoid using them as it was so bad.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We previously used AppDynamics. It isn't built for the cloud and is hard to deploy at scale.

How was the initial setup?

The initial setup was not difficult. We just had to teach teams the concept of tags.

What about the implementation team?

We did the implementation in-house. It was me. I am the SME for Datadog at the company.

What was our ROI?

The solution has saved months of time and reduced blindspots for all app teams.

What's my experience with pricing, setup cost, and licensing?

I'd advise users to be careful with logs and the APM as those are the ones that can get expensive fast.

Which other solutions did I evaluate?

We looked into Dynatrace. However, we found the cost to be high.

Which deployment model are you using for this solution?

Hybrid Cloud

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

reviewer1996521

Engineering Manager at Indeed.com

Oct 30, 2022

Download

Transparent, easy to use, and integrates well with Slack

Pros and Cons

"Datadog's seamless integration with Slack and PagerDuty helped us to receive alerts right to the most common notification methods we use (our mobile devices and Slack)."
"Datadog APM immensely improved our ability to understand the reasons behind production issues."

"I would like better navigability across pages."
"Particularly as Datadog starts offering more platform capabilities like APM, Watchdog, shift left initiatives like instrumentation, continuous testing, intelligent test runner, and Synthetic and real user monitoring, the UI can become more and more clunky, giving users a very frustrating experience."

What is our primary use case?

I primarily use the solution to learn, watch and monitor business and engineering metrics in the production and QA environments of my team.

We create monitors on key business metrics and observe regressions and anomalies.

Less often, I leverage the events ability in Datadog to get notified about significant activities happening in my teams' deployments.

We learn about Datadog monitor alerts through Slack and often attempt to create SLOs using Terraform.

We use APM for observability.

Most recently, I learned about WatchDog Alerts that I will be heavily looking into.

How has it helped my organization?

Datadog simplified my ability to watch easily and add monitors on any metric emitted by any team at my organization.

Datadog APM immensely improved our ability to understand the reasons behind production issues. Its ability to navigate across services seamlessly to understand the time spent at each critical stage of a production request is helpful. This, combined with Datadog's historical ability to show business metrics aside, helped get more powerful insights much more quickly.

Datadog's seamless integration with Slack and PagerDuty helped us to receive alerts right to the most common notification methods we use (our mobile devices and Slack).

What is most valuable?

The most valuable aspects include:

The ability to monitor any team's metric in my company (transparency)
The ability to create/clone dashboards for myself (ease of use)
Its integration with Slack (it is very powerful)
The ability to add monitors on any metric emitted by any team at my organization
(Through Datadog APM) the ability to understand the reasons behind production issues. Its ability to navigate across services seamlessly in order to understand the time spent at each critical stage of a production request is key. This, combined with Datadog's historical ability to show business metrics aside, helped me get more powerful insights much more quickly.
(Through integrations like Slack and PagerDuty) the ability to receive alerts right to the most common notification method we use (our mobile devices and Slack), which saves a lot of time and helps us maintain focus.

What needs improvement?

I would like better navigability across pages. The UI/UX is powerful, yet less intuitive. A lot of times, I somehow navigate across buttons and pages, and I end up forgetting how to get back to a particular view that was more insightful.

Particularly as Datadog starts offering more platform capabilities like APM, Watchdog, Shift left initiatives like instrumentation, continuous testing, intelligent test runner, and Synthetic and real user monitoring, the UI can become more and more clunky, giving users a very frustrating experience.

For how long have I used the solution?

I've used the solution for five to six years.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

it_user1043778

Senior Engineer at a educational organization with 5,001-10,000 employees

Aug 23, 2022

Download

I like the amount of tooling and the number of solutions they sold with their monitoring.

Pros and Cons

"I like the amount of tooling and the number of solutions they sold with their monitoring. Datadog was highly intuitive to use."
"Datadog is a complete solution with easy-to-use templates and excellent scalability."

"Datadog needs more local Asia-Pacific support, and if they don't have a SaaS solution in Asia-Pacific, they should offer an on-prem version. I'm told that's not possible."
"I'd rate Datadog support four out of 10. It was primarily an issue with support in the Asia-Pacific region."

What is our primary use case?

Datadog is a SaaS solution we tried for URL and synthetic monitoring. You record a transaction going into a website and replay that transaction from various locations. Datadog is mainly used by the admin, but three or four other guys had access to the reports and notifications, so it's five altogether.

We probably tried no more than 8 percent of what Datadog can do. There are so many other bits and modules. I've only gone into about half of what APM can do in the Datadog stack.

How has it helped my organization?

We could detect outages on particular websites or problems in specific locations. If I had paid for the full solution, I'm sure I could get a lot of value out of Datadog.

What is most valuable?

I like the amount of tooling and the number of solutions they sold with their monitoring. Datadog was highly intuitive to use.

What needs improvement?

Datadog needs more local Asia-Pacific support, and if they don't have a SaaS solution in Asia-Pacific, they should offer an on-prem version. I'm told that's not possible.

For how long have I used the solution?

I have used Datadog for about two or three years.

What do I think about the scalability of the solution?

I was only using Datadog to monitor on a small scale.

How are customer service and support?

I'd rate Datadog support four out of 10. It was primarily an issue with support in the Asia-Pacific region. I sent them several emails, and they responded around three weeks later.

They said it went around the houses. Nobody knew who to respond to. That's not good enough. They should have at least told me they'd received the email. I used to work in support.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We were just trying Datadog, and we've switched temporarily to Site24x7. We're looking for one of the bigger ones. They've all given us proposals, whereas Datadog hasn't come forward with a proposal for what they could do.

I used Datadog because I already had a relationship with them at a previous company. However, that guy's moved on now, and I wanted to see how good they were.

How was the initial setup?

Setting up Datadog is pretty straightforward. I have a lot of experience doing that sort of thing. It took maybe a day and a half to deploy because I was picking externally facing websites.

I deployed it by myself. One person is enough for the small system we had. However, if we were moving forward, I'd recommend at least two or three people to manage it.

What's my experience with pricing, setup cost, and licensing?

Datadog would've cost around $850 a month based on the loads we were doing, and you could estimate roughly what you would be paying monthly. I liked their pricing model. It was flexible, so you only paid for what you used. I rate Datadog pricing eight out of 10.

Which other solutions did I evaluate?

We looked at several URL and APM monitoring solutions like Site24x7 and Pingdom. They weren't big players like Dynatrace or any of the those that had already provided us a request for information.

What other advice do I have?

Even with our negative experiences, I'd still give Datadog an eight out of 10. Datadog is a complete solution with easy-to-use templates and excellent scalability. People should know exactly what they're going to configure before they try it out. The trial is brief. Don't start a trial until you know exactly what you're going to do.

You must be certain that you can meet any internal security requirements. If you're in the Asia-Pacific region, you might not be able to run something that's running abroad.

Which deployment model are you using for this solution?

Public Cloud

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Nuno Rosa

Principal Consultant at a tech vendor with 10,001+ employees

Jun 17, 2022

Download

Easy to set up and good UI but needs better customization capabilities

Pros and Cons

"The many dozens of integrations that the solution brings out of the box are excellent."
"The UI, basically, is the most valuable aspect of the solution."

"Deploying the agents is still very manual."

What is our primary use case?

The solution is basically used for servers and applications.

What is most valuable?

The UI, basically, is the most valuable aspect of the solution. I really like the look and feel of the solution. It's not very distinctive now since other players have caught up, however, they were the first in the market to present such an effective UI.

The many dozens of integrations that the solution brings out of the box are excellent.

It's easy to set up.

What needs improvement?

Deploying the agents is still very manual.

Network monitoring could be better or rolled into this solution so that you do not have to buy a different product.

Customization of the tool itself should be taken into account. At the moment, although what they provide out of the box is good, they don't offer many customization possibilities. I know it's difficult, however, it's something that they would need to look at. When the customer gets some customization, they want customized requirements. We cannot do it.

For how long have I used the solution?

I've been dealing with the solution for five years.

What do I think about the stability of the solution?

It's quite stable. I have never had an issue in regard to reliability, so it's very stable.

What do I think about the scalability of the solution?

It's very scalable. I have not reached the limits at any time, never in the solution. I've never seen any performance degradation in large environments. I would say it's very scalable.

Each client has its own instance. We do not share instances with multiple customers. There's usually between 20 and 30, depending on the customer.

How are customer service and support?

I never use technical support, to be honest.

How was the initial setup?

The initial setup for the solution itself is quite straightforward. You just set it up and that's it. However, when it comes to, for instance, deploying the agents to the servers, or at least the target machines, it's still a manual task. They still do not have centralized management of the FD agents, which basically delays the deployment of the solution. It's very manual still.

How long it takes to deploy is difficult to pin down. It will vary based on the environment size. Obviously, if it's ten servers, it will basically take half an hour or one hour. If it's 5,000, obviously, besides the number of notes, other considerations will need to be taken into account. If t's a large environment, it will take much longer. We would need to basically develop a solution, or an effective process to deploy the agent and configure them in a standardized manner. This is something that the tool itself or the tool provider does not offer out of the box. You need to build it. That's a drawback.

How many people you need for the deployment and maintenance processes depends on the environment's size and geographical area. On average, I would usually require for every 500 notes, one resource for implementation. Then for overall support, I usually put one resource per 1500.

What was our ROI?

Before, the ROI was much higher as you would not have to compete with any kind of tool since they were very good in the space. However, with time, other companies have picked up the slack. Now, you have other tools which provide a higher ROI. I cannot give a specific ROI percentage since I don't use it for personal use with deployment. We deploy it on behalf of customers. Obviously, depending on the deal, depending on the size, and the ROI will vary. If people are looking for a global monitoring solution in the same tool as Datadog network monitoring, they are always hindered as Datadog does not provide an adequate solution for it. That kind of decreases the ROI since you still need to get another tool to do the network monitoring.

What's my experience with pricing, setup cost, and licensing?

The licensing is a bit complicated. When you pay for it on a note basis, that's perfectly fine. However, when you put log analytics on top of it, it's based on traffic. This is actually an issue. It gets complicated.

What other advice do I have?

I'm providing Datadog. I'm a retailer.

I would recommend the solution.

I would suggest if their environment is in the cloud, companies have their environments in the public cloud, such as GCP, Azure, or AWS. Datadog is a very good candidate to provide an overview of the monitoring. If you want to consider a hybrid solution where systems and servers and applications also provide a good solution and have a lot of APM capabilities, the only drawback will be network monitoring. When you grab a tool that you want to basically monitor the entire environment at a single point of contact, with Datadog, it's possible, however, there's not an effective tool to do network monitoring.

I'd rate the solution seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

Disclosure: My company has a business relationship with this vendor other than being a customer.

reviewer9637683

Software Engineer at Liberis Limited

Oct 2, 2024

Download

Great for logging and racing but needs better customization

Pros and Cons

"Real user monitoring has made triaging any possible bugs our users might face a lot easier."

"They need to offer better/more customization on what logs we get and making tracing possible on Edge runtime logs is a real requirement."

What is our primary use case?

We're using the product for logging and monitoring of various services in production environments.

It excels at providing real-time observability across a wide range of metrics, logs, and traces, making it ideal for DevOps teams and enterprises managing complex environments.

The platform integrates seamlessly with our cloud services, but browser side logging is a little lagging.

Dashboards are very useful for quick insights, but can be time consuming to create, and the learning curve is steep. Documentation is vast, but not as detailed as I'd like.

How has it helped my organization?

The solution has made logging and tracing a lot easier, and the RUM sessions are something we did not have previously. Datadog’s real-time alerting and anomaly detection help reduce downtime by allowing us to identify and address performance issues quickly.

The platform’s intelligent alert system minimises noise, ensuring your team focuses on critical incidents. This results in faster Mean Time to Resolution (MTTR), improving service availability.

It consolidates monitoring for infrastructure, applications, logs, and security into a single platform. This enables us to view and analyse data across the entire stack in one place, reducing the time spent jumping between tools.

What is most valuable?

Real user monitoring has made triaging any possible bugs our users might face a lot easier. RUM tracks actual user interactions, including page load times, clicks, and navigation flows. This gives our organization a clear picture of how our users are experiencing your application in real-world conditions, including slow-loading pages, errors, and other performance issues that affect user satisfaction. We can then easily prioritize these, and make sure we offer our users the best possible experience.

What needs improvement?

I'm not sure if this is on Datadog, however, Vercel integration is very limited.

They need to offer better/more customization on what logs we get and making tracing possible on Edge runtime logs is a real requirement. It is extremely difficult, if not completely impossible, to get working traces and logs displayed in Datadog with our stack of Vercel, NexJs, and Datadog. This is a very common stack in front end development and the difficulty of implementing it is unacceptable. Please do something about it soon. Front end logs matter.

For how long have I used the solution?

I've used the solution for a little over a year.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Hoon Kang

Full Stack Engineer at K HEALTH, INC

Sep 21, 2024

Download

Good alerting and issue detection for many valuable features

Pros and Cons

"Thanks to frequent concurrent deployments, the DataDog alerts monitors allow us quickly detect issues if anything occurs."

"The monitors can be improved."

What is our primary use case?

Our company has a microservice architecture, with different teams in charge of different services. Also, it is a start, which means that we have to build fast and move very fast as well. So before we were properly using DD, we often had issues of things breaking, but without much information on where in our system the breaking happened. This was quite a big-time sync as teams were unfamiliar with other teams' codes, so they needed the help of other teams to debug. This slowed our building down a lot. So implementing dd traces fixed this

What is most valuable?

DataDog has many features, but the most valuable have become our primary uses.

Also, thanks to frequent concurrent deployments, the DataDog alerts monitors allow us quickly detect issues if anything occurs.

What needs improvement?

The monitors can be improved. The chart in the monitors only goes back a couple of hours, clunky. Also, it can provide more info, like traces within the monitors. We have many alerts connected to different notification systems, such as Slack and Opsgenie.

When the on-caller receives notifications fired by the alerts, we are taken to the monitors. Yet often, we have to open up many different tabs to see logs, traces and info that is not accessible on the monitors. I think it would make all of the on callers' lives easier if the monitor had more data

For how long have I used the solution?

We've used the solution for three years.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Akshay Manchalwar

Technical Support Engineer at Cybage Software

Apr 19, 2024

Download

Helps to set up alerts and thresholds to monitor real-time metrics

Pros and Cons

"Integrating Datadog with other platforms has made our monitoring processes a bit easier. It's not super simple, but it's manageable."

"For three to four months, we have been experiencing real-time delays. For example, if we're monitoring incoming traffic, the real-time status should be displayed up to a certain point. However, due to delays or issues with Datadog, the real-time data might only be updated at an earlier time. We are experiencing consistent delays in data updates from Datadog, with the most recent data often being delayed by about an hour. This issue has been ongoing for the past four months."

What is our primary use case?

Datadog is mainly used to set up alerts and thresholds to monitor real-time metrics and checks.

What is most valuable?

Integrating Datadog with other platforms has made our monitoring processes a bit easier. It's not super simple, but it's manageable.

What needs improvement?

For three to four months, we have been experiencing real-time delays. For example, if we're monitoring incoming traffic, the real-time status should be displayed up to a certain point. However, due to delays or issues with Datadog, the real-time data might only be updated at an earlier time. We are experiencing consistent delays in data updates from Datadog, with the most recent data often being delayed by about an hour. This issue has been ongoing for the past four months.

For how long have I used the solution?

I have been using the product for a year.

What do I think about the scalability of the solution?

My company has 50 users for Datadog.

How was the initial setup?

The tool's deployment is difficult and time-consuming.

What's my experience with pricing, setup cost, and licensing?

The tool is open-source.

What other advice do I have?

If you're thinking about using Datadog for the first time, I suggest getting some basic training in data operations. It'll help you navigate Datadog more easily.
Learning it for the first time is not overly difficult, but it's also not very easy.

I would rate the tool a seven out of ten. While it's a useful tool, we've experienced some issues that haven't been resolved yet. Additionally, setting up dashboards and utilizing all the features requires some training.

Which deployment model are you using for this solution?

On-premises

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

reviewer2044965

Senior Site Reliability Engineer at a comms service provider with 501-1,000 employees

Dec 8, 2022

Download

Great centralized dashboards and telemetry capabilities with a helpful visualization of performance metrics

Pros and Cons

"Datadog has proven to be easy to set up and legible for both development and operational teams."

"If there were a more cost-effective manner of deploying the tool, we'd be more likely to adopt it more widely."

What is our primary use case?

We primarily use the solution for centralized dashboarding and telemetry viewing for teams across the organization.

We're focused on ensuring that both development teams and leadership can reasonably gain insights into the status of various systems.

At the end of the day, managing various dashboards and metrics aggregators like Prometheus, Kubernetes server, AWS Cloudwatch, and Grafana have lead to some confusion, and we've had issues with teams not knowing where their data exists and where they can view their system metrics.

Datadog has proven to be easy to set up and legible for both development and operational teams.

How has it helped my organization?

The solution has been useful in generally ensuring that teams are able to better visualize and think about their application's impact on data centers/cloud performance. Having centralized tooling for observability means that each team can be on the same page when discussing monitoring.

There have been some issues where teams have been unable to find metrics within the tool properly and some behaviors with the tagging and grouping functionality that seem not to be as easy to understand as one may expect. That said, overall, the experience has been one that is positive.

What is most valuable?

The dashboards have proven most helpful in ensuring that teams can track the performance of their apps. On a more practical scale, the alerts have proved invaluable for triaging and bringing services back online.

Being able to tie the alerts generated through Datadog monitors has allowed us to quickly and effectively respond to infrastructure and software issues that would have otherwise hamstrung the organization and prevented us from accomplishing our day-to-day tasks. This is naturally invaluable.

What needs improvement?

I'm sure that this is said all the time, however, the pricing model has led us to restrict the usage of the service. If there were a more cost-effective manner of deploying the tool, we'd be more likely to adopt it more widely.

Aside from the cost, the nature of the tagging and grouping features within the monitoring dashboards have often caused headaches when creating new dashboards for aggregate services and infrastructure stacks. It would be nice to ensure that this feature is supported long-term and brought with easier accessibility.