We utilize Datadog mainly to monitor our API integrations and all of the inventory that comes in from our API partners. Each event has its own ID, so we can trace all activity related to each event and troubleshoot where needed.
Operations Manager at a tech services company with 201-500 employees
Good dashboards, easy troubleshooting, and integrations
Pros and Cons
- "The dashboards are super convenient to us for a more zoomed out view of what is going on with each integration that we utilize."
- "There could be more easily identifiable documentation on how to find different things on the platform."
What is our primary use case?
How has it helped my organization?
Datadog gives non-dev teams insights as to what all is happening with a particular event as well as flags any errors so that we can troubleshoot more efficiently.
What is most valuable?
The dashboards are super convenient to us for a more zoomed out view of what is going on with each integration that we utilize.
What needs improvement?
There could be more easily identifiable documentation on how to find different things on the platform. It can be overwhelming at first glance, and it's hard to find appropriate documentation on the site to lead you to where you need to be.
Buyer's Guide
Datadog
January 2026
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.
881,082 professionals have used our research since 2012.
For how long have I used the solution?
I've used the solution for about 1.5 years.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
System Engineer at a financial services firm with 10,001+ employees
A stable and scalable infrastructure monitoring solution
Pros and Cons
- "Datadog has flexibility."
- "The product needs to have more enterprise approach to configuration."
What is most valuable?
Datadog has flexibility.
What needs improvement?
The product needs to have more enterprise approach to configuration.
For how long have I used the solution?
We use the tool to monitor our whole infrastructure. CPU, memory, and disk space are the types of things we use it for.
What do I think about the stability of the solution?
It is a stable solution.
What do I think about the scalability of the solution?
It is a scalable solution.
How are customer service and support?
The technical support team is good and responsive.
How would you rate customer service and support?
Positive
How was the initial setup?
The initial setup is not very easy and the deployment took eight months.It took quite a few teams to get it all accomplished. I rate it a six out of ten.
What other advice do I have?
I rate the solution eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Datadog
January 2026
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.
881,082 professionals have used our research since 2012.
AWS Cloud Architect Consultant at a manufacturing company with 10,001+ employees
A very solid option with flexible features for analyzing data and enhancing observability
Pros and Cons
- "The solution allows flexibility and heightened observability for presenting data, creating indicators, and setting service-level objectives."
- "The solution should provide alerts for cloud outages."
What is our primary use case?
Our company deploys the solution for our customers as an observability tool to define SLOs and SLIs along with logs and metrics.
The solution includes incident, post-mortem, and root cause analysis that provides a level of truth for incidents and issues with applications.
We have SREs and teams in operations, management, and applications who all access to the solution and ensure proper integrations.
How has it helped my organization?
Our company is adopting SRE practices and the solution helps us to align the practices with our site reliability. We get more insights about issues at the outset which helps us to make better decisions such as continuing with agility or stopping to fix issues.
We are at the beginning stages of using the solution but are defining it as our company standard for use by all teams.
What is most valuable?
The solution allows flexibility and heightened observability for presenting data, creating indicators, and setting service-level objectives. There are interesting options for monitors and features that offer flexible ways to analyze data.
What needs improvement?
The solution should provide alerts for cloud outages that would allow us to report potential service impacts directly to applications or on the dashboard. Alerts are important because there is a need to determine the impact on your SLAs, SLOs, and SLIs to decide whether to move toward disaster recovery or another environment.
I would like the ability to share dashboard screenshots via email rather than having to direct others to the dashboard because it sometimes requires permissions.
For how long have I used the solution?
I have been using the solution for one year.
What do I think about the stability of the solution?
Our SRE teams report that the solution is very solid, stable, and reliable.
What do I think about the scalability of the solution?
We are using the SaaS model so do not manage scalability because the product takes care of our scaling needs with no issues.
How are customer service and support?
I don't have direct access with support but our SRE teams work with them and are satisfied.
Which solution did I use previously and why did I switch?
Our company used other products in the past but wanted to move to the cloud. We found that the solution was a very good fit for us.
How was the initial setup?
The initial setup is not that easy because there are many choices for configuration, workloads, servers, and containers.
We utilized technical support to help us understand integration and prepare patterns for other applications.
What about the implementation team?
We created small configurations and then utilized technical support to configure an application selected from our portfolio.
We utilize a team approach for implementations that sometimes includes SREs.
What's my experience with pricing, setup cost, and licensing?
The solution is fairly priced but history and log storage can get costly depending on your needs.
I rate the cost a four out of ten.
What other advice do I have?
The solution is appropriate for companies that are moving to the cloud and want a very solid tool for observability, logging, and everything related to SRE practices.
I rate the solution a nine out of ten.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Other
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
Principal Consultant at a tech vendor with 10,001+ employees
Easy to set up and good UI but needs better customization capabilities
Pros and Cons
- "The many dozens of integrations that the solution brings out of the box are excellent."
- "Deploying the agents is still very manual."
What is our primary use case?
The solution is basically used for servers and applications.
What is most valuable?
The UI, basically, is the most valuable aspect of the solution. I really like the look and feel of the solution. It's not very distinctive now since other players have caught up, however, they were the first in the market to present such an effective UI.
The many dozens of integrations that the solution brings out of the box are excellent.
It's easy to set up.
What needs improvement?
Deploying the agents is still very manual.
Network monitoring could be better or rolled into this solution so that you do not have to buy a different product.
Customization of the tool itself should be taken into account. At the moment, although what they provide out of the box is good, they don't offer many customization possibilities. I know it's difficult, however, it's something that they would need to look at. When the customer gets some customization, they want customized requirements. We cannot do it.
For how long have I used the solution?
I've been dealing with the solution for five years.
What do I think about the stability of the solution?
It's quite stable. I have never had an issue in regard to reliability, so it's very stable.
What do I think about the scalability of the solution?
It's very scalable. I have not reached the limits at any time, never in the solution. I've never seen any performance degradation in large environments. I would say it's very scalable.
Each client has its own instance. We do not share instances with multiple customers. There's usually between 20 and 30, depending on the customer.
How are customer service and support?
I never use technical support, to be honest.
How was the initial setup?
The initial setup for the solution itself is quite straightforward. You just set it up and that's it. However, when it comes to, for instance, deploying the agents to the servers, or at least the target machines, it's still a manual task. They still do not have centralized management of the FD agents, which basically delays the deployment of the solution. It's very manual still.
How long it takes to deploy is difficult to pin down. It will vary based on the environment size. Obviously, if it's ten servers, it will basically take half an hour or one hour. If it's 5,000, obviously, besides the number of notes, other considerations will need to be taken into account. If t's a large environment, it will take much longer. We would need to basically develop a solution, or an effective process to deploy the agent and configure them in a standardized manner. This is something that the tool itself or the tool provider does not offer out of the box. You need to build it. That's a drawback.
How many people you need for the deployment and maintenance processes depends on the environment's size and geographical area. On average, I would usually require for every 500 notes, one resource for implementation. Then for overall support, I usually put one resource per 1500.
What was our ROI?
Before, the ROI was much higher as you would not have to compete with any kind of tool since they were very good in the space. However, with time, other companies have picked up the slack. Now, you have other tools which provide a higher ROI. I cannot give a specific ROI percentage since I don't use it for personal use with deployment. We deploy it on behalf of customers. Obviously, depending on the deal, depending on the size, and the ROI will vary. If people are looking for a global monitoring solution in the same tool as Datadog network monitoring, they are always hindered as Datadog does not provide an adequate solution for it. That kind of decreases the ROI since you still need to get another tool to do the network monitoring.
What's my experience with pricing, setup cost, and licensing?
The licensing is a bit complicated. When you pay for it on a note basis, that's perfectly fine. However, when you put log analytics on top of it, it's based on traffic. This is actually an issue. It gets complicated.
What other advice do I have?
I'm providing Datadog. I'm a retailer.
I would recommend the solution.
I would suggest if their environment is in the cloud, companies have their environments in the public cloud, such as GCP, Azure, or AWS. Datadog is a very good candidate to provide an overview of the monitoring. If you want to consider a hybrid solution where systems and servers and applications also provide a good solution and have a lot of APM capabilities, the only drawback will be network monitoring. When you grab a tool that you want to basically monitor the entire environment at a single point of contact, with Datadog, it's possible, however, there's not an effective tool to do network monitoring.
I'd rate the solution seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer.
Head of Product Design at a tech vendor with 51-200 employees
Good alerts and detailed data but needs UI improvements
Pros and Cons
- "Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale."
- "In terms of UI, everything is very small, which makes it quite difficult to navigate at times."
What is our primary use case?
I work in product design, and although we use Datadog for monitoring, etc, my use case is different as I mostly review and watch session recordings from users to gain insight into user feedback.
We watch multiple sessions per week to understand how users are using our product. From this data, we are able to hone in on specific problems that come up during the sessions. We then reach out to specific users to follow up with them via moderated testing sessions, which is very valuable for us.
How has it helped my organization?
Using Datadog has allowed us to review detailed interactions of users at a scale that leads us to make informed data-driven UX improvements as mentioned above.
Being able to pinpoint specific users via filtering is also very useful as it means when we have direct feedback from a specific user, we can follow up by watching their session back.
The engineering team's use case for Datadog is for alerting, which is also very useful for us as it gives us visibility of how stable our platform is in various different lenses.
What is most valuable?
Session recordings have been the most valuable to me as it helps me gain insights into user behaviour at scale. By capturing real-time interactions, such as clicks, scrolls, and navigation paths, we can identify patterns and trends across a large user base. This helps us pinpoint usability issues, optimize the user experience, and improve the overall experience for our users. Analyzing these recordings enables us to make data-driven decisions that enhance both functionality and user satisfaction.
What needs improvement?
I'd like the ability to see more in-depth actions on user sessions, such as where there are specific problems and rather than having to watch numerous session recordings to understand where this happens to get alerts/notifications of specific areas that users are struggling with - such as rage clicks, etc.
In terms of UI, everything is very small, which makes it quite difficult to navigate at times, especially in terms of accessibility, so I'd love for there to be more attention on this.
For how long have I used the solution?
I've used the solution for over one year.
Which solution did I use previously and why did I switch?
We did not evaluate other options.
What's my experience with pricing, setup cost, and licensing?
I wasn't part of the decision-making process during licensing.
Which other solutions did I evaluate?
I wasn't part of the decision-making process during the evaluation stage.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Vice President of SaaS Infrastructure at a tech services company with 51-200 employees
Enhances efficiency with robust alerting and visualization tools
Pros and Cons
- "The real-time data helps us make informed decisions and optimize our operations, ultimately enhancing our overall efficiency and performance."
- "The pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs."
What is our primary use case?
Our primary use case for Datadog is to monitor and manage our fully cloud-native infrastructure. We utilize DataDog to gain real-time visibility into our cloud environments, ensuring that all our services are running smoothly and efficiently.
The platform’s extensive integration capabilities allow us to seamlessly track performance metrics across various cloud services, containers, and microservices.
With Datadog’s robust alerting and visualization tools, we can proactively identify and resolve issues, minimizing downtime and optimizing our system’s performance. This has been crucial in maintaining the reliability and scalability of our cloud-native applications.
How has it helped my organization?
Datadog has significantly enhanced our organization’s operational efficiency and reliability. By providing real-time visibility into our cloud-native infrastructure, Datadog enables us to monitor performance metrics, detect anomalies, and resolve issues swiftly.
The platform’s robust alerting system ensures that potential problems are addressed before they impact our services, reducing downtime and improving overall system stability. Additionally, Datadog’s comprehensive dashboards and reporting tools have streamlined our troubleshooting processes and facilitated better decision-making.
What is most valuable?
The most valuable feature of Datadog for our organization has been its real-time monitoring capabilities. This feature provides us with instant visibility into our cloud-native infrastructure, allowing us to track performance metrics and detect anomalies as they occur. The ability to monitor our systems in real-time means we can quickly identify and address issues before they escalate, minimizing downtime and ensuring the reliability of our services.
Additionally, the real-time data helps us make informed decisions and optimize our operations, ultimately enhancing our overall efficiency and performance.
What needs improvement?
While Datadog has been instrumental in enhancing our operational efficiency, there are areas where it could be improved.
One area is the user interface, which could be more intuitive and user-friendly, especially for new users.
Additionally, the pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs.
For future releases, it would be beneficial to include more advanced machine learning capabilities for predictive analytics, helping us anticipate issues before they occur.
More third-party tools would also be valuable additions.
For how long have I used the solution?
I've used the solution for six years.
What do I think about the stability of the solution?
DataDog has proven to be a highly stable solution for our monitoring needs. Throughout our usage, we have experienced minimal downtime and consistent performance, even during peak traffic periods. The platform’s reliability ensures that we can continuously monitor our cloud-native infrastructure without interruptions, which is crucial for maintaining the health and performance of our services.
What do I think about the scalability of the solution?
DataDog’s scalability has been impressive and instrumental in supporting our growing cloud-native infrastructure. The platform effortlessly handles increased workloads and scales alongside our expanding services without compromising performance. Its ability to integrate with a wide range of cloud services and technologies ensures that as we grow, DataDog continues to provide comprehensive monitoring and insights.
How are customer service and support?
Our experience with Datadog’s customer service and support has been exceptional. The support team is highly responsive and knowledgeable, providing timely assistance whenever we’ve encountered issues or had questions.
Their proactive approach to offering solutions and guidance has been invaluable in helping us maximize the platform’s capabilities.
How would you rate customer service and support?
Positive
How was the initial setup?
The setup is straightforward.
What about the implementation team?
We handled the setup in-house.
What's my experience with pricing, setup cost, and licensing?
The pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs.
What other advice do I have?
One area is the user interface, which could be more intuitive and user-friendly, especially for new users.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Senior Software Engineer at a transportation company with 51-200 employees
Good dashboard, excellent monitoring, and easy to expand
Pros and Cons
- "Datadog has helped us a ton by allowing us to set up a multitude of easily configurable alarms across our tech stack and infrastructure."
- "I found the documentation can sometimes be confusing."
What is our primary use case?
We primarily use Datadog for alerts. If we're running out of database connections or CPU credits we want to find out in Slack. Datadog provides nice features for that.
Secondarily, we use Datadog for analyzing historical trends and forecasting potential issues.
I'm trying to learn how to add in Continuous Profiler in our primary backend servers and set up Synthetic Tests for monitoring our front end.
Everything is mostly on AWS, and the Datadog integrations help a ton.
How has it helped my organization?
Datadog has helped us a ton by allowing us to set up a multitude of easily configurable alarms across our tech stack and infrastructure. It doesn't matter if it's in AWS Lambda or a Docker container in AWS EC2, Datadog's intuitive interface makes alarms incredibly easy to configure, reducing our resolution time for incidents.
A lot of the value comes from how frictionless the integrations are. Adding in a Datadog agent or flipping a switch on the Datadog UI to start streaming Lambda data makes the product so incredibly appealing for my company.
What is most valuable?
The monitoring feature has been the most valuable.
I really like the dashboard. Monitoring has a straightforward tie-in to business value at my company (i.e. declaring incidents, etc). Things like having a dashboard and APM make my job easier. That said DevX is a little bit of a harder sell to executives in my company.
The dashboard feature makes it so easy to inspect multiple metrics at once across services. It's truly been a lifesaver when I'm personally trying to understand why performance degradation is happening.
What needs improvement?
I found the documentation can sometimes be confusing. I tried configuring APM for some of our Python containers, and I had to cross-reference multiple blog posts and the official documentation to figure out which Datadog-agent to use. If I needed a ddtrace trace, what environment variables I should set, etc.
Furthermore, to generate my own traces, I wasn't aware that ddtrace adds its own "monkey patching," which led to headaches with respect to configuring the service for RabbitMQ.
A more unified and up-to-date documentation suite would be greatly appreciated.
For how long have I used the solution?
I've used the solution for about two years.
What do I think about the stability of the solution?
I don't recall seeing an incident from Datadog in the past couple of years and that's been wonderful.
What do I think about the scalability of the solution?
The solution is incredibly scalable! To be fair, our data throughput to Datadog isn't super huge, however, we have never seen issues as it scaled to handle more of our data.
Which solution did I use previously and why did I switch?
We used to use AWS Cloudwatch for a lot of our monitoring needs. That said, the interface felt clunky, confusing, and limited.
What was our ROI?
We don't have hard numbers on ROI. That said, overall, it has been a wonderful addition to our tooling suite.
Which other solutions did I evaluate?
We also looked at Honeycomb and are currently using both in production.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Senior Site Reliability Engineer at a tech vendor with 10,001+ employees
Good alerts and monitoring with a relatively simple setup
Pros and Cons
- "The management of SLOs and their related burn-rate monitors have allowed us to onboard teams to on-call fast."
- "Managing dashboards as IaC is a bit hard to work out at times."
What is our primary use case?
Datadog provides us with a solution for data ingesting for all of our application metrics, resource metrics, APM/tracing data etc.
We use it for use in dashboards, monitoring/alerting, SLO targets, incident response etc.
We have a lot of applications across multiple languages/frameworks etc., and have deployed in Kubernetes across multiple regions in AWS, along with underlying managed resources such as SQS, Aurora, etc.
Datadog makes understanding the state of these seamless. We are a company with millions of daily active users, and this level of detail is excellent.
How has it helped my organization?
Datadog has allowed us to rapidly spin up alerting and monitoring that helps our incident responders get alerted quickly when our SLOs are in danger and helps to quickly resolve issues.
It is the single most important tool we have from an SRE perspective.
It also provides us with an easy way to get information at a glance for all of our services through APM and create unified dashboards that track our underlying resources, such as databases, queues, etc., alongside application data.
It has been invaluable to our organization.
What is most valuable?
The management of SLOs and their related burn-rate monitors have allowed us to onboard teams to on-call fast.
Management of resources using infrastructure-as-code has been a recent game-changer for us. Combining the two has allowed us to provide product teams with a total solution for getting their applications attached to user-focused alerting and monitoring within a matter of days rather than months - and has clearly impacted our ability to discover and respond to significant production incidents.
What needs improvement?
Managing dashboards as IaC is a bit hard to work out at times. I use custom tools to convert JSON dashboards to Terraform resources. Ideally, I'd like for some sort of building tool for this to be built into the app. For example, a templating system that can easily be exported to IaC would be transformative for us.
There are also some aspects of the API that can be a bit verbose - especially in the area of new features like SLOs - and take some time to understand. That said, overall, they're well-documented enough to be a minor concern for us.
For how long have I used the solution?
I've been using the solution for over five years.
What do I think about the stability of the solution?
I have never seen a major outage that prevented us from using Datadog, although I can't speak for other teams/time zones
What do I think about the scalability of the solution?
This product is massively scalable - I haven't seen any issues as we continue to onboard new technologies and teams
How are customer service and support?
Datadog provides us with a number of direct lines to support, although I haven't personally required their assistance.
Which solution did I use previously and why did I switch?
We previously used LightStep for APM and switched to Datadog to unify all of our application data.
How was the initial setup?
Most elements are quite simple to set up. However, some types of data collection require organization-wide engineering buy-in.
What about the implementation team?
We handled the initial setup in-house.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros
sharing their opinions.
Updated: January 2026
Product Categories
Cloud Monitoring Software Application Performance Monitoring (APM) and Observability Network Monitoring Software IT Infrastructure Monitoring Log Management Container Monitoring AIOps Cloud Security Posture Management (CSPM) AI ObservabilityPopular Comparisons
Wazuh
Zabbix
SentinelOne Singularity Cloud Security
Dynatrace
Splunk Enterprise Security
Snyk
Microsoft Defender for Cloud
Prisma Cloud by Palo Alto Networks
Darktrace
IBM Security QRadar
New Relic
Splunk AppDynamics
Elastic Security
Azure Monitor
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Datadog vs ELK: which one is good in terms of performance, cost and efficiency?
- Any advice about APM solutions?
- Which would you choose - Datadog or Dynatrace?
- What is the biggest difference between Datadog and New Relic APM?
- Which monitoring solution is better - New Relic or Datadog?
- Do you recommend Datadog? Why or why not?
- How is Datadog's pricing? Is it worth the price?
- Anyone switching from SolarWinds NPM? What is a good alternative and why?
- Datadog vs ELK: which one is good in terms of performance, cost and efficiency?
- What cloud monitoring software did you choose and why?



















