Try our new research platform with insights from 80,000+ expert users
Reviewer 76 - PeerSpot reviewer
Vice President of SaaS Infrastructure at a tech services company with 51-200 employees
User
Top 20
Sep 30, 2024
Enhances efficiency with robust alerting and visualization tools
Pros and Cons
  • "The real-time data helps us make informed decisions and optimize our operations, ultimately enhancing our overall efficiency and performance."
  • "The pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs."

What is our primary use case?

Our primary use case for Datadog is to monitor and manage our fully cloud-native infrastructure. We utilize DataDog to gain real-time visibility into our cloud environments, ensuring that all our services are running smoothly and efficiently. 

The platform’s extensive integration capabilities allow us to seamlessly track performance metrics across various cloud services, containers, and microservices. 

With Datadog’s robust alerting and visualization tools, we can proactively identify and resolve issues, minimizing downtime and optimizing our system’s performance. This has been crucial in maintaining the reliability and scalability of our cloud-native applications.

How has it helped my organization?

Datadog has significantly enhanced our organization’s operational efficiency and reliability. By providing real-time visibility into our cloud-native infrastructure, Datadog enables us to monitor performance metrics, detect anomalies, and resolve issues swiftly. 

The platform’s robust alerting system ensures that potential problems are addressed before they impact our services, reducing downtime and improving overall system stability. Additionally, Datadog’s comprehensive dashboards and reporting tools have streamlined our troubleshooting processes and facilitated better decision-making.

What is most valuable?

The most valuable feature of Datadog for our organization has been its real-time monitoring capabilities. This feature provides us with instant visibility into our cloud-native infrastructure, allowing us to track performance metrics and detect anomalies as they occur. The ability to monitor our systems in real-time means we can quickly identify and address issues before they escalate, minimizing downtime and ensuring the reliability of our services. 

Additionally, the real-time data helps us make informed decisions and optimize our operations, ultimately enhancing our overall efficiency and performance.

What needs improvement?

While Datadog has been instrumental in enhancing our operational efficiency, there are areas where it could be improved. 

One area is the user interface, which could be more intuitive and user-friendly, especially for new users. 

Additionally, the pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs. 

For future releases, it would be beneficial to include more advanced machine learning capabilities for predictive analytics, helping us anticipate issues before they occur. 

More third-party tools would also be valuable additions.

Buyer's Guide
Datadog
March 2026
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: March 2026.
884,873 professionals have used our research since 2012.

For how long have I used the solution?

I've used the solution for six years.

What do I think about the stability of the solution?

DataDog has proven to be a highly stable solution for our monitoring needs. Throughout our usage, we have experienced minimal downtime and consistent performance, even during peak traffic periods. The platform’s reliability ensures that we can continuously monitor our cloud-native infrastructure without interruptions, which is crucial for maintaining the health and performance of our services.

What do I think about the scalability of the solution?

DataDog’s scalability has been impressive and instrumental in supporting our growing cloud-native infrastructure. The platform effortlessly handles increased workloads and scales alongside our expanding services without compromising performance. Its ability to integrate with a wide range of cloud services and technologies ensures that as we grow, DataDog continues to provide comprehensive monitoring and insights.

How are customer service and support?

Our experience with Datadog’s customer service and support has been exceptional. The support team is highly responsive and knowledgeable, providing timely assistance whenever we’ve encountered issues or had questions. 

Their proactive approach to offering solutions and guidance has been invaluable in helping us maximize the platform’s capabilities.

How was the initial setup?

The setup is straightforward.

What about the implementation team?

We handled the setup in-house.

What's my experience with pricing, setup cost, and licensing?

The pricing model can be quite complex and might benefit from more flexible options tailored to different organizational needs.

What other advice do I have?

One area is the user interface, which could be more intuitive and user-friendly, especially for new users.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Delivery Manager, DBA Services at a manufacturing company with 10,001+ employees
Real User
Jan 26, 2023
It combines tracing and logging in one tool
Pros and Cons
  • "Datadog provides tracing and logging, whereas Dynatrace focuses on tracing, and Splunk is more of a logging tool. Datadog's advantage is that we don't need two tools."
  • "Datadog isn't as mature as some of the established players like Dynatrace or Splunk. It's a new product, so they are constantly releasing new features, and I don't have much to complain about."

What is our primary use case?

We use Datadog for monitoring to get the traces and logs of all our applications. Datadog provides dashboard and alert capabilities to identify if something is wrong with various teams. More than 200 users, mostly software engineers, work with Datadog. 

What is most valuable?

Datadog provides tracing and logging, whereas Dynatrace focuses on tracing, and Splunk is more of a logging tool. Datadog's advantage is that we don't need two tools. 

What needs improvement?

Datadog isn't as mature as some of the established players like Dynatrace or Splunk. It's a new product, so they are constantly releasing new features, and I don't have much to complain about.

For how long have I used the solution?

We have used Datadog for seven months.

What do I think about the stability of the solution?

We haven't issued any issues so far, so it's a highly stable platform. 

What do I think about the scalability of the solution?

We are a unit within a much larger entity that is using Datadog. It can scale up to meet your needs. 

How are customer service and support?

We have regular calls with the Datadog team. They take feedback and bring in the product managers to quickly answer questions and fix issues. They help you deal with some of the issues you have with any new product, but Datadog is one of the fastest-growing products in the monitoring space.

How was the initial setup?

You don't need to install anything because it's a SaaS product with a web-based UI. They provide you the credentials to give you admin access. You only need to install the agents where you need monitoring. The time required to deploy the agent depends on what you're monitoring, but the solution itself works like Office 365 or any other SaaS product. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Datadog
March 2026
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: March 2026.
884,873 professionals have used our research since 2012.
reviewer2004174 - PeerSpot reviewer
Senior Software Engineer at a insurance company with 10,001+ employees
Real User
Oct 31, 2022
Very good RUM, synthetics, and infrastructure host maps
Pros and Cons
  • "Overall, the Data UI and the usability of customer features continue to improve."
  • "It is very difficult to make the solutions fit perfectly for large organizations, especially in terms of high cardinality objects and multi-tenancy, where the data needs to be rolled up to a summarized level while maintaining its individual data granularity and identifiers."

What is our primary use case?

I have been using Datadog products and capabilities increasingly over the last 4 years, from POC to widespread adoption. 

The capabilities we use are unique for each use case and can be combined in various ways to provide the full observability coverage needed to maintain stable operations and shift from becoming more reactive to proactive. 

Our organization uses both site/service reliability for the range of backend and frontend services, custom monitoring, and dashboards that can be dynamic and reused for multiple teams.

How has it helped my organization?

The capabilities we use are unique for each use case. They can be combined in various ways to provide the full observability coverage needed to maintain stable operations in order to become more proactive. 

Our organization uses both site/service reliability for backend and frontend services. Custom monitoring and dashboards that can be dynamic and reused for multiple teams. 

We continue to increase the size of our footprint as we get more and more positive experiences.

What is most valuable?

The APM, RUM, synthetics, and infrastructure host maps have been some of the most popular and commonly used features. 

Overall, the Data UI and the usability of customer features continue to improve. 

The RUM session data and replays are much more convenient and applicable than other tools I have worked with in the past, and by combining multiple capabilities or features together, there is full visibility across the technology stacks and can identify specific bottlenecks or areas for risk and vulnerabilities to be likely to exist. 

Watchdog insights take the work out of the hardest part, helping us identify the issues before our customers.

What needs improvement?

It is very difficult to make the solutions fit perfectly for large organizations, especially in terms of high cardinality objects and multi-tenancy, where the data needs to be rolled up to a summarized level while maintaining its individual data granularity and identifiers. Tagging is imperative. However, the solutions could be improved for these needs in the future.

For how long have I used the solution?

I've used the solution for over four years now.

What do I think about the stability of the solution?

The stability is excellent.

What do I think about the scalability of the solution?

You can work with engineering to make it work for your needs. They are excellent at supporting their customers.

How are customer service and support?

Technical support is excellent.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I previously used New Relic, App Dynamics, Heap, Clicktale, and more. Datadog has incorporated many of the features we were looking for into a one-stop shop.

How was the initial setup?

The initial setup is simple and straightforward.

What about the implementation team?

We had an in-house team working directly with Datadog engineering support and technical enablement.

Which other solutions did I evaluate?

We looked into New Relic, App Dynamics, Heap, Clicktale, and more. Datadog has many of the features we were looking for in one place.

What other advice do I have?

We use all versions of the solution.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2004024 - PeerSpot reviewer
SRE at a financial services firm with 10,001+ employees
Real User
Oct 31, 2022
Excellent synthetic monitoring, APM, and alert features
Pros and Cons
  • "The monitoring functionality, in general, and tagging infrastructure are great."
  • "While the tool is robust with many different capabilities, users would greatly benefit from more examples in the documentation."

What is our primary use case?

We deploy various services for our main platform on AWS across multiple regions. We have a development environment, a staging environment, a QA environment, and a production environment. We deploy our many services across hundreds of instances. 

We have many server farms, all responsible for various services on our market intelligence platform. The deployment of each server farm or even individual instances varies depending on what stood up. We have instances built in three different ways, with two different pipelines and some even on user data scripts.

How has it helped my organization?

My team has a 24/7 on-call schedule where we need to be ready to handle and mitigate incidents with the platform at any moment. 

We have countless monitors set up on Datadog that alert directly to our queue using an email that generates a ticket. 

The actionable steps for each type of monitor and its associated incident are easily included in the alerts whenever something is triggered. We generate links to the Datadog monitors and can instantly drill down into what went wrong and for how long.

What is most valuable?

The features I have found most helpful are synthetic monitoring, APM, and alert features. The monitoring functionality, in general, and tagging infrastructure are great.

Synthetics have become bread and butter for us as we have migrated many tests over to Datadog. We have simplified and consolidated our synthetic tests while also making them more robust with the help of your tagging. 

A large portion of our monitoring is based on synthetics results, and alerts integrate seamlessly without an incident queue system. We use dashboards heavily. 

The metrics capabilities are extremely helpful, and we use virtually all of the widgets.

What needs improvement?

My main place of improvement for Datadog would be the documentation. While the tool is robust with many different capabilities, users would greatly benefit from more examples in the documentation. 

The number of current code snippets available in the docs is not enough, and some need to be updated even today. 

One function I would add would be a button to generate a report of the performance of a synthetic test and the performance of each of the steps in the test over time.

For how long have I used the solution?

This timeline varies in terms of how long we've used the solution. We have one platform completely in the cloud and one still on-premises. We've had the solution for many years on AWS.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Architect at SEI Investments
Real User
Oct 31, 2022
Great support with a helpful APM and profiler
Pros and Cons
  • "The most valuable aspects of the product include the APM and profiler."
  • "I find the training great. That said, it is set for the LCD (lowest common denominator). Of course, this is very helpful to sell the product, yet, to really utilize the product, you need to get more detailed."

What is our primary use case?

We primarily use Datadog for:

  • Native memory
  • Logging
  • APM
  • Context switching
  • RUM
  • Synthetic
  • Databases
  • Java
  • JVM settings
  • File i/o
  • Socket i/o
  • Linux
  • Kubernetes
  • Kafka
  • Pods
  • Sizing

We are testing Datadog as a way to reduce our operational time to fix things (mean time to repair). This is step one. We hope to use Datadog as a way to be proactive instead of reactive (mean time to failure).

So far, Datadog has shown very good options to work on all of our operational and development issues. We are also trying to use Datadog to shift left, and fix things before they break (MTTF increase).

How has it helped my organization?

We are currently in a POC and do not own Datadog at the moment. 

So far, there have been a few issues due to security. There are two main security issues. 

The first is moving data off-prem. This has been resolved to a point (filtering logs, etc). However, there is still an issue with moving a JFR as a JFR potentially contains data that is not allowed off-prem.

The second security issue is more internal, however, the main installation requires root access or using an ACL. Our company does not use ACLs on our Linux platform. This is problematic since the install sets a no-login on the Datadog user.

What is most valuable?

The most valuable aspects of the product include the APM and profiler.

These two have given us insights into things that are very difficult to track down given the standard OS (Linux) tools. 

The native memory tracking is super difficult to see exactly where it comes from. I attended a course (continuous profiling), and it showed me the potentially very important capabilities.

If you add these details to a standard dashboard, or a sub-dashboard for techy people, or even just a notebook, it would be easy to identify issues before they occur.

Combining these details with the basic tools (infra, logging, APM, and good rules), Datadog can easily show the details that a true engineer would need. It isn't just for monitoring, however, I see the value in it for engineers.

What needs improvement?

I have done every training offered (and in a short period of time: two days for 20 courses).

I find the training great. That said, it is set for the LCD (lowest common denominator). Of course, this is very helpful to sell the product, yet, to really utilize the product, you need to get more detailed.

If I did the training as it is written and I cut/paste a bunch of stuff and see the cut/paste work, I didn't really learn anything. Later sessions (I quit using the editor and switched to VI) stopped cutting and pasting, and learned much more.

For how long have I used the solution?

I've used the solution for one month.

What do I think about the stability of the solution?

I' give stability a thumbs up.

What do I think about the scalability of the solution?

We are not sure yet in terms of scalability. The off-prem solution seems to scale well (although had issues with the training slowing down).

How are customer service and support?

Technical support is great.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I previously used Dynatrace and Elastic. We didn't switch. We are in a POC.

How was the initial setup?

The initial setup is simple yet complex. There are too many teams are needed.

What about the implementation team?

We did the initial setup in-house.

What was our ROI?

In terms of ROI, the labor saving is probably the biggest. The NPR is probably second - although management would probably reverse these.

What's my experience with pricing, setup cost, and licensing?

Pricing and licensing is fairly complicated. A GB for .1 sounds great, however, once you put all 16 or so prices together, it adds up fast. A cost model sheet on the main site would be very helpful.

Which other solutions did I evaluate?

We are currently in a POC.

What other advice do I have?

We work with all product versions.

Which deployment model are you using for this solution?

On-premises

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Ramon Snir - PeerSpot reviewer
CTO at a tech vendor with 1-10 employees
Real User
Oct 31, 2022
Increases delivery velocity with les manual testing and good integrations
Pros and Cons
  • "Since we integrated Datadog, we have had increased confidence in the quality of our service, and we had an easier time increasing our delivery velocity."
  • "Since the Datadog platform has so many separate features, solving so many use cases, there are often inconsistencies in feature availability and interoperability between products."

What is our primary use case?

We use Datadog for three main use cases, including:

  • Infrastructure and application monitoring. It is ensuring that our services are available and performant at all times. This allows us to proactively address incidents and outages without customers contacting us. This includes monitoring of cloud resources (databases, load balancers, CPU usage, etc.), high-level application monitoring (response times, failure rates, etc.), and low-level application monitoring (business-oriented metrics and functional exceptions to customer experience.
  • Analyzing application behavior, especially around performance. We often use Datadog's application performance monitoring on non-production environments to evaluate the impact of newly introduced features and gain confidence in changes.
  • End-to-end regression testing for APIs and browser-based experiences. Using Datadog's synthetic testing checks periodically that the system behaves in the exact correct way. This is often used as a canary to detect issues even before users reach them organically.

How has it helped my organization?

Since we integrated Datadog, we have had increased confidence in the quality of our service, and we had an easier time increasing our delivery velocity. 

We have seen time after time that the monitors we have carefully created based on all ingested data are detecting issues quickly and accurately. 

This means we allow ourselves to manually test things less frequently. We have also had an easier time investigating application errors and slowness using Datadog's APM and log explorer products which allow us to introspect any part of the system, in its execution context.

What is most valuable?

The most valuable features include:

  • Integrated observability data ingestions: All data that Datadog collects is connected. This allows easily connected logs with failed requests, and slow database questions with services and requests.
  • Broad integrations allow us to monitor our entire production environment in a single place, not just cloud resources. Since all parts stream metrics, logs, and events to Datadog, we can have unified dashboards and manage monitors and incidents all from the same page.
  • A high level of configuration. We can configure and modify many parts, from how data is collected from our applications to how Datadog parses and visualizes it. This means that we always get the best experience, and we don't need to find ten different products that do small things well or settle on one product that does everything badly.

What needs improvement?

Since the Datadog platform has so many separate features, solving so many use cases, there are often inconsistencies in feature availability and interoperability between products. 

Older, more mature products tend to be complete (many features, customization, broad integrations, etc.), while newer products will often be at a "just above minimum viable product" phase for a long time, doing what's intended yet missing valuable customizations and integrations.

For how long have I used the solution?

We've used the solution for 12 months.

What do I think about the scalability of the solution?

The solution scales very well on technical aspects, being able to ingest large quantities of data from many services. However, the pricing often doesn't scale naturally, and effort has to be put in to keep ongoing costs at a reasonable amount.

How are customer service and support?

Customer service and support are generally very high-quality. In most cases, they reply very quickly and offer well-researched and relevant responses. This is contrasted with many vendors who take a long time to reply and send links to documentation instead of understanding the problem.

However, we had cases where support took several weeks to reply to a complicated request and sometimes eventually responded that the issue cannot be resolved. These are rare edge-case occurrences.

How would you rate customer service and support?

Positive

How was the initial setup?

A large part of the initial setup was straightforward. We were able to collect about 80% of the relevant and 90% of the meaningful insights from just a couple of hours of connecting the AWS integration and the Datadog APM agent. 

Getting it to 100% and configuring and customizing things to our unique situation, took about two weeks. Datadog's documentation and support team were extremely helpful during both phases.

What about the implementation team?

We handled the setup in-house.

What was our ROI?

From the number of outages stopped or shortened (which lead to lost revenue from non-renewals) and the number of hours saved on investigations (which correlates to engineering salaries), I estimate that the ROI of the implementation time and monthly charges to be between 10x and 20x.

What other advice do I have?

We use the solution as a SaaS deployment.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2003202 - PeerSpot reviewer
Architect at a comms service provider with 10,001+ employees
Real User
Oct 31, 2022
Good for monitoring and following metrics with a helpful flame graph
Pros and Cons
  • "Flame graphs are pretty useful for understanding how GraphQL resolves our federated queries when it comes to identifying slow points in our requests. In our microservice environment with 170 services."
  • "I often have issues with the UI in my browser."

What is our primary use case?

We use the solution primarily for distributed tracing, service insight and observability, metrics, and monitoring. We create custom metrics from outbound service calls to trace the availability of back-office systems. 

We use the flame graph to get insights into our GraphQL implementation. It helps highlight how resolvers work. 

However, it's lacking in tracing which GraphQL queries are run, and we use custom spans for that.

How has it helped my organization?

Prior, the team only had Instana, and few people used it. The main barriers to entry were the access (since it was not integrated into our SSO) and the user experience, which made it hard to follow. We had an on-prem version, and it wasn't the snappiest. The APM has made observability and tracing more accessible to developers.

What is most valuable?

Flame graphs are pretty useful for understanding how GraphQL resolves our federated queries when it comes to identifying slow points in our requests. In our microservice environment with 170 services. There are complex transactions over the course of a single user request since we essentially operate as a middle layer with 90 back office systems we integrate to.

What needs improvement?

I often have issues with the UI in my browser. I tend to have a lot of tabs open, yet have issues with it not responding or not showing data. A couple of times, pasting the URL into an incognito window shows the data that's there.

For how long have I used the solution?

I've used the solution for two years. 

How was the initial setup?

The initial setup was complex and required a bit of tweaking to get everything configured correctly and into our pipelines.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1996521 - PeerSpot reviewer
Engineering Manager at Indeed.com
User
Oct 30, 2022
Transparent, easy to use, and integrates well with Slack
Pros and Cons
  • "Datadog's seamless integration with Slack and PagerDuty helped us to receive alerts right to the most common notification methods we use (our mobile devices and Slack)."
  • "I would like better navigability across pages."

What is our primary use case?

I primarily use the solution to learn, watch and monitor business and engineering metrics in the production and QA environments of my team. 

We create monitors on key business metrics and observe regressions and anomalies.

Less often, I leverage the events ability in Datadog to get notified about significant activities happening in my teams' deployments.

We learn about Datadog monitor alerts through Slack and often attempt to create SLOs using Terraform.

We use APM for observability.

Most recently, I learned about WatchDog Alerts that I will be heavily looking into.

How has it helped my organization?

Datadog simplified my ability to watch easily and add monitors on any metric emitted by any team at my organization.

Datadog APM immensely improved our ability to understand the reasons behind production issues. Its ability to navigate across services seamlessly to understand the time spent at each critical stage of a production request is helpful. This, combined with Datadog's historical ability to show business metrics aside, helped get more powerful insights much more quickly.

Datadog's seamless integration with Slack and PagerDuty helped us to receive alerts right to the most common notification methods we use (our mobile devices and Slack).

What is most valuable?

The most valuable aspects include:

  • The ability to monitor any team's metric in my company (transparency)
  • The ability to create/clone dashboards for myself (ease of use)
  • Its integration with Slack (it is very powerful)
  • The ability to add monitors on any metric emitted by any team at my organization
  • (Through Datadog APM) the ability to understand the reasons behind production issues. Its ability to navigate across services seamlessly in order to understand the time spent at each critical stage of a production request is key. This, combined with Datadog's historical ability to show business metrics aside, helped me get more powerful insights much more quickly.
  • (Through integrations like Slack and PagerDuty) the ability to receive alerts right to the most common notification method we use (our mobile devices and Slack), which saves a lot of time and helps us maintain focus. 

What needs improvement?

I would like better navigability across pages. The UI/UX is powerful, yet less intuitive. A lot of times, I somehow navigate across buttons and pages, and I end up forgetting how to get back to a particular view that was more insightful. 

Particularly as Datadog starts offering more platform capabilities like APM, Watchdog, Shift left initiatives like instrumentation, continuous testing, intelligent test runner, and Synthetic and real user monitoring, the UI can become more and more clunky, giving users a very frustrating experience. 

For how long have I used the solution?

I've used the solution for five to six years.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2026
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.