Try our new research platform with insights from 80,000+ expert users
CTO at a tech vendor with 501-1,000 employees
MSP
Top 20
Aug 7, 2025
Alerting and metrics improve monitoring efficiency while pricing presents challenges
Pros and Cons
    • "The pricing nowadays is quite complex."

    What is our primary use case?

    The primary purposes for which Datadog is used include infrastructure monitoring and application monitoring.

    The main use case for Datadog integration capabilities is to monitor workloads in public cloud, and those public cloud integrations that reached the public cloud metric natively were helpful or critical for us. We are not using Datadog for AI-driven data analysis tasks, but more cloud-native and vendor-native tools at the moment, and at the time when I was still in my last employer, we didn't use Datadog for the AI piece at all.

    What is most valuable?

    I find alerting and metrics to be the most effective features of Datadog for system monitoring. It was still cheaper to run Datadog than other alternatives, so the running costs were cheaper because it was SaaS and quite easy to use.

    Datadog is only available in SaaS.

    What needs improvement?

    The pricing nowadays is quite complex.

    In future updates, I would like to see AI features included in Datadog for monitoring AI spend and usage to make the product more versatile and appealing for the customer.

    For how long have I used the solution?

    I have been using Datadog since 2014.

    Buyer's Guide
    Datadog
    January 2026
    Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.
    881,082 professionals have used our research since 2012.

    What was my experience with deployment of the solution?

    There were no problems with the deployment of Datadog.

    The deployment of Datadog just took a few hours.

    What do I think about the stability of the solution?

    The challenges I encountered while using Datadog were in the early days when the product was missing the ability to monitor Kubernetes and similar features, but they have since added those features. At the moment, I don't think there are too many challenges that I am worrying about.

    How was the initial setup?

    One person is enough to do the installation.

    What other advice do I have?

    I am not working with any of these solutions currently because I'm on sabbatical, but I used to work with Datadog six months ago, and now at the moment I'm on sabbatical.

    We were using the tools that AWS and Azure came with natively to monitor the AI workflows on their platforms.

    I used to work as the CTO at Northcloud, but I no longer work there.

    On a scale of one to ten, I rate Datadog an eight out of ten.

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Other
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: Aug 7, 2025
    Flag as inappropriate
    PeerSpot user
    Andrei Mita - PeerSpot reviewer
    Service Manager at a tech vendor with 10,001+ employees
    Real User
    Top 5
    Oct 2, 2024
    Easy to configure with synthetic testing and offers a consolidated approach to monitoring
    Pros and Cons
    • "Synthetic testing is by far the most valuable feature in our organization."
    • "One area where the product could be improved is Application Performance Monitoring (APM)."

    What is our primary use case?

    We use this solution for enterprise monitoring across a large number of applications in multiple environments like production, development, and testing. It helps us track application performance, uptime, and resource usage in real time, providing alerts for issues like downtime or performance bottlenecks. 

    Our hybrid environment includes cloud and on-premise infrastructure. The solution is crucial for ensuring reliability, compliance, and high availability across our diverse application landscape.

    How has it helped my organization?

    Datadog has greatly improved our organization by centralizing all monitoring into one platform, allowing us to consolidate data from a wide range of sources. 

    From infrastructure metrics and application logs to end-user experience and device monitoring, everything is now collected and displayed in one place. This has simplified our monitoring processes, improved visibility, and allowed for faster issue detection and resolution. 

    By streamlining these operations, Datadog has enhanced both efficiency and collaboration across teams.

    What is most valuable?

    Synthetic testing is by far the most valuable feature in our organization. It’s highly requested since the setup process is both quick and straightforward, allowing us to simulate user interactions across our applications with minimal effort. 

    The ease of configuring tests and interpreting the results makes it accessible even to non-technical team members. This feature provides valuable insights into user experience, helps identify performance bottlenecks, and ensures that our critical workflows are functioning as expected, enhancing reliability and uptime.

    What needs improvement?

    One area where the product could be improved is Application Performance Monitoring (APM). While it's a powerful feature, many in our organization find it difficult to fully understand and utilize to its maximum potential. 

    The data provided is comprehensive, yet it can sometimes be overwhelming, especially for those who are less familiar with the intricacies of application performance metrics. 

    Simplifying the interface, offering clearer guidance, or providing more intuitive visualizations would make it easier for users to extract valuable insights quickly and efficiently.

    For how long have I used the solution?

    I've used the solution for four years.

    What do I think about the stability of the solution?

    The solution is very stable. Issues happen once or twice a year and are usually solved before we have any real impact on the service.

    What do I think about the scalability of the solution?

    Scalability has never been a bottleneck for us; we've never felt any issues here.

    How are customer service and support?

    Support is slow at the beginning, however, they are much better and responsive now.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    Datadog offered the most consolidated approach to our monitoring needs.

    How was the initial setup?

    This was a migration project, so it was rather complex.

    What about the implementation team?

    We implemented the solution with our in-house team.

    What's my experience with pricing, setup cost, and licensing?

    I'd recommend new users look down the road and decide on at least a three-year plan.

    Which other solutions did I evaluate?

    We evaluated AppDynamics and Dynatrace.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Buyer's Guide
    Datadog
    January 2026
    Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.
    881,082 professionals have used our research since 2012.
    Tony Martinez1 - PeerSpot reviewer
    Works at a transportation company with 201-500 employees
    Real User
    Top 20
    Oct 1, 2024
    Great logging, session replays, and alerting
    Pros and Cons
    • "Dashboards are helpful for reviewing occasionally to get a higher-level overview of what's happening."
    • "The UI has a lot going on. It should be simpler and have a better way to onboard someone new to using Datadog."

    What is our primary use case?

    Our primary use cases include:

    • Alert on errors customers encounter in our product. We've set up logs that go to slack to tell us when a certain error threshold is hit.
    • Investigate slow page load times. We have pages in our app that are loading slowly and the logs help us figure out which queries are taking the longest time.
    • Metrics. We collect metrics on product usage.
    • Session replays. We watch session replays to see what a user was doing when a page took a long time to load or hit an error. This is helpful.

    How has it helped my organization?

    It's helped us find bugs that customers are experiencing before they're reported to us. Sometimes, customers don't report errors, so being able to catch errors before they're reported helps us investigate before other users find errors

    Datadog has helped us investigate slow page loading times and even see the specific queries that are taking a long time to load

    Logging lets us see the context around an error. For example, see if a backend service had an error before it surfaced on the frontend.

    Dashboards are helpful for reviewing occasionally to get a higher-level overview of what's happening.

    What is most valuable?

    The most valuable aspects include: 

    • Logging. Being able to view detailed logs helps debug issues.
    • Session replays. They are helpful for seeing what a customer was doing before they saw an error or had a slow page load
    • Alerting. This is an important part of our on-call process to send alerts to slack when an error threshold is crossed. Alerts/monitors are easy to configure to only alert when we want them to alert.
    • Dashboards. It's helpful to pull up dashboards that show our most common errors or page performance. It's a good way to see how the app is performing from a birds-eye-view.

    What needs improvement?

    The UI has a lot going on. It should be simpler and have a better way to onboard someone new to using Datadog.

    The log querying syntax can be confusing. Usually, I filter by finding a facet in a log and selecting to filter by that facet - but I'm not sure how to write the filter myself

    The monitor/alert syntax is also somewhat hard to understand.

    Overall, it should be easier to learn how to use the product while you're using the product. Perhaps tooltips or a link to learn more about whatever section you're using.

    For how long have I used the solution?

    I've used the solution for two years.

    Which solution did I use previously and why did I switch?

    We did not previously use a different solution.

    Which other solutions did I evaluate?

    We did not evaluate other options. 

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Akshay Manchalwar - PeerSpot reviewer
    Technical Support Engineer at a tech vendor with 10,001+ employees
    Real User
    Top 5Leaderboard
    Apr 19, 2024
    Helps to set up alerts and thresholds to monitor real-time metrics
    Pros and Cons
    • "Integrating Datadog with other platforms has made our monitoring processes a bit easier. It's not super simple, but it's manageable."
    • "For three to four months, we have been experiencing real-time delays. For example, if we're monitoring incoming traffic, the real-time status should be displayed up to a certain point. However, due to delays or issues with Datadog, the real-time data might only be updated at an earlier time. We are experiencing consistent delays in data updates from Datadog, with the most recent data often being delayed by about an hour. This issue has been ongoing for the past four months."

    What is our primary use case?

    Datadog is mainly used to set up alerts and thresholds to monitor real-time metrics and checks.

    What is most valuable?

    Integrating Datadog with other platforms has made our monitoring processes a bit easier. It's not super simple, but it's manageable.

    What needs improvement?

    For three to four months, we have been experiencing real-time delays. For example, if we're monitoring incoming traffic, the real-time status should be displayed up to a certain point. However, due to delays or issues with Datadog, the real-time data might only be updated at an earlier time. We are experiencing consistent delays in data updates from Datadog, with the most recent data often being delayed by about an hour. This issue has been ongoing for the past four months.

    For how long have I used the solution?

    I have been using the product for a year. 

    What do I think about the scalability of the solution?

    My company has 50 users for Datadog. 

    How was the initial setup?

    The tool's deployment is difficult and time-consuming. 

    What's my experience with pricing, setup cost, and licensing?

    The tool is open-source. 

    What other advice do I have?

    If you're thinking about using Datadog for the first time, I suggest getting some basic training in data operations. It'll help you navigate Datadog more easily. 
    Learning it for the first time is not overly difficult, but it's also not very easy.

    I would rate the tool a seven out of ten. While it's a useful tool, we've experienced some issues that haven't been resolved yet. Additionally, setting up dashboards and utilizing all the features requires some training. 

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Victor Chen1 - PeerSpot reviewer
    Software Engineer at a agriculture with 51-200 employees
    Real User
    Top 20
    Sep 30, 2024
    Good for log ingestion and analyzing logs with easy searchability of data
    Pros and Cons
    • "The feature I've found most valuable is the log search feature."
    • "More helpful log search keywords/tips would be helpful in improving Datadog's log dashboard."

    What is our primary use case?

    We use Datadog as our main log ingestion source, and Datadog is one of the first places we go to for analyzing logs. 

    This is especially true for cases of debugging, monitoring, and alerting on errors and incidents, as we use traffic logs from K8s, Amazon Web Services, and many other services at our company to Datadog. In addition, many products and teams at our company have dashboards for monitoring statistics (sometimes based on these logs directly, other times we set queries for these metrics) to alert us if there are any errors or health issues.

    How has it helped my organization?

    Overall, at my company, Datadog has made it easy to search for and look up logs at an impressively quick search rate over a large amount of logs. 

    It seamlessly allows you to set up monitoring and alerting directly from log queries which is convenient and helps for a good user experience, and while there is a bit of a learning curve, given enough time a majority of my company now uses Datadog as the first place to check when there are errors or bugs. 

    However, the cost aspect of Datadog is tricky to gauge because it's related to usage, and thus, it is hard to tell the relative value of Datadog year to year.

    What is most valuable?

    The feature I've found most valuable is the log search feature. It's set up with our ingestion to be a quick one-stop shop, is reliable and quick, and seamlessly integrates into building custom monitors and alerts based on log volume and timeframes. 

    As a result, it's easy to leverage this to triage bugs and errors, since we can pinpoint the logs around the time that they occur and get metadata/context around the issue. This is the main feature that I use the most in my workflow with Datadog to help debug and triage issues.

    What needs improvement?

    More helpful log search keywords/tips would be helpful in improving Datadog's log dashboard. I recently struggled a lot to parse text from raw line logs that didn't seem to match directly with facets. There should be smart searching capabilities. However, it's not intuitive to learn how to leverage them, and instead had to resort to a Python script to do some simple regex parsing (I was trying to parse "file:folder/*/*" from the logs and yet didn't seem to be able to do this in Datadog, maybe I'm just not familiar enough with the logs but didn't seem to easily find resources on how to do this either). 

    For how long have I used the solution?

    I've used the solution for 10 months.

    What's my experience with pricing, setup cost, and licensing?

    Beware that the cost will fluctuate (and it often only gets more expensive very quickly).

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    reviewer2045004 - PeerSpot reviewer
    Software Engineering Manager at a hospitality company with 1,001-5,000 employees
    Real User
    Dec 7, 2022
    Easy to implement with great passive and active monitoring
    Pros and Cons
    • "It is easy to implement and scale applications with standardized visibility, monitoring and alerting"
    • "Datadog is so feature-rich that it is often hard to onboard new folks and tough to decide where to invest time."

    What is our primary use case?

    We primarily use the solution for application monitoring (APM, logs, metrics, alerts).

    It's useful for active monitoring (static monitors, threshold monitors). We get a lot of value out of anomaly detection as well. SLOs and monitoring of SLOs have been another value add.

    In terms of metrics, the out-of-the-box infrastructure metrics that come with the Datadog agent installation are great. We have made use of both the custom metrics implementation as well as the log-based metrics which are extremely convenient.

    We also leverage Datadog for use of RUM and want to explore session replay.

    How has it helped my organization?

    It is easy to implement and scale applications with standardized visibility, monitoring and alerting

    We get a lot of value out of passive and active monitoring. While different teams across our organization have used different services (metrics, logs, APM, RUM), almost all teams have been able to use the dashboards to report and track high-level metrics and active monitoring. 

    Active monitoring (static monitors, threshold monitors) is great. We get a lot of value out of anomaly detection as well. SLOs and monitoring of SLOs have been another value add for our organization.

    What is most valuable?

    The APM and tracing provide visibility and the ability to get right to root cause issues while being able to deploy new services without much need for custom instrumentation quickly

    The active monitoring (static monitors, threshold monitors) has been very helpful. We get a lot of value out of anomaly detection. SLOs and monitoring of SLOs have been extremely valuable.

    The metrics and out-of-the-box infrastructure metrics that come with the Datadog agent installation are quite helpful to the organization. We have made use of both the custom metric implementation as well as the log-based metrics which are extremely convenient.

    What needs improvement?

    Datadog is so feature-rich that it is often hard to onboard new folks and tough to decide where to invest time. 

    The APM is a perfect example of this. This feature alone has so much (profiling, tracing, span summary, flame graphs). I would love to see more of the insight and automation-focused features, such as the log patterns, where I can spend time more efficiently.

    The cost of Datadog at scale can get very expensive very quickly. I would like to see a better usage/cost dashboard with breakdowns like the AWS cost explorer.

    For how long have I used the solution?

    I've used the solution for three years.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    reviewer2561892 - PeerSpot reviewer
    Principal. Performance Engineering at a real estate/law firm with 1,001-5,000 employees
    User
    Top 20
    Oct 2, 2024
    A go-to tool for analyzing, understanding, and investigating application performance
    Pros and Cons
    • "Log analytics give us a powerful mechanism for error tracking, research, and analysis."
    • "Network device and performance monitoring could be improved, as we've faced some limitations in this area."

    What is our primary use case?

    The soluton is used for full stack enterprise performance monitoring for our primarily cloud-based stack on AWS. We have implemented monitoring coverage using RUM for critical apps and websites and utilize APM (integrated with RUM) for full stack traceability.  

    We use Datadog as our primary log repository for all apps and platforms, and the advanced log analytics enable accurate log-based monitoring/alerting and investigations. 

    Additionally, we some advanced RUM capabilities and metrics to track and optimize client-side user experience. We track SLO's for our critical apps and platforms using Datadog.

    How has it helped my organization?

    We now have full-stack observability, which allows us to better understand application behavior, quickly alert users about issues, and proactively manage application performance.  

    We've seen value by implementing observability coordinated across multiple applications, allowing us to track things like customer shopping and orders across multiple applications and services.  

    For critical application launches, we've built dashboards that can track user activity and confirm users are able to successfully utilize new features, tracking user activities in real-time in a war-room situation.  

    Datadog is our go-to tool for analyzing, understanding, and investigating application performance and behavior.

    What is most valuable?

    APM accurately tracks our service performance across our ecosystem. RUM gives us client-side performance and user experience visibility, and the rate of new features implemented in the Digital Experience area recently has been high. Log analytics give us a powerful mechanism for error tracking, research, and analysis.  

    Custom metrics that we've created allow us to track KPIs in real-time on dashboards. All of these have proven valuable in our organization.  Additionally, Datadog product support teams are responsive and have provided timely support when needed.

    What needs improvement?

    Agent remote configuration should be provided/improved and streamlined, allowing for config changes/upgrades to be performed via the portal instead of at the host.   

    Cost tracking via the admin portal is a bit lacking, even though it has gotten better.  I'm looking for usage trends (that drive cost) across time and better visibility or notifications about on-demand charges.  

    Network device and performance monitoring could be improved, as we've faced some limitations in this area.  

    The Datadog usage-based cost model, while giving us better transparency, is difficult to follow at times and is constantly evolving.  

    For how long have I used the solution?

    I've used the solution for three years.

    How are customer service and support?

    Support has been responsive and helpful.  

    How would you rate customer service and support?

    Positive

    What's my experience with pricing, setup cost, and licensing?

    Pricing is straightforward. That said, it's sometimes difficult to estimate usage volumes.

    Which other solutions did I evaluate?

    We evaluated Datadog and New Relic in detail and chose Datadog due to their straightforward and competitive pricing model, and their full coverage of monitoring features that we desired, and an easy-to-use UI.  

    Which deployment model are you using for this solution?

    Public Cloud
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    SecOps Engineer at a media company with 51-200 employees
    User
    Top 20
    Sep 30, 2024
    Helpful support, with centralized pipeline tracking and error logging
    Pros and Cons
    • "Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most."
    • "While the documentation is very good, there are areas that need a lot of focus to pick up on the key details."

    What is our primary use case?

    Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting. 

    How has it helped my organization?

    Through the use of Datadog across all of our apps, we were able to consolidate a number of alerting and error-tracking apps, and Datadog ties them all together in cohesive dashboards. 

    What is most valuable?

    The centralized pipeline tracking and error logging provide a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly. 

    Synthetic testing is great, allowing us to catch potential problems before they impact real users. Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders. 

    What needs improvement?

    While the documentation is very good, there are areas that need a lot of focus to pick up on the key details. In some cases the screenshots don't match the text when updates are made. 

    I spent longer than I should trying to figure out how to correlate logs to traces, mostly related to environmental variables.

    For how long have I used the solution?

    I've used the solution for about three years.

    What do I think about the stability of the solution?

    We have been impressed with the uptime.

    What do I think about the scalability of the solution?

    It's scalable and customizable. 

    How are customer service and support?

    Support is helpful. They help us tune our committed costs and alert us when we start spending out of the on-demand budget.

    Which solution did I use previously and why did I switch?

    We used a mix of SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility.

    How was the initial setup?

    Setup is generally simple. .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

    What about the implementation team?

    We implemented the solution in-house.

    What was our ROI?

    There has been significant time saved by the development team in terms of assessing bugs and performance issues.

    What's my experience with pricing, setup cost, and licensing?

    I'd advise others to set up live trials to asses cost scaling. Small decisions around how monitors are used can have big impacts on cost scaling. 

    Which other solutions did I evaluate?

    NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

    What other advice do I have?

    We are excited to dig further into the new offerings around LLM and continue to grow our footprint in Datadog. 

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Buyer's Guide
    Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
    Updated: January 2026
    Buyer's Guide
    Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.