No more typing reviews! Try our Samantha, our new voice AI agent.
Lead Data Ops Engineer at Wipro Limited
Real User
Top 20
May 31, 2026
Incident workflows have transformed and now reduce downtime for critical gaming services
Pros and Cons
  • "We have seen a positive return on investment from PagerDuty Operations Cloud through improved operational efficiencies, faster incident response, and reduced downtime."

    What is our primary use case?

    My name is Dinesh Singh Negi and I currently work as a Lead DataOps Engineer in the online gaming industry. My primary responsibility is ensuring the reliability, availability, and performance of our data platform and complete production system. I work extensively with AWS services, Prometheus, Grafana, and PagerDuty for monitoring, alerting, and incident management. My team supports critical gaming workloads and data pipelines that require high uptime and quick incident response. A significant part of my role involves setting up monitoring strategies, managing on-call operations, handling production incidents, and performing root cause analysis. We drive operational improvements, and we use PagerDuty Operations Cloud as our central incident management platform to ensure alerts are routed to the right team and escalated appropriately. I have been working in operations and reliability for nine to ten years and have hands-on experience managing large-scale customer-facing environments where managing, minimizing downtime, and reducing meantime to resolution are key priorities. We use PagerDuty Operations Cloud to understand the maximum time of acknowledgment and maximum time of resolution to derive meaningful analysis from the incidents that have been triggered to different teams.

    I have been working for nine to ten years in operation, production support, reliability engineering, and mixed roles during this time. I have worked extensively on monitoring, incident management, system reliability, and operational excellence while particularly supporting large-scale online platforms and data operations. For five to six years, my focus has been on ensuring high availability, managing production incidents, optimizing monitoring and alerting strategies, and improving operational processes. Throughout these years, I have gained hands-on experience with AWS Cloud, Prometheus, Grafana, and PagerDuty Operations Cloud, which are the core tools we use for monitoring, alerting, and incident responses.

    What is most valuable?

    The best features are those we have been using for incident management. We have been using PagerDuty Operations Cloud for on-call scheduling, escalation policies, and integration capabilities. Incident management is extremely valuable because it ensures critical alerts are delivered to the right people immediately. On-call scheduling and escalation policies are very helpful because we can define clear ownership for the services and automatically escalate incidents if they are not acknowledged within a specific timeframe. Another key strength is the integration ecosystem. We can integrate it with our monitoring stack including Prometheus, Grafana, and AWS services, which helps us automate alerts ingestion and incident creation without manual intervention. The most valuable features are automating alerts, escalations, on-call management, integrations, and incident analytics.

    One example that stands out was a production incident where we experienced a sudden spike in database latency during peak gaming hours. This started impacting player transactions and causing delays in some backend services. Our Prometheus and Grafana monitoring detected this abnormal latency and error rate increase, which went beyond a threshold, and the alert was automatically routed to PagerDuty Operations Cloud. PagerDuty Operations Cloud immediately notified the on-call engineer of our team and triggered the escalation workflow based on the incident severity. Since the issue occurred during peak traffic, quick response was critical, which was maintained. PagerDuty Operations Cloud helped us coordinate multiple teams, including DataOps, application, and other infrastructure teams. The platform helped ensure everyone was engaged quickly and that no critical notifications were missed. While we were under investigation, we identified a resource bottleneck in the database layer caused by an unexpected traffic surge. With the help of the database team, we scaled the required AWS resource and optimized a few long-running queries. This restored normal performance.

    What needs improvement?

    A significant positive impact is improving incident response efficiency and overall service reliability. Before we had a mature incident management process, coordinating responses during critical issues often required manual communication and follow-ups. PagerDuty Operations Cloud automated all of those things, including alert ownership, escalation, ensuring that incidents are routed to the right team members immediately. One of the most measurable benefits is the reduction in meantime to acknowledge and meantime to resolve. Faster detection and response help minimize service disruptions and maintain a stable experience for our users, which is especially important in the online gaming industry where availability and performance directly affect customer satisfaction. The platform has helped us mature our operational practices by analyzing incident trends, alert volumes, and escalation patterns. We have been able to refine our monitoring, reduce alert fatigue, and proactively address recurring issues before they become major bottlenecks in production.

    For how long have I used the solution?

    I have been using PagerDuty Operations Cloud for approximately more than five years.

    Buyer's Guide
    PagerDuty Operations Cloud
    June 2026
    Learn what your peers think about PagerDuty Operations Cloud. Get advice and tips from experienced pros sharing their opinions. Updated: June 2026.
    900,747 professionals have used our research since 2012.

    What do I think about the stability of the solution?

    PagerDuty Operations Cloud is stable.

    What do I think about the scalability of the solution?

    When you are using a tool for incident response, you need to trust that notifications and escalations work when a critical event occurs. PagerDuty Operations Cloud has been very dependable in that regard. Another aspect we have found valuable is the flexibility to support different teams and services as our environment grows. We have added new applications, data pipelines, and AWS service resources. We are able to extend our PagerDuty Operations Cloud configuration without major challenges or changes to our overall operational model.

    Which solution did I use previously and why did I switch?

    I have not used any solution previously. Since the beginning of 2021, I have been using PagerDuty Operations Cloud.

    How was the initial setup?

    The setup and customization process was relatively straightforward. The integrations were one of the easiest parts. PagerDuty Operations Cloud provides well-documented integrations for monitoring tools and cloud platforms. Connecting it with our Prometheus, Grafana, and AWS monitoring stack did not require significant development efforts. The initial setup involved configuring alert routing, defining service ownership, and mapping severity levels to appropriate escalation policies. Customizing on-call schedules and escalation workflows was also quite flexible. We were able to create different schedules for various teams, define escalation paths based on incident severity, and establish notification rules that match our operational requirements. As our team and environment grew, we refined the configuration further by tuning alert thresholds and reducing noise to avoid alert fatigue. It is important to ensure engineers receive only actionable alerts rather than excessive notifications.

    What about the implementation team?

    PagerDuty Operations Cloud's AI and automation capabilities are primarily used for alert correlation, event intelligence, noise reduction, incident prioritization, and providing operational context to responders. These capabilities help engineers identify and respond to issues more quickly while keeping humans in control of critical decisions. We see value in the direction of autonomous operations. If AI agents continue to improve in areas such as incident triage, root cause analysis, and automated remediation for well-understood scenarios, they could further reduce response times and operational overhead.

    What was our ROI?

    We have seen a positive return on investment from PagerDuty Operations Cloud through improved operational efficiencies, faster incident response, and reduced downtime. I cannot share financial figures, but I can speak to operational outcomes we have observed. Since implementing PagerDuty Operations Cloud and integrating it with AWS, Prometheus, and Grafana monitoring stack, we have seen measurable improvements in incident processes such as MTTA and MTTR, or reduced alert fatigue by using event correlation and alert deduplication. These improvements have helped us a great deal.

    Which other solutions did I evaluate?

    I did not get a chance to evaluate any other applications. When I was in the company, they were using PagerDuty Operations Cloud only, so I started with that.

    What other advice do I have?

    My advice would be to start with a clear incident management strategy rather than focusing only on the tool itself. PagerDuty Operations Cloud delivers the most value when you have well-defined service ownership, escalation policies, severity levels, and monitoring practices in place. The platform is very powerful, but its effectiveness depends on the quality of the alerts and operational processes behind it. I would also recommend investing time in alert tuning early on and integrating PagerDuty Operations Cloud with your monitoring stack, whether it is AWS, Prometheus, Grafana, or any other observability tool. Make sure the alerts being sent are actionable. Reducing noise from the beginning will help prevent alert fatigue and improve adoption among engineering teams. I would rate this product an eight out of ten.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: May 31, 2026
    Flag as inappropriate
    PeerSpot user
    Divyajyoti Ghosh - PeerSpot reviewer
    CSO C Apac at Autodesk, Inc.
    Real User
    Top 10
    Mar 18, 2026
    Reliable on-call workflows have supported incident response and integrations across teams
    Pros and Cons
    • "PagerDuty Operations Cloud has been reliable as it has never gone down in my experience; I have never seen it fail."
    • "Our developers noted that integrating incident bots or chatbots created with AI tools occasionally presents challenges with PagerDuty Operations Cloud."

    What is our primary use case?

    We took a subscription and started integrating many applications with it. We have integrated it with ServiceNow and also use our wiki repository by Spotify called Beacon. We onboard multiple services and integrate them with PagerDuty so that different engineering teams can update their line of escalation policy and primary, secondary, and tertiary users who are on call 24/7 or following a Follow the Sun model.

    Additionally, we have integrated PagerDuty for incident commanders who can acknowledge any incident page they receive and update their responses. If they need to hand it over to someone else, they can do that as well. This is how we are actually using PagerDuty. We primarily leverage it for service onboarding. For instance, if you have created a product and need to support it, which might run on any cloud such as EC2, AWS, or Azure, the service teams or triage engineering teams have their members added into PagerDuty along with different playbooks, runbooks, and SOPs integrated with any ticketing tool such as ServiceNow or Jira, whichever you are using. On these fronts, we effectively use PagerDuty.

    PagerDuty Operations Cloud has been reliable as it has never gone down in my experience; I have never seen it fail. I rely on PagerDuty Operations Cloud for on-call support for any high severity incidents or sev zero scenarios. This is a great feature, and I can update it using the mobile app because we also use PagerDuty Operations Cloud mobile app. Occasionally, I may be on call during weekends, and something might come up. For example, on October 20th, I was on leave but used PagerDuty Operations Cloud to stay in sync, even without my laptop. I could join Teams and Zoom calls and simultaneously update required documentation via Copilot and ChatGPT on PagerDuty Operations Cloud regarding different incidents. This capability was extremely helpful.

    What is most valuable?

    The integration with ServiceNow is one of the most valuable features. I rely on PagerDuty Operations Cloud for on-call support for high severity incidents or sev zero scenarios, which is a great feature. PagerDuty Operations Cloud has been reliable as it has never gone down, and I trust it for incident response. Additionally, I find the capability to update using the mobile app extremely helpful.

    Our developers noted that integrating incident bots or chatbots created with AI tools occasionally presents challenges with PagerDuty Operations Cloud. This is the only negative feedback I have encountered. However, I am very satisfied with PagerDuty Operations Cloud because my team is also pleased with it. We have onboarded multiple teams using this tool, and it functions well. From a security perspective, I believe there could be more layers. When scheduling on-call rotations for different team members, access should be restricted to specific users to prevent unauthorized changes to the on-call module. Despite this, the security features have been functioning well, and overall, I appreciate PagerDuty Operations Cloud.

    What needs improvement?

    In terms of integration, while I cannot speak for all developers, some have encountered anomalies, but I expect they will resolve over time. Our developers noted that integrating incident bots or chatbots created with AI tools occasionally presents challenges with PagerDuty Operations Cloud. This is the only negative feedback I have encountered.

    From a security perspective, I believe there could be more layers. When scheduling on-call rotations for different team members, access should be restricted to specific users to prevent unauthorized changes to the on-call module. Despite this, the security features have been functioning well.

    For how long have I used the solution?

    I have been using PagerDuty Operations Cloud since 2021.

    What do I think about the stability of the solution?

    I cannot recall ever having to contact support. I have been using PagerDuty Operations Cloud for more than seven years across two organizations without ever needing assistance. It never breaks down for us, and considering I have devoted 20 years of my career to IT infrastructure operations, where everything typically breaks down, including Jira and ServiceNow, it is impressive to say that PagerDuty Operations Cloud has not caused disruptions. I can also mention MongoDB Atlas as a vendor we subscribe to and similarly, I have never experienced disruptions in their services, aside from scheduled maintenance.

    PagerDuty Operations Cloud performs maintenance, which we are notified of in advance, usually two weeks prior, and this occurs during low-activity periods such as holidays. We adapt our workflows accordingly but typically, these notifications for maintenance are infrequent, occurring once or twice a year, making them manageable.

    What do I think about the scalability of the solution?

    PagerDuty Operations Cloud is scalable, but I emphasize I am a user. In our field, whether supporting applications or web technologies, it is very scalable. However, if developers assess it from an AI perspective, I cannot comment due to our newness to AI. Nevertheless, I find it scalable in all other aspects and have worked with some of the top tools available, apart from Remedy, Siebel, or Lotus Notes. Overall, it is highly scalable.

    How are customer service and support?

    I cannot recall ever having to contact support.

    How would you rate customer service and support?

    Negative

    Which solution did I use previously and why did I switch?

    I have never used any alternatives to PagerDuty Operations Cloud. I have been a bit old school, where we used to get paged on our phone numbers. Aside from PagerDuty Operations Cloud, I cannot recall using anything else.

    Which other solutions did I evaluate?

    Some alternatives to PagerDuty Operations Cloud could be Automation Anywhere, Tray.io, IBM RPA, and a few solutions from Hyland and SAP that also do automation.

    What other advice do I have?

    While I cannot provide specific pricing details, I can share my perspective as an operations professional. Though we use Jira and initially relied on ServiceNow, we have transitioned more towards Vulcan. We never considered moving away from PagerDuty Operations Cloud. I believe that whatever the cost is, it is beneficial because the IT infrastructure operations industry cannot function without PagerDuty Operations Cloud or a similar product. Furthermore, PagerDuty Operations Cloud has an excellent reputation.

    New users are onboarded to ServiceNow or Jira, and they immediately create a PagerDuty Operations Cloud account profile that goes through a verification and approval process by a hierarchy. Once approved, they can set up their numbers and build their profiles to reflect their department, area of expertise, and time zone, allowing them to track incidents outside their shifts while remaining informed about ongoing schedules.

    OpenScape is one product that I used before. I worked with Siemens in healthcare IT infrastructure operations, and during that period, we used BMC Remedy integrated with OpenScape, which was back around 2013 to 2016. Back then, our phone numbers were connected to it, but it was not particularly helpful. If you were not in front of your laptop, you received a call, and the automated IVR provided a brief description of the incident logged. You could only acknowledge or resolve the incident without having the option to assign it. OpenScape was what I had used before opting for PagerDuty Operations Cloud at Salesforce starting in 2021, and continuing with PagerDuty Operations Cloud became more widespread at Autodesk in 2022. As I mentioned earlier, some developers have indicated that integrating bots presents a challenge that has been somewhat resolved over time, but that is the only negativity I have heard about PagerDuty Operations Cloud. I rate this product overall a nine out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: Mar 18, 2026
    Flag as inappropriate
    PeerSpot user
    Buyer's Guide
    PagerDuty Operations Cloud
    June 2026
    Learn what your peers think about PagerDuty Operations Cloud. Get advice and tips from experienced pros sharing their opinions. Updated: June 2026.
    900,747 professionals have used our research since 2012.
    Hempreet Singh - PeerSpot reviewer
    Digital Specialist Engineer at a consultancy with 10,001+ employees
    MSP
    Top 20
    Dec 10, 2025
    On-call automation has reduced downtime and has enabled faster incident response at scale
    Pros and Cons
    • "PagerDuty Operations Cloud has positively impacted my organization by helping in faster incident detection and resolution with less downtime."
    • "Even though PagerDuty Operations Cloud is a strong platform, many things can be improved."

    What is our primary use case?

    PagerDuty Operations Cloud is a platform that helps teams manage incidents, automate operations, and ensure system reliability by bringing alerts, on-call schedules, and real-time responses into one place. When we had to push things into production, we set up PagerDuty schedules on a weekly or biweekly basis. If an issue occurred at night, a roster would pop up, and the respective engineer would have to handle that use case.

    A specific incident where PagerDuty Operations Cloud helped my team was during the peak season in America when lakhs of orders were placed in December, and a major S1 severity production issue suddenly happened. If no monitoring tool had been in place, the company would have faced doomed circumstances, incurring lakhs of dollars in losses. PagerDuty came to our rescue at the last moment when nothing was happening. At 3:00 a.m. my time, I received a message and subsequently a call while sleeping, and I learned that this issue had occurred. I logged in quickly, promptly fixed that issue, and within an hour or so, the issue was resolved with minimal damage. I even received appreciation for my quick response.

    PagerDuty Operations Cloud helps in similar situations because whenever some issue happens and we are not aware of it, PagerDuty comes with a flag telling us that there is an issue that needs to be fixed before it becomes a major problem.

    What is most valuable?

    Some of the best features PagerDuty Operations Cloud offers are comprehensive incident management, automation, and AI operations, all integrated into one platform. Second, it provides noise reduction and smarter alert grouping through global intelligent alert grouping that uses machine learning to group and correlate alerts across services. It also provides automation to reduce toil and speed up resolutions and artificial intelligence, including generative AI assistance, to help teams respond faster and smarter. Additionally, it has built-in workflows with standardized, repeatable processes, improved visibility, collaboration, and a unified operations view, and support for bridging customer-facing teams and engineering and the SRE teams. The last thing it provides is scalability for enterprise environments.

    The AI-powered alert grouping and automation have made a difference in my day-to-day work by reducing alert noise. It automatically groups multiple related alerts into a single incident, so instead of 20 separate alerts, I get one meaningful alert, which prevents on-call engineers from being spammed. It also helps in faster root cause understanding because AI looks at patterns across systems including logs, metrics, alarms, and graphs, finally providing a broad summary about that. This cuts down the response time, helps in prioritization, and reduces the burnout of on-call teams.

    PagerDuty Operations Cloud has positively impacted my organization by helping in faster incident detection and resolution with less downtime. It has reduced noise and fewer false alerts, allowing better focus for teams, meaning that on-call engineers can focus only on real and important issues rather than all the duplicate and negligible issues. It has helped with automation and efficiency, better collaboration and communication among teams, improved post-incident learning and prevention, and has not only helped in operational cost savings and better return on investment, but also in scalability and readiness for growth.

    What needs improvement?

    Even though PagerDuty Operations Cloud is a strong platform, many things can be improved. Analytic and reporting depth can be improved with better depth. Noise suppression and alert grouping robustness can be improved because sometimes the grouping becomes vague and somewhat unclear. Usability can improve, and user interface and user experience can improve because it becomes quite complex for new users. Integration and ecosystem limitations can be improved, as well as cost because for small or mid-sized organizations, it would become quite expensive to pay for this solution. Complexity for smaller teams or simpler needs can also improve.

    I think we can have richer analytics, and the reporting dashboards can improve. More robust noise suppression can help us. Native support for alert attachment can help us. A simpler user interface and user experience can be implemented, and pricing tiers and models should be more favorable. Accessible documents and easier onboarding can help a lot.

    For how long have I used the solution?

    I have been using PagerDuty Operations Cloud since the first year of my job, and I have worked on four projects, using it in all of them.

    What do I think about the stability of the solution?

    PagerDuty Operations Cloud is one of the most stable platforms.

    What do I think about the scalability of the solution?

    The scalability of PagerDuty Operations Cloud is quite great. I have seen it scale in a very easy and robust manner.

    PagerDuty Operations Cloud has met my needs as my team and workload have grown. The workload would definitely grow because since we are going online, production issues might happen, but PagerDuty has helped reduce that workload.

    How are customer service and support?

    I never faced an issue that would make me have to reach out to PagerDuty customer support because I think it worked fantastically. However, if that happens in the future, I would be happy to share my experience.

    How would you rate customer service and support?

    Which solution did I use previously and why did I switch?

    This is my first company, and I have been working here since the beginning of my career, so PagerDuty Operations Cloud is the only solution I have worked with.

    How was the initial setup?

    Before using PagerDuty Operations Cloud, my team often took longer to identify the root cause of incidents because alerts were scattered across different tools. After moving to PagerDuty Operations Cloud, AI-powered alert grouping and automated flows have helped us detect issues much faster. We now mobilize the right team within minutes, and our overall incident resolution time has dropped significantly, which has directly reduced our downtime and improved service reliability.

    What about the implementation team?

    It was not a team-level decision whether my organization evaluated other options before choosing PagerDuty Operations Cloud.

    What was our ROI?

    Cost savings happened since losses were prevented. Time savings also occurred, response time reduced, and many such things happened which I have already mentioned.

    What's my experience with pricing, setup cost, and licensing?

    Pricing, setup cost, and licensing were not my headaches, and the organization already provided me with everything set up. I just had to log in and start using it.

    Which other solutions did I evaluate?

    I did not purchase PagerDuty Operations Cloud through the AWS Marketplace because it is an organization-wide decision, so my company would have done that.

    What other advice do I have?

    I would definitely recommend trying this solution. If you are thinking to go with production in the near future, definitely give it a try. If someone is trying to go to production and wants to have reduced service level agreements and reduced time for root cause analysis and everything, definitely give it a try. It is a tool that you should work with, and I rate this product a 10 out of 10.

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: Dec 10, 2025
    Flag as inappropriate
    PeerSpot user
    reviewer2848911 - PeerSpot reviewer
    Vice President – IT, Enterprise Operations Tools at a tech vendor with 10,001+ employees
    Real User
    Top 20
    Jun 5, 2026
    Centralized incident workflows have reduced outage windows and improved response coordination
    Pros and Cons
    • "With one solution, we are able to do the triaging, and that definitely reduced the outage window and the average outage window."
    • "Dynamic scheduling is something I was waiting for almost three or four years."

    What is our primary use case?

    PagerDuty is predominantly used for our enterprise notifications for all of the incident management processes, especially the major incident management. We have many applications and infrastructure components. Earlier, we used a solution that only provided text-based communication. When we wanted to look for something with multi-channel notification and correlation capability, that is where we leveraged PagerDuty Operations Cloud.

    I am currently going through the governance process to get additional capabilities onboarded. GenAI is not yet enabled since I am from a regulated organization and had to secure approvals before enabling any AI-related components. Most probably in the next two or three months, we will be enabling both GenAI, SRE agent, and the AI capabilities of PagerDuty Operations Cloud.

    What is most valuable?

    The ease of use is one of the key strengths. Creating the escalation policies and notification channels per user is straightforward, and it is not a requirement that everyone has the same notification rules. Users have flexibility in getting the communication they need. Event orchestration is the other part which works well for us.

    Primarily, we were able to get the right people at the right time through our escalation mechanism, which is an automated switch from level one to level two. This helped us improve the overall MTTA, and the acknowledgment rate has drastically improved. For the major incidents, we were able to triage everything with PagerDuty Operations Cloud itself instead of switching between multiple tools such as Teams or other orchestration platforms. With one solution, we are able to do the triaging, and that definitely reduced the outage window and the average outage window.

    We do have automations in two main ways. One is the incoming automation where we have multiple monitoring tools and systems that generate events. We ingest them into PagerDuty Operations Cloud and then using event orchestration, we create all of the respective incidents, whether they are PagerDuty Operations Cloud-only incidents, ServiceNow incidents, or different methods we use. The other automation method is incident workflows where we are able to call out to respective endpoints for the remaining automations. This is growing at this point in time, but event orchestration is mainly what we use for the automation of the triaging.

    We used to have a two-digit figure of MTTA, and now it is reduced to less than one hour.

    Getting the right people on board whenever there is a major issue and dialing them individually took a longer time. Now with PagerDuty Operations Cloud, having all of the predefined rules and the orchestrations we can create, it is definitely bringing value. Bringing the right people at the right time and improving the restoration time so that we do not impact any of the business end-user services is where PagerDuty Operations Cloud definitely plays a key role in delivering the business value.

    What needs improvement?

    I have submitted a few enhancement requests. Dynamic scheduling is something I was waiting for almost three or four years. Finally, I believe they are coming up in a few weeks with dynamic scheduling because whenever any operations deals occur, the shift rosters will not be static. People may be rotating between different shifts, and setting up on PagerDuty Operations Cloud was a challenging task. They are in the early access stage of dynamic rosters, and I believe that will address this issue. On the reporting perspective, there is a wide variety of reports, and the out-of-the-box reports can be matured further. Though we are getting customized reports through professional services and it is beneficial, if they were out-of-the-box, then they would further help. There are plenty of reports, but still, there is maturity that can be addressed.

    For how long have I used the solution?

    I have been using PagerDuty Operations Cloud for almost four years.

    What do I think about the stability of the solution?

    There are not many issues except during Cloudflare or major AWS issues. Otherwise, we do not have any performance issues. The platform is performing well.

    What do I think about the scalability of the solution?

    PagerDuty Operations Cloud is scalable, but how you will take the business model matters. We are on the user license basis, so we know how many users we can onboard to PagerDuty Operations Cloud. The rest of the things are definitely scalable, depending on how you agree with them on the contractual level. There is no challenge with that unless you have not calculated or forecasted your requirement.

    How are customer service and support?

    I used PagerDuty Operations Cloud support.

    I would say they are pretty good, with regular support scoring eight or nine out of ten, and professional services scoring around nine out of ten. Both are pretty good for our business requirements.

    Which solution did I use previously and why did I switch?

    We were using different HP tools for all of the alerting and also a solution from OnSolve, earlier called TelAlert. Those solutions were distributed and not one central solution for incident management and alerts. Now it is centralized with one of our ITSM tools and PagerDuty Operations Cloud for both alerting and incident management.

    How was the initial setup?

    The initial setup was comparatively easy. We had to train the people because it was a new solution altogether. We got professional services support, and they helped us move forward. We did not have many challenges on the system level. Only user experience took more time as the team needed to learn how to use and operate the solution.

    What about the implementation team?

    I used PagerDuty Operations Cloud support.

    What was our ROI?

    From the pricing perspective, we got a good deal. When we took the tool, we did a comparison of the competitors and evaluated, and we are satisfied with that pricing. From the ROI perspective specific to the tool, we have not had a chance to calculate it. But overall, with the end-to-end process where PagerDuty Operations Cloud is present, I think we are almost near to getting the ROI.

    Which other solutions did I evaluate?

    We verified Twilio and two other solutions at that time.

    What other advice do I have?

    I would definitely ask them to do a PoC and do integrations with their existing ITSM tools or wherever they are looking for and thoroughly verify one end-to-end testing. Taking a major incident as a simulation and performing comparison on what metrics they do internally and what additional could help them out with the new solution of PagerDuty Operations Cloud, I think these two things definitely should be tested.

    PagerDuty Operations Cloud as a product, I would give an eight out of ten. The only reason I put eight instead of ten is the enhancement requests or any new features. The time to market has to be much faster than what they have at this point. Some flexibility on the customization should also be provided. My overall review rating for this product is eight out of ten.

    Which deployment model are you using for this solution?

    On-premises

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
    Last updated: Jun 5, 2026
    Flag as inappropriate
    PeerSpot user
    RajbhushanSharma - PeerSpot reviewer
    Operations & Delivery Lead at Tavlex
    Real User
    Top 20
    Jun 13, 2026
    Centralized alerting has streamlined on-call workflows and reduced incident response times
    Pros and Cons
    • "Together, these features help our team respond faster, stay organized during an incident, and reduce service disruptions for our customers."
    • "The initial setup and configuration can be complex, especially for teams managing multiple services, escalation policies, and integrations."

    What is our primary use case?

    My main use case for PagerDuty Operations Cloud is to manage critical alerts and incidents across our production systems. It helps our team route alerts to the right people, manage on-call schedules, coordinate responses, and reduce downtimes. We also use its integrations with our monitoring and collaboration tools, so issues are identified and addressed quickly before they impact our customers.

    What is most valuable?

    PagerDuty Operations Cloud is part of our daily operational workflow. It sits between monitoring tools and response teams, ensuring alerts reach the right people without delay. We use it for on-call scheduling, incident escalations, and coordinating responses across teams. Having everything centralized has reduced alert fatigue and helped us respond to issues more consistently, especially during off-hours and high-priority incidents.

    PagerDuty Operations Cloud offers intelligent alerting, on-call scheduling, automated escalations, and incident management as its best features. The platform makes it easy to ensure alerts reach the right person, and escalation policies prevent critical issues from being missed. We also rely heavily on its integration with monitoring and collaboration tools and its real-time visibility into operations. Together, these features help our team respond faster, stay organized during an incident, and reduce service disruptions for our customers.

    What needs improvement?

    While PagerDuty Operations Cloud is strong overall, there are a few areas for improvement. The initial setup and configuration can be complex, especially for teams managing multiple services, escalation policies, and integrations. Some reporting and analytics features could offer more customization without requiring additional configurations. The mobile app works well for alerting, but managing more advanced settings is generally easier from the web interface. It would also be helpful to have more out-of-the-box workflow templates and automation recommendations to simplify onboarding for new teams.

    To make the daily workflow smoother, simplifying the user interface for certain administrative tasks would be a significant improvement. Sometimes, navigating the settings to adjust on-call schedules or escalation policies can take a few extra steps, particularly for large environments. More customizable dashboards and easier reporting for non-technical stakeholders, along with additional guided recommendations for alert tuning, could help teams get even more value from the platform. These are relatively minor points, but addressing them would make an already great tool even more user-friendly.

    I did not give PagerDuty Operations Cloud a perfect rating because there is still room for improvement in areas such as reporting flexibility, dashboard customization, and simplifying certain administrative tasks. Overall, it is a mature and dependable platform that positively impacts our work.

    For how long have I used the solution?

    I have been using PagerDuty Operations Cloud for a year and a half.

    What do I think about the stability of the solution?

    PagerDuty Operations Cloud has been very stable for us.

    What do I think about the scalability of the solution?

    PagerDuty Operations Cloud scales well. As our teams, services, and integrations have grown, the platform has handled the increased workload without requiring any major changes on our side.

    How are customer service and support?

    The customer support has been really quick and responsive. I would give the customer support a 10 out of 10 because the support team has been responsive.

    Which solution did I use previously and why did I switch?

    We used a combination of monitoring tools, but we have not used any particular software regarding incident management.

    How was the initial setup?

    Pricing for PagerDuty Operations Cloud was reasonable for the value provided. Setup was straightforward, and licensing was flexible enough to scale as our team grew. Overall, there were no major concerns in that area.

    What about the implementation team?

    We purchased PagerDuty Operations Cloud directly through the PagerDuty sales team.

    What was our ROI?

    We have seen a positive return on investment. The biggest gains have come from faster incident resolution, less time spent managing alerts, and reduced downtime. This has helped the team work more efficiently without needing additional operational resources.

    Which other solutions did I evaluate?

    We did not evaluate any other option while choosing PagerDuty Operations Cloud.

    What other advice do I have?

    Take the time to properly set up your alerting rules, escalation policies, and integrations from the beginning. PagerDuty Operations Cloud provides the most value when it is aligned with your team's workflows and requirements. Once configured well, it can significantly improve incident response and reduce alert fatigue while making on-call management much easier. I would rate this product a 9 out of 10.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: Jun 13, 2026
    Flag as inappropriate
    PeerSpot user
    Ganesh Singh - PeerSpot reviewer
    Network Operations Center Engineer at Samsung R&D Institute India
    Real User
    Top 20
    May 25, 2026
    Incident workflows have reduced alert noise and now prioritize critical issues faster
    Pros and Cons
    • "The impact of PagerDuty Operations Cloud's alert reduction feature on our organization helps determine which incidents are business-critical, the impacted services, which teams should respond first, and what customer business impacts exist, focusing on the prioritization of critical incidents, enabling faster incident responses and reducing alert noise for better business visibility."
    • "I have faced outages and latency in PagerDuty Operations Cloud, including notification delays, web dashboard issues, and majorly API integration failures, along with mobile notification problems such as latency and missed pushes, especially affecting specific regional users."

    What is our primary use case?

    My usual use cases of PagerDuty Operations Cloud include acknowledging and escalating any P1, P2, or P3 incidents to the upper management or the service team, and also troubleshooting for L1 or L2 issues.

    For features of PagerDuty Operations Cloud, I find it easy to use and understand, and it is very easy to differentiate the alerts.

    What is most valuable?

    PagerDuty Operations Cloud is easy to understand, allowing on-call service teams to escalate priority alerts to the required development team or service team, which helps reduce alert downtime by the SLA, benefiting the organization's requirements.

    The reliability of PagerDuty Operations Cloud includes high availability architecture, multiple alert channels, strong enterprise adoption, and a reliable escalation system, making it easy to see the escalation matrix on the page dashboard.

    The impact of PagerDuty Operations Cloud's alert reduction feature on our organization helps determine which incidents are business-critical, the impacted services, which teams should respond first, and what customer business impacts exist, focusing on the prioritization of critical incidents, enabling faster incident responses and reducing alert noise for better business visibility.

    What needs improvement?

    For improving PagerDuty Operations Cloud, I think there should be a feature for faster incident response, better on-call management, and centralized alert management, including organizing priority alerts into separate buckets for P1, P2, P3, and P4.

    For how long have I used the solution?

    I have been using PagerDuty Operations Cloud for three years.

    What do I think about the stability of the solution?

    I have faced outages and latency in PagerDuty Operations Cloud, including notification delays, web dashboard issues, and majorly API integration failures, along with mobile notification problems such as latency and missed pushes, especially affecting specific regional users.

    How are customer service and support?

    I did not often communicate with the technical support of PagerDuty Operations Cloud because I did not have access.

    Which solution did I use previously and why did I switch?

    Before PagerDuty Operations Cloud, I used Tomcat for the same use cases.

    In my previous organization, they used Tomcat, and when I switched to another organization that used PagerDuty Operations Cloud, I differentiated between them. I was not a customer or owner, so I do not know the pricing, but I have used it for managing alerts and reducing downtime.

    How was the initial setup?

    When I started working at my current organization, PagerDuty Operations Cloud was already installed.

    What about the implementation team?

    I have implemented automation through PagerDuty Operations Cloud for incident response in some instances to reconnect during incidents if they are down automatically or do not have major issues.

    What was our ROI?

    I have not noticed any return on investment from PagerDuty Operations Cloud.

    Which other solutions did I evaluate?

    I last worked with PagerDuty Operations Cloud in February of this year.

    I did not decide to stop working with PagerDuty Operations Cloud because there was something wrong with it. I was contracted by Samsung, and in February, after the project closed, they moved all the contract employees. PagerDuty Operations Cloud simply worked during my time with Samsung.

    What other advice do I have?

    I used PagerDuty Operations Cloud internally in my company.

    I did not use PagerDuty Operations Cloud's autonomous AI agents.

    I did not use generative AI in PagerDuty Operations Cloud.

    I have not implemented AI functionality of PagerDuty Operations Cloud.

    Before using PagerDuty Operations Cloud, the SLA for any alerts was 30 minutes, and after implementing PagerDuty Operations Cloud, it benefited by reducing that time to 15 minutes for non-major issues or closing alerts.

    I used official documentation from the organization, such as Confluence pages detailing how to use PagerDuty Operations Cloud, uploaded by the upper management.

    I consider PagerDuty Operations Cloud highly scalable for enterprise incident management and operation responses, with technical scalability, team and organizational scalability, and integration scalability.

    I find the scalability important because we can easily transfer knowledge to newcomers, allowing them to understand without complex solutions.

    I gave this review a rating of 9 out of 10.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: May 25, 2026
    Flag as inappropriate
    PeerSpot user
    Mukesh Ts - PeerSpot reviewer
    Software Developer at Webspruce
    Real User
    Top 20
    Dec 5, 2025
    Automated on-call scheduling has reduced manual effort and now keeps holiday coverage reliable
    Pros and Cons
    • "Now, we don't have to think about it; we can simply set it up in PagerDuty and it works, with the escalation and everything still functioning reliably based on the configuration we set up six months to one year ago."
    • "I think the view on the website regarding how we see the chart and graph of who is on-call at what time could be improved."

    What is our primary use case?

    My main use case for PagerDuty Operations Cloud is to set up shifts for people on-call.

    A specific example of how I use PagerDuty Operations Cloud for setting up shifts is for when we need to set up shifts for holidays. In our team, we'll assign people who will be on-call and create an Excel sheet and upload it to PagerDuty. It works normally, gives notifications, and everything else functions properly. It is very easy to set up and manage.

    I usually discuss with my team who will be on-call during holidays, and we will set up how many people are needed. We create an Excel sheet, upload it to PagerDuty, and set up the line of who is the first person to reach, and if they miss it, then whom to escalate to. The web view and website are also very easy to use. I think this is the normal use case. Perhaps other teams are using it differently, but this works well for us. Before, it was very manual, and it was quite difficult.

    What is most valuable?

    The best features PagerDuty Operations Cloud offers are that it is simple to set up and supports Excel sheet uploads, which was very helpful. Setting up notifications and the integration with Datadog was excellent. We can automate many things.

    PagerDuty Operations Cloud has positively impacted my organization because the support team is very happy. Before, setting up everything was very difficult. Now, we don't have to think about it. We can simply set it up in PagerDuty and it works. The escalation and everything simply works with the configuration we set up six months to one year ago, and it still functions. We make only minor changes. I think a lot of manual effort has been reduced, and the system is more reliable.

    Since implementing PagerDuty Operations Cloud, before the L1 team had to stay online at night, and if someone fell asleep and missed an issue, it would easily escalate to a manager or someone higher up, creating a lot of fuss. That is almost gone now. The discussion part about deciding who will be on-call and setting that up was not as foolproof when we were creating it manually, and someone had to invest a lot of time, around one or two hours weekly. Now, it takes simply less than five minutes. Every week, we simply discuss and it's done. I think a lot of time has been saved, and a lot of mental effort has been saved.

    What needs improvement?

    I think the view on the website regarding how we see the chart and graph of who is on-call at what time could be improved. We could make that line more expressive to show who will get escalated if someone misses.

    What do I think about the stability of the solution?

    PagerDuty Operations Cloud is stable; we didn't find any bugs or unintended behavior.

    What do I think about the scalability of the solution?

    PagerDuty Operations Cloud is scalable; we can easily add teams, manage tags, and create teams. It is very easy to manage, and adding the line of priority and deciding whom to go first was very easy.

    How are customer service and support?

    The customer support is adequate; usually, they respond and help us fix issues during integration. It was helpful.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    Before using PagerDuty Operations Cloud, there was no solution in place. The L1 team was the one who checked the issues and called the developers, asking them if the error was related to them. This involved manually calling fifteen to twenty developers, which would take half an hour, and the issue would have persisted long enough, reducing the reliability of the site. Now it is automatic and very effective.

    What was our ROI?

    I have seen a return on investment; a lot of time has been saved. As I mentioned earlier, it would take a lot of manual effort before. Sometimes by mistake, two or more than one person would be assigned on-call, and it was not foolproof. The escalation was not possible at all before, which led to the L1 team being under too much stress. Now, it is not that severe; the L1 team had to coordinate with many people and call many people from their phones when they got an error. It was actually very bad. Now, PagerDuty escalates and will call them, and if it belongs to them, they will join. It is much more efficient and much less stressful.

    Which other solutions did I evaluate?

    We were not involved in evaluating other options; I think the higher team decided to go with PagerDuty, and we are happy with it.

    What other advice do I have?

    I don't want to add anything else about the features; we use this much and it's great. We don't want anything more for now. I don't think there is anything to improve; we are using PagerDuty Operations Cloud to set up on-call duty and it works. I chose a rating of nine because there may be some improvements in the future. My advice to others looking into using PagerDuty Operations Cloud is that the feature of on-call duty and setting up the on-call person are excellent. You can simply proceed with it, and even if teams are big, it will not be annoying or feel overwhelming. Just set it up and forget it; that's all. It is very effective. I have no additional thoughts about PagerDuty Operations Cloud before we wrap up; it is excellent. You can adopt it if you don't have any special needs; it is commonly accepted and effective. I gave this review a rating of nine out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: Dec 5, 2025
    Flag as inappropriate
    PeerSpot user
    Senior Software Engineer at LTM
    Real User
    Top 20
    Jun 14, 2026
    On-call alerts have ensured critical issues are addressed faster and teams focus on core work
    Pros and Cons
    • "PagerDuty Operations Cloud improved my team's ability to focus on core tasks rather than routine issues by having the notification feature that was very helpful to monitor and trigger high and critical issues directly to team members."
    • "There was agent alert fatigue with more granular root cause analysis that can be done."

    What is our primary use case?

    I usually use PagerDuty Operations Cloud for the notification of high-priority incidents within the infrastructure.

    I also use it for escalating to the on-call members, scheduling the priority of incidents or issues within the infrastructure, and creating scheduled rotations for team members.

    What is most valuable?

    The most valuable feature of PagerDuty Operations Cloud is that even though my device is on silent, it still rings and lets me know that something happened for the organization.

    On-call schedules for team members are very helpful to find out who is currently on call to get help with incidents or to get tickets routed to them. At the same time, it pushes me notifications, gives me a call on my mobile number, and triggers emails on my email address, so the multiple notification service of PagerDuty Operations Cloud is excellent.

    From a user perspective, the most valuable part of PagerDuty Operations Cloud is the notification feature that continuously contacts me until I acknowledge it. High and critical incidents are totally valuable for the organization because something is failing and I need to repair it on priority to not lose the business.

    PagerDuty Operations Cloud improved my team's ability to focus on core tasks rather than routine issues by having the notification feature that was very helpful to monitor and trigger high and critical issues directly to team members.

    What needs improvement?

    I am not using PagerDuty Operations Cloud's autonomous AI agents now because we have not gotten into that yet.

    I have not used generative AI yet.

    The integration with ServiceNow is very good, as even though if I add some notes over there, it directly pushes the email or also pastes it on the ServiceNow tickets.

    PagerDuty Operations Cloud also provides me information about how many incidents with the same errors I have encountered, as it does have the analysis engine running with incoming tickets.

    There was agent alert fatigue with more granular root cause analysis that can be done. If I consider the false positive alerts, reducing them and giving real numbers of the issue would be beneficial.

    I believe there is always room for improvement, and since technology is changing day by day, I will rate PagerDuty Operations Cloud as a nine.

    For how long have I used the solution?

    I have been using PagerDuty Operations Cloud for two-plus years, and I am still actively using it.

    What do I think about the stability of the solution?

    PagerDuty Operations Cloud is stable, but I did have one issue where services were down for about ten to twelve minutes. I consider it highly stable and reliable overall.

    What do I think about the scalability of the solution?

    The scalability of PagerDuty Operations Cloud is good and I have not encountered any problems with it.

    How are customer service and support?

    I did not have to reach customer service because the product has been stable and reliable, and I can say it is really good.

    Which solution did I use previously and why did I switch?

    I found PagerDuty Operations Cloud to be more stable than other solutions, so I directly went with PagerDuty Operations Cloud.

    How was the initial setup?

    Another team integrated PagerDuty Operations Cloud into the system and set it up.

    We did refer to the PagerDuty Operations Cloud documents for setting up teams and creating schedules.

    What about the implementation team?

    Another team integrated PagerDuty Operations Cloud into the system and set it up.

    What was our ROI?

    PagerDuty Operations Cloud improved my team's ability to focus on core tasks rather than routine issues by having the notification feature that was very helpful to monitor and trigger high and critical issues directly to team members.

    Regarding cost saving, PagerDuty Operations Cloud provides the feature but is not really reducing the cost of other operations.

    What's my experience with pricing, setup cost, and licensing?

    I do not usually focus on pricing for PagerDuty Operations Cloud at the moment, but for smaller teams, I believe it is costlier, while for multi-million dollar companies, it is still affordable. For smaller teams who want to improve their operations, the cost is an issue.

    Which other solutions did I evaluate?

    I found PagerDuty Operations Cloud to be more stable than other solutions, so I directly went with PagerDuty Operations Cloud.

    What other advice do I have?

    I am satisfied with PagerDuty Operations Cloud and really appreciate the product, so I do not have any questions at the moment, but I do have interest in whether PagerDuty Operations Cloud has implemented agents to help with any issues that happen. I rate this product a nine overall.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: Jun 14, 2026
    Flag as inappropriate
    PeerSpot user
    Yash Dhawan - PeerSpot reviewer
    Tech Ops Engineer at Zeller
    Real User
    Top 5
    May 22, 2026
    Automated incident alerts have improved response time and keep critical issues under control
    Pros and Cons
    • "PagerDuty Operations Cloud is a very strong platform to improve the incident response time and operational efficiency through automation and intelligent alert management."
    • "The noise aspect of PagerDuty Operations Cloud could be better."

    What is our primary use case?

    The main purpose of PagerDuty Operations Cloud is to receive alarms for real incident production critical issues. Whenever an incident happens, we get an alarm call or a phone call on our phone or through an application call, so that we are aware of the situation and know that we have to be vigilant about it and resolve that particular issue. This is the main use case of PagerDuty Operations Cloud, and it is the use case where most of the company is using it.

    What is most valuable?

    PagerDuty Operations Cloud has improved the incident management and operational responses workflows. It has the capabilities of intelligent alerting and automated escalation policies, which help reduce manual intervention. The most useful aspects would be the noise reduction if that can be improved. Overall, PagerDuty Operations Cloud is a very strong platform to improve the incident response time and operational efficiency through automation and intelligent alert management.

    What needs improvement?

    The noise aspect of PagerDuty Operations Cloud could be better. What happens is if there is some sort of an issue occurring, it keeps on repeating and calling again and again. Once the alert is acknowledged, it should stop calling because it creates noise. When we are handling production critical issues, once we have acknowledged an alert, we are still receiving calls, which becomes a distraction because we are also checking an incident and trying to resolve that particular issue.

    Generative AI must provide some significant value to the incident management workflow in PagerDuty Operations Cloud. It helps in faster incident triaging, reducing the time spent understanding the alerts and identifying the root cause. It quite helps at a certain point in time, but suppose we have an issue which is not under the capabilities of that particular functionality. In those cases, it does not help. However, if we have integrated our internal services with the system, it collaborates by providing the context in actionable information using that.

    For how long have I used the solution?

    I have been using PagerDuty Operations Cloud for four or five years.

    What do I think about the stability of the solution?

    The functionality in the application or the website of PagerDuty Operations Cloud is 100 percent very much operational all the time. There is no lag. The reliability is very good. We can rely on this application. Even if we are not on the system, we get the call on time. We get the call pretty much on time. There is never a lag on alerting or anything. This is something which we can rely on and trust the product.

    What do I think about the scalability of the solution?

    Scalability for PagerDuty Operations Cloud is something which is very good. We can put unlimited numbers of alerts and unlimited numbers of things integrated with this system.

    How are customer service and support?

    I have never contacted the technical support team of PagerDuty Operations Cloud. The application was pretty much reliable and I never had to contact them because it is operational all the time. In case if it is needed, I believe there would be great support, but that is not needed.

    How was the initial setup?

    I was not involved in the initial deployment of PagerDuty Operations Cloud.

    What about the implementation team?

    I have not been involved with the implementation at PagerDuty Operations Cloud. The reliability team, like the SRE team, used to do it. I have not done that, but I have used it, so I can only give feedback about using it.

    Which other solutions did I evaluate?

    I have used OpsGenie once, which is an Atlassian product. It is similar, but their user interface is not that much helpful as compared to PagerDuty Operations Cloud.

    I am not aware of the pricing for PagerDuty Operations Cloud, but I believe it is quite good. The other product, OpsGenie, is helping because they are offering it for free of cost if you are using other products like Jira. I believe that is where the differentiation is happening. For PagerDuty Operations Cloud, I do not know what the pricing is; I do not have any idea about it.

    What other advice do I have?

    PagerDuty Operations Cloud handles maintenance themselves. I give a nine out of ten for PagerDuty Operations Cloud because everything is pretty much perfect, including scalability, as we can scale to unlimited numbers of things. Reliability is also something which we can rely on. Maintenance is solid, and the automation part is also performed very well. I deducted one point because of the noise part which can be improved in the future. Overall, this application is pretty much available all the time and it maintains the operational efficiency very well. My overall review rating for PagerDuty Operations Cloud is nine out of ten. We are users of PagerDuty Operations Cloud product.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Last updated: May 22, 2026
    Flag as inappropriate
    PeerSpot user
    Aashish Bhandari - PeerSpot reviewer
    Dev Ops Engineer | Cloud Cost Optimization at HCLSoftware
    Real User
    Top 20
    Mar 6, 2026
    On-call automation has transformed alert handling and now creates a faster, competitive workflow
    Pros and Cons
    • "PagerDuty Operations Cloud changed the process and the flow in our team very smoothly."
    • "However, I hear from my manager that the pricing is very high for PagerDuty Operations Cloud, and only a few of us have the main business tier accounts."

    What is our primary use case?

    My use case for PagerDuty Operations Cloud is from the SRE and DevOps team. We use PagerDuty Operations Cloud for specific alerting purposes and for the pipeline process. When we build a pipeline and it suddenly fails due to some job and issues, we receive an error. We set up PagerDuty Operations Cloud with our monitoring services, which we are currently using, Datadog. Datadog is connected with PagerDuty Operations Cloud, and whenever Datadog receives an alert or a spike or anything critical, it will trigger an alert to PagerDuty Operations Cloud, and we quickly get a notification. We are currently using this process, and we are also maintaining our on-shift call rotation. For example, on Monday, Wednesday, and Friday, I am working as a shift lead, and then on Tuesday, Saturday, and Sunday, someone else is the shift lead. Regarding MTTR and all those statistics, we can see how many alerts we received, how many alerts we acknowledged this month, and we have a timeline as well. One of the valuable parts of PagerDuty Operations Cloud is that in our team, we can have a competitive environment. For example, if I resolved the most alerts triggered and resolved this month, then someone else can do it next month, and whoever resolves the most critical alerts on time receives appreciation every month.

    What is most valuable?

    One feature of PagerDuty Operations Cloud that I find valuable is the on-call schedule. We can manage our on-call scheduling, and we have various alert and notification delivery methods available, including mobile. We can receive phone calls, emails, SMS, and push notifications. For example, if someone missed the notification, they will get a phone call, which is very straightforward. We also have incident automation, making collaboration with any third-party monitoring services we use very straightforward, such as Datadog. We can seamlessly automate things with PagerDuty Operations Cloud. The AI features are also beneficial; for example, noisy alerts that trigger regularly and false positive alerts get suppressed. It checks the past month's alerts, showing us that this alert triggered 60 percent, this alert triggered 20 percent, this alert is rare, and this alert is not rare. The escalation policy is excellent as well, as if I did not pick up the call, my manager will get the call; if my manager did not pick up, then his manager gets the call. These are some of the most valuable parts we use in PagerDuty Operations Cloud.

    In Datadog, we have multiple dashboards and monitoring systems where we see our spikes and alerts. When we integrated with PagerDuty Operations Cloud, we got better signal and less noise. When we are seeing a spike that is concurrent, in PagerDuty Operations Cloud, the AI feature already signifies that alert as a noisy alert, and it suppresses that alert. This significantly improves our workflow with both Datadog and PagerDuty Operations Cloud. We have faster response and faster escalation. Previously, in Datadog, we did not get notifications, and people would refresh it and check the spike every hour. Now that we integrated PagerDuty Operations Cloud, any alert triggers, and we quickly get a notification or a phone call. Therefore, we do not sit in front of a computer and refresh repeatedly. Additionally, we have a centralized incident workflow; PagerDuty Operations Cloud and Datadog feed into PagerDuty Operations Cloud incident timeline, so we see everything there. We do not need to open Datadog again and again, and if we need to deep dive into an alert from Datadog, we can click the link inside PagerDuty Operations Cloud, redirecting us to the Datadog dashboard where everything is noted down and visible.

    In PagerDuty Operations Cloud, AI suppressing our alerts has helped streamline repetitive tasks. For example, very noisy alerts get suppressed automatically, aiding smarter routing. When we have new joiners in our team, they see alerts already suppressed, allowing them to focus on the critical ones instead of the lower ones. Additionally, alert prioritization is present; we receive critical alerts, high alerts, and then low alerts. The faster prioritization facilitated by AI enhances our alert management processes. Also, the root cause historical pattern assists us; if we get an alert similar to one from last month, it tells us how we resolved that alert previously. Historical patterns using AI greatly aid us in alert management.

    What needs improvement?

    I have already used PagerDuty Operations Cloud, and my previous monitoring tools were very poor for alerting. I had a good impression of PagerDuty Operations Cloud, but I believe it can improve with deeper root cause insights. I know there is automation to detect recent deployments causing incidents, but a deeper root cause analysis could provide more details. If PagerDuty Operations Cloud offers more information, we will not need to jump into the main dashboards where the alert triggered. For instance, if we get more insights directly in PagerDuty Operations Cloud, we would not need to check the Datadog dashboard. Additionally, I think a sandbox mode would be helpful for new team members, allowing us to guide them in simulating alerts, performing escalation policies, and creating PagerDuty Operations Cloud channels.

    For how long have I used the solution?

    I have been working with PagerDuty Operations Cloud for five years. I worked on two different projects, and in both projects, we use PagerDuty Operations Cloud.

    What do I think about the stability of the solution?

    In my previous project, we utilized the flexible incident command system to coordinate large-scale incidents, but in my current project with only Datadog, we have not received many alerts or incidents in the last couple of days.

    How are customer service and support?

    I do not have direct contact with PagerDuty Operations Cloud tech support or customer service teams, but my senior team members have connected with them when we received an alert related to our team failing to set it up properly. The customer support team promptly gave us insight and helped us within 24 hours.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    I am currently working with PagerDuty Operations Cloud. Previously, on my previous project, we were on BigPanda, but we faced multiple issues during BigPanda. At that time, there was no call schedule feature, and there was no alert triggered feature for BigPanda. We then moved it to PagerDuty Operations Cloud, and suddenly everything was smooth. We got a phone app as well; we set up PagerDuty Operations Cloud on the phone as well. Whenever any alert triggered for us, we used to quickly check from our phone to see if it was a false positive, a true P1, P2 alert, a major alert, or a critical alert. We then quickly jump into the alert and work on it. PagerDuty Operations Cloud changed the process and the flow in our team very smoothly.

    How was the initial setup?

    I found the initial setup of PagerDuty Operations Cloud straightforward; I did not face any complexities during the setup for alerts or during the initial configuration.

    What's my experience with pricing, setup cost, and licensing?

    Regarding pricing for PagerDuty Operations Cloud, I am currently a software engineer and a senior software engineer, so I do not handle the pricing aspect. However, I hear from my manager that the pricing is very high for PagerDuty Operations Cloud, and only a few of us have the main business tier accounts. Many of us have low tier accounts that restrict us to acknowledging and viewing alerts, while a few have the ability to create and trigger alerts. Therefore, I do not think much about pricing, but I do believe it is somewhat high. However, I think this is valid because PagerDuty Operations Cloud provides a vast amount of benefits compared to other alerting systems.

    Which other solutions did I evaluate?

    Regarding the key differences, pros and cons of PagerDuty Operations Cloud compared to competitors, some pros include alert grouping, AI functionality, and the ability to easily integrate with Slack for quicker resolution. Additionally, we receive phone notifications and push notifications, which many of the other competitors do not provide. The pricing of PagerDuty Operations Cloud is also reasonable for the functionalities it offers compared to its competitors. These are some benefits I see in PagerDuty Operations Cloud, including helpful alert insights and direct links to dashboards we have integrated, such as Datadog and Grafana, which allow us to resolve issues quickly.

    What other advice do I have?

    The recommendation I share, based on my experience with PagerDuty Operations Cloud, is that it is one of the best platforms for synchronizing with your monitoring tools. It will improve your flow, and your team will definitely benefit from PagerDuty Operations Cloud compared to other competitors, as it offers numerous advantages. I give this review a rating of ten out of ten.

    Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
    Last updated: Mar 6, 2026
    Flag as inappropriate
    PeerSpot user
    Buyer's Guide
    Download our free PagerDuty Operations Cloud Report and get advice and tips from experienced pros sharing their opinions.
    Updated: June 2026
    Buyer's Guide
    Download our free PagerDuty Operations Cloud Report and get advice and tips from experienced pros sharing their opinions.