No more typing reviews! Try our Samantha, our new voice AI agent.
Binaya Moharana - PeerSpot reviewer
System Administrator And Application Support at ICE
MSP
Top 20
Apr 1, 2026
Automated incident workflows have reduced downtime and improve real-time on-call response
Pros and Cons
  • "PagerDuty Operations Cloud is the best tool available, and I can confidently say it is the best tool for all aspects, not just incident management or escalation, but for all analytical functions as well."
  • "I have observed that MTTR is very slow, and wrong escalation sometimes routes alerts to the wrong team rather than the proper team."

What is our primary use case?

PagerDuty Operations Cloud is used for production incident management to automate incidents when alerts come from our tools. When we have a critical issue and receive that alert and notification, we configure it accordingly. On-call scheduling, escalation policies, and tracking MTTR improvement are areas where we interact and interrogate with the tools. This approach reduces downtime, enables faster incident response, provides clear accountability, and improves reliability.

We support different clients including JPMorgan, Wells Fargo, PL, LPL, and Bank of America, though we are not a customer, partner, or reseller. More than 300 clients use the solution. In my company, we have 21 specialists working with PagerDuty Operations Cloud.

What is most valuable?

The best features of PagerDuty Operations Cloud are intelligent alerting, the code fixture, on-call scheduling, escalation capabilities, automation, runbooks, and integration. PagerDuty Operations Cloud has improved downtime mostly by 30 to 50%.

What needs improvement?

We are not using the autonomous AI agent in PagerDuty Operations Cloud, and we have not integrated with AI Ops, which uses machine learning to group similar alerts automatically and suggest root cause analysis from past incidents. However, we are planning to implement this functionality.

Improvement in PagerDuty Operations Cloud should focus on areas where we need to reduce alert noise by filtering unnecessary alerts. PagerDuty should send actionable alerts, and grouping and suppression should be managed from PagerDuty's side.

Alert noise and grouping do not work together seamlessly in PagerDuty Operations Cloud, but they should be consolidated. Alerts should be directed to the right person with real-time notification to ensure no critical issue is missed. Currently, when we receive alerts through call, SMS, or email, some users do not receive them, and the end client providing support to their clients may miss something important. This is the most critical feature that PagerDuty should improve.

Defining on-call scheduling and escalation by groups is also necessary. The duty roster should clearly indicate which alerts go to L1, L2, or L3 level support. Automatic escalation is not happening if nobody has responded to an alert, which I have observed.

While I did not deploy PagerDuty Operations Cloud, I performed the migration and reconfigured PagerDuty from scratch, then migrated to ServiceNow where I handled redeployment. I have observed that MTTR is very slow, and wrong escalation sometimes routes alerts to the wrong team rather than the proper team. On-call schedules should map the different teams we have, such as the application team, infrastructure team, or database team, who will take ownership. They need to align to the same team only for that incident. Proper service configuration for each application is required.

For how long have I used the solution?

I have almost three or more years of experience with PagerDuty Operations Cloud.

Buyer's Guide
PagerDuty Operations Cloud
June 2026
Learn what your peers think about PagerDuty Operations Cloud. Get advice and tips from experienced pros sharing their opinions. Updated: June 2026.
900,747 professionals have used our research since 2012.

What do I think about the stability of the solution?

The stability of PagerDuty Operations Cloud is good. I have worked with Opsgenie, and PagerDuty is better than Opsgenie and ServiceNow as well. I can give a rating of a minimum of eight.

What do I think about the scalability of the solution?

The scalability of PagerDuty Operations Cloud is also good.

How are customer service and support?

I can give PagerDuty Operations Cloud a rating of a minimum of 7.5 to eight for technical support.

Which solution did I use previously and why did I switch?

Previously, we were using PagerDuty Operations Cloud on-premises, and now we are planning to implement a hybrid approach with both cloud and on-premises solutions. I have worked with Opsgenie, and PagerDuty is better than Opsgenie and ServiceNow as well.

How was the initial setup?

Alert reduction in PagerDuty Operations Cloud does not help prevent costly incidents.

What other advice do I have?

PagerDuty Operations Cloud is the best tool available. I have never worked with other solutions, but whenever I have had the chance to work with PagerDuty Operations Cloud, there is nothing on my mind to say negatively. I can confidently say it is the best tool for all aspects, not just incident management or escalation, but for all analytical functions as well. This is the best tool in the market.

Regarding maintenance, I have not observed that part of PagerDuty Operations Cloud directly. The on-site USA team is located in Jacksonville, and some team members might be handling maintenance from their end, but I am not certain about their specific involvement.

I would recommend PagerDuty Operations Cloud to other users because it provides the best tool for instant alerting, ensuring the right person responds, automatic escalation, and reducing downtime and alert noise. It integrates with different monitoring tools including DataDog, New Relic, Prometheus, and Grafana. The main strengths are real-time alerting, proper escalation, and faster incident response, which help reduce downtime and improve MTTR.

I have experience with MTTR in PagerDuty Operations Cloud. Faster detection and alerting reduce MTTR significantly. When PagerDuty is integrated with other monitoring systems, alerts are real-time and actionable.

The end-to-end flow of PagerDuty Operations Cloud, including on-call schedules, escalation policies, and services, is excellent. It is highly reliable, offers easy escalation setup, and has strong interaction capabilities that reduce manual effort. However, the cost might be high.

Disclosure: My company has a business relationship with this vendor other than being a customer. partner
Last updated: Apr 1, 2026
Flag as inappropriate
PeerSpot user
Yarasi Harshavardhan Reddy - PeerSpot reviewer
Operations Lead at a tech vendor with 10,001+ employees
Real User
Top 20
Jun 13, 2026
Intelligent alerts have protected revenue and now drive faster incident triage with AI guidance
Pros and Cons
  • "Since PagerDuty Operations Cloud has all the data and provides forward-looking resolution steps and information about which team was involved, PagerDuty AI helps us tremendously."
  • "While PagerDuty has comment functionality, a chat option would be a potential addition."

What is our primary use case?

I have been using PagerDuty for the last nine years, but PagerDuty Operations Cloud for over one and a half years.

We work directly with merchants and need to trigger immediate alerts whenever there are 5xx errors or business errors like 4xx issues, as well as payment failures. We have configured every alert on a data log in some other monitoring tools that are integrated with PagerDuty. We receive alerts very immediately and trigger calls and Slack notifications. We integrate everything with PagerDuty and get notifications instantly, after which we start our triage process.

One use case I can mention is when we have an auth rate dip. Whenever there is an auth rate dip, we run into revenue losses with the merchants or partners that PayPal currently works with. Since everything is integrated, PagerDuty Operations Cloud catches when there is an auth rate dip for particular merchants and immediately triggers a notification for us. We then immediately dive into what the problem is and figure out how to fix the issue with the help of engineering teams.

What is most valuable?

PagerDuty Operations Cloud is one of the best tools we have seen because it is already integrated with AI. We use it as a barrier tool, meaning it is the top tool that we consider and we get notified when there is an issue.

The best features include integrating with any tool and analyzing all previous alerts that have been stored. When an alert occurred on a particular day, we can immediately be notified on Slack with historical data and, since it is integrated with AI, we receive suggestions on how it can be resolved, how it was resolved earlier, and who resolved it. These are the very best features we have seen on PagerDuty Operations Cloud.

Since we have historical data showing when an alert has triggered on a particular day, we can turn it into a problem incident and work with the relevant teams to get it fixed completely so it does not reoccur. We are recording these kinds of repetitive issues using that feature.

It is very helpful that we can integrate with numerous monitoring tools such as Datadog, Splunk, and Kibana. Since we have integrated many other tools, I feel this is one of the features that PagerDuty Operations Cloud offers that makes it great.

What needs improvement?

Since PagerDuty Operations Cloud is already equipped with the latest technologies, I do not feel that anything more needs to be added, including summarizing content, as it is already available. Since it is already connected with AI, I do not feel that any other features could be added, so I do not have a concrete answer right now since we already have a number of features available and this is already a highly improved state.

While PagerDuty has comment functionality, a chat option would be a potential addition.

For how long have I used the solution?

I have been using PagerDuty for the last nine years, but PagerDuty Operations Cloud for over one and a half years.

What do I think about the stability of the solution?

PagerDuty Operations Cloud is highly accurate and there are no issues with the accuracy. It is highly reliable in terms of alert triggering and we do not get any false alarms, with only very minimal ones based on our internal signals. We do not have any complaints about PagerDuty Operations Cloud.

What do I think about the scalability of the solution?

PagerDuty Operations Cloud definitely increases efficiency for us. Since we do not have much manual work with workflows and everything is automated, it definitely helps.

Which solution did I use previously and why did I switch?

We are using only PagerDuty and do not have any other tool in use. There is no other tool that can match PagerDuty Operations Cloud.

What was our ROI?

We definitely have an ROI in terms of earlier requiring multiple employees. Since we are now using AI, we have reduced our staffing needs and can save a lot of time and money as well.

Which other solutions did I evaluate?

There is no other tool that can match PagerDuty Operations Cloud.

What other advice do I have?

Earlier, PagerDuty Operations Cloud was just notifying incidents, but now it is showing historical data and we can see how it was resolved earlier and quickly get notes from that to resolve issues with the historical data and suggestions.

Earlier, when there was an auth rate dip or different signals that we received through Datadog or different platforms, we used to have some false alarms. Now, everything we are using is AI-based with agents that were configured with those signals. We have very accurately configured the AI using factors such as holiday seasons that will have high traffic, and everything was configured with historical data. We are getting very solid results and signals.

Since PagerDuty Operations Cloud has all the data and provides forward-looking resolution steps and information about which team was involved, PagerDuty AI helps us tremendously.

We definitely do not have any revenue loss since we are getting accurate signals and alerts and have a solution for all configured alerts.

Since it has all advanced features integrated with AI, I am really impressed with the ability to integrate with numerous monitoring tools very easily and the ease of onboarding any member to PagerDuty Operations Cloud. Setting up the alerts and everything is very easy with a number of monitoring tools. That is why I rated this product a nine out of ten. There is no other tool that can match PagerDuty Operations Cloud right now.

We have a number of layers in terms of governance and security since we are a payment gateway. PagerDuty Operations Cloud has its own governance and security at a great level, so we do not need to think about any security concerns from PagerDuty Operations Cloud governance.

Since it already has AI features, I am going to recommend others to use PagerDuty Operations Cloud. I rate this solution a nine out of ten.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: Jun 13, 2026
Flag as inappropriate
PeerSpot user
Buyer's Guide
PagerDuty Operations Cloud
June 2026
Learn what your peers think about PagerDuty Operations Cloud. Get advice and tips from experienced pros sharing their opinions. Updated: June 2026.
900,747 professionals have used our research since 2012.
SusangRamesh - PeerSpot reviewer
Product Marketing Manager at OJCommerce, LLC
Real User
Top 5Leaderboard
May 4, 2026
Automated incident alerts have boosted checkout reliability and increased online sales
Pros and Cons
  • "With the help of PagerDuty Operations Cloud, we have been able to resolve a lot of issues in a much quicker fashion, and our overall sales has gone up 30% after the introduction of PagerDuty Operations Cloud, which is a major advantageous situation for us."
  • "Everything is fine now with PagerDuty Operations Cloud, but one thing that they can do to improve is bring more integrations."

What is our primary use case?

We have been using PagerDuty Operations Cloud for about 1.5 to two years now.

Our main use case with PagerDuty Operations Cloud is to create automation workflows with the Gen AI capabilities present in PagerDuty Operations Cloud. We run it on our e-commerce platform to detect anomalies whenever they occur.

We want to firefight different situations, so with the Gen AI capabilities of PagerDuty Operations Cloud, whenever an issue arrives during the checkout of a customer, it immediately informs us and we create a Teams channel or a Slack channel to address the issue, determine what the issue is, and fix it immediately. It throws us alerts whenever it is necessary to keep the customer journey much more smooth and much more comfortable.

We particularly use this to avoid different incidents that the customers might face, such as payment database issues, checkout issues, or product not going into cart issues. With the Gen AI provided by PagerDuty Operations Cloud, we are able to sort out everything, get timely notifications, and make the customer's journey much more smooth.

What is most valuable?

One of the best features is the notification categorization that we can do with PagerDuty Operations Cloud, which is the incident type categorization. We can select whether it is a major incident or a minor incident and based on what we select, a dedicated Slack channel or a dedicated Teams channel is created, which is much more helpful for us to diagnose the issue.

With the incident type categorization, we are able to prioritize which issues to sort out first and which issues to sort out later. This has helped us firefight the major issues on a first come first serve basis, so categorization helps us work more efficiently.

With the help of PagerDuty Operations Cloud, we have been able to resolve a lot of issues in a much quicker fashion. Our overall sales has gone up 30% after the introduction of PagerDuty Operations Cloud, which is a major advantageous situation for us.

This was majorly due to reduced downtime and faster response, which made the customer believe us more and made the customer's entire user journey much more smoother. This has directly impacted our sales, and the customer's journey within our e-commerce platform has been very quick with the reduced downtime, so this has helped us gain more sales.

What needs improvement?

Everything is fine now with PagerDuty Operations Cloud, but one thing that they can do to improve is bring more integrations. As of now, only Slack and Teams integration is there for firefighting, and whenever an issue arrives, a notification is provided only on these platforms. A lot of different channels can also be looked into for integration to make the work much more smoother.

For how long have I used the solution?

We have been using PagerDuty Operations Cloud for about 1.5 to two years now.

What do I think about the stability of the solution?

PagerDuty Operations Cloud is very much stable. Being a cloud-front platform, there is no downtime, and with Amazon AWS hosting it, we find it very stable and the updates are also quite regular, which is something that we appreciate very much.

What do I think about the scalability of the solution?

PagerDuty Operations Cloud is very easy to scale, given the Gen AI diagnostics and Gen AI metrics present, and automation is also there, which is quite helpful in scaling up the entire platform and the entire business journey for us as well.

How are customer service and support?

Customer support is very great. We would rate it 10 on 10 because they are very knowledgeable people and we enjoy engaging with them a lot.

Which solution did I use previously and why did I switch?

We did not use any prior solutions to this; we were doing it manually only, and PagerDuty Operations Cloud is one of the first softwares that we have used.

What was our ROI?

We have seen a positive return on investment with the help of PagerDuty Operations Cloud. We have seen an increase in our sales, a 30% increase in our overall sales with the help of PagerDuty Operations Cloud, and also our sales cycle time has reduced a lot. We have seen a 50% improvement in our sales cycle as well with the help of PagerDuty Operations Cloud.

Which other solutions did I evaluate?

We did not evaluate any other options. We just saw the PagerDuty Operations Cloud demo and we were impressed with it, and we went ahead with it as it was much more affordable and it solved our issues.

What other advice do I have?

The incident alert feature that PagerDuty Operations Cloud gives has helped us prevent a lot of issues which are about to come, and a lot of same mistakes have been stopped. The repeating of the same mistakes has been stopped, so this has helped us make new mistakes instead, which is much more better than making the same mistakes again and again. This has helped us grow our business in a much more efficient and much more quicker manner than we expected.

We would definitely recommend that others at least take the trial version of PagerDuty Operations Cloud because for every e-commerce business, or any business, an AI which is as powerful as PagerDuty Operations Cloud must be deployed so as to reduce the number of errors and to improve the overall business efficiency. We urge others to at least try using the trial version. We give this review an overall rating of 10.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: May 4, 2026
Flag as inappropriate
PeerSpot user
Navnath Solanke - PeerSpot reviewer
Senior Executive at a consultancy with 11-50 employees
Real User
Top 5Leaderboard
Apr 27, 2026
Proactive alerts and clear incident documentation have improved our outage response times
Pros and Cons
  • "PagerDuty Operations Cloud helps us monitor systems effectively, and we have not had any escalations to date."
  • "It needs to be integrated with some other applications. I expect it to be one platform for all operations; it should not depend upon Splunk, DataDog, or other applications or tools."

What is our primary use case?

I handle level two operations. Whenever a major incident occurs or there is an outage, I am informed first via Splunk, DataDog, and PagerDuty. PagerDuty Operations Cloud is used for alertness, and we have configured threshold values within it. I have the mobile application installed on my phone, so I receive information about any outage as soon as it occurs.

I work at Vodafone Intelligence Services, which is a subsidiary of Vodafone. We are a UK-based company that performs level one, level two, and level three operations for all European countries and some countries in India, including South Africa, Ghana, Spain, Egypt, Hungary, and the UK. These are our major customers. As part of operations, we have a team of about 15,000 people who manage the different markets and customers. PagerDuty Operations Cloud is used everywhere across our organization, along with Splunk and AppDynamics.

What is most valuable?

PagerDuty Operations Cloud is used for monitoring, and we upload detailed documentation for major incidents such as P2 or P1 severity. We prepare documentation about the incident including what caused it, what the resolution time was, what the impact was, and everything else, which we then put on PagerDuty Operations Cloud. Apart from this, we do not use it for any other applications; it is used exclusively for monitoring purposes and setting up alertness.

We receive many benefits as part of L1 or L2 operations running 24 hours a day. As soon as there is an issue, if I am the first point of contact and I do not receive the call, it goes to the second person, my line manager. If my line manager does not pick up the phone, it goes to the third person, the skip-level manager. This is beneficial for us; even if it is a minor outage lasting 5 or 10 minutes, we receive an alert about it. If there is a major incident, we still receive the alert. Even if we are away from the system and not actively monitoring, we get the alert as soon as there is an outage.

We have the TIBCO integration layer, which is integrated with DataDog, and DataDog is integrated with PagerDuty Operations Cloud. When we ask PagerDuty Operations Cloud how many incidents are recurring with a specific service, it provides historical data showing how many times that service was down.

What needs improvement?

I do not see any improvements needed in how I use PagerDuty Operations Cloud; it is still good. We receive phone calls and emails, but the use case is limited. It needs to be integrated with some other applications. I expect it to be one platform for all operations; it should not depend upon Splunk, DataDog, or other applications or tools. Everything should be in one place to make things easier and reduce complexity. Otherwise, we have to manage different tools. I expect monitoring tools to be consolidated together for better results and less complexity.

For how long have I used the solution?

I have been using PagerDuty Operations Cloud for almost three and a half years now.

What do I think about the stability of the solution?

I have not seen any crashes in PagerDuty Operations Cloud; it is a good tool. The user interface is really what I appreciate most. It is not a tedious task to spend time on PagerDuty Operations Cloud. The smoothness, availability, and user interface are very friendly. I have used other tools like DataDog, which is a little more complex, but PagerDuty Operations Cloud is a good tool with a friendly UI.

What do I think about the scalability of the solution?

PagerDuty Operations Cloud will grow; we do not have any concerns for this product. We need to put the system together for alerts, and it is good that PagerDuty Operations Cloud has the availability and will definitely grow over time.

How are customer service and support?

For technical support, we raise tickets most of the time and do not get in touch with them directly. However, we receive resolutions in a timely manner. The technical support team has expertise and answers our questions on the first attempt, keeping interactions short and simple, which makes a huge impact. If they sent us back and forth, it would make for lengthy discussions. When we raise an issue with PagerDuty Operations Cloud technical team, they respond effectively and keep it concise. We do not have to raise multiple tickets for the same issue; it is the best experience we have had.

Which solution did I use previously and why did I switch?

Since I joined the organization, I have only three and a half years of experience, and from day one, I have used PagerDuty Operations Cloud. I am not sure how the team handled previous incidents before. I believe the organization has been associated with PagerDuty Operations Cloud for a longer period of time. I do not remember how the team managed incidents prior, but PagerDuty Operations Cloud helps us monitor systems effectively, and we have not had any escalations to date. We handle outages within 10 to 15 minutes. Customers may panic, but major escalations are managed effectively.

What other advice do I have?

PagerDuty Operations Cloud is used for monitoring, and detailed documentation is uploaded if there is a major incident such as P2 or P1 severity. Documentation about the incident is prepared including what caused it, what the resolution time was, what the impact was, and everything else, which is then put on PagerDuty Operations Cloud. Apart from this, it is not used for any other applications; it is used exclusively for monitoring purposes and setting up alertness. I would rate this product 8 out of 10.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: Apr 27, 2026
Flag as inappropriate
PeerSpot user
Patel Dhulva - PeerSpot reviewer
Software Firmware Engineer at Kohler Co.
Real User
Top 5
May 30, 2026
AI-driven incident management has reduced downtime and improves focus on strategic work
Pros and Cons
  • "PagerDuty Operations Cloud has positively impacted my organization by enabling faster issue response, which helped reduce downtime, saved revenue by avoiding long outages, improved team accountability during incidents, reduced manual effort in handling alerts, and helped maintain a better customer experience."
  • "PagerDuty Operations Cloud needs improvements because sometimes integrations are not very seamless and misbehave."

What is our primary use case?

PagerDuty Operations Cloud is a multifunctional digital operations platform that meets my organization's needs.

I am impressed by this digital operations solution because it is the most appropriate tool for incident detection and alerting.

PagerDuty Operations Cloud is a very user-friendly tool, highly accurate, and an easy-to-customize digital operations management system that suits my organization's needs.

It has intelligent noise reduction capabilities that play a significant role in minimizing alert floods.

What is most valuable?

PagerDuty Operations Cloud offers top-tier features that enable real-time alerting and accelerate incident response.

The solution is reliable and effective when it comes to automating routine diagnostic tasks.

Regarding how the real-time alerting and automation features have helped my team, problem-solving became automatic, and incident management becomes less complex to manage.

PagerDuty Operations Cloud has positively impacted my organization by enabling faster issue response, which helped reduce downtime, saved revenue by avoiding long outages, improved team accountability during incidents, reduced manual effort in handling alerts, and helped maintain a better customer experience.

The solution's alert reduction feature has had a major impact on preventing costly incidents in my organization. By grouping related alerts and de-duplicating noise, my team was able to spot real issues faster instead of getting buried in alerts, helping us prevent two to three potential outages because engineers responded to the root alert instead of missing it in noise.

What needs improvement?

The user interface should be easier to customize and use.

The pricing could be less expensive, especially for smaller organizations.

The user interface could be made easier to customize and navigate so that users who are new to this platform find the learning curve smoother.

PagerDuty Operations Cloud needs improvements because sometimes integrations are not very seamless and misbehave.

For how long have I used the solution?

I have been using PagerDuty Operations Cloud for about one year and a few months.

What other advice do I have?

PagerDuty Operations Cloud is a great operational efficiency tool, not just for paging.

It is very cost-effective, especially for organizations that are not limited by budgets.

PagerDuty Operations Cloud solves a lot of problems.

For example, if any issue arises during our online exam with our client, then PagerDuty Operations Cloud alerts the right team and the right people, and tasks are assigned so those problems can be resolved at the correct time and our real task does not get disrupted.

PagerDuty Operations Cloud's AI functionality has improved my team's ability to focus on core tasks rather than routine issues by removing routine alert triage.

The AI groups and de-duplicates alerts automatically, so our engineers are not manually sorting through twenty duplicate notifications for one root issue, allowing them to save a lot of time and focus on other strategic tasks, which improves productivity in my organization.

We are using PagerDuty Operations Cloud's autonomous AI agents for low-severity incidents, which automatically triage, correlate, and resolve known issues without human intervention, such as restarting services or acknowledging flapping alerts.

This has contributed to efficiency by cutting manual workload by thirty-five percent and also reducing MTTR for routine incidents.

The effectiveness of PagerDuty Operations Cloud's generative AI in providing insights for decision-making is effective during incidents.

The AI provides clear insights through incident summaries and what-changed analysis, helping us decide where to start troubleshooting instead of guessing, enabling us to make data-driven decisions easily, and providing actionable insights that improve response decisions.

The influence of PagerDuty Operations Cloud's embedded AI on revenue protection in terms of reducing alert fatigue and incident costs has a positive impact by reducing downtime risks and operational costs per incident.

I would rate this review nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: May 30, 2026
Flag as inappropriate
PeerSpot user
Navin Samuel - PeerSpot reviewer
Delivery Manager at Cognizant
Real User
Top 20
May 25, 2026
Modern alert automation has boosted noc efficiency and now streamlines on-call workflows
Pros and Cons
  • "Overall, it was a very effective product and really helped the productivity of the teams."
  • "Initially it was a nightmare. There were a lot of bugs and we were getting a lot of alerts, and it was a very messy period."

What is our primary use case?

My network operation center team monitors network alerts and digital alerts and pages engineers on-call. My team utilized this tool for supporting the customer Nike, and I led the NOC operations at Cognizant. We had various sources like Splunk, New Relic, VeloCloud, and SolarWinds triggering alerts for device failures or high CPU utilization. PagerDuty Operations Cloud correlates all events from multiple sources and triggers alerts to the respective teams based on the orchestration set for the particular event. My team handles alerts through automated processes or through manual intervention and resolves incidents. They also use it to trigger on-call engineers through SMS or email.

When we receive alerts that do not need manual intervention for 15 to 20 minutes, we have automation in place to monitor that alert and resolve it if the issue is remediated automatically. We also had automated functions to check the reachability of devices where we have on-demand ping commands in PagerDuty Operations Cloud. Event correlation helps to  action bulk alerts in a single incident. For example, if one location is triggering alerts for multiple devices, all these alerts are correlated in a single incident, and the engineer will action and resolve the alerts.

We were getting a lot of alerts on a daily basis that do not require immediate intervention. We automated those alerts to be in a different queue where PagerDuty Operations Cloud itself monitors and resolves the alert. We also implemented agentic AI use cases for device interface alerts where the alerts are triggered and not immediately actioned by humans. PagerDuty Operations Cloud monitors and performs certain actions using agentic AI. If agentic AI is not able to clear the alerts, it creates an incident and ticket to the respective engineering team. If it is able to clear the issue, it automatically resolves the alert. This automation helped in reducing a lot of noisy alerts without manual intervention, and without it, a lot of manual intervention would have been required for engineers to perform basic triage.

What is most valuable?

PagerDuty Operations Cloud definitely helped us. In terms of governance, it had very good reports regarding the alerts we were receiving and calculating the times and effort. Overall, it was a very effective product and really helped the productivity of the teams.

PagerDuty Operations Cloud has correlation features which help us when we get a lot of alerts from one particular location. Based on the correlation values set for the alerts, it automatically correlates all the incidents. It also attaches the KB articles with troubleshooting steps

What needs improvement?

PagerDuty Operations Cloud was at a premature stage and we were not utilizing the GenAI features. I was using it for nearly two years, and we migrated from a different tool to PagerDuty Operations Cloud. Many of these features were in the development stage, so I cannot comment on its capability when it comes to GenAI.

We were not fully utilizing PagerDuty Operations Cloud capabilities as our environment was different. We were just migrating to PagerDuty Operations Cloud and it had been nearly two years. The first year went fully into migrating the product to the environment and fixing bugs, and the next year went into development and stabilizing the product in the customer environment. By the time we were planning to use AI and all the advanced features, I left the project for a different engagement and we do not use PagerDuty Operations Cloud in this current environment.

Initially it was a nightmare. There were a lot of bugs and we were getting a lot of alerts, and it was a very messy period. We had to manually keep track of all the alerts and share the feedback to get it sorted out. The user interface that was provided in the initial stages was not friendly, and it was very difficult to manage the alerts. In later stages, based on feedback, they improved the customization options and user interface, so initially it was not good.

For how long have I used the solution?

I was using it for nearly two years.

What do I think about the stability of the solution?

PagerDuty Operations Cloud is reliable. We have faced only one or two outages in a span of two years and have not had a lot of outages that disrupt operations. I will not say it is 100% perfect, as we do run into bugs and disruptions. Overall, PagerDuty Operations Cloud is a reliable product.

What do I think about the scalability of the solution?

PagerDuty Operations Cloud is highly scalable as it can be integrated with multiple platforms and multiple ticketing tools, and with automation and AI features available, it is highly scalable.

How are customer service and support?

PagerDuty Operations Cloud support was good compared to BigPanda. We had a better experience with PagerDuty Operations Cloud. There were some features that we requested and which they delayed in delivering, but they had their reasoning. Overall, the support by PagerDuty Operations Cloud was good. They had a monthly cadence call and a bi-weekly call with the tools and support team to address concerns on a weekly basis. PagerDuty Operations Cloud is a good product for the organization and the support team is highly effective and responsive.

Which solution did I use previously and why did I switch?

We used a tool called BigPanda. BigPanda was very buggy and their support teams were not responding on time to some critical issues. It had issues in integrating with other tools like ServiceNow and we had a lot of issues with reporting. Reporting was not accurate and there were a lot of inaccurate data in the reporting, so we had difficulties in governing the team's performance. The customer decided that it was better to migrate to a different tool. We had PagerDuty Operations Cloud even during that period, just to page on-call engineers, but not as an event correlation platform tool. They provided a testing period for us and we decided to switch over to PagerDuty Operations Cloud rather than holding onto BigPanda.

What other advice do I have?

PagerDuty Operations Cloud is already up to date with the requirements in terms of cloud automation and AI enhancement. It is a very modern solution for NOC operations and for paging on-call engineers. It has all the orchestration and automation functionalities required to perform triages for all types of incidents, and it can correlate using intelligence automatically. I do not see a lot of room for improvements, but there were some bugs that we were working on with the vendor. Overall as a product, PagerDuty Operations Cloud is a very modern solution for NOC environments.

The user interface was very easy in PagerDuty Operations Cloud. There were some operation-related bugs that were not due to some configurations and challenges with integrating the tool with the customer environment. Overall as a product by default, it had a very user-friendly user interface.

I would recommend PagerDuty Operations Cloud as a good product for the organization. I gave this product a rating of 8.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: May 25, 2026
Flag as inappropriate
PeerSpot user
Gulam Gauss - PeerSpot reviewer
Cloud Support Engineer at ATOS
MSP
Top 20
May 28, 2026
Real-time monitoring has ensured proactive issue resolution across critical cloud environments
Pros and Cons
  • "Using PagerDuty Operations Cloud is a very crucial part for us; if we do not use it, we don't know what is happening in the customer's environment."
  • "If there is an outage on PagerDuty's side, we sit idle waiting for it to be resolved."

What is our primary use case?

We use PagerDuty Operations Cloud for monitoring purposes, as we have our own agent installed in the customer's environment across AWS and Azure, virtual machines or instances. Our monitoring agent continuously monitors the customer's environment for high CPU utilization, high memory utilization, disk utilization, and if there is any problem within the customer's environment, then we get an alert on PagerDuty. After receiving the alert, we have to acknowledge it or resolve it. If the issue is still there, then we have to work on that alert and resolve the issue, so that is part of PagerDuty.

Using PagerDuty Operations Cloud is a very crucial part for us; if we do not use it, we don't know what is happening in the customer's environment. It plays a crucial role in our organization because we have no other option for receiving any kind of information from the customer's environment. If there is an outage on PagerDuty's side, we sit idle waiting for it to be resolved. We reach out to PagerDuty's team to sort out the issue quickly, as some of our customers are very important and we have to monitor their environments 24/7 because they do not want any problems.

What is most valuable?

I appreciate the notifications in PagerDuty Operations Cloud. We get real-time updates in PagerDuty, and that is the best part. The second feature is that we get each and every detail in PagerDuty as well, including the main problem, which account is affected, and in which account, which instance or which VM is affected. We also get graphs in PagerDuty Operations Cloud, making it so that 50% of our work is done by PagerDuty itself, with the rest of the 50% being the work we have to do.

What needs improvement?

The main part I would see improved or enhanced in PagerDuty Operations Cloud is the absence of a PagerDuty notifier in Chrome. Someone created a notifier in the Chrome extension, but it was removed after an update to the Chrome version. My request is for PagerDuty's team to create a notifier so we get pop-up alerts whenever we receive any kind of alert. We currently use the notifier in Microsoft Edge, which works fine, but we need it in Chrome as well, and I suggest it should have options for acknowledging, holding, and resolving alerts, as that is very important.

Enhancements are not needed for me except for having PagerDuty Operations Cloud's notifier from the official PagerDuty team.

For how long have I used the solution?

I have been working with PagerDuty Operations Cloud for more than five years.

What do I think about the stability of the solution?

Regarding the stability aspect of PagerDuty Operations Cloud, I am aware of problems we faced during the CrowdStrike incident last year. We encountered multiple issues, but after degrading our customer's systems, the problems were resolved once CrowdStrike released a new patch, normalizing the process.

What do I think about the scalability of the solution?

I think PagerDuty Operations Cloud is scalable, and I have not encountered any limitations. We can configure it as needed, and I receive detailed information through PagerDuty. As I mentioned earlier, 50% of our work is already handled, and the remaining part requires us to log in to the customer's environment to check what we can do.

How are customer service and support?

I have only interacted with PagerDuty's technical support team during the CrowdStrike incident. They worked very efficiently and responded on time. We also have their contact numbers, emails, and technical assistance email IDs, so whenever we need to modify anything like increasing or degrading resources, we reach out to them without any problem.

On a scale of one to ten, I would rate PagerDuty's technical support as nine out of ten.

Which solution did I use previously and why did I switch?

We have been using PagerDuty Operations Cloud from the initial phase; there was no previous solution.

How was the initial setup?

The initial setup process for PagerDuty Operations Cloud involves creating a user first with our official email ID. After that, we select which team we are working in, and as we're operating 24/7, we add on-call availability, with three engineers assigned for different shifts, which means splitting 24 hours into three 8-hour shifts.

After adding the user, we have to specify our name and the shift timings during which we are working. We also select our teams, whether L1, L2, or L3, based on that, we receive alerts from the customer's environment. That is the basic process we use.

What about the implementation team?

We handle the deployment in-house for PagerDuty Operations Cloud.

What was our ROI?

In terms of measurable benefits from PagerDuty Operations Cloud, it is cost-effective and saves resources, as we receive alerts with real-time data. If the alert is resolved in the customer's environment, then PagerDuty Operations Cloud alerts also get resolved in real time. So far, I have not encountered any errors in PagerDuty Operations Cloud system; this is a very good thing for us.

What's my experience with pricing, setup cost, and licensing?

I don't know anything about the pricing, setup costs, or licensing part as we have a different team for that.

Which other solutions did I evaluate?

I have no idea about evaluating other options available in the market as our company decided to use PagerDuty Operations Cloud based on recommendations from contacts who suggested we use it.

What other advice do I have?

My advice to other organizations considering PagerDuty Operations Cloud is to focus on real-time updates, as we have configurable options. Their support is very timely, so we receive prompt answers to any queries. My only concern is that PagerDuty needs to develop their notification system to provide similar notifications on Windows or macOS as we get with their app on Android or iOS. I would rate this product a ten out of ten overall.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Last updated: May 28, 2026
Flag as inappropriate
PeerSpot user
DeepakReddy - PeerSpot reviewer
Senior Dev Ops Engineer at Scaler Academy
Real User
Top 5Leaderboard
Apr 8, 2026
Centralized incident response has reduced downtime and now needs more predictable costs
Pros and Cons
  • "PagerDuty Operations Cloud has positively impacted our organization by accelerating incident response and reducing MTTR by up to 27%, and we are also integrating AI and ML into PagerDuty Operations Cloud."
  • "One area for improvement in PagerDuty Operations Cloud is the unpredictable costs that can cause issues in our organization and project complexity, along with the occasional perception of an outdated user interface by non-tech personnel."

What is our primary use case?

My main use case for PagerDuty Operations Cloud is for cloud-based operations, including incident management and resolving incident responses to reduce downtime and improve reliability.

PagerDuty Operations Cloud provides a central command center that collects data signals from various IT systems, which helps us detect high-priority incidents and reduce the noise.

In addition to my main use case, we are able to perform on-call scheduling and routing with the help of PagerDuty Operations Cloud very easily.

What is most valuable?

The best features PagerDuty Operations Cloud offers include automated on-call scheduling, AI-driven alert grouping, and an impressive number of integrations, as it has more than 700 integrations, which really help us.

The user interface is user-friendly for non-technical persons, and it has been maintained well over the years, making it easier for our project managers and product managers to navigate PagerDuty Operations Cloud dashboards.

PagerDuty Operations Cloud's embedded AI has helped reduce alert fatigue and lower costs from incidents, which has contributed to retaining revenue by minimizing financial losses.

The AI-driven alert grouping, particularly for incident management, helps us significantly as it streamlines our processes.

PagerDuty Operations Cloud has positively impacted our organization by accelerating incident response and reducing MTTR by up to 27%, and we are also integrating AI and ML into PagerDuty Operations Cloud.

What needs improvement?

One area for improvement in PagerDuty Operations Cloud is the unpredictable costs that can cause issues in our organization and project complexity, along with the occasional perception of an outdated user interface by non-tech personnel.

For how long have I used the solution?

I have been using PagerDuty Operations Cloud for around three years.

What do I think about the stability of the solution?

PagerDuty Operations Cloud is stable and scalable, capable of handling enterprise environments and multi-cloud setups efficiently.

How are customer service and support?

I would rate customer support a seven out of 10.

Which solution did I use previously and why did I switch?

Previously, we were using incident.io, AlertOps, and DataDog, but we switched to PagerDuty Operations Cloud due to its all-in-one solution capabilities.

What about the implementation team?

I have implemented AI and automation through PagerDuty Operations Cloud for incident response, which has significantly changed our operational efficiency, allowing us to accomplish more with less manual input.

I have experimented with PagerDuty Operations Cloud's autonomous AI agents, striving to automate repetitive tasks, which improves operational efficiency.

What was our ROI?

While I cannot provide exact return on investment metrics, I estimate that we save around 20% of costs and approximately 10 to 15 hours a week with the efficient use of PagerDuty Operations Cloud.

The 27% reduction in MTTR has had a direct business impact as it enhances business growth and supports impactful business decisions.

The alert reduction feature has greatly impacted our ability to prevent costly incidents, as we can accurately respond to alerts with the help of autonomous AI agents, which reduces erroneous notifications.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing has been positive compared to other tools, as PagerDuty Operations Cloud simplifies many management tasks that would otherwise be burdensome.

Which other solutions did I evaluate?

Before choosing PagerDuty Operations Cloud, I evaluated previous tools such as incident.io and DataDog, but we selected PagerDuty Operations Cloud for its comprehensive features and strong alerting.

What other advice do I have?

My advice for others considering PagerDuty Operations Cloud is to first understand their organizational needs before selecting any tool to prevent mismatches.

I believe all aspects of my experience with PagerDuty Operations Cloud have been covered.

I would rate my overall experience with PagerDuty Operations Cloud a seven out of 10.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: Apr 8, 2026
Flag as inappropriate
PeerSpot user
Punit Anand - PeerSpot reviewer
Senior Site Reliability & Observability Engineer at ParentPay Group
Real User
Jun 22, 2026
Alert de-duplication has reduced noise and now improves response time and root cause analysis
Pros and Cons
  • "PagerDuty Operations Cloud has improved our response time and mean time to resolution in my organization."
  • "We do not experience downtime. However, I have observed one issue: we integrated with LogicMonitor, which is a monitoring tool, and alerts come from there to PagerDuty Operations Cloud. When alerts are resolved in LogicMonitor, they should also resolve in PagerDuty Operations Cloud, but sometimes they do not resolve."

What is our primary use case?

I have integrated other monitoring tools like LogicMonitor, and alerts come to PagerDuty Operations Cloud where we acknowledge them and work upon the issues. I have created multiple services that send alerts to out-of-hours groups for the on-call engineers. We detect issues faster and this helps in root cause analysis. Alert noise reduction is a major use case for us, as it groups duplicate alerts, which is very useful. The mobile application is also excellent.

What is most valuable?

I appreciate the event de-duplication feature in PagerDuty Operations Cloud because my company has many alerts for similar devices or servers, and it groups them together. This helps us see when a particular server's CPU and memory are both spiking, which aids significantly in root cause analysis.

Another feature I value is push notifications. We receive calls, SMS messages, and emails for the same alert, so we do not miss any notifications.

My organization has reduced noise by approximately 20% because of the de-duplication feature in PagerDuty Operations Cloud and the report feature. The report feature sends us a weekly report showing how many similar alerts occurred that week, and we work on reducing those alerts. By following this policy for three months, we reduced noise by 20%, which is a huge achievement for us.

PagerDuty Operations Cloud has improved our response time and mean time to resolution in my organization. We have integrated many monitoring tools through PagerDuty Operations Cloud, and the integration feature is excellent. It integrates very well with other monitoring tools via API and through email. I recommend other organizations use this integration feature.

The platform generates weekly reports showing how many alerts we received and the response time for each service and alert. I now pull daily reports via API. Since my company operates from 7:30 AM to 4:30 PM, with on-calls after hours, I need to know how many alerts occur outside business hours. Using a report scheduled through PagerDuty Operations Cloud API, the system sends me the alerts. I then analyze how many alerts came that night and work with the application team to reduce noise and resolve incidents. I value the report feature completely.

As a technical engineer, I observe that noise is being reduced and platform stability is increasing. My company is product-based with many products, and they are becoming more stable because we receive alert notifications faster. PagerDuty Operations Cloud is helping my organization tremendously.

What needs improvement?

Overall, I have positive feedback about PagerDuty Operations Cloud, but as an enhancement, I would suggest the reporting feature could be improved. I generate reports based on the service, but it has a limitation where it cannot send all alerts. The limitation is that it can only send 1,000 incidents using the API. If that capacity could be enhanced to send 2,000 alerts in one report, that would be beneficial.

Currently, we have not applied any automation through PagerDuty Operations Cloud. However, it does help with automation in that when we receive more alerts for a similar issue or for only one server, we know that server's health is not good. We then find the root cause and apply automation directly on the server, not through PagerDuty Operations Cloud. The feature would be useful, but my company does not have the automation feature enabled. It shows as a request trial, so I think we need to try that.

For how long have I used the solution?

I have been using PagerDuty Operations Cloud for one year and two months.

What do I think about the stability of the solution?

We do not experience downtime. However, I have observed one issue: we integrated with LogicMonitor, which is a monitoring tool, and alerts come from there to PagerDuty Operations Cloud. When alerts are resolved in LogicMonitor, they should also resolve in PagerDuty Operations Cloud, but sometimes they do not resolve. This should happen, and I think this is an API issue that needs to be addressed. I am not certain whether other customers of PagerDuty Operations Cloud are experiencing the same issue.

How are customer service and support?

I have no complaints about customer service because PagerDuty Operations Cloud is an incident management tool and it performs that function very well.

Which solution did I use previously and why did I switch?

We preferred PagerDuty Operations Cloud over ServiceNow, which we used previously for the same purpose. When an alert came, we would call engineers, and ServiceNow has that feature as well. However, PagerDuty Operations Cloud is much more advanced in terms of notifying users and reducing the time to respond. We are satisfied with it and are not planning to move to other tools currently.

How was the initial setup?

I joined this organization one year and two months ago, and the initial setup was already done. I only enhanced that setup and created new integrations and new event orchestrations. I cannot comment on the initial setup itself, but I am confident it would have been easy.

Which other solutions did I evaluate?

Overall, I can say PagerDuty Operations Cloud is a critical part of our incident management process. Reliability and alert delivery are strong compared to other tools such as ServiceNow. The area where we see the biggest opportunity is AI-driven event correlation, richer alert context, and improved analytics. I do not think any other tool is near that level. We tried ServiceNow because we have it as well, but it does not match PagerDuty Operations Cloud. The overall feedback is positive.

What other advice do I have?

I would recommend that organizations with high alert noise, whether similar to my company or larger companies, should try PagerDuty Operations Cloud. They should use its event and alert de-duplication features and integration with other tools, which are excellent. The calling notification feature is also very good. Overall, it is a strong solution. I rate PagerDuty Operations Cloud as nine out of ten because I do not see any gaps in what I use on a daily basis.

Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Last updated: Jun 22, 2026
Flag as inappropriate
PeerSpot user
Saurab Gnagurde - PeerSpot reviewer
IT Analyst | Aws Cloud Ops | Dev Ops | Fin Ops at Tata Consultancy
Real User
Top 5Leaderboard
Jun 9, 2026
On-call automation has reduced critical incident impact and ensures faster production responses
Pros and Cons
  • "PagerDuty Operations Cloud helps us manage production incidents beyond service outages, mostly high CPU utilization where we set alerts, application failures, pod issues in Kubernetes, and infrastructure-related alerts."
  • "One area where I believe improvement can be made is reporting and dashboard customization to make it more user-friendly."

What is our primary use case?

As a cloud operation team, I was a user who set the alerts, and whatever important incidents or anomalies were detected that needed to be immediately taken care of were bifurcated through our APM tools that we integrated with PagerDuty Operations Cloud. As a cloud operation team, we supported the platform for rotational shifts. My roles involved setting the person in the shift according to the shift roster, so whenever any incidents triggered, they would get the call. The primary use was supporting production operations and cloud activities.

Our multi-environment consists of AWS infrastructure, Linux servers, Kubernetes clusters, and customer-facing applications. PagerDuty Operations Cloud was mainly used for incident management and alerting. We integrated it with AppDynamics, Instana, and CloudWatch, where it would monitor the patterns and platform, and then PagerDuty Operations Cloud would generate the critical alerts that the appropriate support team who was working in that present shift would get notified of immediately. This platform really helped us manage production incidents beyond service outages, mostly high CPU utilization where we set alerts, application failures, pod issues in Kubernetes, and infrastructure-related alerts. We configured all kinds of alerts, which ensured that alerts were routed to the correct on-call person, helping us reduce response time in critical situations.

What is most valuable?

One of the best features I would mention about PagerDuty Operations Cloud is its on-call rotational scheduling support and escalation management practices. If an engineer did not acknowledge the alert within a defined time frame, the incident was automatically escalated to the next person, support team, or manager of that specific team. Another useful feature was its integration capability. We were able to integrate PagerDuty Operations Cloud with monitoring and observability tools that allow alerts to generate automatically whenever issues were detected in the environment within a fraction of time. We also had the mobile application that was very helpful because the engineer could receive calls, notifications, and acknowledge the incident and track the updates even when they were away from their laptop.

I also valued the centralized incident management dashboard that provides visibility into active incidents, response status, escalation history, and overall operational health. I used to get all the data accumulated there through the dashboard.

PagerDuty Operations Cloud helps us manage production incidents beyond service outages, mostly high CPU utilization where we set alerts, application failures, pod issues in Kubernetes, and infrastructure-related alerts.

What needs improvement?

My experience with PagerDuty Operations Cloud has been positive overall. One area where I believe improvement can be made is reporting and dashboard customization to make it more user-friendly. The operations team often requires different views compared to the management team. Having more flexibility in generating custom reports would be helpful. Another improvement could be providing more advanced AI-driven collaboration capabilities to reduce unnecessary noise alerts and help the team focus on the most critical issues. Apart from these areas, the platform is very reliable and effective for managing production incidents and on-call operations.

For how long have I used the solution?

I have been using PagerDuty Operations Cloud for almost five to six years.

What do I think about the stability of the solution?

PagerDuty Operations Cloud has been stable and performing well wherever our incident management or alerting was configured for production support. Timely notifications and incident responses were critical. PagerDuty Operations Cloud delivers alerts immediately through multiple channels which we configured, including mobile on-call notifications, email, SMS, and phone calls. Since PagerDuty Operations Cloud was integrated with our monitoring and observability tools, it helped ensure that critical incidents were captured and routed to the appropriate on-call team. During my usage, I did not encounter any significant outages or stability issues that impacted our operations due to PagerDuty Operations Cloud.

What do I think about the scalability of the solution?

PagerDuty Operations Cloud is highly scalable and works well with small and large environments. The project I worked on was integrated with multiple application servers and cloud resources for monitoring. PagerDuty Operations Cloud handles all the alerts from different resources and routes them to the appropriate teams. As the infrastructure grows, new services get implemented, escalation policies get defined, and schedules and teams are easily available without requiring major changes in our existing setup. This makes it suitable for an organization to manage large cloud infrastructure and multiple team supports.

Which solution did I use previously and why did I switch?

When I joined this project, they had already implemented PagerDuty Operations Cloud. When I joined, the SOPs and testing were already in process. After a few days, when I was actually onboarded, many of the alerts were configured in PagerDuty Operations Cloud. I did not get the chance to work on different tools besides PagerDuty Operations Cloud.

How was the initial setup?

During the initial setup of PagerDuty Operations Cloud, when I joined the project, I got a Jira ticket listing a few of the servers where I needed to install PagerDuty agents so it could trigger any alerts or integrate with the server. I was mostly involved in the configuration part.

The setup was straightforward. PagerDuty Operations Cloud also helped us in this process. It was not directly integrated on the individual servers, but we integrated our monitoring tools and observability with PagerDuty Operations Cloud. The servers and applications were monitored through application monitoring tools such as Instana, Zabbix, and Splunk. Whenever critical alerts were generated, they would automatically forward to PagerDuty Operations Cloud through the configured integrations we set up with the application. PagerDuty Operations Cloud would notify the on-call engineers and follow different escalation policies if the alerts were not acknowledged within a specific time. Our flow was that we had EC2 instances, AWS servers, and CloudWatch alarms, and if any alert triggered, it would send through SNS, AWS Simple Notification Service, and then to PagerDuty Operations Cloud and the on-call engineer.

What about the implementation team?

We followed the documentation provided by PagerDuty Operations Cloud for the configuration part.

The documentation is full-fledged with proper details on how to configure it depending on the integration with any application monitoring tool. They specify what steps need to be followed. If integrating with servers, they mention which type of server, whether it is Windows or Linux, and accordingly, they have provided all the documents. The documentation is comprehensive and easy to understand, such that even a layperson can do the configuration part with the way they have provided the documentation.

What other advice do I have?

We are not mostly focused on utilizing PagerDuty's autonomous AI agents because we are working on cloud infrastructure where we do the deployments. We have not implemented AI in our cloud to that extent. Going forward, if our infrastructure is AI-based, then we will definitely explore where PagerDuty Operations Cloud can help in that.

As of now, we do not use generative AI capabilities of PagerDuty Operations Cloud. Our infrastructure is huge, and there is a dedicated developer team working on AI-related things. They are still in two POCs, and the POC is being evaluated. If it looks good, then only we can roll this out into production because my application is customer-facing, and we do not want anything to go wrong or if the alert triggers unnecessarily due to some AI alert that did not notify us. That would ultimately cause us to lose our SLAs and SLOs, and all the other escalation matrices would come into the picture. That is why we are still in POCs as it is critical.

That part is taken care of by a different team or mostly the clients themselves. My main role is to keep the environment always up and running, and all alerts should be properly centralized and customized accordingly.

PagerDuty Operations Cloud is basically where we get the alert, and we can integrate through Slack and on-call rotational shifts on cell phones. Prior to this, we were mostly relying on application monitoring tools only and emails and Slack notifications. If an on-call shift person is not at their desk and if any alert has been triggered and no one is there to acknowledge it or look into it and take necessary action, then ultimately there will be customer impact. That is why we implemented PagerDuty Operations Cloud. Even if the on-call person is not near their laptop, they will get the call and can immediately acknowledge and report to the team that we have received a P1 call for this specific environment or that the alert is regarding a production issue. Another team member will immediately take action, so there will not be any miss.

I did not encounter any issues that required contacting support for PagerDuty Operations Cloud. This review represents an overall rating of 9 out of 10.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: Jun 9, 2026
Flag as inappropriate
PeerSpot user
Buyer's Guide
Download our free PagerDuty Operations Cloud Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2026
Buyer's Guide
Download our free PagerDuty Operations Cloud Report and get advice and tips from experienced pros sharing their opinions.