Blog

Introducing Enhancements to the PagerDuty Operations Cloud: Building Operational Resilience for the Modern Enterprise

by Madeline Zemer October 8, 2024 | 6 min read

Global outages and disruptions have become an inevitable reality for the modern enterprise. As digital dependencies deepen, organizations must effectively manage disruptions or risk damage to their customer experience, brand reputation, and bottom line.

Today, we’re thrilled to unveil the latest innovations for the PagerDuty Operations Cloud. These new enhancements to our end-to-end platform harness the power of artificial intelligence (AI) and automation at scale, empowering organizations to strengthen their operational resilience and future-proof their business.

Operational resilience: A critical business imperative

With disruptions no longer a matter of if but when, incident management has evolved from an IT concern to a CEO and board-level priority. In such a high-stakes environment, enterprises must be prepared to quickly restore services and safeguard customer experience. 

Operations teams face three critical challenges with each incident:

  • Orchestrating the right responses across people, processes, and technologies.
  • Cutting through system noise to take action before customers are impacted.
  • Gathering learnings from outages and transforming them into proactive improvements.

PagerDuty Advance: GenAI capabilities for incident management

To enhance incident response, PagerDuty Advance integrates GenAI capabilities across the PagerDuty Operations platform. Key features include:

  • Intelligent prompts for actionable insights: Teams can quickly get up to speed with prompts like “Has this incident happened before?” and “What’s the customer impact?”, minimizing context switching and accelerating resolution. AI-driven analysis generates valuable insights that support better decision-making, during and after incidents.
  • Automated documentation: PagerDuty Advance helps automate tasks like summarizing incidents and drafting post-incident reports, keeping stakeholders informed throughout the incident lifecycle.
  • Enhanced chat collaboration: PagerDuty Advance now seamlessly integrates with both Microsoft Teams (early access) and Slack to enhance collaboration.  

Accelerate strategic initiatives with PagerDuty

PagerDuty empowers enterprises to deliver exceptional customer experiences while minimizing the impact of service disruptions. Our comprehensive solutions focus on three key areas: incident management transformation, operations center modernization, and automation standardization and center of excellence (COE). These strategic offerings foster operational resilience and align with our customers’ long-term goals.

Our latest product announcements are bundled to enhance these solutions, providing a robust framework for enterprises to advance their operational capabilities.

Incident management transformation

To manage the unexpected in an increasingly unpredictable world, modern enterprises must transform how they handle end-to-end processes for major and minor incidents in today’s complex IT environments. To support these goals, we’re introducing these new features:

  • New unified chat experience: PagerDuty is introducing a completely reimagined chat experience with a modernized look and feel. This update consolidates our chat apps into a single, seamless experience, enabling teams to manage incidents end-to-end within Microsoft Teams and Slack. Leveraging PagerDuty Advance’s GenAI capabilities, responders can query and engage with critical context directly in their preferred chat platforms. The unified chat experience will be generally available in Q4.

  • Incident Types: Customers can use this feature to define how specific incidents behave and align them with their unique operational processes. This feature supports enterprise customization, allowing customers to drive bespoke response processes for different scenarios, whether it’s a Security, FinOps, or Major Incident. The tailored approach reduces the risk of missing critical steps while streamlining incident management. Sign up for early access to Incident Types.
  • Service Reassignment: This feature allows customers to easily move incidents across services, streamlining triage, and accelerating resolution. This feature saves valuable time during critical incidents. Service Reassignment also enables seamless integration with ITSM tools like ServiceNow for bidirectional synchronization of service changes and operational insights. Early access to Service Assignment will be available in Q4. Sign up here.
  • Operational Maturity Model with recommendations and benchmarks: Customers can gain actionable recommendations and industry benchmarks to help teams mitigate risks, improve operational maturity, and speed up response and recovery times. The Operational Maturity Model is generally available in-product. Early access to recommendations and benchmarks will be available in Q4. Sign up for early access here.

Operations center modernization

Resilience starts by separating signal from noise so organizations have a clear view of what matters and can deploy automation or humans to get the most efficient path to resolution. To help operations centers reduce triage time and focus on high-impact issues, we’re unveiling purpose-built innovations that speed resolution via automation. New features include:

  • Global Intelligent Alert Grouping: This advanced feature leverages machine learning to significantly reduce noise and enhance teams’ understanding of impact scope and potential blast radius across services. By detecting alerts with strong co-occurring patterns alongside textual similarities, it extends the existing content-based alert grouping across services already available in PagerDuty AIOps. This capability helps teams quickly assess potential blast radius and prioritize responses. Global Intelligent Alert Grouping is now available in early access for AIOps customers.
  • Automation on Alerts: This feature triggers automated remediation at the alert level, before they escalate into incidents, which avoids disrupting responders unless necessary. For example, teams can initiate automated fixes while pausing incident creation, allowing time for the remediation to take effect and creating an incident only if the automated solution fails. By reducing the number of escalated incidents, the feature will enable teams to focus on more critical tasks, ultimately lowering costs and improving service quality.  Automation on Alerts will be early access in Q4 for AIOps customers. Sign up here.
  • Operations Console enhancements: Upcoming enhancements to the recently released Operations Console will provide teams with comprehensive alert visibility from a single dashboard. New features include a side panel with alert information and timeline tabs, and enriched responder views with custom fields. By eliminating the need for tab switching, these enhancements can increase efficiency, reducing MTTA and MTTR. The Operations Console is generally available for AIOps customers, with these new enhancements entering early access for AIOps customers in Q4. 

Automation Standardization and Center of Excellence (CoE)

Driving automation at scale is a key priority for enterprises investing in partners and solutions to find efficiencies. PagerDuty helps prepare organizations to handle future events by automating critical workflows. 

A key new feature of this solution is:

  • Automation use case library: Implement automation solutions faster with pre-built and recommended automations for common IT/Dev scenarios, including technical and business process automation. The library enables customers to transform unplanned work into automated planned tasks, reducing the risk of operational failures and accelerating resolution for future events. It provides automation tips for common use cases such as container management, diagnostics, and database management to prevent performance issues. The Automation Use Case Library is available now. 

 

Building a more resilient future

For enterprises aiming to deliver high-quality customer experiences and mitigate the risk of customer-impacting incidents, PagerDuty offers comprehensive solutions to build true operational resilience. By leveraging the PagerDuty Operations Cloud, companies can increase innovation velocity, reduce costs to achieve operational efficiency at scale. PagerDuty empowers customers to streamline the incident lifecycle, ensuring teams are always prepared to respond swiftly and effectively to any disruptions.

Ready to future-proof your operations? Sign up for our launch webinar to learn how innovations coming to the PagerDuty Operations Cloud can help you build operational resilience.