PagerDuty Blog

7 Habits of Successful Generative AI Adopters

Generative AI is forecasted to have a massive impact on the economy. These headlines are driving software teams to rapidly consider how they can incorporate generative AI into their software, or risk falling behind in a sea-change of disruption. But in the froth of a disruptive technology, there’s also high risk of wasted investment and lost customer trust.

How do software teams move quickly to build compelling features using generative AI, without wasting resources, incurring painful tech debt, or losing customer trust? After talking to data scientists, data engineers, and product managers working on AI-generated features, a few patterns have emerged.

1. Have a clear use case that solves a problem

It’s easy to get caught up in the hype of a fast-moving technology. But nothing is more wasteful than building a solution in search of a problem. In a recent interview, Mitra Goswami, Senior Director of Data Science at PagerDuty, emphasizes the need to focus on a use case: “At the end of the day, what is the problem you are trying to solve?”

This advice isn’t just philosophically sound. It has a direct impact on key decisions you have to make. “Some models are too big,” said Goswami about how the use case can introduce requirements. “For example, large language models like GPT 3.5 or 4 are almost magical, but can consume vast amounts of processor cycles and be costly. Smaller, more industry- or business-focused language models can often provide better results tailored to business needs and have lower latency, especially for real-time applications.”

To identify a use case, it helps to consider the use case you want to solve for: summarization, chatbots, code generation. However, to scope in even further, Goswami asked more questions. She recommends thinking about customer impact: “Is this use case going to give my customers some relief? Is this going to save my customer money?” Testing your use cases this way helps ensure you build something that adds value.

2. Have a solid data foundation

For organizations that have been building data-intensive features for a while, they aren’t starting from zero with generative AI. “We’ve been doing AI for a long time,” stated Goswami. She credits a strong data architecture foundation as critical to moving quickly with generative AI. According to Goswami, a robust infrastructure is essential to effectively harness the capabilities of Language Models (LLMs) and LLM vendors.

“First and foremost is a data storage solution, often a data lake, which is vital for housing large volumes of text data for training and fine-tuning models. Scalability is also critical to accommodate variable workloads, while a well-designed API layer allows for seamless integration with LLM services. Comprehensive monitoring, logging, and cost management systems help maintain infrastructure health and optimize expenses.”

What does it mean for a data environment to be healthy? According to Manu Raj, Senior Director of Analytics and Data Engineering at PagerDuty, there are some key requirements. “It is absolutely essential that [you have] the foundational elements of maintaining that data quality and data observability.”

3. Stay flexible in your approach

The pace of change in AI has been tremendous. ChatGPT was introduced less than a year ago, and had 100 million users in less than two months. By July 2023, Meta made waves again by releasing Llama 2. Thanks to heavy open source involvement and a lot of investment, the underlying large language models (LLMs) and services around them keep evolving. And quickly.

For data scientists and engineers, that means a constantly shifting landscape of options. Waiting for “clear winners” to emerge, however, isn’t an option. The risk of taking too long to start building is weighed against the risk of building on a technology that becomes outdated. Balancing progress with the potential need to change requires a flexible approach.

“We were very flexible in choosing the model,” explained Goswami. Flexibility affords the team changes in the future, as needed. But that doesn’t mean changing all the time. Goswami stressed the need to build towards that use case, and not change for the sake of changing. “It’s a very evolving field right so you need to move in a direction,” she noted. “Don’t change every day because this field is evolving a lot. Build towards something.”

4. Start with design principles

As lauded as “go fast and break things” has been as a development mantra, that approach can add toxic tech debt. Yet, slow and onerous architectural reviews can kill innovation. How did Goswami and team strike a balance to move quickly, but not paint the architecture into a corner?

“We started writing a design document early,” she explained. The document includes architecture patterns and how they want to interact with vendors or open source models. This affords the team flexibility in the future and minimizes disruption. “Starting with design principles when building Language Model-as-a-Service architecture is paramount for several reasons. Design principles serve as a guiding framework, ensuring that the architecture is aligned with the intended goals, whether it’s natural language understanding, content generation, or data analysis. 

“They facilitate consistency and clarity in decision-making, leading to an efficient and high-quality architecture. Design principles also help in focusing on user-centric design, ensuring that the LLM services meet user needs and expectations effectively. In essence, starting with design principles is essential for creating a robust, user-centered, and adaptable LLM architecture. The Data Science team at PagerDuty collaborated very closely with the Architecture Strategy Team and Chief Architect Philip Jacob to create LLM-as-a-Service architecture.”

By assuming that changes are inevitable, the team can design for change. Designing for change requires thinking about the interfaces and interactions between different components. That way, one component—such as an LLM—can change with well-understood implications to the rest of the architecture. As I’ve written about before, other useful elements to support change include tests and CI/CD pipelines.

5. Establish clear guidelines for data privacy and responsibility

Privacy is top of mind for users as they look at new features and products coming out that are based on LLMs. Establishing trust is a shared responsibility across many teams, from product to data science and more. To ensure teams are working responsibly, Goswami recommends having guidelines that teams can refer to.

“We need to be very intentional on how we are using our data,” explained Goswami. “We are letting our customers know that ‘hey you have an opt-in option’.” Transparency and opt-in options are published in PagerDuty’s public guidelines for the safe and secure use of generative AI. And PagerDuty isn’t alone in this. “More and more vendors are declaring and moving towards an approach where the interaction with the AI is not used in the training data for the AI,” noted Jake Cohen, Senior Product Manager at PagerDuty.

“We’ve kept what we share with the AI very minimal,” explained Cohen. Even beyond privacy concerns, there can be functional reasons to limit what depends on AI. Cohen described how he isolates where AI-generated runbooks use AI. “We’re thinking very critically about what we need to use the AI for and where we can use classical software.”

6. Have a framework to compare

Data scientists have long used confidence scores for predictive models, and other means to understand and rank a model’s output. From there, teams can solve for the fastest, most efficient approach that still meets the desired accuracy. Similarly, the accuracy of generative AI is only one factor to evaluate.

“If you don’t measure, you don’t know what you are talking about,” asserted Goswami. “We wanted to be quantitative in our approach and hence we created a framework.” Goswami and team considered several factors, such as cost, latency, and accuracy. “We created a framework that made it easier for us to compare these elements across the portfolio of LLMs available to us.”

Such a framework also helps as new LLMs or other technology options emerge. The team can benchmark new options against everything that’s been tested before. Rather than chase hype, the team can make data-driven decisions about what new options to pursue. And existing choices can be routinely tested against those benchmarks to ensure performance hasn’t degraded below other options.

7. Embed best practices

With any kind of automation, there’s an opportunity to encode doing something the correct way. After all, computers are better suited than humans to do repetitive tasks the same way every time. A similar opportunity exists when building with generative AI. Unlike an open prompt field, a more structured approach to AI-generated output allows users to benefit from expertise.

An example is how PagerDuty’s AI-generated runbooks are designed to use plug-in where available. Instead of re-creating a connection to another system, like Ansible or an AWS service, the AI will re-use the plug-in. Besides reuse, Cohen noted how this approach is also more manageable: “The benefit of breaking this workflow up into these steps that take advantage of these plugins is that it helps make the job more mutable and debuggable.”

Some best practices can’t be embedded in the output directly, but they can be incorporated into the user experience. “What we chose to do is for every job that is generated using the AI, there’s a bolded note in the job description that says: ‘Note this was generated by AI. It’s best practice to review this and make your first invocation in a non-mission critical environment.’,” described Cohen. “The same would be said about new automation that’s created by a human, even the experienced ones.” Reminding humans of the best practice with AI output helps junior team members ramp up quickly and safely.
Following these practices helped the PagerDuty team quickly build useful features to customers using generative AI. They also reduce the risk of tech debt, wasted time and resources, and losing customer trust. Read more about PagerDuty’s learnings from building with LLMs for incident response.