Home
/
Resources
/
AI

/
Webinar

/
AI Drives Modern Tech Stacks—But What Happens When It Fails?

Webinar

AI Drives Modern Tech Stacks—But What Happens When It Fails?

Free On-Demand Webinar

AI Drives Modern Tech Stacks—But What Happens When It Fails?

AI adoption is accelerating across enterprises, yet 95% of AI projects stall or fail to deliver value. Scaling AI remains a significant challenge—complex integrations, security, and compliance requirements can slow progress and introduce risk. To unlock AI’s full potential, organizations need to rapidly detect and resolve issues before they cascade, ensuring reliability and empowering teams to scale initiatives with confidence.

Join PagerDuty and Arize as they unpack the real-world challenges organizations face in accelerating AI initiatives and share actionable strategies for success.

Key topics include:

Overcoming fragmented data and monitoring across diverse systems
Managing the rising complexity of AI environments
Equipping AI teams to stay prepared and responsive in dynamic, high-stakes situations

Thanks for joining us for this exciting discussion about how AI adoption is accelerating across enterprises.

But what happens when things go wrong?

So we have two amazing speakers here today that I'm really excited to introduce.

Chris Bell is a principal solution architect at PagerDuty specializing in AI ops, automation, and AI driven operations.

He helps some of the world's largest enterprises modernize their incident response and operational resilience strategies. With deep experience connecting observability, automation, and AI, Chris focuses on helping organizations deliver on their AI initiatives with secure, scalable, and actionable insights. So thank you, Chris.

And then we also have Hakan Tekgul who leads solutions architecture at Arize AI, helping organizations integrate AI use cases into the Arise AI platform to observe and evaluate GenAI applications.

Previously, he led the ML solutions team at Tazi AI, and he is based in the San Francisco Bay Area. So without further ado, Chris and Hakan, thank you so much, and take it away.

Thank you, Melissa. I appreciate you having us this morning.

Alright. So we can go ahead and get started. I'm glad everyone could join us today.

And, And, you know, we're gonna kinda talk a little bit about the mandate and the push to deliver on AI initiatives across the enterprise.

I don't think I have to tell you if you've joined your own company town halls or listened to your own executive speak, AI is no longer an option. They are telling you that you have to use it in your job every day, and you have to find ways to deliver on it in the applications that you're building.

However, we're seeing 95% of AI projects stall or fail to deliver value. There's a lot of reasons for that. We'll talk a little bit about that this morning, and Hakana and I will show you how we can help mitigate some of that to make sure that you're able to deliver on the outcomes the business is demanding from you.

So let's go ahead and go to the next slide.

Alright. So, you know, why do some of the AI initiatives fail? Or ninety five percent of them as we saw from that previous slide? Well, fragmented data is a huge problem. It's been a problem for as long as I've been in operations. But as we've continued to progress our technology and we've seen agents and AI start to emerge, that fragmented data is becoming a bigger and bigger problem. So the ability to connect to those systems, monitor them, and connect the signals together. Traditional DevOps tools don't really understand or manage AI risks today.

It's a whole different world from a synthetic observability stack that's running a user transaction on your application to monitoring for hallucinations and ensuring that we're getting good quality responses back. And then what I think is the real challenge for a lot of teams that I talked to today, they expect just incredibly fast development time.

Executives wanna be able to go talk to the trade magazines and brag about how they're using AI to reduce headcount or to make their organizations more efficient.

I saw one article recently where they bragged about saving two hundred and eighty thousand hours of developer time already this year, But they need to launch these in sixty to ninety days in many cases. The technology is evolving so quickly, that we're running under sixty to ninety day development cycles even in large scale financial services institutions that typically have two to five year transformation road maps. So all of these different circumstances are conspiring to make these launches even more difficult. So let's go and go to the next slide.

Alright. So as we're gonna talk a little bit about what happens when a CX agent starts hallucinating today, and it's one of the first places we saw a major push in agents and AI. But when you think about what it takes to deliver on that, you have your traditional infrastructure combined with your cloud infrastructure that you built.

Data that's scattered across different types of databases and flat files and unstructured data. Now if you wanna have a competent customer support agent, it needs to understand when they made their last payment. Or when a transaction fails. And then models are changing every day.

Which ones do you plug in? How do you bring in new models? How do you make sure that the frameworks that you built support those? And then when things start to go wrong in that agent downstream, how do you respond to it?

Right? Now when we're more dealing with this wild west of AI standards and trying to build these resilient AI systems.

It's really making life difficult for a lot of teams out there, and we're here to help. So let's go ahead and go to the next slide.

So this is really where PagerDuty fits in the DevOps life cycle and how we help teams do what I'm starting to see as an absolute requirement in this space, which is fail forward.

Right? Right? We all know we're going to have problems. We're going to see hallucinations. We're going to have vector databases fragment. We're going to have changes that go wrong. But when you're operating under a sixty to ninety day timeline, every failure, every mistake needs to be presented as a learning opportunity and drive that back into your developer tools into your observability systems as a such as Arize so you can make better determinations and react more quickly and reduce the number of times you need to engage operational teams.

Alright. On that note, I'm going to pass it over to Hakan here to talk a little bit about how Arize handles the observability part of this.

Thanks so much, Chris. Yeah. If you can go to the next slide. Yeah. I guess, like, Chris mentioned a lot of great pain points of building AI.

Like, as Chris said, models change every day. There's a significant expectation of very, very fast development times. Right? So with all these changes, like, how does, like, Arize as a platform helps?

Right? So, like, based on what we see, we think that there are specific three buckets of capabilities that are needed in order to succeed with an AI project. And these three buckets are observability, evaluation, and prompt optimization and experimentation. But what I mean by each of these is that, like, observability really gives you the ability to see what's happening behind the scenes.

Right? Let's say you have a GenAI agent. Right? That agent has access to multiple LLMs, multiple tool calls, maybe MCP servers.Right? So in order to really quantify the performance of such a system, you really need to understand, like, what is actually happening behind the scenes so that you can maybe find the root root cause down to the problem. And in order to actually find the root cause and in order to get signals from your application, even if you have observability, you actually need the concept of evaluation. Right?

So the evaluation basically gives you the ability to quantify the signals. Like, am I hallucinating? Am I picking the right tool calls? Is my algorithm completing the task in the correct way?

So all of this will give you, like, a sense of, like, okay. Is my MCP solving the problem? Is my tool calls the problem? And go from there.

And as you kind of detect these issues, you would need to find ways to, like, experiment with different models. Right? Like, as Chris said, models change every day. So how do you test these new models?

Right? Like, let's say, GPT five came up recently or maybe a new Gemini model will come tomorrow. Like, how do you make sure that, like, if you switch from GPT to Gemini or if you switch from prompt a to prompt me, which one is the best for your application? So these are some of these are the main three buckets that Arize really helps for challenges in building AI.

And if you go to the next slides We, We, like, we basically provide a platform that basically kind of addresses these three buckets. Right? So Arize as an observability and immigration platform provides a prompt IDE that's really designed to run these experiments, like prompt a, prompt b, model a, b, c. Right?

So do all those testing and then provide, like, a way to optimize prompts. Right? Like, we even provide, like, a prompt learning experience whereby you leveraging human labeled datasets, you can actually include your prompts, maybe, like, store some of your production prompts in the prompt hub. So it really gives you power to really manage your prompts in the most efficient way possible.

And on top of prompt ID, evaluation is probably the most important piece in the AI space right now where people just wanna really understand, like, is their application working or not? And, like, traditional metrics, like like, machine learning metrics, like precision recall accuracy, or comparing with the ground tool just doesn't apply anymore in production. So that's why the concept of Elegam innovation kind of became a huge hit in the the past two years where you actually use, like, an LLM as a judge to do these innovations. Like, you basically ask another LLM to check, like, hey.

Is this, like, customer support agent hallucinating based on the context, input, and output? So this really gives you, like, a way to understand what your application is doing in terms of performance. And, again, observability is connected to this, like, significantly where, like, you can view dashboards, aggregate all the metrics. If you wanna do some web teaming or maybe cost tracking, it'll generally help you that.

But we believe that as long as you have these three pillars, which are really dependent and connected to each other, you will have a successful AI journey in your organization. So let's actually take a look at an example use case to kind of, like, explain what I exactly mean. So if we can go into the next slide.

So let's actually look at a very kind of complicated agent. This is actually a very kind of common agent structure that we see across our ecommerce customers where, like, you have you basically have a chatbot. Right? And this chatbot is designed to, like, talk with you to travel, like, hotel planning, any itinerary.

Right? Basically, do anything you want regarding travel. And then the idea is, like, you chat with this in order to purchase, like, a hotel or something else, like, print tickets. And in this case, like, even though it just looks like a chat interface, there's actually a lot going on behind the scenes.

Right? So, like, one of the main things in this chatbot is, like, in the in the main point here, when a user asks a question, you have a router. Right? And this router actually decides which path to follow.

Right? Is the user asking about the product, or is this just a customer support question? Does the user want to track a package? Right?

And then depending on which path that you follow, there's a bunch of LLM calls happening, internal API calls happening, some application code. So in order to really understand, debug, and troubleshoot these very complicated agents, you do need the observability piece to actually see this flow on a platform like a guys and all the subcomponents behind the agent. And, also, you do need a way to evaluate it. Right?

So, like, how do you actually, like, pick how do you understand if the routing function picked the right path? Right? Because if the router function fails, then the the rest of it doesn't matter. Right?

If the router function picked a track package instead of custom support, then, like, it's not gonna be helpful. Right? And the users will get frustrated, and it will have a significant impact to the business. So evaluation is key on this.

And if you kind of if I give you, like, a kind of quick explanation of how evals work for this agent, if we can go to the next slide, we will basically look at, like, how do you measure performance and detect issues. Right? So what I always talk about algorithm evaluations. I talked about algorithm as a judge, but how does it actually work?

Right? At the end, like, we we basically have these evaluators that are predefined, which really kind of makes up of an eval prompt that's designed to check for specific metrics. Like, this could be hallucination, toxicity, code generation, like, mainly valid correctness, and then an eval model. Right?

And we usually kind of recommend that the eval model could be your own model. Like, whatever you use right now would actually work pretty well. But using this evaluator, you provide the data to evaluate. So let's say the router functions input, like, was the user question?

Based on the user question, what was the routed.

Right? Like, did the routing function pick the track package or something else? And then based on that, the evaluator would evaluate this input and output and will provide a final output, usually a label, like a binary label as correct, incorrect. It can also provide a score and also an explanation.

Right? So this would kind of really allow you to be very kind of powerful as you kind of iterate with different models, prompts, and as you kind of look at different issues in your production agents.

Let's go to the next slide. So we wanna do a quick demo on this. Like, like, how does actually, like, an AIS plus PagerDuty partnership help you kind of throughout this journey where, like, how AIS helps you understand the different signals, detect different issues. And now from there, how does kind of PagerDuty help you kind of troubleshoot this and then take some action and remediate it as soon as possible.

So let me actually share my screen.

And Get into it. Get into the demo. Cool. So this is the Rx platform.

So, basically, what you see here is that, like, anytime you have an agent, you will have the ability to integrate to Arise using the concept of open parametry.

But the idea is that once you do the integration, anytime your agent gets used like, anytime someone asks a question to your agent, it will automatically show up here. Right? So for context, we actually have an Arize customer support agent here. So all of these questions are actually questions about Arize.

So you can actually see kind of the input, like, let's say, the user said dataset not showing up, the final output from the support agent, like, how long did it take to answer, all of the total token usage. You can even see the cost if you want here as well as the concept of evaluations. Right? So if you look at this example, like, let's say this the user asked this question and we see that it's hallucinated.

If we actually click on this, I will be able to see exactly what happened behind the scenes in order to get to this final output. Right? So, like, the user asked for a dataset not showing up, but you can actually see step by step what happened. At first, embedded the user question.

Right? And then it retrieves some context. Right? So it retrieves some specific documents from our documentation in order to answer this question.

And then by leveraging this context, an alarm call has been made in order to answer the question, and then you can see, like, the system prompt, the user prompt, and the final output. And this final output will exactly match the final output here. Right? So and, again, this is actually a relatively simple customer support agent.

But if you look at if you think about the chat to purchase agent I show, this will actually have, like, hundreds of Hundreds of building blocks or spans as we call it underneath, so you will have full visibility in terms of each step. Like, how long the telegram call take, how long the delivery will take so that you can do this debugging exercise. And the cool thing is, like, as as these come in, they will automatically get evaluated by Right? So you can set up the metrics I talked about so that, like, anytime someone asks a question, that data will come into Arize, and that date that data will automatically get evaluated for hallucination, let's say.

And in this case, what Arize will do is it's gonna assign a label, like, hallucinated, not hallucinated. And, also, it's gonna give you a long explanation of why it thinks it is hallucinated. Right? In this case, actually, like, it talks about the answer, and it says, like, why the reference text does mention the output, the rest of the troubleshooting steps suggested in the answer are not mentioned or inclined anywhere.

Right? So it's basically saying that the output does not really is not really grounded by the reference text that I retrieved from my documentation. So, like, some of this output is just made up. It's hallucinated.

Right? There is actually a origination in this output, so you need to come take a look, see what's going on here. And, again, like, this might be only one specific example, but this will run automatically on all your incoming questions. That's hallucinated, factual, correct, incorrect.

And you can aggregate all these metrics in, like, a monitor on our eyes. So if I go to, like, a hallucination monitor, you can actually see that, like, I've been actually harassing for quite a few time, and then you can actually kind of get alerts on this. Right? So right now, I actually have a hallucination monitor.

I'm looking at the, like, let's say, the last twenty four hours of data, and then I can play with, like, any threshold I want. I can set up automated thresholds. I can even set, like, your range so that if the Harish nation rate, there's, like, a significant spike, I can get that alert. And this alert will automatically go into paging data.

Right? So the goal is, like, as your data comes in, as your data gets evaluated, you're gonna need to detect any significant issues. Like, let's say, in the past twenty four hours, like Twenty percent of your responses are hallucinated. You need to have some sort of action on that.

And that's actually where pager duty will come in, where a guy will send notifications to PagerDuty, and then someone from the PagerDuty side, like, someone who's using that we do PagerDuty will use this information and will take some actions to remediate and go from there. Right? So in order to actually show the next piece of the demo where let's assume that, like there's a going on, like I showed. Like, twenty percent of responses are in the past day, and a word will be sent out to PagerDuty, and I'm gonna pass it to Chris, to continue with the demo.

Thank you, Hakan. I appreciate that. That was a great setup. And, you know, I I think it's really key as you start to look across the modern stack. And, you know, what teams are dealing with when they're setting up these, you know, hallucination monitors and these evaluation prompts, trying to really align things across, you know, not just the new tools that they're building, but across their entire ecosystem and all the connections that it's dependent on. I love that slide that showed all the different chains that a prompt might go through.

And then what happens when you introduce something like Snowflake as another option for it to pull data from. Right? So how do you continue to scale that out?

Well, here, we're gonna take a look at a customer that's trying to launch a new, you know, CX chatbot experience, and this is their environment, and we're gonna we can see here we've got a p five active on it. There's also some p twos going on. And, really, when you think about your enterprise stack, there's gonna be dozens, hundreds of incidents going on across the environment. Now one of the key things that PagerDuty wants to help you with is connecting which ones are relevant to what you're focused on right now.

Now my team isn't the one dealing with this p2 for this instant transfer failure.

I'm actually on the CXQA agent team and the SRE team who's supporting that.

And so with our operations console, I'm able to kinda quickly narrow down into the incident that we got from our eyes saying, hey. We've got some hallucination uptick going on. Right now, it's just a p5. We caught it early because we've tuned the system so that we get those early observability warnings coming in.

And what we were able to do here very quickly using our SRE agent, which is using a rise in the back end to track some of this stuff as we go on. So if I were to rate my AI response here poorly, that would actually flow back through that system we just saw.

We're able to quickly summarize all of this information coming from these other systems.

And bring that right back. So I can see, like, we've got a hallucination rate, our retrieval issues, and then go, you know what? Let's look at past incidents related to this.

Right? And I can start to actually interact with this across that complex environment and surface up which of these issues are related and how we can quickly resolve this before it becomes a major incident.

I will say sometimes these run faster than others.

There we go. So now we got a quick past incident analysis. We can see it's happened before. We've seen vector frag DV fragmentation.

So when we think about all these systems interacting, could be something on a database that's not necessarily directly related to this. Right? So now and I can start to ask things like, are there related incidents or changes? Right?

Because we saw there's a lot of information, a lot of incidents going on.

And many times, most incidents are caused by a change that's gone wrong. So we can actually see we've got a CX knowledge store latency spike, thirty eight hundred milliseconds. That could definitely be contributing to it. And when we look at this based on what we've seen before, based on what we're offering up, we can actually go, this is most likely the issue. We need to go back and and clean up this problem. And I can even go, you know what? What should I do next?

And we can start to drive this guided remediation right from PagerDuty so you can quickly identify where you should focus your time. And we can go, look. Run the automated diagnostics first. So hey. It looks like someone here has gone ahead and prepackaged this up as, yep, I've already got an agent diagnostics. Now if I had to run this manually, again, the runbooks are available. This could be coming from Confluence. Or if your teams are moving fast, maybe you just upload your MD files directly from your GitHub repository.

We kinda bootstrap and get things moving. So I'm gonna go ahead and pull my agent diagnostics here. And we're gonna have this come back. We can see the SRE agent is actually analyzing the output from these as it runs through, and it's gonna come back and actually tell me.

Hey. We've confirmed that root cause. So these are the things you need to do right now to resolve this issue.

And because of that, I can actually come right over here. And go, you know what? Let me go ahead and start my vector database optimization. Now this workflow, in addition to going off and calling the automated runbooks that we built using our agents and that continuous feedback loop.

It's sending out the status updates to any teams who care about the CX agent to my customer support agents to let them know we've started this process and that we expect things to normalize in fifteen to twenty minutes. Now for demo purposes, I don't wanna wait fifteen minutes while we run an optimization, so this does go a little bit quicker here today. We can see we're all within this one chat. I was able to analyze what was happening, connect it to the other signals in my environment, and, you know, dismiss the noise, at least from my perspective, and bring this back to to resolving issues and responding properly to my customers without declaring a major incident, without needing to scramble everybody, which means that when you think about how team CAI, any false positive, any bad experience with customers will result in that ninety five percent of AI projects having the plug pull.

And then one last thing I'll point out is everything that I did here, I could have interacted with this entire incident from Slack or Teams or my mobile phone. So while I showed you what we were doing here in the PagerDuty UI, If your teams like to operate on Slack like many of us do, we could have run the entire thing including the agent interactions directly from Slack. So on that note, I think that's going to conclude our demo for today. Now you know, if I were running this in production, I would actually go through the next steps, which is to create a post incident review so that way we can start to analyze these things at scale. I know what you're thinking right now. We don't create PIRs for p5.

You know what? If you wanna actually improve, we do wanna create PIRs for p5s and p4s and p3s so we can identify those solutions and continue to fail forward.

So when we think about unifying and automating those t or and empowering those AI teams so they can drive that fail forward motion. You saw how we were able to quickly detect that issue coming from Arize, how that we were able to set up the evaluation criteria so we could bring in that hallucination signal early. Right? Maybe we start warning at eight percent so we can react right away. Instead of waiting till we're at twenty percent, we need to scramble everybody. We saw how we were able to auto fix common AI failures. In fact, if I were to ask the agent what I should do to prevent this in the future, it would tell me I should probably just have it run those automated responses without ever paging me in the first place, and that is something we certainly could have done.

If I needed to, I could have escalated to a major incident and mobilized additional teams. Fortunately, I was able to avoid that by being guided to my resolution quickly.

And even though I resolved it fast, I still kept the people who needed to know in the loop and opened myself up for that learning opportunity.

Alright. And on that note, if you're interested in seeing more, please, you know, reach out to your account exec or scan this, and I am going to pass it over to Catherine here for some q&a.

Thank you both, Chris and Hakan. Hi, everyone. Cat here from Arize. Wanted to help facilitate the q and a portion of this webinar. So we'll start with the first question.

What are the most common reasons that you both have seen AI projects stall, and how have you seen teams proactively address these issues?

So I can take a pass at that first. And in many cases, it's, you know, the security and architecture teams and this rapidly evolving landscape. I've seen hack day projects and winners get early releases into testing production, and then thirty days later have everything pulled because SecArch no longer wants to support it, or they couldn't figure out how to move it to an operational status. Right? It's great to be able to put it out, but if you can't proactively catch these failures, then the business doesn't trust it. And that is a good way to have the plug pulled.

Hakan, what do you think?

Yeah. I think that's a great collab, Chris. And then on top of the, like, I would say, like, it's super easy to build, like, a, like, a Twitter demo as I call it. Like, you can get an AI project running in ten minutes.

But then, like, when you actually try to take it to production, it doesn't work. But what I actually see some stuff, like, getting stalled is, like, yeah, like, people like, at especially after, like, a few incidents that happened in the past few years with some larger enterprises, right now, people just have a hard time getting, like, an approval to go into production. And that's mainly because, like, like, they don't have a good way to quantify the performance and benchmark it against the dataset, or they can show that, hey. Like, I I actually tested this across, like, two thousand user questions across five different models.

I tested this for security weeks. I tested this for, like, any prompt injection attacks. Right? So there's just really not a good way to report stuff to the leadership for production approval.

And then, again, this is also kind of, like because, like, people are stuck in spreadsheets. Like, they don't have a centralized platform to do this or maybe even, like, something like a page where they can just hook it up to detect issues. So I think those are some of the stuff that a lot of organizations are starting to think about proactively in order to make these go into production faster.

I really love that comment, Hakan, about the Twitter demo and how you can very quickly and easily spin up a working prototype that allows everyone for your hack day or what you're showing up. I've run into this myself with some agents that I've built for our solution consulting team.

Real easy when it's just me and I'm operating under a controlled set of circumstances. But as you start to roll that out to a few more users, they ask the questions you don't think about. They hit security boundaries that you didn't consider in the first place.

And suddenly what you felt like was ninety percent of the work to get there was only ten percent, and you've got a lot of work ahead of you to get your pet project off the ground. And, you know, it's, it gets very difficult very quickly once you start to move it into production.

Right. Thank you both. And on that note, another question is then how do you ensure that your models and applications remain reliable and accurate as data sources and the business requirements evolve.

So, Hakan, I'll let you take that one first if you don't mind.

Yeah. Sure. Absolutely. I mean yeah. So, like, I think that the concept of continuous evaluation is pretty important.

Right? Like, your user behavior will change. The questions that your user asks will change. Models will change.

Right? Like, behind the scenes, like, even if you use GPT four o, there's actually specific updates going on, like, every week to that model. Right? It's not one static model.

So everything will change over time. So that's why you need, like, a continuous way to monitor and troubleshoot your application. And the monitoring piece always starts with first collecting data and then doing the continuous relation piece. Right?

Like, it's one thing to get, like, a production approval with all your metrics, but you need to continue to keep doing that in production as well. Right? This is also what I saw in the machine learning space, like, before this whole Gen AI thing became a huge hit where people spent a lot of time for preproduction. But once a model and application goes into production, they kind of stop, like, actually monitoring it for continuous metrics.

So the first thing is, like, to really have a good set of tools stack that will help you continuously evaluate, monitor, and troubleshoot. Right? That's where actually Arize and PagerDuty will come in. So it's really important to detect those anomalies as, like, business requirement models or user behavior change.

That's such a great point. I Config Drift 32 min is such a backbreaker in in this type of environment, and it always has been. You think about the biggest risk to a business and an operations environment, it's configuration drift. Your point icon. You push an application or production, even traditional applications. You set up your observability tool, and then you forget all about it until six months later when you miss a major failure because a schema changed or your observability changed a little bit. And now we're dealing with something that iterates much, much faster than those traditional systems with things that are kind of putting together answers on the fly, like you said, with GPT four o or Quad or Bedrock or all these other systems we're plugging into, those are always being updated and tuned and being addressed for things like toxicity and, you know, that could have unexpected results on providing, you know, next steps for your automated runbook.

And so I think that is a good call out, and this is really where we can start to help teams.

Alright. Thank you both. And then our final question, could you both speak to what is the learning curve look like, or what have you seen it look like for data science and AIML teams in adopting Arize and PagerDuty?What does that workflow look like in adopting the incident response workflows with PagerDuty and Arise?

So I can go first on that one. So for us, you know, one of the things we've always been focused on at PagerDuty is being able to very quickly get systems into operations and being able to support those.

You've seen a lot of work on our side and investing into internal developer portals and plugging more into operations as code. So that way, as you're deploying and iterating on your applications, all of that should be coming from your CICD pipeline to drive your service relationships, to drive your business services, your escalation policies.

And, eventually, you know, even things like your automated runbooks because that's what really gets you to the point where you're able to deploy a new AI application and support it on day one instead of releasing it to the wild and then trying to stitch together a way to support and operationalize that afterwards, which is what I think we've seen a lot of teams try to do.

And what Chris said. Like, I think, like, when you think about, like, from an AI perspective, like, really the learning curve is kind of two paths. Right?

First, like, you need to, like, integrate your AI use case into Arise. Right? That's where usually most of the time spent. And that way, we try to make it as easy as possible with our integrations and our compliance with OpenTelemetry.

And after you actually get the integration done, then the next one is like, okay, how do I navigate the platform? How do I, like, set up maybe a monitor, set up PagerDuty, or set up dashboards or evaluations? And, like, that's where actually, like, our vision is to be really, like, AI native going forward.

Like, just like Chris show, like, like, an SRE agent. We also have our own agent, Alex. I couldn't show today, but that's really designed, like, especially in our vision, like, a way for you to interact with the platform itself. Right?

You can ask questions about your data, ask questions about their evaluations, try to detect anomalies from there, and then use it as a way to learn about the product as well. Right? Like, where do I navigate to do this and everything? So I think, like, most of the successful companies in the next five years will be dependent if they're AI native or not.

And that's where, like, they need, like, agents, like, either a SME agent or Alex to be able to decrease this learning curve as much as Yep.

Amazing. Hakan, Chris, thank you both for sharing your insight today and walking us through how we can actually apply them.

And thanks everyone for taking the time to spend with us this last hour.

But, yeah, if you want to know more information about PagerDuty Arize, check out these links, and we'll be sending the recording as well as some additional resources in a follow-up. But thanks everyone for spending time, and thank you again, Hakana and Chris.

Watch On-Demand

First Name ^*

Last Name ^*

Email ^*

Company ^*

Job Title ^*

Country ^*

Speakers

Chris Bell

Principal Solutions Architect

PagerDuty

Hakan Tekgul

AI Solutions Lead

Arize