Availability lessons from shoe companies and ancient warlords
This is the second in a series of posts on increasing overall availability of your service or system. In the first post of this series,...
This is the second in a series of posts on increasing overall availability of your service or system. In the first post of this series,...
PagerDuty is thrilled to be a sponsor for PuppetConf 2011. PuppetConf is a DevOps and Operations conference presented by Puppet Labs in beautiful Portland, OR...
As you may already know, PagerDuty suffered an outage of 30 minutes yesterday, followed by a period of increased alert delivery times. We’re taking the downtime...
Updated on 9/21: We have replaced Twitter with our status page as a communication method. At PagerDuty we strive for 100% uptime, and it is a...
On August 8 – 10, we’ll be “staying classy” in San Diego, California as we attend HostingCon 2011. HostingCon is the premier conference and tradeshow...
PagerDuty is pleased to announce integration with Pingdom; it's now easier than ever to find out about and respond to website downtime
Velocity 2011 was a blast! Thanks to everyone who came by our booth to find more about PagerDuty, snag a t-shirt, and enter our contest.
1 min read
Have you ever said to yourself: “PagerDuty is great, but I wish I could better integrate it into the custom tools I already use.” Or...
PagerDuty is excited to be attending the O’Reilly Velocity Conference 2011 next week in Santa Clara, CA. Velocity is a great venue that focuses on...
PagerDuty is hosting the June meet-up for the San Francisco Perl Mongers Meetup. Gaëtan Voyer-Perraul from MongoDB will be presenting, "Perl + MongoDB => Mongoers + Fun".
1 min read
We are very pleased to announce a new partnership with Red Gate Software. We are good friends with Simon and Neil, the co-CEOs of Red...
2 min read
We’re hiring! Interested in working with a team reinventing the stagnant world of IT operations software? Want a job hacking on a product with a...
1 min read
PagerDuty presented at Under The Radar yesterday and won both Best In Show's Audience Choice Award and Developer Tools' Audience Choice Award. If you missed the presentation, both the video and slides are embedded after the jump.
1 min read
Today, at around 1am Pacific Time, Amazon began having major problems with some of their cloud infrastructure: specifically with their EC2, EBS, and RDS offerings. We'd like to share some statistics on the alerts we sent out - via phone or SMS - during the outage.
This post is meant as a quick introduction to some concepts of system availability, so that subsequent posts in this series make sense. I'll go over concepts like availability, SLA, mean time between failure, mean time to recovery, etc.
Introducing Curated Arial Non-Orbital Navigation System or CANON.
4 min read
This is Part 1 in a multi-part series dealing with tips for being on-call.
We've added deep linking to the incidents table. The browser will now remember all your interactions with the table as you move throughout your account or recall your bookmarks.