May 2007 Archives

As promised last week here's part two of the story about a rough week at Cogent last week. When last we left our intrepid, optical network, it was depeering wee little British autonomous systems in an effort to gussy itself up for future suitors (we guessed; although there were several other interesting guesses as well. More on that shortly). Well, things went downhill from there.

On Wednesday, April 25, at about 19:25 UTC (15:25 EDT / 12:25 PDT), Cogent had a fairly serious backbone issue. It was reported on NANOG. It was a moderately large event at the time, with a total impact on most of Cogent's network for about 45 minutes, and at least some part of the network affected for almost three hours. The problem was attributed to a router software bug. Cogent had another problem later in the week, on Friday, that appears to only have impacted customers in Boston.

Part of my interest in these events is personal: Renesys (AS34135) is single homed to Cogent at a development site in Boston. These two outages happened to both hit during the middle of user testing for a new application we're working on (more on that in the coming weeks). So that was pretty embarassing and frustrating. We're shopping around for other providers at 1 Summer now, but (as usual) providers are unclear on whether they can offer service in the building and what they might charge to do so. So we're waiting. Additionally, two of Renesys's three other service providers in New Hampshire, Worldpath (AS3770) and SEGNet (AS11524) both use Cogent as one of their upstreams as well. So we were impacted by the problems. But being a customer of, or a provider to someone who has a network problem isn't enough to raise my interest (we have a lot of customers who run networks, strangely enough).

My main interest in the Cogent outage is that it was large enough to be felt across the Internet and gives me an opportunity to look at some of the ways to understand and analyze such events after the fact. So let's take a look at what happened, not just from the RFO (Reason For Outage) issued by Cogent, but rather what the whole Internet thought of the event.

About the Renesys Blog

Our weblog is written by a variety of Renesys employees. They run the gamut from senior execs and engineers to sales guys. Anyone who has something to say that could be informative or of interest to our customers and visitors, says it here.

About this Archive

This page is an archive of entries from May 2007 listed from newest to oldest.

April 2007 is the previous archive.

August 2007 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Archives

Pages