<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Renesys Blog</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/" />
    <link rel="self" type="application/atom+xml" href="http://www.renesys.com/blog/atom.xml" />
   <id>tag:www.renesys.com,2008:/blog//1</id>
    <link rel="service.post" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1" title="Renesys Blog" />
    <updated>2008-05-01T01:15:58Z</updated>
    
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type 3.2</generator>
 
<entry>
    <title>The Day the Youtube Died: The Video</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/04/the_day_the_youtube_died_video.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=61" title="The Day the Youtube Died: The Video" />
    <id>tag:www.renesys.com,2008:/blog//1.61</id>
    
    <published>2008-04-15T18:17:47Z</published>
    <updated>2008-05-01T01:15:58Z</updated>
    
    <summary>Randy Epstein of Host.net and WVFiber graciously (or perhaps maliciously, given the quality of the performance) filmed and did the post-production on the recent performance at the Global Peering Forum. If I had a virtual tip jar, I would set...</summary>
    <author>
        <name>Todd Underwood</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Society" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>Randy Epstein of <a href="http://host.net/">Host.net</a> and <a href="http://wvfiber.com">WVFiber</a> graciously (or perhaps maliciously, given the quality of the performance) filmed and did the post-production on the recent performance at the <a href="http://peeringforum.com/">Global Peering Forum</a>.  If I had a virtual tip jar, I would set it out.  Enjoy:
</p>
<p>
<object width="425" height="355"><param name="movie" value="http://www.youtube.com/v/JJ-nSCl1UMc&hl=en"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/JJ-nSCl1UMc&hl=en" type="application/x-shockwave-flash" wmode="transparent" width="425" height="355"></embed></object></p>]]>
        
    </content>
</entry>
<entry>
    <title>The Day the YouTube Died</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/04/the_day_the_youtube_died_1.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=60" title="The Day the YouTube Died" />
    <id>tag:www.renesys.com,2008:/blog//1.60</id>
    
    <published>2008-04-14T20:34:00Z</published>
    <updated>2008-05-01T01:15:43Z</updated>
    
    <summary> At the recent Global Peering Forum I performed a spoof song based on the recent YouTube hijacking. (I&apos;m told that video will eventually be available, at which point I&apos;m sure I&apos;ll have to go into hiding at an undisclosed...</summary>
    <author>
        <name>Todd Underwood</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Society" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
At the recent <a href="http://peeringforum.com">Global Peering Forum</a> I performed a spoof song based on the recent YouTube hijacking.  (I'm told that video will eventually be available, at which point I'm sure I'll have to go into hiding at an undisclosed location.)
</p>
<p>
American Pie was previously <a href="http://www.youtube.com/watch?v=phSpBCdWq1U">parodied at a RIPE meeting</a> and now is practically a tradition, much to Mike Hughes's chagrin, as he thinks it's overdone already.  The great thing about the original song is that it's choc full of <a href="http://www.fiftiesweb.com/amerpie-1.htm">references</a> in the music industry.  I tried to pepper several more into my version (and I have a few additional verses in progress that I just didn't finish).  
</p>
<p>
What links would you provide to these references?  What additional references do you think are important and missing (given the history of the Internet theme)?
</p>
<p><h3>The Day the YouTube Died</h3></p>
<p>
<br />A long long time ago
<br />I can still remember how the videos used to make me smile.
<br />And I knew if I had my chance, 
<br />I'd watch the <a href="http://www.youtube.com/watch?v=hMnk7lh9M3o">prison thriller dance</a>
<br />and maybe I'd be happy for a while.
<br />But February made me shiver with every packet I'd deliver
<br /><a href="http://www.renesys.com/blog/2008/02/pakistan_hijacks_youtube_1.shtml">bad routes in the tables</a>, the paths they were not stable.
<br />I can't remember if I cried when I saw my request was denied
<br />but boredom welled up deep inside
<br /><a href="http://www.news.com/8301-10784_3-9877614-7.html">the day the YouTube died</a>.
</p>
]]>
        <![CDATA[<p>
<br />I was singing...
</p>

<p><h3>Chorus:</h3></p>

<p>
<br />Bye bye ye old YouTube of mine
<br />Sent my packet to the prefix but then I was denied
<br />Them old mullahs were <a href="http://www.csmonitor.com/2008/0225/p04s03-wosc.html"> hiding filth from our eyes</a>
<br />Singing this'll be the day YouTube dies!
<br />This'll be <a href="http://www.renesys.com/blog/2008/02/pakistan_hijacks_youtube_1.shtml">the day YouTube dies</a>.
</p>

<p>
<br />Did you write the <a href="http://www.adobe.com/products/flashmediaserver/">streaming code</a>
<br />And do you think videos should load
<br />If the prefix takes you there?
<br />Now do you believe in full control
<br />When you end up at a <a href="http://www.youtube.com/watch?v=oHg5SJYRHA0">rickroll</a>
<br />Man that guy sure had <a href="http://static.flickr.com/31/56334472_62836b2094_o.jpg">great hair</a>
</p>

<p>
<br />Well I know you loved that silly site
<br />Cause you <a href="http://www.google.com/support/youtube/bin/answer.py?answer=57931">posted crap day and night</a>.
<br />Just <a href="http://www.youtube.com/watch?v=kHmvkRoEowc">leave Britney alone</a>
<br />Until she's <a href="http://www.msnbc.msn.com/id/18444336/">fully grown</a>.
</p>

<p>
<br />I was a lonely late-night <a href="http://www.youtube.com/watch?v=-6Dmg_4ZA2Y">emo</a> star
<br /><a href="http://uk.youtube.com/watch?v=sK6g0ai_RnU">Singing offkey, strumming guitar</a>
<br />But my demise came from <a href="http://en.wikipedia.org/wiki/Pakistan">afar</a>
<br />The day the YouTubes died.
</p>

<p>
<br />I started singing...
</p>

<p><i>Chorus</i></p>

<p>
<br />Now for <a href="http://www.merit.edu/networkresearch/projecthistory/nsfnet/nsfnet_article.php">20 years</a> we've been in these <a href="http://www.boingboing.net/2006/07/02/sen-stevens-hilariou.html">tubes</a>
<br />Clogged with <a href="http://icanhascheezburger.com/">lolcats</a> and <a href="http://www.albinoblacksheep.com/flash/internet4porn">mostly boobs</a>
<br />But that's not how it used to be.
<br />When <a href="http://gigaom.com/2005/02/13/verizon-buys-mci-for-68-billion/">Verizon</a> was still <a href="http://www.usenix.org/publications/login/1999-2/isp.html">ALTERnet</a>
<br />And their <a href="http://www.verizonbusiness.com/terms/peering/">peering was something you could get</a>
<br />On the network that we called <a href="http://www.nanog.org/2.95.NANOG.notes/mae-west.html">FDDI</a>.
</p>

<p>
<br />Oh and while <a href="http://www.genuityestate.com/shareholders-notice.html">Genuity's stock</a> was down
<br /><a href="http://www.level3.com/newsroom/pressreleases/2002/20021127.html">The Three stole</a> their dubious crown.
<br /><a href="http://www.level3.com/userimages/dotcom/pdf/Tech_leadership_AS1.pdf">AS1 was shuttered</a>
<br />The tables all got cluttered.
<br />And <a href="http://www.pantherexpress.net/">while</a> <a href="http://www.limelightnetworks.com/">CDNs</a> <a href="http://gigaom.com/2006/12/26/level3-buys-savvis-cdn-business/">copied</a> <a href="http://www.akamai.com/">Akamai</a>
<br />The <a href="http://www.boston.com/business/technology/articles/2006/07/15/akamai_mit_hit_limelight_with_patent_suit/">court</a> <a href="http://www.networkworld.com/news/2002/0214akaspeed.html">cases</a> <a href="http://www.ll.georgetown.edu/FEDERAL/judicial/fed/opinions/03opinions/03-1007.html">they</a> <a href="http://findarticles.com/p/articles/mi_m0EIN/is_2001_July_12/ai_76496705">flew</a> on by
<br />And <a href="http://finance.aol.com/earnings/akamai-technologies-inc/akam/nas">prices rose</a> up to the sky
<br />The day the YouTubes died.
</p>

<p><br />We were singing...</p>
<p><i>Chorus</i></p>
<p>
<br /><a href="http://en.wikipedia.org/wiki/Napster">Napster</a> drove <a href="http://www.news.com/2100-1023-244073.html">serious traffic</a>
<br /><a href="http://bigpicture.typepad.com/comments/2008/03/teenagers-shun.html">Tunes for the younger demographic</a>
<br />But then <a href="http://www.wired.com/gadgets/portablemusic/news/2002/05/52540">one day it went away</a>.
</p>

<p>
<br />I went down to the <a href="http://www.comcast.net/a/">Comcast</a> store
<br />Where I'd <a href="http://www.boston.com/business/technology/articles/2006/06/02/comcast_upgrade_speeds_up_downloads/">streamed the downloads</a> years before
<br />But the kids there, said <a href="http://torrentfreak.com/comcast-throttles-bittorrent-traffic-seeding-impossible/">BitTorrent wouldn't play</a>
</p>
<p>
<br />And on the net the <a href="http://blogs.guardian.co.uk/digitalcontent/2008/02/google_yahoo_and_msns_ad_reven.html">Google rules</a>
<br />We're just  <a href="http://gigaom.com/2006/01/15/gather-them-eyeballs/">eyeballs for AdSense tools</a>
<br />They know when we've <a href="http://www.mathgurusonline.com/2006/05/17/google-adsense-ctr-explained/">clicked it</a>
<br />My <a href="http://adblockplus.org/en/">Adblock Plus</a> just nixed it.
</p>
<p>
<br />I tried to squeeze bucks from <a href="http://nanog.org/">NANOG</a>
<br />But <a href="http://www.news.com/2100-1033-273689.html">y'all</a> <a href="http://dc.internet.com/news/article.php/2101_776871">are</a> <a href="http://www.isp-planet.com/news/2002/xo_020618.html">broke</a> so I must <a href="http://www.renesys.com/blog">blog</a>
<br />and now I'm stuck with <a href="http://babbledog.com">Babbledog</a>
<br />Ever since the YouTube died...
</p>
]]>
    </content>
</entry>
<entry>
    <title>Telia and Cogent Kiss and Make Up</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/03/telia_and_cogent_kiss_and_make_1.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=59" title="Telia and Cogent Kiss and Make Up" />
    <id>tag:www.renesys.com,2008:/blog//1.59</id>
    
    <published>2008-03-28T21:09:36Z</published>
    <updated>2008-04-05T22:51:19Z</updated>
    
    <summary> On March 28th at 17:52 UTC, we saw the peering link between Telia and Cogent come back up. Recently, peering disputes, especially with Cogent, tend to be all about traffic ratios: as long as both parties send roughly the...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
On March 28th at 17:52 UTC, 
we saw the peering link between Telia and Cogent come back up.
Recently, peering disputes, especially with Cogent, tend to be all about traffic ratios:
as long as both parties send roughly the same amount of traffic to each other, 
life is good.
But when the ratios get out of whack, someone's feelings get hurt 
(more specifically someone's business model is threatened).
Before the de-peering,
we would typically see Cogent using Telia to reach around 2700 networks (prefixes). 
Now that count has dropped to just about 1450 networks.
On the other hand, 
Telia used to reach approximately 7000 networks via Cogent and that number has now increased to almost 8600.
So was Cogent sending too much traffic to Telia before?
Did Telia then do something to provoke Cogent to turn them off (like send a bill)?
We'll never know definitively, but someone blinked and the Internet is now whole again.
</p>

<p>
While this is good for the Internet,
Cogent claimed that this dispute was about 
<a href="http://blog.wired.com/27bstroke6/2008/03/isp-quarrel-par.html">capacity issues</a> and no one orders and installs new high capacity circuits in a week,
especially during a contract dispute.
So if there was a capacity issue, 
there is <em>still</em> a capacity issue.
As a result, the situation is bound to be very fluid for the next few weeks.
We'll update this blog as we analyze the resulting shifts in routing.
</p>]]>
        
    </content>
</entry>
<entry>
    <title>He said, she said: Cogent vs. Telia</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/03/he_said_she_said_cogent_vs_tel.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=58" title="He said, she said: Cogent vs. Telia" />
    <id>tag:www.renesys.com,2008:/blog//1.58</id>
    
    <published>2008-03-21T16:10:31Z</published>
    <updated>2008-03-28T22:24:27Z</updated>
    
    <summary> As in most lovers&apos; quarrels, it is difficult to objectively evaluate the claims of the combatants. Naturally, we tend to side with the person we know best, as it&apos;s their viewpoint we hear most often and are inclined to...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
As in most lovers' quarrels, 
it is difficult to objectively evaluate the claims of the combatants.
Naturally, we tend to side with the person we know best, 
as it's their viewpoint we hear most often and are inclined to be sympathetic towards.
Both Cogent and Telia are claiming to be the aggrieved party in their
<a href="http://www.renesys.com/blog/2008/03/you_cant_get_there_from_here_1.shtml">
peering dispute</a> and are now making their case in
<a href="http://www.washingtonpost.com/wp-dyn/content/article/2008/03/19/AR2008031901741.html"> the court of public opinion</a>.
We will almost certainly never know the details of their private business relationship, 
but we can make a few more inferences from the data.
Let me state up front that, like many major ISPs,
Telia and Cogent are customers of Renesys and we love them both equally.
Everything we report in our blogs is based on objective analysis of our global data, 
independent of our own business relationships.
</p>]]>
        <![CDATA[<p>
So let's recap what we do know at this point.
<ul>
   <li> At the end of the day on March 13th, the peering link between Cogent and Telia went away, cutting off
      some of their respective customers (those without other providers) from each other.</li>
   <li> Cogent took responsibility for <a href="http://blog.wired.com/27bstroke6/2008/03/isp-quarrel-par.html">pulling the plug</a>.</li>
   <li> Despite this, Telia traffic found its way to Cogent via Verizon for 12 hours after the initial event.</li>
   <li> <em>Something then happened to cut off this alternate route.</em></li>
</ul>
</p>

<p>
<b>Why did the Verizon routes disappear?</b>
</p>

<p>
So what exactly was that "something"?
It is difficult to say with certainty, 
but we have only three choices:
<ul>
  <li> Cogent stopped accepting routes to Telia via Verizon.</li>
  <li> Telia stopped accepting routes to Cogent via Verizon.</li>
  <li> Verizon stopped transiting traffic between the two estranged parties.</li>
</ul>
</p>

<p>
In our 
<a href="http://www.renesys.com/blog/2008/03/you_cant_get_there_from_here_1.shtml">last blog</a>,
we gave reasons why we thought the first option might be true.
After sifting through more data,
we will present possible reasons for the other two options,
either of which may be more likely than the first.
We'll let you decide, but first
we're going to have to digress a bit and provide some background for those less familiar
with the workings of Internet routing.
</p>

<p>
<b>Background</b>
</p>

<p>
While it's an overused term,
<a href="http://en.wikipedia.org/wiki/Tier_1_carrier">Tier-1</a> providers are thought of as those organizations like Sprint and AT&T who are at the top of the Internet food chain.
They pay no one for service &mdash; people pay them directly or pay their customers.
The only reason all customers of Sprint can reach all customers of AT&T, for example,
is because both companies agree to exchange traffic with one another without charging.
At a routing level, 
Sprint tells AT&T which networks it is ultimately responsible for so that AT&T will send traffic for those networks to Sprint.  AT&T does likewise for Sprint.
Such an arrangement is known as peering, and since no money changes hands it is more preciously known as <a href="http://en.wikipedia.org/wiki/Peering">settlement-free peering</a>.
There are only about nine Tier-1 providers, and for everything to work,
they all need to peer with one another.
They also have no economic interest in letting anyone else join the club.
But of course that doesn't stop others from trying to force their way in, and there
are certainly marketing advantages to claiming you are at the top of the heap.
</p>

<p>
Cogent and Telia have long been Tier-1 wannabes.
To attain this status, it helps if you have a global network
and lots of customers and peers yourself.
Then you can start to play the existing Tier-1s off one another for your traffic,
and with a lot of effort and good fortune,
you might get settlement-free peering with some of them.
At worst, you might have to pay for peering from those with whom you don't have enough leverage.
From a routing perspective, paid peering looks exactly like settlement-free peering.
At Renesys,
our 
<a href="http://www.renesys.com/products_services/market_intel">Market Intelligence</a>
offering denotes both relationships as simply "peering", since we have no way of knowing if money is changing hands.
</p>

<p>
<b>Analysis</b>
</p>

<p>
For a long time, we have watched Cogent and Telia get closer and closer to Tier-1 status, at least as far as looking like they have no providers, whether or not they pay for some of their peerings.
Cogent is almost there, but is still a customer of NTT (AS 2914) for the sole purpose of reaching
AOL (AS 1668).
(Interestingly enough, Cogent could become Tier-1 by default before long,
as AOL continues to shrink and sell off its network.)
As of a few weeks ago,
Telia also looked to have one provider (namely Verizon) for reaching certain networks.
But we stopped seeing evidence of that on February 27th.
That is, after this date, Telia had no known provider to reach any markets anywhere,
using only their own network or links routed as peering (paid or not) to reach everyone.
</p>

<p>
Then Cogent de-peered Telia and suddenly Verizon and others started providing a path 
between the two and their respective customers.  
At the time of my 
<a href="http://www.renesys.com/blog/2008/03/you_cant_get_there_from_here_1.shtml">last blog</a>, 
I was still thinking Verizon was a provider for Telia, so I was not surprised.
(The other providers we saw were the result of leaks (mistakes) or because of customer dual homing.)
But it would make no sense for Verizon to suddenly go away if they were still a transit provider for Telia, hence my conjecture about Cogent stopping this path.
As mentioned above,
there are two other possibilities, so let's now examine these.
</p>

<p>
Suppose Telia really did stop being a customer of Verizon a few weeks ago and got
settlement-free peering.
Verizon's routing announcements might not have fully reflected that change in their business relationship.
That is, perhaps Verizon was announcing more than their customer routes to Telia, 
routes that might only show up if Telia lost access to parts of the Internet via other means.
And so when the Telia-Cogent link was severed,
Verizon inadvertently started providing free transit to Telia to reach Cogent.
If true, it would have been in Verizon's interest to correct this mistake as quickly as possible.
They would have no reason to connect Telia to Cogent without getting paid by either party.
This scenario seems very unlikely to us as it would involve a substantial misconfiguration on the part of Verizon of a magnitude we have not seen in the past.
</p>

<p>
Finally, suppose that Telia has paid peering with Verizon and so Verizon still views them as a customer and perhaps is sending them more routes than they would a settlement-free peer.  
Again, such routes might have only become visible when the Telia-Cogent link went away and Telia started using them instead.  
If Telia has paid peering with Verizon,
then they have to pay according to the amount of traffic they send them.
Telia might not be very interested in paying for something (access to Cogent) that was formerly free.
Such a situation could have inspired Telia to block this path themselves.
</p>

<p>
<b>Conclusions</b>
</p>

<p>
But no matter what happened here at a technical level and what the corresponding motivations were, 
the Internet has been partitioned for more than one week after the start of this spat.
This seems quite odd to people and I keep getting asked "OK, but why isn't this a bigger deal?  Give me just one juicy example, e.g., Norway can't reach CNN or something like that."
Unfortunately and fortunately, 
I don't have any such juicy examples.
Unfortunately, since if there were such highly visible examples, 
there would probably be enough external pressure on the warring parties to bring about a quick peace agreement.
Fortunately, since this partitioning is not impacting most of us.
</p>

<p>
But for those of you who have asked for concrete examples,
here is one.
Martha Stewart Living is single-homed behind Cogent and announces one network,
namely, 38.96.143.0/24.
If you go to 
<a href="http://looking-glass.telia.net">Telia's looking glass</a>
as of the time of this posting, 
you cannot get to Martha's network.
As far as Telia is concerned, Martha doesn't exist.
Does this mean that the Swedes are deprived of the pleasure of buying Martha's wares and sending her email?
Not at all.
Her web site is hosted by Savvis and a customer service email address points to AOL.
But if Martha's parole officer allows her to visit Scandinavia any time soon,
she won't be able to reach her corporate network.
(By the way, I found this network by looking up all the single-homed customers of Cogent
in Renesys' 
<a href="http://www.renesys.com/products_services/market_intel">Market Intelligence</a>.)
This is not to make light of the situation, but to point out that the disruptions are largely
personal in nature.
And while this rift in the Internet is seriously impacting a wide range of individuals, 
it's unlikely to be resolved anytime soon without a lot more yelling and screaming.
</p>

<p>
It also points out the dangers of being a Tier-1 provider or a near Tier-1.
Sure you can claim you've made it to the big time and don't need to buy from anyone, 
but you do need cooperation from the rest of the cartel and not one of them is 
going to do you any favors if you are perceived to be in a position of weakness relative to them.
After all, this is all about money, and outside of the 
<a href="http://en.wikipedia.org/wiki/Federal_Reserve_System">US Federal Reserve</a>,
no one else seems to be giving that away.
</p> 
]]>
    </content>
</entry>
<entry>
    <title>You can&apos;t get there from here</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/03/you_cant_get_there_from_here_1.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=57" title="You can't get there from here" />
    <id>tag:www.renesys.com,2008:/blog//1.57</id>
    
    <published>2008-03-18T04:01:53Z</published>
    <updated>2008-03-28T22:24:06Z</updated>
    
    <summary> Cogent and Telia are having a lover&apos;s quarrel and, as a result, the Internet is partitioned. That means customers of Cogent and Telia cannot necessarily reach one another. This was not due to a configuration error or a physical...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
Cogent and Telia are having a lover's quarrel and, as a result,
the Internet is partitioned.
That means customers of Cogent and Telia cannot necessarily reach one another.
This was <b>not</b> due to a configuration error or a physical cable break.
This is the 
<a href="http://www.renesys.com/blog/2005/12/peering_the_fundamental_archit.shtml">way the Internet works</a> and sometimes doesn't work.
If the businesses that run the show don't play nice with one another,
their customers can pay the price of being cut off from parts of the 'net.
At least when 
<a href="http://www.renesys.com/blog/2008/02/pakistan_hijacks_youtube_1.shtml">Pakistan mistakenly hijacked YouTube,</a>
the matter was sorted out in hours and did not require the cooperation of Pakistan.
The Cogent/Telia tiff has been going on for 4 days now and only they can resolve their differences.  
The rest of the world can only hope for full connectivity to be restored.
</p>]]>
        <![CDATA[<p>
When relationships end, it's often hard to figure out the one thing people really want to know: who dumped whom?  
The politically correct thing to say is that the two parties had irreconcilable differences or that they had simply grown in different directions.  
That all sounds nice, but in fact, it's typically the case that one party called it quits, 
leaving the other broken-hearted.
Although routing data don't show who pulled the plug,
<a href="http://gigaom.com/2008/03/14/the-telia-cogent-spat-could-ruin-web-for-many/">the word on the street</a> is that Cogent jilted Telia last Thursday, 
ending their long-term peering relationship.
The result is that the Internet is now partitioned.
Downstream customers of Telia and Cogent can only reach each other if they also have providers other than these two.
If not, they are out of luck as Cogent plays the Internet equivalent of chicken with Telia.
</p>

<p>
So let's look at the facts.
On 13 March 2008 at 22:03 UTC,
we saw the link between Cogent (AS 174) and Telia (AS 1299) disappear.
The way this looks in the routing announcements is that all AS paths with "174 1299" or "1299 174" on them vanished, 
meaning there is no longer a physical link between these two providers.
At this point,
Telia lost direct access to 4474 prefixes (networks) transited via Cogent,
whereas, Cogent lost 1633 Telia networks.
Now, you might be thinking, "So what?  Won't the Internet just route around the problem, finding
alternate paths?"
Well, yes if there are alternate paths to be found and if the players actually <em>allow</em> traffic to flow via them.
For around 12 hours, 
most Telia customers did access Cogent via Verizon (56% of the 4474 networks), Level 3 (16%), AT&T (6%) and others, but then that abruptly stopped.
We're guessing it's because Cogent eventually slammed the door shut on these alternate paths to their network from Telia,
since none of Cogent's customers accessed Telia via alternate routes during this time.
Like divorce court, depeering is <em>supposed to be painful</em>,
otherwise you might not get what you want.
You only hurt the ones you love.
</p>

<p>
So now we have a bit of a problem.
Customers who ultimately rely solely on Cogent for transit cannot get to Telia, likewise for customers downstream of Telia.
So where are these customers exactly?
The following tables show the geo-location of those networks downstream of Cogent that cannot reach Telia and vice versa for Telia.
</p>

<p>
<table align="left" border="1">
<caption><em>Telia cannot reach Cogent</em></caption>
<colgroup align="left">
 <col width="100">
<colgroup align="right">
 <col width="100">
<thead>
<tr>
  <th>Country</th>
  <th># Prefixes</th>
</tr>
</thead>
<tbody>
<tr><td>US<td align="right">1868
<tr><td>Canada<td align="right">232
<tr><td>France<td align="right">98
<tr><td>Spain<td align="right">41
<tr><td>Germany<td align="right">31
<tr><td>UK<td align="right">27
<tr><td>Others<td align="right">86
</tbody>
</table>
<table  align="center" border="1">
<caption><em>Cogent cannot reach Telia</em></caption>
<colgroup align="left">
 <col width="100">
<colgroup align="right">
 <col width="100">
<thead>
<tr>
  <th>Country</th>
  <th># Prefixes</th>
</tr>
</thead>
<tbody>
<tr><td>Sweden<td align="right">444
<tr><td>Finland<td align="right">322
<tr><td>Russia<td align="right">153
<tr><td>Poland<td align="right">113
<tr><td>US<td align="right">73
<tr><td>Latvia<td align="right">62
<tr><td>Bulgaria<td align="right">52
<tr><td>Spain<td align="right">40
<tr><td>Denmark<td align="right">35
<tr><td>Norway<td align="right">30
<tr><td>Others<td align="right">249
</tbody>
</table>
</p>

<p>
<par>   </par>
</p>
<p>
Given the markets Cogent and Telia operate in,
these lists are not too surprising.
What is surprising is that networks in the US are actually cut off from each another, 
since a largely US provider is playing hardball with a largely European one.
For the Internet to be whole again, 
Cogent and Telia need to kiss and make up.
No one can force either one to carry traffic destined for the other.
But my guess is that Telia is hearing more grief from Scandinavian customers not being able
to reach US content than Cogent is hearing from US customers cut off from Northern Europe. 
</p>

<p>
Of course, the list of impacted networks is too long to be included here,
but they include a wide range of commercial, educational and government clients.
On the Telia side, the victims include the Swedish Defense Data Agency, the Finnish State Computer Center, and broadband customers in St. Petersburg.
With regard to Cogent, 
Blue Cross and Blue Shield of Delaware, Kansas State University and Reuters America were all collateral damage.
</p>

<p>
But people can and do de-peer all the time for business reasons without blowing holes in 
the Internet along the way.
Early on 14 March 2008,
Flag (AS 15412) and SingTel (AS 7473) parted ways,
but almost all of the few thousand networks carried between the two managed to find alternate paths via their peers or providers.
That's because neither tried to "stick it" to the other and allowed Internet routing to do what it does best, find another way to get from here to there.
</p>
]]>
    </content>
</entry>
<entry>
    <title>Pakistan hijacks YouTube</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/02/pakistan_hijacks_youtube_1.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=55" title="Pakistan hijacks YouTube" />
    <id>tag:www.renesys.com,2008:/blog//1.55</id>
    
    <published>2008-02-25T00:50:21Z</published>
    <updated>2008-03-07T02:08:10Z</updated>
    
    <summary> Late in the (UTC) day on 24 February 2008, Pakistan Telecom (AS 17557) began advertising a small part of YouTube&apos;s (AS 36561) assigned network. This story is almost as old as BGP. Old hands will recognize this as, fundamentally,...</summary>
    <author>
        <name>Martin A. Brown</name>
        <uri>renesys.com</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
Late in the (UTC) day on 24 February 2008, Pakistan Telecom (AS 17557) began advertising a small part of YouTube's (AS 36561) assigned network.  This story is almost as old as BGP.  Old hands will recognize this as, fundamentally, the same problem as the  <a href="
http://merit.edu/mail.archives/nanog/1997-04/msg00380.html">infamous AS 7007 from 1997</a>, a <a href="http://www.renesys.com/blog/2006/01/coned_steals_the_net.shtml">more recent ConEd mistake of early 2006</a> and even <a href="http://www.renesys.com/blog/2005/12/internetwide_nearcatastrophela.shtml">TTNet's Christmas Eve gift 2004</a>.
</p>

<p>
Just before 18:48 UTC, Pakistan Telecom, in response to <a href="http://www.renesys.com/blog/pakistan_blocking_order.pdf">government order</a> to block access to YouTube (see <a href="http://ca.news.yahoo.com/s/afp/080224/world/denmark_media_islam_pakistan_internet_youtube">news item</a>) started advertising a route for 208.65.153.0/24 to its provider, PCCW (AS 3491).  For those unfamiliar with BGP, this is a more specific route than the ones used by YouTube (208.65.152.0/22), and therefore most routers would choose to send traffic to Pakistan Telecom for this slice of YouTube's network.
</p>]]>
        <![CDATA[<p>
I became interested in this immediately as I was concerned that I wouldn't be able to spend my evening watching imbecilic videos of cats doing foolish things (even for a cat).  Then, I started to examine our mountains of BGP data and quickly noticed that the correct AS path ("Will the real YouTube please stand up?") was getting restored to most of our peers.
</p>

<p>
The data points identified below are culled from over 250 peering sessions with 170 unique ASNs.  While it is hard to describe exactly how widely this hijacked prefix was seen, we estimate that it was seen by a bit more than two-thirds of the Internet.
</p>

<p>
This table shows the timing of the event and how quickly the route propagated (this is actually a fairly normal propagation pattern).  The ASNs seeing the prefix were mostly transit ASNs below, so this means that these routes were distributed broadly across the Internet.  Almost all of the default free zone (DFZ) carried the hijacked route at least briefly.
<table>
  <tr>
    <td><b>18:47:00</b></td><td>uninterrupted videos of <a href="http://youtube.com/watch?v=voHTPHXy_Aw">exploding jello</a></td>
  </tr>
  <tr>
    <td><b>18:47:45</b></td><td>first evidence of hijacked route propagating in Asia, AS path 3491 17557</td>
  </tr>
  <tr>
    <td><b>18:48:00</b></td><td>several big trans-Pacific providers carrying hijacked route (9 ASNs)</td>
  </tr>
  <tr>
    <td><b>18:48:30</b></td><td>several DFZ providers now carrying the bad route (and 47 ASNs)</td>
  </tr>
  <tr>
    <td><b>18:49:00</b></td><td>most of the DFZ now carrying the bad route (and 93 ASNs)</td>
  </tr>
  <tr>
    <td><b>18:49:30</b></td><td>all providers who will carry the hijacked route have it (total 97 ASNs)</td>
  </tr>
  <tr>
    <td><b>20:07:25</b></td><td>YouTube, AS 36561 advertises the /24 that has been hijacked to its providers</td>
  </tr>
  <tr>
    <td><b>20:07:30</b></td><td>several DFZ providers stop carrying the erroneous route</td>
  </tr>
  <tr>
    <td><b>20:08:00</b></td><td>many downstream providers also drop the bad route</td>
  </tr>
  <tr>
    <td><b>20:08:30</b></td><td>and a total of 40 some-odd providers have stopped using the hijacked route</td>
  </tr>
  <tr>
    <td><b>20:18:43</b></td><td>and now, two more specific /25 routes are first seen from 36561</td>
  </tr>
  <tr>
    <td><b>20:19:37</b></td><td>25 more providers prefer the /25 routes from 36561</td>
  </tr>
  <tr>
    <td><b>20:28:12</b></td><td>peers of 36561 start seeing the routes that were advertised to transit at 20:07</td>
  </tr>
  <tr>
    <td><b>20:50:59</b></td><td>evidence of attempted prepending, AS path was 3491 17557 17557</td>
  </tr>
  <tr>
    <td><b>20:59:39</b></td><td>hijacked prefix is withdrawn by 3491, who disconnect 17557</td>
  </tr>
  <tr>
   <td><b>21:00:00</b></td><td>the world rejoices; <a href="http://youtube.com/watch?v=Zll_jAKvarw">Leeroy Jenkins online again.</td>
</table>
</p>
<p></p>
<p>
Since BGP relies on a transitive trust model, validation between customer and provider is important.  In this case, PCCW (3491) did not validate Pakistan Telecom's (17557) advertisement for 208.65.153.0/24.  By accepting this advertisement and readvertising to its peers and providers PCCW was propagating the wrong route.  Those who saw this route from PCCW selected it since it was a more specific route.   YouTube was advertising 208.65.152.0/22 before the event started and the /24 was a smaller (and more specific) advertisement.  According to usual BGP route selection process, the /24 was then chosen, effectively completing the hijack.
</p>

<p>
Because of the fast detection and reaction of the YouTube staff and cooperation with other providers, service for their (sub-) prefix was interrupted for about an hour and forty minutes for some lucky customers and, at most, a bit more than two hours.  The exact duration of the outage depends on your vantage point on the Internet.
</p>

<p>
When these sorts of events occur, there is renewed interest in a variety of solutions to this problem.  BGP is fundamental to provider relationships and will not be going away anytime soon.  Cryptographic extensions to BGP have been suggested (<a href="http://www.nanog.org/mtg-0606/pdf/josh-karlin.pdf">Pretty Good BGP</a>, <a href="http://www.networkworld.com/details/6485.html?def">Secure Origin BGP</a> and <a href="http://www.ir.bbn.com/sbgp/">SBGP</a>).  These may be too taxing for router CPUs.  Of course, after any sort of hijacking event (whether inadvertent or malicious) prefix and AS monitoring is suggested (e.g., <a href="http://iar.cs.unm.edu/">the Internet Alert Registry</a>, <a href="http://phas.netsec.colostate.edu/">the Prefix Hijack Alert System</a>, <a href="http://www.ris.ripe.net/myasn.html">RIPE's MyASN</a> and 
<a href="http://www.renesys.com/products_services/routing_intelligence/">Renesys' Routing Intelligence</a>).
</p>

<p>
Ultimately, though, the problem remains one of transitive trust.  A provider can and should limit the advertisements it will accept from a customer.  The mechanics can be arranged manually or can be configured using Routing Policy Specification Language (RPSL) to communicate the policy and drive configuration.   In the case of Pakistan Telecom, they originate or transit fewer than 1000 prefixes.
</p>

<p>
So, it's heartwarming to know that two things are still true.  It is still trivially possible to hijack prefixes (whether maliciously or inadvertently).  I can go to sleep knowing that my neighbors are happily watching their <a href="http://icanhascheezburger.com/2007/01/11/i-can-has-cheezburger/">LOLCATS</a>.
</p>]]>
    </content>
</entry>
<entry>
    <title>On the road again:  Diary of an itinerant Internet transit sales guy</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/02/on_the_road_again_diary_of_an_1.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=50" title="On the road again:  Diary of an itinerant Internet transit sales guy" />
    <id>tag:www.renesys.com,2008:/blog//1.50</id>
    
    <published>2008-02-21T12:45:45Z</published>
    <updated>2008-02-29T00:34:26Z</updated>
    
    <summary> Bob, the sales guy. Ditched my #@!%$! cell in Stockholm. Verizon CDMA does not work in Europe! Upside: I now have a shiny, new World Edition Blackberry GSM/CDMA. I call it Trixie. Road Tip: Just say NO! to mouth-searing...</summary>
    <author>
        <name>Bob Fletcher</name>
        
    </author>
            <category term="Business" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<table align="left">
<tr>
<td>
<a href="http://www.renesys.com/about/management.shtml">
<img src="http://www.renesys.com/blog/bob/travel/Bob.jpg" width="150" height="194"></a>
</td>
</tr>
<tr>
<td align="center"><em>Bob, the sales guy.</em>
</td>
</tr>
</table>

<p>
Ditched my #@!%$! cell in Stockholm. Verizon CDMA does not work in Europe! Upside: I now have a shiny, new World Edition Blackberry GSM/CDMA. I call it Trixie.
</p>

<blockquote>
<table><tr><td bgcolor=#99FFFF>
<em>Road Tip:</em> Just say NO! to mouth-searing kimchi or Indian curry for breakfast. No matter how polite you're trying to be.
</td></tr></table>
</blockquote>

<p>
With barely enough time to recharge Trixie after calls in Denver, Albuquerque, Stockholm and Bonn, I hopped a jet for Tokyo, Hong Kong and Kuala Lumpur. (Trixie and I barely made it out of KL alive. Cab driver must have been conserving gas; tried to piggyback car in front.) Beginning to feel like Marco Polo on 'roids, but I gotta check out LA and  DC before catching a shuttle back home to Boston (close enough) . 
</p>

<p>
Trixie is overloaded with commentary, observations, insider scoops, and . . . new NSP sales and marketing contacts! (Hey, I'm a sales guy.) Time to download and see what comes out . . .
</p>

]]>
        <![CDATA[<p>
<b>Eastern Europe</b> seems to be the new hot spot on the continent. 
Lots of Internet transit NSPs are staking claims. 
Mostly Europeans and the usual large-scale suspects, but a few US providers are breaking in.
</p>

<p>
<b>Action out of Africa.</b> 
Early NSP missionary work (infrastructure, key relationships, etc.) is starting to make converts, 
mostly along the tips of the continent. 
South Africa and the balmy Mediterranean shore are lapping up most of the attention.
</p>

<blockquote>
<table><tr><td bgcolor=#99FFFF>
<em>Road Tip:</em> Need to track a package from anywhere to anywhere? 
Don't use anyone but FedEx. (They didn't pay me to say this.)
</td></tr></table>
</blockquote>

<p>
<b>Middle East</b> incumbent NSPs (grandfathered from the good old government-run land line days) are eager
to establish Internet transit hub for traffic between Europe and Asia. 
They're investing heavily in technology and talent. 
</p>

<table align="right">
<tr>
<td>
<img src="http://www.renesys.com/blog/bob/travel/dubai.jpg" width="350" height="231">
</td>
</tr>
<tr>
<td align="center"><em>Construction cranes dot the dunes of Dubai.</em>
</td>
</tr>
</table>

<p>
<b>Last year's Taiwan fiber outages</b> caused by the Boxing Day (I grew up in England) 
<a href="http://www.renesys.com/tech/presentations/pdf/nanog42.pdf">earthquake of 2006</a> shook up the 
IP transit market. 
Back-up routing out of Asia is now all the rage and new undersea cables are planned. 
Russian comrades pushing a land-based fiber path across Asia can't take orders fast enough. 
Margins are enviable.
</p>

<blockquote>
<table><tr><td bgcolor=#99FFFF>
<em>Road Tip:</em> Re Asian hotel Internet connections. VoIP operational only in Japan.
</td></tr></table>
</blockquote>

<p>
<b>Localization, localization, localization.</b>
Watch it grow &#8212; if you can watch that fast. 
Seems like just a year or two ago (a half-life in Internet years) most Internet content came from the US. 
Now, language-based contours are increasing exponentially.
It's like the Wild West (or East) out there!
</p>

<p>
My finely-tuned sales guy antenna is vibrating: Japan now generates 50% of its own content, China 24%, and 5% for Malaysia! Spain and Latin America have their own action. Then there's Poland. Who needs Google and MySpace? They've got their own. 
</p>

<table align="left">
<tr>
<td>
<img src="http://www.renesys.com/blog/bob/travel/roppongi.jpg" width="275" height="288">
</td>
</tr>
<tr>
<td align="center"><em>Nightlife in Tokyo's Roppongi district.</em>
</td>
</tr>
</table>

<p>
Now that Asia's generating so much of its own content, major home-grown NSPs expect to have very different (read: smaller) relationships with the current Tier-1 NSPs. 
If China Telecom's relationship with Sprint should, shall we say, dwindle, Sprint could be knocked down a peg or two in 
<a href="http://www.renesys.com/products_services/market_intel">Renesys' Global Customer Base index</a>.
</p>

<p><b>Scuttlebutt, afterthoughts, noise.</b></p>

<p>
Want to make a profit? Then forget about selling Internet transit to Wal-Mart . . . I mean, Google.
</p>

<p>
Take a walk on the wild wide, otherwise known as Internet transit peering. 
Relationships are often sealed with the equivalent of a handshake in a back room. 
Not so with their not-so-distant relatives in the more orthodox voice market. 
They negotiate contracts! On paper! At the likes of Intelsat GTM in Washington, DC each spring. 
I'm not saying that people don't peer closely at their peering ratios; they do. 
And make important decisions based on them, too. 
It's just kind of like, oh, living together instead of getting married.
</p>

<p>
Internet transit pricing varies. Duh. 
Here's why: it depends on local competition of course, but also on just plain geography. 
It's priciest in Australia and the Middle East. 
One exec told me that they changed plans to locate their Asian office in Sydney, and decided to go with Hong Kong     
&#8212; based solely on projected cost of Internet transit in Sydney! 
Sure didn't seem cheap to me last time I was in Hong Kong. Sydney pricing must be surreal.
</p>

<p>
Can't help but notice this correlation: NSP growth equals (or surpasses) how much they've outsourced. 
Reminds me of the light and nimble British Navy taking pot shots at a lumbering Spanish Armada. 
Large NSPs lug around large organizations. 
They don't respond to change and opportunities as quickly as leaner organizations that outsourced everything but their crack sales staff after the dot com crash. 
Disclaimer: Exclude from preceding characterization all nimble, responsive NSPs who subscribe to Renesys 
Market Intelligence. They're doing great. (Hey, I'm a sales guy.)
</p>

<p>
Contact me at bobf@renesys.com to chat (or buy licenses :-)
</p>
]]>
    </content>
</entry>
<entry>
    <title>Cable Breaks: Lessons Learned </title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/02/cable_breaks_lessons_learned_1.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=48" title="Cable Breaks: Lessons Learned " />
    <id>tag:www.renesys.com,2008:/blog//1.48</id>
    
    <published>2008-02-11T18:30:36Z</published>
    <updated>2008-02-26T09:09:09Z</updated>
    
    <summary> In the past 14 months, the world has seen two catastrophic failures of its global telecommunications systems: the Taiwan quakes, which snapped 7 of 9 important cables in Asia in December 2006, and a series of mishaps in the...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Business" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
In the past 14 months, 
the world has seen two catastrophic failures of its global telecommunications systems:
the  
<a href="http://www.renesys.com/blog/2007/01/the_shape_of_disaster_on_the_n.shtml">Taiwan quakes</a>,
which snapped 7 of 9 important cables in Asia in December 2006,
and a series of mishaps in the 
<a href="http://www.renesys.com/blog/2008/01/mediterranean_cable_break.shtml">Mediterranean</a>
and the 
<a href="http://www.telegeography.com/cu/article.php?article_id=21567">Gulf</a>,
damaging several others.
In a world increasingly dependent on global trade and communications,
what lessons can we learn from all of this and what measures should we take?
</p>

<p>
I'll discuss these questions in what follows, 
but let me warn you in advance.
There is <em>nothing</em> earth-shattering here.
In fact, I can save you time and sum up the entire discussion with three bullet points:
</p>
<ul>
<li>You get what you pay for.</li>
<li>Entropy happens.</li>
<li>Geography matters.</li>
</ul>
<p>
We've seen a lot of comments and discussion that fail to take into account one or more of these basics truths.
Let's look at each point in detail.
</p>]]>
        <![CDATA[<p>
<u>You get what you pay for.</u>
</p>
<p>
The segment of humanity that relies on the Internet can be roughly broken into two camps:
Internet Consumers and Internet Providers.
Internet Consumers are individuals, governments or businesses who use the Internet for their own purposes.
Internet Providers are in the business of providing Internet access to Internet Consumers.
While it is possible to be in both camps,
most groups are largely in one or the other.
Each has different lessons to learn.
</p>

<p>
To protect themselves, Internet Consumers need to do at most three things.
</p>
<p>
<ol>
<li> Determine the importance of their connectivity. </ii>
<li> Make the business case for change, if it is warranted. </li>
<li> Become an informed consumer and buy appropriately.</li>
</ol>
</p>
<p>
Can your business survive a complete loss of connectivity or severely degraded connectivity for a day, a week, or a month?  
If your answer is "no" for anything less than a month, you need to seriously consider your disaster plans.
Undersea cables can easily take a month to repair.  
And in times of distress, there may be many more organizations (with deeper pockets) in line ahead of you.
You need more than multiple providers or backup lines. 
You need to concern yourself with their physical independence.
In other words, having 3 or 10 providers is of no help if they all use the same submarine cable.
Don't think of the expense of multiple providers as insurance against an unlikely event, such as your house catching on fire.  
Your house almost certainly is not going to burn to the ground.  
However, the Internet cables your business runs on are going to fail; 
it is only a matter of when and to what degree.
Figure out what reliable, fault-tolerant connectivity is worth to you and then buy what makes
sense.
No one else can do this for you.
</p>

<p>
To sell to informed Internet Consumers and stay competitive, Internet Providers need to do at least three things.
</p>
<p>
<ol>
<li> Plan for disasters and have a response plan in place.</li>
<li> Build redundancy. </li>
<li> Sell your great network to your customers.</li>
</ol>
</p>
<p>
When planning for disasters, consider your response to individual cable breaks.  
If you plan to just "wait it out", consider what losing all your customers in a region will mean to your bottom line.  They won't be waiting for the repair ships to arrive.  As we saw with the latest round of cable cuts, 
<a href="http://www.renesys.com/blog/2008/02/mediterranean_cable_break_part.shtml">reaction time</a>
is everything in gaining new business.  
Your former loyal customers will be gone in a heartbeat if you can't deliver in an emergency.
Of course, in order to have any contingency plan at all, you need some redundancy in your
network, which then gives you a great sales and marketing tool, 
<em>especially</em> after events like these.
Remind your potential customers of the dangers of a focus on price alone.
Even if they don't buy from you today, they will still be around when the next cable breaks.
Of course, to be effective and stay one step ahead of your competitors, 
you need up-to-the-minute
<a href="http://www.renesys.com/products_services">Internet Intelligence</a>
which is where Renesys comes in.
It is of no surprise to us to see our customers doing well both during and after such outages.
</p>

<p>
<u>Entropy happens.</u>
</p>
<p>
While I'll admit that multiple cable breaks in a week does seem somewhat unlikely, 
I learned long ago to never ascribe to malice that which can be explained by incompetence or simply chance.
Given the laws of nature and the fact that there are over 6 billion of us bumbling along, 
I can't say it is all that surprising either.  
Individual cable outages occur all the time and seldom get much attention -
the sea floor is not the most hospitable environment for fiber optics.
<a href="http://www.economist.com/world/international/displaystory.cfm?story_id=10653963">The Economist</a>
reports that there were over 50 repair operations in the Atlantic last year alone.
<a href="http://blog.wired.com/27bstroke6/2008/02/who-cut-the-cab.html">Wired</a>
claims an average of one cable cut every three days.
What these outages teach us is that chance events do not have to be uniformly distributed by geography or by time.
And the only reason we even noticed these is because of the extreme lack of redundancy in the region.
</p>

<p>
<u>Geography matters.</u>
</p>
<p>
As does 
<a href="http://www.renesys.com/blog/2008/01/15th_century_routing.shtml">weather</a>, economics, politics and other annoyances of a non-virtual world.
You can see these forces in action on
<a href="http://image.guardian.co.uk/sys-images/Technology/Pix/pictures/2008/02/01/SeaCableHi.jpg"> cable maps</a>.
If you look at the Middle East in particular, where would you lay the cables?
</p>
<p>
<ul>
<li>West?  That takes you through the Sahara.</li>
<li>South?  It's a long way around the horn of Africa.</li>
<li>East?  That's the really long way to reach your business partners in Europe.</li>
<li>North? Better, but avoid the war zones.</li>
</ul>
</p>
<p>
The point is simply that your options are going to be influenced by where you live.
</p>

<p>
I don't mean to imply that any of the problems we've seen are unsolvable. 
If Internet Consumers demand cheap access above all else, they will get it and, as a consequence, it will be unreliable.  If they demand reliable access above all else and are willing to pay for it, the Internet Providers will be more than happy to accommodate them.  
The ones who don't won't be around for long.
And in the overall scheme of things, 
we are not talking about a lot of money.
A mere $125M USD gets you
<a href="http://www.webwire.com/ViewPressRel.asp?aId=58061">a brand new cable</a>
from Egypt to France, but only about one third of
<a href="http://www.iht.com/articles/2007/03/28/business/web-0327sharkey.php">a plane.</a>
Internet consumers, which include governments,
need to decide which is more useful.
</p>]]>
    </content>
</entry>
<entry>
    <title>Mediterranean Cable Break - Part IV</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/02/mediterranean_cable_break_part_3.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=54" title="Mediterranean Cable Break - Part IV" />
    <id>tag:www.renesys.com,2008:/blog//1.54</id>
    
    <published>2008-02-07T19:58:00Z</published>
    <updated>2008-02-25T10:33:33Z</updated>
    
    <summary> We started this blog thread last week, when we only had two broken cables to consider, but since that time there have been reports of several more failures and they seem to keep coming in. As far as this...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
We started this blog thread last week, when we <em>only</em> had two broken cables to consider, but since that time there have been 
<a href="http://www.telegeography.com/cu/article.php?article_id=21567">reports</a>
of several more failures and they seem to keep 
<a href="http://www.khaleejtimes.com/DisplayArticle.asp?xfile=data/theuae/2008/February/theuae_February155.xml&section=theuae">coming in.</a>
As far as this thread is concerned, the first two parts
(<a href="http://www.renesys.com/blog/2008/01/mediterranean_cable_break.shtml">here</a>
and 
<a href="http://www.renesys.com/blog/2008/01/mediterranean_cable_break.shtml">here</a>)
focused on the countries and local providers most impacted on the day of the first two cable failures.
We then looked at the providers of some of the
<a href="http://www.renesys.com/blog/2008/02/mediterranean_cable_break_part.shtml">harder-hit countries</a>
and how they were able to restore connectivity (or not) during the subsequent 48 hours.
And along the way, we felt obliged to counter some nonsense circulating on the Internet <a href="http://www.renesys.com/blog/2008/02/attention_iran_is_not_disconne_1.shtml">claiming that Iran had been cut off.</a>
It's been a busy week and we've barely scratched the surface.
But plowing ahead, 
we will take an extended look at two local providers, Bharti in India and DCI in Iran,
and how they weathered the storm.
One week later, how are these two local providers gaining access to the global Internet?
What has changed?  
We will use these examples to provide a glimpse into what can be discovered by collecting up enough public routing data from enough carefully selected places,
combining it with geo-location information and then doing an enormous amount of processing.
</p>
]]>
        <![CDATA[<p>
I'm going to start with a word of caution:
this will be the most technical of our discussions so far.
However, it will not be difficult to follow if we take things one step at a time.
The first simple observation is that an organization can be both a customer and a provider, depending on one's point of view.
For example, Bharti is both a provider to numerous companies in India and a customer of Sprint.  
So whether someone says they are a provider or a customer depends the direction in which the money is flowing, toward them (provider) or away from them (customer).
To introduce some terminology,
let's consider customer C with providers P1 and P2.
From our data, Renesys will observe these two business relationships, as well as
the networks (IP prefixes for you routing experts)  that are routed across them.  
We can infer a lot by watching the routing along these links and how it changes over time. 
For example, if P1 is having a problem, we might see C suddenly shift some of their networks to P2.
We will observe this as a decrease in the number of networks on the C-to-P1 link and a corresponding increase on the C-to-P2 link.
On the Internet, traffic equates to money, so P1 just lost some cash flow, while P2 gained some.
</p>

<p>
With this background, let's take a closer look at Bharti and the major carriers who connect them to the rest of the global Internet.
Before the cable cuts,
Bharti was receiving service from several carriers including British Telecom (BT), Deutsche Telekom (DTAG), Cable & Wireless (C&W) and Sprint.
We observed these four particular carriers until the cable breaks, and then each of these simply went away.
Only Sprint eventually recovered to some degree on 2 February, 
but ended up carrying far fewer networks.
It is not surprising that certain carriers went completely off-line, 
but why did Sprint come back after two days?
No cables were repaired during that time and no new ones were suddenly brought into service.
</p>

<a href="http://www.renesys.com/blog/egypt/9498.pfxs.png" 
onclick="window.open('http://www.renesys.com/blog/egypt/9498.pfxs.png','popup',scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/egypt/9498.pfxs.500x150.png">
</a>

<p>
Sprint has a strong global network and has considerable capacity heading from Asia to the west coast of the US.
If their outage could have been corrected by a configuration change,
you would think that that would not have taken two days.
Are they selling service in India on routers without capacity in both directions?
Were they preferring their more expensive MPLS service over IP and had no available bandwidth for IP?
If so, what happened after two days to restore IP service?
Looking at Sprint's <a href="http://www.sprintworldwide.com/english/maps/global_ip_mpls_maps.pdf">network maps</a> (see page 9),
they claim to have capacity on SEA-ME-WEA 3,
which was not impacted by the outages.
What exactly was Sprint doing for those two days in India?
</p>

<p>
As for the providers who gained new traffic,
AT&T, SingTel and Level 3 initially picked up new networks from Bharti.
However, all of them subsequently fell, perhaps due to another cable cut, 
with only Level 3 managing to preserve some of their gains.  
This answers one of our questions from an 
<a href="http://www.renesys.com/blog/2008/02/mediterranean_cable_break_part.shtml">earlier blog</a> about exactly how Level 3 managed to gain business in India.  
It was due almost entirely to Bharti, a very large local provider.
</p>

<p>
Now, let's consider DCI in Iran.
DCI is the only provider in Iran with connections to the outside world.
Most of their traffic flowed via TTNet, SingTel or Flag before the breaks, and not surprisingly, Flag lost many of the networks it carried earlier.
</p>

<a href="http://www.renesys.com/blog/egypt/IR.pfxs.png" 
onclick="window.open('http://www.renesys.com/blog/egypt/IR.pfxs.png','popup',scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/egypt/IR.pfxs.500x150.png">
</a>

<p>
But this graph might seem to contradict our 
<a href="http://www.renesys.com/blog/2008/02/attention_iran_is_not_disconne_1.shtml">previous blog,</a>
where we said that the outaged networks in Iran had been quickly recovered.
The graph shows a drop in networks carried by Flag,
but no corresponding rise in networks for TTNet and/or SingTel.
This is explained by the fact that networks can be carried by more than one provider.
For example, I might reach a network in Iran via Flag, but you might reach that very same network via TTNet.  
This is why Iran was able to recover so quickly.
DCI could use any one of their three primary providers and, in fact, were using more than one of them for many of their networks.
When Flag failed, traffic could easily move to one of the surviving providers.
So although total bandwidth into the country was reduced, 
there was little in the way of a long-term outage for many networks.
</p>

<p>
From this discussion, we can see that graphing the number of networks over time does always tell the whole story.  
Here SingTel and TTNet both could have picked up a lot of new <em>traffic</em> because of the failure of Flag, but not necessarily any new <em>networks</em>.
How can we observe such situations in the routing data?
Well, when Iran had three main providers, the rest of the world would pick these three in some proportion based on various 
<a href="http://www.faqs.org/rfcs/rfc1771.html">routing attributes</a>, 
which are beyond the scope of this discussion.
However, when Flag went away, there were only two primary Iranian providers left standing.
Renesys' worldwide assortment of routing peers (i.e., data collection points)
would have been forced to pick one of the two survivors for traffic into Iran.
We can capture this with a metric we call PPT, peer-prefix-time.
Basically, for each of the Renesys peers worldwide,
we count up the total amount of time this peer routes a network (prefix) in a particular way.
Thus, for each network in Iran and each peer, we'll know how long it was routed via TTNet or Flag or SingTel.
Adding up these times for all networks in Iran tells us how popular these providers are for gaining access to Iran on any given day.  We show this in the graph below.
</p>

<a href="http://www.renesys.com/blog/egypt/IR.ppt.png" 
onclick="window.open('http://www.renesys.com/blog/egypt/IR.ppt.png','popup',scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/egypt/IR.ppt.500x150.png">
</a>

<p>
Immediately after the cable cuts, TTNet was preferred for access to Iran by a significant majority of the world over SingTel.  But then as time went on, the two providers achieved rough parity.  This could have been because of routing decisions made by DCI  to balance traffic between their remaining providers.
This example is to show that simply counting up routed networks,
while useful, only gets you so far.
You also need to know how the rest of the world chooses between the available options and in
what proportion.
Flag, which still routed networks to Iran after the cuts, 
was selected by almost no one.
</p>

<p>
I want to thank you for getting to this point in my blogs and for all the thoughtful comments I have been receiving both publicly and privately.  I wish I had the time to answer all the questions,
but I guess if I did we wouldn't have much of a business.
Renesys makes its money by selling such 
<a href="http://www.renesys.com/products_services">Internet Intelligence</a> to its customers.
So with this blog, I am going to close out our discussion of cable breaks for now,
except that I'll soon follow up with some non-technical concluding remarks and lessons learned.
</p>]]>
    </content>
</entry>
<entry>
    <title>ATTENTION: Iran is not disconnected!</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/02/attention_iran_is_not_disconne_1.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=47" title="ATTENTION: Iran is not disconnected!" />
    <id>tag:www.renesys.com,2008:/blog//1.47</id>
    
    <published>2008-02-03T23:15:55Z</published>
    <updated>2008-02-15T12:42:01Z</updated>
    
    <summary> Let me repeat, Iran is not disconnected from the Internet! We have gotten a few queries about why we did not highlight Iran in our review of the network outages that resulted from the cable breaks. (See here, here...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
Let me repeat, Iran is <em>not</em> disconnected from the Internet!
</p>
<p>
We have gotten a few queries about why we did not highlight Iran in our review of the network outages that resulted from the cable breaks.
(See
<a href="http://www.renesys.com/blog/2008/01/mediterranean_cable_break.shtml">here</a>,
<a href="http://www.renesys.com/blog/2008/01/mediterranean_cable_break_part_1.shtml">here</a> and
<a href="http://www.renesys.com/blog/2008/02/mediterranean_cable_break_part.shtml">here</a>.)
Like most countries in the region,
the outages in Iran were very significant, but for the most part they did not exceed 20% of their total number of networks.
Now 20% is a significant loss, but in the context of an event where countries lost almost all of their connectivity, such a loss did not place Iran into the top 10 of impacted countries.
So we focused most of our attention where the losses where the highest.
</p>
]]>
        <![CDATA[<p>
But then there was this Slashdot 
<a href="http://hardware.slashdot.org/article.pl?sid=08/02/01/1912220">posting</a>,
claiming Iran had zero connectivity.
This was news to us.
It's said that <a href="http://www.guardian.co.uk/notesandqueries/query/0,5753,-21510,00.html">
"the first casualty of war is truth."</a>
Something similar can probably be said with regard to catastrophic failures. 
Truth might not be first, but it is a very close second.
Journalists are pushed to meet deadlines for stories about topics for which they have little familiarity, 
and technical experts sometimes jump to conclusions on the basis of little evidence.
It's not hard to see why the truth gets distorted;  it's hard to think clearly when you believe the sky is falling.
</p>

<p>
The Slashdot claim was made since a web page at the 
<a href="http://www.internettrafficreport.com/asia.htm">Internet Traffic Report</a>
was reporting that the country was down.
This report seems to be based on pings to a single router in Iran from multiple places around the world, which at best only indicates that one router in Iran is unavailable, 
not that the entire Internet has ceased to function there.
Of course, once something ends up "in print",
it tends to gain credibility and then be referenced by others.
And before long, large numbers of people think it is actually true.
(For a detailed ping analysis to the region during the outage, see this 
<a href="https://confluence.slac.stanford.edu/display/IEPM/Effects+of+Fibre+Outage+through+Mediterranean">
article.</a>)
</p>

<p>
To understand what happened in Iran after the fiber cuts, 
we looked at actual routing data for the country, collected from around the globe.
You can say with absolute certainty that if a provider does not have a route to <em>any</em> network in Iran, 
then no traffic will flow from that provider or its customers to Iran.
But that is all you can say.  The problem could be with the provider.
That is why Renesys collects routes from a carefully selected set of peers around the world.
If none of them know how to get to Iran, then you can be assured that Iran is truly off the air.
Note that you have to be careful here with your selection of peers.  If all of them end up traversing the same cable to get to Iran, even when other options exist, then the problem could be only with that cable and nothing more.  To make a definitive statement about the worldwide reachability of any geography, 
you need to collect data from a diverse and at least somewhat independent set of peers so that you'll see all paths into the area.
When the overwhelming majority of them have the same view of a situation,
then you can conclude that the view is almost certainly correct for the entire world.
</p>

<p>
So back to Iran.  In the following graph, we plotted the availability of Iranian networks for four entire days, 30 January 00:00 UTC until 3 February 00:00 UTC.
The first day is the day of the cable cuts.
Of the 695 networks that geo-locate to Iran,
at no time were more than 199 unavailable,
as observed by large number of Renesys peers.
A few peers here and there might not have been able to reach Iran for local reasons, 
but the vast majority of the world could get to most of the networks in Iran for this entire
time period.
Note also that around 64 networks were unavailable before the event even started.
These networks could be simply unused at this time.  
In other words, at most 135 networks that were active before the cable cuts disappeared for at least a short while during the outages.
</p>

<p>
<table>
<tr>
<td>
<a href="http://www.renesys.com/blog/iran/iran.png" 
onclick="window.open('http://www.renesys.com/blog/iran/iran.png','popup','width=1500,height=450,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/iran/iran.png" width="750" height="225">
</a>
</td>
</tr>
<td align="center"><em>Global Reachability of Iranian Networks</em>
</td>
</tr>
</table>
</p>

<p>
So much for Iran being off the Internet.
Again, this is not to imply that Iran was not impacted by this event.
A lot of networks were unavailable and some of them continue to be so.
The end users of those networks are certainly noticing the problem 
and everyone in the country might be experiencing a slowdown due to the decrease in bandwidth
to the region.  Still, Iran fared much better than most.
</p>]]>
    </content>
</entry>
<entry>
    <title>Mediterranean Cable Break - Part III</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/02/mediterranean_cable_break_part.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=46" title="Mediterranean Cable Break - Part III" />
    <id>tag:www.renesys.com,2008:/blog//1.46</id>
    
    <published>2008-02-02T11:17:24Z</published>
    <updated>2008-02-11T22:05:29Z</updated>
    
    <summary> Our first two blog entries on this topic focused on the events of 30 January 2008, when two submarine cables systems were damaged. These systems provided much of the capacity into the Middle East and the Indian subcontinent from...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
Our first two blog entries on this topic focused on the events of 30 January 2008, when two submarine cables systems were damaged.  These systems provided much of the capacity into the Middle East and the Indian subcontinent from the west.  Although some countries were hurt more than others, the loss of connectivity was extensive and very widespread.  Some countries and a few providers were almost completely knocked off the Internet.  As Day 1 came to a close, it was clear that the damaged cables were not going to be repaired anytime soon and the impacted parties would have to look for alternatives to waiting it out.
</p>
<p>
Day 2 and 3 saw a frenzy of activity as local providers in the region tried to broker agreements with anyone who still had capacity.
They were under intense pressure to restore service to local governments and businesses.
In turn, global and regional providers with surviving capacity into the region were busy hunting for new customers.
We definitely had a seller's market.
At Renesys, we watched all of the activity with great interest and decided to wait until the end of Day 3 to report on the winners and losers, after the initial deals were made and things had settled down to some degree.
</p>
]]>
        <![CDATA[<p>
In this part,
we'll examine five selected countries and the providers into them both before and after the event.
The countries we'll consider are Egypt, Kuwait, Saudia Arabia, Pakistan and India.
We picked these because of their total number of unreachable networks and/or their prominence in the region.
For each country,
we consider only those networks which were unavailable after the cable cuts.
A small number of those were also unavailable before the cuts, presumably for other reasons.
After the cuts, all five countries lost global connectivity to a large number of their local networks.
We looked at who provided service to these now lost networks <em>before</em> the cuts and then looked to see where they ended up at the end of Day 3 (2 February, 00:00 UTC).  Were they still dead?  Or were they restored by some other provider?  Did any new providers suddenly appear in one of these countries?  Did any fail to act?  We'll answer these questions in what follows, limiting our charts to the larger players in each country.
Keep in mind that throughout this discussion, we will be examining <em>only</em> those networks that were unavailable after the cable cuts.
For example, in India, there were numerous networks that were not impacted by this event, namely, those getting capacity from the east, rather than the west.
We are assuming that any network still reachable after the cuts had no reason to switch providers at this time and so we do not consider them further.
</p>

<p>
<table>
<tr>
<td>
<img src="http://www.renesys.com/blog/egypt/EG.gif">
</td>
</tr>
</tr>
</table>
</p>

<p>
We start off by looking at Egypt, one of the harder-hit countries.
At the end of Day 3, the big winner was Telecom Italia, which clearly picked up a lot of new
business, restoring service to almost 300 more networks than they had originally lost.
France Telecom was the big loser, with no down networks restored.  
Flag, who maintains one of the broken cables, managed to restore service to a few hundred networks, presumably by sending that traffic east, rather than west.  
Although not shown here, VSNL seized the opportunity and entered the Egyptian market for the first time, acquiring a local customer.
</p>

<p>
<table>
<tr>
<td>
<img src="http://www.renesys.com/blog/egypt/KW.gif">
</td>
</tr>
</tr>
</table>
</p>

<p>
The situation in Kuwait was even more interesting with Global Voicecom, Telecom Italia, Flag and Verizon unable to restore a single down network.
VSNL did what the others could not and restored service for all of their networks, 
as well as gained some new business.
They clearly exploited the fact that they have capacity to both the east and west from this region.
PCCW was the big winner, managing to pick up almost 70 new networks.
And when all else fails, there is always satellite.
Horizon Satellite entered the market, providing service to four of the fallen.
</p>

<p>
<table>
<tr>
<td>
<img src="http://www.renesys.com/blog/egypt/SA.gif">
</td>
</tr>
</tr>
</table>
</p>

<p>
Saudia Arabia saw AT&T, Flag and Sprint largely unable to act to restore service.
The big winners here were Deutsche Telekom, gaining 13 networks, and VSNL, gaining 11.
</p>

<p>
<table>
<tr>
<td>
<img src="http://www.renesys.com/blog/egypt/PK.gif">
</td>
</tr>
</tr>
</table>
</p>
<p>
We now turn our attention to the Indian subcontinent and the two big players, Pakistan and India.  
First up is Pakistan.
The big loser here was Verizon, with over 700 networks that they failed to restore.
BT was the big winner, picking up almost 500 networks.
In a country with under 1400 routed networks, this was a huge gain.
PCCW also did very well, gaining over 200 new networks, while
Stixlite started servicing an additional 166 networks.
</p>

<p>
<table>
<tr>
<td>
<img src="http://www.renesys.com/blog/egypt/IN.gif">
</td>
</tr>
</tr>
</table>
</p>

<p>
And finally, we look at the huge market of India.
Sprint, Cable & Wireless, Deutsche Telekom, BT and Verizon all took it hard here, largely failing to restore connectivity to their networks.
AT&T and Flag also lost a lot of networks.
But the big winner was SingTel, gaining over 200 networks, 
followed by Level 3, picking up over 100.
One wonders why AT&T, with lots of fiber assets in the continent, 
had a net loss among these networks whereas Level 3, with no fiber in the region, saw a net increase.
</p>

<p>
<table>
<tr>
<td>
<img src="http://www.renesys.com/blog/egypt/dead.gif">
</td>
</tr>
</tr>
</table>
</p>

<p>
Having reviewed these five countries in detail, 
we wondered how many networks were still unavailable?
That is, how much money was still on the table for those nimble providers with additional capacity?
For any country, there will always be a small number of unavailable networks at any given time.  These could be down for entirely local reasons: scheduled maintenance, power outages, etc.
So we graphed the number of networks unreachable both before the cable cuts and at the end of Day 3.
As you can see, there is still much to do.
But keep in mind that even when reachability is eventually restored to all down networks, <em>overall capacity</em> to the region will be severely curtailed until those cables are repaired.  And latencies could be very high if, for example, Europeans now need to reach the Middle East by way of the US and Asia.  Of course, some connectivity is always better than none at all.
</p>

<p>
So what will happen next, when service is fully restored?
Flag is claiming this work will be complete by 
<a href="http://in.news.yahoo.com/reuters_ids_new/20080202/r_t_rtrs_nl_general/tnl-third-undersea-cable-cut-in-middle-e-223dd93.html">mid-February</a>, 
but there was yet another cable cut yesterday.
Who will restore service to those networks still off the air?
Who were the large customers that shifted service to other providers?
What were the lessons learned by this event and what can be done to guard against future cable cuts?  These are questions we will leave to subsequent blogs.  Stay tuned.
</p>]]>
    </content>
</entry>
<entry>
    <title>Mediterranean Cable Break - Part II</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/01/mediterranean_cable_break_part_1.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=45" title="Mediterranean Cable Break - Part II" />
    <id>tag:www.renesys.com,2008:/blog//1.45</id>
    
    <published>2008-02-01T00:20:05Z</published>
    <updated>2008-02-11T22:05:07Z</updated>
    
    <summary> After looking at the countries most impacted by the cable cut in our first blog on this topic, we now turn our attention to the Internet service providers in the region and how they fared. Due to differences in...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
After looking at the countries most impacted by the cable cut in our first blog on this topic, we now turn our attention to the Internet service providers in the region and how they fared.  Due to differences in network architecture, cable ownership, and transit purchasing, carriers in the same country may not all experience the same degree of outage.  <em>For all of the following, we consider a network to be "outaged" when it is unreachable from the perspective of the broader Internet&mdash;as represented by Renesys's 250 peering sessions.</em>
</p>
<p>
The following two tables provide the top 15 providers with the largest number of outaged networks.  We list the provider's name, the country in which most of their unreachable networks are located and their autonomous system number (ASN), an assigned number that uniquely identifies their organization on the Internet.
</p>
In the first table, we list the providers in decreasing order by total number of outaged networks.  
In the second table, we list them by decreasing order of the percentage of their networks that are unreachable.
</p>
<p>
Not surprisingly, the hardest hit providers are located primarily in the hardest hit countries: Egypt, Kuwait, India and Pakistan.  One local provider in each of Egypt and Kuwait lost essentially all of their Internet connectivity.
</p>
]]>
        <![CDATA[<table>
 <tr>
  <th>Provider</th>
  <th>Country</th>
  <th>ASN</th>
  <th>Num</th>
 </tr>
 <tr>
  <td>LINKdotNET</td>
  <td>Egypt</td>
  <td align=right>24863</td>
  <td align=right>268</td>
 </tr>
 <tr>
  <td>Pakistan Telecom</td>
  <td>Pakistan</td>
  <td align=right>17557</td>
  <td align=right>234</td>
</tr>
 <tr>
  <td>Wataniya Telecom</td>
  <td>Kuwait</td>
  <td align=right >29357</td>
  <td align=right >199</td>
 </tr>
 <tr>
  <td>TEDATA</td>
  <td>Egypt</td>
  <td align=right >8452</td>
  <td align=right >191</td>
 </tr>
 <tr>
  <td>Sify</td>
  <td>India</td>
  <td align=right >9583</td>
  <td align=right >184</td>
 </tr>
 <tr>
  <td>EgyNet</td>
  <td>Egypt</td>
  <td align=right >20858</td>
  <td align=right >169</td>
 </tr>
 <tr>
  <td>Dancom</td>
  <td>Pakistan</td>
  <td align=right >23966</td>
  <td align=right >147</td>
 </tr>
 <tr>
  <td>Micronet</td>
  <td>Pakistan</td>
  <td align=right >23674</td>
  <td align=right >125</td>
 </tr>
 <tr>
  <td>Hathway</td>
  <td>India</td>
  <td align=right >17488</td>
  <td align=right >111</td>
 </tr>
 <tr>
  <td>Internet Egypt</td>
  <td>Egypt</td>
  <td align=right >5536</td>
  <td align=right >95</td>
 </tr>
 <tr>
  <td>QualityNet</td>
  <td>Kuwait</td>
  <td align=right >9155</td>
  <td align=right >90</td>
 </tr>
 <tr>
  <td>Nile Online</td>
  <td>Egypt</td>
  <td align=right >15475</td>
  <td align=right >90</td>
 </tr>
 <tr>
  <td>Data Network</td>
  <td>Pakistan</td>
  <td align=right >9260</td>
  <td align=right >87</td>
 </tr>
 <tr>
  <td>IDM</td>
  <td>Lebanon</td>
  <td align=right >9051</td>
  <td align=right >80</td>
 </tr>
</table>
</p>
<br>
<p>
<table>
 <tr>
  <th>Provider</th>
  <th>Country</th>
  <th>ASN</th>
  <th>%</th>
 </tr>
 <tr>
  <td>Wataniya Telecom</td>
  <td>Kuwait</td>
  <td align=right>29357</td>
  <td align=right>100</td>
 </tr>
 <tr>
  <td>Yalla Online</td>
  <td>Egypt</td>
  <td align=right>20484</td>
  <td align=right>99</td>

 </tr>
 <tr>
  <td>Dancom</td>
  <td>Pakistan</td>
  <td align=right>23966</td>
  <td align=right>80</td>
 </tr>

 <tr>
  <td>EgyNet</td>
  <td>Egypt</td>
  <td align=right>20858</td>
  <td>77</td>
 </tr>
 <tr>

  <td>Internet Egypt</td>
  <td>Egypt</td>
  <td align=right>5536</td>
  <td align=right>77</td>
 </tr>
 <tr>
  <td>Micronet</td>

  <td>Pakistan</td>
  <td align=right>23674</td>
  <td align=right>73</td>
 </tr>
 <tr>
  <td>Data Network</td>
  <td>Pakistan</td>
  <td align=right>9260</td>
  <td align=right>71</td>
 </tr>
 <tr>
  <td>Pakistan Telecom</td>
  <td>Pakistan</td>
  <td align=right>17557</td>

  <td align=right>64</td>
 </tr>
 <tr>
  <td>TEDATA</td>
  <td>Egypt</td>
  <td align=right>8452</td>
  <td align=right>60</td>

 </tr>
 <tr>
  <td>IDM</td>
  <td>Lebanon</td>
  <td align=right>9051</td>
  <td align=right>56</td>
 </tr>

 <tr>
  <td>Exatt</td>
  <td>India</td>
  <td align=right>18231</td>
  <td align=right>53</td>
 </tr>
 <tr>

  <td>Nile Online</td>
  <td>Egypt</td>
  <td align=right>15475</td>
  <td align=right>49</td>
 </tr>
 <tr>
  <td>Emirates Internet</td>

  <td>UAE</td>
  <td align=right>5384</td>
  <td align=right>46</td>
 </tr>
 <tr>
  <td>Eepad</td>
  <td>Algeria</td>
  <td align=right>33783</td>
  <td align=right>43</td>
 </tr>
 <tr>
  <td>QualityNet</td>
  <td>Kuwait</td>
  <td align=right>9155</td>
  <td align=right>35</td>
 </tr>
</table>
</p>
<br>
<p>
After totaling up the damage to the local providers, 
we wondered if any of harder hit ones managed to regain connectivity for some of their networks via alternate paths.  We often hear statements like "the Internet is good at routing around damage".
Well, that can be true, but only when there are available alternatives.
Looking at hard-hit Egypt and Kuwait, 
we plotted the number of outaged networks per provider over the past day in the following <em>stacked</em> graph, 
where the width of each color represents the number of unreachable networks for a given provider.
If any of the providers had choices,
we would expect to see the width of their <em>color</em> decrease over time as they shifted traffic to alternative paths.
Except in one case, that didn't happen. 
The exception was the Egyptian provider, LINKdotNET (ASN 24863),  which did regain connectivity for most of their unavailable networks for about one hour at 20:00 UTC, only to lose more than twice as many after that time.  
Whatever backup routes they used obviously didn't hold up.
The others had essentially the same number of networks out all day, 
the <a href="http://www.renesys.com/blog/2007/01/the_shape_of_disaster_on_the_n.shtml"> typical shape</a> of a prolonged catastrophic failure.
</p>

<a href="http://www.renesys.com/blog/egypt/layered.EG.KW.png" 
onclick="window.open('http://www.renesys.com/blog/egypt/layered.EG.KW.png','popup',scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/egypt/layered.EG.KW.500x150.png">
</a>

<p>
To make this last point perhaps more forcefully,
we next plot the total number of outaged networks for the region as a whole for 24 hours, excluding the Indian subcontinent.
As shown, there was no immediate relief for the large swath of the Internet cut off by this disaster.
</p>

<a href="http://www.renesys.com/blog/egypt/all_outa.jan30.outa.png" 
onclick="window.open('http://www.renesys.com/blog/egypt/all_outa.jan30.outa.png','popup',scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/egypt/all_outa.jan30.outa.500x150.png">
</a>

<p>
Next up, we'll look at the global Internet providers and who won and lost in the battle to retain their regional customers or acquire new ones.
</p>
]]>
    </content>
</entry>
<entry>
    <title>Mediterranean Cable Break</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/01/mediterranean_cable_break.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=44" title="Mediterranean Cable Break" />
    <id>tag:www.renesys.com,2008:/blog//1.44</id>
    
    <published>2008-01-30T23:53:22Z</published>
    <updated>2008-02-11T22:04:42Z</updated>
    
    <summary> Early this morning local time, two cable systems north of Alexandria, Egypt were severed, greatly impacting both Internet and voice traffic to the region. The broken cables are operated by Flag Telecom and SEA-ME-WEA 4, and if past undersea...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
Early this morning local time, 
<a href="http://www.bloomberg.com/apps/news?pid=20601085&sid=aWe706hsLNdY&refer=europe">two cable systems north of Alexandria, Egypt were severed</a>,
greatly impacting both Internet and voice traffic to the region.
The broken cables are operated by Flag Telecom and SEA-ME-WEA 4, 
and if <a href="http://www.renesys.com/tech/presentations/pdf/Plenary2-Underwood.pdf">past undersea cable cuts</a> 
are any measure, 
repair time will be measured in weeks, not days.
This is a preliminary report on the countries most impacted by this failure, 
as seen from the perspective of Internet routing.
</p>]]>
        <![CDATA[<p>
<table>
<tr>
<td>
<a href="http://www.renesys.com/blog/egypt/cablemap5.png" 
onclick="window.open('http://www.renesys.com/blog/egypt/cablemap5.png','popup','width=1097,height=546,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/egypt/cablemap5.png" width="515" height="250">
</a>
</td>
</tr>
<td align="center"><em>Most Impacted Countries</em>
</td>
</tr>
</table>
</p>

<p>
As you can see from the above map, there are several cable systems that connect Europe, the Middle East and Asia, via
the Suez Canal.  
The countries highlighted in red are those whose Internet connectivity is being disrupted the most by this event.
At Renesys, we geo-locate all routed networks and observe their reachability from over 250 locations around the globe.
In the case of disasters like this, 
we will suddenly see a large percentage and/or a large number of country-specific networks disappear from the Internet.
As the following charts show,
Egypt and Pakistan lost the highest percentage of their networks, 
while India lost the least.
However, India had the third highest total number of networks disappear.
Looking at the cable map, it is not surprising that the Indian subcontinent was impacted by events off the 
coast of Egypt.
There are essentially two ways to get to this part of the world: via the Suez Canal or via Southeast Asia.
</p>

<p>
<table>
<tr>
<td>
<img src="http://www.renesys.com/blog/egypt/image001.gif">
</td>
</tr>
</tr>
</table>
</p>
<p>
<table>
<tr>
<td>
<img src="http://www.renesys.com/blog/egypt/image002.gif">
</td>
</tr>
<td align="center"><em></em>
</td>
</tr>
</table>
</p>

<p>
The next graph show the outages over time for the four countries who lost the most number of networks, namely, 
Egypt, Pakistan, Kuwait and India.
You can observe a sharp loss of connectivity for these countries at 04:30 UTC, 
followed by another event at 08:00 UTC.
</p>

<p>
<table>
<tr>
<td>
<a href="http://www.renesys.com/blog/egypt/big.outa.png" 
onclick="window.open('http://www.renesys.com/blog/egypt/big.outa.png','popup','width=1500,height=450,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/egypt/big.outa.png" width="750" height="225">
</a>
</td>
</tr>
<td align="center"><em>Most Impacted Countries by Total Number of Networks</em>
</td>
</tr>
</table>
</p>

<p>
Our final graph shows the total number of networks lost for the region, excluding the Indian subcontinent,
in order to more clearly illustrate the timing of these events.
Notice that there are two long term events starting at 04:30 UTC and 08:00 UTC, presumably the two cable breaks.
Then there are shorter lived events at around 06:00 UTC and 13:00 UTC, 
which may reflect measures taken in an attempt to route around the problem.
</p>

<p>
<table>
<tr>
<td>
<a href="http://www.renesys.com/blog/egypt/all_outages1.outa.png" 
onclick="window.open('http://www.renesys.com/blog/egypt/all_outages1.outa.png','popup','width=1500,height=450,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false">
<img src="http://www.renesys.com/blog/egypt/all_outages1.outa.png" width="750" height="225">
</a>
</td>
</tr>
<td align="center"><em>Total Number of Outaged Regional Networks</em>
</td>
</tr>
</table>
</p>

<p>
Stay tuned to this blog for more information as we continue our analysis.
</p>]]>
    </content>
</entry>
<entry>
    <title>15th Century Routing</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2008/01/15th_century_routing.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=43" title="15th Century Routing" />
    <id>tag:www.renesys.com,2008:/blog//1.43</id>
    
    <published>2008-01-23T19:14:05Z</published>
    <updated>2008-02-19T19:30:12Z</updated>
    
    <summary> Which way is up? Since I sometimes find myself hopelessly lost, I tend to wonder about global navigation in the days before GPSes or even accurate maps. I imagine you started off with just a general idea of where...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<table align=right>
<tr>
<td align=center>
<img src="http://www.renesys.com/blog/hike/sign.jpg" width="295" height="221" />
<em>Which way is up?</em>
</td>
</tr>
</table>

<p>
Since I sometimes find myself hopelessly lost, 
I tend to wonder about global navigation in the days before GPSes or even 
accurate maps.  
I imagine you started off with just a general idea of where you wanted to go 
(e.g., "The New World"), 
crude navigational aids (the stars, Sun and Moon when you could see them), 
and hearsay from your fellow travelers or the locals about your proposed course.
In addition, you only had a view of the world from your current location,
limited by the curvature of the earth.
</p>]]>
        <![CDATA[<table align=left>
<tr>
<td align=center>
<img src="http://www.renesys.com/blog/hike/earl.jpg" width="221" height="295" />
<em>Looking for trouble</em>
</td>
</tr>
</table>

<p>
Thinking about it, this sounds a lot like modern day routing.  
You have a general idea of where you want some packets to go (a network prefix)
and very crude navigational tools (BGP updates), 
most of which are hearsay from peers, providers and random strangers you have
no reason to trust.
Plus your world view (routing table) is limited to the place where you collect
your routes and from whom you collect them.
There is no comprehensive map of the world (global routing table) or any reason
to believe that your packets won't fall off the end of the earth (be blackholed)
when you send them on their way.
The lack of a global routing table is the reason Renesys has to collect routes
from all over the world (over 250 places at present) and the reason we
see around <a href="http://www.renesys.com/tech/presentations/pdf/renesys-routing_small_prefixes.pdf">330,000 routes</a>, 
compared to what is currently considered a "full table" at just under 240,000
routes.
Compared to modern navigation in the physical world, 
it seems like we would have a better way to route Internet traffic by now.
</p>

<table align=right>
<tr>
<td align=center>
<img src="http://www.renesys.com/blog/hike/trail.jpg" width="295" height="221" />
<em>Single homed</em>
</td>
</tr>
</table>

<p>
Conveniently, Renesys is located in New Hampshire where real life navigation
can remind you more of the ancient world than the modern one. 
(Maybe that's why we're good at Internet routing?) 
Our physical maps are often wrong, showing roads that no longer exist or
landmarks in the wrong locations.
Cell service comes and goes and doesn't even work at my house, 
in a neighborhood less than 2km from Dartmouth College.
And it is often so cold that battery life is limited to minutes; so even if 
your GPS did work, it would be dead when you pulled it from your pack.
</p>

<table align=left>
<tr>
<td align=center>
<img src="http://www.renesys.com/blog/hike/rob.jpg" width="221" height="295" />
<em>Happy not to have been blackholed</em>
</td>
</tr>
</table>

<p>
With this in mind, we ventured out on a Renesys team-building exercise on 
Sunday, 20 January 2008.  
We picked this day to go above tree line since the weather forecast was 
particularly nasty.
A storm was coming and the <em>highs</em>
for the day were supposed to drop to -26C
on the peaks with wind speeds exceeding 100kph.
Luckily, we got to the top ahead of the storm and the day was absolutely 
brilliant and comfortable: -22C with a cooling breeze of 50kph.
Our routes were clearly visible and no one was blackholed - a very successful
day of navigating the mountain.
</p>]]>
    </content>
</entry>
<entry>
    <title>Cascading Failures: Believe It or Not!</title>
    <link rel="alternate" type="text/html" href="http://www.renesys.com/blog/2007/09/cascading_failures_believe_it.shtml" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.renesys.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=42" title="Cascading Failures: Believe It or Not!" />
    <id>tag:www.renesys.com,2007:/blog//1.42</id>
    
    <published>2007-09-25T22:00:00Z</published>
    <updated>2008-01-26T19:59:44Z</updated>
    
    <summary> Maybe it speaks to a risk-averse nature, but I&apos;ve always been interested in failure and in learning from the mistakes of others - obviously so I don&apos;t have to learn such lessons first hand. This is particularly important when...</summary>
    <author>
        <name>Earl Zmijewski</name>
        <uri>http://www.renesys.com/blog/</uri>
    </author>
            <category term="Engineering" />
    
    <content type="html" xml:lang="en" xml:base="http://www.renesys.com/blog/">
        <![CDATA[<p>
Maybe it speaks to a risk-averse nature, but I've always been interested in 
failure and in learning from the mistakes of others - obviously so I don't have
to learn such lessons first hand.  This is particularly important when you 
engage in activities where bad decisions can kill you.  But generally, as any 
<a href="http://www.forewordmagazine.com/reviews/viewreviews.aspx?reviewID=1845">
	book on mountaineering mishaps</a> demonstrates, it takes a series of 
errors in the "correct order" and at the wrong times to cause you serious harm.
</p>

<p>
In high risk activities under adverse conditions, it's not hard to make poor
decisions that you would never contemplate from the comfort of your favorite
living room chair.  But while there is little risk to life and limb on the 
Internet, its very connectedness means that the blunders of pretty much anyone 
can impact you.  What is important in this environment is the half-life and the 
reach of the mistakes.  Those that are local and die out quickly have little 
chance of resulting in global mayhem.  Others compound with all the other endless 
screw-ups regularly going on and eventually become a giant avalanche careening 
down hill, collecting mass and bearing down on the sleeping village below.  
This is one of those stories.  It might be true or it might not.  Your opinion
depends on how much imagination you think we have!
</p>]]>
        <![CDATA[<p>
Our tale starts innocently enough with a global upscale hotel chain, call them Hotel-A.  
(No, not that <a href="http://en.wikipedia.org/wiki/Paris_Hilton">one</a>.)
Hotel-A decides it is time to get serious about updating the software on their desktops.  
For a large organization, these updates really should be downloaded once to a local 
server and pushed or pulled from there.  But that takes time and a little 
expertise; the easier option is to burn your Internet bandwidth by updating all machines 
directly from the Content Delivery Network (CDN) serving this up.
The CDN is hosted by Provider-C, a global Internet transit provider.
</p>

<p>
<b>Mistake #1:</b> Ignore scaling issues and take the easy way out. 
</p>

<p>
To compound this error, Hotel-A elects to take the defaults, updating all PCs at 
exactly the same time.
</p>

<p>
<b>Mistake #2:</b> Ignore scheduling issues.  Who cares what happens at 3am anyway?
</p>

<p>
Hotel-A staff now have a ticking time bomb in place, set to go off on a particular day
of the month.  Every one of their PCs (probably thousands) will start to suck down a
large and identical set of updates at that time.
This will go on for many hours and effectively saturate all of their Internet links.
(<a href="http://heartbeat.skype.com/2007/08/what_happened_on_august_16.html">
Like this.</a>)
Sure enough, Hotel-A's self-DOS proceeds on schedule and their network operations staff 
identifies it as such, failing only to prepend "self" to the problem report.  Network 
operations in turn calls their transit provider, Provider-A, another global carrier.
</p>

<p>
<b>Mistake #3:</b> Blame the Boogeymen on the Internet before looking in your own backyard.
</p>

<p>
We've all been in similar situations.
Provider-A greatly values and implicitly trusts Hotel-A.  They are a good customer and pay 
serious money for services.  The time to act is now!
</p>

<p>
<b>Mistake #4:</b> Accept the reported problem at face value, rather than investigate yourself.
</p>

<p>
Now, how hard would it have been to notice that the source of the "attack" was a major CDN?  
Provider-A decides to blackhole all traffic from the source network.  But they do more than that - 
they blackhole all traffic from this network to all of their customers, not just Hotel-A.
</p>

<p>
<b>Mistake #5:</b> Overreact in a time of crisis.
</p>

<p>
Next, Provider-A compounds the problem by announcing the black-holed network as their very own.  
That is, they start originating a network that in fact belongs to Provider-C.
</p>

<p>
<b>Mistake #6:</b> Carelessly inject your IGP routes into BGP.
</p>

<p>
The folks at Provider-C now start to get reports about the inaccessibility of the CDN they host.  
These folks are smart and run a tight ship.  They carefully check out the complaints 
from various points on their worldwide network, but to no avail.  Everything looks great from inside
their network and from various external points as well.
Provider-A ultimately figures out what they have done and withdraws the route, leaving Provider-C 
scratching their collective heads about exactly what happened, as their traffic suddenly returned to 
normal.
They give Renesys a call to see if we have noticed "anything odd" about the 
problem network.  One quick look shows Provider-A and Provider-C both originating the same
network for a short period of time one nice summer morning.  
Given the reach of both carriers, the Renesys global peering set was roughly evenly divided between 
the two.
So half the world had access to the CDN and half did not.
The CDN and Provider-C were both collateral damage from a series of mistakes made by others.
</p>

<p>
You might think that such a sequence of events is highly unlikely and so probably didn't happen.
Or you might think that we aren't clever enough to have actually made up such a story.
Regardless, I happen to believe that cascading failures are common, although underreported.
With <a href="http://despair.com/idiocy.html">billions of mistake prone humans</a> 
connected to the global Internet,
how could this not be the case?
</p>

<p>
<b>Epilogue:</b>
In conclusion, there are really two important morals to take away from this story.  
First, it is fairly trivial to economically harm even the best run networks.  
Second, you can't effectively monitor your network from within your network.  
While it's true all the protocols on the Internet are from a happier day and are now seriously broken
and in need of replacement, your only real alternative to awaiting for nirvana might be to have 
someone else watch your (high-value) back and to keep a list of those NOC phone numbers handy.
</p>]]>
    </content>
</entry>

</feed> 

