Today is
World IPv6 Day,
a day when
major content providers have agreed to furnish service over IPv6 for a 24-hour test period.
Hopefully, you didn't notice anything different about your Internet experience today,
but providers will have gained valuable experience with the technology and any technical hurdles that remain to be overcome.
In this blog, we'll report on how far into the IPv4-to-IPv6 transition we actually are and,
more importantly, just how far we have to go.
There is no denying that there has been a tremendous amount of progress in the last decade or so,
but much remains to be done and we are only at the very beginning of a long process.
Articles By Earl Zmijewski
About two weeks ago, Level 3 announced plans to acquire Global Crossing and we blogged on the enormous size and scope of the new entity, which we called Level Crossing. This week, CenturyLink, a regional US phone company, agreed to acquire Savvis. Since CenturyLink also owns Qwest, we are seeing another merger of two Tier-1 Internet providers, a pairing which we'll label Qwavvis. In what follows, we examine the possible business considerations behind the move, as well as the impact on Internet transit customers and Renesys Market Intelligence rankings.
On Monday, 11 April 2011, Level 3 announced they had entered a definitive agreement to acquire Global Crossing. According to the Renesys Market Intelligence rankings, this merger would bring together the world's #1 and #2 global providers, with over half the Internet market on earth dependent on the combined entity. If the deal gained regulatory approval in the US and elsewhere today, how would the Internet provider landscape change? We'll answer that question in this blog, giving the proposed union a fictional name of Level Crossing for the purposes of our discussion.
As of approximately 20:46 UTC, four hours after this blog was first published, Noor started disappearing from the Internet. They are completely unavailable at present as shown below
As we observed last week, Egypt took the unprecedented step of withdrawing from the Internet. The government didn't simply block Twitter and Facebook (an increasingly common tactic of regimes under fire), but rather they apparently ordered most major Egyptian providers to cease service via their international providers, effectively removing Egyptian IP space from the global Internet and cutting off essentially all access to the outside world via this medium. The only way out now would be via traditional phone calls, assuming they left that system up, or via satellite. We thought the Internet ban would be temporary, but much to our surprise, the situation has not changed. One of the few Egyptian providers reachable today, four days after the start of the crisis, is The Noor Group. In this blog, we'll take a quick look at them and some of the businesses they serve.
Last week, we looked at the problem of incorrect DNS answers emanating from China and the potential impact on Internet users outside the country. In this blog, we'll consider a proposed and partially implemented solution (DNSSEC) and the broader problem of hosting global services in any country known to tamper with Internet traffic. We'll even suggest a rating system from one to five stars for evaluating countries, and we'll note that while the US was once a 5 on this scale (highest rating), it is currently a 4 and might be headed to a 3 or 2. In general, the direction for the world seems to be for a less open and more censored Internet, and that is the truly unfortunate part of this story.
There's been sudden interest recently in a Chinese route hijacking incident that occurred way back in April, brought about by a new report to the US Congress that highlighted the event (see pages 236-247). A second Chinese event, also in the report, has received almost no attention despite being much more interesting (technically, anyway). A Chinese DNS censorship incident occurred just one month earlier, in March, and although we already presented an analysis of that event (here and here), today we'll provide an update on the incident and its scope. But first, let's step back and get some context on events such as these, and see if the hype is warranted.
We've all heard about the wonders of cloud computing. Take your corporate web server, your email servers, your calendar software and even your business plans and other important documents and throw them all into "the cloud". No more finicky hardware to maintain, buggy software to patch or data backups to worry about. Outsource all of those headaches and enjoy reading your email from the beach on your phone.
Of course, nothing is ever that simple. Like any outsourced solution, you will need to perform due diligence. Is your cloud service provider technically and financially sound? Have they acquired sufficient diversity with respect to their Internet connectivity? Do they comply with all applicable regulations for your jurisdiction? Are there potential physical problems at their hosting locations, such as exposure to the threat of earthquakes or hurricanes? You can probably figure all of this out. But there is another threat that your due diligence will certainly fail to expose: the threat of your cloud neighbors. If you end up with the wrong ones, you may suffer as a result of their bad behavior or simply because of the content they host. This blog examines a few examples of this potential problem.
Time flies. Although it was over 18 months ago, it seems just like yesterday that a small Czech provider, SuproNet, caused global Internet mayhem by making a perfectly valid (but extremely long) routing announcement. Since Internet routing is trust-based, within seconds every router in the world saw this announcement and tried to pass it on. Unfortunately, due to the size of this single message, quite a few routers choked - resulting in widespread Internet instability. Today, over a year later, we were treated to a somewhat different version of the exact same story.
As our regular readers know, Renesys computes daily rankings of all the service providers in the world: globally, by geography, and by market segment. The rankings are a rather crude measure of size, as they are based entirely on the quantity of IP space ultimately transited by each provider. However, it's the ranking trends that are more revealing than any absolute number. Who is adding customers? Who is losing them or just standing still? All of the rankings and the reasons for any changes are updated daily and available via our Market Intelligence offering. For the past couple of Decembers (2009, 2008), we've also provided a glimpse into some of this data via year-end blogs devoted to the top global providers. Halfway through 2010, we decided to revisit this topic and highlight some recent changes: the fall of Sprint and rise of Tinet being perhaps the most interesting.
Here we go again. In March we wrote a blog entitled Accidentally Importing Censorship which described how incorrect DNS answers were returned in response to certain queries to the I-root. The problem was tracked down to a single instance of the I-root located in China. Queries to this server for domains blocked in China, such as Facebook, would return seemingly arbitrary answers. As we noted, countries, and even companies, can impose their own standards on the Internet and block anything they want. This story was only noteworthy because those blocks (via bad DNS answers) became visible outside of China. Well, guess what? We are once again seeing the Beijing I-root from outside of China.
For an advanced technology that we all depend upon, it sure seems that the Internet has more than its fair share of problems: spam, viruses, malware, spyware, phishing, worms, trojans, DDoS attacks, hijacks, DNS cache poisoning, botnets, keystroke loggers, etc. We need an entirely new vocabulary just to talk about this stuff. Most of it appears to come out of the blue, forcing the rest of the world to react. But the good news is that there is at least one problem we can do something about in advance. Unfortunately, not everyone has been taking the problem seriously enough and we are about to hit the wall.
I'm talking about the impending exhaustion of IP addresses, IPv4 addresses to be exact. Every computer on the Internet needs access to at least one unique address in order to be connected. Around the dawn of the Internet, 32-bit IPv4 addresses, which allow for 4,294,967,296 different possibilities, seemed like more than enough. This was a simpler time when computers cost millions and no one imagined a phone you could put in your pocket. As the Internet grew, it soon became obvious that the seemingly inexhaustible supply of 4 billion addresses wasn't quite enough. And so, a 128-bit IPv6-based Internet was proposed, this one with 340,282,366,920,938,463,463,374,607,431,768,211,456 different addresses. (We're not going to make that mistake again!) The only problem was that the new Internet wasn't interoperable with the old one we already knew and loved. Without a Y2K-type hard deadline to focus on, we kept barreling along toward the edge of the IPv4 cliff. Now that the edge is clearly in sight, this blog looks at how far we have come in adopting the not-so-new-anymore IPv6 Internet and, perhaps more importantly, how much further we need to go.
With advancements in hardware and software, sophisticated filtering technologies are increasingly being applied to restrict access to the Internet. This happens at the level of both governments and corporations. Renesys is headquartered in the "Live Free or Die" US state of New Hampshire. In our small town of roughly 10,000 folks, we know of a local company who tries to restrict non-work related (e.g., shopping) websites from their employees. Unfortunately, someone who works there can't read about Amazon's cloud computing as a result — a small bit of collateral damage. Entire countries act in much the same way. The OpenNet Initiative keeps track of such state-sponsored restrictions and publishes interesting maps based on the level of filtering by topic. But given the open nature of the trust-based Internet, one country's restrictions, if not handled very carefully, can easily foul the global Internet nest we all live in. This blog is about one such story of Internet restrictions in China becoming visible (seemingly at random) from other parts of the world and going undetected for 3 weeks. Given the increasing complexity of this technology, the difficulty in controlling a very open Internet, and the strong desire of some to do just that, this could be a harbinger of things to come.
As our faithful readers know, Renesys monitors routing on the global Internet in real time and uses that information in a variety of ways. For example, we can instantly let you know which networks a hurricane has disabled or even tell you when a war has left things pretty much as they were. In short, we keep an eye on the Internet, the entire Internet, but this is all done at the level of IP addresses and the paths they follow.
The recent
attack on Twitter got us thinking.
Maybe we should be keeping an eye on a few more things?
While your IP addresses and routes to them might be completely stable,
the average user doesn't know about those.
In other words, when was the last time you typed ...
http://216.239.59.104
instead of ...
http://www.google.com
into your browser?
What if someone manages to point your domain name to some other IP addresses? You would still be operational as far as the Internet routers were concerned, but no humans would probably be reaching you. And that's the problem we'll briefly consider in this blog.
As our regular readers know, Renesys collects a lot of Internet routing data, using it to create reports and products based on hard facts and objective analysis. Perhaps the only controversial thing we do with our data is to rank all the service providers in the world: globally, by geography, and by market segment. The rankings are a rather crude measure of size, as they are based entirely on the quantity of IP space ultimately transited by each provider. However, it's the ranking trends that are more revealing than any absolute number. Who is adding customers? Who is losing them or just standing still? Changes in IP transit answer these questions and more. Although there are obvious shortcomings in this approach, it is certainly objective and the process is fully automated. All of our rankings are updated daily and available via our Market Intelligence offering. In this posting, we will take a look at the top 13 providers in the world for 2009 and how they have jockeyed for position throughout the year, similar in spirit to our December 2008 blog, which provides more details about our methodology. We will see what a difference a year has made and highlight some of the more interesting changes.
Internet connectivity is a good thing. Many of us depend on it for everything from our livelihoods to our entertainment. However, the Internet is very fragile and even the The New York Times is worried about it. But they're primarily concerned with overloads that can occur when everyone on the planet does the same thing at roughly the same time, such as surfing for news about Michael Jackson. Unfortunately, we will never avoid all such scenarios. Physical systems are designed around average and typical peak loads, not around extremely high loads associated with very unlikely events. Who would pay for that?
And this applies to other complex systems besides the Internet. I was in India during 9/11 and, for two days, I could not make a traditional phone call to the US. Why? Everyone in India knows someone in NYC, and they all picked up the phone at the same time to check in on them. The circuits were so overloaded, I couldn't even get the friendly "Your call cannot be completed as dialed" message.
No system is ever going to be engineered for insanely high loads. If everyone in your town decides to take a shortcut through your neighborhood to avoid an accident on the highway, you are going to have trouble getting out of your driveway. But rather than give up and wait it out, there is something you can do in advance and at reasonable cost: build a second driveway to a different street on the other side of your house, one that isn't fed by the same access roads from the highway. This blog is about building such redundancy into your Internet connectivity, so you aren't disconnected by a single failure. And while it's good that the New York Times and various governments are watching the problem, if your business depends on the Internet, you're largely on your own to audit and verify that you are buying a sufficient level of redundancy for your budget. A lot of fragility problems could be solved by more informed consumers performing the necessary due diligence.
A couple of months ago, we discussed how a small Czech provider ended up causing global Internet mayhem by tickling a Cisco bug via a rather ridiculous routing announcement. While it's easy to fault the instigator of this meltdown, ultimate responsibility belongs with the vendors of poorly tested code. If we've learned anything in decades of software engineering, it is that you can't assume anything about user input. If you don't check that input for validity, you are not just being careless, you are creating a time bomb that will eventually go off. Another such bomb went off on Sunday, 3 May 2009, taking out a large swath of the Internet. We recount the sorry tale here.
In our last blog entry, we talked about measuring the state of routing anarchy that exists on the Internet on a per-country basis. We looked at every routed network (prefix) by country of origin and tried to answer the question: do folks do what they say and say what they do, as articulated via routing registries? Although many manage to administer their routes with care, the overall results are quite varied. And without some way of verifying routes via some authoritative source, we are left only with the current system of believing everything we're told and hoping for the best. The dangers of such a system are demonstrated dramatically from time to time.
Although they certainly could, countries typically don't exercise any control over the routing hygiene of the companies operating within their borders. Countries might tax those companies, filter their traffic for objectionable content, mandate the types of software or equipment they can use and even spy on them, but if a company wants to screw up routing on the global Internet, well that's their business. As we've noted in the past, no driver's license is required on the Information Superhighway, as there are essentially no rules, regulations or enforcement. So in this blog entry, we'll apply our scoring idea to those who can easily effect change, namely, those organizations who are ultimately responsible for how traffic flows on the Internet.
Since Renesys maintains large quantities of data on the Internet going back many years, we sometimes get the question: If you guys are watching the entire 'net, why don't you just warn people when things break? My response is generally along the lines of: Sure we can do that. Simply tell us the correct state of the Internet at each moment in time and we'll alert you to any operational differences we observe. This is generally met with silence.
Renesys can tell you a lot about the current state of the Internet, but absolutely no one can tell you the correct state. And that is because no one is in charge, and so there is no central authoritative source of information. Think of the Internet as a highway system where anyone can buy a car and simply start driving: no need to register the car, attach a license plate, buy insurance or get a driver's license. You don't even have to show an id or be sober. Just pay some fees, buy some equipment, hook up and go. The barrier to entry really is that low.
Obviously, this arrangement can cause some problems. When Pakistan hijacked YouTube last year by announcing YouTube IP space, out of the hundreds of thousands of routing announcements seen on Internet, how was anyone to know this particular one was incorrect? Okay sure, you couldn't get your videos, but maybe YouTube had just opened a data center in Karachi and the problem was internal to them? Without some way of checking the authenticity of routes, the routers that direct traffic on the Internet simply believe what they are told. And if the best route to YouTube appears to be via Pakistan, then they are all going to use it, no questions asked. This is not a new problem, and this blog explores an old and largely failed attempt to address it. We then compare the differences between countries with respect to their routing hygiene.
This post is a follow-up to our blog last week about a small Czech provider briefly causing global Internet mayhem via a single errant routing announcement. In this incident, SuproNet (AS 47868) announced its one prefix, 94.125.216.0/21, to its backup provider, Sloane Park Property Trust (AS 29113), with an extremely long AS path. We've gotten more feedback about this entry than any other in recent memory, so we thought we'd try to answer some of the questions that were posed both here and elsewhere, as well as provide some clarification about exactly what went on. The questions we try to address include:
- How could anyone be this dumb?
- Why did this cascade throughout the planet?
- Can you provide more details about the impact and its spread?
- How do we prevent this from happening again?
Last August at DEFCON, Alex Pilosov and Tony Kapela presented a talk entitled Stealing the Internet: An Internet Scale Man-In-The-Middle Attack, which illustrated a technique for misdirecting specific Internet traffic via carefully constructed BGP routing messages. Using this approach, an attacker can redirect the incoming traffic of any victim through his own site for further inspection or alteration before ultimately passing it on to the victim. Furthermore, the attack can be carried out in a way that is largely transparent to the victim. Since this talk, Renesys staff have been repeatedly asked "So are people using this technique today?" That is, are people currently "stealing the Internet", and if so, who is attacking whom? Given the volume of routing data that Renesys has at our disposal and the number of tools we have to slice and dice it, we thought this would be a relatively straightforward question to answer. We were wrong.
Although we ultimately succeeded in answering the question and in developing a general Man-In-The-Middle (MITM) detection algorithm for the global Internet, we ended up writing a lot of code over the course of several months and burning through endless CPU cycles looking for attack evidence. Our results were presented this week at Black Hat and the complete presentation can be found here. In this blog, we'll hit on some of the highlights from the presentation.
This weekend, John Markoff wrote an interesting piece for the New York Times entitled Do We Need a New Internet? While his emphasis was largely on security, or rather the lack thereof, the central point Markoff makes is that the Internet may be so hopelessly broken that it could be better to start over, rather than continue to apply band-aids. As if to emphasize this point, SuproNet, a local Czech provider, single-handedly caused a global Internet meltdown for upwards of an hour today. SuproNet accomplished this feat by sending out a rather unusual routing update, one which a lot of routers did not handle very well. The result was Internet bedlam.
It's been an interesting year in many ways, not least of which for the Internet. This year, I started to contribute in earnest to the Renesys blog and back in January I was wondering "How am I going to find anything interesting to talk about on a regular basis? Nothing much happens on the Internet, right?" Well, it certainly did this year and now I've got many more ideas than I have time to research and write about. In hindsight, I guess it isn't too surprising. As the world becomes more interconnected and more Internet-dependent, we're bound to bump into each other more and expose the limitations of the current system. So let's review what 2008 brought us and take a guess at what is in store for the new year.
As readers of this blog will know, Renesys collects Internet routing data — a lot of it. We use this data in a variety of ways: in determining the impact of cable breaks, natural disasters and deliberate partitionings; in uncovering the source of hijacks or other questionable activity; in analyzing Internet business relationships; and in exploring "what-if" scenarios.
All of our reports and products are based on hard facts and objective analysis. Perhaps the only controversial thing we do with our data is to rank all the service providers in the world: globally, by geography, and by market segment. The rankings are a rather crude measure of size, as they are based entirely on the quantity of IP space ultimately transited by each provider. Although there are obvious shortcomings in this approach, it is certainly objective and the process is fully automated. It also happens to be derived from data that is readily available for all providers. Routing data, unlike most other metrics we could consider using, is inherently public.
While everyone wants to be #1 (hence the controversy around rankings), changes in rank can be far more revealing than the actual rank itself. In other words, while there are surely big differences between #1 and #50 in our rankings, the differences between #5 and #6 are much less clear given the nature of the metric. What we tend to look for are abrupt changes and long-term trends. Did a provider just jump in the rankings? Maybe they picked up a large customer or a nearby rival lost one? Who was it? Is another provider showing steady gains in the rankings? Maybe they are consistently taking market share with an aggressive, well-executed business plan in a particular part of the world? This is why changes in rankings matter: they capture some of the dynamics of the business of providing Internet service. With this in mind, we will take a look at the top 13 providers in the world for 2008 and how they have jockeyed for position throughout the year. We will also highlight some of the more interesting changes.
Atrivo (aka Intercage), a Concord, California-based Internet hosting service, disappeared from the Internet for around two days recently. They didn't go bankrupt or suffer a physical catastrophe. Their providers simply shut them down by refusing their traffic. This might very well be the first time in history that the Internet community, a cooperative association of networks with no governing body, has collectively put someone out of business, if only briefly. The alleged sins of Atrivo have been documented extensively, both in the popular media (e.g., the Washington Post) and in technical forums (e.g., Spamhaus and numerous postings to the NANOG mailing list). It is clear that emotions run high with respect to Atrivo, long accused of benefiting from cyber-crime by hosting purveyors of malware, adware, spam, viruses and other cyber-surges. In this blog, we'll take a quick look at their brief demise and make a few observations.
All reports from Louisiana indicate that power outages as a result of Gustav are extensive and ongoing, with over a million customers still without service and with potentially very long waits ahead of them. The extent of the power outages can be seen in regularly updated maps provided by the state. (A comprehensive list of utilities by region does not seem to be available.) We've even heard from state officials that the power problems are worse now than they were after Katrina. So it would be natural to assume that the ISPs in the state were similarly impacted, but that is not the case. Internet connectivity is alive and well in Louisiana and the other Gulf states, with all major providers operational, via either conventional or backup power. End users should have connectivity once power is restored to their homes. We'll review the past three days from an Internet perspective in what follows.
As the world waited for Gustav to hit the US, at Renesys we wondered how the Internet would fare this time around. Would we see the large scale, long term outages we observed during Katrina? Or would the critical communications infrastructure of the region stand fast? As of 19:00 UTC on date of Gustav's landfall, the score so far is Internet 1, Gustav 0. Connectivity in the region is very good and outages are sporadic. Either we got lucky or we've learned some valuable lessons.
We've been keeping an eye on Georgia all week. It's rather hard not to as the media keeps calling, looking for a juicy story. (It's amusing how the questions can seem designed solely to confirm a story that has already been partially written.) Not being schooled in this "art", we haven't been able to invent any interesting "facts", as the network infrastructure of Georgia has been relatively stable all week. But then today, we did see about one third of the country go away again for an extended period. Since it wasn't the entire country, we didn't rush out and buy oil futures. And since the outaged networks did come back, we're assuming this event was due to a temporary (although perhaps extensive) power outage.
As the world watches events unfold in Georgia, all eyes are on the Baku-Tbilisi-Ceyhan (BTC) pipeline, a major source of European oil that is not under Russian control and is projected to carry 1 million barrels a day by 2009. (See this link for a map of oil pipelines in the area.) What many people don't realize is that the cyber world is often built alongside the physical one. That is, those fiber optic cables that carry Internet traffic tend to follow the world's pipelines, bridges, and railroad tracks. Loss of Internet connectivity can therefore imply the physical destruction of vital pathways for trade. And so it is with some interest that we monitored Georgian Internet connectivity over the weekend as hostilities with Russia escalated. This blog takes a quick look at how Georgia connects to the 'net and what has been happening over the last three days.
Sigh. I had been meaning to write this for a long time. Where did the summer go? Anyway, since you are reading this, you've probably heard something along these lines before: "Oh, you work on the Internet?! You must be rich. The Internet is paved with gold!" Right? The fact is that the Internet can be paved with gold for the content providers (e.g., Google, gambling and porn sites), but for the rest of us, it isn't. Not even close. The truth of this is no more evident than in the Internet transit business, namely, those folks who move all the bits around that ultimately build fortunes for the content providers. It's a commodity business with ferocious competition, whose quality of service is difficult for the average person to gauge. When was the last time you volunteered to pay more for electrical service to your house? Or sewer service for that matter? Or even gave it a second thought? In this environment, the transit providers are under tremendous pricing pressure and have only two options: grow or die. To grow, they can enter new markets and/or buy up the competition. Sometimes they purchase licenses to Renesys' Market Intelligence to help them explore the marketplace. To die, all they need to do is stand still.
This blog entry is motivated by France Telecom's recent failed bid to purchase TeliaSonera, and explores the characteristics of both companies and what they would have looked like as a combined entity. Mergers in the industry are never good for Renesys (fewer potential customers), but we do have the data to consider some of the implications. Let's get started.
When we wrote about the issues surrounding the management of the L root, four questions came to mind immediately, which we will review here as way of a concluding blog on this topic. We also presented this work and our questions at NANOG 43 and OARC 2008 DNS-Operators Workshop. Unfortunately, we don't have many answers and welcome clarification from anyone in the know. The questions are
- Why wasn't ICANN using their own IP space?
- Why the change after 10 years?
- Why wasn't the old space simply given to ICANN?
- Why all the bogus L root servers?
We will summarize what we know about these issues.
Our blog on the L root server received quite a few comments, both at our site and others (e.g, Slashdot, CircleID, CircleID, and various DNS newsgroups). Negative responses tended to follow a "no harm, no foul" line of reasoning, which sadly completely misses the point. So we'll restate the central issue here again and talk about safeguards you can take today if you operate a DNS server or BGP-enabled router.
There have been a number of attacks on the root name servers over the years, and much written on the topic. (A few references are here, here and here.) Even if you don't know exactly what these servers do, you can't help but figure they're important when the US government says it is prepared to launch a military counterattack in response to cyber-attacks on them.
This posting is about an attack on one such root name server. Actually, "attack" isn't really an appropriate term. It was not really an attack or a hijack or even identity theft. For one thing, these terms imply the existence of both a victim and a villain. In this story, the villains are not obvious and there might not have been any victims. And as we will see, you can't really steal something you own. All we can say for certain is that many of you, if not most, probably used an unauthorized root name server over the past few months and were blissfully unaware of it. These bogus servers may have acted just like a normal root server, providing the correct answers to your queries without logging your requests. But since these servers are now shut down, we can no longer investigate what they were doing. And we can only guess at the motivations of those who set them up.
Here at Renesys, we've almost come to expect that natural disasters will be immediately reflected in changes to Internet routing. We've certainly seen that in events such as Hurricane Katrina and the Taiwanese earthquakes. So it was with some surprise that neither the earthquakes in Sichuan province in central China or the Myanmar cyclone registered so much as a blip on our Internet radar.
We currently geo-locate 3 networks (prefixes) to Myanmar and over 2000 to Sichuan province. Over the course of these unfortunate tragedies, we have seen only a normal level of network instability or outages. In the case of China, since the large providers into the country tend to do a good job aggregating prefixes, visibility into the behavior of smaller prefixes only comes from having in-country sources of data. But even our Chinese peers show nothing abnormal with respect to Sichuan networks. Hopefully the apparent lack of damage to the communications infrastructure in these areas will help speed relief efforts.
On March 28th at 17:52 UTC, we saw the peering link between Telia and Cogent come back up. Recently, peering disputes, especially with Cogent, tend to be all about traffic ratios: as long as both parties send roughly the same amount of traffic to each other, life is good. But when the ratios get out of whack, someone's feelings get hurt (more specifically someone's business model is threatened). Before the de-peering, we would typically see Cogent using Telia to reach around 2700 networks (prefixes). Now that count has dropped to just about 1450 networks. On the other hand, Telia used to reach approximately 7000 networks via Cogent and that number has now increased to almost 8600. So was Cogent sending too much traffic to Telia before? Did Telia then do something to provoke Cogent to turn them off (like send a bill)? We'll never know definitively, but someone blinked and the Internet is now whole again.
While this is good for the Internet, Cogent claimed that this dispute was about capacity issues and no one orders and installs new high capacity circuits in a week, especially during a contract dispute. So if there was a capacity issue, there is still a capacity issue. As a result, the situation is bound to be very fluid for the next few weeks. We'll update this blog as we analyze the resulting shifts in routing.
As in most lovers' quarrels, it is difficult to objectively evaluate the claims of the combatants. Naturally, we tend to side with the person we know best, as it's their viewpoint we hear most often and are inclined to be sympathetic towards. Both Cogent and Telia are claiming to be the aggrieved party in their peering dispute and are now making their case in the court of public opinion. We will almost certainly never know the details of their private business relationship, but we can make a few more inferences from the data. Let me state up front that, like many major ISPs, Telia and Cogent are customers of Renesys and we love them both equally. Everything we report in our blogs is based on objective analysis of our global data, independent of our own business relationships.
Cogent and Telia are having a lover's quarrel and, as a result, the Internet is partitioned. That means customers of Cogent and Telia cannot necessarily reach one another. This was not due to a configuration error or a physical cable break. This is the way the Internet works and sometimes doesn't work. If the businesses that run the show don't play nice with one another, their customers can pay the price of being cut off from parts of the 'net. At least when Pakistan mistakenly hijacked YouTube, the matter was sorted out in hours and did not require the cooperation of Pakistan. The Cogent/Telia tiff has been going on for 4 days now and only they can resolve their differences. The rest of the world can only hope for full connectivity to be restored.
In the past 14 months, the world has seen two catastrophic failures of its global telecommunications systems: the Taiwan quakes, which snapped 7 of 9 important cables in Asia in December 2006, and a series of mishaps in the Mediterranean and the Gulf, damaging several others. In a world increasingly dependent on global trade and communications, what lessons can we learn from all of this and what measures should we take?
I'll discuss these questions in what follows, but let me warn you in advance. There is nothing earth-shattering here. In fact, I can save you time and sum up the entire discussion with three bullet points:
- You get what you pay for.
- Entropy happens.
- Geography matters.
We've seen a lot of comments and discussion that fail to take into account one or more of these basics truths. Let's look at each point in detail.
We started this blog thread last week, when we only had two broken cables to consider, but since that time there have been reports of several more failures and they seem to keep coming in. As far as this thread is concerned, the first two parts (here and here) focused on the countries and local providers most impacted on the day of the first two cable failures. We then looked at the providers of some of the harder-hit countries and how they were able to restore connectivity (or not) during the subsequent 48 hours. And along the way, we felt obliged to counter some nonsense circulating on the Internet claiming that Iran had been cut off. It's been a busy week and we've barely scratched the surface. But plowing ahead, we will take an extended look at two local providers, Bharti in India and DCI in Iran, and how they weathered the storm. One week later, how are these two local providers gaining access to the global Internet? What has changed? We will use these examples to provide a glimpse into what can be discovered by collecting up enough public routing data from enough carefully selected places, combining it with geo-location information and then doing an enormous amount of processing.
Let me repeat, Iran is not disconnected from the Internet!
We have gotten a few queries about why we did not highlight Iran in our review of the network outages that resulted from the cable breaks. (See here, here and here.) Like most countries in the region, the outages in Iran were very significant, but for the most part they did not exceed 20% of their total number of networks. Now 20% is a significant loss, but in the context of an event where countries lost almost all of their connectivity, such a loss did not place Iran into the top 10 of impacted countries. So we focused most of our attention where the losses where the highest.
Our first two blog entries on this topic focused on the events of 30 January 2008, when two submarine cables systems were damaged. These systems provided much of the capacity into the Middle East and the Indian subcontinent from the west. Although some countries were hurt more than others, the loss of connectivity was extensive and very widespread. Some countries and a few providers were almost completely knocked off the Internet. As Day 1 came to a close, it was clear that the damaged cables were not going to be repaired anytime soon and the impacted parties would have to look for alternatives to waiting it out.
Day 2 and 3 saw a frenzy of activity as local providers in the region tried to broker agreements with anyone who still had capacity. They were under intense pressure to restore service to local governments and businesses. In turn, global and regional providers with surviving capacity into the region were busy hunting for new customers. We definitely had a seller's market. At Renesys, we watched all of the activity with great interest and decided to wait until the end of Day 3 to report on the winners and losers, after the initial deals were made and things had settled down to some degree.
After looking at the countries most impacted by the cable cut in our first blog on this topic, we now turn our attention to the Internet service providers in the region and how they fared. Due to differences in network architecture, cable ownership, and transit purchasing, carriers in the same country may not all experience the same degree of outage. For all of the following, we consider a network to be "outaged" when it is unreachable from the perspective of the broader Internet—as represented by Renesys's 250 peering sessions.
The following two tables provide the top 15 providers with the largest number of outaged networks. We list the provider's name, the country in which most of their unreachable networks are located and their autonomous system number (ASN), an assigned number that uniquely identifies their organization on the Internet.
In the first table, we list the providers in decreasing order by total number of outaged networks. In the second table, we list them by decreasing order of the percentage of their networks that are unreachable.Not surprisingly, the hardest hit providers are located primarily in the hardest hit countries: Egypt, Kuwait, India and Pakistan. One local provider in each of Egypt and Kuwait lost essentially all of their Internet connectivity.
Early this morning local time, two cable systems north of Alexandria, Egypt were severed, greatly impacting both Internet and voice traffic to the region. The broken cables are operated by Flag Telecom and SEA-ME-WEA 4, and if past undersea cable cuts are any measure, repair time will be measured in weeks, not days. This is a preliminary report on the countries most impacted by this failure, as seen from the perspective of Internet routing.
|
| Which way is up? |
Since I sometimes find myself hopelessly lost, I tend to wonder about global navigation in the days before GPSes or even accurate maps. I imagine you started off with just a general idea of where you wanted to go (e.g., "The New World"), crude navigational aids (the stars, Sun and Moon when you could see them), and hearsay from your fellow travelers or the locals about your proposed course. In addition, you only had a view of the world from your current location, limited by the curvature of the earth.
Maybe it speaks to a risk-averse nature, but I've always been interested in failure and in learning from the mistakes of others - obviously so I don't have to learn such lessons first hand. This is particularly important when you engage in activities where bad decisions can kill you. But generally, as any book on mountaineering mishaps demonstrates, it takes a series of errors in the "correct order" and at the wrong times to cause you serious harm.
In high risk activities under adverse conditions, it's not hard to make poor decisions that you would never contemplate from the comfort of your favorite living room chair. But while there is little risk to life and limb on the Internet, its very connectedness means that the blunders of pretty much anyone can impact you. What is important in this environment is the half-life and the reach of the mistakes. Those that are local and die out quickly have little chance of resulting in global mayhem. Others compound with all the other endless screw-ups regularly going on and eventually become a giant avalanche careening down hill, collecting mass and bearing down on the sleeping village below. This is one of those stories. It might be true or it might not. Your opinion depends on how much imagination you think we have!
