
Some of Virgin Media’s cable broadband customers have been left unable to access a number of popular websites (e.g. Stack Overflow) for approximately two weeks due to the ISPs apparent lack of a Service Level Agreement (SLA) with “the network where we think the issue is” (i.e. Internap).
A Virgin Media spokeswoman told The Register yesterday that it was “aware some of our customers are having difficulty” and is now actively “liaising with the network provider to understand what may be causing this specific issue“.
Mark Wilkin, VM’s Support Forum Manager, added (VM Support Forum):
“Also for reference the majority of the internet (excluding dedicated peering connections) still operates on the basis of “you pass my traffic along and I’ll pass yours” agreements that operate on a “handshake” basis. So in this case we don’t have a SLA with the network where we think the issue is.
We’re talking to internap at the moment and we’re also currently attempting to contact stackoverflow.com as well to talk to them about getting this sorted out.”
Stack Overflow, part of the Stack Exchange Network, is a hugely popular question and answer site for professional and enthusiast programmers. The Stack Exchange itself claims to be a fast-growing network of 99 question and answer sites that are typically used by around 3 million people.
It should be said that the issue does not affect everybody, only those connected to certain parts of the operator’s network. In the meantime those wanting to visit Stack Overflow can either continue to view the “Request timed out” message until it’s resolved or simply avoid Virgin’s network with a free web proxy. We haven’t been able to test it ourselves but switching to OpenDNS might also work.
If the problem is DNS based, then using a different server such as OpenDNS would resolve the issue, but if it is routing based problem (likely given the statement) then changing DNS would not have any effect.
Networks exchange IP packets between border routers. Every network must have at least 1 transit provider (though typically several), and can have private peers (as many as you can get). Transit costs the ISP money, where as peering is generally free. As such you push as much traffic over peering as you can, and it is typically a best efforts arrangement with no SLAs. Transit however is a paid for service and would typically have an SLA attached.
Transit will get you access to 100% of the internet, where as peering shall only provide you access to a small fraction of the internet (basically your peer’s own downstream customers).
Border Gateway Protocol (BGP) is used to interconnect separate networks. ISPs strive to ensure that they have at least two routes to any given destination on the internet for redundancy purposes. This is why they have multiple transit providers. A destination may have a route via numerous transit providers and peers. The routing protocol will select the path which has the shortest path in terms of network hops. If Virgin Media are seeing issues in their routing between two networks then they should adjust their routing tables to send data over a different path. This is called traffic engineering.
Traffic engineering is done for several reasons. This could be least cost, political (such as routing a lot of traffic over a link which you know shall cost the 3rd party network a lot of money in transit fees in order to increase the chances that they would peer with you directly, thus reducing your transit costs) or best path.
I guess VM are going with the least cost option.
You could improve your connection by pushing your traffic over a VPN where the “other end” is on a network outside of Virgin Media, but the link between them and the “other end” is OK. The choice of “other end” must be such that the route form there to the effected server (such as stack overflow for example) is via a different route than it is directly from Virgin Media.
This would require knowledge of routing and looking glasses to see the BGP AS-Paths.
@Bob – Although they can control which provider they use for traffic leaving their network, unfortunately they have no control over the return path. Although that said, I very much doubt they would have made any changes, too much like hard work.