Home
 » ISP News » 
Sponsored

UPDATE7 Major BT Fault Affecting UK Internet and Broadband Services

Wednesday, July 20th, 2016 (10:06 am) - Score 6,868
storms united kingdom broadband weather

Multiple ISPs are reporting that a fault, which is most likely on BT’s national broadband network, is affecting Internet connectivity for a large number of people across the United Kingdom (i.e. disrupting access to online services, websites and slowing connection speeds).

The problem itself began after 8am this morning and does not appear to be affecting every ISP and area, although it is extremely widespread and right now none of the ISPs are able to provide a clear picture of the cause.

BT Service Status Update

We currently have a major broadband problem that is preventing webpages from loading for some customers. This is also stopping some users from logging into their email accounts via webmail. Our engineers are working to fix this as quickly as possible.

However two ISPs have pointed the finger towards a possible problem at a major London data centre, which may be suffering from a routing problem that is connected to a power failure (if it’s routing then a fix shouldn’t be far behind). Now here’s a selection of some ISP tweets, until we know more.

UPDATE 10:21am

According to Entanet, “The problem has been confirmed on BT’s side and they are currently investigating. We haven’t been provided anything relating to the cause as of yet, but we will update upon receipt of this information. Unfortunately the BT systems remain down preventing us from looking into individual line issues.”

UPDATE 10:43am

The official line from BTOpenreach is as follows: “We’re sorry that some BT and Plusnet customers are experiencing problems accessing some internet services this morning. This is due to power issues at one of our internet peering partners’ sites in London. Engineers are working to fix things as fast as possible.” Of course it’s not just BT and Plusnet customers being affected.

On the upside some traffic now appears to have been rerouted and customers are reporting an improvement, but there are still a lot of complaints coming in. We suspect this will be resolved fairly soon.

UPDATE 10:52am

The power failure appears to have occurred at London’s TeleCity data centre.

UPDATE 11:07am

A private report from TeleCity, which was sent to certain providers earlier today and seen by ISPreview.co.uk, suggests that there was a power failure with one of their Uninterruptible Power Supply (UPS) systems at 8/9 Harbour Exchange (LD8) in London.

The fault hit a key router (edge4-tch), although this was technically resolved at 9:15am. However the knock-on impact for ISPs like BT and their clients (consumers, businesses and other ISPs etc.) can often take a little bit longer to resolve (computer networks are complicated animals).

Connections should be slowly getting back to normal.

UPDATE 11:57am

BT informs us that the problem is affecting around 10% of Internet usage on their network. So essentially, when users go to certain websites which are routed via the faulty part of the network, around 10% of their attempts to connect to the website they want to go to may fail. However, as posted above, normal services are slowly being restored.

UPDATE 1:17pm

The London Internet Exchange (LINX) has furnished us with the following statement to help clarify their role in events: “This morning between 07:55 BS and 08:17 BST, one of the Datacentres that houses equipment for The London Internet Exchange (LINX) experienced a partial power outage. This affected only one of a number of internet peering nodes that LINX operates at the facility, and service was fully restored on the LINX network at 09:15 BST. Several reports claim that this outage affected British Telecom and their services at LINX. While several networks connected to LINX were indeed affected, BT was not one of them. LINX provide two fully redundant platforms to offer better resilience to the UK’s Internet infrastructure, the second platform was not affected by this power outage.”

UPDATE 21st July 2016

A very similar fault has struck again this morning (read our coverage).

Leave a Comment
16 Responses
  1. Avatar Kits

    I noticed that BT speedtest is also down so looks like BT to me.

  2. Avatar DTMark

    That’s a timely update – just had one of our customers on the phone (Holborn, London) with this issue, and I was able to find this information here.

  3. Avatar Ignition

    Openreach don’t have a national broadband network, Mark. That network belongs to BT Wholesale.

  4. Avatar Alanc

    This outage was affecting me but everything is working fine now.

  5. Avatar Steve Jones

    The problem was finally resolved form me at about 11:15 and had been running for 3 months. It’s not the first time power outages at Telehouse have caused problems. In any event, in this era it’s a bit concerning that there is a dependency on any single physical location, no matter how internally resilient it is.

    I would have thought that from a national security perspective no critical network services ought to be dependent on single physical locations.

    • Avatar Olorin

      I understand your viewpoint Steve, but this is impossible to achieve in practice. There are over 500,000 independent networks around the world, who each need to be able to connect with each other. We have ‘transit’ providers to achieve long-distance connections, but to maximise speeds and network quality, networks need to ‘peer’ locally with one another. This involves operating heavily connected points of presence, where lots of networks come together for this specific purpose.

      London’s Docklands area is the key place in the UK for this to happen. In addition, international fibre (underseas cables) come in to these key data centres in Docklands, so the largest of networks MUST be present here to connect. There are other key points in the UK such as Manchester and Ireland, but London is actually the world’s largest internet exchange/connectivity point in terms of traffic when we combine all IX, DC and transit providers’ data usage.

    • Avatar DTMark

      So who was actually affected by this? For example EE 4G has been working here perfectly well; I’ve had calls from customers to say that they can’t access their stuff, which is loading fine from here with no hiccups this morning.

      Apart from the Halifax website which breaks quite often anyway and would seem to be unrelated, I haven’t seen any issues at all.

    • Avatar Steve Jones

      @Olorin

      I’m perfectly aware of the difficult of all this. I worked on extremely highly available IT systems for decades. It is horrendously complicated, and especially for “stateful” systems which involve issues of data and state replication, transactional replication and so on. I’ve no doubt that London docklands is the landing point for a lot of core networks, and I’ve also aware how expediency creeps in over time as it’s easier to bring everything to single points rather than design geographic resilience into network and systems.

      There is no issue over the use of peering/network interchange systems. Indeed, defining major architectural components as common functional units is one of the key ways of making systems manageable. However, those common functions have to be inherently resilient, and it’s rather more practical with network than for systems.

      There is also an issue of national security. I have visited Telehouse (but not for perhaps a decade), and any single location is a potential target for a localised disaster, whether natural or man-made. I recall many years ago an IRA truck bomb exploding in docklands, not too far from the original Telehouse site from what I recall.

      Yes, it’s difficult, and yes it’s expensive, but the scope for a major network outage which could cause huge national disruption is simply enormous.

      The issues are possibly made worse as Telehouse not only acts as a location for strategic network interconnects, but as a hosting centre. OK, Telehouse now has more than one dockland location, but the suspicion is that this is more often for capacity rather than true geographical resilience.

    • Avatar Olorin

      I’m not really sure I follow, Steve. I understand your point of a (potential) security breach, but you could say the same for Thames House, SIS Building, Houses of Parliament and so on.

      Equally the consequences losing connectivity would be felt around the world, but remember there are hundreds of key connectivity sites/locations around the world just like London’s Docklands. Los Angeles has One Wilshire, New York has 111 8th Avenue – taking either of these down would collapse a great portion of the world’s connectivity.

      There are over ten independent data centres spread around Docklands, many more very close by, and these are all interconnected. Sure, they could be spread 1+mile from each other, but at the hefty cost of less performance to their end users, where it really matters. The balance of physical diversity versus raw performance is a never-ending battle for network and hosting operators. What’s key is nobody has the ‘right’ answer – they all do things differently, and there are copious ISPs of every nature out there to suit the needs/requirements/worries of every customer.

    • Avatar Steve Jones

      @Olorin

      Yes, there are other important single sites outsides of the domain of network/computing, but some are false comparisons. Yes, the Houses of Parliament are (probably necessarily single site), but if the Thames flooded and took the whole site out, it would not stop the UK running. Human networks are generally naturally resilient just because people are very flexible. Major infrastructure has to be designed with resilience in mind.

      The issue with regard to common-mode failures due to co-location are often neglected as it’s expedient to just ignore them. The problem with the way telecommunications and systems are embedded into modern society are such that outages can cause enormous economic damage very quickly. In systems designed to work on near-instantaneous access, the consequences of failures can quickly escalate. Modern society is, in many ways, much more fragile than it used to be due to the lack of inherent resilience. It is said that the country is only three square meals from chaos.

    • Avatar Olorin

      Steve, what do you propose as the solution to this problem you describe?

  6. Avatar DTMark

    Only in the language of broadband double-speak could an uninterruptable power supply be interrupted 😉

    • Avatar Steve Jones

      Anybody who has ever worked in a major data centre (and I’ve worked in a few) can tell horror stories about UPSs (and auxiliary generators) failing. They are immensely complex (as are all aspects of highly available systems). Really it’s impossible to guarantee 100% of anything at a single physical location.

      nb. this is far from the first power outage at Telehouse.

    • Avatar dragoneast

      I just love the idea of a country “three square meals from chaos”! Sums up my dietary principles.

      But seriously, are some of us with the wrong ISP? My ISPs fall-back kicked in and kept everything running seamlessly this morning.

      More seriously, of course there’s a risk of meltdown. The issue keeps coming up periodically. Although I doubt any of us are fully aware of the resilience that critical (and I mean critical to the country, not “just me”) networks have – why would that information be publicly available? That being said I’m sure from what we do know that a number of “solid” institutions e.g banks and a lot of data processors have less than they ought to. But people, rather than networks, have a lot more resilience than the pros give them credit for. In the real world we often respond to risks after, rather than before. For a very good reason, try to eliminate all risk and you’d never stop. We have other priorities, rightly or wrongly in the opinion of any of us, individually. Money doesn’t grow on trees, whatever we like to think.

  7. Avatar nice-try-at-pulling-the-wool-over

    Both of LINX’s London peering LANs are present in at least 10 datacenters of which more than half are located in Docklands so there is no excuse for any ISP, from small hosting provider to national network operators, to claim that they had no option to take a redundant connection to the UK’s largest Internet Exchange in another location.

    Furthermore the suggestions that automatic failover in the event of a network losing either an upstream provider or IX connection is “impossible to achieve in practice” as there “are over 500,000 independent networks around the world” are simply untrue as the primary purpose of routing protocols such as BGP is to facilitate the dynamic re-routing of traffic when connectivity to the previous best path is lost.

    I have very little sympathy for networks who fail to adequately plan their capacity to cope with a single POP or provider failure as regardless of how many 9’s the provider puts in their SLA, in the real world beyond the whiteboard its a case of when something will fail rather than if.

    So there may be other factors in play but if as reported the issue was caused by one of BT’s multiple public peering ports being down, which itself was a result of a power issue at the datacentre, the impact do a well designed and managed network should have been limited to at most the 180 seconds it takes for a BGP sessions with default timers to go down… which does not appear to be the case here 🙁

    • Avatar Evan Crissall

      Thank you for your cut-to-the-quick analysis. A refreshing change from the implausible bluff of BT sockpuppets who overwhelm these forums.

Comments RSS Feed

Javascript must be enabled to post (most browsers do this automatically)

Privacy Notice: Please note that news comments are anonymous, which means that we do NOT require you to enter any real personal details to post a message. By clicking to submit a post you agree to storing your comment content, display name, IP, email and / or website details in our database, for as long as the post remains live.

Only the submitted name and comment will be displayed in public, while the rest will be kept private (we will never share this outside of ISPreview, regardless of whether the data is real or fake). This comment system uses submitted IP, email and website address data to spot abuse and spammers. All data is transferred via an encrypted (https secure) session.

NOTE 1: Sometimes your comment might not appear immediately due to site cache (this is cleared every few hours) or it may be caught by automated moderation / anti-spam.

NOTE 2: Comments that break our rules, spam, troll or post via known fake IP/proxy servers may be blocked or removed.
Cheapest Superfast ISPs
  • Hyperoptic £19.95 (*22.00)
    Avg. Speed 50Mbps, Unlimited
    Gift: Promo Code: HYPER20
  • NOW TV £22.00 (*40.00)
    Avg. Speed 36Mbps, Unlimited
    Gift: None
  • SSE £22.00
    Avg. Speed 35Mbps, Unlimited
    Gift: None
  • xln telecom £22.74 (*47.94)
    Avg. Speed 66Mbps, Unlimited
    Gift: None
  • Vodafone £22.95
    Avg. Speed 35Mbps, Unlimited
    Gift: None
Prices inc. Line Rental | View All
The Top 20 Category Tags
  1. BT (2693)
  2. FTTP (2536)
  3. FTTC (1740)
  4. Building Digital UK (1682)
  5. Politics (1576)
  6. Openreach (1539)
  7. Business (1358)
  8. FTTH (1280)
  9. Statistics (1189)
  10. Mobile Broadband (1159)
  11. Fibre Optic (1034)
  12. 4G (1000)
  13. Ofcom Regulation (986)
  14. Wireless Internet (985)
  15. Virgin Media (962)
  16. EE (668)
  17. Sky Broadband (649)
  18. TalkTalk (633)
  19. Vodafone (625)
  20. 5G (462)
Promotion
Helpful ISP Guides and Tips
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
»
Sponsored

Copyright © 1999 to Present - ISPreview.co.uk - All Rights Reserved - Terms , Privacy and Cookie Policy , Links , Website Rules , Contact