More than four thousand customers on CityFibre’s full fibre broadband network in the Northeast England city of York (Yorkshire and the Humber), which is used by lots of different ISPs (Vodafone, Zen Internet, iDNET, TalkTalk etc.), are this afternoon being affected by a major network outage.
The incident, which appears to have started at around 11am this morning, resulted in customers losing internet connectivity and appears as if it could be related to the failure of a core router. But at present it’s unclear why the router failed or when local services will return to normal. All of the operator’s resolver teams (FLM, ERS, and Magdalene) have been engaged to help restore the service.
We are expecting another update on the progress of this effort any minute.
Advertisement
UPDATE 2:38pm
One of CityFibre’s ISPs, iDNET, has just issued the following update: “Following the engineers visit a power issue was identified which has now been resolved. Any services affected should come back up shortly.” ISPreview understands from other sources that the power issue was caused by a tripped breaker at their Fibre Exchange (FEX).
UPDATE 1st Jan 2025 @ 10:08am
The service fell over again for a second time last night, but was later restored. This morning we’ve seen the following status update from CityFibre’s network team.
Advertisement
CityFibre Network Status
Services have now been fully restored for all affected customers. The on-site specialist electrician has confirmed that the issue was caused by the FEX overheating, which led to a power failure within the rack.
CityFibre is currently arranging for the appropriate team to attend the York FEX site to repair the failing AC unit responsible for the power issues. In the meantime, Magdalen will remain on-site overnight to monitor the rack’s power supply and ensure no further disruptions occur.
At this time, the estimated time of arrival for AC support is yet to be determined. However, it has been confirmed that this will not take place until later today. Major Incident Management will remain engaged in overseeing the incident but will place it on hold until 10:00 AM to allow time for a confirmed ETA for the AC support team.
Advertisement
Don’t breakers usually trip to protect against a fault on the circuit not be the root cause themselves?
Breakers themselves can be faulty, but if there were no other faults found, it’s most likely that there was some condition caused by transient over-voltage.
Also equipment of that magnitude will have A and B power feeds, sounds like a poor design if 1 break tripping can take out a core router!
Most Openreach exchanges, save really huge ones with core nodes, are single-fed by low voltage standard 120+amp residential-style supplies. Grid power is reliable enough that single-feed is fine and UPS with gensets can cover any small issues. It’s far cheaper to operate this way rather than install dual-feed high volt multi-substation supplies.
The problem today with Cityfibre could have tripped two breakers if a PSU popped, for example. That’s very common, unfortunately, and breakers are very sensitive to any transient spikes.
Redundant PSUs fail in pairs far more often than you might imagine. In any piece of equipment they both will have come off the production line at basically the same time so any systemic issues will be present in both of them. One gives up and the other goes no chance I am taking the load of both of them and promptly fails too. Been at the sharp end of it more times than I care to think about.
Lesson is learned, can’t be repeater again!
And it’s just gone down again!
For a critical telecoms data centre it should have separate A and B power supplies fed from different electricity substations, automatic transfer switches so that both A and B legs can be maintained at the racks if one of the incoming supplies is lost, and all of the data centre equipment in it should be dual power supply on both legs. My employer won’t allow any single power supply kit to be installed in the data centre regardless of how critical it is.
Looking at where CF locate their FeXs I would doubt they’d have proper datacentre grade fully independent power supplies. Many are tucked away in converted industrial units or a glorified portacabin. With a small generator backup.
As with most things, you get what you pay for.
It’s not a data centre, it’s a transmission site. Apart from a few key transmission sites by the bigger players there won’t be many with a dual feed in from the grid.
Where they diversify is at the DC supplies. So probably one AC supply in, depending on how big the site is it could even have its own 11kv supply.
The DC supplies should be separate feeding A and B supplies on the equipment, with battery back up. The AC feed will also be generator backed up. Ultimately the batteries should only be there for the few seconds it takes the generator to start up, but they also have enough capacity to hold up the site in the event of a generator failure.
I’d struggle to characterise 4,000 customers as critical. We’re not talking LINX scale infrastructure.
As always, the question to ask their ISP customers is ‘what SLA are you paying for?’ It’s unlikely to be 5 or even 4 nines.
Let’s put this in perspective. We’re talking about 4000 residential users here. With a 1:64 split ratio that could easily be served by a single street cabinet. And no, street cabinets wouldn’t have 11kv dual feeds…
Its all gone down again everywhere – very frustrating!
It’s just gone down again 🙁
And it’s off again
It’s gone again, so CityFibre’s very rigorous troubleshooting of “flip it back on and see if it trips again” looks to have come up short.
And it’s back down again since 18:30
Vodafone broadband still out in York this evening. They tell me it’s a mass outage in the area.
YO24 3AJ
Mine is still down at 22.13
The fault has returned for us here in York unfortunately. Thank goodness for 4/5g !
Down again as of 18:45 still no resolution but specialist electrician enroute and next update 00:30
Apparently the specialist electrician was Hooting with his Nanny at around 00:30. 🙂
Still not working in acomb
Is there any update yet?
Anyone up and running in acomb?
The second update from CityFibre further illustrates how poorly set up their infrastructure seems to be. It should not be possible for an air conditioning failure to go undetected until it causes an equipment failure. There are multiple ways an AC failure can be detected: The AC system itself should be able to trigger an alert to the maintenance team that there is a fault; Environmental monitoring of the room and rack temperatures should detect the temperature has exceeded a pre-set safe level and trigger an alert; The FEX itself should be able to trigger an alert based on its internal temperature sensors. We have all 3 methods configured to automatically raise an incident ticket with a 30 minute SLA for an on-site technician to go investigate.
Sounds like a job for IoT sensors. If only they had a way to connect them to their wider monitoring system!
The following day still down in York for me. My broadband provider says everything is operational but it’s far from it. Nearly 48 hours without WiFi now it’s an absolute joke. Never get any updates from city fibre themselves or eta as to when it could be working again
Residential connections are just that.
You cannot say it’s an ‘absolute joke’ when you aren’t paying for a SLA.
If you want guaranteed uptime you need to invest in a backup connection and the hardware to support it.
Could this have happened to Openreach too?
What about the alt nets like Grain , Brsk etc?
Extremely poor quality engineering from Cityfibre, how many site visits required before the fault was correctly diagnosed and rectified? Should have been FTF (First Time Fix).
Has this problem been resolved yet please. I still have no internet and wondered if it was just me.
And if someones depending on voip over their data lines to make an emergency call, to say save a life, how non-crittical is this really, particulalry from a patients perspective?
Dependable infrastructure?
But them my early background was in flight control systems where a single failure was never an acceptable systme outage, be that from power, sensor, control, or supporting systems, even e.g. hydraulics.
This just seems another example of shoddy national infrastructure across, need, requirement, V&V to ‘as built’ and of course the governance of. Particulalrly given ‘mandated’ [improvement] change on ‘old’ / existing, seems like theres a clear lowering of dependability-resillience ‘standards’? Just more ‘modern’ managememt and engineering standards (d)evolution I wonder.
@SicOf
Well said, I agree completely.
As you say, when people will be relying on VoIP for emergency calls then the internet connection suddenly becomes critical.
There must be legislation put in place to make the system resilient.
I agree it’s cheaper than ever to introduce resilience and we’re removing it. Crazy
Isn’t this similar to the outage in Glasgow from the other month that also took a while to fix due to the engineer traveling up from Plymouth?
Indeed it is. After there’d been reports of a fire in that exchange of theirs.
It’s crazy how they have so few engineers on the ground. But then again, they have made tons of people redundant recently.