All services are operating normally at this time.
Data Center | Current | Last Day | Last Month | Last Year |
---|---|---|---|---|
Dallas, TX | Up | 100% | 100% | 100% |
Seattle, WA | Up | 100% | 100% | 100% |
Piscataway, NJ | Up | 100% | 100% | 100% |
Los Angeles, CA | Up | 100% | 100% | 100% |
London, UK | Up | 100% | 100% | 100% |
Maidenhead, UK | Up | 100% | 100% | 100% |
The Netherlands | Up | 100% | 100% | 100% |
Seattle network issues
Rebooting dalvz7highram12
NJ network problem
seapure1 connectivity
dalvz7highram6
seavz7highram1 accessibility issues
seavz7highram1 connectivity issues
Network Issue in Dallas
Hello,
OFFICIAL RFO - 10/28/2019
Summary of Incident:
———————————————
Yesterday, Monday October 28th 2019, at approximately 4:23pm portions of customers in our TPA1, TPA2 and DAL1 data centers experienced a loss of network that lasted anywhere from a few minutes to a few hours depending on your server(s) location. The cause of the issue has been identifed and is as follows:
At roughly 4:23pm one of our Network Engineers applied a policy update to our DAL1 edge routers. This policy update was incomplete which led to the full internet routing table being propogated throughout the aggreagation layer of DAL1. This mistake was further exacerbated when that full routing table was automatically injected into the Hivelocity DDoS protection network resulting in the full routing table being distributed to other Hivelocity facilities, i.e. TPA1 and TPA2. The full internet routing table injection led to multiple network devices having their resources exhausted which ultimately led to the network disuption. Once our Network Engineers identified the cause of the issue we began reloading each of the affected network devices to correct the problem. Ultimately, yesterday's network event was a result of human error.
Service Impact Times:
———————————————
October 28th, 4:23pm - 6:44pm EST
Remediation Plans:
———————————————
We have implemented new router policies that will prevent full route tables being similarly propogated should human error ever occur again. Additionally, we have implemented new review protocols to minimize the likelihood of any human error occurring.
For years most of our customers have experienced 100% uptime due to our redundancies and nearly 2 decades of experience. We take our responsbility to you very seriously and no one hates it more than us when we fall short of our goals. We are deeply sorry for the inconvenience and any negative impact this disruption had on your operation.
seavz7highram1 connectivity errors
LA server connectivity issues
ukhighram2 is down
lastorage1 is offline
Tuesday, 01 October 2019, 15:21 - We're aware that this server is currently offline. We've reached out to the datacenter and are currently assessing the situation. Further updates will be tagged to this announcement as they become available. Thank you for your patience.
Wednesday, 02 October 2019, 23:12 - From what it looks like some of the harddisks in the RAID60 array have silently spitted errors and the RAID card (HP P420) wasn't aware of this and therefore the filesystem has crashed. We have brought the node back online and we are currently running fsck.
Friday, 04 October 2019, 18:10 - We're still working on this. We'll update you as soon as further details become available.
Sunday, 06 October 2019, 04:56 - fsck process is still ongoing.
Tuesday, 08 October 2019, 11:55 - fsck remains ongoing at this time, and is expected to take a bit of time given the size of the disk array. We'll update this anouncement as we have more information.
Thursday, 10 October 2019, 09:06 - fsck process continues.
Monday, 14 October 2019, 02:06 - fsck process has been finished. Some of the VPS's are online. Some are not booting up due to corrupt files. We are working on this.
Monday, 23 October 2019, 14:18 - We were only able to recover around half of the VPS's from this node. We have built a new node and we are syncing the data to it right now. A ticket will follow for your situation.
Dallas network connectivity issues
08/05/2019 06:57 - We are having connectivity issues in Dallas at the moment. Our engineers is working to resolve the issue.
08/05/2019 07:49 - Our engineers are still trying to resolve the issue. Your patience is appreciated.
08/05/2019 08:02 - We have found the root cause of the issue and 90% of the servers are back online now.
08/05/2019 09:53 - All the servers are back online. The issue was related to a bug in our networking gear. We have applied a fix for this and we'll upgrade our core networking gear to have full redundancy to avoid issues like these in the following month. A full RFO will be posted here.
------------------------------
Reason for Outage:
Out of a sudden we have started to notice some of the IP addresses disappearing from the ARP table of our core router and we have started investigating the issue.
All the network configuration was correct and it's been running with the same configuration for the past 4 years without issues.
We have then identified that once a random MAC ID gets online, the ARP table was getting filled up in a few seconds which was causing the unstability.
We have narrowed down to which MAC ID was causing the issue and then we have replaced it.
We have notified the manufacturer of the routing gear about the issue and we are awaiting an explanation.
We are sorry for the inconvenience caused by this.
Partial network connectivity issues in Dallas
Issue with two servers in UK
Storage server connectivity with IP range 63.142.248.xxx
Dalhighram16 - connectivity issues
Network Downtime in Dallas
We have experienced a BGP flap on Thursday, 20 June 2019, 08:59 and our engineers have investigated the situation and found out that our main router has rebooted itself for an unknown reason.
We have been monitoring our router since and we haven't seen any irregularities until Friday, 21 June 2019, 15:19.
On Friday, 21 June 2019, 15:19, our router became inaccessable again.
We have dispatched datacenter engineers to physically connect to the router and unfortunately they weren't able to.
We have then immediately attempted to put our spare core router online to replace the faulty one.
Finally on Friday, 21 June 2019, 15:59, the spare router was in place and up and running and our BGP session was reestablished.
For this not to happen again in the future, we are moving forward with our our core redundancy setup plans to have full network redundancy in advance of our scheduled date to complete this.
We sincerely apologize for the inconvenience this situation has caused you and we appreciate your understanding very much.
Connectivity issues in Dallas Location
Varied Dallas network connectivity issues