9th July 2022

Atlanta Datacenter OUTAGE - ATL106KVM

We are aware that ATL106KVM is currently experiencing connectivity issues. We are currently looking into this node further and will provide additional updates on this matter once we have more information.

UPDATE: We are still working with datacenter remote hands to identify the issue. Thank you for your patience during this time.

UPDATE: We are currently in the process of replacing the PSU and motherboard on this machine. We will update once we hear more from the hands on the ground at the ATL datacenter.

UPDATE: Remote hands in Atlanta is still working on this. We will update once they provide us with additional information.

UPDATE: We have been actively working on this node with remote hands in ATL, but have been unsuccessful so far. We had originally expected the PSU replacement to rectify the issue at hand earlier this morning (which it did not, the server's BMC is up but server stays shut down). Because that was unsuccessful, we then decided the next course of action would be to replace the motherboard as that was displaying symptoms of a bad motherboard.

We had remote hands perform an emergency motherboard swap from inventory, but came to find out the heatsinks we utilize for this CPU were not compatible with the replacement motherboard that the datacenter had in inventory. At this time, we are preparing a shipment to overnight a new heatsink to ATL. In the meantime, we are currently working with on-site technicians to explore other options for bringing this server back online.

We sincerely apologize for this unexpected outage and greatly appreciate your patience regarding this node.

UPDATE: While we are still working on exploring other possibilities with our datacenter partner in ATL, we have a kit of preassembled RAM/CPUs/heatsink/motherboard that has been verified working and will be overnighted to arrive tomorrow morning. If we are unable to identify a solution before this, we will have remote hands then swap the entire kit into the server to eliminate any other potential faults.

UPDATE: Despite a replacement of major components (including full motherboard replacement), this machine is still experiencing issues (this time the machine is at least powering on, but returning post code errors). The symptoms include several post issues, so our team is continuing their efforts to diagnose and restore service.

We fully understand the impact and downtime is something that we all wish to avoid. Rest assured, our team is working closely with facility hands to resolve and troubleshoot this issue. We will share additional updates once available on this status incident.

UPDATE: This incident has been resolved, the on-site technicians have corrected the post code errors and the machine is now online with the replacement kit. All VMs on this node are now responding again, and this host node is now considered to be fully operational. If you are still experiencing issues in connecting to your VM on this node, please feel free to open a support ticket and we'd be happy to assist. We will be working closely with our datacenter partner in Atlanta on a preventative follow-up plan to avoid such incidents from occurring again in the future.