I’ve recently run into a few hardware failures in my servers and network so I figured I’d write up my debugging process and resolutions for all of them. The faults include a switch wall adapter, motherboard network interface, and a NAS data disk.
I had the network adapter start failing in my NAS. The symptoms was that on occasion, the NAS would stop responding to the network, and I could see in my firewall control panel that the NAS now had no IP address. My first reaction to this was to reboot the NAS, and the NIC came up and started working again after the reboot. But as we all know, that’s not going to last, and it went down again. This time I reconfigured the NAS to use one of my PCIE NICs as the main NIC. Unraid provides network configurations in the webui which I used to get it back going. The main catch I ran into was enabling bridging on the now prime NIC, and turning it off on the dead NIC. I forgot to turn it off on the dead NICand it prevented my docker containers from accessing the internet despite myself being able to access their various webuis.
I lost one of my older HDDs from my NAS recently. I logged in one day and I saw one drive was disabled due to read errors. On top of this, my parity drive was also outputting miles of errors. This didn’t look good as first and kind-of implied data loss of about 4TB. Looking at the smart reports for both drives, I saw nothing wrong with the parity disk, but the data disk showed a few hundred reallocated sectors. So now I knew the data disk was bad, but I might have something simple wrong with the parity. Since I had moved recently, I figured a cable could be loose. I powered down the system, unplugged and re-plugged all HDD cables (just to cover all the bases in case others were loose), powered up and the parity was fully functional, we don’t have 4TB of data loss. My next step was to just use unbalance to move all the data off the parity emulated drive and remove it from my array. This all went smoothly and my array is back at full operational status.
Network Switch Power Supply
My entire network changed its layout recently, so my office gained a small NetGear switch for the main switch. This worked well for a few days before everything went offline. This was strange since my WiFi had internet. Then I noticed the switch… no lights… nothing. I took out a volt meter and checked the power supply for it and got nothing. Well, looks like I need a new wall wart for the old switch. Lucky for me I had an extra Unifi switch I could install to avoid any more unnecessary downtime for the office while I looked for a new wart.