I’ve recently run into a few hardware failures in my servers and network so I figured I’d write up my debugging process and resolutions for all of them. The faults include a switch wall adapter, motherboard network interface, and a NAS data disk.Continue reading “Recent Hardware Failures”
I have been fighting failing parity checks for a few months now on my unraid server. I looked into each disk, checked smart stats and even thought I had found the culprit hard drive that was causing the issues. I still had it in my array but with no data on it just in case. This all happened just before another set of problems arose. The VMs on my server started acting up, crashing, and eventually when logging into one VM, everything crashed due to memory problems. I ran memtest and discovered that one of my RAM sticks was at issue, and from there determined that it simply wasn’t seated properly. After reseating the RAM, everything started working properly again. Parity checks come back clean, no more kernel panics, and the VMs are running stably. One partially unseated RAM stick caused all those issues.
I was originally excited when docker was going to be included in the next release of unraid, the concept behind it was solid and sounded like it would make management of my server easier. This was the case for months before docker started acting up. Now I’ve been working on a way to remove any need of docker on my NAS, moving it to a VM or another server due to its instabilities. Issues I’ve run into include it not being able to stop running containers, start stopped containers, create new containers, and preventing Linux from shutting down. I could live with all of the above except the shutdown bug. It doesn’t just prevent shutdown from running, but it prevents the kernel from shutting down at all, and well after the user shells are all offline, so there’s no way to manually kill docker to allow the system to shut down safely. This is exceptionally frustrating and has caused unclean shutdowns when I’ve lost power and even when I’m just doing maintenance, since the only way to restart when docker does this is to do a hard reset. I’m not giving up hope on containers, just going to be a bit more careful around docker, they seem to advertise quite well compared to issues people have had with their software.