Raspberry Pi Docker Cluster

I’ve always wanted to experiment with clustering technologies, I tried setting up a kubernetes cluster however that ended in failure. For this next experiment, I went with something simpler to deal with, docker swarm. Since docker and swarm are supported on raspberry pi’s, and since i had a number of raspberry pi’s not in use, I decided to use them for the cluster.

I printed a 2U rack mount kit for raspberry pis. I felt like this would be the perfect time to make use of it. I racked up 2 raspberry pi 3B+ units with POE hats (more on that later) and went to use those for the docker cluster. I added Samsung 32GB micro sd cards for storage.

POE Power

With the PIs all in a rack, I wanted to minimize the cables going into the rack and make it easier to put them in the rack unit and pull them out. The side power connector on the Pis is sub-optimal for this. I picked up 2 RPI 3B+ and POE hats to cover this use case and they BARELY fit in the rack mount unit, but they did fit.

NFS Storage

Since raspberry pis aren’t well known for having a strong storage system, I searched for a better way to take care of storage for the cluster. I figured using my unRAID NAS would be a great way to handle centralized storage. This way all nodes could access data for all containers as the containers move between nodes. This needed a version 3.2 and higher docker compose file, however it worked well for getting the containers running.

---
version: "3.2"

volumes:
  ircd_nfs:
    driver: local
    driver_opts:
      type: nfs
      o: addr=192.168.1.123,nolock,soft,rw
      device: ":/mnt/user/Settings/Inspircd"
  anope_nfs:
    driver: local
    driver_opts:
      type: nfs
      o: addr=192.168.1.123,nolock,soft,rw
      device: ":/mnt/user/Settings/Anope"
  anope_nfs_data:
    driver: local
    driver_opts:
      type: nfs
      o: addr=192.168.1.123,nolock,soft,rw
      device: ":/mnt/user/Settings/Anope/data"

Problems

While the software setup was simple enough, it was not without its downfalls. The POE power that was thought to simplify things, had issues, the classic micro sd card issues reared their heads, and with those I even ran into NFS issues and docker issues.

POE Power

I ran into multiple times where one of my POE powered nodes would drop out of the cluster and require a manual reboot after a low power situation from the POE power. This became a problem as I learned not to rely on the cluster due to the potential for it to crash without warning and the inability to SSH in and reboot the nodes made it that much harder to restore service.

Micro SD Card Issues

Alas, the classic problem with Pis, I ended up having to reinstall one of the Pis a few times due to micro sd card issues, and once I had the best micro sd card in there, things began to work better, however the fact exists that micro sd cards are not the best form of primary storage, and if I’m looking for a long running machine with minimal maintenance, they may not be the best form of storage. Due to my NFS setup, I didn’t have to worry about application data loss, however the NFS caused its own problems.

Docker Issues

I ran into issues on the manager node where the tasks.db file would get too large and lock up the entire cluster. There is a GitHub issue related to this and a simple enough fix, however

NFS Issues

Rebooting the NAS providing the NFS share caused problems across the cluster, requiring a reboot of every node to solve. This proved that the storage solution i thought I had was actually not a solution, but moving the problem elsewhere. While I knew the NFS would be a single point of failure, I didn’t realize that none of the nodes would survive the NFS server being rebooted.

Successes

The raspberry pi docker cluster was a success at proving the viability of the docker swarm stack and simple OS setup that I had used on the Pis. The systems were easy enough to setup and add to the cluster. It also showed that I needed a more fault tolerant system for storage as the NFS server requires restarts from time to time, and I didn’t want to restart the entire cluster each time that occurs, and the raspberry pi doesn’t have the best storage engine to handle something like gluster.

Conclusion

In the end, I used what I learned from the raspberry pi cluster to build an x86 virtualized docker cluster. This used docker-swarm in the same way as the raspberry pi cluster, however it used gluster for the storage, sharing storage between all 3 nodes and synchronizing it. By using x86, I can avoid building my own arm based containers, by using gluster for storage, I avoid the NFS connection issues, and with the virtual machines, I can avoid the problems I encountered from the raspberry pi hardware.