As a project, I decided I wanted to learn more about Cluster computing. The cheapest way to do this, of course, is with a cluster of Raspberry Pis. There are several truly impressive Raspberry Pi clusters across the internet, but the cheapest and easiest option I could find was the ClusterHat. The ClusterHat is a shield for a Raspberry Pi 3 or 4 which allows you to attach four Raspberry Pi Zeros as nodes to a Raspberry Pi 3/4 easily. It is extremely easy to install, set up, but the big question after following the website’s set up instructions is what comes next?

I sat on the ClusterHat for a few months before finally getting around to setting up the cluster. To my disappointment, there didn’t seem much in the way of tutorials on what to do with my new Raspberry Pi cluster. I am extremely grateful for @glmdev’s tutorial on how to set up a cluster. Their tutorial is extremely well put together and you can follow the steps there to set up your cluster. In fact, you should stop right now, click the link and give that gentleman a few claps for me, since this would not have been possible without his help. The majority of this tutorial is based on theirs and only a few things have been simplified to be more specific to a ClusterHat set up. If you have found this tutorial and are not using ClusterHat, stop reading right now and refer to the tutorial linked above.

What You Will Need

To set up this cluster you will need the following items:

  • 1 x ClusterHatpurchase link has it available a la carte or as a kit
  • 1 x Raspberry Pi 3 or 4 — I am working with a Raspberry Pi 3, which works just fine.
  • 4 x Raspberry Pi Zero or Zero W — The wireless internet will not be necessary but if you already have a W laying around, you can pop it on
  • 5 x MicroSD cards
  • 1 x Thumb drive or external Hard Drive — Optional, but recommended
  • PSU - The main controller Raspberry Pi will power the others, so you need to make sure your power supply is powerful enough to power all three. The minimum power requirements for the entire system is 1-1.2A (depending on if you are using Zeros or Zero Ws), and I have had no problems running the set up with a 2.5A PSU

Getting Started

Put the Cluster Hardware Together

The hardware setup for the ClusterHat is rather straightforward. It comes with standoffs and screws to install on top of the Raspberry Pi 3 or 4 you have. Place the ClusterHat onto the GPIO pins of the controller Pi and secure it in place with the standoffs and screws. When you place the Pi Zeros onto the cluster hat, be sure that you use the correct USB port on the Pi Zero. You want to use the one labeled USB rather than the power USB. The Clusterhat uses the USB to push the network to the PI Zeros.

Install the Clusterhat Raspbian Images

You can find the Raspbian images for the ClusterHat here. There are several options for you to choose from for the controller Pi. If you are not familiar with running Linux headless, I would recommend using one of the STD or Full Images for the controller. I opted to run the Lite image across the entire thing. You can grab a zip file of all the lite images, which will speed up the download process. This zip image will come with the CBRIDGE image.

There are two versions of the controller image — CBRIDGE and CNAT. I suggest using the CNAT image to follow along with this tutorial. The only difference between the two images is how the network is handled. You should select the CBRIDGE if you wish to have each of the PI Zero nodes available directly on your home network. The CNAT image will create a subnet for the PI Zeros on the controller pi and makes set up much easier. It is also more secure as the Nodes will only be visible to the controller image.

Burn the Raspbian images onto the SD cards using your tool of choice. I recommend using Etcher from Balena for this task. After the image is burned to the SD card, open up the boot partition and create an empty file named ssh on the boot partition. This will allow you to SSH into the Raspberry Pis. If you don’t want SSH capability on the controller Raspberry Pi, you can skip this step for that image, but enabling SSH on each Raspberry Pi Zero will make configuration and set up much easier later on. Once you have burned the images and enabled SSH, place the SD cards into their respective Raspberry Pis.

First Boot

Before you do anything else, you will need to change the default password on the controller image. If you used the CBRIDGE image in the previous step, you will also must do this on each of the PI Zeros. Run the following command:

sudo raspi-config

and select the first option Change User Password to set a new password. While you are in the config, you should set your Localization options. By default, the ClusterHat images are set up for UK locale and keyboard layouts. Go to Localization Options and set the appropriate options for your keyboard. It is very important to do this for the timezone on all of your raspberry Pis as they will need to have the same system time later on.

After setting these settings on the controller you may need to reboot the Pi, go ahead and do so. After the Pi boots back up, set up a SSH key with the following commands:

ssh-keygen -t rsa -b 4096
cat ~/.ssh/id_rsa.pub

After you have generated a ssh-key and displayed it, copy it because you will need this in the following steps.

Set up a config file for SSH names ~/.ssh/config with the following contents (If you are using the CBRIDGE image, skip this step):

Host p1
    Hostname 172.19.181.1
    User pi

Host p2
    Hostname 172.19.181.2
    User pi

Host p3
    Hostname 172.19.181.3
    User pi

Host p4
    Hostname 172.19.181.4
    User pi

This previous step is a little superfluous, but you will now be able to quickly ssh into any of the nodes from the controller Pi with the command ssh hostname where hostname is the hostname of the nodes: p1, p2, p3, or pi4.

You can now boot up the nodes. After you run the following command, you should see the Node’s lights turn on orange one by one as each Pi Zero comes online:

sudo clusterhat on

Whenever you need to shutdown the nodes you can use this following command to turn them off. It is very helpful when needing to reboot them all at the same time:

sudo clusterhat off

NOTE: this command performs a hardboot which cuts power to the nodes. If you run this command while they are writing to the SD card, you run the risk of corrupting your SD card, which is catastrophic.

You should now be able to ssh into each of the nodes to set them up. Repeat the password and localization set up for each node that you performed for the controller node. You should also put the ssh-key that you copied earlier into the ~/.ssh/authorized_keys file:

ssh p1
# the default password for the nodes is clusterctl

echo [paste the ssh key here] >> ~/.ssh/authorized_keys
sudo raspi-config

NOTE: If you are using the CBRIDGE image rather than the CNAT image, you will want to set up a static IP address for each one of the nodes and the controller. The steps to do this are outside the scope of this tutorial, but you can find the steps needed here.

After setting up the localization and ssh on each of the nodes, install the ntpdate package. This will keep the systemtime synced across the nodes in the background:

sudo apt-get install -y ntpdate

Hostnames

One of the benefits of using the ClusterHat and their images, is that the hostnames and networking is already set up for you. If you wish to use a different hostname schema, this would be the time to do it. For example, I changed the hostname on my controller pi from controller to medusa. There is no benefit or reason this other than personal aesthetics. If you should decide to change the hostnames, use the following commands:

sudo hostname [new-hostname]    # where [new-hostname] is your preferred hostname
sudo vi /etc/hostname           # change the hostname in this file
sudo vi /etc/hosts              # change 'controller' to [new-hostname]

I highly recommend keeping the node names as they are, but if you should chose a different name, be sure to keep the numbering scheme in place. For example, if you would prefer them all to be called node, make sure you name them node01, node02, node03, and node04.

A Note on Users

If you want to change the default username, you can do that now. Be warned that you will need to have the same UID on all of the nodes and controller pi, and things will be smoother if all the users have the same username. I recommend sticking with the default pi user for this tutorial.

Set Up a Shared Drive

It is not necessary for you to have a shared drive to run commands on the cluster, but it be extremely useful to have a partition to hold data that the entire cluster can use. You can set up and use a partition on the controller’s SD card if it is simpler for you. For my cluster, I used an extra 128 GB thumb drive I had laying about to use as extra drive. Originally, I was planning on using an older 320GB external, but after plugging it in, the controller Pi started giving me undervoltage warnings.

We will be setting up a NFS share with the controller Pi, so log into the Pi and follow these steps.

To find out where the device is loaded in dev, run the lsblk command and you should see something like this:

➜ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda           8:0    1 119.5G  0 disk
└─sda1        8:1    1 119.5G  0 part
mmcblk0     179:0    0  29.7G  0 disk
├─mmcblk0p1 179:1    0   256M  0 part /boot
└─mmcblk0p2 179:2    0  29.5G  0 part /

For me, and most likely yourself, the device is located at /dev/sda1. Next we will need to partition the drive:

sudo mkfs.ext4 /dev/sda1

NOTE: Be very careful here that you type everything correctly, because you don’t want to end up formatting the wrong drive!

After formatting the drive, set up a directory to mount it to. I prefer using the /media folder to set up my external drives, but you can place the folder anywhere on the filesystem. Just be sure that it will be the same folder across all of your nodes!

sudo mkdir /media/Storage
sudo chown nobody.nogroup -R /media/Storage
sudo chmod -R 777 /media/Storage

NOTE: We have set up the loosest of permissions for this drive. Any one who has access to the pi will be able to read, modify, or remove the contents of this drive. We are doing this because the cluster we are building will most likely not hold sensitive data and is mostly for educational purposes. If you are using this machine in a production environment or to process sensitive data, you will want to use different permissions in this previous step.

After setting this up, run the blkid command to get the UUID of the drive so we can set up automatic mounting of the drive whenever the pi boot. You will be looking for a line like this:

/dev/sda1: LABEL="sammy" UUID="a13c2fad-7d3d-44ca-b704-ebdc0369260e" TYPE="ext4" PARTLABEL="primary" PARTUUID="d310f698-d7ae-4141-bcdb-5b909b4eb147"

Though the most important part is the UUID, UUID="a13c2fad-7d3d-44ca-b704-ebdc0369260e". Edit your fstab to contain the following line, while making sure you substitute your drives UUID in the appropriate location:

sudo vi /etc/fstab

# Add the following line to the bottom of the fstab file:
UUID=a13c2fad-7d3d-44ca-b704-ebdc0369260e /media/Storage ext4 defaults 0 2

Ensure that the NFS server is installed on your controller:

sudo apt-get install -y nfs-kernel-server

You will then need to update your /etc/exports file to contain the following line at the bottom:

/media/Storage 172.19.181.0/24(rw,sync,no_root_squash,no_subtree_check)

NOTE: If you are using the CBRIDGE image, you will need to use the IP address schema on your network. For example, if your network is using 192.168.1.X, you will need to change the 172.19.181.0 to 192.168.1.0 in the above example.

After editing the exports file, run the command sudo exportfs -a to update the NFS server.

Mount the Drive on the Nodes.

You will now need to mount the NFS share that we just set up on each of the node Pis. Run the following command set on each node:

sudo apt-get install -y nfs-common

# Create the mnount folder, using the same mount folder above.
# If you used different permissions, use the same permissions here.
sudo mkdir /media/Storage
sudo chown nobody.nogroup /media/Storage
sudo chmod -R 777 /media/Storage

# Set up automatic mounting by editing your /etc/fstab:
sudo vi /etc/fstab

# Add this line to the bottom:
172.19.181.254:/media/Storage  /media/Storage  nfs  defaults  0 0

You can now run the sudo mount -a command and the NFS drive will mount to the node. If you receive any errors, double check your /etc/fstab files in the nodes and the etc/exports file on the controller. Create a test file inside the NFS mount directory to ensure that you can see the file across all of the nodes:

echo "This is a test" >> /media/Storage/test

You should be able to see this file on all of the nodes and edit it as well.

Set Up Munge

NOTE: This is the part I had the most problems with when setting up my nodes. Munge is an authentication service for creating and validating credentials used in cluster environments. The Raspbian Buster munge package does not seem to install correctly and the solution proposed here is a hack to get it working correctly. If you are using this tutorial in a production environment, I can not make any guarantees to the security of this solution, and suggest you take some extra time to understand how munge works.

Edit the Hosts File

To set up munge, we will first need to edit our /etc/hosts file to contain the addresses and hostnames of the other nodes in the cluster. This will make name resolution much easier and take out any of the guess work for the Pis. You will need to edit the /etc/hosts file on each of the nodes and the controller. You will want to all of the IP addresses and hostnames except for the one of the current machine on each of the Pis.

For example add these lines to your controller’s /etc/hosts file:

172.19.181.1	pi1
172.19.181.2	pi2
172.19.181.3	pi3
172.19.181.4	pi4

On the first node (pi1), add these lines:

172.19.181.254	controller
172.19.181.2	  pi2
172.19.181.3	  pi3
172.19.181.4	  pi4

Repeat this process for the other three nodes. After editing the hosts file, you will want to install and configure munge on each of the PIs. We will start with the controller:

Install and Configure Munge

sudo apt-get install -y munge

Before we start munge, we will need to edit the munge service file:

systemctl edit --system --full munge

This following step will force munge to ignore a few of the security warnings present in the raspbian package. It seems that munge is very picky about the permissions of the root directory on the raspbian image. We will be securing the actual files that munge uses to the proper permissions in the munge documentation. I must say again that I cannot ensure the overall security of this method! Find the following line and add --force to it.

ExecStart=/usr/sbin/munged --force

Your file should look like this when you are done:

[Unit]
Description=MUNGE authentication service
Documentation=man:munged(8)
After=network.target
After=time-sync.target

[Service]
Type=forking
ExecStart=/usr/sbin/munged --force
PIDFile=/run/munge/munged.pid
User=munge
Group=munge
Restart=on-abort

[Install]
WantedBy=multi-user.target

We will now need to change the permissions and set up a /run directory for munge to use. We will create a directory for the munge pidfile and give it ownership to the munge user:

sudo mkdir /run/munge
sudo chown -R munge:munge /run/munge
sudo chmod -R 0755 /run/munge

After this run the following commands to start munge at boot as well as start it right now:

sudo systemctl enable munge
sudo systemctl start munge

The second command should be a silent process, but you should check the status to make sure that munge started properly without any problems:

sudo systemctl status munge

After getting this set up on the controller Pi, copy the munge key file to the NFS that we set up earlier. You will need this same key on each of the nodes:

sudo cp /etc/munge/munge.key /media/Storage

Munge on the Nodes

Repeat the steps above for each of the nodes in the cluster, except instead of copying the key from the node to the NFS, copy the key from the controller to the node, overwriting the key installed by munge:

sudo cp /media/Storage/munge.key /etc/munge

Ensure that munge user and group is the only user that has read permissions on the munge key on each of the nodes by running sudo ls -la /etc/munge. The output should look like this:

$ sudo ls -la /etc/munge

total 12
drwx------  2 munge munge 4096 Aug  3 20:52 .
drwxr-xr-x 93 dhuck dhuck 4096 Aug  3 21:37 ..
-r--------  1 munge munge 1024 Aug  3 20:52 munge.key

In order to test that munge is set up correctly, run the following command:

ssh pi1 munge -n | unmunge

As long as you are not currently on pi1, you should see something akin to this output:

STATUS:           Success (0)
ENCODE_HOST:      medusa (127.0.1.1)
ENCODE_TIME:      2019-08-05 20:19:54 +0100 (1565032794)
DECODE_TIME:      2019-08-05 20:19:54 +0100 (1565032794)
TTL:              300
CIPHER:           aes128 (4)
MAC:              sha256 (5)
ZIP:              none (0)
UID:              dhuck (1000)
GID:              dhuck (1000)
LENGTH:           0

Set up Slurm

Now we are ready to install slurm and get our cluster clustering!

Install Slurm on Controller Pi

On the controller run the following commands to install slurm:

sudo apt-get install -y slurm-wlm

This will take a moment. After it finishes, we will use the default SLURM configuration and modify it to meet our needs. Copy the config file over from the slurm documentation:

cd /etc/slurm-llnl
cp /usr/share/doc/slurm-client/examples/slurm.conf.simple.gz .
gzip -d slurm.conf.simple.gz
mv slurm.conf.simple slurm.conf

Open the /etc/slurm-llnl/slurm.conf file and make the following edits:

Set the control machine information:

SlurmctldHost=controller(172.19.181.254)
# note: if you used the CBRIDGE or changed the hostname, this will look
# different for you.

Ensure that the SelectType and SelectTypeParameters parameters are set to the following values:

SelectType=select/cons_res
SelectTypeParameters=CR_Core

If you wish to change or set the name of your cluster, you can set it with the ClusterName parameter. I set mine to merely be cluster:

ClusterName=cluster

At the end of the file, there should be an entry for a compute node. Delete it, and put this in it’s place:

NodeName=controller NodeAddr=172.19.181.254 CPUS=4 State=UNKNOWN
NodeName=p1 NodeAddr=172.19.181.1 CPUs=1 State=UNKNOWN
NodeName=p2 NodeAddr=172.19.181.2 CPUs=1 State=UNKNOWN
NodeName=p3 NodeAddr=172.19.181.3 CPUs=1 State=UNKNOWN
NodeName=p4 NodeAddr=172.19.181.4 CPUs=1 State=UNKNOWN

# note: if you are using the CBRIDGE image for the controller, the IP
# addresses will be different for you. Same goes if you change any of
# the hostnames!

You will also need to remove the default entry for PartitionName at the end of the file and replace it with our own custom Partition name:

PartitionName=mycluster Nodes=p[1-4] Default=YES MaxTime=INFINITE State=UP

We will now need to tell Slurm which resources it can access on the nodes. Create the following file, /etc/slurm-llnl/cgroup.conf and add the following lines to it:

CgroupMountpoint="/sys/fs/cgroup"
CgroupAutomount=yes
CgroupReleaseAgentDir="/etc/slurm-llnl/cgroup"
AllowedDevicesFile="/etc/slurm-llnl/cgroup_allowed_devices_file.conf"
ConstrainCores=no
TaskAffinity=no
ConstrainRAMSpace=yes
ConstrainSwapSpace=no
ConstrainDevices=no
AllowedRamSpace=100
AllowedSwapSpace=0
MaxRAMPercent=100
MaxSwapPercent=100
MinRAMSpace=30

Next, we will need to whitelist system devices in the file /etc/slurm-llnl/cgroup_allowed_devices_file.conf:

/dev/null
/dev/urandom
/dev/zero
/dev/sda*
/dev/cpu/*/*
/dev/pts/*
/media/Storage*

# note that the final line is the name of your NFS drive and should
# edited to reflect that.

These are values you can play around with to explore how cluster computing will assign resources. This is a very loose and permissive configuration that can be edited to suit your needs.

Copy these configuration files to the NFS drive that we set up earlier:

sudo cp slurm.conf cgroup.conf cgroup_allowed_devices_file.conf /media/Storage

Finally, enable and start the slurm daemon on the controller Pi:

sudo systemctl enable slurmd
sudo systemctl start slurmd

# and the control daemon!
sudo systemctl enable slurmctld
sudo systemctl start slurmctld

Set up Slurm on the Nodes

Install the slurm client on the nodes:

sudo apt-get install -y slurmd slurm-client

Copy the the configuration files that we made for slurm over to each of the nodes:

sudo cp /clusterfs/slurm.conf /etc/slurm-llnl/slurm.conf
sudo cp /clusterfs/cgroup* /etc/slurm-llnl

Finally, enable and start the slurm daemon on each node:

sudo systemctl enable slurmd
sudo systemctl start slurmd

And that should be it to get things set up! Login to the controller node and test slurm to make sure it works. Run the sinfo command and you should get the following output:

PARTITION  AVAIL  TIMELIMIT  NODES  STATE NODELIST
mycluster*    up   infinite      4   idle p[1-4]

Furthermore, you can run the uptime on all of the nodes with the following command:

srun --nodes=4 uptime

Which should produce a similar result to the following:

14:46:52 up 2 days, 0 min,  0 users,  load average: 0.05, 0.03, 0.00
14:46:52 up 2 days, 0 min,  0 users,  load average: 0.00, 0.00, 0.00
20:46:52 up 2 days, 0 min,  0 users,  load average: 0.17, 0.14, 0.10
14:46:52 up 2 days, 0 min,  0 users,  load average: 0.00, 0.02, 0.00

Further Reading

You should now have a functioning compute cluster with your ClusterHat and Raspberry Pis. You can use the srun command to run a command on however many nodes you wish though this is admittedly tedious and not extremely useful outside of system maintenance. For further reading, I will let @glmdev take over for me and link to Part II and Part III of their Raspberry Pi cluster tutorial.