Setting up an OpenMPI cluster within a LAN

We have purchased several DELL PowerEdge R640 servers several months ago as computation nodes. What we want to have is to build a small computational cluster so that they can run users’ programmes in parallel. Users prefer to work on Linux and most of the programmes will be written in C / C++ / Fortran. Therefore we are going to build a Beowulf cluster with a head node, or server node, and several computation nodes. The purpose of setting up a head node is to ease the future management process as all the libraries and user data will be served from the head node, as well as to handle network connection with external environment, such as user login. Then all client nodes will only be responsible for computation job.

All the nodes in the environment are running Ubuntu-18.04 LTS. And the necessary software including OpenMPI, SSH and NFS. We also assume that a user account with same name exists in every node. Below are the steps for setting up the cluster.

Software packages and libraries

  • Install necessary compiler, e.g. gcc, g++, gfortran, etc.

SSH

For both server and client nodes, install your favourite version of ssh server if not yet. OpenSSH is shown here as an example:

$ sudo apt install openssh-server

In order to enable the communication between server and client nodes through SSH without explicitly typing password, we can set up SSH public key authentication for all the nodes.

  • Create ~/.ssh directory and chmod 700
  • Generate RSA key for all machines
    $ ssh-keygen -t rsa
    

    Secure it with passphrase Don’t set passphrase for the file or you’ll be prompted to type it every time for the ssh, unless you also setup key management with keyring.

  • Copy the public key from server to every client machine and also from every client machine to server
    $ ssh-copy-id target_ip_address
    
  • Add or uncomment below two lines to /etc/ssh/sshd_config:
    PubkeyAuthentication yes
    RSAAuthentication yes
    
  • Restart ssh service
    sudo service ssh restart
    
  • Login to all the machines through ssh once to verify the correctness of the setup.

NFS

Server

  • Install NFS packages
    $ sudo apt-get install nfs-kernel-server nfs-common
    
  • Create an NFS shared/export directory. E.g.
    $ sudo mkdir -p /srv/nfs_share
    
  • Change ownership and permissions of the shared directory to nobody
    $ sudo chown nobody:nogroup /srv/nfs_share
    $ sudo chmod 777 /srv/nfs_share
    
  • Add an entry in /etc/exports.
    /srv/nfs_share <Client1IP>(rw,sync,nosubtree_check)
    /srv/nfs_share <Client2IP>(rw,sync,nosubtree_check)
    ...
    

    or for multiple clients in the subnet:

    /srv/nfs_share <subnetIP/24>(rw,sync,nosubtree_check)
    

    for all IPs.

  • Export and update the settings.
    $ exportfs -varf
    
  • (Optional) Edit /etc/conf.d/nfs-common.conf
    STATD_OPTS="-no-notify"
    
  • Restart nfs server
    $ sudo systemctl restart nfs-kernel-server.service
    
  • Set firewall rule to allow client access the nfs port. For example:
    $ sudo ufw allow from <subnetIP/24> to any port nfs
    $ sudo ufw enable
    $ sudo ufw status
    

Client

  • Install nfs-common package
    $ sudo apt-get install nfs-common
    
  • Create a mount point for mounting NFS shared directory.
    $ sudo mkdir -p /mnt/nfs_share_client
    
  • Mount shared directory
    $ sudo mount -t nfs <ServerIP>:/srv/nfs_share /mnt/nfs_share_client
    
  • We can also make the mount permanently by adding an entry to /etc/fstab
    # NFS for MPI cluster
    <masterIP>:/srv/nfs_share /mnt/nfs_share_client nfs
    

Install OpenMPI in server

  • Download from openmpi.org, e.g.
    $ wget https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.4.tar.bz2
    $ tar xjvf openmpi-4.0.4.tar.bz2
    
  • Install to shared folder
    $ ./configure --prefix=<NFS shared folder>
    $ make
    $ make install
    

Client setting

  • Add environment variable to PATH and LD_LIBRARY_PATH by edit .bashrc.
    # Setting OpenMPI related environment variables
    export OMPI=<OpenMPI directory>
    export PATH="$PATH:$OMPI/bin"
    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$OMPI/lib"
    

    NOTE: You must add the above lines at the VERY BEGINNING of .bashrc, or at least BEFORE the below lines, since the nodes are communicating in non-interactive login mode.

    # If not running interactively, don't do anything
    case $- in
        *i*) ;;
          *) return;;
    esac
    

Network configuration

  1. Set firewall rule for server

    sudo ufw allow from <subnetIP/24>
    
  2. Add nodes information to mpi_hosts (/etc/hosts) file.

    • Server
      127.0.0.1 localhost
      <Client1IP> Client1
      <Client2IP> Client2
      ...
      
    • Client
      127.0.0.1 localhost
      <MasterIP> Master
      

Test

  1. Test with your own program, e.g.
    • Compile
    • Run with specified host file
  2. Or simply run the below command
$ mpirun --host <remote host> hostname

Conclusion

We have talked about how to build a Beowulf OpenMPI cluster for parallel computing. However, there are still many things can be done to enhance the whole environment, at least in three directions.

First, currently we are only using NFS to share OpenMPI among nodes, without a shared /home that renders more difficulties for user management, as well as more steps need to be done by users before they can run the job. This actually leads to the second point, which a better resource management job scheduler, e.g. slurm is necessary to handle all the recourses and job arrangement. Finally, the cluster is built manually from scratch, i.e. setting up individual rsh keys, creating NFS shares, editing host files, setting static IPs, and applying kernel patches manually1. The manual process makes scaling up cluster to be difficult and error-prone. A suitable cluster management software should be in great help.

References


  1. https://en.wikipedia.org/wiki/OpenMosix#ClusterKnoppix ↩︎

Avatar
Leo Mak
Make the world a better place, piece by piece.
comments powered by Disqus