Integrating with Slurm

Workbench | Advanced

These steps describe how to integrate Posit Workbench with Launcher and Slurm.

In this configuration, the Workbench and Launcher services are installed to one node in the Slurm cluster, and the Workbench Session Components are installed on all other Slurm nodes.

Requirements

  • An existing Slurm cluster
  • Supported versions of Slurm are:
    • 21.X
    • 22.X
    • 23.X
  • Access to the rsession.conf file:
    • The rsession.conf contains configurations specific to the sessions Workbench launches. Therefore, it needs to exist on the Slurm nodes where the sessions are being run. See the Appendix H - rsession.conf section for more information.

Pre-Flight Configuration Checks

Verifying active Slurm compute nodes

  • On a machine with Slurm configured, ensure that you have one or more worker nodes that are ready to accept jobs as part of the Slurm cluster by running the following command:

    $ sinfo

Verifying functionality with a test job

  • On a machine with Slurm configured, ensure that you are able to deploy a sample jobs to your Slurm cluster by running a test command:

    $ srun date
  • You can also create a sample script called submit.sh:

    #!/bin/bash
    #
    #SBATCH --job-name=test
    #SBATCH --output=res.txt
    #
    #SBATCH --ntasks=1
    #SBATCH --time=10:00
    
    srun hostname
    srun sleep 60
  • Then submit the job by running the following command:

    $ sbatch submit.sh
  • Verify that it runs successfully on the Slurm cluster.

Step 1. Install Workbench on one Slurm node

  • You can install Workbench on any Slurm node that has access to the Slurm tooling such as sbatch, srun, sinfo, etc. This can be a Slurm login/submission node, controller node, or compute node.

    Note

    Unless you are using load balancing, you only need to install Workbench on a single node. In later steps, you’ll install R and the Workbench Session Components on other Slurm nodes.

Step 2. Configure Workbench with Launcher

  • On the node where Workbench is installed, add the following lines to the Workbench configuration file:

    File: /etc/rstudio/rserver.conf
    # Launcher Config
    launcher-address=127.0.0.1
    launcher-port=5559
    launcher-sessions-enabled=1
    launcher-default-cluster=Slurm
    launcher-sessions-callback-address=http://<SERVER-ADDRESS>:8787

We recommend that you do the following:

  • In the launcher-sessions-callback-address setting, you should replace <SERVER-ADDRESS> with the hostname or IP address of Workbench.

  • You should also change the protocol and port if you are using HTTPS or a different port.

    Note

    If HTTPS is being used, ensure you have launcher-use-ssl=1 configured in the rserver.conf file.

Ensure that the launcher-sessions-callback-address is reachable from the Slurm compute nodes.

Step 3. Configure Launcher settings and plugins

  • On the node where Workbench is installed, add the following lines to the Launcher configuration file:

    File: /etc/rstudio/launcher.conf
    [server]
    address=127.0.0.1
    port=5559
    server-user=rstudio-server
    admin-group=rstudio-server
    authorization-enabled=1
    thread-pool-size=4
    enable-debug-logging=1
    
    [cluster]
    name=Slurm
    type=Slurm

Step 4. Configure profile for Launcher Slurm plugin

  • On the node where Workbench is installed, add the following lines to the Launcher profiles configuration file:

    File: /etc/rstudio/launcher.slurm.profiles.conf
    [*]
    default-cpus=1
    default-mem-mb=512
    max-cpus=2
    max-mem-mb=1024

Step 5. Configure Launcher with Slurm

  • On the node where Workbench is installed, add the following lines to the Launcher Slurm configuration file:

    File: /etc/rstudio/launcher.slurm.profiles.conf
    slurm-service-user=slurm

    If the Slurm CLI is installed in a non-default location, then it must be specified in the Slurm configuration file. For example:

    File: /etc/rstudio/launcher.slurm.conf
    slurm-service-user=slurm
    slurm-bin-path=/usr/local/bin

    If you fail to add the non-default location in the configuration file, then the following error may return:

    Error
    03 Aug 2021 17:42:42 [rstudio-slurm-launcher] ERROR slurm error 7 (Slurm command exited due to an unknown error: /bin/sh: scontrol: command not found

The Job Launcher uses the slurm-service-user account for interactions with a Slurm cluster. This includes querying cluster information, interacting with jobs, and submitting jobs on behalf of the logged in user’s account.

Step 6. Verify Slurm configuration and cluster environment

Verify that the following requirements are satisfied for Workbench and Launcher to work with your Slurm cluster:

  • All Slurm nodes should have user’s home directories mounted via shared file storage with matching user and group IDs across all nodes.

  • The root user should have read access to all users’ home directories.

  • In your Slurm configuration file (slurm.conf), the MinJobAge setting should be equal to or greater than the job-expiry-time setting in /etc/rstudio/launcher.conf, which is 24 h by default. For both of them to be 24 h, you would need to set MinJobAge=86400 in your slurm.conf.

    Note

    The MinJobAge setting in slurm.conf is configured in seconds, rather than hours.

Step 7. Ensure that R is available on Slurm compute nodes

  • On each Slurm compute node in the cluster (where you did not install Workbench), you will need to install one or more versions of R and associated R packages to be able to start R sessions via Slurm.

  • We recommend installing R to a shared file server or network drive location so that any installed packages are also available across all compute nodes. You can also make use of existing versions of R and environment modules that are available on the cluster.

  • When using multiple versions of R, the shared file /var/lib/rstudio-server/r-versions must be reachable by all Slurm nodes. Note that this file is generated by Workbench, and that its location may be changed by setting r-versions-path=<shared directory>/r-versions in rserver.conf.

    Note

    For more information on using Launcher and Slurm with multiple versions of R and module loading, refer to the Multiple Versions of R and Module Loading section of the Workbench Launcher Administration Guide and the R Versions section of the Workbench Administration Guide.

    • Alternatively, you can manually install R and R packages by following the steps to Install R on each Slurm compute node.

Step 8. Install Workbench session components on Slurm compute nodes

On each Slurm compute node in the cluster (where you did not install Workbench), you will need to install the Workbench session components and create an rstudio-server user to be able to start R sessions via Slurm.

Use the following commands to install the Workbench session components on each Slurm compute node:

sudo yum install libcurl-devel libuser-devel libpq openssl-devel rrdtool

curl -O https://download1.rstudio.org/session/rhel9/x86_64/rsp-session-rhel9-2023.12.1-402.pro1-x86_64.tar.gz
sudo mkdir -p /usr/lib/rstudio-server
sudo tar -zxvf ./rsp-session-rhel9-2023.12.1-402.pro1-x86-64.tar.gz -C /usr/lib/rstudio-server/
sudo mv /usr/lib/rstudio-server/rsp-session*/* /usr/lib/rstudio-server/
sudo rm -rf /usr/lib/rstudio-server/rsp-session*
rm -f ./rsp-session-rhel9-2023.12.1-402.pro1-x86_64.tar.gz
sudo yum install libcurl-devel libpq openssl-devel rrdtool

curl -O https://download1.rstudio.org/session/rhel8/x86_64/rsp-session-rhel8-2023.12.1-402.pro1-x86_64.tar.gz
sudo mkdir -p /usr/lib/rstudio-server
sudo tar -zxvf ./rsp-session-rhel8-2023.12.1-402.pro1-x86-64.tar.gz -C /usr/lib/rstudio-server/
sudo mv /usr/lib/rstudio-server/rsp-session*/* /usr/lib/rstudio-server/
sudo rm -rf /usr/lib/rstudio-server/rsp-session*
rm -f ./rsp-session-rhel8-2023.12.1-402.pro1-x86_64.tar.gz
sudo yum install libcurl-devel libpq openssl-devel rrdtool

curl -O https://download1.rstudio.org/session/centos7/x86_64/rsp-session-centos7-2023.12.1-402.pro1-x86_64.tar.gz
sudo mkdir -p /usr/lib/rstudio-server
sudo tar -zxvf ./rsp-session-centos7-2023.12.1-402.pro1-x86_64.tar.gz -C /usr/lib/rstudio-server/
sudo mv /usr/lib/rstudio-server/rsp-session*/* /usr/lib/rstudio-server/
sudo rm -rf /usr/lib/rstudio-server/rsp-session*
rm -f ./rsp-session-centos7-2023.12.1-402.pro1-x86_64.tar.gz
sudo apt-get install curl libcurl4-gnutls-dev libssl-dev libuser1-dev libpq5 rrdtool

curl -O  https://download1.rstudio.org/session/jammy/amd64/rsp-session-jammy-2023.12.1-402.pro1-amd64.tar.gz
sudo mkdir -p /usr/lib/rstudio-server
sudo tar -zxvf ./rsp-session-jammy-2023.12.1-402.pro1-amd64.tar.gz -C /usr/lib/rstudio-server/
sudo mv /usr/lib/rstudio-server/rsp-session*/* /usr/lib/rstudio-server/
sudo rm -rf /usr/lib/rstudio-server/rsp-session*
rm -f ./rsp-session-jammy-2023.12.1-402.pro1-amd64.tar.gz
sudo apt-get install curl libcurl4-gnutls-dev libssl-dev libuser libuser1-dev libpq5 rrdtool

curl -O https://download1.rstudio.org/session/focal/amd64/rsp-session-focal-2023.12.1-402.pro1-amd64.tar.gz
sudo mkdir -p /usr/lib/rstudio-server
sudo tar -zxvf ./rsp-session-focal-2023.12.1-402.pro1-amd64.tar.gz -C /usr/lib/rstudio-server/
sudo mv /usr/lib/rstudio-server/rsp-session*/* /usr/lib/rstudio-server/
sudo rm -rf /usr/lib/rstudio-server/rsp-session*
rm -f ./rsp-session-focal-2023.12.1-402.pro1-amd64.tar.gz
zypper install curl libsqlite3-0 libpq5
curl -O https://download1.rstudio.org/session/opensuse15/x86_64/rsp-session-opensuse15-2023.12.1-402.pro1-x86_64.tar.gz
sudo mkdir -p /usr/lib/rstudio-server
sudo tar -zxvf ./rsp-session-opensuse15-2023.12.1-402.pro1-x86_64.tar.gz -C /usr/lib/rstudio-server/
sudo mv /usr/lib/rstudio-server/rsp-session*/* /usr/lib/rstudio-server/
sudo rm -rf /usr/lib/rstudio-server/rsp-session*
rm -f ./rsp-session-opensuse15-2023.12.1-402.pro1-x86_64.tar.gz

To create rstudio-server user on each Slurm compute node:

  • First check the uid and gid of the rstudio-server user by running id rstudio-server on the Workbench host machine.

  • Then, on each compute node run the following commands:

    $ sudo groupadd --system --gid <gid of rstudio-server> rstudio-server
    $ sudo useradd --system --gid rstudio-server --uid <uid of rstudio-server> rstudio-server

Step 9. Restart Workbench and Launcher Services

  • Run the following to restart services:

    $ sudo rstudio-server restart
    $ sudo rstudio-launcher restart

Step 10. Test Workbench with Launcher and Slurm

  • In your browser, navigate to the Workbench interface and log in.
  • Select New Session, then click the Start Session button.
  • You can then use the RStudio Session, which is running as a Slurm job.

Additional Documentation

For more information on Workbench and Launcher, refer to the following reference documentation, see the Launcher section.

Troubleshooting Workbench and Slurm

For additional information on troubleshooting Workbench with Launcher and Slurm, see the Launcher Troubleshooting section of the Workbench Admin Guide.