Using the CUBIC Cluster

Table of contents

  1. Setting up your account
  2. Project Directory Access Request
  3. File permissions on CUBIC
    1. Keypress issues
  4. Configuring a CUBIC account
    1. Quick fixes for annoying behavior
  5. Installing miniconda in your project (The hard way)
  6. Installing the flywheel CLI tool
  7. Checking that your python SDK works
  8. Finalizing your setup
  9. Downloading data from flywheel to CUBIC
  10. Mounting CUBIC on your local machine
    1. Creating a sensible mount point
    2. Mounting CUBIC
  11. Moving data to and from CUBIC
    1. Moving files to CUBIC
    2. Moving files from CUBIC
  12. Using R/R-studio and Installation of R packages
  13. CPUs, Nodes, & Memory
    1. Specifying CPUs on a node
    2. Errors with Allocating Memory/Memory Overflow
  14. Additional information about CUBIC

The cubic cluster is a very powerful set of servers that we can use for computing. Although they are running Linux, familiarity with Linux does not mean that you will be able to effectively use CUBIC. This section details how to get up and running on the CUBIC cluster. In general we now reccomend using PMACS for specific analysis projects, and reserve CUBIC for use as a high-performance compute engine for large batches of containerized jobs that are launched from Flywheel. However, for specific projects (esp collaborations with CBICA), it may make sense to have your project live on CUBIC.

Setting up your account

To get login credentials for CUBIC, you must have already a Penn Medicine account (i.e. an email). Once you do, ask the lab’s PMACS/CUBIC manager to create a ticket asking for a new CUBIC account. You will receive an email with your login credentials and other instructions. Once you are granted login credentials for CUBIC, you will be able to connect from inside the Penn Medicine network using SSH. To access the network remotely, follow instructions to install the client. If you can successfully authenticate but are blocked from access, you may need to contact someone to put you on an exceptions list.

Once inside the Penn network, the login to CUBIC looks like this:

$ ssh -Y username@cubic-login

You use your UPHS password to login. On success you will be greeted with their message and any news:

                               Welcome to

                   #####   ######   ###   #####      #
                  #     #  #     #   #   #     #    # #
                  #        #     #   #   #         #   #
                  #        ######    #   #        #     #
                  #        #     #   #   #        #######
                  #     #  #     #   #   #     #  #     #
                   #####   ######   ###   #####   #     #

            Center for Biomedical Image Computing and Analytics

				**** Reminder ****

		The login nodes are shared by all users and are intended
		for interactive work only. Long-running tasks requiring

You can hit the space bar to read all of this or q to exit.

Project Directory Access Request

Once you have access to CUBIC, you may need to start a project in a new directory. Visit this wiki for more, or follow along below.

First you need to fill out the data management document available here. This document will ask you for a number of details about your project, including the data’s source and estimates about how much disk space you will need over a 6 month, 12 month, and 24 month period, and the estimated lifespan of the data ( 🤷). You will also need to provide the CUBIC usernames for everyone you want to have read and/or write access to the project — getting this done ahead of time is strongly recommended because, as you can imagine, requesting changes after-the-fact can be a bother.

Additionally, you will need to be familiar with:

  • Whether or not the data has an IRB associated with it and who has approval
  • Whether or not the data is the definitive source
  • Whether or not you have a data use agreement
  • What will happen to the data at the end of its expected lifespan on the cluster

This document must be saved as a .txt file and before being submitted with your request.

Finally, you will need approval from your PI. This involves sending an email to the PI with a written blurb to the effect of “Do you approve of this project folder request”, to which the PI only needs to respond “Yes, approved”. Once you’ve got this you can screenshot the conversation (include the date in frame) and save that as an image.

With these two documents, you can now submit the request via the the Request Tracker — you’ll need your CBICA/CUBIC login credentials for this.


Lastly, attach your supporting documents.

The process for accessing an existing project is similar, but fortunately you will not have to fill out a new data management document; only the PI approval and filling of the online ticket is required. You should receive an email from CBICA confirming your request, and you can always return to the Request Tracker to see the status of your ticket.

File permissions on CUBIC

Unlike many shared computing environments, read and write permissions are not configured using groups. Instead, individual users are granted access to data on a project-by-project basis. For example, if you are a member of the project pnc_fixel_cs you will not be able to read or write directly to that project’s directory (which will be something like /cbica/projects/pnc_fixel_cs).

To access a project’s files you have to log in as a project user. This is done using the sudo command after you have logged in as your individual user. In this example you would need to use sudo to log in as the pncfixelcs user and run a shell. Note that underscores in the project directory are removed when logging in as the project user. By running

$ sudo -u pncfixelcs sudosh

and entering the same UPHS password you used to log in to your individual user account. You can see that the project user has their own environment:

$ echo $HOME

This means that the user will have their own startup scripts like .bashrc and .bash_profile in their $HOME directory.

Keypress issues

Sometimes after logging in as a project user, you will find that you have to type each character twice for it to appear in your terminal. If this happens you can start another shell within your new shell by running bash or zsh in your new bash session. This usually creates a responsive shell.

Configuring a CUBIC account

Note that individual user accounts typically have very little hard drive space allotted to them. You will likely be doing all your heavy computing while logged in as a project user. This means that you will want to configure your project user account with any software you need. This example we will use the xcpdev account as an example. First, log in as the project user:

$ sudo -u xcpdev sudosh

Let’s see what is in this directory:

$ ls -al .
total 14
drwxrws---.   7 xcpdev xcpdev      4096 Feb 12 19:44 ./
drwxr-xr-x. 215 root   root        8192 Feb 10 16:06 ../
-rw-------.   1 xcpdev xcpdev        14 Oct  9 16:52 .bash_history
-r--r-x---.   1 xcpdev xcpdev       873 Jul  9  2018 .bash_profile*
-r--r-x---.   1 xcpdev xcpdev      1123 Jul  9  2018 .bashrc*
drwsrws---.   2 xcpdev xcpdev      4096 Aug 19 14:13 dropbox/
lrwxrwxrwx.   1 xcpdev xcpdev        17 Oct  9 16:52 .java -> /tmp/xcpdev/.java/
drwxr-s---.   3 xcpdev xcpdev      4096 Oct  9 16:52 .local/
drwxr-s---.   2 xcpdev xcpdev      4096 Oct  9 16:52 perl5/
drwxr-s---.   2 xnat   sbia_admins 4096 Jan  6 23:47 RAW/
drwxr-s---.   2 xcpdev xcpdev      4096 Jul  9  2018 .subversion/
-rw-r-----.   1 xcpdev xcpdev         0 Oct  9 16:52 .tmpcheck-cubic-login1
-rw-rw-r--.   1 xcpdev xcpdev         0 Feb 12 19:44 .tmpcheck-cubic-login4
-rw-r-x---.   1 root   root        2360 Jul  9  2018 xcpDev_Project_Data_use.txt*

Notice that .bashrc is not writable by anyone. We’ll need to change this temporarily so we can configure the environment. To do so, run

$ chmod +w .bashrc
$ ls -al .
-rw-rwx---.   1 xcpdev xcpdev      1123 Jul  9  2018 .bashrc*

and we can see that the file is now writable.

Quick fixes for annoying behavior

By default, CUBIC replaces some basic shell programs with aliases. In your .bashrc file you can remove these by deleting the following lines:

alias mv="mv -i"
alias rm="rm -i"
alias cp="cp -i"

Additionally, you will want to add the following line to the end of .bashrc:


In order to ensure that the compute nodes source your .bashrc, you can use the -V flag with qsub. We also recommend that when you launch a script requiring your conda environment and packages, you add source activate <env> to the top of your script. To change the default installation for a given software package, prepend the path to your $PATH and source your .bashrc:

echo PATH=/directory/where/your/installation/lives:${PATH} >> ~/.bashrc
source ~/.bashrc

Installing miniconda in your project (The hard way)

You will want a python installation that you have full control over. After logging in as your project user and changing permission on your .bashrc file, you can install miniconda using

$ wget
$ chmod +x
$ ./Miniconda3-latest-Linux-x86_64.

You will need to hit Enter to continue and type yes to accept the license terms. The default installation location is fine (it will be $HOME/miniconda3). Sometimes you will run into a memory error at this step. If this happens, just log out and log back in and the issue should be remediated. This can be avoided in the first place by, when sshing into cubic, logging into *login4.

When prompted if you want to initialize miniconda3, respond again with yes

Do you wish the installer to initialize Miniconda3
by running conda init? [yes|no]
[no] >>> yes

For the changes to take place, log out of your sudo bash session and your second bash session, then log back in:

$ exit
$ sudo -u xcpdev sudosh
(base) $ which conda

You will notice that your shell prompt now begins with (base), indicating that you are in conda’s base environment.

There will be a permission issue with your conda installation. You will need to change ownership of your miniconda installation. To fix this run

$ chown -R `whoami` ~/miniconda3

When you launch jobs on cubic, they will autmoatically use cubic’s base conda environment instead of your project user’s miniconda installation. To fix this, you will need to initialize miniconda for a bash script submitted to qsub by running

$ source ~/miniconda3/etc/profile.d/

Let’s create an environment we will use for interacting with flywheel.

$ conda create -n flywheel python=3.7
$ conda activate flywheel
$ pip install flywheel-sdk

Installing the flywheel CLI tool

To install the Flywheel CLI tool on CUBIC, you will again need to be logged in as your project user and have a writable .bashrc. Now create a place to put the fw executable.

$ cd
$ mkdir -p software/flywheel
$ cd software/flywheel

Flywheel will complain if your version is out of date, so best to find the latest version and download that. You can find the latest version by logging into flywheel. Once you’ve logged in, in the upper-right corner, select your account menu, and select Profile. Scroll down to the Download Flywheel CLI section, and you should see the latest version (e.g. 10.7.3). In the first line below, replace <version> with the version number you just found (e.g.

$ wget<version>/
$ unzip
$ echo "export PATH=\$PATH:~/software/flywheel/linux_amd64" >> ~/.bashrc
$ exit
$ exit
$ sudo -u xcpdev bash
$ bash
$ fw login $APIKEY

where $APIKEY is replaced with your flywheel api key. You can find your personal api key in your account profile (same place you went for the version #) by scrolling all the way to the bottom.

Checking that your python SDK works

After running the fw login command from above you can activate your flywheel conda environment and check that you can connect:

$ conda activate flywheel
$ python

and in python

>>> import flywheel
>>> fw = flywheel.Client()

If there is no error message, you have a working Flywheel SDK!

Finalizing your setup

After all these steps, it makes sense to return your .bashrc to non-writable mode

$ chmod -w ~/.bashrc

Downloading data from flywheel to CUBIC

The following script is an example of download the output of a flywheel analysis to CUBIC

import flywheel
import os

fw = flywheel.Client()

project = fw.lookup('bbl/ALPRAZ_805556') # Insert your project name here
subjects = project.subjects() # This returns the subjects that are in your project

# This is a string that you will use to partial match the name of the analysis output you want.
analysis_str = 'acompcor'

for sub in subjects:
    """Loop over subjects and get each session"""
    sub_label = sub.label.lstrip('0') #Remove leading zeros

    for ses in sub.sessions():
        ses_label = ses.label.lstrip('0') #Remove leading zeros
        """Get the analyses for that session"""
        full_ses = fw.get(
        these_analyses = [ana for ana in full_ses.analyses if analysis_str in ana.label]
        these_analyses_labs = [ana.label for ana in full_ses.analyses if analysis_str in ana.label]
        if len(these_analyses)<1:
             print('No analyses {} {}'.format(sub_label,ses_label))
        for this_ana in these_analyses:
            """Looping over all analyses that match your string"""
            if not this_ana.files:
                # There are no output files.

            outputs = [f for f in this_ana.files if'.zip')
                and not'')] # Grabbing the zipped output file
            output = outputs[0]

            # I am getting this ana_label to label my directory.
            ## You may want to label differently and/or
            ## change the string splitting for your specific case.
            ana_label = this_ana.label.split(' ')[0]

            dest = '/cbica/projects/alpraz_EI/data/{}/{}/{}/'.format(ana_label,sub_label,ses_label) #output location
                os.makedirs(dest) # make the output directory
            except OSError:
                print(dest+" exists")
            else: print("creating "+dest)
            dest_file =
            if not os.path.exists(dest_file):
                """Download output file if it does not already exist"""
                print("Downloading", dest_file)

We can run this script using qsub and the following bash script. Providing the full path to python is important! Your path may be different depending on install location. Obviously the name of your python script may also be different.


Mounting CUBIC on your local machine

A guide for those who want to mount their cbica project folder on their local machine. This guide uses SSHFS. The first part discusses creating a mountpoint on your machine that matches the directory structure of CUBIC. This is useful because all of you scripts that contain filepaths will work on locally and on the server (very convenient!). If you already have a mountpoint, or prefer to mount somewhere else, you can ignore the first part and skip to the section on mounting using sshfs.

Creating a sensible mount point

  1. Create a mount point on your local machine that matches the file path to your project dir on CUBIC (Catalina users, see the note below). Since you are making a dir on root, you need to use sudo . You will need to enter your computer password after entering the command. Replace my_project below with you actual project folder name).
    $ sudo mkdir -p /cbica/projects/my_project
  2. Change the owner of your folder to your local user rather than root so that you don’t need to use sudo to do things with it.
    $ sudo chown -R my_username /cbica

Note: For Catalina users, with the update to Catalina, you can longer make directories in /. Instead, there is a strange tecnique that was introduced to make symbolic links. Here are the steps:

  1. Make a directory in you home dir (or elsewhere if you prefer) that will eventually be symbolically linked to /.
      $ cd
      $ mkdir -p cbica/projects/my_project
  2. Using a text editor, create a file called synthetic.conf and save it in /etc. You will need to use sudo to make a file in /etc; e.g. sudo vim /etc/synthetic.conf.
  3. Put the following text in the file. You must use a tab rather than space. cbica /Users/my_home_folder/cbica
  4. Restart the computer.
  5. You should now see a dir in the root dir, /cbica.

Mounting CUBIC

  1. Mac users need to download FUSE and SSHFS: .
  2. Mount ussing sshfs
    $ sshfs -o defer_permissions username@cubic-login:<my-folder-on-CUBIC> <my-local-folder>

    For example, if you have set up your mount point according to the above guide, your command will be:

    $ sshfs -o defer_permissions username@cubic-login:/cbica/projects/my_project /cbica/projects/my_project/

    I recommend putting this command into a script or alias if you need to mount often. E.g. in your .profile put: alias mc="sshfs -o defer_permissions username@cubic-login:/cbica/projects/my_project /cbica/projects/my_project/" Now you can simply type mc to mount cubic. Pro-tip: Follow these instructions to no longer need to type your password:

  3. When you are done, unmount. This should ideally be done BEFORE you disconnect from the network to avoid confusing your computer for a few minutes and making the mountpoint temporarily unresponsive.
$ cd   # just to make sure we are not inside the mount dir
$ umount /cbica/projects/my_project


If you forget to do this and are on a Mac, you may encounter an issue where you cannot mount or unmount and are prompted with the Input/output error. In this case you will need to identify and kill the sshfs process that is stuck. Then you should me able to unmount and remount.

$ pgrep -lf sshfs
$ kill -9 <pid_of_sshfs_process>
$ sudo umount -f <mounted_dir>

Moving data to and from CUBIC

Because of CUBIC’s unique “project user” design, the protocol for moving files to CUBIC is a bit different than on a normal cluster. It is possible to move files to CUBIC by conventional means, or through your mount point, but this can cause annoying permissions issues and is not recommended.

Note that you will need to be within the UPenn infrastructure (i.e. on VPN or on campus) to move files to and from CUBIC.

Moving files to CUBIC

All project directories will include a folder called dropbox/ in the project home directory. Depositing files into this folder will automatically make the project user the owner of the file. Please note, however, that this ownership conversion is not always instantaneous and can take a few minutes, so be patient. Note also that anyone in the project group can move files into this folder. Finally, keep in mind that the dropbox can only contain 1GB or 1000 files at any given time.

scp is the recommended command-line transfer software for moving files onto and off of CUBIC. One need only specify the file(s) to move and the CUBIC destination. See the example below, where <...> indicates user input:

scp </path/to/files*.nii.gz> <username>@cubic-login:/cbica/projects/<project_dir>/dropbox/

This command would copy all nii.gz files from /path/to/ into the dropbox/ folder of your project directory. Note that you are entering your CUBIC username in the destination, not your project username (confusing, I know).

Moving files directly to a non dropbox/ folder on CUBIC with scp or your mount point is possible for a user with project directory write permissions, though is not recommended. Such files will retain the ownership of the CUBIC user who transferred the files, and permissions can only be changed by that user or a user with sudo priveleges.

Moving files from CUBIC

This is much simpler. One can simply use scp (or rsync, or whatever) to copy files from a source on cubic to their local destination. E.g.

scp <username>@cubic-login:/cbica/projects/<project_dir/path/files.csv> </local/path/to/put/files/>

It is also possible to copy files through the mount point, but this would be quite slow and is not really the purpose of the mount point.

Using R/R-studio and Installation of R packages

  1. Currently R-3.6 is installed on CUBIC. If you are satisfy with R-3.6, go to step 2 below. However, you can install another R version in any directory of your choice, usually home directory /cbica/home/username. To install R in your desired directory, follow the following steps.

    $ module load curl/7.56.0 # load the libcurl library
    $ wget #e.g R-3.4.1
    $ tar xvf R-3.4.1.tar.gz
    $ cd R-3.4.1
    $ ./configure --prefix=$HOME/R  --enable-R-shlib #$HOME/R is where R will be installed
    $ make && make install

    Then, installation of R is complete. To run R, add $HOME/R/bin to your PATH. Then, shell commands like R and Rscript will work.

     echo export PATH="$HOME/R/bin:$PATH" >> .bash_profile or .bashrc # add R to bash

    You can load higher version of gcc compiler if required for some R version.

     $ module load gcc/version-number
  2. You can install any R-packages of your choice. It require adding library path in .Rprofile . For example.

    You can have more than one R-packages directory.

  3. You can also use r-studio on CUBIC by simply load rstudio using module.

       $ module load R-studio/1.1.456
       $ rstudio & # enjoy the R and Rstudio, it works

Alternatively, you can use containers:

the neuroR container on docker hub has R and many neuroimaging packages installed, which is also available as an environment module on CUBIC:

module load neuroR/0.2.0 # will load R 4.1
  1. R Studio (with the same neuroimaging packages as neuroR) is also available on docker hub, but not as an environment module, so you need to pull it yourself before running it:
    singularity pull docker://pennsive/rstudio:4.1
    # see for more on running services in singularity
    # command follows format:
    # [command]                                                     [image]           [name of instance]
    singularity instance start -e -B $TMPDIR:/var -B $HOME:/root    rstudio_4.1.sif   my-running-rstudio
    # $PORT must be the number you used to create the ssh tunnel, e.g. ssh -q -L${PORT}:${PORT} user@cubic-login
    SINGULARITYENV_PORT=$PORT singularity run instance://my-running-rstudio
    # other singularity service commands:
    singularity instance list
    singularity instance stop --all

CPUs, Nodes, & Memory

CUBIC has:

  • 168 compute nodes

  • 4840 CPUs

  • 58 TB of RAM

It is suggested to use 20 CPUs per core, with the RAM depending on the size of the jobs. 20 CPUs is suggested as a safe estimate because there are approximately 20 CPUs per node.

Specifying CPUs on a node

In order to prevent your jobs from dying without the cluster giving errors or warnings, there are several steps that can be taken:

  1. Include -e in the code to make sure that the environment is clean. It will also be important to check the .e log for the environment to spot potential warning that will specify whether or not the environment is corrupted.
  2. Check for a core dump to identify whether there are certain jobs that did not go through: If there is a core.XXX file then the job definitely exited unusually.
  3. Some jobs may be killed on cubic if the job is allocated to nodes where the number of CPUs specified in the code is less than the total available CPUs on that node. While it is not possible to select a particular node on CUBIC, it is possible to specify the requirement for submission so that it matches the nodes themselves. It is possible to specify the number of CPUs to be used during submission with the following code:

    a. qsub -pe threaded N -l h_vmem=XG,s_vmem=YG where X and Y represent numbers and N is the number of CPUs. h_vmem is the hard limit of the memory up to which the job can consume, and s_vmem is the soft virtual memory that is the minimum requested to run the job.

    b. qsub -pe threaded N-M where N-M speicify a range of CPUs and M>N

Errors with Allocating Memory/Memory Overflow

Here is an example of a memory allocation error message:

mmap cannot allocate memory failed (/gpfs/fs001/cbica/projects/RBC/Pipeline_Timing/cpac_1.7.1.simg), reading buffer sequentially…

If you see this:

  • Make sure in this case that everything is in the right directory.

  • Make sure that the allocation of memory is specified. Example: mem_gb 20

  • Make sure that the memory is being requested in the cluster itself and not just specified in the code: qsub -l h_vmem=22.5 , s_vmem=22G

Note that the use of h_vmem adds 2.5 GBs to the original mem_gb specification. This is to remain on the safe side of memory specification to the cluster as the cluster will kill any job that uses more than the requested memory space when requesting hard memory (h_vmem). This function is used to save space on the cluster such that several jobs can be run simultaneously but is only advised to be used when the user is sure about the memory specification needed.

Note that s_vmem adds only 2 GBs to the original mem_gb specification. This is because soft memory has more flexibility than hard memory specifications. This is recommended to be used when the exact memory required by each subject is not concretely known so as to diminish the risk of the job being killed by accident.

Additional information about CUBIC

This page has tons of other useful information about using CUBIC. Anyone who plans on using CUBIC regularly should probably browse it. Also, when troubleshooting, make sure the answer to your question isn’t on this page before asking others. Note that you will need to be within the UPenn infrastructure (i.e. on campus or using a VPN) to view this page.