Biowulf
- Biowulf User Guide. Note that Biowulf2 runs Centos (RedHat) 6.x. (Biowulf1 is at Centos 5.x)
helix vs biowulf
- https://helix.nih.gov/ Helix is an interactive system for short jobs. Moving large data transfers to Helix, which is now designated for interactive data transfers. For instance, use Helix when transferring hundreds of gigabytes of data or more using any of these commands: cp, scp, rsync, sftp, ascp, etc.
- https://hpc.nih.gov/docs/rhel7.html#helix Helix transitioned to becoming the dedicated interactive data transfer and file management node [1] and its hardware was later upgraded to support this role [2]. Running processes such as scp, rsync, tar, and gzip on the Biowulf login node has been discouraged ever since.
Linux distribution
$ ls /etc/*release # login mode $ cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.8 (Santiago) $ sinteractive # switch to biowulf2 computing nodes $ cat /etc/redhat-release CentOS release 6.6 (Final) $ cat /etc/centos-release CentOS release 6.6 (Final)
Training notes
- https://hpc.nih.gov/training/ Slides, Videos and Handouts from Previous HPC Classes
- https://hpc.nih.gov/docs/ Biowulf User Guides
- https://hpc.nih.gov/docs/B2training.pdf
- https://hpc.nih.gov/docs/biowulf2-beta-handout.pdf
- Online class: Introduction to Biowulf
Storage
/scratch and /lscratch
The /scratch area on Biowulf is a large, low-performance shared area meant for the storage of temporary files.
- Each user can store up to a maximum of 10 TB in /scratch. However, 10 TB of space is not guaranteed to be available at any particular time.
- If the /scratch area is more than 80% full, the HPC staff will delete files as needed, even if they are less than 10 days old.
- Files in /scratch are automatically deleted 10 days after last access.
- Touching files to update their access times is inappropriate and the HPC staff will monitor for any such activity.
- Use /lscratch (not /scratch) when data is to be accessed from large numbers of compute nodes or large swarms.
- The central /scratch area should NEVER be used as a temporary directory for applications -- use /lscratch instead.
- Running RStudio interactively. It is generally recommended to allocate at least a small amount of lscratch for temporary storage for R.
Transfer files
- https://hpc.nih.gov/docs/transfer.html
- Data Management: Best Practices for Groups
- Locally Mounting HPC System Directories
- Globus
User Dashboard
https://hpc.nih.gov/dashboard/
Dashboard
User dashboard Unlock account, disk usage, job info
Quota
checkquota
Environment modules
# What modules are available module avail module -d avail # default module avail STAR module spider bed # search by a case-insensitive keyword # Load a module module list # loaded modules module load STAR module load STAR/2.4.1a module load plinkseq macs bowtie # load multiple modules # if we try to load a module in a bash script, we can use the following module load STAR || exit 1 # Unload a module module unload STAR/2.4.1a # Switch to a different version of an application # If you load a module, then load another version of the same module, the first one will be unloaded. # Examine a modulefile $ module display STAR ----------------------------------------------------------------------------- /usr/local/lmod/modulefiles/STAR/2.5.1b.lua: ----------------------------------------------------------------------------- help([[This module sets up the environment for using STAR. Index files can be found in /fdb/STAR ]]) whatis("STAR: ultrafast universal RNA-seq aligner") whatis("Version: 2.5.1b") prepend_path("PATH","/usr/local/apps/STAR/2.5.1b/bin") # Set up personal modulefiles # Using modules in scripts # Shared Modules
Single file - sbatch
- sbatch
- Note sbatch command does not support --module option. In sbatch case, the module command has to be put in the script file.
- Script file must be starting with a line #!/bin/bash
Don't use the swarm command on a single script file since swarm will treat each line of the script file as an independent command.
sbatch --cpus-per-task=2 --mem=4g --time=24:00:00 MYSCRIPT # Use --time=24:00:00 to increase the wall time from the default 2 hours
An example of the script file (Slurm environment variable $SLURM_CPUS_PER_TASK within your script was used to specify the number of threads to the program)
#!/bin/bash module load novocraft novoalign -c $SLURM_CPUS_PER_TASK -f s_1_sequence.txt -d celegans -o SAM > out.sam
Multiple files - swarm
- swarm. Remember the -f and --time options. The default walltime is 2 hours.
- Source code and other information on github
swarm -f run.sh --time=24:00:00 swarm -t 3 -g 20 -f run_seqtools_vc.sh --module samtools,picard,bwa --verbose 1 --devel # 3 commands run in 3 subjobs, each command requiring 20 gb and 3 threads, allocating 6 cores and 12 cpus swarm -t 3 -g 20 -f run_seqtools_vc.sh --module samtools,picard,bwa --verbose 1 # To change the default walltime, use --time=24:00:00 swarm -t 8 -g 24 --module tophat,samtools,htseq -f run_master.sh cat sw3n17156.o
Why the job is pending
Partition and freen
- https://hpc.nih.gov/docs/b2-userguide.html#cluster_status
- In the script file, we can use $SLURM_CPUS_PER_TASK to represent the number of cpus
- In the swarm command, we can use -t to specify the threads and -g to specify the memory (GB).
Biowulf nodes are grouped into partitions. A partition can be specified when submitting a job. The default partition is 'norm'. The freen command can be used to see free nodes and CPUs, and available types of nodes on each partition.
We may need to run swarm commands on non-default partitions. For example, not many free CPUs are available in 'norm' partition. Or Total time for bundled commands is greater than partition walltime limit. Or because the default partition norm has nodes with a maximum of 120GB memory.
We can run the swarm command on different partition (the default is 'norm'). For example, to run on b1 parition (the hardware in b1 looks inferior to norm)
swarm -f outputs/run_seqtools_dge_align -g 20 -t 16 --module tophat,samtools,htseq --time=6:00:00 --partition b1 --verbose 1
Below is an output from freen command.
........Per-Node Resources........ Partition FreeNds FreeCPUs Cores CPUs Mem Disk Features -------------------------------------------------------------------------------- norm* 0/301 1624/9488 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g unlimited 3/12 202/384 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g niddk 1/82 228/2624 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g,niddk ibfdr 0/184 0/5888 16 32 60g 800g cpu32,core16,g64,ssd800,x2650,ibfdr ibqdr 1/95 32/3040 16 32 29g 400g cpu32,core16,g32,sata400,x2600,ibqdr ibqdr 51/89 1632/2848 16 32 60g 400g cpu32,core16,g64,sata400,x2600,ibqdr gpu 1/4 116/128 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,gpuk20x,acemd gpu 17/20 634/640 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,gpuk20x largemem 3/4 254/256 32 64 1007g 800g cpu64,core32,g1024,ssd800,x4620,10g nimh 58/60 1856/1920 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g,nimh ccr 0/85 1540/2720 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g,ccr ccr 0/63 598/2016 16 32 123g 400g cpu32,core16,g128,sata400,x2600,1g,ccr ccr 0/60 974/1920 16 32 60g 400g cpu32,core16,g64,sata400,x2600,1g,ccr ccrclin 4/4 128/128 16 32 60g 400g cpu32,core16,g64,sata400,x2600,1g,ccr quick 0/85 1540/2720 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g,ccr quick 0/63 598/2016 16 32 123g 400g cpu32,core16,g128,sata400,x2600,1g,ccr quick 0/60 974/1920 16 32 60g 400g cpu32,core16,g64,sata400,x2600,1g,ccr quick 1/82 228/2624 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g,niddk quick 58/60 1856/1920 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g,nimh b1 71/363 7584/8712 12 24 21g 100g cpu24,core12,g24,sata100,x5660,1g b1 3/287 3410/4592 8 16 21g 200g cpu16,core8,g24,sata200,x5550,1g b1 7/16 192/256 8 16 68g 200g cpu16,core8,g72,sata200,x5550,1g b1 5/20 300/640 16 32 250g 400g cpu32,core16,g256,sata400,e2670,1g b1 3/10 100/160 8 16 68g 100g cpu16,core8,g72,sata100,x5550,1g
Running R scripts
https://hpc.nih.gov/apps/R.html
Running a swarm of R batch jobs on Biowulf
$ cat Rjobs R --vanilla < /data/username/R/R1 > /data/username/R/R1.out R --vanilla < /data/username/R/R2 > /data/username/R/R2.out
swarm -g 16 -f /home/username/Rjobs --module R
Parallelizing with 'parallel'
detectBatchCPUs <- function() { ncores <- as.integer(Sys.getenv("SLURM_CPUS_PER_TASK")) if (is.na(ncores)) { ncores <- as.integer(Sys.getenv("SLURM_JOB_CPUS_PER_NODE")) } if (is.na(ncores)) { return(4) # for helix } return(ncores) } ncpus <- detectBatchCPUs() options(mc.cores = ncpus) mclapply(..., mc.cores = ncpus) makeCluster(ncpus)
Some experiences
freen command shows the maximum threads is 56 and the memory is 246GB.
When I run an R script (foreach is employed to loop over simulation runs), I find
- Assign 56 threads can guarantee 56 simulations run at the same time (check by the jobload command).
- We need to worry about the RAM size. The larger the threads, the more memory we need. If we don't assign enough memory, weird error message will be spit out.
- Even assigning 56 threads can help to run 56 simulations at the same time, the actual execution time is longer than when I run fewer simulations.
allocated threads | allocated memory | number of runs | memory used | time (min) |
---|---|---|---|---|
56 | 64 | 10 | 30 | 10 |
56 | 64 | 20 | 36 | 13 |
56 | 64 | 56 | 58 | 27 |
Monitor jobs/Delete jobs
https://hpc.nih.gov/docs/userguide.html#monitor
sjobs watch -n 30 jobload scancel -u XXXXX scancel NNNNN scancel --state=PENDING scancel --state=RUNNING squeue -u XXXX jobhist 17500 # report the CPU and memory usage of completed jobs.
The other two commands are very useful too jobhist and swarmhist (temporary).
$ cat /usr/local/bin/swarmhist #!/bin/bash usage="usage: $0 jobid" jobid=$1 [[ -n $jobid ]] || { echo $usage; exit 1; } ret=$(grep "jobid=$jobid" /usr/local/logs/swarm_on_slurm.log) [[ -n $ret ]] || { echo "no swarm found for jobid = $jobid"; exit; } echo $ret | tr ';' '\n' $ jobhist 22038972 $ swarmhist 22038972 date=(SKIP) host=(SKIP) jobid=22038972 user=(SKIP) pwd=(SKIP) ncmds=1 soptions=--array=0-0 --job-name=swarm --output(SKIP) njobs=1 job-name=swarm command=/usr/local/bin/swarm -t 16 -g 20 -f outputs/run_seqtools_vc --module samtools,picard --verbose 1
Show properties of a node
Use freen -n.
$ freen -n ........Per-Node Resources........ Partition FreeNds FreeCPUs Cores CPUs Mem Disk Features Nodelist ---------------------------------------------------------------------------------------------------------------------------------- norm* 160/454 17562/25424 28 56 248g 400g cpu56,core28,g256,ssd400,x2695,ibfdr cn[1721-2203,2900-2955] norm* 0/476 5900/26656 28 56 250g 800g cpu56,core28,g256,ssd800,x2680,ibfdr cn[3092-3631] norm* 278/309 8928/9888 16 32 123g 800g cpu32,core16,g128,ssd800,x2650,10g cn[0001-0310] norm* 281/281 4496/4496 8 16 21g 200g cpu16,core8,g24,sata200,x5550,1g cn[2589-2782,2799-2899] norm* 10/10 160/160 8 16 68g 200g cpu16,core8,g72,sata200,x5550,1g cn[2783-2798] ...
Exit code
https://hpc.nih.gov/docs/b2-userguide.html#exitcodes
Local disk and temporary files
See https://hpc.nih.gov/docs/b2-userguide.html#local and https://hpc.nih.gov/storage/
Walltime limits
$ batchlim Max jobs per user: 4000 Max array size: 1001 Partition MaxCPUsPerUser DefWalltime MaxWalltime --------------------------------------------------------------- norm 7360 02:00:00 10-00:00:00 multinode 7560 08:00:00 10-00:00:00 turbo qos 15064 08:00:00 interactive 64 08:00:00 1-12:00:00 (3 simultaneous jobs) quick 6144 02:00:00 04:00:00 largemem 512 02:00:00 10-00:00:00 gpu 728 02:00:00 10-00:00:00 (56 GPUs per user) unlimited 128 UNLIMITED UNLIMITED student 32 02:00:00 08:00:00 (2 GPUs per user) ccr 3072 04:00:00 10-00:00:00 ccrgpu 448 04:00:00 10-00:00:00 (32 GPUs per user) forgo 5760 1-00:00:00 3-00:00:00
Interactive debugging
Default is 2 CPUs, 4G memory (too small) and 8 hours walltime.
Increase them to 60 GB and more cores if we run something like STAR for rna-seq reads alignment.
sinteractive --mem=32g -c 16 --gres=lscratch:100
The '--gres' option will allocate a local disk, 100GB in this case. The local disk directory will be /lscratch/$SLURM_JOBID.
Parallel jobs
Parallel (MPI) jobs that run on more than 1 node: Use the environment variable $SLURM_NTASKS within the script to specify the number of MPI processes.
Singularity
Snakemake and Singularity
Building a reproducible workflow with Snakemake and Singularity
R program
https://hpc.nih.gov/apps/R.html
Find available R versions:
module -r avail '^R$'
where -r means to use regular expression match. This will match "R/3.5.2" or "R/3.5" but not "Rstudio/1.1.447".
(Self-installed) R package directory
~/lib/R
The directory ~/R/x86_64-pc-linux-gnu-library/ was not used anymore in Biowulf.
SSH tunnel
https://hpc.nih.gov/docs/tunneling/
The use of interactive application servers (such as Jupyter notebooks) on Biowulf compute nodes requires establishing SSH tunnels to make the service accessible to your local workstation.