NGS-Pipeline Installation

Quick Installation

The NGS-Pipeline can be installed with all dependencies by executing the following command from terminal/shell:

"${SHELL}" <(curl -L https://raw.githubusercontent.com/vivaxgen/ngs-pipeline/main/install.sh)

Warning

If you are inside an active Conda/Mamba environemnt, please deactivate first before running the above command as the Conda/Mamba enviroment might interfere the installation process.

Warning

If installing under WSL/WSL2, please make sure that the target directory is under Linux filesystems. The installation will fail if the target directory is under Windows filesystems, such as /mnt/c or /mnt/d.

If supplied with MAMBA_ROOT_PREFIX environment variable, the install script will use the directory to install and save the micromamba environment. This is useful to reduce the storage space in case there are several micromamba environments installed in the system.

Snakemake Profile Setting

The Snakemake profile setting allows NGS-Pipeline to use the optimal resources of the system.

If installed in a HPC/cluster system and properly configured, the NGS-Pipeline will submit the Snakemake workflows using available workload manager in the system. Current supported workload managers are SLURM and PBSPro. During installation, the install process will try to detect the presence of SLURM and PBSPro, and generate a symbolic link $VVG_BASEDIR/etc/bashrc.d/99-snakemake-profile that points to the correct profile. However, should the detection fail, the symbolic link can be generated by the following command for SLURM:

$VVGBIN/set-cluster-system.sh --profiledir $VVG_REPODIR/etc/snakemake-profiles/slurm/99-snakemake-profile

For PBSPro, just change slurm with pbspro. The symbolic link can also be generated manually using the ln as follow:

ln -srf $VVG_REPODIR/etc/snakemake-profiles/slurm/99-snakemake-profile $VVG_BASEDIR/etc/bashrc.d/

For other workload manager, a manual setup of the profile can be performed following the instructions in the section here.

The NGS-Pipeline also provides several pre-set profiles that can be used for non-HPC/cluster system such as laptops/desktop or stand-alone workstations. The following table lists the location of the profiles

Adding Additional Flags for Job Submission

Some workload manager require additional flags in their job script submission in order for proper execution of the job script. For example, PBSPro in a system might require additional flags to access the shared file system, such as -l storage:shared_fs. To inject additional flags for job script submission, create a resource file in $VVG_BASEDIR/etc/bashrc.d/95-cluster-extra-flags with the following content:

SNAKEMAKE_CLUSTER_EXTRA_FLAGS="-l storage:shared_fs"

Reactivate the NGS-Pipeline enviroment, and check whether the environment has been properly set using the following command:

echo ${SNAKEMAKE_CLUSTER_EXTRA_FLAGS}

If the above command does not show the argument flags, double check the location of the newly-created resource file.

Note

The actual argument flags to set in the SNAKEMAKE_CLUSTER_EXTRA_FLAGS will depend on the workload manager. Please consult the documentation of the respective workload manager or consult the administrators of the HPC/cluster system.

Manual Setup for Snakemake Cluster Profile

The pipeline relies on vivaxGEN Base Utility to provide the necessary directory layout, environment setting and cluster profile settings. In case that the workload manager installed in your system is not supported out of the box by vvg-base (which is currently SLURM and PBSPro), please consult this document to set up the profile manually.

Uninstalling the Pipeline

To uninstall the pipeline, remove the whole installation directory of the pipeline (i.e. $VVG_BASEDIR).