Setup BD Single Cell Genomics Rhapsody Analysis
This post provides instruction of installing DB Genomics Rhapsody™ Analysis pipeline in a local Linux server.
1. Minimal System Requirements
- Operating system:
- macOS® X or Linux®.
- Microsoft® Windows® is not supported.
- 8-core processor (>16-core recommended)
- 32 GB RAM (128 GB recommended)
- 250 GB free disk space
In this tutorial, we will work on Ubuntu 18.04.5 LTS Server amd64.
Note:
- This pipeline has been tested working on Ubuntu 18.04 LTS, Ubuntu 20.04 LTS and CentOS 7.x.
- Might not work for other Linux distributions.
2. Install necessary softwares
$ sudo apt update
$ sudo apt install git cwltool
Note:
cwltool
from Ubuntu package repository works well. So it is not necessary to installcwltool
/cwl-runner
viapip
.
3. Install Docker
3.1 Uninstall old versions of Docker if necessary
$ sudo apt remove docker docker-engine docker.io containerd runc
3.2 Install from repository
- Install necessary packages at first:
$ sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg \
lsb-release
- Add Docker official GPG key:
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
- Setup the stable repository:
$ echo \
"deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
- Install Docker Engine
$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io
- Verify installation
$ sudo docker run hello-world
This command downloads a test image and runs it in a container.
- Manage Docker as a normal user
$ sudo usermod -aG docker $USER
$ newgrp docker
Now you can run docker
commands without sudo
.
$ docker run hello-world
4. Download BD Genomics Rhapsody image
All bdgenomics rhapsody images are avaliable from Docker Hub. Here we install the most recent one.
$ docker pull bdgenomics/rhapsody:1.9.1
It need a while according to you net speed. After this process completed, verify it by:
$ docker images
5. Download CWL and YML files
These files are available from BD Genomics repository at Bitbucket.
$ git clone https://bitbucket.org/CRSwDev/cwl/ bd-cwl
Files in the sub-directory v1.9.1/
are what we need.
Now the BD Genomics Rhapsody Analysis Pipeline installed completely.
6. A pseudo-run
Here show a sample yml file: demo.yml
for Rhapsody pipeline:
#!/usr/bin/env cwl-runner
cwl:tool: rhapsody
Reads:
- class: File
location: "demo_S1_L001_R1_001.fq.gz"
- class: File
location: "demo_S1_L001_R2_001.fq.gz"
Reference_Genome:
class: File
location: "GRCh38-gencodev29.tar.gz"
Transcriptome_Annotation:
class: File
location: "gencodev29.gtf"
Sample_Tags_Version: human
Subsample_Tags: 0.2
Please see more details and description in the template file:
bd-cwl/v1.9.1/template_wta_1.9.1.yml
.
Next launch the pipeline:
$ cwltool \
--parallel \
--tmpdir-prefix tmp_ \
--outdir result/ \
rhapsody_wta_1.9.1.cwl \
demo.yml > demo.log 2>&1
Note:
- This pipeline will generate lots of temporary files. So don’t forget use the option
--tempdir-prefix
. Otherwise all temporary files/dirs will be stored in the/tmp
directory.- Output directory
result/
need to be created before hand.- The final message “Final process status is success” indicates the pipeline completed success!