We are going to be using the following tools throughout the course:
fastqc
multiqc
trimmomatic
bwa
samtools
(version > 1.0)bcftools
(version > 1.0)singularity
nextflow
The easiest way to install all of the above is through conda
as follows:
- Download and install the latest version of miniconda:
curl -O -L https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
- Add the correct channels in the conda (so that it knows where to look for packages):
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
- Install the required tools:
conda install -c bioconda fastqc multiqc trimmomatic bwa samtools bcftools singularity nextflow
You might also want to download and install IGV
, as it may be useful in several visualizations.
Finally, the following tools are in all likelihood already installed in most major OSs, but you should also check for them:
git
curl
gunzip
java
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
Windows Install R by downloading and running the correct installer file from CRAN. Also, please install the RStudio IDE. Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select "Run as administrator" instead of double-clicking). Otherwise problems may occur later, for example when installing R packages.
Linux You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo yum install R). Also, please install the RStudio IDE.
We will install some specific packages as well as all packages that might be needed / useful.
install.packages(c("tidyverse", "ggplot2", "rafalib", "caret", "ggpubr", "GGally", "ROCR", "Boruta", "party", "earth", "mlbench", "glmnet", "e1071", "randomForest", "neuralnet"), dependencies=c("Depends", "Suggests") );
Also install bioconductor, and some bioconductor-related libraries.
source("https://bioconductor.org/biocLite.R")
BiocManager::install()
BiocManager::install("factoextra")
BiocManager::install("fpc")
BiocManager::install("knitr")
BiocManager::install("kableExtra")
BiocManager::install("TxDb.Hsapiens.UCSC.hg38.knownGene")
BiocManager::install("BSgenome.Hsapiens.UCSC.hg38")
BiocManager::install("DiffBind")
BiocManager::install('MotifDb')
BiocManager::install('methylKit')
BiocManager::install('genomation')
BiocManager::install('ggplot2')
BiocManager::install('TxDb.Mmusculus.UCSC.mm10.knownGene')
BiocManager::install("AnnotationHub")
BiocManager::install("annotatr")
BiocManager::install("bsseq")
BiocManager::install("DSS")
That's it, all done! You have now all the tools in place!