ME/CFS study from Swedish clinical cohort (INMEST): single-level analyses and omics integration.
This project used multiple omics data:
- Plasma protein expression (Olink - NPX values)
- Cell abundance (CyTOF - Grid cell abundance)
- mRNA sequencing (VST counts)
Single-level analyses and omics-integration was performed in order to characterize this heterogeneous ME/CFS cohort and the effects of the INMEST treatment throughout the clinical trial.
Project is created with:
- RStudio version: 3.6.0
- Python version: 2.7
- Unix/Linux
DESeq2/
contains script to convert Kallisto estimates and run DESeq2MOFA/
contains script to train MOFA model and some functions to uncover sources of variationFigures/
contains scripts used for analyses and figures displayed in paperMEM_clean.R
used for mixed-effect modellinggsea_clean.R
used for gene set enrichment analyses and enrichment plot
- Grid is a supervised learning algorithm based on t-SNE implementation. It uses the manual classification of cells in CyTOF samples and then the automatic classification of cells in new samples through the use of machine learning techniques based on these manual classifications.
- To run Grid, install it using:
$ pip install cellgrid
- The principles behind DESeq2 is described in Love et al. (2014)
- Used Kallisto outputs (estimates) that were converted into read counts using
tximport
before running DESeq2 withdeseq_run.R
clusterProfiler::gseGO()
used ingsea_clean.R
to run hierarchical gsea and plot- Also, extracts results from GSEA used in
GO_plot.R
- Partial-bayesian mixed-effect modelling to account for covariates
MEM_clean.R
extracts confounding variables for downstream use in GSEA for example, after modelling
- Multi-Omics Factor Analysis (MOFA) was used in this study in order to deconvolute the main sources of variation in the differents sets of data mentioned above. For more information, read their published Methods paper Argelaguet et al. (2018).
- MOFA is publicly accessible here: https://github.com/bioFAM/MOFA
spearman_corrmatrix.R
uses cell abundance dataframe built from Grid that is sub-setted by active treatments for correlation- Option to re-order one matrix in accordance to the other for comparison purposes
- Post GSEA,
GO_plot.R
extracts set of genes associated with GO terms to plot - Original counts used to calculate median expression and log2 ratio for active treatments (covariate)
- Solely, significant features of MEM used
Volcano_MEM_clean.R
reads in generated MEM table built inMEM_clean.R
to build volcano plots of features with significant ones highlighted and a top sub-set of them labeled
visNetwork_clean.R
reads in an edge and node file manipulated from TTRUST v2 output to build disease tolerance network with color gradient based on log2 expression and size of nodes from number of regulatory interactions
To run this project, install it locally using devtools:
$ install.packages('devtools')
$ library(devtools)
$ install_github('Brodinlab/ME-CFS_study')
$ library(ME-CFS_study)