This is the official implementation of CI-GWAS as described in our paper: Causal inference for multiple risk factors and diseases from genomics data
cmake >= 3.18
python >= 3.9.6
, withnumpy >= 1.22.1
andscipy >= 1.11.3
.R >= 4.1.0
, with dependencies:
install.packages("BiocManager")
BiocManager::install(pkgs=c("graph","Rgraphviz", "RBGL"))
install.packages(c( "abind", "igraph", "ggm", "corpcor", "robustbase", "vcd", "Rcpp", "bdsmatrix", "sfsmisc", "fastICA", "clue", "MASS", "Matrix", "mvtnorm"," huge", "ggplot2", "dagitty", 'pcalg', 'Matrix'))
CI-GWAS is a conglomerate of scripts and compiled programs bundled in a python command-line-interface.
First simply clone the repo:
git clone --recurse-submodules https://github.com/medical-genomics-group/ci-gwas.git
The cli should already be accessible via
./ci-gwas.py
The cusk
part of the project has to compiled:
cd cusk
cmake -S . -B build
cmake --build build
You can then run the tests to check that everything works:
cd build && ctest
Get help:
./ci-gwas.py -h
First of all, make sure that any marker data you want to plug in is LD pruned, or at least does not have markers with a correlation of 1.
A standard analysis, if you have data at the individual-level available, consists of subsquent calls to
ci-gwas.py prep-bed
to compute means and variances of all markers (make sure to exclude any markers that have no recorded variation; the .stds file should not have any 0.0 entries)ci-gwas.py block
to block the LD matrixci-gwas.py cusk
(once for each block) to compute skeletons (make sure that the trait values are standardized)ci-gwas.py merge-block-outputs
to merge all skeletonsci-gwas.py cuskss-merged
to run another of cuda-skeleton using only the selected markers. This is optional, but can help to reduce the marker-trait FDR.ci-gwas.py sepselect
to find separation setsci-gwas.py srfci
to infer a PAGci-gwas.py mvivw
to run mvivw with the IVs inferred in the skeleton construction
Alternatively, if you have correlations from summarized data, you can start at step 3) with cuskss
instead of cusk
. In that case it is important that
- the traits have the same order in the
mxp
andpxp
files - the markers have the same order in the
mxm
andmxp
files
In addition, when including binary or ordinal traits, cuskss-het
should be used with heterologous correlations instead of cusk
or cuskss
.
invalid device function
,
no kernel image is available for execution on the device
,
Can be caused by the chosen device having a lower GPU Compute Capability than the one cusk
was compiled for. The Compute Capability targeted by the build is specified in the top level CMakeLists.txt
. If there are multiple devices on the machine and only a subset of them have the appropriate Compute Capability, you can choose one by setting CUDA_VISIBLE_DEVICES=X
where X is the index of the device. The device list can be inspected with nvidia-smi -L
.