GRPM System 2.0

The GRPM system is an advanced tool designed for the integration and analysis of genetic polymorphism data corresponding to specific biomedical domains. It consists of five modular components that facilitate data retrieval, merging, analysis, and the incorporation of GWAS data.

Introduction

The GRPM system is a Python-based framework designed for the construction of a comprehensive dataset of human genetic polymorphisms associated with nutrition. By integrating data from multiple sources and utilizing MeSH ontology as semantic retrieval tool, this workflow enables researchers to investigate genetic variants with significant associations to specified biomedical subjects. The primary objective of developing this resource was to support nutritionists in exploring gene-diet interactions and implementing personalized nutrition strategies.

Installation

You can visualize and query the developed datasets by installing our package via:

pip install git+https://github.com/johndef64/GRPM_system.git

Example queries are available in the tests directory and test.ipynb.

Workflow Description

The workflow is composed of five distinct modules, each executing a crucial function to assist in the integration and analysis of genetic polymorphism data associated with nutrition. The modules are outlined below:

No.	Module	Description
1.	Dataset Builder	Retrieves and integrates data from the LitVar and PubMed databases in a structured format.
2.	MeSH Term Selection	Extracts a coherent MeSH lists to query the GRPM Dataset starting from simple biomedial terms collections (NLP based).
3.	Dataset Querying	Exexute MeSH query in the GRPM dataset, extracting a subset of matching entities, and generates a data report.
4.	Gene Prioritization	Analyzes retrieved data and computes gene interest index to filter significative results.
5.	GWAS Data Integration	Merges GWAS data, associating phenotypes and potential risk/effect alleles with the GRPM data (BioBERT based).

To reproduce our pipeline, execute each module individually by selecting the "Open in Colab" option. Ensure that all necessary dependencies and files are imported. Google Drive synchronization is available.

Each Jupyter notebook includes commands to download and install the necessary dependencies for execution.

Usage

Comprehensive instructions for the usage of each module are found within the respective Jupyter Notebooks provided. Follow the guidelines closely and install the necessary Python packages specified for each module.

Updates

The GRPM Dataset accessible on Zenodo represents a version of LitVar1, which has since been deprecated and replaced by LitVar2. Module 1 (Dataset Builder) has been updated for compatibility with LitVar2. The other modules in the pipeline remain operational using the original GRPM Dataset as available on Zenodo.

Requirements

All requirements are outlines in requirements.txt and setup.py

Name		Name	Last commit message	Last commit date
Latest commit History 339 Commits
grpm_dataset		grpm_dataset
grpm_surveys		grpm_surveys
gwas_data		gwas_data
misc		misc
pygrpm		pygrpm
ref-mesh		ref-mesh
tests		tests
GRPM_01_main_dataset_build.ipynb		GRPM_01_main_dataset_build.ipynb
GRPM_02_mesh_selection.ipynb		GRPM_02_mesh_selection.ipynb
GRPM_03_dataset_querying.ipynb		GRPM_03_dataset_querying.ipynb
GRPM_04_data_analysis.ipynb		GRPM_04_data_analysis.ipynb
GRPM_05_gwas_grpm_integration.ipynb		GRPM_05_gwas_grpm_integration.ipynb
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
test.ipynb		test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GRPM System 2.0

Introduction

Installation

Workflow Description

Usage

Updates

Requirements

About

Releases

Packages

Languages

johndef64/GRPM_system

Folders and files

Latest commit

History

Repository files navigation

GRPM System 2.0

Introduction

Installation

Workflow Description

Usage

Updates

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages