SPOKEsig-Workshop

Objectives of this workshop are as follows:

1. To demonstrate how to fetch clinical EHR data using Patient Explorer

2. To demonstrate the usage of REST API to fetch SPOKEsig embedding vectors corresponding to the obtained EHR data

3. To demonstrate the usage of SPOKEsig vectors in classifying patient population using a machine learning model

Follow the steps described below to follow the workshop

Step 0: Check if python is installed into your local machine

Let us first make sure you have installed Python3 into your local machine. For that:

Open terminal (for Linux/MacOS) or Open Command Prompt (for Windows)
Copy-paste the following

(Note: you can copy the following by clicking an icon that pops up at the right end of the line):

python --version

or

python3 --version

If you see Python 3.x.x you should be good to go.

Note: We have tested and verified that all codes in this repo will work on Python 3.6.x, Python 3.7.x and Python 3.8.x

If Python3 is not installed in your machine, please install one of the above mentioned versions of Python3 before proceeding further.

Step 1: Clone this repo to your local machine

Open terminal (for Linux/MacOS) or Open Command Prompt (for Windows)
On the command line, copy and paste the following

git clone https://github.com/BaranziniLab/SPOKEsig-Workshop.git

Enter your GitHub credentials if asked.
Check if a folder named "SPOKEsig-Workshop" is created. If yes, Step 1 is completed

Step 2: Create a virtual environment, activate it and install python modules

First, check if virtualenv is installed in your local machine. For that, type the following:

virtualenv --version

If you see a version number on your screen, then you are good to go. Otherwise, type the following:

pip install virtualenv

For Linux/MacOS users

In the terminal, open "SPOKEsig-Workshop" folder that you just cloned. You can copy-paste the following:

cd SPOKEsig-Workshop

Once you are inside the folder, let us create a virtual environment. For that, copy-paste the following:

virtualenv -p $(which python3) venv

Activate the created virtual environment. For that, copy-paste the following:

source venv/bin/activate

To install the required python modules for this workshop, copy-paste the following:

pip install -r requirements.txt

For Windows users

In the cmd, open "SPOKEsig-Workshop" folder that you just cloned. You can copy-paste the following:

cd SPOKEsig-Workshop

Once you are inside the folder, let us create a virtual environment.

For that, use the following syntax (NB: DO NOT COPY PASTE).

virtualenv --python "\path\to\python.exe" venv

Change "\path\to\python.exe" to the path of python in your local machine.

Usually path would be like (assuming you are using Python 3.6.x):

C:\Users\<user_name>\AppData\Local\Programs\Python\Python36\python.exe

Activate the created virtual environment. For that, copy-paste the following:

.\venv\Scripts\activate

To install the required python modules for this workshop, copy-paste the following:

pip install -r requirements.txt

Step 3: Spin up a Jupyter notebook instance in your local machine

In this workshop, we will run all codes in Jupyter notebook. Hence, type the following in your terminal(for Linux/MacOS)/CMD(for Windows) to start Jupyter notebook:

jupyter-notebook

This will spin up a jupyter notebook instance in your local machine. This notebook will appear in your browser

To use this instance, you will be prompted to enter a token. To get the token, check your terminal(for Linux/MacOS)/CMD(for Windows)

Once you furnish the token, you can see the contents of SPOKEsig-Workshop directory in your browser

Note:

If the notebook doesn't start automatically in your browser, copy and paste the link that appears on your terminal(for Linux/MacOS)/CMD(for Windows), The link will look like the following:

http://localhost:8888/?token=

Congratulations! you have successfully started Jupyter notebook in your local machine!

Now, let us start running codes.

Step 4: Add API credentials to config file

For successfully running codes, we need to first add API credentials to workshop.conf file

Open the file workshop.conf in SPOKEsig-Workshop directory
Change <API USERNAME> and <API PASSWORD> in the [API] section of the config file to username and password provided to you
Save the config file

Step 5: Create patient SPOKEsigs

From this step onwards, you need to connect your UCSF VPN.

In this step, we will create SPOKE signatures (a.k.a. SPOKEsigs) of patients using their clinical EHR data. For this, we make use of PatientExplorer to import clinical data of patients.

Please follow instructions in the presentation to know how to import clinical data from PatientExplorer

Once the data is imported to your local machine, then:

You must have downloaded three csv files from PatientExplorer to your local machine such as a condition file, a medication file and a measurement file. As you might have noticed, these data correspond to three diseases such as breast_cancer, colon_cancer and irritable_bowel_syndrome.

Rename these files as:

condition csv file --> breast_colon_ibd_conditions.csv

medication csv file --> breast_colon_ibd_drugs.csv

measurement csv file --> breast_colon_ibd_measurements_w_ab.csv

Copy these renamed files to the folder /data in SPOKEsig-Workshop directory
Open the folder /code in SPOKEsig-Workshop directory
Open the notebook named get_patient_spoke_sig_using_patient_explorer_and_API.ipynb
As the name indicates, this notebook is used to create SPOKEsigs of patients that is imported from PatientExplorer. Notebook is well commented and hence, you can see that SPOKEsigs are obtained by making API calls whose credentials are saved at Step 4.
Once you run all sections of this notebook, it saves three files to /data folder. Files are such as: a SPOKEsig numpy file (named random_patient_spokesigs.npy) and two flat files that provide information on patients (random_patient_info.tsv and example_cohort.tsv)

Congratulations! You have now successfully created and saved SPOKEsigs of patients from PatientExplorer.

Step 6: Let us use the created SPOKEsigs on a machine learning model

Now we created SPOKEsigs of patients coming from three different disease categories such as breast_cancer, colon_cancer and irritable_bowel_syndrome.

Let us see if a machine learning model can make use of these SPOKEsigs and classify patients into these three different disease categories.

Open the notebook named patient_spoke_sig_analysis.ipynb
This notebook is also well commented. However, unlike previous notebook, this one has allocated some sections for you to fill.

Don't worry! Useful links are provided at relevant sections for you to refer and fill out.
If you still cannot figure out, you can always go to the wiki section of this repo and see the missing portions of the code from there.

Once you run all sections of this notebook, you can see the performance of a machine learning model (random forest classifier) in classifying patient data into three disease categories using SPOKEsigs as its input.

Congratulations! You have successfully completed the hands-on session of this workshop.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPOKEsig-Workshop

Step 0: Check if python is installed into your local machine

Step 1: Clone this repo to your local machine

Step 2: Create a virtual environment, activate it and install python modules

For Linux/MacOS users

For Windows users

Step 3: Spin up a Jupyter notebook instance in your local machine

Step 4: Add API credentials to config file

Step 5: Create patient SPOKEsigs

Step 6: Let us use the created SPOKEsigs on a machine learning model

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
code		code
data		data
model		model
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
workshop.conf		workshop.conf

BaranziniLab/SPOKEsig-Workshop

Folders and files

Latest commit

History

Repository files navigation

SPOKEsig-Workshop

Step 0: Check if python is installed into your local machine

Step 1: Clone this repo to your local machine

Step 2: Create a virtual environment, activate it and install python modules

For Linux/MacOS users

For Windows users

Step 3: Spin up a Jupyter notebook instance in your local machine

Step 4: Add API credentials to config file

Step 5: Create patient SPOKEsigs

Step 6: Let us use the created SPOKEsigs on a machine learning model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages