-
Notifications
You must be signed in to change notification settings - Fork 11
/
20-kidney_differential_expression.Rmd
89 lines (72 loc) · 2.44 KB
/
20-kidney_differential_expression.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
title: "Kidney differential expression analysis"
output:
html_notebook:
toc: true
toc_float: true
---
**J. Taroni 2018**
In this notebook, we work with microdissected glomeruli data from patients with
nephrotic syndrome, patients with ANCA-associated vasculitis, and living donors.
First, we'll apply the recount2 PLIER model and then we will test those LVs for
differential expression between the three groups.
## Functions and directory set up
```{r}
`%>%` <- dplyr::`%>%`
source(file.path("util", "test_LV_differences.R"))
source(file.path("util", "plier_util.R"))
```
```{r}
# plot and result directory setup for this notebook
plot.dir <- file.path("plots", "20")
dir.create(plot.dir, recursive = TRUE, showWarnings = FALSE)
results.dir <- file.path("results", "20")
dir.create(results.dir, recursive = TRUE, showWarnings = FALSE)
```
## Read in data
### Expression data
```{r}
ercb.file <- file.path("data", "expression_data",
"ERCB_Glom_CustCDF19_forVCRC.txt")
exprs.df <- readr::read_tsv(ercb.file)
exprs.df <- dplyr::select(exprs.df, -EntrezGeneID)
colnames(exprs.df)[1] <- "Gene"
```
```{r}
agg.exprs.df <- PrepExpressionDF(exprs.df)
# any genes that don't have a gene symbol need to be dropped
agg.exprs.df <- dplyr::filter(agg.exprs.df, !(is.na(Gene)))
readr::write_tsv(agg.exprs.df, path = file.path("data", "expression_data",
"ERCB_Glom_mean_agg.pcl"))
# as a matrix for use with PLIER
exprs.mat <- as.matrix(dplyr::select(agg.exprs.df, -Gene))
rownames(exprs.mat) <- agg.exprs.df$Gene
```
### Clinical data
```{r}
clinical.file <- file.path("data", "sample_info", "ERCB_glom_diagnosis.tsv")
diagnosis.df <- readr::read_tsv(clinical.file)
```
## Apply recount2 model
```{r}
# load model itself
recount.plier <- readRDS(file.path("data", "recount2_PLIER_data",
"recount_PLIER_model.RDS"))
# project
recount.b <- GetNewDataB(exprs.mat = as.matrix(exprs.mat),
plier.model = recount.plier)
```
```{r}
# save B matrix to file
b.file <- file.path(results.dir, "ERCB_glom_recount2_B.RDS")
saveRDS(recount.b, file = b.file)
```
## Differential expression
```{r}
LVTestWrapper(b.matrix = recount.b,
sample.info.df = diagnosis.df,
phenotype.col = "Diagnosis",
file.lead = "ERCB_glom_recount2_model",
plot.dir = plot.dir,
results.dir = results.dir)
```