forked from Rita-Li527/494_final_project
-
Notifications
You must be signed in to change notification settings - Fork 0
/
population.Rmd
92 lines (65 loc) · 2.27 KB
/
population.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# Control for population structure
```{r}
#Load
library(GENESIS)
library(GWASTools)
```
# Data loading and description
To demonstrate PC-AiR and PC-Relate analyses with the GENESIS package, we analyze SNP data from the Mexican Americans in Los Angeles, California (MXL) and African American individuals in the southwestern USA (ASW) population samples of HapMap 3. Mexican Americans and African Americans have a diverse ancestral background, and familial relatives are present in these data. Genotype data at a subset of 20K autosomal SNPs for 173 individuals are provided as a GDS file.
```{r load data}
gdsfile <- system.file("extdata", "HapMap_ASW_MXL_geno.gds", package="GENESIS")
geno <- GdsGenotypeReader(filename = gdsfile)
genoData <- GenotypeData(geno)
nsnp(genoData)
```
# PC-AiR
## LD pruning
```{r}
library(SNPRelate)
# read 12 GDS data
gds <- snpgdsOpen(gdsfile,allow.duplicate = TRUE)
snpset <- snpgdsLDpruning(gds, method="corr",
slide.max.bp=10e6,
ld.threshold=sqrt(0.1), verbose=FALSE)
pruned <- unlist(snpset, use.names=FALSE)
length(pruned)
```
## Simulate Phenotype
```{r}
sample.id <- read.gdsn(index.gdsn(gds, "sample.id"))
# simulation
pheno <- rnorm(173)
mydat <- data.frame(scanID = sample.id,
pheno = pheno)
scanAnnot <- ScanAnnotationDataFrame(mydat)
HapMap_genoData <- GenotypeData(geno, scanAnnot = scanAnnot)
HapMap_genoData
```
```{r}
Kingoutput <- snpgdsIBDKING(gds)
sample.id <- Kingoutput$sample.id
king.matrix <- Kingoutput$kinship
colnames(king.matrix) <- sample.id
partition <- pcairPartition(kinobj = king.matrix ,divobj = king.matrix)
# 91 individuals
related <- partition$rels
# 82 individuals
unrelated <- partition$unrels
getScanID(HapMap_genoData)
related.index <- c()
a <- 0
for(i in 1:173){
name <- getScanID(HapMap_genoData,i)
if (name%in%related){
related.index[a] <- i
a <- a+1
}
}
related.index
#snpID <- getSnpID(HapMap_genoData)
#genotype <- getGenotype(HapMap_genoData, snpID, scan = c(1,2,4,5))
#chromosome <- getChromosome(HapMap_genoData)
#position <- getPosition(HapMap_genoData)
#scanID <- getScanID(HapMap_genoData, related.index)
#MatrixGenotypeReader(genotype=genotype, snpID=snpID, chromosome=chromosome, position=position, scanID=related.index)
```