forked from YuLab-SMU/treedata-book
-
Notifications
You must be signed in to change notification settings - Fork 0
/
11_ggtree_exts-others.Rmd
100 lines (69 loc) · 5.85 KB
/
11_ggtree_exts-others.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
\newpage
# Other ggtree extensions {#chapter11}
```{r include=F}
library(knitr)
opts_chunk$set(message=FALSE, warning=FALSE, eval=TRUE, echo=TRUE, cache=TRUE)
```
The `r Biocpkg("ggtree")` package is a general package for visualizing tree structures and associated data. If you have some special requirements that are not directly provided by `r Biocpkg("ggtree")`, you may need to use one of the extension packages built on top of `r Biocpkg("ggtree")`. For example, the `r CRANpkg("RevGadgets")` package for visualizing the output of the `r pkg_revbayes`, the `r Biocpkg("sitePath")` package for visualizing fixation events on phylogenetic pathways, and the `r Biocpkg("enrichplot")` package for visualizing hierarchical structure of the enriched pathways.
```{r revdep}
rp <- BiocManager::repositories()
db <- utils::available.packages(repo=rp)
x <- tools::package_dependencies('ggtree', db=db,
which = c("Depends", "Imports"),
reverse=TRUE)
print(x)
```
There are `r length(unlist(x))` packages in CRAN or Bioconductor that depend or import `r Biocpkg("ggtree")` and several packages on GitHub that extend `r Biocpkg("ggtree")`. Here we briefly introudce some extension packages, including `r Biocpkg("MicrobiotaProcess")` and `r Biocpkg("tanggle")`.
## Taxonomy annotation using MicrobiotaProcess {#MicrobiotaProcess}
The `r Biocpkg("MicrobiotaProcess")` package provides a LEfSe-like algorithm [@segata_metagenomic_2011] to discover microbiome biomarkers by comparing taxon abundance between different classes. It provides several methods to visualize the analysis result. The `ggdiffcalde()` is developed based on `r Biocpkg("ggtree")` [@yu_ggtree:_2017]. In addition to the `diff_analysis()` result, it also supports a data frame that contains a hierarchical relationship (*e.g.* [taxonomy annotation](#MicrobiotaProcess-taxonomy) or KEGG annotation) with another data frame that contains taxa and factor information and/or pvalue. The following example demonstrates how to use data frames (*i.e.* analysis results) to visualize the differential taxonomy tree. More details can be found on the vignette of the `r Biocpkg("MicrobiotaProcess")` package.
(ref:CRCdiffscap) Visualize differential taxonomy clade.
(ref:CRCdiffcap) **Visualize differential taxonomy clade.**
```{r CRCdiffclade, fig.width=7, fig.height=7, message=FALSE, fig.cap="(ref:CRCdiffcap)", fig.scap="(ref:CRCdiffscap)", out.extra='', warning=FALSE, message=FALSE, results='hide'}
library(MicrobiotaProcess)
library(ggplot2)
library(TDbook)
# load `df_difftax` and `df_difftax_info` from TDbook
taxa <- df_alltax_info
dt <- df_difftax
ggdiffclade(obj=taxa,
nodedf=dt,
factorName="DIAGNOSIS",
layout="radial",
skpointsize=0.6,
cladetext=2,
linewd=0.2,
taxlevel=3,
# This argument is to remove the branch of unknown taxonomy.
reduce=TRUE) +
scale_fill_manual(values=c("#00AED7", "#009E73"))+
guides(color = guide_legend(keywidth = 0.1, keyheight = 0.6,
order = 3,ncol=1)) +
theme(panel.background=element_rect(fill=NA),
legend.position="right",
plot.margin=margin(0,0,0,0),
legend.spacing.y=unit(0.02, "cm"),
legend.title=element_text(size=7.5),
legend.text=element_text(size=5.5),
legend.box.spacing=unit(0.02,"cm")
)
```
The data frame of this example is from the analysis result of `diff_analysis()` using public datasets [@kostic2012genomic]. The `colors` represent the features enriched in the relevant class groups. The size of circle points represents the `-log10(pvalue)`, *i.e.* a larger point indicates a greater significance. In Figure \@ref(fig:CRCdiffclade), we can find that *Fusobacterium* sequences were enriched in carcinomas, while Firmicutes, Bacteroides, and Clostridiales were depleted in tumors. These results were consistent with the original article [@kostic2012genomic]. The species of *Campylobacter* has been proven to be associated with colorectal cancer [@He289; @wu2013dysbiosis; @amer2017microbiome]. We can find in Figure \@ref(fig:CRCdiffclade) that *Campylobacter* was enriched in tumors, while its relative abundance is lower than *Fusobacterium*.
## Visualizing phylogenetic network using tanggle
The `r Biocpkg("tanggle")` package provides functions to display split network. It extends the `r Biocpkg("ggtree")` package [@yu_ggtree:_2017] to allow the visualization of phylogenetic networks.
(ref:phylonetworxscap) Phylogenetic network.
(ref:phylonetworxcap) **Phylogenetic network**.
```{r phylonetworx, fig.width=7, fig.height=7, message=FALSE, fig.cap="(ref:phylonetworxcap)", fig.scap="(ref:phylonetworxscap)", out.extra='', warning=FALSE}
library(ggplot2)
library(ggtree)
library(tanggle)
file <- system.file("extdata/trees/woodmouse.nxs", package = "phangorn")
Nnet <- phangorn::read.nexus.networx(file)
ggsplitnet(Nnet) +
geom_tiplab2(aes(color=label), hjust=-.1)+
geom_nodepoint(color='firebrick', alpha=.4) +
scale_color_manual(values=rainbow(15)) +
theme(legend.position="none") +
ggexpand(.1) + ggexpand(.1, direction=-1)
```
## Summary {#summary11}
The `r Biocpkg("ggtree")` is designed to support the grammar of graphics, allowing users to quickly explore phylogenetic data through visualization. When users have special needs and `r Biocpkg("ggtree")` does not provide them, it is highly recommended to develop extension packages to implement these missing functions. This is a good mechanism, and we also hope that `r Biocpkg("ggtree")` users can become a `r Biocpkg("ggtree")` community. In this way, more functions for special needs can be developed and shared among users. Everyone will benefit from it, and I am very excited that this is happening.