Skip to content

Commit

Permalink
export self-adjoin + v0.9.0
Browse files Browse the repository at this point in the history
  • Loading branch information
corybrunson committed Jun 26, 2018
1 parent e3abc53 commit ca234a6
Show file tree
Hide file tree
Showing 13 changed files with 269 additions and 126 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: ggalluvial
Type: Package
Title: Alluvial Diagrams in 'ggplot2'
Version: 0.8.1
Version: 0.9.0
Date: 2018-05-30
Authors@R: person('Jason Cory', 'Brunson', email = 'cornelioid@gmail.com',
role = c('aut', 'cre'))
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ export(lode_leftward)
export(lode_rightleft)
export(lode_rightward)
export(lode_zigzag)
export(self_adjoin)
export(stat_alluvium)
export(stat_flow)
export(stat_stratum)
Expand Down
6 changes: 5 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# ggalluviala 0.8.1
# ggalluviala 0.9.0

## `geom_alluvium()` patch

Expand All @@ -16,6 +16,10 @@ The deprecated parameters `axis_width` (all geom layers) and `ribbon_bend` (`geo

A vignette illustrating two methods for labeling small strata, using other **ggplot2** extensions, is included.

## `self_adjoin()` export

The internal function `self_adjoin()`, invoked by `geom_flow()`, is revised, exported, documented, and exemplified.

# ggalluvial 0.8.0

## Stat layer functionality
Expand Down
105 changes: 53 additions & 52 deletions R/alluvial-data.r
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#' Check for alluvial structure and convert between alluvial formats
#'
#'
#' Alluvial diagrams consist of multiple horizontally-distributed columns (axes)
#' representing factor variables, vertical divisions (strata) of these axes
#' representing these variables' values; and splines (alluvial flows) connecting
Expand All @@ -19,58 +19,59 @@
#' set \code{axes} of fields encodes its values at each axis variable.
#' \code{is_alluvia_form} tests for this structure.
#' }
#'
#'

#' \code{to_lodes_form} takes a data frame with several designated variables to
#' be used as axes in an alluvial diagram, and reshapes the data frame so that
#' the axis variable names constitute a new factor variable and their values
#' comprise another. Other variables' values will be repeated, and a
#' row-grouping variable can be introduced. This function invokes
#' the axis variable names constitute a new factor variable and their values
#' comprise another. Other variables' values will be repeated, and a
#' row-grouping variable can be introduced. This function invokes
#' \code{\link[tidyr]{gather}}.
#'
#'
#' \code{to_alluvia_form} takes a data frame with axis and axis value variables
#' to be used in an alluvial diagram, and reshape the data frame so that the
#' axes constitute separate variables whose values are given by the value
#' variable. This function invokes \code{\link[tidyr]{spread}}.
#'
#'

#' @name alluvial-data
#' @import tidyselect
#' @family alluvial data manipulation
#' @param data A data frame.
#' @param logical Deprecated. Whether to return a logical value or a character
#' string indicating the type of alluvial structure ("none", "lodes", or
#' "alluvia").
#' @param silent Whether to print messages.
#' @param key,value,id In \code{to_lodes_form}, handled as in
#' \code{\link[tidyr]{gather}} and used to name the new axis (key), stratum
#' (value), and alluvium (identifying) variables. In \code{to_alluvia_form},
#' handled as in \code{\link[tidyr]{spread}} and used to identify the fields
#' of \code{data} to be used as the axis (key), stratum (value), and alluvium
#' @param key,value,id In \code{to_lodes_form}, handled as in
#' \code{\link[tidyr]{gather}} and used to name the new axis (key), stratum
#' (value), and alluvium (identifying) variables. In \code{to_alluvia_form},
#' handled as in \code{\link[tidyr]{spread}} and used to identify the fields
#' of \code{data} to be used as the axis (key), stratum (value), and alluvium
#' (identifying) variables.
#' @param axes In \code{*_alluvia_form}, handled as in
#' \code{\link[dplyr]{select}} and used to identify the field(s) of
#' @param axes In \code{*_alluvia_form}, handled as in
#' \code{\link[dplyr]{select}} and used to identify the field(s) of
#' \code{data} to be used as axes.
#' @param ... Used in \code{is_alluvia_form} and \code{to_lodes_form} as in
#' \code{\link[dplyr]{select}} to determine axis variables, as an alternative
#' @param ... Used in \code{is_alluvia_form} and \code{to_lodes_form} as in
#' \code{\link[dplyr]{select}} to determine axis variables, as an alternative
#' to \code{axes}. Ignored when \code{axes} is provided.
#' @param weight Optional field of \code{data}, handled using
#' \code{\link[rlang]{enquo}}, to be used as heights or depths of the alluvia
#' @param weight Optional field of \code{data}, handled using
#' \code{\link[rlang]{enquo}}, to be used as heights or depths of the alluvia
#' or lodes.
#' @param diffuse Fields of \code{data}, handleded using
#' \code{\link[tidyselect]{vars_select}}, to merge into the reshapen data by
#' \code{id}. They must be a subset of the axis variables. Alternatively, a
#' logical value indicating whether to merge all (\code{TRUE}) or none
#' @param diffuse Fields of \code{data}, handleded using
#' \code{\link[tidyselect]{vars_select}}, to merge into the reshapen data by
#' \code{id}. They must be a subset of the axis variables. Alternatively, a
#' logical value indicating whether to merge all (\code{TRUE}) or none
#' (\code{FALSE}) of the axis variables.
#' @param distill A logical value indicating whether to include variables, other
#' than those passed to \code{key} and \code{value}, that vary within values
#' than those passed to \code{key} and \code{value}, that vary within values
#' of \code{id}. Alternatively, a function (or its name) to be used to distill
#' each such variable to a single value. In addition to existing functions,
#' \code{distill} accepts the character values \code{"first"} (used if
#' \code{distill} is \code{TRUE}), \code{"last"}, and \code{"most"} (which
#' each such variable to a single value. In addition to existing functions,
#' \code{distill} accepts the character values \code{"first"} (used if
#' \code{distill} is \code{TRUE}), \code{"last"}, and \code{"most"} (which
#' returns the modal value).
#' @param discern Logical value indicating whether to suffix values of the
#' variables used as axes that appear at more than one variable in order to
#' distinguish their factor levels. This forces the levels of the combined
#' @param discern Logical value indicating whether to suffix values of the
#' variables used as axes that appear at more than one variable in order to
#' distinguish their factor levels. This forces the levels of the combined
#' factor variable \code{value} to be in the order of the axes.
#' @example inst/examples/ex-alluvial-data.r

Expand All @@ -81,21 +82,21 @@ is_lodes_form <- function(data,
weight = NULL,
logical = TRUE, silent = FALSE) {
if (!isTRUE(logical)) deprecate_parameter("logical")

key_var <- vars_pull(names(data), !!rlang::enquo(key))
value_var <- vars_pull(names(data), !!rlang::enquo(value))
id_var <- vars_pull(names(data), !!rlang::enquo(id))

if (any(duplicated(cbind(data[c(key_var, id_var)])))) {
if (!silent) warning("Duplicated id-axis pairings.")
}

n_pairs <-
dplyr::n_distinct(data[key_var]) * dplyr::n_distinct(data[id_var])
if (nrow(data) < n_pairs) {
if (!silent) warning("Missing id-axis pairings.")
}

# if `weight` is not `NULL`, use NSE to identify `weight_var`
if (!is.null(rlang::enexpr(weight))) {
weight_var <- vars_select(names(data), !!rlang::enquo(weight))
Expand All @@ -108,7 +109,7 @@ is_lodes_form <- function(data,
return(if (logical) TRUE else "lodes")
}
}

if (logical) TRUE else "lodes"
}

Expand All @@ -119,7 +120,7 @@ is_alluvia_form <- function(data,
weight = NULL,
logical = TRUE, silent = FALSE) {
if (!isTRUE(logical)) deprecate_parameter("logical")

if (is.null(rlang::enexpr(weight))) {
weight_var <- NULL
} else {
Expand All @@ -132,7 +133,7 @@ is_alluvia_form <- function(data,
return(if (logical) FALSE else "none")
}
}

if (!is.null(rlang::enexpr(axes))) {
axes <- data_at_vars(data, axes)
} else {
Expand All @@ -143,13 +144,13 @@ is_alluvia_form <- function(data,
axes <- unname(vars_select(names(data), !!!quos))
}
}

n_alluvia <- nrow(dplyr::distinct(data[axes]))
n_combns <- do.call(prod, lapply(data[axes], dplyr::n_distinct))
if (n_alluvia < n_combns) {
if (!silent) message("Missing alluvia for some stratum combinations.")
}

if (logical) TRUE else "alluvia"
}

Expand All @@ -159,11 +160,11 @@ to_lodes_form <- function(data,
..., axes = NULL,
key = "x", value = "stratum", id = "alluvium",
diffuse = FALSE, discern = FALSE) {

key_var <- rlang::quo_name(rlang::enexpr(key))
value_var <- rlang::quo_name(rlang::enexpr(value))
id_var <- rlang::quo_name(rlang::enexpr(id))

if (!is.null(rlang::enexpr(axes))) {
axes <- data_at_vars(data, axes)
} else {
Expand All @@ -174,11 +175,11 @@ to_lodes_form <- function(data,
axes <- unname(vars_select(names(data), !!!quos))
}
}

stopifnot(is_alluvia_form(data, axes, silent = TRUE))

if (!is.data.frame(data)) data <- as.data.frame(data)

if (is.logical(rlang::enexpr(diffuse))) {
diffuse <- if (diffuse) axes else NULL
} else {
Expand All @@ -187,7 +188,7 @@ to_lodes_form <- function(data,
stop("All `diffuse` variables must be `axes` variables.")
}
}

# combine factor levels
cat_levels <- unname(unlist(lapply(lapply(data[axes], as.factor), levels)))
if (any(duplicated(cat_levels)) & is.null(discern)) {
Expand All @@ -200,14 +201,14 @@ to_lodes_form <- function(data,
} else {
strata <- unique(unname(cat_levels))
}

# format data in preparation for `gather()`
data[[id_var]] <- 1:nrow(data)
if (!is.null(diffuse)) {
diffuse_data <- data[, c(id_var, diffuse), drop = FALSE]
}
for (i in axes) data[[i]] <- as.character(data[[i]])

# `gather()` by `axes`
res <- tidyr::gather(data,
key = !!key_var, value = !!value_var,
Expand All @@ -218,7 +219,7 @@ to_lodes_form <- function(data,
if (!is.null(diffuse)) {
res <- merge(diffuse_data, res, by = id_var, all.x = FALSE, all.y = TRUE)
}

res
}

Expand All @@ -227,13 +228,13 @@ to_lodes_form <- function(data,
to_alluvia_form <- function(data,
key, value, id,
distill = FALSE) {

key_var <- vars_pull(names(data), !!rlang::enquo(key))
value_var <- vars_pull(names(data), !!rlang::enquo(value))
id_var <- vars_pull(names(data), !!rlang::enquo(id))

stopifnot(is_lodes_form(data, key_var, value_var, id_var, silent = TRUE))

# handle any variables that vary within `id`s
uniq_id <- length(unique(data[[id_var]]))
uniq_data <- unique(data[setdiff(names(data), c(key_var, value_var))])
Expand Down Expand Up @@ -269,14 +270,14 @@ to_alluvia_form <- function(data,
} else {
distill <- NULL
}

# `spread()` by designated `key` and `value`
res <- tidyr::spread(data, key = !!key_var, value = !!value_var)
# recombine with `distill_data`
if (!is.null(distill)) {
res <- merge(distill_data, res, by = id_var, all.x = FALSE, all.y = TRUE)
}

res
}

Expand Down
16 changes: 10 additions & 6 deletions R/geom-flow.r
Original file line number Diff line number Diff line change
Expand Up @@ -91,24 +91,28 @@ GeomFlow <- ggproto(
"colour", "fill", "alpha"))
flow_fore <- if (aes.flow != "backward") flow_aes else NULL
flow_back <- if (aes.flow != "forward") flow_aes else NULL
data <- self_adjoin(data, "x", "alluvium", pair = flow_pos,
keep0 = flow_fore, keep1 = flow_back)
data <- self_adjoin(
data = data, key = "x", by = "alluvium",
link = flow_pos,
keep.x = flow_fore, keep.y = flow_back,
suffix = c(".0", ".1")
)

# aesthetics (in prescribed order)
aesthetics <- intersect(.color_diff_aesthetics, names(data))
# arrange data by aesthetics for consistent (reverse) z-ordering
data <- data[do.call(order, lapply(
data[, c("link", aesthetics)],
data[, c("pair", aesthetics)],
function(x) factor(x, levels = unique(x))
)), ]

# construct spline grobs
xspls <- plyr::alply(data, 1, function(row) {

# spline paths and aesthetics
xspl <- knots_to_xspl(row$xmax0, row$xmin1,
row$ymin0, row$ymax0, row$ymin1, row$ymax1,
row$knot.pos0, row$knot.pos1)
xspl <- knots_to_xspl(row$xmax.0, row$xmin.1,
row$ymin.0, row$ymax.0, row$ymin.1, row$ymax.1,
row$knot.pos.0, row$knot.pos.1)
aes <- as.data.frame(row[flow_aes],
stringsAsFactors = FALSE)[rep(1, 8), ]
f_data <- cbind(xspl, aes)
Expand Down
30 changes: 0 additions & 30 deletions R/geom-utils.r
Original file line number Diff line number Diff line change
Expand Up @@ -7,36 +7,6 @@ rect_to_poly <- function(xmin, xmax, ymin, ymax) {
)
}

# self-adjoin a dataset, pairing some fields and holding others from one end
self_adjoin <- function(data, key, also.by,
pair = NULL, keep0 = NULL, keep1 = NULL) {
# ensure that `key` is coercible to numeric
if (is.character(data[[key]])) data[[key]] <- as.factor(data[[key]])
# self-(inner )join position aesthetics by numeric-coerced `key` and `also.by`
adj <- dplyr::inner_join(
transform(data,
link = as.numeric(data[[key]]))[, c("link", also.by, pair)],
transform(data,
link = as.numeric(data[[key]]) - 1)[, c("link", also.by, pair)],
by = c("link", also.by),
suffix = c("0", "1")
)
# side-join non-position aesthetics
if (!is.null(keep0)) adj <- dplyr::left_join(
adj,
transform(data,
link = as.numeric(data[[key]]))[, c("link", also.by, keep0)],
by = c("link", also.by)
)
if (!is.null(keep1)) adj <- dplyr::left_join(
adj,
transform(data,
link = as.numeric(data[[key]]) - 1)[, c("link", also.by, keep1)],
by = c("link", also.by)
)
adj
}

# x-spline coordinates from 2 x bounds, 4 y bounds, and knot position
knots_to_xspl <- function(x0, x1, ymin0, ymax0, ymin1, ymax1, kp0, kp1) {
x_oneway <- c(x0, x0 + kp0, x1 - kp1, x1)
Expand Down
Loading

0 comments on commit ca234a6

Please sign in to comment.