From 4a815cb81adfa4585b4917ad804d35f6fbed7751 Mon Sep 17 00:00:00 2001 From: Fonti Kar Date: Thu, 15 Aug 2024 14:15:07 -0700 Subject: [PATCH 1/3] Very rough outline with Sierra #915 --- vignettes/purrr.Rmd | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 vignettes/purrr.Rmd diff --git a/vignettes/purrr.Rmd b/vignettes/purrr.Rmd new file mode 100644 index 00000000..cc34e2c7 --- /dev/null +++ b/vignettes/purrr.Rmd @@ -0,0 +1,43 @@ +--- +title: "purrr" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{purrr} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) + +library(purrr) +``` + +[Big picture] + +The purrr package makes applying your functions to multiple elements of a list or data frame easy but you don't need a `for` loop. + + +### The purrr function families + +- `map` apply function mulitple times/ multplie outputs +- `reduce` 1 output +- `predicate` TRUE/FALSE logical output + +### `map()` + +Detailled map example + +### `reduce`()` + +Detailled map example + +### `predicate`()` + +Detailled map example + + + From aa34a27333335baaca4efb450da055b2a6f2f73b Mon Sep 17 00:00:00 2001 From: Sierra Johnson Date: Thu, 15 Aug 2024 14:57:33 -0700 Subject: [PATCH 2/3] map example started --- vignettes/purrr.Rmd | 41 +++++++++++++++++++++++++++++++++++------ 1 file changed, 35 insertions(+), 6 deletions(-) diff --git a/vignettes/purrr.Rmd b/vignettes/purrr.Rmd index cc34e2c7..34c5ed46 100644 --- a/vignettes/purrr.Rmd +++ b/vignettes/purrr.Rmd @@ -27,17 +27,46 @@ The purrr package makes applying your functions to multiple elements of a list o - `reduce` 1 output - `predicate` TRUE/FALSE logical output -### `map()` +### `map()` family -Detailled map example +Map is used to apply the same function multiple times. It can work on lists, data frames, and other things. The first argument, `.x` is the object, the second argument, `.f` is the function you want to apply. Here is a simple example of how map is used. -### `reduce`()` +```{r} +x <- list(1,2,3) -Detailled map example +map(.x = x, .f = sqrt) -### `predicate`()` +``` +However, the example above isn't that useful because the data could have easily been a vector. The `map` functionality becomes more important when you consider a more complex object like a data frame and a function that doesn't work with a regular mutate. We can create a custom function, then apply that to a column in mtcars. + +```{r} + + +``` + +Often, it's easier to describe the function inside the map call, this is when you can create an anonymous function using `~`. + +In this more useful example, the base R function `split` is used to create a list of data frames. `map` is then used to fit a regression model with the `lm` function for each group. Note that the first time `~` appears, it's creating the anonymous function, then it is used within `lm` as part of the formula. + +```{r} -Detailled map example +by_cyl <- split(mtcars, mtcars$cyl) +by_cyl |> + map(~ lm(mpg ~ wt, data = .x)) |> + map(coef) |> + map_dbl(2) + +``` + +`map` takes only one argument and always outputs a list. If you want to use multiple arguments, variants such as `map2` and `pmap` will work. If you want to output something other than a list, there are suffixs such as `_chr` and `_dbl`. + + +### `reduce`()` + +Detailled reduce example + +### `predicate`()` +Example here From 509f124d3f327128895e0fe1ac22b70130f20011 Mon Sep 17 00:00:00 2001 From: Sierra Johnson Date: Fri, 23 Aug 2024 21:36:51 -0600 Subject: [PATCH 3/3] filling out the rest of the outline --- vignettes/purrr.Rmd | 79 +++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 72 insertions(+), 7 deletions(-) diff --git a/vignettes/purrr.Rmd b/vignettes/purrr.Rmd index 34c5ed46..0924d96f 100644 --- a/vignettes/purrr.Rmd +++ b/vignettes/purrr.Rmd @@ -23,7 +23,7 @@ The purrr package makes applying your functions to multiple elements of a list o ### The purrr function families -- `map` apply function mulitple times/ multplie outputs +- `map` apply function multiple times/ multiple outputs - `reduce` 1 output - `predicate` TRUE/FALSE logical output @@ -37,10 +37,12 @@ x <- list(1,2,3) map(.x = x, .f = sqrt) ``` + However, the example above isn't that useful because the data could have easily been a vector. The `map` functionality becomes more important when you consider a more complex object like a data frame and a function that doesn't work with a regular mutate. We can create a custom function, then apply that to a column in mtcars. ```{r} +# Simple example here. But haven't found one to copy from the books. ``` @@ -52,21 +54,84 @@ In this more useful example, the base R function `split` is used to create a lis by_cyl <- split(mtcars, mtcars$cyl) -by_cyl |> - map(~ lm(mpg ~ wt, data = .x)) |> - map(coef) |> +by_cyl %>% + map(~ lm(mpg ~ wt, data = .x)) %>% + map(coef) %>% map_dbl(2) ``` -`map` takes only one argument and always outputs a list. If you want to use multiple arguments, variants such as `map2` and `pmap` will work. If you want to output something other than a list, there are suffixs such as `_chr` and `_dbl`. +`map` takes only one argument and always outputs a list. If you want to use multiple arguments, variants such as `map2` and `pmap` will work. If you want to output something other than a list, there are suffixs such as `_chr` and `_dbl`. + +`map_vec` is a special use case ... +```{r} + +# map_vec example here + +``` + +Special Note: Progress bar ... seriously, how do we emphasize this, it's going to change my life. +When you start using `purrr` functions for large datasets or mapping complex functions, it can be challenging to know whether your code is running correctly because it takes a while to run. Use the `.progress` argument to make a progress bar in your mapping functions. To set one up, we recommend setting the name of the progress bar using a short string. +```{r} + +# simple progress bar example. + +``` + +Progress bars can have a lot more functionality, which you should read about here... ### `reduce`()` -Detailled reduce example +Reduce combines the elements of a vector, `.x`, into one number using the `.f` function. Like `map`, the simplest use case doesn't really demonstrate why it's valuable. +```{r} + +reduce(1:4, `+`) + +reduce(1:4, union) + +``` +As we start looking at the more complex use cases, the `accumulate` variant can be helpful for understanding what is happening. `accumulate` works the same as `reduce`, but it includes the intermediate steps. If we call `accumlulate` on the examples above, it's easier to see how the numbers are being combined sequentially, +```{r} + +accumulate(1:4, `+`) + +accumulate(1:4, union) + +``` + +Similar to map, we can think about how reduce can save us from having to use a `for` loop .... + +```{r} + +# Use map to generate sample data +l <- map(1:4, ~ sample(1:10, 15, replace = T)) + +# For loop to find values that occur in every element +out <- l[[1]] +for (i in seq(2, length(l))) { + out <- intersect(out, l[[i]]) +} +out + +# Same functionality with reduce +reduce(l, intersect) + +``` ### `predicate`()` -Example here +Is this all we want to show? Is there another example that would be good? + +```{r} + +df <- data.frame(x = 1:3, y = c("a", "b", "c")) +detect(df, is.factor) +detect_index(df, is.factor) + +str(keep(df, is.factor)) +str(discard(df, is.factor)) + +``` +