EVOLUTION-MANAGER
Edit File: nest.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Nest and unnest</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for nest {tidyr}"><tr><td>nest {tidyr}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Nest and unnest</h2> <h3>Description</h3> <p>Nesting creates a list-column of data frames; unnesting flattens it back out into regular columns. Nesting is implicitly a summarising operation: you get one row for each group defined by the non-nested columns. This is useful in conjunction with other summaries that work with whole datasets, most notably models. </p> <p>Learn more in <code>vignette("nest")</code>. </p> <h3>Usage</h3> <pre> nest(.data, ..., .names_sep = NULL, .key = deprecated()) unnest( data, cols, ..., keep_empty = FALSE, ptype = NULL, names_sep = NULL, names_repair = "check_unique", .drop = deprecated(), .id = deprecated(), .sep = deprecated(), .preserve = deprecated() ) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>.data</code></td> <td> <p>A data frame.</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p><<code><a href="tidyr_tidy_select.html">tidy-select</a></code>> Columns to nest, specified using name-variable pairs of the form <code>new_col = c(col1, col2, col3)</code>. The right hand side can be any valid tidy select expression. </p> <p><img src="../help/figures/lifecycle-deprecated.svg" alt='Deprecated lifecycle' />: previously you could write <code>df %>% nest(x, y, z)</code> and <code>df %>% unnest(x, y, z)</code>. Convert to <code>df %>% nest(data = c(x, y, z))</code>. and <code>df %>% unnest(c(x, y, z))</code>. </p> <p>If you previously created new variable in <code>unnest()</code> you'll now need to do it explicitly with <code>mutate()</code>. Convert <code>df %>% unnest(y = fun(x, y, z))</code> to <code>df %>% mutate(y = fun(x, y, z)) %>% unnest(y)</code>.</p> </td></tr> <tr valign="top"><td><code>.key</code></td> <td> <p><img src="../help/figures/lifecycle-deprecated.svg" alt='Deprecated lifecycle' />: No longer needed because of the new <code>new_col = c(col1, col2, col3)</code> syntax.</p> </td></tr> <tr valign="top"><td><code>data</code></td> <td> <p>A data frame.</p> </td></tr> <tr valign="top"><td><code>cols</code></td> <td> <p><<code><a href="tidyr_tidy_select.html">tidy-select</a></code>> Columns to unnest. </p> <p>If you <code>unnest()</code> multiple columns, parallel entries must be of compatible sizes, i.e. they're either equal or length 1 (following the standard tidyverse recycling rules).</p> </td></tr> <tr valign="top"><td><code>keep_empty</code></td> <td> <p>By default, you get one row of output for each element of the list your unchopping/unnesting. This means that if there's a size-0 element (like <code>NULL</code> or an empty data frame), that entire row will be dropped from the output. If you want to preserve all rows, use <code>keep_empty = TRUE</code> to replace size-0 elements with a single row of missing values.</p> </td></tr> <tr valign="top"><td><code>ptype</code></td> <td> <p>Optionally, supply a data frame prototype for the output <code>cols</code>, overriding the default that will be guessed from the combination of individual values.</p> </td></tr> <tr valign="top"><td><code>names_sep, .names_sep</code></td> <td> <p>If <code>NULL</code>, the default, the names will be left as is. In <code>nest()</code>, inner names will come from the former outer names; in <code>unnest()</code>, the new outer names will come from the inner names. </p> <p>If a string, the inner and outer names will be used together. In <code>nest()</code>, the names of the new outer columns will be formed by pasting together the outer and the inner column names, separated by <code>names_sep</code>. In <code>unnest()</code>, the new inner names will have the outer names (+ <code>names_sep</code>) automatically stripped. This makes <code>names_sep</code> roughly symmetric between nesting and unnesting.</p> </td></tr> <tr valign="top"><td><code>names_repair</code></td> <td> <p>Used to check that output data frame has valid names. Must be one of the following options: </p> <ul> <li><p> "minimal": no name repair or checks, beyond basic existence, </p> </li> <li><p> "unique": make sure names are unique and not empty, </p> </li> <li><p> "check_unique": (the default), no name repair, but check they are unique, </p> </li> <li><p> "universal": make the names unique and syntactic </p> </li> <li><p> a function: apply custom name repair. </p> </li> <li> <p><a href="tidyr_legacy.html">tidyr_legacy</a>: use the name repair from tidyr 0.8. </p> </li> <li><p> a formula: a purrr-style anonymous function (see <code><a href="../../rlang/html/as_function.html">rlang::as_function()</a></code>) </p> </li></ul> <p>See <code><a href="../../vctrs/html/vec_as_names.html">vctrs::vec_as_names()</a></code> for more details on these terms and the strategies used to enforce them.</p> </td></tr> <tr valign="top"><td><code>.drop, .preserve</code></td> <td> <p><img src="../help/figures/lifecycle-deprecated.svg" alt='Deprecated lifecycle' />: all list-columns are now preserved; If there are any that you don't want in the output use <code>select()</code> to remove them prior to unnesting.</p> </td></tr> <tr valign="top"><td><code>.id</code></td> <td> <p><img src="../help/figures/lifecycle-deprecated.svg" alt='Deprecated lifecycle' />: convert <code>df %>% unnest(x, .id = "id")</code> to <code style="white-space: pre;">df %>% mutate(id = names(x)) %>% unnest(x))</code>.</p> </td></tr> <tr valign="top"><td><code>.sep</code></td> <td> <p><img src="../help/figures/lifecycle-deprecated.svg" alt='Deprecated lifecycle' />: use <code>names_sep</code> instead.</p> </td></tr> </table> <h3>New syntax</h3> <p>tidyr 1.0.0 introduced a new syntax for <code>nest()</code> and <code>unnest()</code> that's designed to be more similar to other functions. Converting to the new syntax should be straightforward (guided by the message you'll recieve) but if you just need to run an old analysis, you can easily revert to the previous behaviour using <code><a href="nest_legacy.html">nest_legacy()</a></code> and <code><a href="nest_legacy.html">unnest_legacy()</a></code> as follows:</p> <pre>library(tidyr) nest <- nest_legacy unnest <- unnest_legacy </pre> <h3>Grouped data frames</h3> <p><code>df %>% nest(data = c(x, y))</code> specifies the columns to be nested; i.e. the columns that will appear in the inner data frame. Alternatively, you can <code>nest()</code> a grouped data frame created by <code><a href="../../dplyr/html/group_by.html">dplyr::group_by()</a></code>. The grouping variables remain in the outer data frame and the others are nested. The result preserves the grouping of the input. </p> <p>Variables supplied to <code>nest()</code> will override grouping variables so that <code>df %>% group_by(x, y) %>% nest(data = !z)</code> will be equivalent to <code>df %>% nest(data = !z)</code>. </p> <h3>Examples</h3> <pre> df <- tibble(x = c(1, 1, 1, 2, 2, 3), y = 1:6, z = 6:1) # Note that we get one row of output for each unique combination of # non-nested variables df %>% nest(data = c(y, z)) # chop does something similar, but retains individual columns df %>% chop(c(y, z)) # use tidyselect syntax and helpers, just like in dplyr::select() df %>% nest(data = any_of(c("y", "z"))) iris %>% nest(data = !Species) nest_vars <- names(iris)[1:4] iris %>% nest(data = any_of(nest_vars)) iris %>% nest(petal = starts_with("Petal"), sepal = starts_with("Sepal")) iris %>% nest(width = contains("Width"), length = contains("Length")) # Nesting a grouped data frame nests all variables apart from the group vars library(dplyr) fish_encounters %>% group_by(fish) %>% nest() # Nesting is often useful for creating per group models mtcars %>% group_by(cyl) %>% nest() %>% mutate(models = lapply(data, function(df) lm(mpg ~ wt, data = df))) # unnest() is primarily designed to work with lists of data frames df <- tibble( x = 1:3, y = list( NULL, tibble(a = 1, b = 2), tibble(a = 1:3, b = 3:1) ) ) df %>% unnest(y) df %>% unnest(y, keep_empty = TRUE) # If you have lists of lists, or lists of atomic vectors, instead # see hoist(), unnest_wider(), and unnest_longer() #' # You can unnest multiple columns simultaneously df <- tibble( a = list(c("a", "b"), "c"), b = list(1:2, 3), c = c(11, 22) ) df %>% unnest(c(a, b)) # Compare with unnesting one column at a time, which generates # the Cartesian product df %>% unnest(a) %>% unnest(b) </pre> <hr /><div style="text-align: center;">[Package <em>tidyr</em> version 1.1.2 <a href="00Index.html">Index</a>]</div> </body></html>