EVOLUTION-MANAGER
Edit File: hoist.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Rectangle a nested list into a tidy tibble</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for hoist {tidyr}"><tr><td>hoist {tidyr}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Rectangle a nested list into a tidy tibble</h2> <h3>Description</h3> <a href='https://www.tidyverse.org/lifecycle/#maturing'><img src='figures/lifecycle-maturing.svg' alt='Maturing lifecycle'></a> <p><code>hoist()</code>, <code>unnest_longer()</code>, and <code>unnest_wider()</code> provide tools for rectangling, collapsing deeply nested lists into regular columns. <code>hoist()</code> allows you to selectively pull components of a list-column out in to their own top-level columns, using the same syntax as <code><a href="../../purrr/html/pluck.html">purrr::pluck()</a></code>. <code>unnest_wider()</code> turns each element of a list-column into a column, and <code>unnest_longer()</code> turns each element of a list-column into a row. <code>unnest_auto()</code> picks between <code>unnest_wider()</code> or <code>unnest_longer()</code> based heuristics described below. </p> <p>Learn more in <code>vignette("rectangle")</code>. </p> <h3>Usage</h3> <pre> hoist( .data, .col, ..., .remove = TRUE, .simplify = TRUE, .ptype = list(), .transform = list() ) unnest_longer( data, col, values_to = NULL, indices_to = NULL, indices_include = NULL, names_repair = "check_unique", simplify = TRUE, ptype = list(), transform = list() ) unnest_wider( data, col, names_sep = NULL, simplify = TRUE, names_repair = "check_unique", ptype = list(), transform = list() ) unnest_auto(data, col) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>.data, data</code></td> <td> <p>A data frame.</p> </td></tr> <tr valign="top"><td><code>.col, col</code></td> <td> <p>List-column to extract components from.</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>Components of <code>.col</code> to turn into columns in the form <code>col_name = "pluck_specification"</code>. You can pluck by name with a character vector, by position with an integer vector, or with a combination of the two with a list. See <code><a href="../../purrr/html/pluck.html">purrr::pluck()</a></code> for details. </p> <p>The column names must be unique in a call to <code>hoist()</code>, although existing columns with the same name will be overwritten. When plucking with a single string you can choose to omit the name, i.e. <code>hoist(df, col, "x")</code> is short-hand for <code>hoist(df, col, x = "x")</code>.</p> </td></tr> <tr valign="top"><td><code>.remove</code></td> <td> <p>If <code>TRUE</code>, the default, will remove extracted components from <code>.col</code>. This ensures that each value lives only in one place.</p> </td></tr> <tr valign="top"><td><code>.simplify, simplify</code></td> <td> <p>If <code>TRUE</code>, will attempt to simplify lists of length-1 vectors to an atomic vector</p> </td></tr> <tr valign="top"><td><code>.ptype, ptype</code></td> <td> <p>Optionally, a named list of prototypes declaring the desired output type of each component. Use this argument if you want to check each element has the types you expect when simplifying.</p> </td></tr> <tr valign="top"><td><code>.transform, transform</code></td> <td> <p>Optionally, a named list of transformation functions applied to each component. Use this function if you want transform or parse individual elements as they are hoisted.</p> </td></tr> <tr valign="top"><td><code>values_to</code></td> <td> <p>Name of column to store vector values. Defaults to <code>col</code>.</p> </td></tr> <tr valign="top"><td><code>indices_to</code></td> <td> <p>A string giving the name of column which will contain the inner names or position (if not named) of the values. Defaults to <code>col</code> with <code style="white-space: pre;">_id</code> suffix</p> </td></tr> <tr valign="top"><td><code>indices_include</code></td> <td> <p>Add an index column? Defaults to <code>TRUE</code> when <code>col</code> has inner names.</p> </td></tr> <tr valign="top"><td><code>names_repair</code></td> <td> <p>Used to check that output data frame has valid names. Must be one of the following options: </p> <ul> <li><p> "minimal": no name repair or checks, beyond basic existence, </p> </li> <li><p> "unique": make sure names are unique and not empty, </p> </li> <li><p> "check_unique": (the default), no name repair, but check they are unique, </p> </li> <li><p> "universal": make the names unique and syntactic </p> </li> <li><p> a function: apply custom name repair. </p> </li> <li> <p><a href="tidyr_legacy.html">tidyr_legacy</a>: use the name repair from tidyr 0.8. </p> </li> <li><p> a formula: a purrr-style anonymous function (see <code><a href="../../rlang/html/as_function.html">rlang::as_function()</a></code>) </p> </li></ul> <p>See <code><a href="../../vctrs/html/vec_as_names.html">vctrs::vec_as_names()</a></code> for more details on these terms and the strategies used to enforce them.</p> </td></tr> <tr valign="top"><td><code>names_sep</code></td> <td> <p>If <code>NULL</code>, the default, the names will be left as is. If a string, the inner and outer names will be paste together using <code>names_sep</code> as a separator.</p> </td></tr> </table> <h3>Unnest variants</h3> <p>The three <code>unnest()</code> functions differ in how they change the shape of the output data frame: </p> <ul> <li> <p><code>unnest_wider()</code> preserves the rows, but changes the columns. </p> </li> <li> <p><code>unnest_longer()</code> preserves the columns, but changes the rows </p> </li> <li> <p><code>unnest()</code> can change both rows and columns. </p> </li></ul> <p>These principles guide their behaviour when they are called with a non-primary data type. For example, if you <code>unnest_wider()</code> a list of data frames, the number of rows must be preserved, so each column is turned into a list column of length one. Or if you <code>unnest_longer()</code> a list of data frame, the number of columns must be preserved so it creates a packed column. I'm not sure how if these behaviours are useful in practice, but they are theoretically pleasing. </p> <h3><code>unnest_auto()</code> heuristics</h3> <p><code>unnest_auto()</code> inspects the inner names of the list-col: </p> <ul> <li><p> If all elements are unnamed, it uses <code>unnest_longer()</code> </p> </li> <li><p> If all elements are named, and there's at least one name in common acros all components, it uses <code>unnest_wider()</code> </p> </li> <li><p> Otherwise, it falls back to <code>unnest_longer(indices_include = TRUE)</code>. </p> </li></ul> <h3>Examples</h3> <pre> df <- tibble( character = c("Toothless", "Dory"), metadata = list( list( species = "dragon", color = "black", films = c( "How to Train Your Dragon", "How to Train Your Dragon 2", "How to Train Your Dragon: The Hidden World" ) ), list( species = "blue tang", color = "blue", films = c("Finding Nemo", "Finding Dory") ) ) ) df # Turn all components of metadata into columns df %>% unnest_wider(metadata) # Extract only specified components df %>% hoist(metadata, "species", first_film = list("films", 1L), third_film = list("films", 3L) ) df %>% unnest_wider(metadata) %>% unnest_longer(films) # unnest_longer() is useful when each component of the list should # form a row df <- tibble( x = 1:3, y = list(NULL, 1:3, 4:5) ) df %>% unnest_longer(y) # Automatically creates names if widening df %>% unnest_wider(y) # But you'll usually want to provide names_sep: df %>% unnest_wider(y, names_sep = "_") # And similarly if the vectors are named df <- tibble( x = 1:2, y = list(c(a = 1, b = 2), c(a = 10, b = 11, c = 12)) ) df %>% unnest_wider(y) df %>% unnest_longer(y) </pre> <hr /><div style="text-align: center;">[Package <em>tidyr</em> version 1.1.2 <a href="00Index.html">Index</a>]</div> </body></html>