EVOLUTION-MANAGER
Edit File: separate.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Separate a character column into multiple columns with a...</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for separate {tidyr}"><tr><td>separate {tidyr}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Separate a character column into multiple columns with a regular expression or numeric locations</h2> <h3>Description</h3> <p>Given either a regular expression or a vector of character positions, <code>separate()</code> turns a single character column into multiple columns. </p> <h3>Usage</h3> <pre> separate( data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE, extra = "warn", fill = "warn", ... ) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>data</code></td> <td> <p>A data frame.</p> </td></tr> <tr valign="top"><td><code>col</code></td> <td> <p>Column name or position. This is passed to <code><a href="../../tidyselect/html/vars_pull.html">tidyselect::vars_pull()</a></code>. </p> <p>This argument is passed by expression and supports <a href="../../rlang/html/nse-force.html">quasiquotation</a> (you can unquote column names or column positions).</p> </td></tr> <tr valign="top"><td><code>into</code></td> <td> <p>Names of new variables to create as character vector. Use <code>NA</code> to omit the variable in the output.</p> </td></tr> <tr valign="top"><td><code>sep</code></td> <td> <p>Separator between columns. </p> <p>If character, <code>sep</code> is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values. </p> <p>If numeric, <code>sep</code> is interpreted as character positions to split at. Positive values start at 1 at the far-left of the string; negative value start at -1 at the far-right of the string. The length of <code>sep</code> should be one less than <code>into</code>.</p> </td></tr> <tr valign="top"><td><code>remove</code></td> <td> <p>If <code>TRUE</code>, remove input column from output data frame.</p> </td></tr> <tr valign="top"><td><code>convert</code></td> <td> <p>If <code>TRUE</code>, will run <code><a href="../../utils/html/type.convert.html">type.convert()</a></code> with <code>as.is = TRUE</code> on new columns. This is useful if the component columns are integer, numeric or logical. </p> <p>NB: this will cause string <code>"NA"</code>s to be converted to <code>NA</code>s.</p> </td></tr> <tr valign="top"><td><code>extra</code></td> <td> <p>If <code>sep</code> is a character vector, this controls what happens when there are too many pieces. There are three valid options: </p> <ul> <li><p> "warn" (the default): emit a warning and drop extra values. </p> </li> <li><p> "drop": drop any extra values without a warning. </p> </li> <li><p> "merge": only splits at most <code>length(into)</code> times </p> </li></ul> </td></tr> <tr valign="top"><td><code>fill</code></td> <td> <p>If <code>sep</code> is a character vector, this controls what happens when there are not enough pieces. There are three valid options: </p> <ul> <li><p> "warn" (the default): emit a warning and fill from the right </p> </li> <li><p> "right": fill with missing values on the right </p> </li> <li><p> "left": fill with missing values on the left </p> </li></ul> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>Additional arguments passed on to methods.</p> </td></tr> </table> <h3>See Also</h3> <p><code><a href="unite.html">unite()</a></code>, the complement, <code><a href="extract.html">extract()</a></code> which uses regular expression capturing groups. </p> <h3>Examples</h3> <pre> library(dplyr) # If you want to split by any non-alphanumeric value (the default): df <- data.frame(x = c(NA, "a.b", "a.d", "b.c")) df %>% separate(x, c("A", "B")) # If you just want the second variable: df %>% separate(x, c(NA, "B")) # If every row doesn't split into the same number of pieces, use # the extra and fill arguments to control what happens: df <- data.frame(x = c("a", "a b", "a b c", NA)) df %>% separate(x, c("a", "b")) # The same behaviour as previous, but drops the c without warnings: df %>% separate(x, c("a", "b"), extra = "drop", fill = "right") # Opposite of previous, keeping the c and filling left: df %>% separate(x, c("a", "b"), extra = "merge", fill = "left") # Or you can keep all three: df %>% separate(x, c("a", "b", "c")) # To only split a specified number of times use extra = "merge": df <- data.frame(x = c("x: 123", "y: error: 7")) df %>% separate(x, c("key", "value"), ": ", extra = "merge") # Use regular expressions to separate on multiple characters: df <- data.frame(x = c(NA, "a?b", "a.d", "b:c")) df %>% separate(x, c("A","B"), sep = "([.?:])") # convert = TRUE detects column classes: df <- data.frame(x = c("a:1", "a:2", "c:4", "d", NA)) df %>% separate(x, c("key","value"), ":") %>% str df %>% separate(x, c("key","value"), ":", convert = TRUE) %>% str </pre> <hr /><div style="text-align: center;">[Package <em>tidyr</em> version 1.1.2 <a href="00Index.html">Index</a>]</div> </body></html>