EVOLUTION-MANAGER
Edit File: daply.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Split data frame, apply function, and return results in an...</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for daply {plyr}"><tr><td>daply {plyr}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Split data frame, apply function, and return results in an array.</h2> <h3>Description</h3> <p>For each subset of data frame, apply function then combine results into an array. <code>daply</code> with a function that operates column-wise is similar to <code><a href="../../stats/html/aggregate.html">aggregate</a></code>. To apply a function for each row, use <code><a href="aaply.html">aaply</a></code> with <code>.margins</code> set to <code>1</code>. </p> <h3>Usage</h3> <pre> daply( .data, .variables, .fun = NULL, ..., .progress = "none", .inform = FALSE, .drop_i = TRUE, .drop_o = TRUE, .parallel = FALSE, .paropts = NULL ) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>.data</code></td> <td> <p>data frame to be processed</p> </td></tr> <tr valign="top"><td><code>.variables</code></td> <td> <p>variables to split data frame by, as quoted variables, a formula or character vector</p> </td></tr> <tr valign="top"><td><code>.fun</code></td> <td> <p>function to apply to each piece</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>other arguments passed on to <code>.fun</code></p> </td></tr> <tr valign="top"><td><code>.progress</code></td> <td> <p>name of the progress bar to use, see <code><a href="create_progress_bar.html">create_progress_bar</a></code></p> </td></tr> <tr valign="top"><td><code>.inform</code></td> <td> <p>produce informative error messages? This is turned off by default because it substantially slows processing speed, but is very useful for debugging</p> </td></tr> <tr valign="top"><td><code>.drop_i</code></td> <td> <p>should combinations of variables that do not appear in the input data be preserved (FALSE) or dropped (TRUE, default)</p> </td></tr> <tr valign="top"><td><code>.drop_o</code></td> <td> <p>should extra dimensions of length 1 in the output be dropped, simplifying the output. Defaults to <code>TRUE</code></p> </td></tr> <tr valign="top"><td><code>.parallel</code></td> <td> <p>if <code>TRUE</code>, apply function in parallel, using parallel backend provided by foreach</p> </td></tr> <tr valign="top"><td><code>.paropts</code></td> <td> <p>a list of additional options passed into the <code><a href="../../foreach/html/foreach.html">foreach</a></code> function when parallel computation is enabled. This is important if (for example) your code relies on external data or packages: use the <code>.export</code> and <code>.packages</code> arguments to supply them so that all cluster nodes have the correct environment set up for computing.</p> </td></tr> </table> <h3>Value</h3> <p>if results are atomic with same type and dimensionality, a vector, matrix or array; otherwise, a list-array (a list with dimensions) </p> <h3>Input</h3> <p>This function splits data frames by variables. </p> <h3>Output</h3> <p>If there are no results, then this function will return a vector of length 0 (<code>vector()</code>). </p> <h3>References</h3> <p>Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. <a href="https://www.jstatsoft.org/v40/i01/">https://www.jstatsoft.org/v40/i01/</a>. </p> <h3>See Also</h3> <p>Other array output: <code><a href="aaply.html">aaply</a>()</code>, <code><a href="laply.html">laply</a>()</code>, <code><a href="maply.html">maply</a>()</code> </p> <p>Other data frame input: <code><a href="d_ply.html">d_ply</a>()</code>, <code><a href="ddply.html">ddply</a>()</code>, <code><a href="dlply.html">dlply</a>()</code> </p> <h3>Examples</h3> <pre> daply(baseball, .(year), nrow) # Several different ways of summarising by variables that should not be # included in the summary daply(baseball[, c(2, 6:9)], .(year), colwise(mean)) daply(baseball[, 6:9], .(baseball$year), colwise(mean)) daply(baseball, .(year), function(df) colwise(mean)(df[, 6:9])) </pre> <hr /><div style="text-align: center;">[Package <em>plyr</em> version 1.8.7 <a href="00Index.html">Index</a>]</div> </body></html>