EVOLUTION-MANAGER
Edit File: tidy.Corpus.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Tidy a Corpus object from the tm package</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for tidy.Corpus {tidytext}"><tr><td>tidy.Corpus {tidytext}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Tidy a Corpus object from the tm package</h2> <h3>Description</h3> <p>Tidy a Corpus object from the tm package. Returns a data frame with one-row-per-document, with a <code>text</code> column containing the document's text, and one column for each local (per-document) metadata tag. For corpus objects from the quanteda package, see <code><a href="corpus_tidiers.html">tidy.corpus</a></code>. </p> <h3>Usage</h3> <pre> ## S3 method for class 'Corpus' tidy(x, collapse = "\n", ...) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>x</code></td> <td> <p>A Corpus object, such as a VCorpus or PCorpus</p> </td></tr> <tr valign="top"><td><code>collapse</code></td> <td> <p>A string that should be used to collapse text within each corpus (if a document has multiple lines). Give NULL to not collapse strings, in which case a corpus will end up as a list column if there are multi-line documents.</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>Extra arguments, not used</p> </td></tr> </table> <h3>Examples</h3> <pre> library(dplyr) # displaying tbl_dfs if (requireNamespace("tm", quietly = TRUE)) { library(tm) #' # tm package examples txt <- system.file("texts", "txt", package = "tm") ovid <- VCorpus(DirSource(txt, encoding = "UTF-8"), readerControl = list(language = "lat")) ovid tidy(ovid) # choose different options for collapsing text within each # document tidy(ovid, collapse = "")$text tidy(ovid, collapse = NULL)$text # another example from Reuters articles reut21578 <- system.file("texts", "crude", package = "tm") reuters <- VCorpus(DirSource(reut21578), readerControl = list(reader = readReut21578XMLasPlain)) reuters tidy(reuters) } </pre> <hr /><div style="text-align: center;">[Package <em>tidytext</em> version 0.3.4 <a href="00Index.html">Index</a>]</div> </body></html>