EVOLUTION-MANAGER
Edit File: wordStem.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Get the stem of words</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for wordStem {SnowballC}"><tr><td>wordStem {SnowballC}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Get the stem of words</h2> <h3>Description</h3> <p>This function extracts the stems of each of the given words in the vector. </p> <h3>Usage</h3> <pre> wordStem(words, language = "porter") </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>words</code></td> <td> <p>a character vector of words whose stems are to be extracted.</p> </td></tr> <tr valign="top"><td><code>language</code></td> <td> <p>the name of a recognized language, as returned by <code><a href="getStemLanguages.html">getStemLanguages</a></code>, or a two- or three-letter ISO-639 code corresponding to one of these languages (see references for the list of codes). </p> </td></tr> </table> <h3>Details</h3> <p>This uses Dr. Martin Porter's stemming algorithm and the C libstemmer library generated by Snowball. </p> <h3>Value</h3> <p>A character vector with as many elements as there are in the input vector with the corresponding elements being the stem of the word. Elements of the vector are converted to UTF-8 encoding before the stemming is performed, and the returned elements are marked as such when they contain non-ASCII characters. </p> <h3>Author(s)</h3> <p>Milan Bouchet-Valat</p> <h3>References</h3> <p><a href="http://snowball.tartarus.org/">http://snowball.tartarus.org/</a> </p> <p><a href="http://www.loc.gov/standards/iso639-2/php/code_list.php">http://www.loc.gov/standards/iso639-2/php/code_list.php</a> for a list of ISO-639 language codes. </p> <h3>Examples</h3> <pre> # Simple example wordStem(c("win", "winning", "winner")) # Test some of the vocabulary supplied at https://github.com/snowballstem/snowball-data for(lang in getStemLanguages()) { load(system.file("words", paste0(lang, ".RData"), package="SnowballC")) stopifnot(all(wordStem(dat$words, lang) == dat$stem)) } stopifnot(is.na(wordStem(NA))) </pre> <hr /><div style="text-align: center;">[Package <em>SnowballC</em> version 0.7.0 <a href="00Index.html">Index</a>]</div> </body></html>