EVOLUTION-MANAGER
Edit File: stri_opts_brkiter.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Generate a List with BreakIterator Settings</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for stri_opts_brkiter {stringi}"><tr><td>stri_opts_brkiter {stringi}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Generate a List with BreakIterator Settings</h2> <h3>Description</h3> <p>A convenience function to tune the <span class="pkg">ICU</span> <code>BreakIterator</code>'s behavior in some text boundary analysis functions, see <a href="stringi-search-boundaries.html">stringi-search-boundaries</a>. </p> <h3>Usage</h3> <pre> stri_opts_brkiter( type, locale, skip_word_none, skip_word_number, skip_word_letter, skip_word_kana, skip_word_ideo, skip_line_soft, skip_line_hard, skip_sentence_term, skip_sentence_sep, ... ) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>type</code></td> <td> <p>single string; either the break iterator type, one of <code>character</code>, <code>line_break</code>, <code>sentence</code>, <code>word</code>; or a custom set of ICU break iteration rules. see <a href="stringi-search-boundaries.html">stringi-search-boundaries</a></p> </td></tr> <tr valign="top"><td><code>locale</code></td> <td> <p>single string, <code>NULL</code> or <code>""</code> for default locale</p> </td></tr> <tr valign="top"><td><code>skip_word_none</code></td> <td> <p>logical; perform no action for "words" that do not fit into any other categories</p> </td></tr> <tr valign="top"><td><code>skip_word_number</code></td> <td> <p>logical; perform no action for words that appear to be numbers</p> </td></tr> <tr valign="top"><td><code>skip_word_letter</code></td> <td> <p>logical; perform no action for words that contain letters, excluding hiragana, katakana, or ideographic characters</p> </td></tr> <tr valign="top"><td><code>skip_word_kana</code></td> <td> <p>logical; perform no action for words containing kana characters</p> </td></tr> <tr valign="top"><td><code>skip_word_ideo</code></td> <td> <p>logical; perform no action for words containing ideographic characters</p> </td></tr> <tr valign="top"><td><code>skip_line_soft</code></td> <td> <p>logical; perform no action for soft line breaks, i.e., positions where a line break is acceptable but not required</p> </td></tr> <tr valign="top"><td><code>skip_line_hard</code></td> <td> <p>logical; perform no action for hard, or mandatory line breaks</p> </td></tr> <tr valign="top"><td><code>skip_sentence_term</code></td> <td> <p>logical; perform no action for sentences ending with a sentence terminator ("<code>.</code>", "<code>,</code>", "<code>?</code>", "<code>!</code>"), possibly followed by a hard separator (<code>CR</code>, <code>LF</code>, <code>PS</code>, etc.)</p> </td></tr> <tr valign="top"><td><code>skip_sentence_sep</code></td> <td> <p>logical; perform no action for sentences that do not contain an ending sentence terminator, but are ended by a hard separator or end of input</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>any other arguments to this function are purposely ignored</p> </td></tr> </table> <h3>Details</h3> <p>The <code>skip_*</code> family of settings may be used to prevent performing any special actions on particular types of text boundaries, e.g., in case of the <code><a href="stri_locate_boundaries.html">stri_locate_all_boundaries</a></code> and <code><a href="stri_split_boundaries.html">stri_split_boundaries</a></code> functions. </p> <p>Note that custom break iterator rules (advanced users only) should be specified as a single string. For a detailed description of the syntax of RBBI rules, please refer to the ICU User Guide on Boundary Analysis. </p> <h3>Value</h3> <p>Returns a named list object. Omitted <code>skip_*</code> values act as they have been set to <code>FALSE</code>. </p> <h3>References</h3> <p><em><code>ubrk.h</code> File Reference</em> – ICU4C API Documentation, <a href="http://icu-project.org/apiref/icu4c/ubrk_8h.html">http://icu-project.org/apiref/icu4c/ubrk_8h.html</a> </p> <p><em>Boundary Analysis</em> – ICU User Guide, <a href="http://userguide.icu-project.org/boundaryanalysis">http://userguide.icu-project.org/boundaryanalysis</a> </p> <h3>See Also</h3> <p>Other text_boundaries: <code><a href="stri_count_boundaries.html">stri_count_boundaries</a>()</code>, <code><a href="stri_extract_boundaries.html">stri_extract_all_boundaries</a>()</code>, <code><a href="stri_locate_boundaries.html">stri_locate_all_boundaries</a>()</code>, <code><a href="stri_split_boundaries.html">stri_split_boundaries</a>()</code>, <code><a href="stri_split_lines.html">stri_split_lines</a>()</code>, <code><a href="stri_trans_casemap.html">stri_trans_tolower</a>()</code>, <code><a href="stri_wrap.html">stri_wrap</a>()</code>, <code><a href="stringi-search-boundaries.html">stringi-search-boundaries</a></code>, <code><a href="stringi-search.html">stringi-search</a></code> </p> <hr /><div style="text-align: center;">[Package <em>stringi</em> version 1.4.6 <a href="00Index.html">Index</a>]</div> </body></html>