EVOLUTION-MANAGER
Edit File: stri_locate_boundaries.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Locate Text Boundaries</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for stri_locate_all_boundaries {stringi}"><tr><td>stri_locate_all_boundaries {stringi}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Locate Text Boundaries</h2> <h3>Description</h3> <p>These functions locate text boundaries (like character, word, line, or sentence boundaries). Use <code>stri_locate_all_*</code> to locate all the matches. <code>stri_locate_first_*</code> and <code>stri_locate_last_*</code> give the first or the last matches, respectively. </p> <h3>Usage</h3> <pre> stri_locate_all_boundaries( str, omit_no_match = FALSE, ..., opts_brkiter = NULL ) stri_locate_last_boundaries(str, ..., opts_brkiter = NULL) stri_locate_first_boundaries(str, ..., opts_brkiter = NULL) stri_locate_all_words(str, omit_no_match = FALSE, locale = NULL) stri_locate_last_words(str, locale = NULL) stri_locate_first_words(str, locale = NULL) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>str</code></td> <td> <p>character vector or an object coercible to</p> </td></tr> <tr valign="top"><td><code>omit_no_match</code></td> <td> <p>single logical value; if <code>FALSE</code>, then 2 missing values will indicate that there are no text boundaries</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>additional settings for <code>opts_brkiter</code></p> </td></tr> <tr valign="top"><td><code>opts_brkiter</code></td> <td> <p>a named list with <span class="pkg">ICU</span> BreakIterator's settings, see <code><a href="stri_opts_brkiter.html">stri_opts_brkiter</a></code>; <code>NULL</code> for default break iterator, i.e., <code>line_break</code></p> </td></tr> <tr valign="top"><td><code>locale</code></td> <td> <p><code>NULL</code> or <code>""</code> for text boundary analysis following the conventions of the default locale, or a single string with locale identifier, see <a href="stringi-locale.html">stringi-locale</a></p> </td></tr> </table> <h3>Details</h3> <p>Vectorized over <code>str</code>. </p> <p>For more information on text boundary analysis performed by <span class="pkg">ICU</span>'s <code>BreakIterator</code>, see <a href="stringi-search-boundaries.html">stringi-search-boundaries</a>. </p> <p>In case of <code>stri_locate_*_words</code>, just like in <code><a href="stri_extract_boundaries.html">stri_extract_all_words</a></code> and <code><a href="stri_count_boundaries.html">stri_count_words</a></code>, <span class="pkg">ICU</span>'s word <code>BreakIterator</code> iterator is used to locate the word boundaries, and all non-word characters (<code>UBRK_WORD_NONE</code> rule status) are ignored. This is function is equivalent to a call to <code>stri_locate_*_boundaries(str, type="word", skip_word_none=TRUE, locale=locale)</code> </p> <h3>Value</h3> <p>For <code>stri_locate_all_*</code>, a list of <code>length(str)</code> integer matrices is returned. The first column gives the start positions of substrings between located boundaries, and the second column gives the end positions. The indexes are code point-based, thus they may be passed, e.g., to <code><a href="stri_sub.html">stri_sub</a></code> or <code><a href="stri_sub_all.html">stri_sub_all</a></code>. Note that you get two <code>NA</code>s in one row if there is no match (and <code>omit_no_match</code> is <code>FALSE</code>) or there are missing data in the input vector. </p> <p><code>stri_locate_first_*</code> and <code>stri_locate_last_*</code>, return an integer matrix with two columns, giving the start and end positions of the first or the last matches, respectively, and two <code>NA</code>s if there is no match. </p> <h3>See Also</h3> <p>Other search_locate: <code><a href="stri_locate.html">stri_locate_all</a>()</code>, <code><a href="stringi-search.html">stringi-search</a></code> </p> <p>Other indexing: <code><a href="stri_locate.html">stri_locate_all</a>()</code>, <code><a href="stri_sub_all.html">stri_sub_all</a>()</code>, <code><a href="stri_sub.html">stri_sub</a>()</code> </p> <p>Other locale_sensitive: <code><a href="oper_comparison.html">%s<%</a>()</code>, <code><a href="stri_compare.html">stri_compare</a>()</code>, <code><a href="stri_count_boundaries.html">stri_count_boundaries</a>()</code>, <code><a href="stri_duplicated.html">stri_duplicated</a>()</code>, <code><a href="stri_enc_detect2.html">stri_enc_detect2</a>()</code>, <code><a href="stri_extract_boundaries.html">stri_extract_all_boundaries</a>()</code>, <code><a href="stri_opts_collator.html">stri_opts_collator</a>()</code>, <code><a href="stri_order.html">stri_order</a>()</code>, <code><a href="stri_sort.html">stri_sort</a>()</code>, <code><a href="stri_split_boundaries.html">stri_split_boundaries</a>()</code>, <code><a href="stri_trans_casemap.html">stri_trans_tolower</a>()</code>, <code><a href="stri_unique.html">stri_unique</a>()</code>, <code><a href="stri_wrap.html">stri_wrap</a>()</code>, <code><a href="stringi-locale.html">stringi-locale</a></code>, <code><a href="stringi-search-boundaries.html">stringi-search-boundaries</a></code>, <code><a href="stringi-search-coll.html">stringi-search-coll</a></code> </p> <p>Other text_boundaries: <code><a href="stri_count_boundaries.html">stri_count_boundaries</a>()</code>, <code><a href="stri_extract_boundaries.html">stri_extract_all_boundaries</a>()</code>, <code><a href="stri_opts_brkiter.html">stri_opts_brkiter</a>()</code>, <code><a href="stri_split_boundaries.html">stri_split_boundaries</a>()</code>, <code><a href="stri_split_lines.html">stri_split_lines</a>()</code>, <code><a href="stri_trans_casemap.html">stri_trans_tolower</a>()</code>, <code><a href="stri_wrap.html">stri_wrap</a>()</code>, <code><a href="stringi-search-boundaries.html">stringi-search-boundaries</a></code>, <code><a href="stringi-search.html">stringi-search</a></code> </p> <h3>Examples</h3> <pre> test <- "The\u00a0above-mentioned features are very useful. Kudos to their developers." stri_locate_all_boundaries(test, type="line") stri_locate_all_boundaries(test, type="word") stri_locate_all_boundaries(test, type="sentence") stri_locate_all_boundaries(test, type="character") stri_locate_all_words(test) stri_extract_all_boundaries("Mr. Jones and Mrs. Brown are very happy. So am I, Prof. Smith.", type="sentence", locale="en_US@ss=standard") # ICU >= 56 only </pre> <hr /><div style="text-align: center;">[Package <em>stringi</em> version 1.4.6 <a href="00Index.html">Index</a>]</div> </body></html>