EVOLUTION-MANAGER
Edit File: stri_compare.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Compare Strings with or without Collation</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for stri_compare {stringi}"><tr><td>stri_compare {stringi}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Compare Strings with or without Collation</h2> <h3>Description</h3> <p>These functions may be used to determine if two strings are equal, canonically equivalent (this is performed in a much more clever fashion than when testing for equality), or to check whether they are in a specific lexicographic order. </p> <h3>Usage</h3> <pre> stri_compare(e1, e2, ..., opts_collator = NULL) stri_cmp(e1, e2, ..., opts_collator = NULL) stri_cmp_eq(e1, e2) stri_cmp_neq(e1, e2) stri_cmp_equiv(e1, e2, ..., opts_collator = NULL) stri_cmp_nequiv(e1, e2, ..., opts_collator = NULL) stri_cmp_lt(e1, e2, ..., opts_collator = NULL) stri_cmp_gt(e1, e2, ..., opts_collator = NULL) stri_cmp_le(e1, e2, ..., opts_collator = NULL) stri_cmp_ge(e1, e2, ..., opts_collator = NULL) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>e1, e2</code></td> <td> <p>character vectors or objects coercible to character vectors</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>additional settings for <code>opts_collator</code></p> </td></tr> <tr valign="top"><td><code>opts_collator</code></td> <td> <p>a named list with <span class="pkg">ICU</span> Collator's options, see <code><a href="stri_opts_collator.html">stri_opts_collator</a></code>, <code>NULL</code> for the default collation options.</p> </td></tr> </table> <h3>Details</h3> <p>All the functions listed here are vectorized over <code>e1</code> and <code>e2</code>. </p> <p><code>stri_cmp_eq</code> tests whether two corresponding strings consist of exactly the same code points, while <code>stri_cmp_neq</code> allows to check whether there is any difference between them. These are locale-independent operations: for natural language processing, where the notion of canonical equivalence is more valid, this might not be exactly what you are looking for, see Examples. Please note that <span class="pkg">stringi</span> always silently removes UTF-8 BOMs from input strings, therefore, e.g., <code>stri_cmp_eq</code> does not take BOMs into account while comparing strings. </p> <p><code>stri_cmp_equiv</code> tests for canonical equivalence of two strings and is locale-dependent. Additionally, the <span class="pkg">ICU</span>'s Collator may be tuned up so that, e.g., the comparison is case-insensitive. To test whether two strings are not canonically equivalent, call <code>stri_cmp_nequiv</code>. </p> <p><code>stri_cmp_le</code> tests whether the elements in the first vector are less than or equal to the corresponding elements in the second vector, <code>stri_cmp_ge</code> tests whether they are greater or equal, <code>stri_cmp_lt</code> if less, and <code>stri_cmp_gt</code> if greater, see also, e.g., <code><a href="oper_comparison.html">%s<%</a></code>. </p> <p><code>stri_compare</code> is an alias to <code>stri_cmp</code>. They both perform exactly the same locale-dependent operation. Both functions provide a C library's <code>strcmp()</code> look-and-feel, see Value for details. </p> <p>For more information on <span class="pkg">ICU</span>'s Collator and how to tune its settings refer to <code><a href="stri_opts_collator.html">stri_opts_collator</a></code>. Note that different locale settings may lead to different results (see the examples below). </p> <h3>Value</h3> <p>The <code>stri_cmp</code> and <code>stri_compare</code> functions return an integer vector representing the comparison results: <code>-1</code> if <code>e1[...] < e2[...]</code>, <code>0</code> if they are canonically equivalent, and <code>1</code> if greater. </p> <p>All the other functions return a logical vector that indicates whether a given relation holds between two corresponding elements in <code>e1</code> and <code>e2</code>. </p> <h3>References</h3> <p><em>Collation</em> - ICU User Guide, <a href="http://userguide.icu-project.org/collation">http://userguide.icu-project.org/collation</a> </p> <h3>See Also</h3> <p>Other locale_sensitive: <code><a href="oper_comparison.html">%s<%</a>()</code>, <code><a href="stri_count_boundaries.html">stri_count_boundaries</a>()</code>, <code><a href="stri_duplicated.html">stri_duplicated</a>()</code>, <code><a href="stri_enc_detect2.html">stri_enc_detect2</a>()</code>, <code><a href="stri_extract_boundaries.html">stri_extract_all_boundaries</a>()</code>, <code><a href="stri_locate_boundaries.html">stri_locate_all_boundaries</a>()</code>, <code><a href="stri_opts_collator.html">stri_opts_collator</a>()</code>, <code><a href="stri_order.html">stri_order</a>()</code>, <code><a href="stri_sort.html">stri_sort</a>()</code>, <code><a href="stri_split_boundaries.html">stri_split_boundaries</a>()</code>, <code><a href="stri_trans_casemap.html">stri_trans_tolower</a>()</code>, <code><a href="stri_unique.html">stri_unique</a>()</code>, <code><a href="stri_wrap.html">stri_wrap</a>()</code>, <code><a href="stringi-locale.html">stringi-locale</a></code>, <code><a href="stringi-search-boundaries.html">stringi-search-boundaries</a></code>, <code><a href="stringi-search-coll.html">stringi-search-coll</a></code> </p> <h3>Examples</h3> <pre> # in Polish, ch < h: stri_cmp_lt("hladny", "chladny", locale="pl_PL") # in Slovak, ch > h: stri_cmp_lt("hladny", "chladny", locale="sk_SK") # < or > (depends on locale): stri_cmp("hladny", "chladny") # ignore case differences: stri_cmp_equiv("hladny", "HLADNY", strength=2) # also ignore diacritical differences: stri_cmp_equiv("hladn\u00FD", "hladny", strength=1, locale="sk_SK") # non-Unicode-normalized vs normalized string: stri_cmp_equiv(stri_trans_nfkd("\u0105"), "\u105") # note the difference: stri_cmp_eq(stri_trans_nfkd("\u0105"), "\u105") # ligatures: stri_cmp_equiv("\ufb00", "ff", strength=2) # phonebook collation stri_cmp_equiv("G\u00e4rtner", "Gaertner", locale="de_DE@collation=phonebook", strength=1L) stri_cmp_equiv("G\u00e4rtner", "Gaertner", locale="de_DE", strength=1L) </pre> <hr /><div style="text-align: center;">[Package <em>stringi</em> version 1.4.6 <a href="00Index.html">Index</a>]</div> </body></html>