EVOLUTION-MANAGER
Edit File: match.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Value Matching</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for match {base}"><tr><td>match {base}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Value Matching</h2> <h3>Description</h3> <p><code>match</code> returns a vector of the positions of (first) matches of its first argument in its second. </p> <p><code>%in%</code> is a more intuitive interface as a binary operator, which returns a logical vector indicating if there is a match or not for its left operand. </p> <h3>Usage</h3> <pre> match(x, table, nomatch = NA_integer_, incomparables = NULL) x %in% table </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>x</code></td> <td> <p>vector or <code>NULL</code>: the values to be matched. <a href="LongVectors.html">Long vectors</a> are supported.</p> </td></tr> <tr valign="top"><td><code>table</code></td> <td> <p>vector or <code>NULL</code>: the values to be matched against. <a href="LongVectors.html">Long vectors</a> are not supported.</p> </td></tr> <tr valign="top"><td><code>nomatch</code></td> <td> <p>the value to be returned in the case when no match is found. Note that it is coerced to <code>integer</code>.</p> </td></tr> <tr valign="top"><td><code>incomparables</code></td> <td> <p>a vector of values that cannot be matched. Any value in <code>x</code> matching a value in this vector is assigned the <code>nomatch</code> value. For historical reasons, <code>FALSE</code> is equivalent to <code>NULL</code>.</p> </td></tr> </table> <h3>Details</h3> <p><code>%in%</code> is currently defined as <br /> <code>"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0</code> </p> <p>Factors, raw vectors and lists are converted to character vectors, and then <code>x</code> and <code>table</code> are coerced to a common type (the later of the two types in <span style="font-family: Courier New, Courier; color: #666666;"><b>R</b></span>'s ordering, logical < integer < numeric < complex < character) before matching. If <code>incomparables</code> has positive length it is coerced to the common type. </p> <p>Matching for lists is potentially very slow and best avoided except in simple cases. </p> <p>Exactly what matches what is to some extent a matter of definition. For all types, <code>NA</code> matches <code>NA</code> and no other value. For real and complex values, <code>NaN</code> values are regarded as matching any other <code>NaN</code> value, but not matching <code>NA</code>, where for complex <code>x</code>, real and imaginary parts must match both (unless containing at least one <code>NA</code>). </p> <p>Character strings will be compared as byte sequences if any input is marked as <code>"bytes"</code>, and otherwise are regarded as equal if they are in different encodings but would agree when translated to UTF-8 (see <code><a href="Encoding.html">Encoding</a></code>). </p> <p>That <code>%in%</code> never returns <code>NA</code> makes it particularly useful in <code>if</code> conditions. </p> <h3>Value</h3> <p>A vector of the same length as <code>x</code>. </p> <p><code>match</code>: An integer vector giving the position in <code>table</code> of the first match if there is a match, otherwise <code>nomatch</code>. </p> <p>If <code>x[i]</code> is found to equal <code>table[j]</code> then the value returned in the <code>i</code>-th position of the return value is <code>j</code>, for the smallest possible <code>j</code>. If no match is found, the value is <code>nomatch</code>. </p> <p><code>%in%</code>: A logical vector, indicating if a match was located for each element of <code>x</code>: thus the values are <code>TRUE</code> or <code>FALSE</code> and never <code>NA</code>. </p> <h3>References</h3> <p>Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) <em>The New S Language</em>. Wadsworth & Brooks/Cole. </p> <h3>See Also</h3> <p><code><a href="pmatch.html">pmatch</a></code> and <code><a href="charmatch.html">charmatch</a></code> for (<em>partial</em>) string matching, <code><a href="match.arg.html">match.arg</a></code>, etc for function argument matching. <code><a href="findInterval.html">findInterval</a></code> similarly returns a vector of positions, but finds numbers within intervals, rather than exact matches. </p> <p><code><a href="sets.html">is.element</a></code> for an S-compatible equivalent of <code>%in%</code>. </p> <p><code><a href="unique.html">unique</a></code> (and <code><a href="duplicated.html">duplicated</a></code>) are using the same definitions of “match” or “equality” as <code>match()</code>, and these are less strict than <code><a href="Comparison.html">==</a></code>, e.g., for <code><a href="NA.html">NA</a></code> and <code><a href="is.finite.html">NaN</a></code> in numeric or complex vectors, or for strings with different encodings, see also above. </p> <h3>Examples</h3> <pre> ## The intersection of two sets can be defined via match(): ## Simple version: ## intersect <- function(x, y) y[match(x, y, nomatch = 0)] intersect # the R function in base is slightly more careful intersect(1:10, 7:20) 1:10 %in% c(1,3,5,9) sstr <- c("c","ab","B","bba","c",NA,"@","bla","a","Ba","%") sstr[sstr %in% c(letters, LETTERS)] "%w/o%" <- function(x, y) x[!x %in% y] #-- x without y (1:10) %w/o% c(3,7,12) ## Note that setdiff() is very similar and typically makes more sense: c(1:6,7:2) %w/o% c(3,7,12) # -> keeps duplicates setdiff(c(1:6,7:2), c(3,7,12)) # -> unique values ## Illuminating example about NA matching r <- c(1, NA, NaN) zN <- c(complex(real = NA , imaginary = r ), complex(real = r , imaginary = NA ), complex(real = r , imaginary = NaN), complex(real = NaN, imaginary = r )) zM <- cbind(Re=Re(zN), Im=Im(zN), match = match(zN, zN)) rownames(zM) <- format(zN) zM ##--> many "NA's" (= 1) and the four non-NA's (3 different ones, at 7,9,10) length(zN) # 12 unique(zN) # the "NA" and the 3 different non-NA NaN's stopifnot(identical(unique(zN), zN[c(1, 7,9,10)])) ## very strict equality would have 4 duplicates (of 12): symnum(outer(zN, zN, Vectorize(identical,c("x","y")), FALSE,FALSE,FALSE,FALSE)) ## removing "(very strictly) duplicates", i <- c(5,8,11,12) # we get 8 pairwise non-identicals : Ixy <- outer(zN[-i], zN[-i], Vectorize(identical,c("x","y")), FALSE,FALSE,FALSE,FALSE) stopifnot(identical(Ixy, diag(8) == 1)) </pre> <hr /><div style="text-align: center;">[Package <em>base</em> version 3.6.0 <a href="00Index.html">Index</a>]</div> </body></html>