EVOLUTION-MANAGER
Edit File: wilcox.test.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Wilcoxon Rank Sum and Signed Rank Tests</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for wilcox.test {stats}"><tr><td>wilcox.test {stats}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Wilcoxon Rank Sum and Signed Rank Tests</h2> <h3>Description</h3> <p>Performs one- and two-sample Wilcoxon tests on vectors of data; the latter is also known as ‘Mann-Whitney’ test. </p> <h3>Usage</h3> <pre> wilcox.test(x, ...) ## Default S3 method: wilcox.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, exact = NULL, correct = TRUE, conf.int = FALSE, conf.level = 0.95, ...) ## S3 method for class 'formula' wilcox.test(formula, data, subset, na.action, ...) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>x</code></td> <td> <p>numeric vector of data values. Non-finite (e.g., infinite or missing) values will be omitted.</p> </td></tr> <tr valign="top"><td><code>y</code></td> <td> <p>an optional numeric vector of data values: as with <code>x</code> non-finite values will be omitted.</p> </td></tr> <tr valign="top"><td><code>alternative</code></td> <td> <p>a character string specifying the alternative hypothesis, must be one of <code>"two.sided"</code> (default), <code>"greater"</code> or <code>"less"</code>. You can specify just the initial letter.</p> </td></tr> <tr valign="top"><td><code>mu</code></td> <td> <p>a number specifying an optional parameter used to form the null hypothesis. See ‘Details’.</p> </td></tr> <tr valign="top"><td><code>paired</code></td> <td> <p>a logical indicating whether you want a paired test.</p> </td></tr> <tr valign="top"><td><code>exact</code></td> <td> <p>a logical indicating whether an exact p-value should be computed.</p> </td></tr> <tr valign="top"><td><code>correct</code></td> <td> <p>a logical indicating whether to apply continuity correction in the normal approximation for the p-value.</p> </td></tr> <tr valign="top"><td><code>conf.int</code></td> <td> <p>a logical indicating whether a confidence interval should be computed.</p> </td></tr> <tr valign="top"><td><code>conf.level</code></td> <td> <p>confidence level of the interval.</p> </td></tr> <tr valign="top"><td><code>formula</code></td> <td> <p>a formula of the form <code>lhs ~ rhs</code> where <code>lhs</code> is a numeric variable giving the data values and <code>rhs</code> a factor with two levels giving the corresponding groups.</p> </td></tr> <tr valign="top"><td><code>data</code></td> <td> <p>an optional matrix or data frame (or similar: see <code><a href="model.frame.html">model.frame</a></code>) containing the variables in the formula <code>formula</code>. By default the variables are taken from <code>environment(formula)</code>.</p> </td></tr> <tr valign="top"><td><code>subset</code></td> <td> <p>an optional vector specifying a subset of observations to be used.</p> </td></tr> <tr valign="top"><td><code>na.action</code></td> <td> <p>a function which indicates what should happen when the data contain <code>NA</code>s. Defaults to <code>getOption("na.action")</code>.</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>further arguments to be passed to or from methods.</p> </td></tr> </table> <h3>Details</h3> <p>The formula interface is only applicable for the 2-sample tests. </p> <p>If only <code>x</code> is given, or if both <code>x</code> and <code>y</code> are given and <code>paired</code> is <code>TRUE</code>, a Wilcoxon signed rank test of the null that the distribution of <code>x</code> (in the one sample case) or of <code>x - y</code> (in the paired two sample case) is symmetric about <code>mu</code> is performed. </p> <p>Otherwise, if both <code>x</code> and <code>y</code> are given and <code>paired</code> is <code>FALSE</code>, a Wilcoxon rank sum test (equivalent to the Mann-Whitney test: see the Note) is carried out. In this case, the null hypothesis is that the distributions of <code>x</code> and <code>y</code> differ by a location shift of <code>mu</code> and the alternative is that they differ by some other location shift (and the one-sided alternative <code>"greater"</code> is that <code>x</code> is shifted to the right of <code>y</code>). </p> <p>By default (if <code>exact</code> is not specified), an exact p-value is computed if the samples contain less than 50 finite values and there are no ties. Otherwise, a normal approximation is used. </p> <p>Optionally (if argument <code>conf.int</code> is true), a nonparametric confidence interval and an estimator for the pseudomedian (one-sample case) or for the difference of the location parameters <code>x-y</code> is computed. (The pseudomedian of a distribution <i>F</i> is the median of the distribution of <i>(u+v)/2</i>, where <i>u</i> and <i>v</i> are independent, each with distribution <i>F</i>. If <i>F</i> is symmetric, then the pseudomedian and median coincide. See Hollander & Wolfe (1973), page 34.) Note that in the two-sample case the estimator for the difference in location parameters does <b>not</b> estimate the difference in medians (a common misconception) but rather the median of the difference between a sample from <code>x</code> and a sample from <code>y</code>. </p> <p>If exact p-values are available, an exact confidence interval is obtained by the algorithm described in Bauer (1972), and the Hodges-Lehmann estimator is employed. Otherwise, the returned confidence interval and point estimate are based on normal approximations. These are continuity-corrected for the interval but <em>not</em> the estimate (as the correction depends on the <code>alternative</code>). </p> <p>With small samples it may not be possible to achieve very high confidence interval coverages. If this happens a warning will be given and an interval with lower coverage will be substituted. </p> <p>When <code>x</code> (and <code>y</code> if applicable) are valid, the function now always returns, also in the <code>conf.int = TRUE</code> case when a confidence interval cannot be computed, in which case the interval boundaries and sometimes the <code>estimate</code> now contain <code><a href="../../base/html/is.finite.html">NaN</a></code>. </p> <h3>Value</h3> <p>A list with class <code>"htest"</code> containing the following components: </p> <table summary="R valueblock"> <tr valign="top"><td><code>statistic</code></td> <td> <p>the value of the test statistic with a name describing it.</p> </td></tr> <tr valign="top"><td><code>parameter</code></td> <td> <p>the parameter(s) for the exact distribution of the test statistic.</p> </td></tr> <tr valign="top"><td><code>p.value</code></td> <td> <p>the p-value for the test.</p> </td></tr> <tr valign="top"><td><code>null.value</code></td> <td> <p>the location parameter <code>mu</code>.</p> </td></tr> <tr valign="top"><td><code>alternative</code></td> <td> <p>a character string describing the alternative hypothesis.</p> </td></tr> <tr valign="top"><td><code>method</code></td> <td> <p>the type of test applied.</p> </td></tr> <tr valign="top"><td><code>data.name</code></td> <td> <p>a character string giving the names of the data.</p> </td></tr> <tr valign="top"><td><code>conf.int</code></td> <td> <p>a confidence interval for the location parameter. (Only present if argument <code>conf.int = TRUE</code>.)</p> </td></tr> <tr valign="top"><td><code>estimate</code></td> <td> <p>an estimate of the location parameter. (Only present if argument <code>conf.int = TRUE</code>.)</p> </td></tr> </table> <h3>Warning</h3> <p>This function can use large amounts of memory and stack (and even crash <span style="font-family: Courier New, Courier; color: #666666;"><b>R</b></span> if the stack limit is exceeded) if <code>exact = TRUE</code> and one sample is large (several thousands or more). </p> <h3>Note</h3> <p>The literature is not unanimous about the definitions of the Wilcoxon rank sum and Mann-Whitney tests. The two most common definitions correspond to the sum of the ranks of the first sample with the minimum value subtracted or not: <span style="font-family: Courier New, Courier; color: #666666;"><b>R</b></span> subtracts and S-PLUS does not, giving a value which is larger by <i>m(m+1)/2</i> for a first sample of size <i>m</i>. (It seems Wilcoxon's original paper used the unadjusted sum of the ranks but subsequent tables subtracted the minimum.) </p> <p><span style="font-family: Courier New, Courier; color: #666666;"><b>R</b></span>'s value can also be computed as the number of all pairs <code>(x[i], y[j])</code> for which <code>y[j]</code> is not greater than <code>x[i]</code>, the most common definition of the Mann-Whitney test. </p> <h3>References</h3> <p>David F. Bauer (1972). Constructing confidence sets using rank statistics. <em>Journal of the American Statistical Association</em> <b>67</b>, 687–690. doi: <a href="https://doi.org/10.1080/01621459.1972.10481279">10.1080/01621459.1972.10481279</a>. </p> <p>Myles Hollander and Douglas A. Wolfe (1973). <em>Nonparametric Statistical Methods</em>. New York: John Wiley & Sons. Pages 27–33 (one-sample), 68–75 (two-sample).<br /> Or second edition (1999). </p> <h3>See Also</h3> <p><code><a href="SignRank.html">psignrank</a></code>, <code><a href="Wilcoxon.html">pwilcox</a></code>. </p> <p><code><a href="../../coin/html/LocationTests.html">wilcox_test</a></code> in package <a href="https://CRAN.R-project.org/package=coin"><span class="pkg">coin</span></a> for exact, asymptotic and Monte Carlo <em>conditional</em> p-values, including in the presence of ties. </p> <p><code><a href="kruskal.test.html">kruskal.test</a></code> for testing homogeneity in location parameters in the case of two or more samples; <code><a href="t.test.html">t.test</a></code> for an alternative under normality assumptions [or large samples] </p> <h3>Examples</h3> <pre> require(graphics) ## One-sample test. ## Hollander & Wolfe (1973), 29f. ## Hamilton depression scale factor measurements in 9 patients with ## mixed anxiety and depression, taken at the first (x) and second ## (y) visit after initiation of a therapy (administration of a ## tranquilizer). x <- c(1.83, 0.50, 1.62, 2.48, 1.68, 1.88, 1.55, 3.06, 1.30) y <- c(0.878, 0.647, 0.598, 2.05, 1.06, 1.29, 1.06, 3.14, 1.29) wilcox.test(x, y, paired = TRUE, alternative = "greater") wilcox.test(y - x, alternative = "less") # The same. wilcox.test(y - x, alternative = "less", exact = FALSE, correct = FALSE) # H&W large sample # approximation ## Two-sample test. ## Hollander & Wolfe (1973), 69f. ## Permeability constants of the human chorioamnion (a placental ## membrane) at term (x) and between 12 to 26 weeks gestational ## age (y). The alternative of interest is greater permeability ## of the human chorioamnion for the term pregnancy. x <- c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46) y <- c(1.15, 0.88, 0.90, 0.74, 1.21) wilcox.test(x, y, alternative = "g") # greater wilcox.test(x, y, alternative = "greater", exact = FALSE, correct = FALSE) # H&W large sample # approximation wilcox.test(rnorm(10), rnorm(10, 2), conf.int = TRUE) ## Formula interface. boxplot(Ozone ~ Month, data = airquality) wilcox.test(Ozone ~ Month, data = airquality, subset = Month %in% c(5, 8)) </pre> <hr /><div style="text-align: center;">[Package <em>stats</em> version 3.6.0 <a href="00Index.html">Index</a>]</div> </body></html>