EVOLUTION-MANAGER
Edit File: family.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Family Objects for Models</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for family {stats}"><tr><td>family {stats}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Family Objects for Models</h2> <h3>Description</h3> <p>Family objects provide a convenient way to specify the details of the models used by functions such as <code><a href="glm.html">glm</a></code>. See the documentation for <code><a href="glm.html">glm</a></code> for the details on how such model fitting takes place. </p> <h3>Usage</h3> <pre> family(object, ...) binomial(link = "logit") gaussian(link = "identity") Gamma(link = "inverse") inverse.gaussian(link = "1/mu^2") poisson(link = "log") quasi(link = "identity", variance = "constant") quasibinomial(link = "logit") quasipoisson(link = "log") </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>link</code></td> <td> <p>a specification for the model link function. This can be a name/expression, a literal character string, a length-one character vector, or an object of class <code>"<a href="make.link.html">link-glm</a>"</code> (such as generated by <code><a href="make.link.html">make.link</a></code>) provided it is not specified <em>via</em> one of the standard names given next. </p> <p>The <code>gaussian</code> family accepts the links (as names) <code>identity</code>, <code>log</code> and <code>inverse</code>; the <code>binomial</code> family the links <code>logit</code>, <code>probit</code>, <code>cauchit</code>, (corresponding to logistic, normal and Cauchy CDFs respectively) <code>log</code> and <code>cloglog</code> (complementary log-log); the <code>Gamma</code> family the links <code>inverse</code>, <code>identity</code> and <code>log</code>; the <code>poisson</code> family the links <code>log</code>, <code>identity</code>, and <code>sqrt</code>; and the <code>inverse.gaussian</code> family the links <code>1/mu^2</code>, <code>inverse</code>, <code>identity</code> and <code>log</code>. </p> <p>The <code>quasi</code> family accepts the links <code>logit</code>, <code>probit</code>, <code>cloglog</code>, <code>identity</code>, <code>inverse</code>, <code>log</code>, <code>1/mu^2</code> and <code>sqrt</code>, and the function <code><a href="power.html">power</a></code> can be used to create a power link function. </p> </td></tr> <tr valign="top"><td><code>variance</code></td> <td> <p>for all families other than <code>quasi</code>, the variance function is determined by the family. The <code>quasi</code> family will accept the literal character string (or unquoted as a name/expression) specifications <code>"constant"</code>, <code>"mu(1-mu)"</code>, <code>"mu"</code>, <code>"mu^2"</code> and <code>"mu^3"</code>, a length-one character vector taking one of those values, or a list containing components <code>varfun</code>, <code>validmu</code>, <code>dev.resids</code>, <code>initialize</code> and <code>name</code>. </p> </td></tr> <tr valign="top"><td><code>object</code></td> <td> <p>the function <code>family</code> accesses the <code>family</code> objects which are stored within objects created by modelling functions (e.g., <code>glm</code>).</p> </td></tr> <tr valign="top"><td><code>...</code></td> <td> <p>further arguments passed to methods.</p> </td></tr> </table> <h3>Details</h3> <p><code>family</code> is a generic function with methods for classes <code>"glm"</code> and <code>"lm"</code> (the latter returning <code>gaussian()</code>). </p> <p>For the <code>binomial</code> and <code>quasibinomial</code> families the response can be specified in one of three ways: </p> <ol> <li><p> As a factor: ‘success’ is interpreted as the factor not having the first level (and hence usually of having the second level). </p> </li> <li><p> As a numerical vector with values between <code>0</code> and <code>1</code>, interpreted as the proportion of successful cases (with the total number of cases given by the <code>weights</code>). </p> </li> <li><p> As a two-column integer matrix: the first column gives the number of successes and the second the number of failures. </p> </li></ol> <p>The <code>quasibinomial</code> and <code>quasipoisson</code> families differ from the <code>binomial</code> and <code>poisson</code> families only in that the dispersion parameter is not fixed at one, so they can model over-dispersion. For the binomial case see McCullagh and Nelder (1989, pp. 124–8). Although they show that there is (under some restrictions) a model with variance proportional to mean as in the quasi-binomial model, note that <code>glm</code> does not compute maximum-likelihood estimates in that model. The behaviour of S is closer to the quasi- variants. </p> <h3>Value</h3> <p>An object of class <code>"family"</code> (which has a concise print method). This is a list with elements </p> <table summary="R valueblock"> <tr valign="top"><td><code>family</code></td> <td> <p>character: the family name.</p> </td></tr> <tr valign="top"><td><code>link</code></td> <td> <p>character: the link name.</p> </td></tr> <tr valign="top"><td><code>linkfun</code></td> <td> <p>function: the link.</p> </td></tr> <tr valign="top"><td><code>linkinv</code></td> <td> <p>function: the inverse of the link function.</p> </td></tr> <tr valign="top"><td><code>variance</code></td> <td> <p>function: the variance as a function of the mean.</p> </td></tr> <tr valign="top"><td><code>dev.resids</code></td> <td> <p>function giving the deviance for each observation as a function of <code>(y, mu, wt)</code>, used by the <code><a href="glm.summaries.html">residuals</a></code> method when computing deviance residuals.</p> </td></tr> <tr valign="top"><td><code>aic</code></td> <td> <p>function giving the AIC value if appropriate (but <code>NA</code> for the quasi- families). More precisely, this function returns <i>-2 ll + 2 s</i>, where <i>ll</i> is the log-likelihood and <i>s</i> is the number of estimated scale parameters. Note that the penalty term for the location parameters (typically the “regression coefficients”) is added elsewhere, e.g., in <code><a href="glm.html">glm.fit</a>()</code>, or <code><a href="AIC.html">AIC</a>()</code>, see the AIC example in <code><a href="glm.html">glm</a></code>. See <code><a href="logLik.html">logLik</a></code> for the assumptions made about the dispersion parameter.</p> </td></tr> <tr valign="top"><td><code>mu.eta</code></td> <td> <p>function: derivative of the inverse-link function with respect to the linear predictor. If the inverse-link function is <i>mu = ginv(eta)</i> where <i>eta</i> is the value of the linear predictor, then this function returns <i>d(ginv(eta))/d(eta) = d(mu)/d(eta)</i>.</p> </td></tr> <tr valign="top"><td><code>initialize</code></td> <td> <p>expression. This needs to set up whatever data objects are needed for the family as well as <code>n</code> (needed for AIC in the binomial family) and <code>mustart</code> (see <code><a href="glm.html">glm</a></code>).</p> </td></tr> <tr valign="top"><td><code>validmu</code></td> <td> <p>logical function. Returns <code>TRUE</code> if a mean vector <code>mu</code> is within the domain of <code>variance</code>.</p> </td></tr> <tr valign="top"><td><code>valideta</code></td> <td> <p>logical function. Returns <code>TRUE</code> if a linear predictor <code>eta</code> is within the domain of <code>linkinv</code>.</p> </td></tr> <tr valign="top"><td><code>simulate</code></td> <td> <p>(optional) function <code>simulate(object, nsim)</code> to be called by the <code>"lm"</code> method of <code><a href="simulate.html">simulate</a></code>. It will normally return a matrix with <code>nsim</code> columns and one row for each fitted value, but it can also return a list of length <code>nsim</code>. Clearly this will be missing for ‘quasi-’ families.</p> </td></tr> </table> <h3>Note</h3> <p>The <code>link</code> and <code>variance</code> arguments have rather awkward semantics for back-compatibility. The recommended way is to supply them as quoted character strings, but they can also be supplied unquoted (as names or expressions). Additionally, they can be supplied as a length-one character vector giving the name of one of the options, or as a list (for <code>link</code>, of class <code>"link-glm"</code>). The restrictions apply only to links given as names: when given as a character string all the links known to <code><a href="make.link.html">make.link</a></code> are accepted. </p> <p>This is potentially ambiguous: supplying <code>link = logit</code> could mean the unquoted name of a link or the value of object <code>logit</code>. It is interpreted if possible as the name of an allowed link, then as an object. (You can force the interpretation to always be the value of an object via <code>logit[1]</code>.) </p> <h3>Author(s)</h3> <p>The design was inspired by S functions of the same names described in Hastie & Pregibon (1992) (except <code>quasibinomial</code> and <code>quasipoisson</code>). </p> <h3>References</h3> <p>McCullagh P. and Nelder, J. A. (1989) <em>Generalized Linear Models.</em> London: Chapman and Hall. </p> <p>Dobson, A. J. (1983) <em>An Introduction to Statistical Modelling.</em> London: Chapman and Hall. </p> <p>Cox, D. R. and Snell, E. J. (1981). <em>Applied Statistics; Principles and Examples.</em> London: Chapman and Hall. </p> <p>Hastie, T. J. and Pregibon, D. (1992) <em>Generalized linear models.</em> Chapter 6 of <em>Statistical Models in S</em> eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole. </p> <h3>See Also</h3> <p><code><a href="glm.html">glm</a></code>, <code><a href="power.html">power</a></code>, <code><a href="make.link.html">make.link</a></code>. </p> <p>For binomial <em>coefficients</em>, <code><a href="../../base/html/Special.html">choose</a></code>; the binomial and negative binomial <em>distributions</em>, <code><a href="Binomial.html">Binomial</a></code>, and <code><a href="NegBinomial.html">NegBinomial</a></code>. </p> <h3>Examples</h3> <pre> require(utils) # for str nf <- gaussian() # Normal family nf str(nf) gf <- Gamma() gf str(gf) gf$linkinv gf$variance(-3:4) #- == (.)^2 ## Binomial with default 'logit' link: Check some properties visually: bi <- binomial() et <- seq(-10,10, by=1/8) plot(et, bi$mu.eta(et), type="l") ## show that mu.eta() is derivative of linkinv() : lines((et[-1]+et[-length(et)])/2, col=adjustcolor("red", 1/4), diff(bi$linkinv(et))/diff(et), type="l", lwd=4) ## which here is the logistic density: lines(et, dlogis(et), lwd=3, col=adjustcolor("blue", 1/4)) stopifnot(exprs = { all.equal(bi$ mu.eta(et), dlogis(et)) all.equal(bi$linkinv(et), plogis(et) -> m) all.equal(bi$linkfun(m ), qlogis(m)) # logit(.) == qlogis(.) ! }) ## Data from example(glm) : d.AD <- data.frame(treatment = gl(3,3), outcome = gl(3,1,9), counts = c(18,17,15, 20,10,20, 25,13,12)) glm.D93 <- glm(counts ~ outcome + treatment, d.AD, family = poisson()) ## Quasipoisson: compare with above / example(glm) : glm.qD93 <- glm(counts ~ outcome + treatment, d.AD, family = quasipoisson()) glm.qD93 anova (glm.qD93, test = "F") summary(glm.qD93) ## for Poisson results (same as from 'glm.D93' !) use anova (glm.qD93, dispersion = 1, test = "Chisq") summary(glm.qD93, dispersion = 1) ## Example of user-specified link, a logit model for p^days ## See Shaffer, T. 2004. Auk 121(2): 526-540. logexp <- function(days = 1) { linkfun <- function(mu) qlogis(mu^(1/days)) linkinv <- function(eta) plogis(eta)^days mu.eta <- function(eta) days * plogis(eta)^(days-1) * binomial()$mu.eta(eta) valideta <- function(eta) TRUE link <- paste0("logexp(", days, ")") structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, valideta = valideta, name = link), class = "link-glm") } (bil3 <- binomial(logexp(3))) ## in practice this would be used with a vector of 'days', in ## which case use an offset of 0 in the corresponding formula ## to get the null deviance right. ## Binomial with identity link: often not a good idea, as both ## computationally and conceptually difficult: binomial(link = "identity") ## is exactly the same as binomial(link = make.link("identity")) ## tests of quasi x <- rnorm(100) y <- rpois(100, exp(1+x)) glm(y ~ x, family = quasi(variance = "mu", link = "log")) # which is the same as glm(y ~ x, family = poisson) glm(y ~ x, family = quasi(variance = "mu^2", link = "log")) ## Not run: glm(y ~ x, family = quasi(variance = "mu^3", link = "log")) # fails y <- rbinom(100, 1, plogis(x)) # need to set a starting value for the next fit glm(y ~ x, family = quasi(variance = "mu(1-mu)", link = "logit"), start = c(0,1)) </pre> <hr /><div style="text-align: center;">[Package <em>stats</em> version 3.6.0 <a href="00Index.html">Index</a>]</div> </body></html>