EVOLUTION-MANAGER
Edit File: new_summary_scanone.Rnw
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Karl W. Broman % The new summary.scanone % % This is an "Sweave" document; see the corresponding PDF. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \documentclass[12pt]{article} %\usepackage{times} \usepackage{amsmath} \usepackage{color} \usepackage{times} % revise margins \setlength{\headheight}{0.0in} \setlength{\topmargin}{-0.25in} \setlength{\headsep}{0.0in} \setlength{\textheight}{9.00in} \setlength{\footskip}{0.5in} \setlength{\oddsidemargin}{0in} \setlength{\evensidemargin}{0in} \setlength{\textwidth}{6.5in} \setlength{\parskip}{6pt} \setlength{\parindent}{0pt} \newcommand{\code}{\texttt} \newcommand{\lod}{\text{LOD}} \begin{document} \SweaveOpts{prefix.string=Figs/scanone} \setkeys{Gin}{width=\textwidth} %% <- change width of figures % Try to get the R code from running into the margin <<echo=FALSE>>= options(width=77) @ % Change S input/output font size \DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\footnotesize, baselinestretch=0.75, formatcom = {\color[rgb]{0, 0, 0.56}}} \DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\footnotesize, baselinestretch=0.75, formatcom = {\color[rgb]{0.56, 0, 0}}} \textbf{The new \code{summary.scanone}} \\ Karl W Broman, 27 Oct 2006 \\ (Added color 26 Oct 2010) \bigskip In R/qtl version 1.04, the function \code{summary.scanone} has been changed quite substantially. Also, the permutations with \code{scanone} have changed to allow the calculation of autosome- and X-chromosome-specific LOD thresholds, and to enable stratified permutation tests. In this document, I describe the revisions and how to use the new functions. We'll first look at the \code{fake.f2} data as an example. First we need to load the package and the data. <<loaddata>>= library(qtl) data(fake.f2) @ <<loadresults,echo=FALSE>>= load("fakef2_results.RData") @ I'm going to use \code{scanone} with \code{method="hk"}. First I run \code{calc.genoprob}, and then \code{scanone} as before. <<scanoneA,eval=FALSE>>= fake.f2 <- calc.genoprob(fake.f2, step=2.5) out.f2 <- scanone(fake.f2, method="hk") @ In \code{summary.scanone}, we can now get an indication of the number of degrees of freedom associated with the LOD scores. We use \code{df=TRUE}, as follows. <<summaryscanoneA>>= summary(out.f2, threshold=3, df=TRUE) @ There are a couple of improvements in the permutations performed by \code{scanone}. First, we can calculate autosome- and X-chromosome-specific LOD thresholds; this is important in this case, as the number of degrees of freedom is different for the X chromosome. Separate autosome and X chromosome permutations may be performed in \code{scanone} via \code{perm.Xsp=TRUE}. The X-chromosome-specific thresholds requires many more permutation replicates to get a threshold of equivalent precision. An increased number of permutations is chosen automatically. Permutations can take a very long time, and so one might want to use a multi-processor computer or cluster and do multiple shorter runs in parallel. And so we have added a function \code{c.scanoneperm} for combining such runs together. <<scanonepermA,eval=FALSE>>= operm1.f2 <- scanone(fake.f2, method="hk", n.perm=500, perm.Xsp=TRUE) operm2.f2 <- scanone(fake.f2, method="hk", n.perm=500, perm.Xsp=TRUE) operm.f2 <- c(operm1.f2, operm2.f2) @ Getting the autosome- and X-chromosome-specific thresholds is a bit tricky, and so another improvement is the addition of the function \code{summary.scanoneperm} for calculating such thresholds. The argument \code{alpha} indicates the significance levels. <<summaryscanonepermA>>= summary(operm.f2, alpha=c(0.05, 0.20)) @ Further, we may include the permutation results in the call to \code{summary.scanone} to automatically calculate thresholds and to have genome-scan-adjusted p-values displayed. <<summaryscaononeB>>= summary(out.f2, perms=operm.f2, alpha=0.05, pvalues=TRUE) @ Finally, one may prefer to do a stratified permutation test, permuting the phenotypes separately within each of the groups defined by sex and cross direction. This may be done in \code{scanone} with the argument \code{perm.strata}, which should be a numeric vector whose unique values define the separate strata. We set of the strata as follows. <<thestrata>>= sex <- fake.f2$pheno$sex pgm <- fake.f2$pheno$pgm strata <- sex + 2*pgm table(strata) @ We then perform the permutation test in four pieces, and combine the results together, as follows. <<scanonepermB,eval=FALSE>>= operm1.f2strat <- scanone(fake.f2, method="hk", n.perm=250, perm.Xsp=TRUE, perm.strata=strata) operm2.f2strat <- scanone(fake.f2, method="hk", n.perm=250, perm.Xsp=TRUE, perm.strata=strata) operm3.f2strat <- scanone(fake.f2, method="hk", n.perm=250, perm.Xsp=TRUE, perm.strata=strata) operm4.f2strat <- scanone(fake.f2, method="hk", n.perm=250, perm.Xsp=TRUE, perm.strata=strata) operm.f2strat <- c(operm1.f2strat, operm2.f2strat, operm3.f2strat, operm4.f2strat) @ The new thresholds are as follows. <<summaryscanonepermB>>= summary(operm.f2strat, alpha=c(0.05, 0.20)) @ The big changes to the \code{summary.scanone} function concern the case of results for multiple phenotypes. To illustrate this, we will look at the \code{fake.bc} data, which has two phenotypes. First we load the data. <<fakebc>>= data(fake.bc) @ <<loaddataB,echo=FALSE>>= load("fakebc_results.RData") @ Now let's run \code{calc.genoprob} and do a genome scan on the two phenotypes. Again, we use \code{method="hk"} for the sake of speed. <<scanoneB,eval=FALSE>>= fake.bc <- calc.genoprob(fake.bc, step=2.5) out.bc <- scanone(fake.bc, pheno.col=1:2, method="hk") @ The results contain LOD scores for each of the phenotypes. By default, \code{summary.scanone} looks at the first of these, though it also shows the LOD score for the second phenotype at the locations of the LOD peaks for the first phenotype. <<summaryscanoneC>>= summary(out.bc, threshold=3) @ If we use \code{lodcolumn=2}, we get the analogous results, looking at the second phenotype. <<summaryscanoneD>>= summary(out.bc, threshold=3, lodcolumn=2) @ If we use \code{format="allpheno"}, we get separate rows for the peaks of each of the phenotypes. <<summaryscanoneE>>= summary(out.bc, threshold=3, format="allpheno") @ Perhaps the most convenient output is obtained with \code{format="allpeaks"}, which gives a single row for each chromosome, with the maximum LOD score and its position for each of the phenotypes. A chromosome is displayed if the LOD score for at least one of the phenotypes exceeds its threshold. The \code{threshold} argument can be a single threshold, applied to all phenotypes, or we can give a vector with separate thresholds for each of the LOD score columns. <<summaryscanoneF>>= summary(out.bc, threshold=c(3,2.5), format="allpeaks") @ A permutation test may be performed as before. Since the \code{fake.bc} data has only autosomal data, use of \code{perm.Xsp=TRUE} would be ignored. <<scanonepermC,eval=FALSE>>= operm.bc <- scanone(out.bc, pheno.col=1:2, method="hk", n.perm=1000) @ We can again use \code{summary} to get LOD thresholds <<summaryscanonepermC>>= summary(operm.bc, alpha=0.05) @ And again these can be used in \code{summary.scanone} to calculate thresholds and get genome-scan-adjusted p-values. <<summaryscanoneG>>= summary(out.bc, perms=operm.bc, alpha=0.05, format="allpeaks", pvalues=TRUE) @ \end{document}