<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
<channel>
<title>UT MD Anderson Cancer Center Department of Biostatistics Working Paper Series</title>
<copyright>Copyright (c) 2013 University of Texas, MD Anderson Cancer Center All rights reserved.</copyright>
<link>http://biostats.bepress.com/mdandersonbiostat</link>
<description>Recent documents in UT MD Anderson Cancer Center Department of Biostatistics Working Paper Series</description>
<language>en-us</language>
<lastBuildDate>Fri, 15 Feb 2013 01:40:18 PST</lastBuildDate>
<ttl>3600</ttl>


	
		
	







<item>
<title>A Study of Mexican Free-Tailed Bat Chirp Syllables: Bayesian Functional Mixed Models for Nonstationary Acoustic Time Series</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper79</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper79</guid>
<pubDate>Wed, 13 Feb 2013 13:18:36 PST</pubDate>
<description>
	<![CDATA[
	<p>We describe a new approach to analyze chirp syllables of free-tailed bats from two regions of Texas in which they are predominant: Austin and College Station. Our goal is to characterize any systematic regional differences in the mating chirps and assess whether individual bats have signature chirps. The data are analyzed by modeling spectrograms of the chirps as responses in a Bayesian functional mixed model. Given the variable chirp lengths, we compute the spectrograms on a relative time scale interpretable as the relative chirp position, using a variable widow overlap based on chirp length. We use 2D wavelet transforms to capture correlation within the spectrogram in our modeling and obtain adaptive regularization of the estimates and inference for the regions-specific spectrograms. Our model includes random effect spectrograms at the bat level to account for correlation among chirps from the same bat, and to assess relative variability in chirp spectrograms within and between bats. The modeling of spectrograms using functional mixed models is a general approach for the analysis of replicated nonstationary time series, such as our acoustical signals, to relate aspects of the signals to various predictors, while accounting for between-signal structure. This can be done on raw spectrograms when all signals are the same length, and can be done using spectrograms defined on a relative time scale for signals of variable length in settings where the idea of defining correspondence across signals based on relative position is sensible.</p>

	]]>
</description>

<author>Josue G. Martinez et al.</author>


</item>






<item>
<title>Approximating random inequalities with Edgeworth expansions</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper78</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper78</guid>
<pubDate>Fri, 09 Nov 2012 05:14:00 PST</pubDate>
<description>
	<![CDATA[
	<p>Random inequalities of the form Prob (x > y + δ) Prob often appear as part of Bayesian clinical trial methods. Simulating trial designs could require calculating millions of random inequalities. When these inequalities require numerical integration, or worse random sampling, the inequality calculations account for the large majority of the simulation time. In this paper we show how to approximate random inequalities using Edgeworth expansions. The calculations required to use these expansions can be done in closed form, as we will see below. Although the calculations are elementary, they are also somewhat tedious, and so we include Python code to illustrate how to use the approximations in practice. We make no distributional assumptions on the random variables X and Y other than requiring that the necessary moments exist. The accuracy of the approximation will depend on how well the densities of these random variables are approximated by the Edgeworth expansions.</p>

	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>Fast Approximation of Inverse Gamma Inequalities</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper77</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper77</guid>
<pubDate>Wed, 17 Oct 2012 07:40:01 PDT</pubDate>
<description>
	<![CDATA[
	<p>Bayesian clinical trial methods sometimes use a conjugate exponential-inverse gamma model for event times. Random inequalities between posterior inverse gamma distributions are used to determine stopping conditions, for example in [1]. Computing these inequalitiy probabilities accounts for nearly all of the computation time used in simulating such trials. This report presents an approximation that could reduce this time by two orders of magnitude.</p>

	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>Fast approximation of Beta inequalities</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper76</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper76</guid>
<pubDate>Mon, 15 Oct 2012 05:40:09 PDT</pubDate>
<description>
	<![CDATA[
	<p>Many Bayesian clinical trial methods are based on random inequalities. For some distribution families, these inequalities can be computed in closed form. For example, [1] gives closed-form solutions to computing</p>
<p>P(X > Y)</p>
<p>when X and Y are either independent normal or independent gamma random variables. However the case of beta random variables is very important and no closed form solution for this case is known. Such inequalities must be evaluated numerically. Simulation programs using these inequalities spend nearly all their time computing the inequalities. This report presents a close-form approximation for beta inequalities that is two orders of magnitude faster to evaluate.</p>

	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>Uniformly Most Powerful Bayesian Tests</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper74</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper74</guid>
<pubDate>Wed, 18 Jul 2012 14:10:22 PDT</pubDate>
<description>
	<![CDATA[
	<p>Uniformly most powerful tests are statistical hypothesis tests that provide the greatest power against a fixed null hypothesis among all possible tests of a given size. In this article, I extend the notion of uniformly most powerful tests by defining a uniformly most powerful Bayesian test to be a test which maximizes the probability that the Bayes factor in favor of the alternative hypothesis exceeds a given threshold. Like their classical counterpart, uniformly most powerful Bayesian tests are most easily defined in one-parameter exponential family models, although I demonstrate that extensions outside of this class are possible. I also show how the connection between uniformly most powerful tests and uniformly most powerful Bayesian tests can be used to provide an approximate calibration between <em>p</em>-values and Bayes factors. Several examples of these new Bayesian tests are provided.</p>

	]]>
</description>

<author>Valen E. Johnson</author>


</item>






<item>
<title>CRM: Prior means and prior medians</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper73</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper73</guid>
<pubDate>Wed, 11 Jul 2012 06:12:31 PDT</pubDate>
<description>
	<![CDATA[
	
	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>Robust Classification of Functional and Quantitative Image Data using Functional Mixed Models</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper72</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper72</guid>
<pubDate>Thu, 17 Nov 2011 12:01:24 PST</pubDate>
<description>
	<![CDATA[
	<p>In this paper, we introduce classification of complex high dimensional functional data in the functional mixed model (FMM) framework.  The FMM relates a functional response to a set of scalar predictors through functional fixed and random effects, and therefore is able to account for various factors that affecting the functions and inducing correlations.  Classification is performed through training the data by treating the class as one of the fixed effects, and then predicting on the test data using posterior predictive probabilities.  Through a Bayesian scheme, we are able to incorporate not only all factors that influencing the functions, but also factors that directly affect class designation. While this classification method is general for all FMM methods, we provide details for two specific Bayesian approaches, the Gaussian, wavelet-based functional mixed model (G-WFMM) and the robust, wavelet-based functional mixed model (R-WFMM).  Both methods perform modeling in the wavelet space, which yields parsimonious representations for the functions, and can naturally adapt to local features, and accommodates various nonstationarities.  The R-WFMM has the additional advantage of allowing potentially heavier tails for features of the functions indexed by particular wavelet coefficients, leading to a down-weighting of outliers that makes the method robust to outlying functions or regions of functions.  The models are applied to a real mass spectroscopy dataset in pancreatic cancer research.  Our results show improved classification when comparing FMM with other typical functional data classification methods and the ad hoc methods that are based on detected spectral peaks.</p>

	]]>
</description>

<author>Hongxiao Zhu et al.</author>


</item>






<item>
<title>Random inequalities between survival and uniform distributions</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper71</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper71</guid>
<pubDate>Fri, 23 Sep 2011 12:02:39 PDT</pubDate>
<description>
	<![CDATA[
	<p>This note will look at ways of computing P(X>Y)where X is a distribution modeling survival (gamma, inverse gamma, Weibull, log-normal) and Y has a uniform distribution.  Each of these can be computer in closed form in terms of common statistical functions. We begin with analytical calculations and then include software implementations in R to make some of the details more explicit. Finally, we give a suggestion for using simulation to compute random inequalities that cannot be computed in closed form.</p>

	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>Basic properties of the soft maximum</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper70</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper70</guid>
<pubDate>Fri, 23 Sep 2011 07:35:13 PDT</pubDate>
<description>
	<![CDATA[
	<p>This note presents the basic properties of the soft maximum, a smooth approximation to the maximum of two real variables. It concludes by looking at potential numerical difficulties with the soft maximum and how to avoid these difficulties.</p>

	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>Goodness-of-fit Diagnostics for Bayesian Hierarchical Models</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper69</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper69</guid>
<pubDate>Tue, 17 May 2011 09:15:45 PDT</pubDate>
<description>
	<![CDATA[
	<p>This article proposes methodology for assessing goodness of fit in Bayesian hierarchical models. The methodology is based on comparisons of the posterior distributions of pivotal discrepancy measures to known reference distributions at various levels of model hierarchies. Because resulting diagnostics can be calculated from the standard output of Markov chain Monte Carlo algorithms, their computational costs are minimal. Several simulation studies are provided, each of which suggests that diagnostics based on pivotal discrepancy measures have higher statistical power than comparable posterior-predictive diagnostic checks in detecting model departures. The proposed methodology is illustrated in a clinical applications.</p>

	]]>
</description>

<author>Valen Johnson</author>


</item>






<item>
<title>Bayesian Model Selection in High-dimensional Settings</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper67</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper67</guid>
<pubDate>Tue, 10 May 2011 11:36:36 PDT</pubDate>
<description>
	<![CDATA[
	<p>Model selection is among the most fundamental and commonly encountered statistical challenges in scientific research. Standard assumptions incorporated into Bayesian model selection procedures result in model selection procedures that are not competitive with commonly used penalized likelihood methods. We propose modifications of standard Bayesian methods by imposing non-local prior densities on model parameters. We show that the resulting model selection procedures are consistent in linear model settings when the number of possible covariates p is bounded by the number of observations n, a property that has not been extended to other model selection procedures. In addition to consistently identifying the true model, the proposed procedures provide accurate estimates of the posterior probability that each identified model is correct. Through simulation studies, we demonstrate that these model selection procedures perform as well or better than commonly used penalized likelihood methods in a range of simulation settings. Proofs of the primary theorem and corollaries underlying the sampling properties of the proposed procedures are provided in supplemental material that is available online.</p>

	]]>
</description>

<author>Valen Johnson et al.</author>


</item>






<item>
<title>Robust, Adaptive Functional Regression in Functional Mixed Model Framework</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper66</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper66</guid>
<pubDate>Mon, 07 Mar 2011 11:55:23 PST</pubDate>
<description>
	<![CDATA[
	<p>Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images),and using other invertible transformations as alternatives to wavelets.</p>

	]]>
</description>

<author>Hongxiao Zhu et al.</author>


</item>






<item>
<title>Skeptical and Optimistic Robust Priors for Clinical Trials</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper65</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper65</guid>
<pubDate>Fri, 18 Feb 2011 11:42:53 PST</pubDate>
<description>
	<![CDATA[
	<p>A useful technique from the subjective Bayesian viewpoint, suggested by Spiegelhalter et al. (1994), is to ask the subject matter researchers and other parties involved, such as pharmaceutical companies and regulatory bodies, for reasonable optimistic and pessimistic priors regarding the effectiveness of a new treatment. Up to now, the proposed skeptical and optimistic priors have been limited to conjugate priors, though there is no need for this limitation. The same reasonably adversarial points of view can be taken with robust priors. Robust priors permit a much faster and efficient resolution of the disagreement between the conclusions based on skeptical and optimistic priors. As a consequence, robust Bayesian clinical trials tend to be shorter. A recent reference with robust priors usefully applied to clinical trials is in Fuquene, Cook, and Pericchi (2009). Our proposal in this paper is to use Cauchy and intrinsic robust priors for both skeptical and optimistic priors leading to results more closely related with the sampling data when prior and data are in conflict. In other words, the use of robust priors removes the dogmatism implicit in conjugate priors.</p>

	]]>
</description>

<author>John D. Cook et al.</author>


</item>






<item>
<title>Block Adaptive Randomization</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper63</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper63</guid>
<pubDate>Tue, 18 Jan 2011 11:27:09 PST</pubDate>
<description>
	<![CDATA[
	<p>This note proposes a block-adaptive randomization method to limit the length of runs in an outcome-adaptive randomized trial.</p>

	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>Upper bounds on non-central chi-squared tails and truncated normal moments</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper62</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper62</guid>
<pubDate>Tue, 05 Oct 2010 07:48:36 PDT</pubDate>
<description>
	<![CDATA[
	<p>We show that moments of the truncated normal distribution provide upper bounds on the tails of the non-central chi-squared distribution, then develop upper bounds for the former.</p>

	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>Asymptotic results for Normal-Cauchy model</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper61</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper61</guid>
<pubDate>Wed, 01 Sep 2010 14:27:17 PDT</pubDate>
<description>
	<![CDATA[
	<p>This report proves asymptotic results for the posterior mean when sampling from a normal distribution with a Cauchy prior on the location parameter.</p>

	]]>
</description>

<author>John D. Cook</author>


</item>






<item>
<title>The King’s Foot of Patient Reported Outcomes: Current Practices and New Developments for the Measurement of Change</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper60</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper60</guid>
<pubDate>Thu, 15 Jul 2010 08:22:00 PDT</pubDate>
<description>
	<![CDATA[
	<p>In June 2009, a group of experts met in a Longitudinal Analysis of Patient Reported Outcomes Working group as part of the Statistical and Applied Mathematical Sciences Institute Summer Psychometric program to discuss the complex issues that arise when conceptualizing and operationalizing "change" in PRO measures and related constructs. This White Paper summarizes these issues and provides possible paths for dealing with the complexities of measuring change. It will discuss issues associated with: (1) conceptualizing and operationalizing change in PRO measures; (2) modeling change using state-of-the-art statistical methods; (3) impediments to detecting true change; (4) new developments to deal with these challenges; and (5) important gaps that are fertile ground for future research.</p>

	]]>
</description>

<author>Richard J. Swartz et al.</author>


</item>






<item>
<title>Consistent Bayesian model selection in p&lt;=n settings</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper59</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper59</guid>
<pubDate>Mon, 07 Jun 2010 05:53:56 PDT</pubDate>
<description>
	<![CDATA[
	<p>Let Y n = (y1, . . . , yn)′ denote a random vector, X n an n × p matrix of real numbers, and p a p × 1 regression vector. This article addresses the selection of non-zero components of p when it is assumed that Y n  N(X np, 2I n) and p  n. Model selection is based on the calculation of posterior model probabilities using non-local prior densities on the regression coefficients for each possible model. The non-local prior densities used for model definition are obtained as products of normal moment priors and are called pMOM prior densities. Under mild conditions on the matrix (X′nXn)−1, we demonstrate that the use of these priors guarantees that the posterior probability of the true model converges to 1 as the sample size increases, and that the resulting model selection procedure exhibits an “oracle” property in the p  n setting.</p>

	]]>
</description>

<author>Valen E. Johnson</author>


</item>






<item>
<title>An inequality for the product of positive random variables</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper57</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper57</guid>
<pubDate>Tue, 16 Mar 2010 09:39:44 PDT</pubDate>
<description>
	<![CDATA[
	<p>A simple inequality is provided for a countable product of non-negative random variables with arbitrary dependence structure.</p>

	]]>
</description>

<author></author>


</item>






<item>
<title>Automated Analysis of Quantitative Image Data Using Isomorphic Functional Mixed Models with Application to Proteomics Data</title>
<link>http://biostats.bepress.com/mdandersonbiostat/paper56</link>
<guid isPermaLink="true">http://biostats.bepress.com/mdandersonbiostat/paper56</guid>
<pubDate>Fri, 12 Feb 2010 08:02:24 PST</pubDate>
<description>
	<![CDATA[
	<p>Image data are increasingly encountered and are of growing importance in many areas of science. Much of these data are quantitative image data, which are characterized by intensities that represent some measurement of interest in the scanned images. The data typically consist of multiple images on the same domain and the goal of the research is to combine the quantitative information across images to make inference about populations or interventions. In this paper, we present a united analysis framework for the analysis of quantitative image data using a Bayesian functional mixed model approach. This framework is exible enough to handle complex, irregular images with many local features, and can model the simultaneous effects of multiple factors on the image intensities and account for the correlation between images induced by the design. We introduce a general isomorphic modeling approach to fitting the functional mixed model, of which the wavelet-based functional mixed model is one example. With suitable modeling choices, this approach leads to efficient calculations and can result in exible modeling and adaptive smoothing of the salient features in the data. The proposed method has the following advantages: it can be run automatically, it produces inferential plots indicating which regions of the image are associated with each factor, it simultaneously considers the practical and statistical significance of findings, and it controls the false discovery rate. Although the method we present is general and can be applied to quantitative image data from any application, in this paper we focus on image-based proteomic data. We apply our method to an animal study investigating the effects of opiate addiction on the brain proteome. Our image-based functional mixed model approach finds results that are missed with conventional spot-based analysis approaches. In particular, we find that the significant regions of the image identified by the proposed method frequently correspond to subregions of visible spots that may represent post-translational modifications or co-migrating proteins that cannot be visually resolved from adjacent, more abundant proteins on the gel image. Thus, it is possible that this image-based approach may actually improve the realized resolution of the gel, revealing differentially expressed proteins that would not have even been detected as spots by modern spot-based analyses.</p>

	]]>
</description>

<author>Jeffrey S. Morris et al.</author>


</item>





</channel>
</rss>
