Controlling the Proportion of Falsely Rejected Hypotheses when Conducting Multiple Tests with Climatological Data

Valérie Ventura Department of Statistics, and Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, Pennsylvania

Search for other papers by Valérie Ventura in
Current site
Google Scholar
PubMed
Close
,
Christopher J. Paciorek Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts

Search for other papers by Christopher J. Paciorek in
Current site
Google Scholar
PubMed
Close
, and
James S. Risbey Centre for Dynamical Meteorology and Oceanography, Monash University, Clayton, Victoria, Australia

Search for other papers by James S. Risbey in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

The analysis of climatological data often involves statistical significance testing at many locations. While the field significance approach determines if a field as a whole is significant, a multiple testing procedure determines which particular tests are significant. Many such procedures are available, most of which control, for every test, the probability of detecting significance that does not really exist. The aim of this paper is to introduce the novel “false discovery rate” approach, which controls the false rejections in a more meaningful way. Specifically, it controls a priori the expected proportion of falsely rejected tests out of all rejected tests; additionally, the test results are more easily interpretable. The paper also investigates the best way to apply a false discovery rate (FDR) approach to spatially correlated data, which are common in climatology. The most straightforward method for controlling the FDR makes an assumption of independence between tests, while other FDR-controlling methods make less stringent assumptions. In a simulation study involving data with correlation structure similar to that of a real climatological dataset, the simple FDR method does control the proportion of falsely rejected hypotheses despite the violation of assumptions, while a more complicated method involves more computation with little gain in detecting alternative hypotheses. A very general method that makes no assumptions controls the proportion of falsely rejected hypotheses but at the cost of detecting few alternative hypotheses. Despite its unrealistic assumption, based on the simulation results, the authors suggest the use of the straightforward FDR-controlling method and provide a simple modification that increases the power to detect alternative hypotheses.

Corresponding author address: Christopher Paciorek, Department of Biostatistics, 655 Huntington Avenue, Harvard School of Public Health, Boston, MA 02115. Email: paciorek@alumni.cmu.edu

Abstract

The analysis of climatological data often involves statistical significance testing at many locations. While the field significance approach determines if a field as a whole is significant, a multiple testing procedure determines which particular tests are significant. Many such procedures are available, most of which control, for every test, the probability of detecting significance that does not really exist. The aim of this paper is to introduce the novel “false discovery rate” approach, which controls the false rejections in a more meaningful way. Specifically, it controls a priori the expected proportion of falsely rejected tests out of all rejected tests; additionally, the test results are more easily interpretable. The paper also investigates the best way to apply a false discovery rate (FDR) approach to spatially correlated data, which are common in climatology. The most straightforward method for controlling the FDR makes an assumption of independence between tests, while other FDR-controlling methods make less stringent assumptions. In a simulation study involving data with correlation structure similar to that of a real climatological dataset, the simple FDR method does control the proportion of falsely rejected hypotheses despite the violation of assumptions, while a more complicated method involves more computation with little gain in detecting alternative hypotheses. A very general method that makes no assumptions controls the proportion of falsely rejected hypotheses but at the cost of detecting few alternative hypotheses. Despite its unrealistic assumption, based on the simulation results, the authors suggest the use of the straightforward FDR-controlling method and provide a simple modification that increases the power to detect alternative hypotheses.

Corresponding author address: Christopher Paciorek, Department of Biostatistics, 655 Huntington Avenue, Harvard School of Public Health, Boston, MA 02115. Email: paciorek@alumni.cmu.edu

Save
  • Benjamini, Y., and Y. Hochberg, 1995: Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Stat. Soc, 57B , 289300.

    • Search Google Scholar
    • Export Citation
  • Benjamini, Y., and D. Yekutieli, 2001: The control of the false discovery rate in multiple testing under dependency. Ann. Stat, 29 , 11651188.

    • Search Google Scholar
    • Export Citation
  • Casella, G., and R. Berger, 2002: Statistical Inference. Duxbury Press, 660 pp.

  • Cressie, N., 1993: Statistics for Spatial Data. Wiley-Interscience, 900 pp.

  • Genovese, C., and L. Wasserman, 2004: A stochostic process approach to false discovery rates. Ann. Stat, 32 , 10351061.

  • Katz, R. W., 2002: Sir Gilbert Walker and a connection between El Niño and statistics. Stat. Sci, 17 , 97112.

  • Livezey, R., and W. Chen, 1983: Statistical field significance and its determination by Monte Carlo techniques. Mon. Wea. Rev, 111 , 4659.

    • Search Google Scholar
    • Export Citation
  • Paciorek, C., J. Risbey, V. Ventura, and R. Rosen, 2002: Multiple indices of Northern Hemisphere cyclone activity, winters 1949– 1999. J. Climate, 15 , 15731590.

    • Search Google Scholar
    • Export Citation
  • Storey, J., 2002: A direct approach to false discovery rates. J. Roy. Stat. Soc, 64B , 479498.

  • von Storch, H., and F. Zwiers, 1999: Statistical Analysis in Climate Research. Cambridge University Press, 484 pp.

  • Wilks, D., 1995: Statistical Methods in the Atmospheric Sciences: An Introduction. Academic Press, 467 pp.

  • Wilks, D., 1997: Resampling hypothesis tests for autocorrelated fields. J. Climate, 10 , 6583.

  • Yekutieli, D., and Y. Benjamini, 1999: Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J. Stat. Plan. Inf, 82 , 171196.

    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 2296 858 157
PDF Downloads 1042 237 9