Identifying Rogue Air Temperature Stations Using Cluster Analysis of Percentile Trends

Scott M. Robeson Department of Geography, Indiana University, Bloomington, Indiana

Search for other papers by Scott M. Robeson in
Current site
Google Scholar
PubMed
Close
and
Jeffrey A. Doty Department of Geography, Indiana University, Bloomington, Indiana

Search for other papers by Jeffrey A. Doty in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

A new and efficient method for identifying “rogue” air temperature stations—locations with unusually large air temperature trends—is presented. Instrumentation problems and spatially unrepresentative local climates are sometimes more apparent in air temperature extremes, yet can have more subtle impacts on variations in mean air temperature. As a result, using data from over 1300 stations in North America, the tails of daily air temperature frequency distributions were examined for unusual trends. In particular, linear trends in the 5th percentile of daily minimum air temperature during the winter months and the 95th percentile of daily maximum air temperature during the summer were analyzed. Cluster analysis then was used to identify stations that were distinct from other locations. Both single- and average linkage clustering were evaluated.

By identifying individual stations along the entire periphery of the percentile trend space, single-linkage clustering appears to produce better results than that of average linkage. Average linkage clustering tends to group together several stations with large trends; however, only a handful of these stations appear distinctly different from the large body of trends toward the center of the percentile trend space. Maps of the rogue stations show that most are in close proximity to numerous other stations that were not grouped into the rogue cluster, making it unlikely that the unusually large temperature trends were due to regional climatic variations. As with all approaches for evaluating data quality, time series plots and station history information also must be inspected to more fully understand inhomogeneous variations in historical climatic data.

Corresponding author address: Scott M. Robeson, Department of Geography, 701 E. Kirkwood Ave., Indiana University, Bloomington, IN 47405. Email: srobeson@indiana.edu

Abstract

A new and efficient method for identifying “rogue” air temperature stations—locations with unusually large air temperature trends—is presented. Instrumentation problems and spatially unrepresentative local climates are sometimes more apparent in air temperature extremes, yet can have more subtle impacts on variations in mean air temperature. As a result, using data from over 1300 stations in North America, the tails of daily air temperature frequency distributions were examined for unusual trends. In particular, linear trends in the 5th percentile of daily minimum air temperature during the winter months and the 95th percentile of daily maximum air temperature during the summer were analyzed. Cluster analysis then was used to identify stations that were distinct from other locations. Both single- and average linkage clustering were evaluated.

By identifying individual stations along the entire periphery of the percentile trend space, single-linkage clustering appears to produce better results than that of average linkage. Average linkage clustering tends to group together several stations with large trends; however, only a handful of these stations appear distinctly different from the large body of trends toward the center of the percentile trend space. Maps of the rogue stations show that most are in close proximity to numerous other stations that were not grouped into the rogue cluster, making it unlikely that the unusually large temperature trends were due to regional climatic variations. As with all approaches for evaluating data quality, time series plots and station history information also must be inspected to more fully understand inhomogeneous variations in historical climatic data.

Corresponding author address: Scott M. Robeson, Department of Geography, 701 E. Kirkwood Ave., Indiana University, Bloomington, IN 47405. Email: srobeson@indiana.edu

Save
  • Alexandersson, H., and A. Moberg, 1997: Homogenization of Swedish temperature data. Part I: Homogeneity test for linear trends. Int. J. Climatol., 17 , 2534.

    • Search Google Scholar
    • Export Citation
  • Calinski, R. B., and J. Harabasz, 1974: A dendrite method for cluster analysis. Commun. Statist., 3 , 127.

  • DeFelice, T. P., 1998: An Introduction to Meteorological Instrumentation and Measurement. Prentice Hall, 229 pp.

  • DeGaetano, A. R., and R. J. Allen, 2002: A homogenized historical temperature extreme dataset for the United States. J. Atmos. Oceanic Technol., 19 , 12671284.

    • Search Google Scholar
    • Export Citation
  • Easterling, D. R., and T. C. Peterson, 1995: A new method for detecting undocumented discontinuities in climatological time series. Int. J. Climatol., 15 , 369377.

    • Search Google Scholar
    • Export Citation
  • Easterling, D. R., T. C. Peterson, and T. R. Karl, 1996: On the development and use of homogenized climate datasets. J. Climate, 9 , 14291434.

    • Search Google Scholar
    • Export Citation
  • Easterling, D. R., T. R. Karl, J. H. Lawrimore, and S. A. Del Greco, 1999: United States Historical Climatology Network daily temperature, precipitation, and snow data for 1871–1997. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, U.S. Department of Energy Rep. CDIAC-118, NDP-070, 84 pp.

  • Fovell, R. G., and M. C. Fovell, 1993: Climate zones of the United States defined using cluster analysis. J. Climate, 6 , 21032135.

  • Gall, R., K. Young, R. Schotland, and J. Schmitz, 1992: The recent maximum temperature anomalies in Tucson: Are they real or an instrument problem? J. Climate, 5 , 657665.

    • Search Google Scholar
    • Export Citation
  • Gallo, K. P., D. R. Easterling, and T. C. Peterson, 1996: The influence of land use/land cover on climatological values of the diurnal temperature range. J. Climate, 9 , 29412944.

    • Search Google Scholar
    • Export Citation
  • Gong, X., and M. B. Richman, 1995: On the application of cluster analysis to growing season precipitation data in North America east of the Rockies. J. Climate, 8 , 897931.

    • Search Google Scholar
    • Export Citation
  • Huber, P., 1981: Robust Statistics. Wiley, 308 pp.

  • Jackson, I. J., and H. Weinand, 1995: Classification of tropical rainfall stations: A comparison of clustering techniques. Int. J. Climatol., 15 , 985994.

    • Search Google Scholar
    • Export Citation
  • Jones, P. D., and A. Moberg, 2003: Hemispheric and large-scale surface air temperature variations: An extensive revision and an update to 2001. J. Climate, 16 , 206223.

    • Search Google Scholar
    • Export Citation
  • Jones, P. D., S. C. B. Raper, R. S. Bradley, H. F. Diaz, P. M. Kelly, and T. M. L. Wigley, 1986: Northern Hemisphere surface air temperature variations: 1851-1984. J. Climate Appl. Meteor., 25 , 161179.

    • Search Google Scholar
    • Export Citation
  • Kalkstein, L. S., G. Tan, and J. A. Skindlov, 1987: An evaluation of three clustering procedures for use in synoptic climatological classification. J. Climate Appl. Meteor., 26 , 717730.

    • Search Google Scholar
    • Export Citation
  • Karl, T. R., and C. N. Williams Jr., 1987: An approach to adjusting climatological time series for discontinuous inhomogeneities. J. Climate Appl. Meteor., 26 , 17441763.

    • Search Google Scholar
    • Export Citation
  • Karl, T. R., C. N. Williams Jr., P. J. Young, and W. M. Wendland, 1986: A model to estimate the time of observation bias associated with monthly mean maximum, minimum and mean temperatures for the United States. J. Climate Appl. Meteor., 25 , 145160.

    • Search Google Scholar
    • Export Citation
  • Kessler, R. W., L. F. Bosart, and R. S. Gaza, 1993: Recent maximum temperature anomalies at Albany, New York: Fact or fiction? Bull. Amer. Meteor. Soc., 74 , 215226.

    • Search Google Scholar
    • Export Citation
  • Lanzante, J. R., 1996: Resistant, robust and non-parametric techniques for the analysis of climate data: Theory and examples, including applications to historical radiosonde station data. Int. J. Climatol., 16 , 11971226.

    • Search Google Scholar
    • Export Citation
  • Meehl, G. A., and Coauthors, 2000: An introduction to trends in extreme weather and climate events: Observations, socioeconomic impacts, terrestrial ecological impacts, and model projections. Bull. Amer. Meteor. Soc., 81 , 413416.

    • Search Google Scholar
    • Export Citation
  • Milligan, G. W., and M. C. Cooper, 1985: An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50 , 159179.

    • Search Google Scholar
    • Export Citation
  • National Climatic Data Center, cited. 2002: The Global Daily Climatology Network (GDCN)Version 1.0, Climate Analysis Branch. [Available online at http://www.ncdc.noaa.gov/gdcn.html.].

  • Oke, T. R., 1981: Canyon geometry and the nocturnal urban heat island: Comparison of scale model and field observations. J. Climatol., 1 , 237254.

    • Search Google Scholar
    • Export Citation
  • Oke, T. R., 1982: The energetic basis of the urban heat island. Quart. J. Roy. Meteor. Soc., 108 , 124.

  • Peterson, T. C., and D. R. Easterling, 1994: Creation of homogeneous composite climatological reference series. Int. J. Climatol., 14 , 671679.

    • Search Google Scholar
    • Export Citation
  • Peterson, T. C., and R. S. Vose, 1997: An overview of the Global Historical Climatology Network temperature data base. Bull. Amer. Meteor. Soc., 78 , 28372849.

    • Search Google Scholar
    • Export Citation
  • Peterson, T. C., and Coauthors, 1998: Homogeneity adjustments of in situ atmospheric climate data: A review. Int. J. Climatol., 18 , 14931517.

    • Search Google Scholar
    • Export Citation
  • Quayle, R. G., D. R. Easterling, T. R. Karl, and P. Y. Hughes, 1991: Effects of recent thermometer changes in the cooperative station network. Bull. Amer. Meteor. Soc., 72 , 17181724.

    • Search Google Scholar
    • Export Citation
  • Robeson, S. M., 2002a: Relationships between mean and standard deviation of air temperature: Implications for global warming. Climate Res., 22 , 205213.

    • Search Google Scholar
    • Export Citation
  • Robeson, S. M., 2002b: Increasing growing-season length in Illinois during the 20th century. Climate Change, 52 , 219238.

  • Robeson, S. M., 2004: Trends in time-varying percentiles of daily minimum and maximum temperature over North America. Geophys. Res. Lett., 31 .L04203, doi:10.1029/2003GL019019.

    • Search Google Scholar
    • Export Citation
  • Solow, A., 1987: Testing for climate change: An application of the two-phase regression model. J. Climate Appl. Meteor., 26 , 14011405.

    • Search Google Scholar
    • Export Citation
  • Vincent, L. A., 1998: A technique for the identification of inhomogeneities in Canadian temperature series. J. Climate, 11 , 10941104.

    • Search Google Scholar
    • Export Citation
  • Vincent, L. A., X. Zhang, B. R. Bonsal, and W. D. Hogg, 2002: Homogenization of daily temperatures over Canada. J. Climate, 15 , 13221334.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.

  • Zhang, X., L. A. Vincent, W. D. Hogg, and A. Niitsoo, 2000: Temperature and precipitation trends in Canada during the 20th century. Atmos.–Ocean, 38 , 395429.

    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 204 44 4
PDF Downloads 101 19 2