Multiclass ROC Analysis

Matthew S. Wandishin Department of Atmospheric Sciences, The University of Arizona, Tucson, Arizona

Search for other papers by Matthew S. Wandishin in
Current site
Google Scholar
PubMed
Close
and
Steven J. Mullen Department of Atmospheric Sciences, The University of Arizona, Tucson, Arizona

Search for other papers by Steven J. Mullen in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

Receiver operating characteristic (ROC) curves have become a common analysis tool for evaluating forecast discrimination: the ability of a forecast system to distinguish between events and nonevents. As is implicit in that statement, application of the ROC curve is limited to forecasts involving only two possible outcomes, such as rain and no rain. However, many forecast scenarios exist for which there are multiple possible outcomes, such as rain, snow, and freezing rain. An extension of the ROC curve to multiclass forecast problems is explored. The full extension involves high-dimensional hypersurfaces that cannot be visualized and that present other problems. Therefore, several different approximations to the full extension are introduced using both artificial and actual forecast datasets. These approximations range from sets of simple two-class ROC curves to sets of three-dimensional ROC surfaces. No single approximation is superior for all forecast problems; thus, the specific aims in evaluating the forecast must be considered.

Corresponding author address: Matthew Wandishin, NSSL, National Weather Center, 120 David L. Boren Blvd., Norman, OK 73072. Email: matt.wandishin@noaa.gov

Abstract

Receiver operating characteristic (ROC) curves have become a common analysis tool for evaluating forecast discrimination: the ability of a forecast system to distinguish between events and nonevents. As is implicit in that statement, application of the ROC curve is limited to forecasts involving only two possible outcomes, such as rain and no rain. However, many forecast scenarios exist for which there are multiple possible outcomes, such as rain, snow, and freezing rain. An extension of the ROC curve to multiclass forecast problems is explored. The full extension involves high-dimensional hypersurfaces that cannot be visualized and that present other problems. Therefore, several different approximations to the full extension are introduced using both artificial and actual forecast datasets. These approximations range from sets of simple two-class ROC curves to sets of three-dimensional ROC surfaces. No single approximation is superior for all forecast problems; thus, the specific aims in evaluating the forecast must be considered.

Corresponding author address: Matthew Wandishin, NSSL, National Weather Center, 120 David L. Boren Blvd., Norman, OK 73072. Email: matt.wandishin@noaa.gov

Save
  • Bamber, D. C., 1975: The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psychol., 12 , 387415.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Doswell C. A. III, , Davies-Jones R. , and Keller D. L. , 1990: On summary measures of skill in rare event forecasting based on contingency tables. Wea. Forecasting, 5 , 576585.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dreiseitl, S., Ohno-Machado L. , and Binder M. , 2000: Comparing three-class diagnostic tests by three-way ROC analysis. Med. Decision Making, 20 , 7889.

    • Search Google Scholar
    • Export Citation
  • Edwards, D. C., and Metz C. E. , 2005a: Restrictions on the three-class ideal observer’s decision boundary lines. IEEE Trans. Med. Imaging, 40 , 15661573.

    • Search Google Scholar
    • Export Citation
  • Edwards, D. C., and Metz C. E. , 2005b: Review of several proposed three-class classification decision rules and their relation to the ideal observer decision rule. Proc. SPIE, 5749 , 129137.

    • Search Google Scholar
    • Export Citation
  • Edwards, D. C., and Metz C. E. , 2006: Optimization of an ROC hypersurface constructed only from an observer’s within class sensitivities. Proc. SPIE, 6146 , A1A7.

    • Search Google Scholar
    • Export Citation
  • Edwards, D. C., Metz C. E. , and Kupinski M. A. , 2004: Ideal observers and optimal ROC hypersurfaces in N-class classification. IEEE Trans. Med. Imaging, 23 , 891895.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Edwards, D. C., Metz C. E. , and Nishikawa R. M. , 2005: The hypervolume under the ROC hypersurface of “near-guessing” and “near-perfect” observers in N-class classification tasks. IEEE Trans. Med. Imaging, 24 , 293299.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fawcett, T., 2003: ROC graphs: Notes and practical considerations for researchers. HP Laboratories Tech. Rep. HPL-2003-4, Palo Alto, CA, 28 pp. [Available online at http://www.purl.org/net/tfawcett/papers/HPL-2003-4.pdf.].

    • Search Google Scholar
    • Export Citation
  • Glahn, B., 2004: Discussion of verification concepts in Forecast Verification: A Practitioner’s Guide in Atmospheric Science. Wea. Forecasting, 19 , 769775.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hand, D. J., and Till R. J. , 2001: A simple generalization of the area under the ROC curve for multiple class classification problems. Mach. Learn., 45 , 171186.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Harvey, L. O., Hammond K. R. , Lush C. M. , and Mross E. F. , 1992: The application of signal detection theory to weather forecasting behavior. Mon. Wea. Rev., 120 , 863883.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • He, X., Metz C. E. , Tsui B. M. W. , Links J. M. , and Frey E. C. , 2006: Three-class ROC analysis—A decision theoretic approach under the ideal observer framework. IEEE Trans. Med. Imaging, 25 , 571581.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kharin, V. V., and Zwiers F. W. , 2003: On the ROC score and probability forecasts. J. Climate, 16 , 41454150.

  • Kupinski, M. A., Edwards D. C. , Giger M. L. , and Metz C. E. , 2001: Ideal observer approximation using Bayesian classification neural networks. IEEE Trans. Med. Imaging, 20 , 886899.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lachiche, N., and Flach P. , 2003: Improving accuracy and cost of two-class and multi-class probabilistic classifiers using ROC curves. Proc. 20th Int. Conf. on Machine Learning, Washington, DC, AAAI, 416–423.

  • Livezey, R. E., and Timofeyeva M. M. , 2008: The first decade of long-lead U.S. seasonal forecasts. Bull. Amer. Meteor. Soc., 89 , 843854.

  • Marzban, C., 2004: The ROC curve and the area under it as performance measures. Wea. Forecasting, 19 , 11061114.

  • Mason, I., 1982: A model for assessment of weather forecasts. Aust. Meteor. Mag., 30 , 291303.

  • Mason, I., 2003: Binary events. Forecast Verification: A Practitioner’s Guide in Atmospheric Science, I. T. Joliffe and D. B. Stephenson, Eds., John Wiley and Sons, 37–76.

    • Search Google Scholar
    • Export Citation
  • Mason, S. J., and Graham N. E. , 1999: Conditional probabilities, relative operating characteristics, and relative operating levels. Wea. Forecasting, 14 , 713725.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mason, S. J., and Graham N. E. , 2002: Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quart. J. Roy. Meteor. Soc., 128 , 21452166.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mossman, D., 1999: Three-way ROCs. Med. Decision Making, 19 , 7889.

  • Murphy, A. H., 1972: Scalar and vector partitions of the probability score. Part II: N-state situation. J. Appl. Meteor., 11 , 11831192.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., and Winkler R. L. , 1987: A general framework for forecast verification. Mon. Wea. Rev., 115 , 13301338.

  • Nakas, C. T., and Yiannoutsos C. T. , 2004: Ordered multiple-class ROC analysis with continuous measurements. Stat. Med., 23 , 34373449.

  • Provost, F., and Domingos P. , 2001: Well-trained PETs: Improving probability estimation trees. CeDER Working Paper IS-00-04, Stern School of Business, New York University, 26 pp.

    • Search Google Scholar
    • Export Citation
  • Provost, F., and Fawcett T. , 2001: Robust classification systems for imprecise environments. Mach. Learn., 42 , 203231.

  • Richardson, D., 2000: Skill and economic value of the ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. Soc., 126 , 649667.

  • Scurfield, B. K., 1996: Multiple-event forced-choice tasks in the theory of signal detectability. J. Math. Psychol., 40 , 253269.

  • Scurfield, B. K., 1998: Generalization of the theory of signal detectability to n-event m-dimensional forced-choice tasks. J. Math. Psychol., 42 , 531.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Srinivasan, A., 1999: Note on the location of optimal classifiers in n-dimensional ROC space. Computing Laboratory Tech. Rep. PRG-TR-2-99, Oxford University, Oxford, United Kingdom, 7 pp.

    • Search Google Scholar
    • Export Citation
  • Thompson, J. C., and Brier G. W. , 1955: The economic utility of weather forecasts. Mon. Wea. Rev., 83 , 249254.

  • Wandishin, M. S., Baldwin M. E. , Mullen S. L. , and Cortinas J. V. Jr., 2005: Short-range ensemble forecasts of precipitation type. Wea. Forecasting, 20 , 609626.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.

  • Wilks, D. S., 2000: Diagnostic verification of the Climate Prediction Center long-lead outlooks, 1995–98. J. Climate, 13 , 23892403.

    • Crossref
    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 1381 301 29
PDF Downloads 1244 285 30