• Aksoy, A., S. Lorsolo, T. Vukicevic, K. J. Sellwood, S. D. Aberson, and F. Zhang, 2012: The HWRF Hurricane Ensemble Data Assimilation System (HEDAS) for high-resolution data: The impact of airborne Doppler radar observations in an OSSE. Mon. Wea. Rev., 140, 18431862, https://doi.org/10.1175/MWR-D-11-00212.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Alessandrini, S., L. Delle Monache, S. Sperati, and J. N. Nissen, 2015a: A novel application of an analog ensemble for short-term wind power forecasting. Renew. Energy, 76, 768781, https://doi.org/10.1016/j.renene.2014.11.061.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Alessandrini, S., L. Delle Monache, S. Sperati, and G. Cervone, 2015b: An analog ensemble for short-term probabilistic solar power forecast. Appl. Energy, 157, 95110, https://doi.org/10.1016/j.apenergy.2015.08.011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bao, J.-W., S. G. Gopalakrishnan, S. A. Michelson, F. D. Marks, and M. T. Montgomery, 2012: Impact of physics representations in the HWRFX on simulated hurricane structure and pressure–wind relationships. Mon. Wea. Rev., 140, 32783299, https://doi.org/10.1175/MWR-D-11-00332.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Biswas, M. K., L. Bernardet, and J. Dudhia, 2014: Sensitivity of hurricane forecasts to cumulus parameterizations in the HWRF Model. Geophys. Res. Lett., 41, 91139119, https://doi.org/10.1002/2014GL062071.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78, 13, https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brown, T. A., 1974: Admissible scoring systems for continuous distributions. The Rand Corporation Doc. P-5235, 22 pp., https://www.rand.org/pubs/papers/P5235.html.

  • Buizza, R., P. L. Houtekamer, G. Pellerin, Z. Toth, Y. Zhu, and M. Wei, 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon. Wea. Rev., 133, 10761097, https://doi.org/10.1175/MWR2905.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Carney, M., and P. Cunningham, 2006: Evaluating density forecasting models. Trinity College Dublin, Department of Computer Science Rep., 12 pp.

    • Crossref
    • Export Citation
  • Davò, F., S. Alessandrini, S. Sperati, L. Delle Monache, D. Airoldi, and M. T. Vespucci, 2016: Post-processing techniques and principal component analysis for regional wind power and solar irradiance forecasting. Sol. Energy, 134, 327338, https://doi.org/10.1016/j.solener.2016.04.049.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Delle Monache, L., T. Nipen, Y. Liu, G. Roux, and R. Stull, 2011: Kalman filter and analog schemes to postprocess numerical weather predictions. Mon. Wea. Rev., 139, 35543570, https://doi.org/10.1175/2011MWR3653.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Delle Monache, L., F. A. Eckel, D. L. Rife, B. Nagarajan, and K. Searight, 2013: Probabilistic weather prediction with an analog ensemble. Mon. Wea. Rev., 141, 34983516, https://doi.org/10.1175/MWR-D-12-00281.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., 2009: A simplified dynamical system for tropical cyclone intensity prediction. Mon. Wea. Rev., 137, 6882, https://doi.org/10.1175/2008MWR2513.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., M. Mainelli, L. K. Shay, J. A. Knaff, and J. Kaplan, 2005: Further improvements to the Statistical Hurricane Intensity Prediction Scheme (SHIPS). Wea. Forecasting, 20, 531543, https://doi.org/10.1175/WAF862.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., J. A. Knaff, and J. Kaplan, 2006: On the decay of tropical cyclone winds crossing narrow landmasses. J. Appl. Meteor. Climatol., 45, 491499, https://doi.org/10.1175/JAM2351.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., C. R. Sampson, J. A. Knaff, and K. D. Musgrave, 2014: Is tropical cyclone intensity guidance improving? Bull. Amer. Meteor. Soc., 95, 387398, https://doi.org/10.1175/BAMS-D-12-00240.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Déqué, M., 2007: Frequency of precipitation and temperature extremes over France in an anthropogenic scenario: Model results and statistical correction according to observed values. Global Planet. Change, 57, 1626, https://doi.org/10.1016/j.gloplacha.2006.11.030.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Djalalova, I., L. Delle Monache, and J. Wilczak, 2015: PM2.5 analog forecast and Kalman filtering post-processing for the Community Multiscale Air Quality (CMAQ) model. Atmos. Environ., 119, 431442, https://doi.org/10.1016/j.atmosenv.2015.05.057.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Doyle, J. D., and Coauthors, 2011: Real-time tropical cyclone prediction using COAMPS-TC. Adv. Geosci., 28, 1528, https://doi.org/10.1142/9789814405683_0002.

    • Search Google Scholar
    • Export Citation
  • Efroymson, M. A., 1960: Multiple regression analysis. Mathematical Methods for Digital Computers, A. Ralston, and H. S. Wilf, Eds., Vol. 1, Wiley and Sons, 191–203.

  • Emanuel, K. A., 1986: An air–sea interaction theory for tropical cyclones. Part I: Steady-state maintenance. J. Atmos. Sci., 43, 585605, https://doi.org/10.1175/1520-0469(1986)043<0585:AASITF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ferrier, B. S., Y. Jin, Y. Lin, T. Black, E. Rogers, and G. DiMego, 2002: Implementation of a new grid-scale cloud and precipitation scheme in the NCEP Eta model. 19th Conf. on Weather Analysis and Forecasting/15th Conf. on Numerical Weather Prediction, San Antonio, TX, Amer. Meteor. Soc., 10.1, https://ams.confex.com/ams/SLS_WAF_NWP/techprogram/paper_47241.htm.

  • Fortin, V., M. Abaza, F. Anctil, and R. Turcotte, 2014: Why should ensemble spread match the RMSE of the ensemble mean? J. Hydrometeor., 15, 17081713, https://doi.org/10.1175/JHM-D-14-0008.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gall, R., J. Franklin, F. Marks, E. N. Rappaport, and F. Toepfer, 2013: The Hurricane Forecast Improvement Project. Bull. Amer. Meteor. Soc., 94, 329343, https://doi.org/10.1175/BAMS-D-12-00071.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gneiting, T., 2011: Making and evaluating point forecasts. J. Amer. Stat. Assoc., 106, 746762, https://doi.org/10.1198/jasa.2011.r10138.

  • Goerss, J. S., and C. R. Sampson, 2014: Prediction of consensus tropical cyclone intensity forecast error. Wea. Forecasting, 29, 750762, https://doi.org/10.1175/WAF-D-13-00058.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gopalakrishnan, S. G., F. Marks, X. Zhang, J.-W. Bao, K.-S. Yeh, and R. Atlas, 2011: The Experimental HWRF System: A study on the influence of horizontal resolution on the structure and intensity changes in tropical cyclones using an idealized framework. Mon. Wea. Rev., 139, 17621784, https://doi.org/10.1175/2010MWR3535.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., and J. S. Whitaker, 2006: Probabilistic quantitative precipitation forecasts based on reforecast analogs: Theory and application. Mon. Wea. Rev., 134, 32093229, https://doi.org/10.1175/MWR3237.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hersbach, H., 2000: Decomposition of the continuous ranked probability score for ensemble prediction systems. Wea. Forecasting, 15, 559570, https://doi.org/10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hopson, T. M., 2014: Assessing the ensemble spread–error relationship. Mon. Wea. Rev., 142, 11251142, https://doi.org/10.1175/MWR-D-12-00111.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Janjic, Z. I., R. Gall, and M. E. Pyle, 2010: Scientific documenation for the NMM solver. NCAR Tech. Note NCAR/TN-477+STR, 53 pp., https://doi.org/10.5065/D6MW2F3Z.

    • Crossref
    • Export Citation
  • Jarvinen, B. R., and C. J. Neumann, 1979: Statistical forecasts of tropical cyclone intensity for the North Atlantic basin. NOAA Tech. Memo. NWS NHC-10, 22 pp., https://www.nhc.noaa.gov/pdf/NWS-NHC-1979-10.pdf.

  • Junk, C., L. Delle Monache, S. Alessandrini, G. Cervone, and L. von Bremen, 2015: Predictor-weighting strategies for probabilistic wind power forecasting with an analog ensemble. Meteor. Z., 24, 361379, https://doi.org/10.1127/metz/2015/0659.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kaplan, J., and Coauthors, 2015: Evaluating environmental mpacts on tropical cyclone rapid intensification predictability utilizing statistical models. Wea. Forecasting, 30, 13741396, https://doi.org/10.1175/WAF-D-15-0032.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kossin, J. P., and M. Sitkowski, 2009: An objective model for identifying secondary eyewall formation in hurricanes. Mon. Wea. Rev., 137, 876892, https://doi.org/10.1175/2008MWR2701.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kossin, J. P., and M. DeMaria, 2016: Reducing operational hurricane intensity forecast errors during eyewall replacement cycles. Wea. Forecasting, 31, 601608, https://doi.org/10.1175/WAF-D-15-0123.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krishnamurti, T. N., R. Correa-Torres, G. Rohaly, D. Oosterhof, and N. Surgi, 1997: Physical initialization and hurricane ensemble forecasts. Wea. Forecasting, 12, 503514, https://doi.org/10.1175/1520-0434(1997)012<0503:PIAHEF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krishnamurti, T. N., C. M. Kishtawal, Z. Zhang, T. LaRow, D. Bachiochi, E. Williford, S. Gadgil, and S. Surendran, 2000: Multimodel ensemble forecasts for weather and seasonal climate. J. Climate, 13, 41964216, https://doi.org/10.1175/1520-0442(2000)013<4196:MEFFWA>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Landsea, C. W., and J. L. Franklin, 2013: Atlantic hurricane database uncertainty and presentation of a new database format. Mon. Wea. Rev., 141, 35763592, https://doi.org/10.1175/MWR-D-12-00254.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Landsea, C. W., J. L. Franklin, and J. Beven, 2015: The revised Atlantic hurricane database (HURDAT2). NOAA/NHC Doc., 6 pp., http://www.nhc.noaa.gov/data/hurdat/hurdat2-format-atlantic.pdf.

  • Liu, Q., N. Surgi, S. Lord, W.-S. Wu, D. Parrish, S. Gopalakrishnan, J. Waldrop, and J. Gamache, 2006: Hurricane initialization in HWRF Model. 27th Conf. on Hurricanes and Tropical Meteorology, Monterey, CA, Amer. Meteor. Soc., 8A.2, https://ams.confex.com/ams/pdfpapers/108496.pdf.

  • Miyamoto, Y., and T. Takemi, 2013: A transition mechanism for the spontaneous axisymmetric intensification of tropical cyclones. J. Atmos. Sci., 70, 112129, https://doi.org/10.1175/JAS-D-11-0285.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., 1973: A new vector partition of the probability score. J. Appl. Meteor., 12, 595600, https://doi.org/10.1175/1520-0450(1973)012<0595:ANVPOT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Nagarajan, B., L. Delle Monache, J. Hacker, D. Rife, K. Searight, J. Knievel, and T. Nipen, 2015: An evaluation of analog-based postprocessing methods across several variables and forecast models. Wea. Forecasting, 30, 16231643, https://doi.org/10.1175/WAF-D-14-00081.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • NCAR, 2014: Verification: Weather forecast verification utilities. R package version 1.40, accessed 1 February 2018, http://CRAN.R-project.org/package=verification.

  • Nolan, D. S., J. A. Zhang, and E. W. Uhlhorn, 2014: On the limits of estimating the maximum wind speeds in hurricanes. Mon. Wea. Rev., 142, 28142837, https://doi.org/10.1175/MWR-D-13-00337.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pu, Z., S. Zhang, M. Tong, and V. Tallapragada, 2016: Influence of the self-consistent regional ensemble background error covariance on hurricane inner-core data assimilation with the GSI-based hybrid system for HWRF. J. Atmos. Sci., 73, 49114925, https://doi.org/10.1175/JAS-D-16-0017.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rogers, R., P. Reasor, and S. Lorsolo, 2013: Airborne Doppler observations of the inner-core structural differences between intensifying and steady-state tropical cyclones. Mon. Wea. Rev., 141, 29702991, https://doi.org/10.1175/MWR-D-12-00357.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sampson, C. R., J. L. Franklin, J. A. Knaff, and M. DeMaria, 2008: Experiments with a simple tropical cyclone intensity consensus. Wea. Forecasting, 23, 304312, https://doi.org/10.1175/2007WAF2007028.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stevenson, S. N., K. L. Corbosiero, M. DeMaria, and J. L. Vigh, 2018: A 10-year survey of tropical cyclone inner-core lightning bursts and their relationship to intensity change. Wea. Forecasting, 33, 2336, https://doi.org/10.1175/WAF-D-17-0096.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Talagrand, O., R. Vautard, and B. Strauss, 1997: Evaluation of probabilistic prediction systems. Proc. ECMWF Workshop on Predictability, Reading, United Kingdom, ECMWF, 25 pp., https://www.ecmwf.int/sites/default/files/elibrary/1997/12555-evaluation-probabilistic-prediction-systems.pdf.

  • Tallapragada, V., and Coauthors, 2015: Hurricane Weather Research and Forecasting (HWRF) Model: 2015 scientific documentation. NCAR Development Testbed Center Rep., 113 pp., http://www.dtcenter.org/HurrWRF/users/docs/scientific_documents/HWRF_v3.7a_SD.pdf.

  • Torn, R. D., and C. Snyder, 2012: Uncertainty of tropical cyclone best-track information. Wea. Forecasting, 27, 715729, https://doi.org/10.1175/WAF-D-11-00085.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Trahan, S., and L. Sparling, 2012: An analysis of NCEP tropical cyclone vitals and potential effects on forecasting models. Wea. Forecasting, 27, 744756, https://doi.org/10.1175/WAF-D-11-00063.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tsai, H.-C., and R. L. Elsberry, 2014: Applications of situation-dependent intensity and intensity spread predictions based on a weighted analog technique. Asia-Pac. J. Atmos. Sci., 50, 507518, https://doi.org/10.1007/s13143-014-0040-7.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tsai, H.-C., and R. L. Elsberry, 2015: Weighted analog technique for intensity and intensity spread predictions of Atlantic tropical cyclones. Wea. Forecasting, 30, 13211333, https://doi.org/10.1175/WAF-D-15-0030.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Velden, C. S., and Coauthors, 2006: The Dvorak tropical cyclone intensity estimation technique: A satellite-based method that has endured for over 30 years. Bull. Amer. Meteor. Soc., 87, 11951210, https://doi.org/10.1175/BAMS-87-9-1195.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vigh, J., and W. H. Schubert, 2009: Rapid development of the tropical cyclone warm core. J. Atmos. Sci., 66, 33353350, https://doi.org/10.1175/2009JAS3092.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, X., and C. H. Bishop, 2003: A comparison of breeding and ensemble transform Kalman filter ensemble forecast schemes. J. Atmos. Sci., 60, 11401158, https://doi.org/10.1175/1520-0469(2003)060<1140:ACOBAE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weng, Y., and F. Zhang, 2016: Advances in convection-permitting tropical cyclone analysis and prediction through EnKF assimilation of reconnaissance aircraft observations. J. Meteor. Soc. Japan, 94, 345358, https://doi.org/10.2151/jmsj.2016-018.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zagrodnik, J. P., and H. Jiang, 2014: Rainfall, convection, and latent heating distributions in rapidly intensifying tropical cyclones. J. Atmos. Sci., 71, 27892809, https://doi.org/10.1175/JAS-D-13-0314.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, F., Y. Weng, J. F. Gamache, and F. D. Marks, 2011: Performance of convection-permitting hurricane initialization and prediction during 2008–2010 with ensemble data assimilation of inner-core airborne Doppler radar observations. Geophys. Res. Lett., 38, L15810, https://doi.org/10.1029/2011GL048469.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, J., D. S. Nolan, R. F. Rogers, and V. Tallapragada, 2015: Evaluating the impact of improvements in the boundary layer parameterization on hurricane intensity and structure forecasts in HWRF. Mon. Wea. Rev., 143, 31363155, https://doi.org/10.1175/MWR-D-14-00339.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhao, Q., and Y. Jin, 2008: High-resolution radar data assimilation for Hurricane Isabel (2003) at landfall. Bull. Amer. Meteor. Soc., 89, 13551372, https://doi.org/10.1175/2008BAMS2562.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, S. Q., M. Zupanski, A. Y. Hou, X. Lin, and S. H. Cheung, 2013: Assimilation of precipitation-affected radiances in a cloud-resolving WRF ensemble data assimilation system. Mon. Wea. Rev., 141, 754772, https://doi.org/10.1175/MWR-D-12-00055.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, Z., 2016: Introduction to the HWRF-based ensemble prediction system. 2016 Hurricane WRF Tutorial, College Park, MD, NCWCP, 31 pp., https://dtcenter.org/HurrWRF/users/tutorial/2016_NCWCP_tutorial/lectures/Wednesday-21-HWRFtutJan2016_Ensemble_Zhang.pdf.

  • Zhang, Z., and T. N. Krishnamurti, 1999: A perturbation method for hurricane ensemble predictions. Mon. Wea. Rev., 127, 447469, https://doi.org/10.1175/1520-0493(1999)127<0447:APMFHE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • View in gallery

    A timeline indicating how a three-member AnEn works for a 48-h forecast of intensity (VMAX). The current 48-h VMAX forecast from HWRF (indicated by the purple box) is compared with a historical archive of past 48-h HWRF forecasts (i.e., the training period). A set of predictors derived from HWRF output is used to determine the three closest historical 48-h forecast analogs (indicated by the blue boxes) matching the current 48-h forecast. Each of the three 48-h HWRF forecast analogs is matched with the corresponding observed value of VMAX (denoted by black circles). The three verifying VMAX observations are then used as the ensemble members of the 48-h VMAX AnEn prediction (indicated by the dashed black circle). In this study, the above process is repeated independently for each forecast lead time from 3 to 96 h, at 3-h increments. Also, 20 analog ensemble members are actually used instead of three.

  • View in gallery

    Example of (a),(c) analog ensemble using VMAX forecast from the consensus statistical and dynamical models known as IVCN as a predictor (AnEn IVCN; see section 4 for a more complete definition) and (b),(d) QuMap forecast PDFs with an initialization time of (top) 0600 UTC 4 Aug 2014 for Hurricane Julio and (bottom) 1200 UTC 9 Sep 2015 for Hurricane Linda. The gray shadings correspond to the 25%–75% (darker) and 5%–95% (lighter) quantiles. The black and dashed lines represent the VMAX observation and ensemble mean, respectively. The blue line represents the HWRF forecast.

  • View in gallery

    Different HWRF estimates of VMAX for EP Hurricane Adrian (2011) at 0000 UTC 8 Jun. Plotted here are the HWRF VMAX obtained from the 3-hourly reforecast data (blue), HWRF VMAX at 25.7-s time increments obtained from the high-resolution dataset (cyan), and the smoothed high-resolution VMAX data used to generate AnEn (red).

  • View in gallery

    The MAE as a function of lead time for the median AnEn VMAX forecasts (black), the median AnEn IVCN VMAX forecasts (green), the HWRF (blue), IVCN VMAX (red), and the median of QuMap forecasts for the (a) EP and (b) ATL. The vertical bars to the left of the vertical line indicate the bootstrap confidence intervals obtained pulling all the lead times together. The vertical bars at the bottom indicate the 5%–95% bootstrap confidence intervals by lead time of the differences of MAE [AnEn − HWRF (black) and AnEn IVCN − HWRF (green)]. The legend at the top left of each panel indicates the MAE values computed using all lead times over the 1729 and 1329 forecasts for the EP and ATL, respectively, over the period 2011–15.

  • View in gallery

    As in Fig. 4, but for the CC. The vertical bars indicate the 5%–95% bootstrap confidence intervals by lead time plotted for AnEn IVCN and HWRF only to reduce clutter.

  • View in gallery

    As in Fig. 4, but for the CRMSE (dashed) and the BIAS (solid). The vertical bars indicate the 5%–95% bootstrap confidence intervals by lead time plotted for only the AnEn IVCN and HWRF to reduce clutter.

  • View in gallery

    The MAE as a function of lead time of the median AnEn VMAX forecasts (black), the median AnEn IVCN VMAX forecasts (green), and the VMAX forecasts from HWRF (blue), NHC (orange), SHF5 (magenta), LGEM (cyan), and IVCN (red) for both the (a) EP and (b) ATL. The sample size per lead time is included below the x axis.

  • View in gallery

    (top) Binned spread–skill plots and (bottom) dispersion diagrams computed in the (left) EP and (right) ATL for AnEn IVCN and QuMap. (a),(b) The correlation coefficient (R) between the absolute error and ensemble spread is indicated in the legends. They are computed as in Goerss and Sampson (2014) on the original values before the average over the bins. (c),(d) RMSE (solid) and ensemble spread (dashed) of AnEn IVCN as a function of the forecast lead time; the 5%–95% bootstrap confidence intervals are plotted for AnEn IVCN only to reduce clutter.

  • View in gallery

    CRPS as a function of forecast lead time for the (left) EP and (right) ATL datasets; 5%–95% bootstrap confidence intervals are also plotted. The vertical bars next to the left vertical axis indicate CRPS bootstrap intervals considering all the lead times together. The legend at the top indicates the overall values of CRPS, potential CRPS (CRPS POT), and reliability (REL).

  • View in gallery

    As in Fig. 4, but for 2015 only. Also, the number of samples per lead time is indicated below the x axis.

  • View in gallery

    Dispersion diagrams [(a) EP, (b) ATL] show RMSE (solid) and ensemble spread (dashed) of AnEn IVCN (black) and the 2015 HWRF ensemble (red) as a function of forecast lead time; the 5%–95% bootstrap confidence intervals associated with RMSE are plotted for both ensembles. The sample size per lead time is included below the x axis.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 24 24 15
PDF Downloads 21 21 14

Probabilistic Prediction of Tropical Cyclone Intensity with an Analog Ensemble

View More View Less
  • 1 National Center for Atmospheric Research, Boulder, Colorado
  • 2 Cooperative Institute for Meteorological Satellite Studies, University of Wisconsin–Madison, Madison, Wisconsin
© Get Permissions
Full access

Abstract

An analog ensemble (AnEn) technique is applied to the prediction of tropical cyclone (TC) intensity (i.e., maximum 1-min averaged 10-m wind speed). The AnEn is an inexpensive, naturally calibrated ensemble prediction of TC intensity derived from a training dataset of deterministic Hurricane Weather Research and Forecasting (HWRF; 2015 version) Model forecasts. In this implementation of the AnEn, a set of analog forecasts is generated by searching an HWRF archive for forecasts sharing key features with the current HWRF forecast. The forecast training period spans 2011–15. The similarity of a current forecast with past forecasts is estimated using predictors derived from the HWRF reforecasts that capture thermodynamic and kinematic properties of a TC’s environment and its inner core. Additionally, the value of adding a multimodel intensity consensus forecast as an AnEn predictor is examined. Once analogs are identified, the verifying intensity observations corresponding to each analog HWRF forecast are used to produce the AnEn intensity prediction. In this work, the AnEn is developed for both the eastern Pacific and Atlantic Ocean basins. The AnEn’s performance with respect to mean absolute error (MAE) is compared with the raw HWRF output, the official National Hurricane Center (NHC) forecast, and other top-performing NHC models. Also, probabilistic intensity forecasts are compared with a quantile mapping model based on the HWRF’s intensity forecast. In terms of MAE, the AnEn outperforms HWRF in the eastern Pacific at all lead times examined and up to 24-h lead time in the Atlantic. Also, unlike traditional dynamical ensembles, the AnEn produces an excellent spread–skill relationship.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Stefano Alessandrini, alessand@ucar.edu

Abstract

An analog ensemble (AnEn) technique is applied to the prediction of tropical cyclone (TC) intensity (i.e., maximum 1-min averaged 10-m wind speed). The AnEn is an inexpensive, naturally calibrated ensemble prediction of TC intensity derived from a training dataset of deterministic Hurricane Weather Research and Forecasting (HWRF; 2015 version) Model forecasts. In this implementation of the AnEn, a set of analog forecasts is generated by searching an HWRF archive for forecasts sharing key features with the current HWRF forecast. The forecast training period spans 2011–15. The similarity of a current forecast with past forecasts is estimated using predictors derived from the HWRF reforecasts that capture thermodynamic and kinematic properties of a TC’s environment and its inner core. Additionally, the value of adding a multimodel intensity consensus forecast as an AnEn predictor is examined. Once analogs are identified, the verifying intensity observations corresponding to each analog HWRF forecast are used to produce the AnEn intensity prediction. In this work, the AnEn is developed for both the eastern Pacific and Atlantic Ocean basins. The AnEn’s performance with respect to mean absolute error (MAE) is compared with the raw HWRF output, the official National Hurricane Center (NHC) forecast, and other top-performing NHC models. Also, probabilistic intensity forecasts are compared with a quantile mapping model based on the HWRF’s intensity forecast. In terms of MAE, the AnEn outperforms HWRF in the eastern Pacific at all lead times examined and up to 24-h lead time in the Atlantic. Also, unlike traditional dynamical ensembles, the AnEn produces an excellent spread–skill relationship.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Stefano Alessandrini, alessand@ucar.edu

1. Introduction

The prediction of tropical cyclone (TC) intensity, defined by the National Hurricane Center (NHC) as the 1-min maximum 10-m wind speed, is an important weather forecast problem because of the destructive impacts of intense TCs on society. While TC intensity forecast guidance has been improving for the last two decades, progress has been relatively slow, compared to improvements in TC track forecasting (DeMaria et al. 2014). Intensity prediction requires representing multiscale aspects of the atmosphere and ocean, including a TC’s surrounding large-scale environment, its mesoscale vortex structure and dynamics, small-scale convection, the boundary layer, air–sea interactions, and oceanic processes both beneath the TC and in surrounding regions. Reducing the intensity forecast error is also complicated by uncertainty in the intensity observations themselves (e.g., Torn and Snyder 2012; Landsea and Franklin 2013; Nolan et al. 2014).

Advancements in TC intensity forecasting with numerical weather prediction (NWP) are, in fact, happening, but until recently, statistical forecast models have been the most reliable techniques for forecasting intensity. The Statistical Hurricane Intensity Forecast model (SHIFOR; Jarvinen and Neumann 1979; DeMaria et al. 2006), based on climatology and persistence variables, has been around for decades and today serves as a common metric of baseline intensity skill. However, the most successful statistical models have been the Statistical Hurricane Intensity Prediction Scheme (SHIPS; DeMaria et al. 2005) and the Logistic Growth Equation Model (LGEM; DeMaria 2009). SHIPS and LGEM are deterministic empirical–dynamical models employing predictors from climatology, persistence, global model forecast fields describing the storm’s environment, and satellite data, and they have continued to undergo refinement since the first operational implementation of SHIPS in 1991. Other statistical techniques using similar predictors that have been beneficial to forecasters are statistical models for rapid intensification (Kaplan et al. 2015) and eyewall replacement cycles (Kossin and Sitkowski 2009; Kossin and DeMaria 2016).

Statistical models continue to be vital tools for forecasters, but dedicated efforts in NWP (e.g., Gopalakrishnan et al. 2011; Tallapragada et al. 2015) have recently led to measurable progress in intensity prediction as state-of-the-art NWP approaches the cloud-resolving scale. Data assimilation (e.g., Zhao and Jin 2008; Zhang et al. 2011; Aksoy et al. 2012; Zhang et al. 2013; Weng and Zhang 2016), including vortex initialization and Gridpoint Statistical Interpolation analysis system (GSI)-based hybrid data assimilation (e.g., Liu et al. 2006; Tallapragada et al. 2015; Pu et al. 2016), microphysics and boundary layer physics improvements (e.g., Bao et al. 2012; Biswas et al. 2014; Zhang et al. 2015), and ensemble forecasting systems (e.g., Doyle et al. 2011; Aksoy et al. 2012) are areas where NWP has contributed to improved intensity forecasts.

Postprocessing of dynamical model output has become more common in recent years. This technique combines some of the strengths of both statistical and dynamical model forecasts and can be used to improve intensity prediction. In the current paper, we describe a novel postprocessing approach to TC intensity forecasting called the analog ensemble (AnEn) technique (Delle Monache et al. 2013). This technique uses a historical dataset of output from a deterministic NWP model and intensity observations to improve upon subsequent forecasts produced by the same NWP model by finding analog historical forecasts that share commonalities with these subsequent forecasts.

The AnEn will be used for both generating ensemble predictions and improving the currently available 1-min average maximum 10-m wind speed (hereafter VMAX) forecasts from the National Centers for Environmental Prediction (NCEP) of the National Oceanic and Atmospheric Administration (NOAA)’s operational version of the Hurricane Weather Research and Forecasting (HWRF) Model. The AnEn was initially developed to improve the prediction of 10-m wind speed and 2-m temperature (Delle Monache et al. 2011, 2013) and also to calibrate existing probabilistic predictions of precipitation (Hamill and Whitaker 2006). It has since been refined and extensively tested for the probabilistic prediction of additional meteorological variables (Nagarajan et al. 2015), renewable energies such as wind and solar power (Alessandrini et al. 2015a,b; Junk et al. 2015; Davò et al. 2016), and air quality (Djalalova et al. 2015).

AnEn needs a historical dataset of deterministic forecasts produced by an NWP model and past observations of a meteorological variable of interest (VMAX, in this case). A given number of past forecasts similar to the current one are identified using a set number of predictors. For each analog forecast of the meteorological variable of interest, the verifying observations of that variable are used to build the ensemble prediction. The underlying assumption is that the past forecast errors are samples from the same probability density function and can be used to describe the uncertainty of the current prediction. Therefore, the AnEn is not based on initial conditions or multimodel and multiphysics perturbation strategies commonly used to generate dynamical ensembles. In the case of dynamical ensemble systems, the intrinsic uncertainty of the evolution of meteorological variables is represented by multiple simulations, either starting from different initial conditions or performed with various models or physics configurations. Applications of these kinds of systems applied to TC intensity forecasts can be found in Zhang and Krishnamurti (1999) and (Krishnamurti et al. 1997, 2000), among others. As the AnEn is based on just a single deterministic forecast, it has the advantage of being significantly less computationally expensive in real time than dynamical ensemble prediction systems and, for this reason, can often be run at higher resolution.

It should be noted that at first glance, the AnEn technique resembles the weighted analog intensity ensemble prediction of (Tsai and Elsberry 2014, 2015) in that a deterministic intensity and intensity spread prediction is made based on a set of closest-matching analogs to the current forecast period. However, in Tsai and Elsberry’s studies, analogs are sought by matching historical observed storm tracks to an operational forecast center’s current track forecast, rather than seeking analogs of a dynamical model’s forecast from its past forecasts.

In this work, we apply the AnEn technique to both the eastern Pacific (EP) and Atlantic Ocean (ATL) basins using a dataset of predictors derived from HWRF reforecast data from 2011 to 2014 and real-time HWRF forecast data from 2015. The AnEn’s performance with respect to mean absolute error (MAE) is compared to the performance of raw HWRF output, the official NHC VMAX prediction, and three other top-performing statistical models. In addition, quantile mapping is used as a baseline reference for the AnEn’s probabilistic predictions.

This paper is organized as follows. In section 2, the datasets are presented. Section 3 describes the AnEn and the quantile mapping method. The predictor selection procedure and the performance verification are discussed in section 4. Finally, a concluding discussion is provided in section 5.

2. Model description

NCEP’s 2015 operational configuration of the HWRF Model (version 3.7; hereafter H215) (Gopalakrishnan et al. 2011; Bao et al. 2012; Tallapragada et al. 2015) is used as the basis of the AnEn in this study. HWRF uses the same dynamical core as the NCEP’s WRF-Nonhydrostatic Mesoscale Model (NMM; Janjic et al. 2010). Three grids are used with two inner two-way interactive moving nests that follow the TC. The parent domain has 18-km horizontal grid spacing and covers a region of roughly 80° × 80°. The inner grids have 6- and 2-km horizontal grid spacing, covering 12° × 12° and 7.1° × 7.1°, respectively. The H215 physics packages include a modified tropical Ferrier (Ferrier–Aligo) microphysics scheme (Ferrier et al. 2002), a simplified Arakawa–Schubert (SAS) cumulus parameterization, a modified Global Forecast System (GFS) planetary boundary layer (PBL) scheme, the Geophysical Fluid Dynamics Laboratory (GFDL) surface layer parameterization, the Noah land surface model, and the Rapid Radiative Transfer Model for general circulation models (RRTMG) longwave and shortwave radiation scheme. H215 is coupled to the Message Passing Interface Princeton Ocean Model for TCs (MPIPOM-TC). A vortex initialization package and GSI hybrid data assimilation are also integral to H215.

3. Probabilistic prediction methods

a. The AnEn

The intensity-based AnEn is constructed for both the eastern Pacific and the Atlantic Ocean basins and is based on retrospective HWRF forecasts from 2011 to 2014 and archived real-time HWRF forecasts from 2015, along with observed VMAX from the hurricane database (HURDAT2; Landsea et al. 2015). For each forecast lead time (3–96 h at 3-h increments1), a subset of historical HWRF forecasts sharing defined similarities to the current HWRF forecast is identified. These matches are the analog forecasts that are used for the AnEn. The meteorological variables used to identify past forecasts most similar to the current one are called analog predictors (described in more detail below). The verifying observations of VMAX associated with the HWRF analog forecasts are used to form the actual VMAX analog ensemble. A conceptual example of the H215-based AnEn using three ensemble members is illustrated in Fig. 1.

Fig. 1.
Fig. 1.

A timeline indicating how a three-member AnEn works for a 48-h forecast of intensity (VMAX). The current 48-h VMAX forecast from HWRF (indicated by the purple box) is compared with a historical archive of past 48-h HWRF forecasts (i.e., the training period). A set of predictors derived from HWRF output is used to determine the three closest historical 48-h forecast analogs (indicated by the blue boxes) matching the current 48-h forecast. Each of the three 48-h HWRF forecast analogs is matched with the corresponding observed value of VMAX (denoted by black circles). The three verifying VMAX observations are then used as the ensemble members of the 48-h VMAX AnEn prediction (indicated by the dashed black circle). In this study, the above process is repeated independently for each forecast lead time from 3 to 96 h, at 3-h increments. Also, 20 analog ensemble members are actually used instead of three.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

Both H215 reforecast data from 2011 to 2014 and H215 real-time forecasts from the 2015 EP and ATL hurricane seasons are combined in order to increase the number of quality historical analogs available. After combining both the reforecasts and real-time forecasts, there are 1729 and 1329 forecasts (covering 75 and 54 TCs) in the EP and ATL, respectively. These forecasts cover the entire spectrum of Saffir–Simpson hurricane wind scale intensity categories. It should be noted that the performance of an AnEn solely derived from reforecasts will be slightly better than the current AnEn incorporating the additional real-time forecasts from 2015 since the reforecasts use cyclone location, intensity, and structure information from the best track database rather than the more imperfect Tropical Cyclone Vitals Database (TCVitals) (e.g., Trahan and Sparling 2012) used in the real-time forecasts.

The procedure for finding analogs follows Delle Monache et al. (2013). In particular, we define the degree of similarity of the current forecast to a historical prediction using the following metric:
e1
where is the current forecast of a predictor i for the lead time t; represents a forecast of a past run for the same lead time; and are the number of predictors and their weights, respectively; is the standard deviation of the past forecasts of a given variable i at the lead time t computed over the past available runs; and is an integer equal to the half-width of the lead time window over which the metric is computed. In this application, has been set equal to 1. As the forecast dataset is made of predictions available with a time increment of 3 h, setting means that the distance is computed over the three forecast lead time hours corresponding to t − 3 h, t, and t + 3 h (i.e., the lead-time window size is 6 h). What this means is that the historical dataset (consisting of more than 1000 simulations of different TCs in each basin) is searched for those events whose AnEn predictors exhibit similar values and temporal trends (i.e., which had a similar behavior within the selected lead time window), compared to the current forecast. The assumption is that if similar past forecast events are found, then the corresponding past observations of VMAX can be used to improve the current prediction.

The predictors considered for the AnEn in this study are developed from the HWRF reforecasts to capture numerous thermodynamic and kinematic properties of a TC’s environment and its inner core. All of the predictors used in this study are storm-centric and azimuthally symmetric. Many used here are similar to those provided in the SHIPS developmental dataset (DeMaria et al. 2005) and therefore are named similarly, while other novel predictors describing the storm’s inner core are developed as well. In total, 64 predictors are tested (see Table A1). An objective selection procedure has been implemented to find those predictors (and their weights) that minimize the mean absolute error over a testing period.

The MAE is defined as follows:
e2
where is the ith observed value, is the ith forecasted value, and N is the total number of forecast/observation pairs. It estimates the average error magnitude of the forecasts. Unlike root-mean-square error (RMSE), MAE is an L1 norm that gives the same weights to all of the errors independent of their magnitude. A perfect forecast would provide an MAE equal to 0.

With a limited number of predictors (e.g., fewer than 10), a “brute force” method can be implemented to find predictors (Alessandrini et al. 2015b; Junk et al. 2015). Once a dataset is split into two independent datasets (for training and verification; see section 4a for details), all possible combinations of the weights wi in the interval [0, 1] with 0.1 increments and with the constraint that are used to generate predictions over the training dataset. The weight combination leading to the lowest MAE is chosen to generate predictions over the independent verification dataset. However, this approach is not computationally feasible when applied to 64 predictors. As an alternative, we have designed an iterative approach on the training dataset to select both the optimal predictors and their weights. This technique is similar to the forward selection technique commonly adopted in statistics to choose predictive variables through an automatic procedure (Efroymson 1960). First, AnEn forecasts are generated by using only one predictor at a time. For each of these NP ensembles, the MAE is calculated using the median of the 20 member values. The predictor resulting in the lowest MAE is chosen as the first predictor, p1. Then, each of the remaining NP − 1 predictors, pi, is tested one by one together with p1. For each pair, AnEn predictions are generated with all the possible weight combinations, using a weight increment of 0.1 and the constraint . The pair resulting in the lowest MAE determines the second predictor, p2, which is selected only if the improvement (decrease) of MAE, compared to using p1 alone, is more than 3%. If p2 is chosen, the procedure is repeated to generate all possible triplets with the remaining NP − 2 predictors, along with p1 and p2. The procedure is interrupted when the increase in performance (decrease of MAE), compared to the previous iteration, is lower than 3%. The selected set of predictors and their corresponding weights wi are used to generate the AnEn predictions over the verification dataset. The 3% threshold has been chosen after sensitivity tests (not shown). This has been identified as an optimal choice to detect statistically significant improvements regarding MAE.

To determine the best ensemble size, the MAE was examined for analog ensembles ranging in size from 5 to 30. Based on the MAE, we found that the optimal ensemble size is 20 members.

b. HWRF quantile mapping

While the MAE and other standard forecast skill metrics are useful for examining the deterministic forecast skill of the median AnEn forecasts, other attributes need to be evaluated to assess the quality of AnEn’s probabilistic forecasts. As a baseline probabilistic prediction, we also generate a VMAX ensemble based on the quantile mapping (QuMap) postprocessing technique inspired by a “quantile matching” or “quantile–quantile mapping” approach (Déqué 2007). Quantile-matching relates modeled values from any given quantile in its forecast distribution with the same quantile in the distribution of observations. In this way, the quantile matching may reduce forecast bias.

Similar to the AnEn, QuMap is an ensemble system based on H215 forecasts of VMAX and corresponding observed VMAX values. Independently for each forecast lead time, the HWRF VMAX forecast values from the training dataset are rank ordered and partitioned into 20 equally populated bins whose edges are the quantiles (from 0 to 1 with 0.05 increments). For each HWRF VMAX forecast of the verification dataset, the corresponding bin from the historical dataset is found, and the ensemble members are the past verifying observations corresponding to 20 randomly selected forecasts from this bin. The dataset has been split into a training and verification part in the same way as the AnEn (see section 4a). Sensitivity tests on the number of members have also been carried out (not shown). QuMap ensembles with a number of members ranging from 20 to 50 did not exhibit performances that are significantly different in terms of continuous ranked probability score. Also, as described in section 4a below, we found that using HWRF VMAX values that were smoothed in time was beneficial for the AnEn. A sensitivity test (not shown) showed a similar improvement for QuMap. Hence, in the results that follow, we also include the QuMap model with smoothed values of HWRF VMAX.

The QuMap is often a superior probabilistic prediction and therefore a more stringent test for the AnEn performance, compared to climatologically defined probabilistic forecasts, which are used in ensemble validation. In fact, a climatological probabilistic prediction provides a unique probability value, reflecting the frequency that an event has occurred in the training dataset (e.g., observed VMAX being higher than a given threshold). On the other hand, QuMap, like the AnEn, provides a different probability value for any forecast event that is conditioned to the specific H215 VMAX prediction. In fact, QuMap is similar to an AnEn using only the HWRF VMAX as a predictor since the QuMap searches for past analog predictions of VMAX. However, the similarity of past forecasts with the current forecast is determined by a quantile distance instead of the Euclidean distance defined in Eq. (1). The AnEn, when compared with the QuMap, allows for a more flexible way to search for analogs, combining information from multiple consecutive lead times and from multiple predictors.

Figure 2 shows an example of probabilistic predictions generated with the AnEn and QuMap. For eastern Pacific Hurricane Julio (2014), the AnEn spread increases with a similar trend as the AnEn mean and the error (intended as the difference between the ensemble mean and the observations). In the case of east Pacific Hurricane Linda (2015), the AnEn spread is almost constant as the lead time increases, being consistent with the small error in the AnEn mean. Looking at the QuMap predictions, the 5th–95th-quantile lines are noisier than those in the AnEn, likely due to adopted random sampling step. For this reason, using more than 20 QuMap members would likely smooth its quantile predictions. In addition, for both QuMap and the AnEn, the deviations of the ensemble from HWRF are flow dependent (i.e., they change depending on the predicted values of VMAX).

Fig. 2.
Fig. 2.

Example of (a),(c) analog ensemble using VMAX forecast from the consensus statistical and dynamical models known as IVCN as a predictor (AnEn IVCN; see section 4 for a more complete definition) and (b),(d) QuMap forecast PDFs with an initialization time of (top) 0600 UTC 4 Aug 2014 for Hurricane Julio and (bottom) 1200 UTC 9 Sep 2015 for Hurricane Linda. The gray shadings correspond to the 25%–75% (darker) and 5%–95% (lighter) quantiles. The black and dashed lines represent the VMAX observation and ensemble mean, respectively. The blue line represents the HWRF forecast.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

c. HWRF dynamical ensemble

To compare the performance of the AnEn with a relevant operational dynamical ensemble prediction system, we use a set of archived HWRF ensemble forecasts that was run at NCEP for the 2015 hurricane season. This ensemble is a 20-member ensemble based on the 2015 operational deterministic HWRF Model, except the ensemble’s horizontal grid spacing of the outer, middle, and inner grids is 27, 9, and 3 km instead of 18, 6, and 2 km used in the deterministic HWRF; the number of vertical levels is reduced from 61 to 43; and there is no GSI-based hybrid data assimilation (Zhang 2016). To create an ensemble, perturbations are applied by using the 20-member Global Ensemble Forecast System (GEFS) for initial and boundary conditions. Additional perturbations are made to the vortex scale of each ensemble member by using stochastic perturbations to the convective trigger, boundary layer height, drag coefficient, and initial wind speed and position.

4. Results

a. Predictor selection

As described earlier, an iterative procedure is performed on the training dataset to determine the set of predictors and weights to optimize the performance of the AnEn with respect to MAE. To build the training and the verification datasets, both EP and ATL datasets have been split into five parts, each including the forecast runs and observations from the same year. The groups of runs corresponding to each storm are chronologically ordered before splitting the dataset. The optimization procedure is repeated five times, where 1 year of the dataset is used for the verification and the remaining 4 years for the optimization (training). The predictors and the weights obtained in each optimization cycle are kept constant and used to generate the AnEn forecasts over the year used for verification. In this way, the AnEn can be tested over the entire dataset while simultaneously keeping an independent dataset for the training. This strategy allows testing of the AnEn on multiple years and on the widest range of situations while still keeping an independent training dataset. The resulting performance assessment aims to be representative of what would be expected if the AnEn was applied to subsequent years on an independent dataset (e.g., by keeping the current dataset 2011–15 as a training sample and 2016–18 as the test period). Also, the optimization cycles have been performed independently over the forecast lead-time intervals 0–24 and 27–96 h. The 27–96-h interval is longer than the 0–24-h interval to allow a similar number of samples in both time periods since the number of missing values of both observed and predicted VMAX increases with the lead time. The set of predictors and weights derived for each of the two intervals is then used for the predictions in the corresponding lead-time range.

While 64 total predictors are available for the AnEn, only a handful are selected by the above optimization cycles. These predictors are described below and summarized in Table 1. Table 2 shows the weights and predictors chosen for each of the five optimization cycles described above. One predictor used by all configurations of the AnEn is the H215 VMAX itself. It should be noted that the VMAX obtained from the 3-hourly output data grids is noisy and can therefore deteriorate the quality of analogs. Fortunately, NCEP provides a file of limited variables that are saved at every model time step (25.7 s), and one of those variables is VMAX. This allows for a reliable smoothing of VMAX data in time. Here, a simple running mean with a span of 1001 data points is used. This window size was chosen through experimentation and allows for a smoothed VMAX curve that maintains the integrity of a storm’s intensity evolution. These smoothed VMAX data are then used at the 3-h lead times to generate AnEn instead of the raw VMAX estimated directly from the 3-h HWRF output grids. Indeed, the smoothed VMAX data provide a more reliable AnEn than the 3-h VMAX output. The impact of VMAX smoothing for a given storm is shown in Fig. 3. The H215 VMAX is used in the AnEn for both the EP and ATL and at all lead times.

Table 1.

Summary of all the predictors used in the H215 AnEn for the ATL and EP Ocean basins.

Table 1.
Table 2.

The predictors and their corresponding weights (in parentheses) resulting from the optimization procedure. Next to the weights, the relative improvement in terms of MAE determined by each added predictor is given as a percentage. The predictor combinations are specified for each forecast lead time interval and each of the five optimization cycles, excluding 1 year at a time. Also indicated here is whether IVCN VMAX is used as a predictor (yes or no). The symbols’ descriptions can be found in Table 1.

Table 2.
Fig. 3.
Fig. 3.

Different HWRF estimates of VMAX for EP Hurricane Adrian (2011) at 0000 UTC 8 Jun. Plotted here are the HWRF VMAX obtained from the 3-hourly reforecast data (blue), HWRF VMAX at 25.7-s time increments obtained from the high-resolution dataset (cyan), and the smoothed high-resolution VMAX data used to generate AnEn (red).

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

A number of other kinematic predictors are used in the various EP and ATL AnEn configurations. An inner-core kinematic parameter that improves the AnEn in both basins is related to the azimuthal-mean inertial stability parameter I, defined as
eq1
where f is the Coriolis parameter, and υ is the tangential component of the wind. To create predictors depicting inertial stability, the squared inertial stability is spatially averaged over the 850–500-hPa layer and over the radial regions contained within r = 0–50 and 0–100 km (INRT1 and INRT2; reflective of the inner core) and r = 100–250 km (INRT3; reflective of the rainband region). These parameters provide a bulk measure of a storm’s wind structure and relate to intensity change processes. For example, when the latent heating occurs in a region of high inertial stability, intensification is more likely (e.g., Vigh and Schubert 2009; Rogers et al. 2013; Stevenson et al. 2018). Other kinematic predictors chosen for the AnEn in the EP include the 850–200-hPa vertical wind shear averaged over the radial region of r = 0–500 km (SHRD) and average 850-hPa tangential wind averaged over the radial region of r = 0–600 km (TANG850).

A few thermodynamic predictors are useful as well. A couple of AnEn configurations in the EP and ATL use a predictor for the midlevel (700–500 hPa) relative humidity averaged over r = 200–800 km (RHMD), while another EP AnEn configuration uses a similar relative humidity predictor but averaged over the 850–700-hPa layer (RHLO). Yet another inner-core thermodynamic predictor that is useful in the ATL version of the AnEn is the degree of “axisymmetricity” of the vertically integrated total condensate in the rainband region (TCONDSYM), adapted from Miyamoto and Takemi (2013). This predictor is an average over the radial range of 100–250 km. A perfectly axisymmetric field would yield a maximum possible TCONDSYM value of 1. As storms become more intense, they tend to become more axisymmetric (e.g., Velden et al. 2006; Zagrodnik and Jiang 2014).

A couple of other additional types of predictors were also found useful. An EP AnEn configuration included the minimum sea level pressure (MINSLP). Also, the TC’s latitude (LAT) was found to be a useful predictor in both the EP and ATL.

While the majority of predictors tested are derived from the HWRF Model, it is of interest to test whether other objective aids not entirely derived from the HWRF may further improve the AnEn. To this end, we tested an equally weighted consensus of top-performing statistical and dynamical models (Sampson et al. 2008; Goerss and Sampson 2014), referred to as IVCN at NHC. NHC updates IVCN every year (using the NHC verification reports for each year), and it includes early models such as those from a couple of configurations of the GFDL model, the SHIPS with an inland decay feature (DSHP), LGEM, and the HWRF (HWFI) over the period of study. The IVCN VMAX turns out, in fact, to be a useful AnEn predictor in both ocean basins.

It should be pointed out that each basin and AnEn configuration uses a different combination of predictors. The reasons for the differences in which predictors are chosen per AnEn model may depend on both physical and statistical reasons. While these differences are of scientific interest, determining the exact reasons is outside the scope of this manuscript. Nonetheless, the predictors used in each AnEn configuration have been linked with TC intensity change in past studies.

b. Deterministic forecast verification

The AnEn is first evaluated as a method to improve the H215 VMAX deterministic forecasts, and it is compared with the operational IVCN VMAX prediction. A description of the metrics used to evaluate the deterministic predictions is provided below. Those include MAE, Pearson correlation coefficient (CC), BIAS, and centered root-mean-square error (CRMSE).

The CC is defined as
e3
where the overbar indicates the mean over the N forecast events, and and are the standard deviations of the forecast and the observations, respectively. The CC measures the strength and the direction of a linear relationship between the forecast and the observations. It ranges between −1 and 1, with 1 being the best achievable correlation. The BIAS, also known as the systematic error, is defined as
e4
and it is a measure of the average error of the forecasts. If the BIAS is removed from each forecast error, one can obtain the CRMSE, which includes random errors and residual conditional biases. The CRMSE is defined as follows:
e5

A deterministic, single-valued forecast can be obtained from any ensemble prediction by taking the mean or the median of the ensemble members at any forecast lead time. To do that for the analysis applied in this section, the median of the 20 AnEn member values is used. The choice is motivated by our goal of improving the MAE. In fact, it has been demonstrated by Gneiting (2011) that the median provides a lower MAE than the ensemble mean.

In Fig. 4, the EP and ATL AnEn forecasts are compared in terms of MAE to HWRF, the IVCN, to the AnEn VMAX forecast including the IVCN VMAX predictor (hereafter AnEn IVCN), and to QuMap. In Fig. 4 and all subsequent figures, only the forecast events (cases) where each of the models and observations are available are considered. In Table 3, the number of cases available is specified by lead time and by basin. In addition, we provide the number of cases where HWRF VMAX data are available but the verifying best track data are not available. There are two primary reasons for missing best track data. At earlier lead times, missing observations are often more likely because the storm is at such an early stage of development (i.e., an “invest”) that the data are not archived in the best track dataset. As lead time increases, missing observations are more often related to cases where the storm has decayed to the point that it is no longer tracked in the best track dataset, yet HWRF has continued to maintain the TC and produce an intensity prediction. As can be seen in Table 3, at later lead times, the number of missing cases is fairly substantial, but it is important to note that those cases are not used in the AnEn development or in the HWRF and AnEn evaluation shown here. Also, since IVCN is an “early model,” generated 6 h before each HWRF initialization time, comparisons of deterministic forecast verification metrics with HWRF are not entirely fair, as IVCN would likely exhibit some improvement if it was produced at the same time as HWRF. Regardless of these timing differences, the IVCN can still be incorporated as an objective predictor. The vertical bars in Fig. 4 indicate the 95% bootstrap confidence intervals computed via a block bootstrap technique to account for serial correlation. The confidence intervals to the left of the vertical line are obtained by pooling together all the lead times, whereas the ones plotted close to the x axis are computed for each lead time on the absolute error differences between HWRF and AnEn (black) and between HWRF and AnEn IVCN (green). For the latter, the MAE of AnEn with and without IVCN can be considered statistically significantly lower than HWRF’s only if the segment is entirely below the x axis. For the EP, both the AnEn and AnEn IVCN significantly outperform HWRF at almost all the lead times up to 24 and 81 h (except for lead times of 42 and 75 h), but only up to 12 and 21 h, respectively, over the ATL. QuMap improves over HWRF at all lead times, and its performances are similar to AnEn IVCN after 51 h for the EP. For the ATL, its performances are very similar to HWRF and IVCN, with worse performances than AnEn IVCN up to about 24 h and lower MAE values for longer lead times.

Fig. 4.
Fig. 4.

The MAE as a function of lead time for the median AnEn VMAX forecasts (black), the median AnEn IVCN VMAX forecasts (green), the HWRF (blue), IVCN VMAX (red), and the median of QuMap forecasts for the (a) EP and (b) ATL. The vertical bars to the left of the vertical line indicate the bootstrap confidence intervals obtained pulling all the lead times together. The vertical bars at the bottom indicate the 5%–95% bootstrap confidence intervals by lead time of the differences of MAE [AnEn − HWRF (black) and AnEn IVCN − HWRF (green)]. The legend at the top left of each panel indicates the MAE values computed using all lead times over the 1729 and 1329 forecasts for the EP and ATL, respectively, over the period 2011–15.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

Table 3.

(left number) Number of valid forecast/observation pairs by lead time and by basin used to compute the indices in Figs. 4, 5, and 6. (right number) Number of cases where HWRF VMAX forecasts are generated but the corresponding VMAX best track observations are missing. These missing cases are not included in the sample size numbers at left and also not used in the evaluation.

Table 3.

The benefit of including the IVCN VMAX predictor is clear in both the EP and ATL. The AnEn IVCN (green) outperforms IVCN up to 21 h over the ATL and at all lead times for the EP. The bootstrap confidence intervals, computed by pooling all the lead times together (left of the vertical segment), indicate that the MAE improvements for the AnEn with IVCN VMAX, compared to all the other models, are statistically significant only in the EP.

In terms of correlation (Fig. 5), all of the models display higher values in the EP than in the ATL after 24 h. Relative improvements among the different models reflect what is already observed for MAE. In Fig. 6, CRMSE and BIAS are shown. HWRF exhibits small negative values of BIAS up to 75 h for the ATL, while it is larger in the EP (larger than about −5 kt after 12 h; 1 kt = 0.5144 m s−1). Over the ATL, the AnEn does not improve upon the HWRF BIAS, except after 84 h (even though the improvement is not statistically significant). In the first 24 h, the ATL AnEn IVCN outperforms HWRF in terms of the CRMSE with a similar BIAS. Therefore, the improvements observed for MAE in the ATL for the same range of lead times can be attributed to a reduction in random error and conditional biases. Similar to HWRF, the IVCN forecast (red) exhibits a low but increasingly positive BIAS over the ATL after forecast hour 69 and an increasingly negative BIAS over the EP at all lead times. The AnEn IVCN (green) has a smaller absolute BIAS than the IVCN for all lead times over the EP. Given that, and the CRMSE results, the overall improvement over the EP of the AnEn with respect to IVCN in terms of MAE can be attributed to a reduction of both random and systematic errors. Overall, the AnEn IVCN is quite competitive with IVCN and improves upon the HWRF prediction at the earlier lead times in the ATL and nearly all lead times in the EP.

Fig. 5.
Fig. 5.

As in Fig. 4, but for the CC. The vertical bars indicate the 5%–95% bootstrap confidence intervals by lead time plotted for AnEn IVCN and HWRF only to reduce clutter.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

Fig. 6.
Fig. 6.

As in Fig. 4, but for the CRMSE (dashed) and the BIAS (solid). The vertical bars indicate the 5%–95% bootstrap confidence intervals by lead time plotted for only the AnEn IVCN and HWRF to reduce clutter.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

For the QuMap, the HWRF BIAS is decreased in both basins. However, in contrast with the AnEn, it never significantly improves the HWRF CRMSE. The CRMSE for QuMap and HWRF follow each other closely across all lead times in both basins.

It is also insightful to compare simple metrics, such as the MAE of AnEn, with that produced by standard NHC baseline tools, along with some of the NHC’s top-performing intensity models. We consider the Decay-Statistical Hurricane Intensity Forecast, version 5 (SHF5; DeMaria et al. 2006) in order to measure the performance of AnEn with respect to a key Hurricane Forecast Improvement Program (HFIP) metric (Gall et al. 2013). We also compare AnEn with a top-performing statistical model, the LGEM (DeMaria 2009), and the IVCN. As a caveat, we obtain archived forecasts from NHC available online from models that are typically updated each year, while the HWRF and AnEn models in this study benefit from using the 2015 configuration of HWRF over the entire 2011–15 period. For each lead time, MAE is only computed when forecasts are produced by all of the models in order to achieve a fair comparison. The MAE results for both the EP and ATL are shown in Fig. 7. The MAE of the IVCN and HWRF intensity forecasts are plotted once again for comparison. The LGEM statistical model and SHF5 baseline produce higher MAE than all other models after 24 and 18 h in the EP and ATL, with the SHF5 producing the highest error. Overall, the AnEn, HWRF, IVCN, and the official NHC VMAX forecasts are competitive with one another as top performers. The AnEn IVCN even exceeds the skill of the NHC at numerous lead times.

Fig. 7.
Fig. 7.

The MAE as a function of lead time of the median AnEn VMAX forecasts (black), the median AnEn IVCN VMAX forecasts (green), and the VMAX forecasts from HWRF (blue), NHC (orange), SHF5 (magenta), LGEM (cyan), and IVCN (red) for both the (a) EP and (b) ATL. The sample size per lead time is included below the x axis.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

c. Probabilistic forecast verification (spread–skill relationship, statistical consistency, reliability, and resolution)

We first assess whether the probabilistic prediction systems of AnEn and QuMap can quantify their own uncertainty by compiling binned spread–skill and dispersion diagrams. In spread–skill diagrams, the ensemble spread is compared with the RMSE of the ensemble mean over small class intervals (i.e., bins) of spread (Wang and Bishop 2003; Hopson 2014). In the dispersion diagrams, however, the same comparison is carried out by considering the overall average of both RMSE and spread at any lead time (Buizza et al. 2005; Fortin et al. 2014). A good correlation in the spread–skill diagram indicates that an ensemble system can predict its uncertainty (Hopson 2014). Ideally, RMSE should match the spread at all classes of values, resulting in a trend lying on the plot’s 1:1 diagonal. In the dispersion diagram, the ensemble spread should match the RMSE at any lead time.

Both the dispersion diagram and the spread–skill diagram quantify the statistical consistency of an ensemble system. An ensemble is statistically consistent if the observations are indistinguishable from the predictions, which means that the observations and the ensemble members are samples from the same distribution. In that case, the ensemble standard deviation (i.e., the ensemble spread) and the RMSE of the ensemble mean should be equal. However, this is only a necessary condition for statistical consistency if the distributions are not Gaussian, given that the higher-order moments of the distributions are not evaluated in these diagrams.

Binned spread–skill diagrams for both EP and ATL are depicted in Fig. 8. Each bin has the same number of spread–RMSE pairs, which results in bins of different width. For both the EP and the ATL, the AnEn IVCN possesses a similar spread–skill relationship as QuMap. In both ocean basins, the spread of both ensembles slightly underestimates the errors, except for the class corresponding to the highest spread, where the opposite occurs. This means that when predicting a large spread, both systems tend to overestimate the verifying error of the ensemble mean. Overall, however, the EP ensembles suffer less from such errors. Figure 8 also shows the correlation coefficient R between the absolute errors and ensemble spreads. As in Goerss and Sampson (2014), these correlation coefficients are computed on the original values before the average over the bins. Compared to the intensity error predicting models as in Goerss and Sampson (2014), the R coefficients of both AnEn IVCN and QuMap are slightly higher. It has to be noted that such comparison is only qualitative, as the datasets are different. AnEn IVCN slightly outperforms QuMap with a higher R value over both basins. The dispersion diagrams (Fig. 8, bottom panels) show a similar consistency for the QuMap and the AnEn, but both systems are underdispersive after about 24 h in both datasets, even though (looking at the bootstrap confidence intervals) the spread underestimation is still acceptable up to about 60 h for the EP and 39 h for the ATL. The AnEn’s underdispersive behavior can be attributed to the cost function chosen during the predictors and weight selection (see section 4a); that is, aiming to optimize the performance in terms of the MAE of the ensemble median does not necessarily mean achieving the optimal statistical consistency.

Fig. 8.
Fig. 8.

(top) Binned spread–skill plots and (bottom) dispersion diagrams computed in the (left) EP and (right) ATL for AnEn IVCN and QuMap. (a),(b) The correlation coefficient (R) between the absolute error and ensemble spread is indicated in the legends. They are computed as in Goerss and Sampson (2014) on the original values before the average over the bins. (c),(d) RMSE (solid) and ensemble spread (dashed) of AnEn IVCN as a function of the forecast lead time; the 5%–95% bootstrap confidence intervals are plotted for AnEn IVCN only to reduce clutter.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

The continuous ranked probability score (CRPS) is computed to assess the overall quality of the AnEn and QuMap probabilistic predictions (Brown 1974; Carney and Cunningham 2006). It can be expressed as
e6
where is the cumulative distribution function (CDF) of the probabilistic forecast, is the CDF of the observation for the ith ensemble prediction/observation pair, and N is the number of available pairs. Hersbach (2000) showed that the CRPS reduces to MAE if a deterministic (single member) forecast is considered. Like the MAE, a lower value of CRPS indicates better skill, with 0 being a perfect score. CRPS has the same unit of measurement as the forecast variable. Also, the CRPS can be conceptualized as the Brier score (Brier 1950), integrated over all the possible threshold values. In fact, a full probabilistic distribution is compared with the observations as represented by their CDF. In Fig. 9, the CRPS as a function of forecast lead time is shown for both AnEn and QuMap and the two datasets. With the CDFs of the forecasts and observations being represented by discrete values, the CRPS has been computed by numerical integration techniques using the function “crpsDecomposition” provided in the software library “verification” (NCAR 2014). For the EP, the AnEn outperforms QuMap by about 5% when all the lead times are pooled together, with the CRPS being statistically significantly lower up to 18 h. For the ATL, the overall performance of the AnEn is very similar to QuMap, even though the CRPS is statistically significantly lower for AnEn up to 12 h. By looking at the different components of the CRPS, we aim to understand which attribute of a probabilistic prediction leads to the improvement of the AnEn over the QuMap. Similar to the Brier score (Murphy 1973), Hersbach (2000) demonstrated that the CRPS can be decomposed into three components: reliability (REL), resolution (RES), and uncertainty (UNC). It is worth noticing that even though the CRPS can be seen as the Brier score integrated over all the possible events, its components (here indicated as REL, RES, and UNC) do not numerically match the integrated components of the Brier score.
Fig. 9.
Fig. 9.

CRPS as a function of forecast lead time for the (left) EP and (right) ATL datasets; 5%–95% bootstrap confidence intervals are also plotted. The vertical bars next to the left vertical axis indicate CRPS bootstrap intervals considering all the lead times together. The legend at the top indicates the overall values of CRPS, potential CRPS (CRPS POT), and reliability (REL).

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

The REL component measures the statistical consistency of the ensemble system (the lower, the better): that is, how well the forecasted probabilities match the observed frequencies. REL is closely connected to the rank histograms (Talagrand et al. 1997), which measure whether the frequency of the observations found in each ensemble bin is equal for all the bins. The RES component measures how much better a system performs compared to a climatological forecast, where the climatological forecast is merely the single probability of an event, as observed in the historical dataset. In general, the resolution attribute of a system reflects how well the different forecast frequency classes can separate the different observed frequencies from the climatological mean. The UNC component, which depends only on the observations, measures the variability of the observations, reflecting the predictability associated with the available dataset. More specifically, CRPS can be expressed as CRPS = REL + CRPSPOT, where CRPSPOT is the potential CRPS and can be expressed as CRPSPOT = UNC − RES. The potential CRPS is the CRPS that could be obtained if a forecasting system would become perfectly reliable (REL = 0). For the details about mathematical formulations of the three components, we direct the reader to Hersbach (2000).

The overall values of CRPSPOT and REL (legends at the top of Fig. 9) indicate that the AnEn IVCN performs quite similarly to QuMap in terms of reliability. This is consistent with the dispersion diagrams and binned spread–skill plots shown earlier. Given the fact that UNC is equal for QuMap and AnEn, a lower CRPSPOT indicates a better (higher) resolution. This means that improvements in the AnEn’s CRPS over the QuMap’s CRPS are achieved because of a higher resolution. This conclusion holds for all the lead times in the EP and only for the earlier lead times in the ATL, until the CRPS AnEn values are lower than those of QuMap.

d. AnEn real-time performance

Both the deterministic and probabilistic forecast verification results above show how the AnEn compares with observations and other intensity prediction methods. However, those results may exaggerate the AnEn’s real-time performance since the HWRF reforecasts from 2011 to 2014 use best track location, intensity, and structure information. Only the real-time HWRF data from 2015 used in deriving the AnEn are based on the more imperfect operational estimates of these TC parameters. To obtain a better estimate of the true real-time performance of the AnEn, we now more closely examine the AnEn’s deterministic real-time performance in the 2015 hurricane seasons of the eastern Pacific and Atlantic. The 2015 AnEn VMAX predictions are equivalent to those that would be generated in a real-time, operational environment because the archived (not rerun) 2015 HWRF operational forecasts have been used, and the training (predictor selection and weight optimization) has been carried out over past model reforecasts and observations (i.e., 2011–14). In addition, we also compare the real-time AnEn with the operational 2015 HWRF ensemble.

Figure 10 shows the MAE associated with AnEn, AnEn IVCN, HWRF, IVCN, and QuMap in 2015 for both the eastern Pacific and Atlantic. With respect to lead time, the MAE values for each model in 2015 share some qualitative similarities with the MAE values for the entire 2011–15 sample, but there are some important differences. In the eastern Pacific, the AnEn and AnEn IVCN once again perform better than the HWRF at most lead times in 2015, but the improvements are not as statistically significant as in the full sample, according to the bootstrap confidence intervals depicted in Fig. 10. Also, the relative improvement of the AnEn IVCN (MAE = 9.48 kt) to the IVCN itself (MAE = 9.71 kt) is not as large in the eastern Pacific in 2015 as in the full 2011–15 sample (8.95 vs 9.72 kt, respectively). Degradation in model performance is also seen for the Atlantic in 2015, except here the QuMap’s MAE is degraded at all lead times, compared to the full 2011–15 period. Also, similar to Fig. 4b for the full sample, the AnEn IVCN performs better than the IVCN and HWRF in the Atlantic at only earlier lead times (Fig. 10b). Overall, for both ocean basins, the AnEn typically still produces competitive to superior deterministic forecast skill in the real-time setting when compared against its baseline models.

Fig. 10.
Fig. 10.

As in Fig. 4, but for 2015 only. Also, the number of samples per lead time is indicated below the x axis.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

To gain an idea of how the AnEn’s real-time performance may compare with operational dynamical ensemble systems, we compare the real-time AnEn IVCN with a set of forecasts from NCEP’s 2015 operational configuration of the 20-member HWRF ensemble for both the EP and ATL 2015 hurricane seasons. We simply use raw HWRF ensemble output in the comparison, which includes 20 ensemble members and a control run. The ensemble verification metrics for the HWRF ensemble shown here may potentially be improved further through other calibration and postprocessing methods. Also, the HWRF ensemble data are only available for a limited subset of all 2015 hurricanes and forecast times. Using only the forecast times in which all HWRF ensemble member and AnEn forecasts are simultaneously available, 130 and 158 forecasts exist for homogeneous comparison in the EP and ATL, respectively.

Figure 11 provides dispersion diagrams comparing the model spread and RMSE for both the AnEn configuration that uses the IVCN predictor and the HWRF ensembles. The comparison is only carried out to a lead time of 99 h. The number of samples available for comparison decreases monotonically from lead times of 3 to 99 h for both ocean basins, with sample sizes decreasing from 100 to 50 in the EP and 131 to 61 in the ATL. Because of the limited sample sizes, we employ bootstrap confidence intervals on the RMSE at the 5% and 95% levels to determine their impact on the uncertainty inherent within the validation. Looking at the RMSE of both models in the EP, we see the AnEn IVCN performs better than the HWRF ensemble at all lead times, but with the large bootstrap uncertainty, the differences in performance are statistically significant only at the 3-, 6-, 66-, and 69-h lead times. What is noteworthy is that the AnEn IVCN spread matches the RMSE relatively well at most lead times, whereas the HWRF ensemble suffers from a lack of ensemble spread as lead time increases. It should be noted, however, that while the AnEn naturally produces better dispersion, calibration may improve the HWRF ensemble dispersion. In contrast with the EP, the RMSE of the AnEn IVCN is slightly worse than the HWRF ensemble in the ATL after lead times of 12 h, although RMSE differences between the two ensembles are not statistically significant at any lead time. Also, the AnEn IVCN is generally more underdispersive in the ATL than in the EP. Nonetheless, consistent with most dynamical ensemble prediction systems, the uncalibrated HWRF ensemble still suffers from relatively worse underdispersion at most lead times in the ATL. Overall, these results highlight the naturally good dispersion of analog ensembles versus uncalibrated operational dynamical ensemble prediction systems and, furthermore, show that the RMSE of the computationally inexpensive AnEn is competitive with the HWRF ensemble.

Fig. 11.
Fig. 11.

Dispersion diagrams [(a) EP, (b) ATL] show RMSE (solid) and ensemble spread (dashed) of AnEn IVCN (black) and the 2015 HWRF ensemble (red) as a function of forecast lead time; the 5%–95% bootstrap confidence intervals associated with RMSE are plotted for both ensembles. The sample size per lead time is included below the x axis.

Citation: Monthly Weather Review 146, 6; 10.1175/MWR-D-17-0314.1

5. Summary

The analog ensemble (AnEn) technique is applied to generate probabilistic predictions of the maximum intensity (VMAX) of tropical cyclones (TCs) in both the eastern Pacific (EP) and Atlantic (ATL) Ocean basins. Two datasets are used, including retrospective forecast runs for the 0–126-h lead-time range from the 2015 configuration of the operational version of the Hurricane Weather Research and Forecasting (HWRF) Model (2011–15) and of maximum 1-min average 10-m wind speed observations from the HURDAT2 dataset. Also, a consensus of top-performing models (IVCN) was tested as an additional predictor in the EP and ATL AnEn systems. The AnEn median has been used to generate single-value deterministic predictions that can be compared to the HWRF and IVCN VMAX forecasts. Without using IVCN VMAX as a predictor, the AnEn significantly outperforms HWRF in terms of mean absolute error (MAE) up to 24 h in the EP and 12 h in the ATL. If IVCN VMAX is used as a predictor, AnEn outperforms both HWRF and IVCN over all the lead times on the EP (the improvements are statistically significant only up to 81 h, with the exception of lead times 42 and 75 h) and up to 24 h on the ATL. Also, in terms of MAE, the AnEn incorporating IVCN is competitive with the NHC’s official VMAX forecast in both the EP and ATL.

A primary question is why the AnEn with and without IVCN, when compared against the HWRF in terms of MAE, performed significantly better in the EP than in the ATL. Some statistical insight was provided in plots depicting bias, centered root-mean-square error, and correlation. The bias and centered root-mean-square error plots show that at lead times where the AnEn improved upon the HWRF, both systematic and random errors were typically reduced by the AnEn, but a larger systematic, negative bias in the HWRF VMAX forecasts in the EP allowed greater relative improvements by the AnEn in that basin. Moreover, the correlations were higher at longer lead times for the HWRF, IVCN, and AnEn in the EP than in the ATL. This suggests that intensity predictability in the EP is likely higher than in the ATL, consistent with past studies (e.g., Kaplan et al. 2015). Indeed, Atlantic storms are subject to a greater amount of variability and complexity in terms of the latitudinal extent of storm tracks, interactions with extratropical weather systems, land interactions, and Saharan air layers. The eastern Pacific exhibits similar types of variability, but storm tracks are confined to a narrower meridional width due to a sharp meridional sea surface temperature gradient preventing intense TCs from reaching as far north as in the ATL. Further, land interactions are relatively sparse in the EP. This may imply that a larger training sample would benefit an ATL-based AnEn in order to have a sufficiently diverse set of historical analogs to draw from.

One of the strongest attributes of the AnEn is that as an ensemble, a deterministic intensity forecast is accompanied by a measure of the forecast uncertainty. When evaluated as an ensemble prediction by looking at the continuous ranked probability score (CRPS), the AnEn IVCN significantly outperforms a quantile mapping model (QuMap) ensemble only up to 18 and 12 h over the EP and ATL, respectively. Similarly to the QuMap, the AnEn exhibits a very good spread–skill relationship (i.e., the ability to reliably quantify uncertainty), which may be valuable in the decision-making process. The decomposition of the CRPS shows that the AnEn improvements over the QuMap can be attributed to a better resolution, given that the reliability attribute of the two systems is similar. A comparison of the HWRF-based AnEn IVCN with the operational HWRF ensemble for forecasts in the 2015 hurricane season shows that the AnEn skill is competitive with dynamical ensemble predictions but at a much lower real-time computational expense (the AnEn takes less than a second to generate a 0–96-h VMAX forecast on a common personal computer, while the HWRF ensemble takes about 3 h running in parallel on a multinodes cluster). Moreover, the spread–skill relationship of the AnEn IVCN remains good in this analysis, while the uncalibrated HWRF ensemble produces inadequate ensemble spread as the RMSE grows larger at later lead times.

Overall, the AnEn shows promise in the realm of TC intensity prediction, and it can be used with any dynamical model framework as long as that model has an adequately large training dataset of archived forecasts or reforecasts. Also, there are ongoing efforts to apply the AnEn in other TC forecasting problems, such as predicting the rate of change of intensity, which may be a more direct and effective way to address rapid intensification.

Acknowledgments

The National Center for Atmospheric Research is sponsored by the National Science Foundation. We appreciate the help of Vijay Tallapragada, Zhan Zhang, and Avichal Mehra in obtaining real-time and reforecast HWRF data and access to NOAA computing resources. A discussion with Chris Davis helped inspire the use of IVCN as a predictor in the AnEn. Ryan Torn and Zhan Zhang are also thanked for sharing 2015 configurations of WRF-based dynamical ensemble prediction systems for comparison with the AnEn. Two anonymous reviewers, Buck Sampson, and editors Carolyn Reynolds and Ron McTaggart-Cowan provided invaluable critical feedback that led to significant improvements to this manuscript. Finally, we thank the NOAA Hurricane Forecast Improvement Program for supporting this research under Grants NA14NWS4680024 and NA16NWS4680027.

APPENDIX

Predictors Tested in AnEn

Table A1 provides a list of all 64 predictors tested in the development of the AnEn.

Table A1.

A complete list of all predictors tested in the development of AnEn.

Table A1.

REFERENCES

  • Aksoy, A., S. Lorsolo, T. Vukicevic, K. J. Sellwood, S. D. Aberson, and F. Zhang, 2012: The HWRF Hurricane Ensemble Data Assimilation System (HEDAS) for high-resolution data: The impact of airborne Doppler radar observations in an OSSE. Mon. Wea. Rev., 140, 18431862, https://doi.org/10.1175/MWR-D-11-00212.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Alessandrini, S., L. Delle Monache, S. Sperati, and J. N. Nissen, 2015a: A novel application of an analog ensemble for short-term wind power forecasting. Renew. Energy, 76, 768781, https://doi.org/10.1016/j.renene.2014.11.061.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Alessandrini, S., L. Delle Monache, S. Sperati, and G. Cervone, 2015b: An analog ensemble for short-term probabilistic solar power forecast. Appl. Energy, 157, 95110, https://doi.org/10.1016/j.apenergy.2015.08.011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bao, J.-W., S. G. Gopalakrishnan, S. A. Michelson, F. D. Marks, and M. T. Montgomery, 2012: Impact of physics representations in the HWRFX on simulated hurricane structure and pressure–wind relationships. Mon. Wea. Rev., 140, 32783299, https://doi.org/10.1175/MWR-D-11-00332.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Biswas, M. K., L. Bernardet, and J. Dudhia, 2014: Sensitivity of hurricane forecasts to cumulus parameterizations in the HWRF Model. Geophys. Res. Lett., 41, 91139119, https://doi.org/10.1002/2014GL062071.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78, 13, https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brown, T. A., 1974: Admissible scoring systems for continuous distributions. The Rand Corporation Doc. P-5235, 22 pp., https://www.rand.org/pubs/papers/P5235.html.

  • Buizza, R., P. L. Houtekamer, G. Pellerin, Z. Toth, Y. Zhu, and M. Wei, 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon. Wea. Rev., 133, 10761097, https://doi.org/10.1175/MWR2905.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Carney, M., and P. Cunningham, 2006: Evaluating density forecasting models. Trinity College Dublin, Department of Computer Science Rep., 12 pp.

    • Crossref
    • Export Citation
  • Davò, F., S. Alessandrini, S. Sperati, L. Delle Monache, D. Airoldi, and M. T. Vespucci, 2016: Post-processing techniques and principal component analysis for regional wind power and solar irradiance forecasting. Sol. Energy, 134, 327338, https://doi.org/10.1016/j.solener.2016.04.049.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Delle Monache, L., T. Nipen, Y. Liu, G. Roux, and R. Stull, 2011: Kalman filter and analog schemes to postprocess numerical weather predictions. Mon. Wea. Rev., 139, 35543570, https://doi.org/10.1175/2011MWR3653.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Delle Monache, L., F. A. Eckel, D. L. Rife, B. Nagarajan, and K. Searight, 2013: Probabilistic weather prediction with an analog ensemble. Mon. Wea. Rev., 141, 34983516, https://doi.org/10.1175/MWR-D-12-00281.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., 2009: A simplified dynamical system for tropical cyclone intensity prediction. Mon. Wea. Rev., 137, 6882, https://doi.org/10.1175/2008MWR2513.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., M. Mainelli, L. K. Shay, J. A. Knaff, and J. Kaplan, 2005: Further improvements to the Statistical Hurricane Intensity Prediction Scheme (SHIPS). Wea. Forecasting, 20, 531543, https://doi.org/10.1175/WAF862.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., J. A. Knaff, and J. Kaplan, 2006: On the decay of tropical cyclone winds crossing narrow landmasses. J. Appl. Meteor. Climatol., 45, 491499, https://doi.org/10.1175/JAM2351.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., C. R. Sampson, J. A. Knaff, and K. D. Musgrave, 2014: Is tropical cyclone intensity guidance improving? Bull. Amer. Meteor. Soc., 95, 387398, https://doi.org/10.1175/BAMS-D-12-00240.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Déqué, M., 2007: Frequency of precipitation and temperature extremes over France in an anthropogenic scenario: Model results and statistical correction according to observed values. Global Planet. Change, 57, 1626, https://doi.org/10.1016/j.gloplacha.2006.11.030.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Djalalova, I., L. Delle Monache, and J. Wilczak, 2015: PM2.5 analog forecast and Kalman filtering post-processing for the Community Multiscale Air Quality (CMAQ) model. Atmos. Environ., 119, 431442, https://doi.org/10.1016/j.atmosenv.2015.05.057.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Doyle, J. D., and Coauthors, 2011: Real-time tropical cyclone prediction using COAMPS-TC. Adv. Geosci., 28, 1528, https://doi.org/10.1142/9789814405683_0002.

    • Search Google Scholar
    • Export Citation
  • Efroymson, M. A., 1960: Multiple regression analysis. Mathematical Methods for Digital Computers, A. Ralston, and H. S. Wilf, Eds., Vol. 1, Wiley and Sons, 191–203.

  • Emanuel, K. A., 1986: An air–sea interaction theory for tropical cyclones. Part I: Steady-state maintenance. J. Atmos. Sci., 43, 585605, https://doi.org/10.1175/1520-0469(1986)043<0585:AASITF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ferrier, B. S., Y. Jin, Y. Lin, T. Black, E. Rogers, and G. DiMego, 2002: Implementation of a new grid-scale cloud and precipitation scheme in the NCEP Eta model. 19th Conf. on Weather Analysis and Forecasting/15th Conf. on Numerical Weather Prediction, San Antonio, TX, Amer. Meteor. Soc., 10.1, https://ams.confex.com/ams/SLS_WAF_NWP/techprogram/paper_47241.htm.

  • Fortin, V., M. Abaza, F. Anctil, and R. Turcotte, 2014: Why should ensemble spread match the RMSE of the ensemble mean? J. Hydrometeor., 15, 17081713, https://doi.org/10.1175/JHM-D-14-0008.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gall, R., J. Franklin, F. Marks, E. N. Rappaport, and F. Toepfer, 2013: The Hurricane Forecast Improvement Project. Bull. Amer. Meteor. Soc., 94, 329343, https://doi.org/10.1175/BAMS-D-12-00071.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gneiting, T., 2011: Making and evaluating point forecasts. J. Amer. Stat. Assoc., 106, 746762, https://doi.org/10.1198/jasa.2011.r10138.

  • Goerss, J. S., and C. R. Sampson, 2014: Prediction of consensus tropical cyclone intensity forecast error. Wea. Forecasting, 29, 750762, https://doi.org/10.1175/WAF-D-13-00058.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gopalakrishnan, S. G., F. Marks, X. Zhang, J.-W. Bao, K.-S. Yeh, and R. Atlas, 2011: The Experimental HWRF System: A study on the influence of horizontal resolution on the structure and intensity changes in tropical cyclones using an idealized framework. Mon. Wea. Rev., 139, 17621784, https://doi.org/10.1175/2010MWR3535.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., and J. S. Whitaker, 2006: Probabilistic quantitative precipitation forecasts based on reforecast analogs: Theory and application. Mon. Wea. Rev., 134, 32093229, https://doi.org/10.1175/MWR3237.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hersbach, H., 2000: Decomposition of the continuous ranked probability score for ensemble prediction systems. Wea. Forecasting, 15, 559570, https://doi.org/10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hopson, T. M., 2014: Assessing the ensemble spread–error relationship. Mon. Wea. Rev., 142, 11251142, https://doi.org/10.1175/MWR-D-12-00111.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Janjic, Z. I., R. Gall, and M. E. Pyle, 2010: Scientific documenation for the NMM solver. NCAR Tech. Note NCAR/TN-477+STR, 53 pp., https://doi.org/10.5065/D6MW2F3Z.

    • Crossref
    • Export Citation
  • Jarvinen, B. R., and C. J. Neumann, 1979: Statistical forecasts of tropical cyclone intensity for the North Atlantic basin. NOAA Tech. Memo. NWS NHC-10, 22 pp., https://www.nhc.noaa.gov/pdf/NWS-NHC-1979-10.pdf.

  • Junk, C., L. Delle Monache, S. Alessandrini, G. Cervone, and L. von Bremen, 2015: Predictor-weighting strategies for probabilistic wind power forecasting with an analog ensemble. Meteor. Z., 24, 361379, https://doi.org/10.1127/metz/2015/0659.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kaplan, J., and Coauthors, 2015: Evaluating environmental mpacts on tropical cyclone rapid intensification predictability utilizing statistical models. Wea. Forecasting, 30, 13741396, https://doi.org/10.1175/WAF-D-15-0032.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kossin, J. P., and M. Sitkowski, 2009: An objective model for identifying secondary eyewall formation in hurricanes. Mon. Wea. Rev., 137, 876892, https://doi.org/10.1175/2008MWR2701.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kossin, J. P., and M. DeMaria, 2016: Reducing operational hurricane intensity forecast errors during eyewall replacement cycles. Wea. Forecasting, 31, 601608, https://doi.org/10.1175/WAF-D-15-0123.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krishnamurti, T. N., R. Correa-Torres, G. Rohaly, D. Oosterhof, and N. Surgi, 1997: Physical initialization and hurricane ensemble forecasts. Wea. Forecasting, 12, 503514, https://doi.org/10.1175/1520-0434(1997)012<0503:PIAHEF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krishnamurti, T. N., C. M. Kishtawal, Z. Zhang, T. LaRow, D. Bachiochi, E. Williford, S. Gadgil, and S. Surendran, 2000: Multimodel ensemble forecasts for weather and seasonal climate. J. Climate, 13, 41964216, https://doi.org/10.1175/1520-0442(2000)013<4196:MEFFWA>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Landsea, C. W., and J. L. Franklin, 2013: Atlantic hurricane database uncertainty and presentation of a new database format. Mon. Wea. Rev., 141, 35763592, https://doi.org/10.1175/MWR-D-12-00254.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Landsea, C. W., J. L. Franklin, and J. Beven, 2015: The revised Atlantic hurricane database (HURDAT2). NOAA/NHC Doc., 6 pp., http://www.nhc.noaa.gov/data/hurdat/hurdat2-format-atlantic.pdf.

  • Liu, Q., N. Surgi, S. Lord, W.-S. Wu, D. Parrish, S. Gopalakrishnan, J. Waldrop, and J. Gamache, 2006: Hurricane initialization in HWRF Model. 27th Conf. on Hurricanes and Tropical Meteorology, Monterey, CA, Amer. Meteor. Soc., 8A.2, https://ams.confex.com/ams/pdfpapers/108496.pdf.

  • Miyamoto, Y., and T. Takemi, 2013: A transition mechanism for the spontaneous axisymmetric intensification of tropical cyclones. J. Atmos. Sci., 70, 112129, https://doi.org/10.1175/JAS-D-11-0285.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., 1973: A new vector partition of the probability score. J. Appl. Meteor., 12, 595600, https://doi.org/10.1175/1520-0450(1973)012<0595:ANVPOT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Nagarajan, B., L. Delle Monache, J. Hacker, D. Rife, K. Searight, J. Knievel, and T. Nipen, 2015: An evaluation of analog-based postprocessing methods across several variables and forecast models. Wea. Forecasting, 30, 16231643, https://doi.org/10.1175/WAF-D-14-00081.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • NCAR, 2014: Verification: Weather forecast verification utilities. R package version 1.40, accessed 1 February 2018, http://CRAN.R-project.org/package=verification.

  • Nolan, D. S., J. A. Zhang, and E. W. Uhlhorn, 2014: On the limits of estimating the maximum wind speeds in hurricanes. Mon. Wea. Rev., 142, 28142837, https://doi.org/10.1175/MWR-D-13-00337.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pu, Z., S. Zhang, M. Tong, and V. Tallapragada, 2016: Influence of the self-consistent regional ensemble background error covariance on hurricane inner-core data assimilation with the GSI-based hybrid system for HWRF. J. Atmos. Sci., 73, 49114925, https://doi.org/10.1175/JAS-D-16-0017.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rogers, R., P. Reasor, and S. Lorsolo, 2013: Airborne Doppler observations of the inner-core structural differences between intensifying and steady-state tropical cyclones. Mon. Wea. Rev., 141, 29702991, https://doi.org/10.1175/MWR-D-12-00357.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sampson, C. R., J. L. Franklin, J. A. Knaff, and M. DeMaria, 2008: Experiments with a simple tropical cyclone intensity consensus. Wea. Forecasting, 23, 304312, https://doi.org/10.1175/2007WAF2007028.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stevenson, S. N., K. L. Corbosiero, M. DeMaria, and J. L. Vigh, 2018: A 10-year survey of tropical cyclone inner-core lightning bursts and their relationship to intensity change. Wea. Forecasting, 33, 2336, https://doi.org/10.1175/WAF-D-17-0096.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Talagrand, O., R. Vautard, and B. Strauss, 1997: Evaluation of probabilistic prediction systems. Proc. ECMWF Workshop on Predictability, Reading, United Kingdom, ECMWF, 25 pp., https://www.ecmwf.int/sites/default/files/elibrary/1997/12555-evaluation-probabilistic-prediction-systems.pdf.

  • Tallapragada, V., and Coauthors, 2015: Hurricane Weather Research and Forecasting (HWRF) Model: 2015 scientific documentation. NCAR Development Testbed Center Rep., 113 pp., http://www.dtcenter.org/HurrWRF/users/docs/scientific_documents/HWRF_v3.7a_SD.pdf.

  • Torn, R. D., and C. Snyder, 2012: Uncertainty of tropical cyclone best-track information. Wea. Forecasting, 27, 715729, https://doi.org/10.1175/WAF-D-11-00085.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Trahan, S., and L. Sparling, 2012: An analysis of NCEP tropical cyclone vitals and potential effects on forecasting models. Wea. Forecasting, 27, 744756, https://doi.org/10.1175/WAF-D-11-00063.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tsai, H.-C., and R. L. Elsberry, 2014: Applications of situation-dependent intensity and intensity spread predictions based on a weighted analog technique. Asia-Pac. J. Atmos. Sci., 50, 507518, https://doi.org/10.1007/s13143-014-0040-7.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tsai, H.-C., and R. L. Elsberry, 2015: Weighted analog technique for intensity and intensity spread predictions of Atlantic tropical cyclones. Wea. Forecasting, 30, 13211333, https://doi.org/10.1175/WAF-D-15-0030.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Velden, C. S., and Coauthors, 2006: The Dvorak tropical cyclone intensity estimation technique: A satellite-based method that has endured for over 30 years. Bull. Amer. Meteor. Soc., 87, 11951210, https://doi.org/10.1175/BAMS-87-9-1195.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vigh, J., and W. H. Schubert, 2009: Rapid development of the tropical cyclone warm core. J. Atmos. Sci., 66, 33353350, https://doi.org/10.1175/2009JAS3092.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, X., and C. H. Bishop, 2003: A comparison of breeding and ensemble transform Kalman filter ensemble forecast schemes. J. Atmos. Sci., 60, 11401158, https://doi.org/10.1175/1520-0469(2003)060<1140:ACOBAE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weng, Y., and F. Zhang, 2016: Advances in convection-permitting tropical cyclone analysis and prediction through EnKF assimilation of reconnaissance aircraft observations. J. Meteor. Soc. Japan, 94, 345358, https://doi.org/10.2151/jmsj.2016-018.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zagrodnik, J. P., and H. Jiang, 2014: Rainfall, convection, and latent heating distributions in rapidly intensifying tropical cyclones. J. Atmos. Sci., 71, 27892809, https://doi.org/10.1175/JAS-D-13-0314.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, F., Y. Weng, J. F. Gamache, and F. D. Marks, 2011: Performance of convection-permitting hurricane initialization and prediction during 2008–2010 with ensemble data assimilation of inner-core airborne Doppler radar observations. Geophys. Res. Lett., 38, L15810, https://doi.org/10.1029/2011GL048469.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, J., D. S. Nolan, R. F. Rogers, and V. Tallapragada, 2015: Evaluating the impact of improvements in the boundary layer parameterization on hurricane intensity and structure forecasts in HWRF. Mon. Wea. Rev., 143, 31363155, https://doi.org/10.1175/MWR-D-14-00339.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhao, Q., and Y. Jin, 2008: High-resolution radar data assimilation for Hurricane Isabel (2003) at landfall. Bull. Amer. Meteor. Soc., 89, 13551372, https://doi.org/10.1175/2008BAMS2562.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, S. Q., M. Zupanski, A. Y. Hou, X. Lin, and S. H. Cheung, 2013: Assimilation of precipitation-affected radiances in a cloud-resolving WRF ensemble data assimilation system. Mon. Wea. Rev., 141, 754772, https://doi.org/10.1175/MWR-D-12-00055.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, Z., 2016: Introduction to the HWRF-based ensemble prediction system. 2016 Hurricane WRF Tutorial, College Park, MD, NCWCP, 31 pp., https://dtcenter.org/HurrWRF/users/tutorial/2016_NCWCP_tutorial/lectures/Wednesday-21-HWRFtutJan2016_Ensemble_Zhang.pdf.

  • Zhang, Z., and T. N. Krishnamurti, 1999: A perturbation method for hurricane ensemble predictions. Mon. Wea. Rev., 127, 447469, https://doi.org/10.1175/1520-0493(1999)127<0447:APMFHE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
1

HWRF produces forecasts out to 126 h, but the training sample size becomes smaller at each lead time since many storms eventually make landfall or weaken before the 126-h period is over. Therefore, AnEn forecasts beyond 96 h are not very reliable for the size of our training dataset.

Save