Search Results

You are looking at 1 - 3 of 3 items for

  • Author or Editor: Michaël Zamo x
  • All content x
Clear All Modify Search
Maxime Taillardat, Olivier Mestre, Michaël Zamo, and Philippe Naveau


Ensembles used for probabilistic weather forecasting tend to be biased and underdispersive. This paper proposes a statistical method for postprocessing ensembles based on quantile regression forests (QRF), a generalization of random forests for quantile regression. This method does not fit a parametric probability density function (PDF) like in ensemble model output statistics (EMOS) but provides an estimation of desired quantiles. This is a nonparametric approach that eliminates any assumption on the variable subject to calibration. This method can estimate quantiles using not only members of the ensemble but any predictor available including statistics on other variables.

The method is applied to the Météo-France 35-member ensemble forecast (PEARP) for surface temperature and wind speed for available lead times from 3 up to 54 h and compared to EMOS. All postprocessed ensembles are much better calibrated than the PEARP raw ensemble and experiments on real data also show that QRF performs better than EMOS, and can bring a real gain for human forecasters compared to EMOS. QRF provides sharp and reliable probabilistic forecasts. At last, classical scoring rules to verify predictive forecasts are completed by the introduction of entropy as a general measure of reliability.

Full access
Michaël Zamo, Liliane Bel, Olivier Mestre, and Joël Stein


Numerical weather forecast errors are routinely corrected through statistical postprocessing by several national weather services. These statistical postprocessing methods build a regression function called model output statistics (MOS) between observations and forecasts that is based on an archive of past forecasts and associated observations. Because of limited spatial coverage of most near-surface parameter measurements, MOS have been historically produced only at meteorological station locations. Nevertheless, forecasters and forecast users increasingly ask for improved gridded forecasts. The present work aims at building improved hourly wind speed forecasts over the grid of a numerical weather prediction model. First, a new observational analysis, which performs better in terms of statistical scores than those operationally used at Météo-France, is described as gridded pseudo-observations. This analysis, which is obtained by using an interpolation strategy that was selected among other alternative strategies after an intercomparison study conducted internally at Météo-France, is very parsimonious since it requires only two additive components, and it requires little computational resources. Then, several scalar regression methods are built and compared, using the new analysis as the observation. The most efficient MOS is based on random forests trained on blocks of nearby grid points. This method greatly improves forecasts compared with raw output of numerical weather prediction models. Furthermore, building each random forest on blocks and limiting those forests to shallow trees does not impair performance compared with unpruned and pointwise random forests. This alleviates the storage burden of the objects and speeds up operations.

Open access
Florian Dupuy, Olivier Mestre, Mathieu Serrurier, Valentin Kivachuk Burdá, Michaël Zamo, Naty Citlali Cabrera-Gutiérrez, Mohamed Chafik Bakkay, Maud-Alix Mader, Guillaume Oller, and Jean-Christophe Jouhaud


Cloud cover provides crucial information for many applications such as planning land observation missions from space. It remains however a challenging variable to forecast, and Numerical Weather Prediction (NWP) models suffer from significant biases, hence justifying the use of statistical post-processing techniques. In this study, ARPEGE (Météo-France global NWP) cloud cover is post-processed using a convolutional neural network (CNN). CNN is the most popular machine learning tool to deal with images. In our case, CNN allows the integration of spatial information contained in NWP outputs. We use a gridded cloud cover product derived from satellite observations over Europe as ground truth, and predictors are spatial fields of various variables produced by ARPEGE at the corresponding lead time. We show that a simple U-Net architecture (a particular type of CNN) produces significant improvements over Europe. Moreover, the U-Net outclasses more traditional machine learning methods used operationally such as a random forest and a logistic quantile regression. When using a large number of predictors, a first step toward interpretation is to produce a ranking of predictors by importance. Traditional methods of ranking (permutation importance, sequential selection, . . . ) need important computational resources. We introduced a weighting predictor layer prior to the traditional U-Net architecture in order to produce such a ranking. The small number of additional weights to train (the same as the number of predictors) does not impact the computational time, representing a huge advantage compared to traditional methods.

Restricted access