MSWX: Global 3-Hourly 0.1° Bias-Corrected Meteorological Data Including Near-Real-Time Updates and Forecast Ensembles

Hylke E. Beck GloH2O, Almere, Netherlands;

Search for other papers by Hylke E. Beck in
Current site
Google Scholar
PubMed
Close
,
Albert I. J. M. van Dijk Fenner School of Environment and Society, Australian National University, Canberra, Australian Capital Territory, Australia, and Environmental Sciences Group, Wageningen University, Wageningen, Netherlands;

Search for other papers by Albert I. J. M. van Dijk in
Current site
Google Scholar
PubMed
Close
,
Pablo R. Larraondo Fenner School of Environment and Society, Australian National University, Canberra, Australian Capital Territory, Australia;

Search for other papers by Pablo R. Larraondo in
Current site
Google Scholar
PubMed
Close
,
Tim R. McVicar Land and Water, CSIRO, and Australian Research Council Centre of Excellence for Climate Extremes, Canberra, Australian Capital Territory, Australia;

Search for other papers by Tim R. McVicar in
Current site
Google Scholar
PubMed
Close
,
Ming Pan Center for Western Weather and Water Extremes, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California;

Search for other papers by Ming Pan in
Current site
Google Scholar
PubMed
Close
,
Emanuel Dutra Instituto Português do Mar e da Atmosfera, and Instituto Dom Luiz, Faculty of Science, University of Lisbon, Lisbon, Portugal;

Search for other papers by Emanuel Dutra in
Current site
Google Scholar
PubMed
Close
, and
Diego G. Miralles Hydro-Climate Extremes Lab, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium

Search for other papers by Diego G. Miralles in
Current site
Google Scholar
PubMed
Close
Full access

We are aware of a technical issue preventing figures and tables from showing in some newly published articles in the full-text HTML view.
While we are resolving the problem, please use the online PDF version of these articles to view figures and tables.

Abstract

We present Multi-Source Weather (MSWX), a seamless global gridded near-surface meteorological product featuring a high 3-hourly 0.1° resolution, near-real-time updates (∼3-h latency), and bias-corrected medium-range (up to 10 days) and long-range (up to 7 months) forecast ensembles. The product includes 10 meteorological variables: precipitation, air temperature, daily minimum and maximum air temperature, surface pressure, relative and specific humidity, wind speed, and downward shortwave and longwave radiation. The historical part of the record starts 1 January 1979 and is based on ERA5 data bias corrected and downscaled using high-resolution reference climatologies. The data extension to within ∼3 h of real time is based on analysis data from GDAS. The 30-member medium-range forecast ensemble is based on GEFS and updated daily. Finally, the 51-member long-range forecast ensemble is based on SEAS5 and updated monthly. The near-real-time and forecast data are statistically harmonized using running-mean and cumulative distribution function-matching approaches to obtain a seamless record covering 1 January 1979 to 7 months from now. MSWX presents new and unique opportunities for hydrological modeling, climate analysis, impact studies, and monitoring and forecasting of droughts, floods, and heatwaves (within the bounds of the caveats and limitations discussed herein). The product is available at www.gloh2o.org/mswx.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Hylke E. Beck, hylke.beck@gloh2o.org

Abstract

We present Multi-Source Weather (MSWX), a seamless global gridded near-surface meteorological product featuring a high 3-hourly 0.1° resolution, near-real-time updates (∼3-h latency), and bias-corrected medium-range (up to 10 days) and long-range (up to 7 months) forecast ensembles. The product includes 10 meteorological variables: precipitation, air temperature, daily minimum and maximum air temperature, surface pressure, relative and specific humidity, wind speed, and downward shortwave and longwave radiation. The historical part of the record starts 1 January 1979 and is based on ERA5 data bias corrected and downscaled using high-resolution reference climatologies. The data extension to within ∼3 h of real time is based on analysis data from GDAS. The 30-member medium-range forecast ensemble is based on GEFS and updated daily. Finally, the 51-member long-range forecast ensemble is based on SEAS5 and updated monthly. The near-real-time and forecast data are statistically harmonized using running-mean and cumulative distribution function-matching approaches to obtain a seamless record covering 1 January 1979 to 7 months from now. MSWX presents new and unique opportunities for hydrological modeling, climate analysis, impact studies, and monitoring and forecasting of droughts, floods, and heatwaves (within the bounds of the caveats and limitations discussed herein). The product is available at www.gloh2o.org/mswx.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Hylke E. Beck, hylke.beck@gloh2o.org

Accurate and timely information on near-surface meteorological variables (e.g., precipitation, air temperature, humidity, wind speed, and radiation) is important for a multitude of scientific and operational purposes, including hydrological modeling (e.g., Nkiaka et al. 2017; Salas et al. 2018), flood and drought forecasting (e.g., Alfieri et al. 2013; Sheffield et al. 2014), agricultural monitoring (e.g., Brown et al. 2018; Schwalbert et al. 2020), and disease tracking (e.g., Myers et al. 2000; Davis et al. 2018). Several gridded near-surface meteorological products with global coverage and a long historical time span (>30 years) have been developed in recent decades (see Table 1 for a nonexhaustive overview). These products differ in terms of design objective, data sources, bias correction, downscaling approach, variables included, spatiotemporal resolution, temporal coverage, and near-real-time availability. While these products have been successfully used in numerous studies, the existing array of products has, in our view, four major drawbacks:

  1. 1)None of the existing products are available in near–real time (here defined as <1-day latency), and therefore they cannot be used to operationally monitor weather. European Centre for Medium-Range Weather Forecasts (ECMWF)’s fifth-generation atmospheric reanalysis (ERA5; Hersbach et al. 2020) and the Hydrological Global Forcing Data (HydroGFD) product (Berg et al. 2021) have a ∼5-day latency, which is not timely enough for most operational applications. Moreover, HydroGFD updates are only available commercially. The Japanese 55-year Reanalysis (JRA-55; Kobayashi et al. 2015) is updated to ∼2 days from real time, but has a relatively coarse 0.56° resolution. Although several operational analysis (as opposed to reanalysis) products are available in near–real time, these do not provide a long and consistent historical record, as they are based on evolving versions of the model and data assimilation system.
  2. 2)As of yet, no product includes consistent and freely available forecasts, and they therefore cannot by themselves be used to provide advance warning of impending weather. As a workaround, operational flood forecasting systems tend to combine historical and near-real-time data and forecast simulations from inconsistent sources (Emerton et al. 2016; Hao et al. 2017). However, this affects the reliability of the warnings issued by these systems (Hirpa et al. 2016; Anghileri et al. 2019; Zsoter et al. 2020). For example, the Global Flood Awareness System (GloFAS; Alfieri et al. 2020) uses ERA5 in combination with ECMWF’s Integrated Forecasting System (IFS) forecasts.
  3. 3)All products except TerraClimate (Abatzoglou et al. 2018) have a relatively coarse spatial resolution (≥0.25°) and thus are unable to accurately represent mountainous regions (Luo et al. 2019; Rouholahnejad-Freund et al. 2020). This is concerning, as mountainous regions contribute a large share of the world’s population with freshwater (Viviroli et al. 2007). TerraClimate has a 0.04° spatial resolution, but its record is not updated in near–real time, and its monthly resolution is insufficient for many applications.
  4. 4)None of the products take advantage of microwave- or infrared-based precipitation retrievals to more accurately capture the location, timing, and intensity of convective storms (Ebert et al. 2007; Massari et al. 2017; Beck et al. 2019b). They therefore exhibit suboptimal precipitation performance, especially in the tropics (Arakawa 2004; Prein et al. 2015).

Table 1.

Nonexhaustive overview of gridded near-surface meteorological products with global coverage and a long historical time span (>30 years). MSWX is included for the sake of completeness. Variable definitions: P = precipitation; T = 2-m air temperature; Tmin = 2-m daily minimum air temperature; Tmax = 2-m daily maximum air temperature; p = surface pressure; RH = 2-m relative humidity; q = 2-m specific humidity; V = 10-m wind speed; Rsw = downward shortwave radiation; Rlw = downward longwave radiation; Cf = cloud cover fraction.

Table 1.

Here we present Multi-Source Weather (MSWX), a near-surface, past, present, and future weather record designed to overcome the abovementioned shortcomings. The record includes 10 key meteorological variables (Table 2) and has several unique features including (i) bias correction using high-resolution reference climatologies; (ii) high spatiotemporal resolution (3-hourly 0.1°); (iii) low latency (updated to ∼3 h from real time); (iv) consistent medium-range (up to 10 days) and long-range (up to 7 months) forecast ensembles; (v) harmonization of the different data sources to obtain a seamless record from the past, to the present, and into the future; and (vi) compatibility with the global Multi-Source Weighted-Ensemble Precipitation (MSWEP) product that blends gauge, satellite, and reanalysis data to improve the precipitation estimates (Beck et al. 2017, 2019a). MSWX consists of four subproducts (Fig. 1):

  1. 1)MSWX-Past (covering 1 January 1979 to ∼5 days from real time) based on ECMWF ERA5 (Hersbach et al. 2020) bias corrected and downscaled using high-resolution monthly or annual reference climatologies;
  2. 2)MSWX-NRT (extension to ∼3 h from real time) based on the National Centers for Environmental Prediction (NCEP) Global Data Assimilation System (GDAS; NCEP 2021a);
  3. 3)MSWX-Mid (10-day forecast ensemble comprising 30 members) based on the NCEP Global Ensemble Forecasting System (GEFS; NCEP 2021b); and
  4. 4)MSWX-Long (7-month forecast ensemble comprising 51 members) based on ECMWF’s fifth-generation seasonal forecast system (SEAS5; Johnson et al. 2019).

Table 2.

The 10 near-surface meteorological variables included in MSWX.

Table 2.
Fig. 1.
Fig. 1.

Details of the MSWX subproducts.

Citation: Bulletin of the American Meteorological Society 103, 3; 10.1175/BAMS-D-21-0145.1

In the remainder of this paper, we present (i) the data and methods underlying MSWX, (ii) the validation of MSWX using station observations, (iii) the assessment of the harmonization approaches, (iv) several caveats and limitations to be considered when using MSWX, and (v) the MSWX directory structure, file naming convention, and data format.

Data and methods

Meteorological variables.

MSWX includes 10 widely used near-surface meteorological variables: (i) precipitation, (ii) 2-m air temperature, (iii) 2-m daily minimum air temperature, (iv) 2-m daily maximum air temperature, (v) surface pressure, (vi) 2-m relative humidity, (vii) 2-m specific humidity, (viii) 10-m wind speed, (ix) downward shortwave radiation, and (x) downward longwave radiation (Table 2). Our selection of variables is similar to other gridded meteorological products, such as PGF (Sheffield et al. 2006) and WFDE5 (Cucchi et al. 2020; Table 1).

Reference climatologies.

MSWX-Past is produced by rescaling ERA5 to match monthly 0.1° “reference climatologies” representing 1979–2019. These reference climatologies are based on state-of-the-art, high-resolution data derived from station observations, satellite imagery, and/or model output as detailed below. The rescaling serves to increase the local accuracy and relevance of the relatively coarse 0.28° ERA5 estimates. The climatologies are computed for each month of the year, yielding, for each variable, a static three-dimensional matrix of size 1,800 × 3,600 × 12.

The MSWX-Past precipitation reference climatology was produced by resampling the station-based Climatologies at High resolution for the Earth’s Land Surface Areas (CHELSA) dataset (representing 1979–2013; Karger et al. 2017) from 0.0083° to 0.1° using spatial averaging. We did not adjust the climatology from 1979–2013 to 1979–2019—as we did for air temperature as explained hereafter—due to the greater uncertainty associated with precipitation estimates. Since CHELSA only covers the land surface, we simply used the ERA5 climatology (for 1979–2019) for ocean areas.

The MSWX-Past air temperature reference climatology was also produced by resampling the CHELSA climatology from 0.0083° to 0.1° using averaging. We used time-varying (nonclimatological) Climatic Research Unit Time Series (CRU TS) data (V4.04; monthly 0.5° resolution; 1901–2019; Harris et al. 2020) to additively (as opposed to multiplicatively) adjust the climatology from 1979–2013 to 1979–2019 on a monthly basis. To this end, the CRU TS data were resampled from 0.5° to 0.1° using bilinear interpolation. We used the unadjusted ERA5 climatology (for 1979–2019) for ocean areas. Note that for daily minimum and maximum air temperature we did not derive separate reference climatologies, as their climatologies were derived from the 3-hourly MSWX-Past air temperature data.

To derive the MSWX-Past surface pressure reference climatology, we used Eq. (6) from Cosgrove et al. (2003):
pMSWX=pERA5exp(gΔZRTmean),
where pMSWX (Pa) is the MSWX-Past surface pressure climatology (0.1° resolution), pERA5 (Pa) is the ERA5 surface pressure climatology (0.28° resolution), g is gravity (9.81 m s−2), R is the specific gas constant of dry air (287.06 J kg−1 K−1), ΔZ (m) is Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010; Danielson and Gesch 2011; resampled from 1 km to 0.1° resolution using averaging) minus ERA5 elevation (0.28° resolution), and Tmean (K) is calculated as follows:
Tmean=TERA5+TMSWX2,
where TERA5 (K) is the ERA5 air temperature climatology (0.28° resolution) and TMSWX (K) is the MSWX-Past air temperature climatology (0.1° resolution).

The MSWX-Past relative humidity reference climatology was produced by bilinearly interpolating the ERA5 relative humidity climatology (appendix A) from 0.28° to 0.1°. Following previous studies (Cosgrove et al. 2003; Sheffield et al. 2006; Cucchi et al. 2020; Dutra et al. 2020; Muñoz Sabater et al. 2021), we used a simple interpolation-based approach to downscale relative humidity, due to the lack of an established downscaling approach or high-resolution climatology that is demonstrably superior.

The MSWX-Past specific humidity reference climatology was calculated from the MSWX-Past relative humidity, air temperature, and surface pressure climatologies. We first calculated the saturation vapor pressure according to Huang [2018, their Eq. (17)]:
eS=exp(34.4944,924.99T+237.1)(T+105)1.57,
where eS (Pa) is the saturation vapor pressure and T (°C) is the MSWX-Past air temperature climatology. Next, we calculated the mixing ratio according to
r=RH×ε×eSpRH×eS,
where r (dimensionless) is the mixing ratio, RH (dimensionless) is the MSWX-Past relative humidity climatology (expressed as ratio), and ε is the gas-constant ratio (0.622 g g−1). Finally, the specific humidity climatology (g g−1) was calculated following
q=r1+r.

The MSWX-Past wind speed reference climatology was produced by bilinearly interpolating the ERA5 climatology (1979–2019) from 0.28° to 0.1° on a monthly basis, and rescaling the long-term mean to match the Global Wind Atlas (GWA) 10-m wind speed climatology (V3.1; https://globalwindatlas.info; resampled from 0.0021° to 0.1° using averaging). The GWA provides mean wind speed at 10, 50, 100, 150, and 200 m above ground/sea level, and was produced using the state-of-the-art DTU Wind Energy microscale modeling system, which combines macroscale (30 km), mesoscale (3 km), and microscale (30 m) model results from ERA5, WRF, and WAsP, respectively (Hahmann et al. 2020; Dörenkämper et al. 2020). The GWA is only available for land areas and over water up to 200 km from shorelines. The bilinearly interpolated monthly ERA5 climatology without rescaling was used for ocean areas.

The MSWX-Past downward shortwave radiation reference climatology was produced by bilinearly interpolating the ERA5 climatology (1979–2019) from 0.28° to 0.1° on a monthly basis, and rescaling the long-term mean to match the Global Solar Atlas (GSA) global horizontal insolation climatology (V2.0; https://globalsolaratlas.info; resampled from 0.01° to 0.1° using averaging). The GSA was produced using the state-of-the-art semiempirical Solargis solar radiation model forced with numerical weather prediction (NWP) model-based estimates of aerosols, water vapor, and ozone and geostationary satellite-based estimates of cloud cover (Šúri et al. 2011a,b; Perez et al. 2013; Šúri and Cebecauer 2014; ESMAP 2019). The GSA only covers the land surface from 60°N to 55°S. For high latitudes and ocean areas, we used the monthly bilinearly interpolated ERA5 climatology without rescaling.

The MSWX-Past monthly downward longwave radiation reference climatology was produced by downscaling ERA5 downward longwave radiation using the previously derived MSWX-Past air temperature and specific humidity reference climatologies according to Cosgrove et al. [2003, their Eqs. (14)–(18)]:
LMSWX=εMSWXεERA5(TMSWXTERA5)4LERA5,
where LMSWX (W m−2) is the downward longwave radiation, ε (dimensionless) is the emissivity, and T (K) is the air temperature. The emissivity is calculated separately for MSWX-Past and ERA5 as
ε=1.08[1exp(e(T/2016))],
where e (hPa) is the vapor pressure and T (K) is again the air temperature (both from either MSWX-Past or ERA5). The vapor pressure is calculated separately for MSWX-Past and ERA5 following
e=qp0.622,
where q (g g−1) is the specific humidity and p (Pa) is the surface pressure (both from either MSWX-Past or ERA5).

MSWX-Past.

MSWX-Past represents the historical portion of the record (from 1 January 1979 to ∼5 days from real time) and is derived by bias correcting the ERA5 reanalysis (hourly 0.28° resolution; Hersbach et al. 2020) using the monthly 0.1° reference climatologies derived in the preceding subsection (Fig. 1). Reanalyses assimilate vast amounts of in situ and satellite observations into NWP models to generate a temporally and spatially consistent continuous record of the state of the atmosphere, ocean, and land surface. ERA5 is ECMWF’s recently released fifth-generation reanalysis and is widely considered the most accurate reanalysis currently available (e.g., Urraca et al. 2018; Hoffmann et al. 2019; Ramon et al. 2019; Zhang et al. 2018; Beck et al. 2019b). Relative and specific humidity are not part of the ERA5 output and were therefore calculated as described in appendix A. The ERA5 data were resampled from 0.28° to 0.1° using nearest neighbor interpolation, and 3-hourly accumulations were calculated for precipitation, while 3-hourly averages were calculated for all other variables. ERA5 data are available via the Copernicus Climate Data Store (CDS; https://cds.climate.copernicus.eu). Provisional ERA5 data are released with a ∼5-day delay from real time, while final ERA5 data are released with a ∼2-month delay from real time. Provisional ERA5 data are generally identical to final ERA5 data unless a serious issue is found (https://confluence.ecmwf.int/display/CUSF/Release+of+ERA5T), in which case the MSWX-Past record for the affected period will be reprocessed once corrected data become available.

Reanalyses are affected by deficiencies in model structure and parameterization, and uncertainties in assimilated observations (Bosilovich et al. 2008; Stephens et al. 2010; Kang and Ahn 2015). As a result, reanalyses tend to exhibit biases that may be acceptable for global applications but not for regional or local applications (Berg et al. 2003; Reichler and Kim 2008; McVicar et al. 2008; Haddeland et al. 2012). To reduce systematic biases and enhance the local relevance of the outputs, we rescaled ERA5 to match the monthly reference climatologies (“Reference climatologies” section). To this end, we derived, for each variable, a static three-dimensional matrix of size 1,800 × 3,600 × 12 with rescaling parameters calculated by dividing the reference climatology by the ERA5 climatology for 1979–2019. These rescaling parameters were used retrospectively to generate the historic portion of MSWX-Past, and are used operationally to update the record to ∼5 days from real time. Prior to the rescaling, air temperature is converted to kelvins (K), as our approach is unsuitable for air temperature in degrees Celsius (°C). Similar rescaling approaches were used to produce the HydroGFD (Berg et al. 2021) and TerraClimate (Abatzoglou et al. 2018) meteorological products (Table 1).

MSWX-NRT.

MSWX-NRT extends MSWX-Past to ∼3 h from real time using outputs from GDAS (3-hourly 0.25° resolution; NCEP 2021a). GDAS ingests vast amounts of in situ and satellite observations from across the globe and initializes the Global Forecast System (GFS) atmospheric model four times per day (at 0000, 0600, 1200, and 1800 UTC) to generate forecasts at lead times of +3, +6, and +9 h. For MSWX-NRT, we use forecasts at lead times of +3 and +6 h. For precipitation and downward shortwave and longwave radiation, the GDAS outputs represent averages from the initialization time to each forecast lead time, corresponding to a 3-h period for the +3-h forecast, and a 6-h period for the +6-h forecast. To obtain 3-hourly data from the +6-h forecast, we multiply the +6-h forecast data by two and subtract the +3-h forecast data. For all other variables, the GDAS outputs represent instantaneous snapshots at each forecast lead time. Since ERA5 represents 3-hourly averages, slight (90 min) shifts can be present in the diurnal cycle between MSWX-NRT and MSWX-Past for these variables. The GDAS data are resampled from 0.25° to 0.1° using nearest neighbor interpolation.

GDAS is an operational analysis based on evolving versions of the model and the data assimilation system, and therefore its error characteristics vary substantially throughout its record (2015 to current). To minimize systematic differences between GDAS and MSWX-Past, we implemented a simple running-mean harmonization approach (Cheng and Steenburgh 2007; Fan and van den Dool 2011; Durai and Bhradwaj 2014) that operationally adjusts GDAS data based on the mean deviation from MSWX-Past in the recent past. The harmonization is performed in a multiplicative manner for precipitation, surface pressure, wind speed, and downward shortwave and longwave radiation, to retain zeros and avoid negative values, whereas it is performed in an additive manner for air temperature, daily minimum and maximum air temperature, and relative and specific humidity. To determine the correction parameters, we use a 3-month window for precipitation and a 3-week window for the other variables. A drawback of the running-mean approach is that it does not correct distributional deviations. Distributional correction approaches require a long and consistent historical data record, which is not available for GDAS.

Recent GDAS data (less than ∼10 days old) are available from the NCEP web services FTP (ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/), while older data are available from the UCAR Research Data Archive (https://rda.ucar.edu/data/ds083.3). GDAS outputs are generally published approximately 7 h after the start of each run, and corresponding MSWX-NRT data are generally published less than approximately half an hour later. As such, the effective latency is 3 h on average, ranging from approximately 1.5 to 4.5 h depending on the time of day.

MSWX-Mid.

MSWX-Mid consists of 3-hourly medium-range (10 days) forecast ensembles (comprising 30 members) based on NCEP’s GEFS (3-hourly 0.47° resolution; NCEP 2021b) harmonized with MSWX-Past using a CDF-matching approach (described in “CDF-matching approach” section). GEFS generates forecasts four times per day (at 0000, 0600, 1200, and 1800 UTC) using the GFS atmospheric model initialized with GDAS. The GEFS forecasts have a length of 16 days. The first 10 days have a 3-hourly 0.25° resolution, while the last 6 days have a 6-hourly 0.5° resolution. We use only the former due to the higher spatiotemporal resolution. We use the GEFS forecasts initialized at 0000 UTC, which are generally fully available at ∼0530 UTC. Currently, the first member of the first variable (precipitation) of the MSWX-Mid forecast ensemble is released at ∼0630 UTC, and the last member of the last variable (downward longwave radiation) is released at ∼1130 UTC.

The GEFS forecasts are downloaded from the NCEP web services FTP (ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/gens/prod/) and are resampled from 0.25° to 0.1° using nearest neighbor interpolation. The GEFS precipitation outputs represent 3-hourly accumulations at lead times of +3, +9, +15, …, +237 h, and 6-hourly accumulations at lead times of +6, +12, +18, …, +234 h. To obtain 3-hourly accumulations from the 6-hourly accumulations, we subtract the preceding 3-hourly accumulation from the 6-hourly accumulation. The GEFS downward shortwave and longwave radiation outputs represent 3-hourly averages at lead times of +3, +9, +15, …, +237 h, whereas they represent 6-hourly averages at lead times of +6, +12, +18, …, +234 h. To obtain 3-hourly averages from the 6-hourly averages, we multiply the 6-hourly average by two and subtract the preceding 3-hourly average. For the remaining seven variables, the GEFS outputs represent instantaneous snapshots at each forecast lead time. Specific humidity is not directly available as output and is therefore calculated from relative humidity, air temperature, and surface pressure [Eqs. (3)–(5)].

The development of MSWX-Mid was made possible by the recent release of the GEFS V12 retrospective forecasts (hereafter “reforecasts”), which supersede the long-obsolete GEFS V2 reforecasts from 2012. The GEFS V12 reforecasts (available via https://noaa-gefs-retrospective.s3.amazonaws.com/index.html) consist of one forecast ensemble (comprising 4 members initialized at 0000 UTC) each day for 2000–19. To derive CDF-matching parameters for each month (“CDF-matching approach” section), we selected one forecast ensemble every 3 days, and concatenated the first 3 days of the first ensemble member, to obtain a single deterministic 3-hourly record spanning 2000–19. The GEFS precipitation reforecast data exhibit severe undocumented artifacts for most of 2016; we therefore excluded all precipitation reforecast data for 2016. To facilitate MSWX-Mid forecast skill assessments, we produced one MSWX-Mid reforecast ensemble (comprising four members initialized at 0000 UTC) each month for 2000–19.

Unlike the operational GEFS forecasts, the GEFS reforecasts do not include relative humidity. This precluded using the reforecasts to derive CDF-matching parameters for relative humidity as well as specific humidity, as the latter is calculated from relative humidity for the operational forecasts. As a workaround, we use the running-mean harmonization approach (described in “MSWX-NRT” section) to harmonize relative and specific humidity.

Areal statistics are often used to quantify changes at larger scales (e.g., the percentage of an area containing multiple grid cells in a particular drought class). Due to differences in spatiotemporal structure (e.g., storm spatial size and duration) between GEFS and ERA5, areal statistics from MSWX-Mid may not be fully consistent with those from MSWX-Past. We therefore recommend using the MSWX-Mid reforecasts to put areal statistics from MSWX-Mid forecasts into a historical context. This caveat only applies to areal statistics; for MSWX-Mid statistics based on a single grid cell, MSWX-Past can be used as a historical reference.

MSWX-Long.

MSWX-Long consists of daily long-range (7 months) forecast ensembles (comprising 51 members) based on ECMWF’s SEAS5 (Johnson et al. 2019; Fig. 1). The SEAS5 data are harmonized with MSWX-Past on a monthly basis for each lead month using a CDF-matching approach (described in “CDF-matching approach” section). SEAS5 forecasts have a 1° spatial resolution and are initialized at 0000 UTC on the first day of each month. They are released on the thirteenth day and we aim to release new MSWX-Long forecasts on the thirteenth day also. MSWX-Long has a daily temporal resolution, as SEAS5 outputs have a 6-hourly or daily temporal resolution (depending on the variable), precluding the generation of 3-hourly data. Surface pressure and relative and specific humidity are not directly available and are therefore calculated as described in appendix B.

The SEAS5 reforecast archive consist of one forecast ensemble comprising 51 members every month from January 1993 through the present. To derive CDF-matching parameters for each month and lead time (see “CDF-matching approach” section), we produced seven daily deterministic records spanning 1 January 1993 to the present (one for each lead month) by concatenating data from the first member of the reforecast ensembles. To facilitate MSWX-Long forecast skill assessments, we produced MSWX-Long reforecast ensembles (comprising 5 members) four times per year (initialized at 0000 UTC on 1 January, 1 April, 1 July, and 1 October) from January 1993 through 2020.

Both the historical SEAS5 reforecasts and the operational SEAS5 forecasts are available via the Copernicus CDS (https://cds.climate.copernicus.eu). The data are resampled from 1° to 0.1° resolution using nearest neighbor interpolation. For precipitation and downward shortwave and longwave radiation, the SEAS5 forecast and reforecast data represent aggregations (in mm and J m−2, respectively) from the initialization time to each lead time, corresponding to a 1-day period for the +0-day lead time, a 2-day period for the +1-day lead time, a 3-day period for the +2-day lead time, and so on. To obtain daily data for lead times greater than +0 days, we subtract the aggregation from the preceding lead time. To subsequently convert the daily radiation data from J m−2 to W m−2, we divide the estimates by 86,400 s (the number of seconds in a day).

Similar to MSWX-Mid, areal statistics from MSWX-Long may not be fully consistent with those from MSWX-Past, due mainly to the substantial difference in spatial resolution between SEAS5 and ERA5 (1° versus 0.28°). We therefore recommend using the MSWX-Long reforecasts to place areal statistics from MSWX-Long forecasts into a historical context. This caveat only applies to areal statistics; for MSWX-Long statistics based on a single grid cell, MSWX-Past can be used as a historical reference.

CDF-matching approach.

A multiplicative CDF-matching approach is used to make the forecasts from GEFS and SEAS5 (hereafter denoted “models” in this subsection) consistent with MSWX-Past, and obtain a homogeneous record from the past through the present and into the future that is updated in near–real time (Fig. 1). The approach is illustrated in Fig. 2 for a single week of GEFS air temperature data. In this example, GEFS exhibits a strong positive deviation, in particular during daytime, which is removed after the CDF matching. Our approach is similar to those in previous studies (e.g., Reichle and Koster 2004; Drusch et al. 2005; Beck et al. 2021; Katiraie-Boroujerdy et al. 2020), except that we do not consider all possible quantiles but a subsample of 10 points, to reduce the computational cost and make it sufficiently efficient for operational implementation.

Fig. 2.
Fig. 2.

Demonstration of the CDF-matching procedure used to make estimates from GEFS and SEAS5 consistent with MSWX-Past. The variable T represents air temperature converted to kelvins (K). (a) Three-hourly time series of MSWX-Past and uncorrected GEFS data for one week. (b) Quantile–quantile plot of GEFS T against MSWX-Past T including the 10 evenly spaced points selected for calculating the correction factors. (c) Correction factors calculated from the 10 combinations of GEFS and MSWX-Past T values. (d) Three-hourly time series of MSWX-Past and CDF-matched GEFS (i.e., MSWX-Mid) for one week.

Citation: Bulletin of the American Meteorological Society 103, 3; 10.1175/BAMS-D-21-0145.1

The first preparatory task, performed just once and in a nonoperational situation, consisted of deriving CDF-matching parameters for each model and variable. This was done for each month to account for seasonality in the deviations between the models and MSWX-Past. Additionally, this was done for each lead month for SEAS5 to account for model drift, which can be substantial due to the long 7-month forecast horizon (Toth and Peña 2007; Manzanas 2020). The parameters were calculated for each 0.1° grid cell by (i) sorting reference and coincident model values (selected using nearest neighbor sampling) independently in ascending order; (ii) for 10 evenly spaced points between the minimum and maximum of the model values, selecting the 10 corresponding sorted reference values with the same index; and (iii) calculating 10 correction factors by dividing the reference values by the corresponding model values. This procedure yielded, for each variable, two static four-dimensional matrices of size 1,800 × 3,600 × 12 × 10 for GEFS and two static five-dimensional matrices of size 1,800 × 3,600 × 12 × 10 × 7 for SEAS5. The first matrix contains the evenly spaced data points, while the second contains the corresponding correction factors.

The model data are made consistent with MSWX-Past operationally by (i) resampling the model data to 0.1° using nearest neighbor interpolation; (ii) converting the model data to indices ranging from 1 to 10 using the static matrix of evenly spaced data points, where index 1 represents data closest to the first evenly spaced point, index 2 represents data closest to the second evenly spaced point, and so on; (iii) selecting the appropriate correction factors from the static matrix of correction factors using the indices from the preceding step; and (iv) multiplying the model data by the selected correction factors.

For GEFS, the CDF matching is performed at a daily resolution for daily minimum and maximum temperature, and at a 3-hourly resolution for the other variables. For SEAS5, the CDF matching is performed at a daily resolution for all variables. Air temperature is converted to kelvins (K) prior to the CDF-matching procedure, as a multiplicative approach is unsuitable for air temperature in degrees Celsius (°C). Precipitation and wind speed are square-root transformed prior to the CDF-matching procedure to increase the normality of the distribution (Widger 1977; Juras 1994). After the CDF matching, air temperature is converted back to degrees Celsius, and precipitation and wind speed are squared.

Technical validation

Visual assessment.

Figures 3 and 4 show air temperature on 24 May 2021, according to the models (ERA5, GDAS, GEFS, and SEAS5) and according to the derived MSWX subproducts (Past, NRT, Mid, and Long) for the North American southwest and the European Alps, respectively, illustrating the enhanced spatial detail provided by MSWX. Due to their relatively coarse spatial resolutions, the models fail to depict numerous important topographic features, such as, for example, the Grand Canyon, the Death Valley, and the Superstition Mountains in the North American southwest (Fig. 3), and the Aosta Valley and the Wildspitze Mountain in the European Alps (Fig. 4). This lack of spatial detail can have important consequences for hydrological simulations, especially in snow-dominated regions with complex terrain (Jin and Wen 2012; Singh et al. 2015). The low consistency among the models vis-à-vis the high consistency among the MSWX subproducts is also striking, and highlights the importance of model harmonization. The divergence in air temperature between MSWX-Long and the other MSWX subproducts over the eastern Mediterranean (Fig. 4h) reflects the long lead time of the presented MSWX-Long data (23 days), and reaffirms that forecasts do not provide useful information on (sub-)daily weather fluctuations beyond the first week (Li and Robertson 2015; Johnson et al. 2019).

Fig. 3.
Fig. 3.

Mean air temperature on 24 May 2021 over the North American southwest according to the models (ERA5, GDAS, GEFS, and SEAS5) and according to the derived MSWX subproducts (Past, NRT, Mid, and Long). The spatial resolution is provided in parentheses in each panel.

Citation: Bulletin of the American Meteorological Society 103, 3; 10.1175/BAMS-D-21-0145.1

Fig. 4.
Fig. 4.

Mean air temperature on 24 May 2021, over the European Alps according to the models (ERA5, GDAS, GEFS, and SEAS5) and according to the derived MSWX subproducts (Past, NRT, Mid, and Long). The spatial resolution is provided in parentheses in each panel.

Citation: Bulletin of the American Meteorological Society 103, 3; 10.1175/BAMS-D-21-0145.1

MSWX-Past.

Table 3 presents average mean absolute error (MAE) values for MSWX-Past and ERA5 calculated using daily station observations for flat terrain (surface slope < 5°) and high-relief terrain (surface slope ≥ 5°; see appendix C for details on the station data sources and postprocessing). Over flat terrain, the MAE of MSWX-Past is either the same or marginally better for most variables. Over high-relief terrain, the MAE of MSWX-Past is generally somewhat better than that of ERA5, especially for the air temperature-related variables, confirming the value of the high-resolution reference climatologies. The MAE of MSWX-Past is, however, slightly worse over both terrain types for specific humidity and over high-relief terrain for wind speed and downward shortwave radiation. For specific humidity, we suspect this is due to the indirect way of calculating the specific humidity reference climatology from independently downscaled and bias corrected relative humidity, air temperature, and surface pressure reference climatologies. For wind speed, this is probably attributable to the underestimation of wind speed by ERA5 in regions of complex terrain (Jourdier 2020; Minola et al. 2020), which is largely corrected in MSWX-Past, resulting in inflated MAE values due to the relatively poorly simulated day-to-day variability. For downward shortwave radiation, the lack of improvement is likely attributable to the small influence that the improved climatology has on the day-to-day variability.

Table 3.

Average mean absolute error (MAE) values for ERA5 and MSWX-Past calculated using daily station observations. For each variable, we only included a station if >200 daily observations were available. To determine whether a station is located in flat terrain (surface slope < 5°) or high-relief terrain (surface slope ≥ 5°), we used GMTED2010 surface slope data (resampled from 250 m to 0.1° resolution using averaging). Here, N denotes the number of stations.

Table 3.

Figure 5 presents MAE values for MSWX-Past precipitation and air temperature calculated using daily station observations (see Fig. S1 in the online supplemental material for maps with MAE values for the other variables and see appendix C for details on the station data sources and postprocessing; https://doi.org/10.1175/BAMS-D-21-0145.2). For precipitation, MAE values are consistently high (>5 mm day−1) in low-latitude regions, such as the Amazon, central Africa, and Southeast Asia, which is attributable to two factors. First, the high average precipitation, which implicitly increases the MAE. Second, the prevalence of localized, high-intensity, short-duration convective storms, which tend to be poorly simulated by NWP models, such as the IFS underlying ERA5 on which MSWX-Past is based (Arakawa 2004; Prein et al. 2015). For air temperature, MAE values are high (>2°C) in regions of complex topography, such as Alaska, the western United States, and the Hindu Kush, suggesting that even the high 0.1° (∼11 km at the equator) resolution of MSWX is insufficient in these areas. The markedly lower air temperature MAE values in Europe (<1°C) likely reflect the high density of surface observations assimilated in ERA5 (Haiden et al. 2018).

Fig. 5.
Fig. 5.

Mean absolute error (MAE) values for MSWX-Past (a) precipitation (N = 92,869) and (b) air temperature (N = 17,247) using daily station observations as reference. For each variable, we only included a station if >200 daily observations were available. Flat terrain (surface slope < 5°) and high-relief terrain (surface slope ≥ 5°) are shown in light and dark gray, respectively. See supplemental Fig. S1 for maps with MAE values for the other variables and see appendix C for details on the station data sources and postprocessing.

Citation: Bulletin of the American Meteorological Society 103, 3; 10.1175/BAMS-D-21-0145.1

Running-mean harmonization.

Table 4 presents mean absolute difference (MAD) values calculated from daily data for 15 May 2021, to test the efficiency of the running-mean harmonization approach (described in “MSWX-NRT” section). The MAD values were calculated between ERA5 and GDAS, which were not harmonized with each other, and between MSWX-Past and MSWX-NRT, which were harmonized. MAD values should be higher between ERA5 and GDAS than between MSWX-Past and MSWX-NRT if the approach is successful. This does indeed appear to be the case for nearly all variables, highlighting the effectiveness of the approach and the high consistency between MSWX-Past and MSWX-NRT. Although the MAD values between MSWX-Past and MSWX-NRT are considerably lower, they do not (and cannot) approach zero at (sub-)daily time scales due to (i) unavoidable nonsystematic differences in (sub-)daily weather simulations between ERA5 and GDAS; and (ii) the length of the running-mean window, which is likely not long enough to eliminate all systematic deviations.

Table 4.

Mean absolute difference (MAD) values between ERA5 and GDAS and between MSWX-Past and MSWX-NRT for 15 May 2021, demonstrating the efficiency of the running-mean harmonization approach.

Table 4.

The MAD values are slightly worse between MSWX-Past and MSWX-NRT than between ERA5 and GDAS for precipitation over ocean areas and for downward shortwave radiation over flat terrain. This is probably attributable to differences in the simulation of large-scale, persistent weather systems between the two models during the running-mean window, which causes under- or overestimation of the correction factors. This does, however, not mean that the harmonization is unsuccessful for these variables, as the MAD values are only slightly worse, and they are significantly better over high-relief terrain for both variables.

CDF-matching harmonization.

Table 5 presents MAD values calculated from daily data for 24 May 2021, to test the efficiency of the CDF-matching harmonization approach (described in “CDF-matching approach” section). The MAD values were calculated between ERA5 and GEFS, which were not harmonized with each other, and between MSWX-Past and MSWX-Mid, which were harmonized. We used the first ensemble member of the GEFS and MSWX-Mid forecasts initialized at 0000 UTC 24 May 2021. All but one of the MAD values are lower between MSWX-Past and MSWX-Mid than between ERA5 and GEFS, suggesting that the CDF-matching harmonization performs satisfactorily and that there is a high degree of consistency between MSWX-Past and MSWX-Mid. The exception is the surface pressure MAD value over ocean areas, which is lower between ERA5 and GEFS than between MSWX-Past and MSWX-Mid, although the difference is almost negligible (53 versus 55 Pa). As explained in the preceding subsection, the MAD values do not approach zero due to unavoidable nonsystematic differences in (sub-)daily weather simulations between the models.

Table 5.

Mean absolute difference (MAD) values between ERA5 and GEFS and between MSWX-Past and MSWX-Mid estimates for 24 May 2021, demonstrating the efficiency of the CDF-matching harmonization approach. For GEFS and MSWX-Mid, we used the first ensemble member of the forecast initialized at 0000 UTC 24 May 2021.

Table 5.

Caveats and limitations

MSWX offers significant advantages over previous near-surface meteorological products (Table 1). However, the following caveats and limitations pertaining to the CDF-matching harmonization approach should be noted:

  1. 1)The approach may suppress or amplify trends in extremes, in particular for variables with skewed distributions, such as precipitation and wind speed. See Maraun (2013) and Cannon et al. (2015) for more discussion of this issue.
  2. 2)Some variables exhibit a strong diurnal cycle, such as air temperature and wind speed. For these variables, the CDF-matching procedure may suppress the diurnal cycle of MSWX-Mid compared to MSWX-Past, if the relative contribution of day-to-day fluctuations relative to the diurnal cycle of GEFS is higher than that of MSWX-Past. Conversely, the CDF-matching procedure may amplify the diurnal cycle of MSWX-Mid compared to MSWX-Past, if the relative contribution of day-to-day fluctuations relative to the diurnal cycle of GEFS is lower than that of MSWX-Past.
  3. 3)For the CDF-matching procedure to reach its full potential, the precipitation frequency produced by the model subject of the correction (GEFS or SEAS5) should equal or exceed that of the reference (MSWX-Past). If the precipitation frequency is lower in GEFS or SEAS5 relative to MSWX-Past, the precipitation frequency and total precipitation amount of MSWX-Mid or MSWX-Long may be slightly underestimated relative to MSWX-Past.
  4. 4)The CDF-matching parameters for GEFS and SEAS5 are based on reforecasts produced using a particular model version. Since the models are continually updated, the CDF-matching parameters will inevitably become less valid over time. Without new reforecasts, it will be necessary to switch to a running-mean harmonization approach for GEFS in the future. A running-mean harmonization approach would not be applicable to SEAS5, due to the long forecast horizon.

The following caveats and limitations pertaining to other aspects of MSWX should also be considered:

  1. 1)Biases in MSWX-Past can be reflected in the running-mean harmonization parameters used to derive MSWX-NRT. For example, if ERA5 (used to derive MSWX-Past) simulated anomalously high relative humidity in the recent past, whereas GDAS (used to derive MSWX-NRT) simulated average relative humidity, MSWX-NRT relative humidity estimates could become unrealistically high.
  2. 2)Slight (90 min) shifts in the diurnal cycle of MSWX-NRT and MSWX-Mid relative to MSWX-Past can occur for air temperature, surface pressure, relative and specific humidity, and wind speed. This is because ERA5 (used to derive MSWX-Past) represents 3-hourly averages for these variables, whereas GDAS and GEFS (used to derive MSWX-NRT and MSWX-Mid, respectively) represent instantaneous snapshots.
  3. 3)If at a particular moment in time both MSWX-Past and MSWX-NRT data are available, preference should be given to the MSWX-Past, due to the aforementioned two caveats, and because ERA5 (used to produce MSWX-Past) generally tends to be more skillful than GDAS (used to produce MSWX-NRT), as demonstrated by Beck et al. (2019b) among others.
  4. 4)MSWX-Long does not contain useful information on (sub-)daily weather fluctuations and should only be used to determine whether weather patterns over longer (i.e., weekly to monthly) time scales are likely to be average, above average, or below average.
  5. 5)SEAS5 model outputs have a 6-hourly or daily resolution (depending on the variable), precluding the generation of 3-hourly data. MSWX-Long, therefore, has a daily resolution.

Compatibility with MSWEP

The global MSWEP product (available at www.gloh2o.org/mswep) blends gauge, satellite, and reanalysis data to obtain more accurate precipitation estimates (Beck et al. 2017, 2019a). From V2.8 onward, the non-gauge-corrected MSWEP estimates are CDF matched to MSWX-Past precipitation. As a result, MSWEP is largely compatible with MSWX-Past precipitation and, therefore, MSWX-Mid and MSWX-Long precipitation can be used to extend MSWEP into the future.

How do MSWX-Past precipitation and MSWEP compare in terms of performance? MSWX-Past precipitation and MSWEP are both based on bias-corrected and downscaled atmospheric model output (ERA5), but MSWEP additionally incorporates gauge observations (from various national and international databases) and satellite-based precipitation estimates [from Integrated Multisatellite Retrievals for GPM (IMERG); Huffman et al. 2018]. MSWEP is thus likely more accurate than MSWX-Past in (i) densely gauged regions (e.g., the conterminous United States, Europe, and Australia; Schneider et al. 2014; Kidd et al. 2017) and (ii) low-latitude, convection-dominated regions (due to the satellite data; Ebert et al. 2007; Beck et al. 2019b).

When should one use precipitation from MSWEP and when from MSWX-Past? The choice depends on the application and involves a trade-off between accuracy and consistency. In general, MSWX-Past is less accurate but more consistent with MSWX-Mid and MSWX-Long, whereas MSWEP is more accurate but potentially less consistent with MSWX-Mid and MSWX-Long (due to the inclusion of gauge and satellite data). For calibrating a hydrological model, for example, we would recommend using MSWEP, to obtain the most realistic model parameters possible (Post et al. 2008; Zeng et al. 2018). However, for calculating MSWX-Mid anomalies (departures from the long-term average), we would recommend using the MSWX-Past record, or if areal statistics need to be calculated, perhaps the MSWX-Mid reforecasts.

Data records

MSWX is provided in the self-describing netCDF-4 data format (www.unidata.ucar.edu/software/netcdf/) and thus can be viewed, edited, and analyzed in most programming languages (e.g., Julia, Python, and R) and software packages (e.g., ArcGIS, GRASS, and QGIS). The grids have a 0.1° spatial resolution and dimensions of 1,800 rows × 3,600 columns. Table 2 list the directory names, netCDF field names, and units for each variable, while Fig. 6 shows the directory structure. The file naming convention is YYYYDOY for the daily files, YYYYDOY.HH for the 3-hourly files, and YYYYMM for the monthly files, where YYYY represents the year, MM represents the month, DOY represents the day of the year, and HH represents the start hour of the 3-h average (all expressed in UTC). The directory structure and file naming convention are further illustrated using the following four examples pertaining to MSWX-Past, MSWX-NRT, MSWX-Mid, and MSWX-Long, respectively:

  • MSWX_V100/Past/Tmin/Monthly/199803.nc: MSWX-Past average daily minimum air temperature (°C) for March 1998;

  • MSWX_V100/NRT/LWd/3hourly/2020301.21.nc: MSWX-NRT average downward longwave radiation (W m−2) between 2100 and 2359 UTC 27 October 2020;

  • MSWX_V100/Mid/Temp/20201102_00/25/3hourly/2020308.12.nc: MSWX-Mid average air temperature (°C) between 1200 and 1459 UTC 3 November 2020, for forecast member 25 of the ensemble initialized at 0000 UTC 2 November 2020; and

  • MSWX_V100/Long/RelHum/20201001_00/07/Daily/2021004.nc: MSWX-Long average relative humidity (%) on 4 January 2021, for forecast member 7 of the ensemble initialized at 0000 UTC 1 October 2020.

Fig. 6.
Fig. 6.

Directory structure of the MSWX product. Each box represents one or more directories.

Citation: Bulletin of the American Meteorological Society 103, 3; 10.1175/BAMS-D-21-0145.1

The full MSWX-Past record currently has a total size of ∼6.2 TB (including all variables). The yearly growth rate of MSWX-Past is ∼140 GB. One year of MSWX-NRT data has a total size of ∼140 GB as well. A single MSWX-Mid forecast ensemble has a total size of ∼120 GB, while a single MSWX-Long forecast ensemble has a total size of ∼400 GB. The total MSWX-Mid reforecast archive has a total size of ∼3.7 TB, while the MSWX-Long reforecast archive has a total size of ∼2.2 TB. The total size of the entire MSWX product, including one MSWX-Mid forecast ensemble and one MSWX-Long forecast ensemble, is thus currently ∼12.8 TB.

Conclusions

We introduced MSWX, an operational global gridded near-surface meteorological product featuring (i) near-real-time updates (∼3-h latency), to operationally monitor weather; (ii) consistent medium-range (up to 10 days) and long-range (up to 7 months) forecast ensembles, to provide advance warning of impending weather; (iii) a high spatial resolution (0.1°), to accurately represent mountainous regions; and (iv) compatibility with the MSWEP precipitation product that blends gauge, satellite, and reanalysis data. Our comprehensive validation of MSWX-Past using station observations suggests that the product tends to perform similar to ERA5 over flat terrain and better than ERA5 over high-relief terrain. Our assessment of the harmonization approaches demonstrates that they perform satisfactorily and that there is a high degree of consistency between the MSWX subproducts (Past, NRT, Mid, and Long). Within the bounds of the caveats and limitations discussed herein, MSWX offers new and unique opportunities for a broad range of applications, including hydrological modeling, climate analysis, water resources management, impact assessment, disease tracking, agricultural monitoring, and forecasting and monitoring of weather extremes (e.g., droughts, floods, and heatwaves), at global to regional scales.

Acknowledgments.

We are grateful to the developers of the CHELSA, CRU TS, ERA5, FLUXNET, GDAS, GEFS, GHCN-D, GSOD, and ISD products. We are also grateful to the developers of the Python modules numpy (Oliphant 2006; van der Walt et al. 2011), scipy (Virtanen et al. 2020), matplotlib (Hunter 2007), pandas (McKinney 2010), and xarray (Hoyer and Hamman 2017). We thank the editor and two anonymous reviewers for helpful suggestions and advice. Diego Miralles acknowledges support from the European Research Council (ERC) under Grant Agreement 715254 (DRY–2–DRY).

Data availability statement.

MSWX is available at www.gloh2o.org/mswx.

Appendix A: ERA5 relative and specific humidity

Since relative humidity is not directly available from ERA5, we calculate it from ERA5 dewpoint and air temperatures according to Vaisala [2013, their Eq. (12)]:
RH=100×10m(TdTd+TnTT+Tn),
where RH (%) is the relative humidity, T (K) is ERA5 air temperature, Td (K) is ERA5 dewpoint temperature, and m and Tn are constants of 7.59138 and 240.7263 K, respectively.
Specific humidity is also not directly available from ERA5, and is therefore calculated from ERA5 dewpoint temperature, air temperature, and surface pressure according to Stull [2016, their Eq. (4.24)]:
q=0.1(εe0p)exp[Lυ(1273.151Td)],
where q (g g−1) is the specific humidity, ε is the gas-constant ratio (622 g kg−1), e0 is the reference saturation vapor pressure (6.113 hPa), p (Pa) is ERA5 surface pressure, L/υ is the Clausius–Clapeyron parameter for vaporization (5,423 K), and Td (K) is ERA5 dewpoint temperature.

Appendix B: SEAS5 surface pressure and relative and specific humidity

SEAS5 provides sea level pressure instead of surface pressure. To calculate surface pressure, we use the barometric formula defined by
p=p0(1+LhT+Lh)gMR0L,
where p (Pa) is the surface pressure, p0 (Pa) is SEAS5 sea level pressure, L is the temperature lapse rate (0.006 K m−1), h (m) is the SEAS5 elevation, T (K) is SEAS5 air temperature, g is gravity (9.81 m s−2), M is the molar mass of dry air (0.02897 kg mol−1), and R0 is the universal gas constant [8.314 46 J (mol K)−1].

Specific and relative humidity are also not part of the SEAS5 output. Specific humidity is calculated following Eq. (A2) from SEAS5 dewpoint temperature and surface pressure [calculated using Eq. (B1)], whereas relative humidity is calculated following Eq. (A1) from SEAS5 air and dewpoint temperatures.

Appendix C: Station observations

For the MSWX-Past validation, we considered observations from 156,757 meteorological stations worldwide from four sources: (i) the Global Historical Climatology Network Daily (GHCN-D; 117,355 stations; Menne et al. 2012), (ii) the Global Summary of the Day (GSOD; 25,457 stations; www.ncei.noaa.gov/access/search/data-search/global-summary-of-the-day), (iii) FLUXNET2015 (206 stations; Pastorello et al. 2020), and (iv) the Integrated Surface Database (ISD; 13,739 stations; www.ncdc.noaa.gov/isd).

Three of the data sources have a daily resolution (GHCN-D, GSOD, and FLUXNET2015), while one has an hourly resolution (ISD). Daily precipitation and wind speed were available from all three daily data sources. Daily air temperature was only available from GSOD and FLUXNET2015 (see the supplement for the list of FLUXNET stations from which we used data). Daily minimum and maximum air temperature were only available from GHCN-D and GSOD, while daily surface pressure was only available from GSOD. Daily downward shortwave and longwave radiation were only available from FLUXNET2015.

Daily relative and specific humidity were not directly available from any of the sources and, therefore, were calculated from hourly ISD dewpoint temperature, air temperature, and surface pressure data [using Eqs. (A1) and (A2), respectively]. Data gaps of ≤2 hours in the relative and specific humidity time series were filled using linear interpolation, after which we calculated daily averages if 24 hourly values were available.

References

  • Abatzoglou, J. T. , S. Z. Dobrowski , S. A. Parks , and K. C. Hegewisch , 2018: TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958 to 2015. Sci. Data, 5, 170191, https://doi.org/10.1038/sdata.2017.191.

    • Search Google Scholar
    • Export Citation
  • Alfieri, L. , P. Burek , E. Dutra , B. Krzeminski , D. Muraro , J. Thielen , and F. Pappenberger , 2013: GloFAS—Global ensemble streamflow forecasting and flood early warning. Hydrol. Earth Syst. Sci., 17, 11611175, https://doi.org/10.5194/hess-17-1161-2013.

    • Search Google Scholar
    • Export Citation
  • Alfieri, L. , V. Lorini , F. A. Hirpa , S. Harrigan , E. Zsoter , C. Prudhomme , and P. Salamon , 2020: A global streamflow reanalysis for 1980–2018. J. Hydrol. X, 6, 100049, https://doi.org/10.1016/j.hydroa.2019.100049.

    • Search Google Scholar
    • Export Citation
  • Anghileri, D. , S. Monhart , C. Zhou , K. Bogner , A. Castelletti , P. Burlando , and M. Zappa , 2019: The value of subseasonal hydrometeorological forecasts to hydropower operations: How much does preprocessing matter? Water Resour. Res., 55, 102159102178, https://doi.org/10.1029/2019WR025280.

    • Search Google Scholar
    • Export Citation
  • Arakawa, A. , 2004: The cumulus parameterization problem: Past, present, and future. J. Climate, 17, 24932525, https://doi.org/10.1175/1520-0442(2004)017<2493:RATCPP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Beck, H. E. , A. I. J. M. van Dijk , V. Levizzani , J. Schellekens , D. G. Miralles , B. Martens , and A. de Roo , 2017: MSWEP: 3-hourly 0.25° global gridded precipitation (1979–2015) by merging gauge, satellite, and reanalysis data. Hydrol. Earth Syst. Sci., 21, 589615, https://doi.org/10.5194/hess-21-589-2017.

    • Search Google Scholar
    • Export Citation
  • Beck, H. E. , E. F. Wood , M. Pan , C. K. Fisher , D. M. Miralles , A. I. J. M. van Dijk , T. R. McVicar , and R. F. Adler , 2019a: MSWEP V2 global 3-hourly 0.1° precipitation: Methodology and quantitative assessment. Bull. Amer. Meteor. Soc., 100, 473500, https://doi.org/10.1175/BAMS-D-17-0138.1.

    • Search Google Scholar
    • Export Citation
  • Beck, H. E. , and Coauthors, 2019b: Daily evaluation of 26 precipitation datasets using stage-IV gauge-radar data for the CONUS. Hydrol. Earth Syst. Sci., 23, 207224, https://doi.org/10.5194/hess-23-207-2019.

    • Search Google Scholar
    • Export Citation
  • Beck, H. E. , and Coauthors, 2021: Evaluation of 18 satellite- and model-based soil moisture products using in situ measurements from 826 sensors. Hydrol. Earth Syst. Sci., 25, 1740, https://doi.org/10.5194/hess-25-17-2021.

    • Search Google Scholar
    • Export Citation
  • Berg, A. A. , J. S. Famiglietti , J. P. Walker , and P. R. Houser , 2003: Impact of bias correction to reanalysis products on simulations of North American soil moisture and hydrological fluxes. J. Geophys. Res., 108, 4490, https://doi.org/10.1029/2002JD003334.

    • Search Google Scholar
    • Export Citation
  • Berg, P. , F. Almén , and D. Bozhinova , 2021: HydroGFD3.0 (Hydrological Global Forcing Data): A 25 km global precipitation and temperature data set updated in near-real time. Earth Syst. Sci. Data, 13, 15311545, https://doi.org/10.5194/essd-13-1531-2021.

    • Search Google Scholar
    • Export Citation
  • Bosilovich, M. G. , J. Chen , F. R. Robertson , and R. F. Adler , 2008: Evaluation of global precipitation in reanalyses. J. Appl. Meteor. Climatol., 47, 22792299, https://doi.org/10.1175/2008JAMC1921.1.

    • Search Google Scholar
    • Export Citation
  • Brown, J. N. , Z. Hochman , D. Holzworth , and H. Horan , 2018: Seasonal climate forecasts provide more definitive and accurate crop yield predictions. Agric. For. Meteor., 260–261, 247254, https://doi.org/10.1016/j.agrformet.2018.06.001.

    • Search Google Scholar
    • Export Citation
  • Cannon, A. J. , S. R. Sobie , and T. Q. Murdock , 2015: Bias correction of GCM precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes? J. Climate, 28, 69386959, https://doi.org/10.1175/JCLI-D-14-00754.1.

    • Search Google Scholar
    • Export Citation
  • Cheng, W. Y. Y. , and W. J. Steenburgh , 2007: Strengths and weaknesses of MOS, running-mean bias removal, and Kalman filter techniques for improving model forecasts over the western United States. Wea. Forecasting, 22, 13041318, https://doi.org/10.1175/2007WAF2006084.1.

    • Search Google Scholar
    • Export Citation
  • Cosgrove, B. A. , and Coauthors, 2003: Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project. J. Geophys. Res., 108, 8842, https://doi.org/10.1029/2002JD003118.

    • Search Google Scholar
    • Export Citation
  • Cucchi, M. , G. P. Weedon , A. Amici , N. Bellouin , S. Lange , H. Müller Schmied , H. Hersbach , and C. Buontempo , 2020: WFDE5: Bias-adjusted ERA5 reanalysis data for impact studies. Earth Syst. Sci. Data, 12, 20972120, https://doi.org/10.5194/essd-12-2097-2020.

    • Search Google Scholar
    • Export Citation
  • Danielson, J. J. , and D. B. Gesch , 2011: Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010). USGS Open-File Rep. 2011-1073, 26 pp.

    • Search Google Scholar
    • Export Citation
  • Davis, J. K. , G. P. Vincent , M. B. Hildreth , L. Kightlinger , C. Carlson , and M. C. Wimberly , 2018: Improving the prediction of arbovirus outbreaks: A comparison of climate-driven models for West Nile virus in an endemic region of the United States. Acta Trop., 185, 242250, https://doi.org/10.1016/j.actatropica.2018.04.028.

    • Search Google Scholar
    • Export Citation
  • Dörenkämper, M. , and Coauthors, 2020: The making of the new European wind atlas—Part 2: Production and evaluation. Geosci. Model Dev., 13, 50795102, https://doi.org/10.5194/gmd-13-5079-2020.

    • Search Google Scholar
    • Export Citation
  • Drusch, M. , E. F. Wood , and H. Gao , 2005: Observation operators for the direct assimilation of TRMM Microwave Imager retrieved soil moisture. Geophys. Res. Lett., 32, L15403, https://doi.org/10.1029/2005GL023623.

    • Search Google Scholar
    • Export Citation
  • Durai, V. R. , and R. Bhradwaj , 2014: Evaluation of statistical bias correction methods for numerical weather prediction model forecasts of maximum and minimum temperatures. Nat. Hazards, 73, 12291254, https://doi.org/10.1007/s11069-014-1136-1.

    • Search Google Scholar
    • Export Citation
  • Dutra, E. , J. Muñoz Sabater , S. Boussetta , T. Komori , S. Hirahara , and G. Balsamo , 2020: Environmental lapse rate for high-resolution land surface downscaling: An application to ERA5. Earth Space Sci., 7, e2019EA000984, https://doi.org/10.1029/2019EA000984.

    • Search Google Scholar
    • Export Citation
  • Ebert, E. E. , J. E. Janowiak , and C. Kidd , 2007: Comparison of near-real-time precipitation estimates from satellite observations and numerical models. Bull. Amer. Meteor. Soc., 88, 4764, https://doi.org/10.1175/BAMS-88-1-47.

    • Search Google Scholar
    • Export Citation
  • Emerton, R. E. , and Coauthors, 2016: Continental and global scale flood forecasting systems. WileyInterdiscip. Rev.: Water, 3, 391418, https://doi.org/10.1002/wat2.1137.

    • Search Google Scholar
    • Export Citation
  • ESMAP, 2019: Global Solar Atlas 2.0: Validation report for global solar radiation model. ESMAP Tech. Rep. 149878, 57 pp., https://documents1.worldbank.org/curated/en/507341592893487792/pdf/Global-Solar-Atlas-2-0-Validation-Report.pdf.

    • Search Google Scholar
    • Export Citation
  • Fan, Y. , and H. van den Dool , 2011: Bias correction and forecast skill of NCEP GFS ensemble week-1 and week-2 precipitation, 2-m surface air temperature, and soil moisture forecasts. Wea. Forecasting, 26, 355370, https://doi.org/10.1175/WAF-D-10-05028.1.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R. , and Coauthors, 2017: The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2). J. Climate, 30, 54195454, https://doi.org/10.1175/JCLI-D-16-0758.1.

    • Search Google Scholar
    • Export Citation
  • Haddeland, I. , J. Heinke , F. Voß , S. Eisner , C. Chen , S. Hagemann , and F. Ludwig , 2012: Effects of climate model radiation, humidity and wind estimates on hydrological simulations. Hydrol. Earth Syst. Sci., 16, 305318, https://doi.org/10.5194/hess-16-305-2012.

    • Search Google Scholar
    • Export Citation
  • Hahmann, A. N. , and Coauthors, 2020: The making of the new European wind atlas—Part 1: Model sensitivity. Geosci. Model Dev., 13, 50535078, https://doi.org/10.5194/gmd-13-5053-2020.

    • Search Google Scholar
    • Export Citation
  • Haiden, T. , and Coauthors, 2018: Use of in situ surface observations at ECMWF. ECMWF Tech. Memo. 834, 28 pp., www.ecmwf.int/node/18748.

    • Search Google Scholar
    • Export Citation
  • Hao, Z. , X. Yuan , Y. Xia , F. Hao , and V. P. Singh , 2017: An overview of drought monitoring and prediction systems at regional and global scales. Bull. Amer. Meteor. Soc., 98, 18791896, https://doi.org/10.1175/BAMS-D-15-00149.1.

    • Search Google Scholar
    • Export Citation
  • Harris, I. , T. J. Osborn , P. Jones , and D. Lister , 2020: Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset. Sci. Data, 7, 109, https://doi.org/10.1038/s41597-020-0453-3.

    • Search Google Scholar
    • Export Citation
  • Hersbach, H. , and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 19992049, https://doi.org/10.1002/qj.3803.

    • Search Google Scholar
    • Export Citation
  • Hirpa, F. A. , P. Salamon , L. Alfieri , J. T. del Pozo , E. Zsoter , and F. Pappenberger , 2016: The effect of reference climatology on global flood forecasting. J. Hydrometeor., 17, 11311145, https://doi.org/10.1175/JHM-D-15-0044.1.

    • Search Google Scholar
    • Export Citation
  • Hoffmann, L. , and Coauthors, 2019: From ERA-Interim to ERA5: The considerable impact of ECMWF’s next-generation reanalysis on Lagrangian transport simulations. Atmos. Chem. Phys., 19, 30973124, https://doi.org/10.5194/acp-19-3097-2019.

    • Search Google Scholar
    • Export Citation
  • Hoyer, S. , and J. Hamman , 2017: xarray: N-D labeled arrays and datasets in Python. J. Open Res. Software, 5, 10, https://doi.org/10.5334/jors.148.

    • Search Google Scholar
    • Export Citation
  • Huang, J. , 2018: A simple accurate formula for calculating saturation vapor pressure of water and ice. J. Appl. Meteor. Climatol., 57, 12651272, https://doi.org/10.1175/JAMC-D-17-0334.1.>

    • Search Google Scholar
    • Export Citation
  • Huffman, G. J. , D. T. Bolvin , and E. J. Nelkin , 2018: Integrated Multi-satellite Retrievals for GPM (IMERG) technical documentation. NASA GSFC Tech. Rep., 60 pp.

    • Search Google Scholar
    • Export Citation
  • Hunter, J. D. , 2007: Matplotlib: A 2D graphics environment. Comput. Sci. Eng., 9, 9095, https://doi.org/10.1109/MCSE.2007.55.

  • Jin, J. , and L. Wen , 2012: Evaluation of snowmelt simulation in the Weather Research and Forecasting Model. J. Geophys. Res., 117, D10110, https://doi.org/10.1029/2011JD016980.

    • Search Google Scholar
    • Export Citation
  • Johnson, S. J. , and Coauthors, 2019: SEAS5: The new ECMWF seasonal forecast system. Geosci. Model Dev., 12, 10871117, https://doi.org/10.5194/gmd-12-1087-2019.

    • Search Google Scholar
    • Export Citation
  • Jourdier, B. , 2020: Evaluation of ERA5, MERRA-2, COSMO-REA6, NEWA and AROME to simulate wind power production over France. Adv. Sci. Res., 17, 6377, https://doi.org/10.5194/asr-17-63-2020.

    • Search Google Scholar
    • Export Citation
  • Juras, J. , 1994: Some common features of probability distributions for precipitation. Theor. Appl. Climatol., 49, 6976, https://doi.org/10.1007/BF00868191.

    • Search Google Scholar
    • Export Citation
  • Kang, S. , and J.-B. Ahn , 2015: Global energy and water balances in the latest reanalyses. Asia-Pac. J. Atmos. Sci., 51, 293302, https://doi.org/10.1007/s13143-015-0079-0.

    • Search Google Scholar
    • Export Citation
  • Karger, D. N. , and Coauthors, 2017: Climatologies at high resolution for the Earth’s land surface areas. Sci. Data, 5, 170122, https://doi.org/10.1038/sdata.2017.122.

    • Search Google Scholar
    • Export Citation
  • Katiraie-Boroujerdy, P.-S. , M. Rahnamay Naeini , A. Akbari Asanjan , A. Chavoshian , K.-L. Hsu , and S. Sorooshian , 2020: Bias correction of satellite-based precipitation estimations using quantile mapping approach in different climate regions of Iran. Remote Sens., 12, 2102, https://doi.org/10.3390/rs12132102.

    • Search Google Scholar
    • Export Citation
  • Kidd, C. , A. Becker , G. J. Huffman , C. L. Muller , P. Joe , G. Skofronick-Jackson , and D. B. Kirschbaum , 2017: So, how much of the Earth’s surface is covered by rain gauges? Bull. Amer. Meteor. Soc., 98, 6978, https://doi.org/10.1175/BAMS-D-14-00283.1.

    • Search Google Scholar
    • Export Citation
  • Kobayashi, S. , and Coauthors, 2015: The JRA-55 reanalysis: General specifications and basic characteristics. J. Meteor. Soc. Japan., 93, 548, https://doi.org/10.2151/jmsj.2015-001.

    • Search Google Scholar
    • Export Citation
  • Li, S. , and A. W. Robertson , 2015: Evaluation of submonthly precipitation forecast skill from global ensemble prediction systems. Mon. Wea. Rev., 143, 28712889, https://doi.org/10.1175/MWR-D-14-00277.1.

    • Search Google Scholar
    • Export Citation
  • Luo, H. , F. Ge , K. Yang , S. Zhu , T. Peng , W. Cai , X. Liu , and W. Tang , 2019: Assessment of ECMWF reanalysis data in complex terrain: Can the CERA-20C and ERA-Interim data sets replicate the variation in surface air temperatures over Sichuan, China? Int. J. Climatol., 39, 56195634, https://doi.org/10.1002/joc.6175.

    • Search Google Scholar
    • Export Citation
  • Manzanas, R. , 2020: Assessment of model drifts in seasonal forecasting: Sensitivity to ensemble size and implications for bias correction. J. Adv. Model. Earth Syst., 12, https://doi.org/10.1029/2019MS001751.

    • Search Google Scholar
    • Export Citation
  • Maraun, D. , 2013: Bias correction, quantile mapping, and downscaling: Revisiting the inflation issue. J. Climate, 26, 21372143, https://doi.org/10.1175/JCLI-D-12-00821.1.

    • Search Google Scholar
    • Export Citation
  • Massari, C. , W. Crow , and L. Brocca , 2017: An assessment of the accuracy of global rainfall estimates without ground-based observations. Hydrol. Earth Syst. Sci., 21, 43474361, https://doi.org/10.5194/hess-21-4347-2017.

    • Search Google Scholar
    • Export Citation
  • McKinney, W. , 2010: Data structures for statistical computing in Python. Proc. Ninth Python in Science Conf., Austin, TX, SciPy, 56–61, https://doi.org/10.25080/Majora-92bf1922-00a.

    • Search Google Scholar
    • Export Citation
  • McVicar, T. R. , T. G. Van Niel , L. T. Li , M. L. Roderick , D. P. Rayner , L. Ricciardulli , and R. J. Donohue , 2008: Wind speed climatology and trends for Australia, 1975–2006: Capturing the stilling phenomenon and comparison with near-surface reanalysis output. Geophys. Res. Lett., 35, L20403, https://doi.org/10.1029/2008GL035627.

    • Search Google Scholar
    • Export Citation
  • Menne, M. J. , I. Durre , R. S. Vose , B. E. Gleason , and T. G. Houston , 2012: An overview of the Global Historical Climatology Network-Daily database. J. Atmos. Oceanic Technol., 29, 897910, https://doi.org/10.1175/JTECH-D-11-00103.1.

    • Search Google Scholar
    • Export Citation
  • Minola, L. , F. Zhang , C. Azorin-Molina , A. A. Safaei Pirooz , R. G. J. Flay , H. Hersbach , and D. Chen , 2020: Near-surface mean and gust wind speeds in ERA5 across Sweden: Towards an improved gust parametrization. Climate Dyn., 55, 887907, https://doi.org/10.1007/s00382-020-05302-6.

    • Search Google Scholar
    • Export Citation
  • Muñoz Sabater, J. , and Coauthors, 2021: ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data, 13, 43494383, https://doi.org/10.5194/essd-13-4349-2021.

    • Search Google Scholar
    • Export Citation
  • Myers, M. F. , D. J. Rogers , J. Cox , A. Flahault , and S. I. Hay , 2000: Forecasting disease risk for increased epidemic preparedness in public health. Remote Sensing and Geographical Information Systems in Epidemiology, Advances in Parasitology, Vol. 47, Academic Press, 309330.

    • Search Google Scholar
    • Export Citation
  • NCEP, 2021a: Global Data Assimilation System (GDAS). NCEI, accessed 7 February 2022, www.ncdc.noaa.gov/data-access/model-data/model-datasets/global-data-assimilation-system-gdas.

    • Search Google Scholar
    • Export Citation
  • NCEP, 2021b: Global Ensemble Forecast System (GEFS). NCEI, accessed 7 February 2022, www.ncdc.noaa.gov/data-access/model-data/model-datasets/global-ensemble-forecast-system-gefs.

    • Search Google Scholar
    • Export Citation
  • Nkiaka, E. , N. Nawaz , and J. Lovett , 2017: Evaluating global reanalysis datasets as input for hydrological modelling in the Sudano-Sahel region. Hydrology, 4, 13, https://doi.org/10.3390/hydrology4010013.

    • Search Google Scholar
    • Export Citation
  • Oliphant, T. E. , 2006: NumPy: A Guide to NumPy. Trelgol Publishing, 363 pp., www.numpy.org.

  • Pastorello, G. , and Coauthors, 2020: The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data. Sci. Data, 7, 225, https://doi.org/10.1038/s41597-020-0534-3.

    • Search Google Scholar
    • Export Citation
  • Perez, R. , T. Cebecauer , and M. Šúri , 2013: Semi-empirical satellite models. Solar Energy Forecasting and Resource Assessment, J. Kleissl , Ed., Academic Press, 103 pp.

    • Search Google Scholar
    • Export Citation
  • Post, D. A. , J. Vaze , F. H. S. Chiew , and J. M. Perraud , 2008: Impact of rainfall data quality on the parameter values of rainfall-runoff models: Implications for regionalisation. Proc. Water Down Under 2008, Adelaide, Australia, Engineers Australia, 23152326, http://hdl.handle.net/102.100.100/119078?index=1.

  • Prein, A. F. , and Coauthors, 2015: A review on regional convection-permitting climate modeling: Demonstrations, prospects, and challenges. Rev. Geophys., 53, 323361, https://doi.org/10.1002/2014RG000475.

    • Search Google Scholar
    • Export Citation
  • Ramon, J. , L. Lled’o , V. Torralba , A. Soret , and F. J. Doblas-Reyes , 2019: What global reanalysis best represents near-surface winds? Quart. J. Roy. Meteor. Soc., 145, 32363251, https://doi.org/10.1002/qj.3616.

    • Search Google Scholar
    • Export Citation
  • Reichle, R. H. , and R. D. Koster , 2004: Bias reduction in short records of satellite soil moisture. Geophys. Res. Lett., 31, L19501, https://doi.org/10.1029/2004GL020938.

    • Search Google Scholar
    • Export Citation
  • Reichle, R. H. , Q. Liu , R. D. Koster , C. S. Draper , S. P. P. Mahanama , and G. S. Partyka , 2017: Land surface precipitation in MERRA-2. J. Climate, 30, 16431664, https://doi.org/10.1175/JCLI-D-16-0570.1.

    • Search Google Scholar
    • Export Citation
  • Reichler, T. , and J. Kim , 2008: Uncertainties in the climate mean state of global observations, reanalyses, and the GFDL climate model. J. Geophys. Res., 113, D05106, https://doi.org/10.1029/2007JD009278.

    • Search Google Scholar
    • Export Citation
  • Rouholahnejad-Freund, E. , Y. Fan , and J. W. Kirchner , 2020: Global assessment of how averaging over spatial heterogeneity in precipitation and potential evapotranspiration affects modeled evapotranspiration rates. Hydrol. Earth Syst. Sci., 24, 19271938, https://doi.org/10.5194/hess-24-1927-2020.

    • Search Google Scholar
    • Export Citation
  • Saha, S. , and Coauthors, 2010: The NCEP Climate Forecast System Reanalysis. Bull. Amer. Meteor. Soc., 91, 10151057, https://doi.org/10.1175/2010BAMS3001.1.

    • Search Google Scholar
    • Export Citation
  • Salas, F. R. , and Coauthors, 2018: Towards real-time continental scale streamflow simulation in continuous and discrete space. J. Amer. Water Resour. Assoc., 54, 727, https://doi.org/10.1111/1752-1688.12586.

    • Search Google Scholar
    • Export Citation
  • Schneider, U. , A. Becker , P. Finger , A. Meyer-Christoffer , M. Ziese , and B. Rudolf , 2014: GPCC’s new land surface precipitation climatology based on quality-controlled in situ data and its role in quantifying the global water cycle. Theor. Appl. Climatol., 115, 1540, https://doi.org/10.1007/s00704-013-0860-x.

    • Search Google Scholar
    • Export Citation
  • Schwalbert, R. A. , T. Amado , G. Corassa , L. P. Pott , P. V. V. Prasad , and I. A. Ciampitti , 2020: Satellite-based soybean yield forecast: Integrating machine learning and weather data for improving crop yield prediction in southern Brazil. Agric. For. Meteor., 284, 107886, https://doi.org/10.1016/j.agrformet.2019.107886.

    • Search Google Scholar
    • Export Citation
  • Sheffield, J. , G. Goteti , and E. F. Wood , 2006: Development of a 50-year high-resolution global dataset of meteorological forcings for land surface modeling. J. Climate, 19, 30883111, https://doi.org/10.1175/JCLI3790.1.

    • Search Google Scholar
    • Export Citation
  • Sheffield, J. , and Coauthors, 2014: A drought monitoring and forecasting system for sub-Sahara African water resources and food security. Bull. Amer. Meteor. Soc., 95, 861882, https://doi.org/10.1175/BAMS-D-12-00124.1.

    • Search Google Scholar
    • Export Citation
  • Singh, R. S. , J. T. Reager , N. L. Miller , and J. S. Famiglietti , 2015: Toward hyper-resolution land-surface modeling: The effects of fine-scale topography and soil texture on CLM4.0 simulations over the southwestern U.S. Water Resour. Res., 51, 26482667, https://doi.org/10.1002/2014WR015686.

    • Search Google Scholar
    • Export Citation
  • Stephens, G. L. , and Coauthors, 2010: Dreary state of precipitation in global models. J. Geophys. Res., 115, D24211, https://doi.org/10.1029/2010JD014532.

    • Search Google Scholar
    • Export Citation
  • Stull, R. , 2016: Practical Meteorology: An Algebra-Based Survey of Atmospheric Science. The University of British Columbia, 940 pp.

  • Šúri, M. , and T. Cebecauer , 2014: Satellite-based solar resource data: Model validation statistics versus user’s uncertainty. Proc. SOLAR 2014 Conf., San Francisco, CA, ASES.

    • Search Google Scholar
    • Export Citation
  • Šúri, M. , T. Cebecauer , and C. A. Guyemard , 2011a: Uncertainty sources in satellite-derived direct normal irradiance: How can prediction accuracy be improved globally? 17th SolarPACES Conf., Granada, Spain, SolarPACES.

    • Search Google Scholar
    • Export Citation
  • Šúri, M. , T. Cebecauer , and A. Skoczek , 2011b: SolarGIS: Solar data and online applications for PV planning and performance assessment. 26th European Photovoltaics Solar Energy Conf., Hamburg, Germany, EU PVSEC.

    • Search Google Scholar
    • Export Citation
  • Toth, Z. , and M. Peña , 2007: Data assimilation and numerical forecasting with imperfect models: The mapping paradigm. Physica D, 230, 146158, https://doi.org/10.1016/j.physd.2006.08.016.

    • Search Google Scholar
    • Export Citation
  • Urraca, R. , T. Huld , A. Gracia-Amillo , F. J. M. de Pison , F. Kaspar , and A. Sanz-Garcia , 2018: Evaluation of global horizontal irradiance estimates from ERA5 and COSMO-REA6 reanalyses using ground and satellite-based data. Sol. Energy, 164, 339354, https://doi.org/10.1016/j.solener.2018.02.059.

    • Search Google Scholar
    • Export Citation
  • Vaisala, 2013: Humidity conversion formulas. Vaisala Tech. Rep., 17 pp., https://www.vaisala.com/en/lp/make-your-job-easier-humidity-conversion-formulas.

  • van der Walt, S. , S. C. Colbert , and G. Varoquaux , 2011: The Numpy array: A structure for efficient numerical computation. Comput. Sci. Eng., 13, 2230, https://doi.org/10.1109/MCSE.2011.37.

    • Search Google Scholar
    • Export Citation
  • Virtanen, P. , and Coauthors, 2020: SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods, 17, 261272, https://doi.org/10.1038/s41592-019-0686-2.

    • Search Google Scholar
    • Export Citation
  • Viviroli, D. , H. H. Dürr , B. Messerli , M. Meybeck , and R. Weingartner , 2007: Mountains of the world, water towers for humanity: Typology, mapping, and global significance. Water Resour. Res., 43, W07447, https://doi.org/10.1029/2006WR005653.

    • Search Google Scholar
    • Export Citation
  • Widger, W. K. , 1977: Estimations of wind speed frequency distributions using only the monthly average and fastest mile data. J. Appl. Meteor. Climatol., 16, 244247, https://doi.org/10.1175/1520-0450(1977)016<0244:EOWSFD>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Zeng, Q. , H. Chen , C.-Y. Xu , M.-X. Jie , J. Chen , S.-L. Guo , and J. Liu , 2018: The effect of rain gauge density and distribution on runoff simulation using a lumped hydrological modelling approach. J. Hydrol., 563, 106122, https://doi.org/10.1016/j.jhydrol.2018.05.058.

    • Search Google Scholar
    • Export Citation
  • Zhang, Q. , J. Ye , S. Zhang , and F. Han , 2018: Precipitable water vapor retrieval and analysis by multiple data sources: Ground-based GNSS, radio occultation, radiosonde, microwave satellite, and NWP reanalysis data. J. Sens., 2018, 3428303, https://doi.org/10.1155/2018/3428303.

    • Search Google Scholar
    • Export Citation
  • Zsoter, E. , C. Prudhomme , E. Stephens , F. Pappenberger , and H. Cloke , 2020: Using ensemble reforecasts to generate flood thresholds for improved global flood forecasting. J. Flood Risk Manage., 13, e12658, https://doi.org/10.1111/jfr3.12658.

    • Search Google Scholar
    • Export Citation

Supplementary Materials

Save