1. Introduction
The NASA Global Modeling and Assimilation Office recently released the Modern-Era Retrospective Analysis for Research and Applications version 2 (MERRA-2; Gelaro et al. 2017). This new global reanalysis product replaces and extends the original MERRA atmospheric reanalysis (Rienecker et al. 2011), as well as the MERRA-Land reanalysis (Reichle et al. 2011). In addition to several other major advances, MERRA-2 uses observed precipitation in place of model-generated precipitation at the land surface during the atmospheric model integration. The use of observed precipitation in MERRA-2 was refined from the approach used for MERRA-Land (Reichle et al. 2017b), which was an offline (land only) replay of MERRA forced by atmospheric fields from MERRA but with the precipitation forcing corrected using gauge-based observations.
The motivation for using observed precipitation in reanalyses is that precipitation is the main driver of soil moisture, which in turn controls the partitioning of incident surface radiation between latent heat (LH) and sensible heat (SH) fluxes back to the atmosphere. Reichle et al. (2017a) show that both MERRA-2 and MERRA-Land have improved upon the land surface hydrology of MERRA, showing better agreement with independent observational time series of soil moisture, terrestrial water storage, streamflow, and snow amount. Here, we extend this work, by evaluating the MERRA-2 surface energy budget and 2-m temperatures
We start by comparing the long-term annual global energy budget over land from MERRA-2, MERRA-Land, and MERRA to state-of-the-art estimates from the literature. These literature estimates, from Trenberth et al. (2009), Wild et al. (2015), and the NASA Energy and Water Cycle Studies (NEWS) program (NEWS Science Integration Team 2007; L’Ecuyer et al. 2015), were each produced by carefully combining multiple input datasets with global energy balance constraints. Taken together they represent our best understanding of the long-term annual mean energy budget over land.
Next, we consider global maps of the performance of the land surface turbulent heat fluxes from each reanalysis, as a step toward linking differences in performance to the dominant local physical processes and to the potential improvements obtained from the use of the observed precipitation in MERRA-2. We focus on the boreal summer [June–August (JJA)], since land–atmosphere coupling is strongest and surface turbulent heat fluxes are most active in the summer.
Unfortunately, there are no standard global gridded reference datasets against which the reanalysis LH and SH can be evaluated. Several recent efforts have compared global LH estimates from different combinations of reanalyses, offline land surface models, and diagnostic methods. Most estimates generally agree on the regional patterns and local seasonal cycle of LH, although there is considerable disagreement in the absolute values and temporal behavior across different flux estimates (Jiménez et al. 2011; Mueller et al. 2011; Miralles et al. 2011). Additionally, uncertainty in the basic model structure is the largest source of disagreement (Schlosser and Gao 2010; Mueller et al. 2013). While ground-based observations are available from tower-mounted eddy covariance sensors (e.g., Baldocchi et al. 2001), the number of towers (in the hundreds) is well below the sampling needed for global estimation (and their locations are not designed to sample globally representative land cover types). Additionally, the measurements themselves have considerable uncertainty and limited spatial representativeness (up to 1 km).
In the absence of a standard reference, we compare the JJA reanalysis turbulent heat flux estimates to two different gridded reference datasets: the Global Land Evaporation Amsterdam Model (GLEAM) (Miralles et al. 2011; Martens et al. 2017) for LH and FLUXNET-Model Tree Ensembles (MTE) (Jung et al. 2010) for LH and SH. These datasets were selected for several reasons: (i) they are among the state of the art, (ii) they are available globally for multidecadal time periods, (iii) they are independent of each other, and (iv) they rely on very different estimation methodologies (water balance modeling for GLEAM and upscaling of tower measurements for MTE). Since neither GLEAM nor MTE represents direct observations of the turbulent heat fluxes, we also compare each reanalysis to tower-based eddy covariance observations from the FLUXNET2015 dataset (FLUXNET 2015). To determine the potential contribution of radiation biases to regional LH and SH biases, we also compare the reanalyses’ surface radiation fields for JJA against gridded observations from the Clouds and the Earth’s Radiant Energy System (CERES) and Energy Balanced and Filled (EBAF) dataset (Kato et al. 2013).
Finally, to test whether the changes in the surface energy budget from MERRA to MERRA-2 have affected the atmospheric boundary layer, we also evaluate the JJA monthly mean daily minimum and maximum
This paper is organized as follows. Section 2 summarizes the reanalysis and reference datasets used, and section 3 presents the results, including evaluation of the (i) reanalyses’ annual global land energy budget averages, (ii) the spatially distributed mean JJA energy budget and
2. Methodology and data
a. The reanalyses
The coverage and resolution of each reanalysis is summarized in Table 1, with further details below. MERRA (Rienecker et al. 2011) and MERRA-2 (Gelaro et al. 2017) are atmospheric reanalyses produced with the NASA Goddard Earth Observing System Model, version 5 (GEOS-5), modeling and data assimilation system and were designed to provide historical analyses of the hydrological cycle across a broad range of climate time scales. To address shortcomings in the land surface hydrology from MERRA, MERRA-Land (Reichle et al. 2011) was released as an offline (land only) replay of MERRA, with the model-generated precipitation corrected using rain gauge observations and with minor, but important, model parameter changes. MERRA-2 features several major advances from MERRA, including an updated atmospheric general circulation model, an updated atmospheric assimilation system, an interactive aerosol scheme, and the use of observed precipitation at the land surface (and to compute wet aerosol deposition). In addition to the land model updates from MERRA-Land, MERRA-2 includes several more updates relevant to the land, as outlined in Reichle et al. (2017a). Most notably, the surface turbulence scheme was revised, generally resulting in enhanced SH over land (Molod et al. 2015).
The reanalyses.

The method used to apply the observed precipitation at the land surface in MERRA-2 was refined from that used in MERRA-Land (Reichle and Liu 2014; Reichle et al. 2017b). In MERRA-Land the precipitation was corrected with daily Climate Prediction Center (CPC) Unified (CPCU; Chen et al. 2008) precipitation observations everywhere. For MERRA-2 the input precipitation differs in two ways: (i) in the high latitudes the MERRA-2 model-generated precipitation is retained, and (ii) over Africa the MERRA-2 precipitation is corrected with pentad-scale blended satellite and gauge-based observations from the CPC Merged Analysis of Precipitation (CMAP; Xie and Arkin 1997) and the Global Precipitation Climatology Project (GPCP; Huffman et al. 2009), version 2.1.
The land surface turbulent fluxes from the NASA reanalyses (MERRA-2, MERRA-Land, and MERRA) have not been explicitly evaluated globally. However, Jiménez et al. (2011) and Mueller et al. (2011) both included MERRA LH when merging multiple LH global land datasets into a single enhanced estimate (see section 2b), and in both studies MERRA was among the highest of the input LH estimates used. Additionally, Jiménez et al. (2011) noted a sharp gradient in the MERRA LH around 10°S in the tropics that was not present in other LH estimates. This bias gradient was traced to MERRA’s excessive rainfall canopy interception and precipitation errors (Reichle et al. 2011). Consequently, the interception reservoir parameters were revised for MERRA-Land (and MERRA-2) to eliminate this feature (the interception reservoir update was the most significant modeling change from MERRA to MERRA-Land).
An additional reanalysis, ERA-Interim, from the European Centre for Medium-Range Weather Forecasts (Dee et al. 2011), is included in the evaluation of the temporal behavior of the turbulent fluxes. In contrast to the NASA reanalyses, ERA-Interim includes a land surface updating scheme (de Rosnay et al. 2014). Specifically, the soil moisture, soil temperature, and snow temperatures are updated to minimize errors in the forecast screen-level relative humidity and temperature, while the snow depths are updated using satellite- and ground-based snow-cover and snow-depth observations.
b. Annual global land energy budget estimates
We compare the reanalyses’ annual global land energy budgets to three state-of-the-art estimates, from Trenberth et al. (2009), Wild et al. (2015), and the NEWS program estimates of L’Ecuyer et al. (2015). Each of these is based on a weighted merger of multiple modeled and observed datasets, and each applies to the energy budget at the start of the twenty-first century. For Trenberth et al. (2009) we have used their estimates for the CERES period of 2000–04; Wild et al. (2015) nominally refers to the same period, while L’Ecuyer et al. (2015) nominally refers to 2000–09. Note that the MERRA LH and SH over land were used as one of the inputs in NEWS.
These three global energy budget studies all provide continental and oceanic energy estimates, where “continental” is defined as nonocean and so includes land, land ice, and lakes but excludes inland seas. By contrast, the land estimates from MERRA-2, MERRA-Land, and MERRA apply to the area modeled by the land surface model, excluding land ice, lakes, and inland seas. The discrepancy due to the inclusion or exclusion of land ice is significant: land ice accounts for 10% of the continental area, with Antarctica making up 95% of this. NEWS provides energy budgets for each continent separately (L’Ecuyer et al. 2015), and we use their (balance constrained) energy budget estimates to approximate the land-only energy budget terms by subtracting the area-weighted Antarctica estimates from the global continental estimates. We then use our land-only NEWS estimates to approximate the continental to land ratio for each NEWS energy budget term. By assuming that the same ratios apply to Trenberth et al. (2009) and Wild et al. (2015) we then approximate land-only estimates for the latter two studies. L’Ecuyer et al. (2015) and Wild et al. (2015) both provide uncertainty ranges for their globally averaged continental estimates, which we have applied unchanged to our approximated land-only estimates.
For LH, we have also used three additional global land annual average estimates from the hydrology community, from Jiménez et al. (2011), Mueller et al. (2011), and Mueller et al. (2013). These estimates are also based on merging modeled and observed estimates. Jiménez et al. (2011) applies to global land (using a similar land definition to the NASA reanalyses) for 1994, while Mueller et al. (2011) applies to the global land area, excluding the Sahara, from 1989 to 1995, and Mueller et al. (2013) applies to the global land plus Greenland for 1989–2005. As previously noted, MERRA LH was one of the inputs used in the multiproduct mergers of Jiménez et al. (2011) and Mueller et al. (2011).
c. Gridded reference datasets
The coverage and resolution of each gridded reference dataset, together with a brief summary of important interdependencies with other datasets or reanalyses used in the study and uncertainty estimates (where available), are summarized in Table 2, with further details provided below.
The gridded reference datasets.

1) GLEAM
GLEAM (version 3.1a) provides daily estimates of terrestrial evapotranspiration, estimated from satellite and reanalysis forcing using a Priestley and Taylor–based model (Miralles et al. 2011; Martens et al. 2017). The precipitation is from the Multi-Source Weighted-Ensemble Precipitation, which is a multimodel merger of established precipitation datasets, including the same CPCU dataset used in MERRA-Land and MERRA-2, as well as ERA-Interim precipitation [the latter is used predominantly in the high latitudes, where observed precipitation datasets are more uncertain (Beck et al. 2017)]. The net surface radiation and
2) MTE
MTE provides global estimates of carbon dioxide, energy, and water fluxes at the land surface, calculated using a machine learning technique to upscale half-hourly energy-balance-corrected eddy covariance observations from 253 FLUXNET tower observations (Jung et al. 2011). The input FLUXNET observations are from the La Thuile data release, an earlier generation of the FLUXNET2015 dataset used here (to be introduced in section 2d). CPCU precipitation (again, used directly in MERRA-Land and MERRA-2) and a
3) CRU temperature data
CRU time series version 4.00 (TS v4.00) provides gridded monthly means of the daily mean, minimum, and maximum temperature over land (Harris et al. 2014a,b). The temperatures are calculated from quality-controlled climate station data, which are interpolated onto the grid according to an assumed correlation decay distance (set to 1200 km for temperature variables). In instances where no station data are available within the assumed decay distance, the published data value defaults to the climatology. Here, such climatological values have been screened out. Also, we require at least 10 data points to estimate each statistic for a given grid cell. Even with this screening, the gridded output will be much less certain when/where station coverage is less dense, which occurs over Africa, South America, central Australia, and the high latitudes.
4) CERES-EBAF radiation data
CERES-EBAF version 4.00 surface radiances are produced with a radiative transfer model after adjusting modeled and observed input data for consistency with top-of-atmosphere (TOA) CERES-EBAF radiation (Kato et al. 2013). The input data (surface, cloud, and atmospheric properties) are adjusted according to their observation-based estimated uncertainties. The input temperature and humidity profiles and land surface skin temperature
The CERES output shortwave irradiances are primarily determined by (observation based) TOA radiation and clouds; hence, they are reasonably independent of the MERRA and MERRA-2 reanalyses (Kato et al. 2013). On the other hand, the CERES output longwave irradiances, and particularly the upwelling longwave
5) Gridded dataset processing
As noted in Tables 1 and 2 some of the reference datasets and reanalyses used here publish output that applies only to the land fraction within each grid cell, while others publish a single estimate that applies to all surface types (land, permanent land ice, lakes, and ocean) within each grid cell. All of the gridded datasets and reanalyses were screened by removing all grid cells where the MERRA-2 land fraction was less than 50% (after interpolation to the relevant resolution) and then aggregated up to monthly means and 1° spatial resolution. All maps of global statistics are based on the boreal summer months of JJA only, and each comparison is made over the maximum available coincident time period, with the time periods noted in the relevant figure captions. The anomaly correlations
d. FLUXNET2015 tower observations
The FLUXNET2015 (FLUXNET 2015) sites were selected by downloading all Tier 1 observations at nonirrigated sites within grid cells classified as land at 1° resolution [as derived previously in section 2c(5)] and for which at least a 10-yr data record is available. Eddy covariance sensors underestimate turbulent heat fluxes and do not generally close the energy balance (Wilson et al. 2002); hence, we used the FLUXNET2015 energy balance closure-corrected LH and SH [see FLUXNET (2015) for details of the correction method]. While these corrections are rather uncertain, the corrected LH and SH showed better agreement with all of the reanalyses in Table 1 in terms of the means across all sites and the correlation of the means between the sites (while having negligible impact on the mean time series anomaly correlations). The balance-corrected FLUXNET data were screened to retain only days with less than 10% gap-filled data and only sites with data for at least 2550 days (~70% of 10 years). The monthly means were then calculated for months with at least 15 days of observations after the above screening, and the corresponding reanalysis monthly means were estimated using the same days. The resulting FLUXNET monthly time series were visually inspected, and obviously unrealistic features were removed. Four sites with unrealistic time series were removed. Of the remaining 21 stations, just one was in the Southern Hemisphere. Since our evaluation focuses on the boreal summertime, this site was excluded. The remaining 20 sites that have been used in this study are listed in Table 1 of the supplemental material.
3. Results
a. Annual global land energy budgets
The globally averaged annual land energy budget estimates for MERRA-2, MERRA-Land, and MERRA are illustrated in Fig. 1, with numerical values given in Table 3. For each term, the estimates for MERRA-2 and MERRA are similar (within 2–3 W m−2), while the partitioning of

The global annual mean energy budget over land from the reanalyses [MERRA-2 (M-2); MERRA-Land (M-L); MERRA (M)], the literature [NEWS (NEW), Trenberth et al. (2009) (Tre), Wild et al. (2015) (Wil), Jiménez et al. (2011) (Jim), Mueller et al. (2011) (Mu1), and Mueller et al. (2013) (Mu3)], and the gridded reference datasets [MTE, GLEAM (GLM), and CERES (CER)], for (a) LH, (b) SH, (c)
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
Global annual land average energy budget from the NASA reanalyses (W m−2), estimated over an area of 130.2 × 106 km2.

Figure 1 also includes the energy budget estimates from the literature (see section 2b), as well as the annual global land averages for each of the gridded reference datasets in Table 2. In Fig. 1a, the MERRA-2 and MERRA global land LH are higher than all of the other estimates (although MERRA-2 is within the Jiménez et al. (2011) and Wild et al. (2015) confidence intervals). The three (land adjusted) LH estimates from the global energy budget studies (Trenberth et al. 2009; Wild et al. 2015; NEWS) are very similar to each other and to MTE, GLEAM, Mueller et al. (2011), and MERRA-Land (all are within 1 W m−2). While the other two LH estimates from the hydrology community (Jiménez et al. 2011; Mueller et al. 2013) are higher, they are not as high as MERRA-2 and MERRA. Compared to the average of the three global land energy budget estimates, the MERRA-2 LH is biased high by 6 W m−2 (15%), while MERRA is biased high by 9 W m−2 (21%), and MERRA-Land is much closer, being biased high by just 1 W m−2 (2%).
For the global land SH in Fig. 1b, MERRA-2 and MERRA are both higher than Trenberth et al. (2009) and Wild et al. (2015), although lower than NEWS (but within the NEWS confidence interval) and very close (within 1 W m−2) to MTE. Compared to the average of the three global land energy budget estimates, MERRA-2 is biased high by 5 W m−2 (15%) and MERRA by 4 W m−2 (12%), while MERRA-Land is much higher, with a bias of 15 W m−2 (42%).
The positive biases in both LH and SH from the reanalyses indicate a positive bias in the incident energy at the land surface. Indeed, Fig. 1g shows that
The literature estimates in Fig. 1 are presented as long-term means, and each represents different temporal and spatial coverage. Likewise, the annual global land averages for the gridded reference datasets in Fig. 1 are based on the full available (spatial and temporal) coverage for each. However, the gridded reference datasets and reanalyses can be cross-screened to ensure that they are compared with consistent coverage. With this cross-screening, the MERRA-2 LH bias estimate is 7 W m−2 versus GLEAM, or 9 W m−2 versus MTE, while the SH bias is 1 W m−2 versus MTE, and the radiation biases versus CERES-EBAF are 10 W m−2 for
b. Land–atmosphere coupling and the MERRA-2 precipitation corrections
Here, we identify regions where, in MERRA-2, (i) LH is sensitive to precipitation (or soil moisture), and (ii) the daily maximum
1) Soil moisture and latent heating
To first order, LH (or evapotranspiration) from soil and vegetation surfaces can be conceptualized as either a moisture- or energy-limited process. In drier conditions (i.e., for soil moisture below some critical point), LH is moisture-limited in that it is restricted by the amount of soil moisture available for evapotranspiration. Temporal variations in LH will then be correlated with the plant available soil moisture (principally, the soil moisture in the root zone). In contrast, in more humid conditions LH is energy limited; there is sufficient soil moisture available for evapotranspiration, so LH proceeds at the maximum rate determined by atmospheric water demand, and temporal variations in LH are accordingly correlated with temporal variations in atmospheric demand (net radiation, atmospheric humidity deficit, and wind) rather than soil moisture.
Figure 2 shows the squared correlation between the JJA monthly anomaly MERRA-2 LH and root-zone soil moisture (SM)

The
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
2) Precipitation feedback on air temperature
Figure 3 shows maps of the squared anomaly correlation

JJA sensitivity of the monthly mean
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
Figure 3c then shows the difference between
For the boreal summer, the strongest impact of the observed precipitation, which can explain more than 25% of the
Figure 3c is consistent with previous studies identifying hot spots of strong coupling between the land and
For reference, the corresponding maps for the austral summer (December–February) are shown in Fig. 1 of the supplemental material for R2anom (LH, SM) and Fig. 2 in the supplemental material for the sensitivity to the precipitation corrections. In Fig. 1 of the supplemental material, the
c. Biases over boreal summer
In section 3a, the biases in the reanalyses’ global land energy budgets were provided as annual means. The seasonal cycle of the monthly mean global land biases (not shown) reveal that the largest global land biases for all budget terms occur in the boreal summer (JJA). Below, maps of these JJA biases are presented and discussed, together with the corresponding biases in 2-m air temperatures.
1) Energy budget terms
Figure 4 shows maps of the reanalyses’ JJA biases in LH and SH compared to each of GLEAM and MTE. For LH, the regions of positive and negative biases relative to GLEAM or MTE are similar (cf. Figs. 4a,d,g,j and Figs. 4b,e,h,k). For both, the LH biases depend on the local LH regime, with energy-limited regions [low

The mean JJA turbulent fluxes, with (a) GLEAM LH, (b) MTE LH, and (c) MTE SH reference data, and the difference from the reference data for (d)–(f) MERRA-2, (g)–(i) MERRA-Land, and (j)–(l) MERRA. The statistics span 1980–2015 for GLEAM and 1982–2011 for MTE.
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
The MERRA LH biases (Figs. 4j,k) show some of the same features as for MERRA-2, again with a tendency for large positive biases in energy-limited LH regimes. The most prominent difference is the sharp bias gradient in MERRA around 10°S (most notable in South America). As discussed in section 2b, this is associated with the unrealistically large rainfall interception reservoir in MERRA, combined with the MERRA precipitation errors; these problems have been alleviated in MERRA-2 (and MERRA-Land). Additionally, there are some isolated regions of large positive biases in moisture-limited regimes in MERRA that are removed in MERRA-2 (and MERRA-Land), such as in Mexico and southern India.
Overall, in energy-limited regions [
Figures 4c,f,i,l show the reanalyses’ biases in SH compared to MTE. In general, the SH biases for each reanalyses have an inverse relationship with the LH biases in Figs. 4b,c,e,f,h,i,k,l (for MERRA-2, the spatial correlation between the SH biases and the LH biases is −0.68 for GLEAM LH and −0.78 for MTE LH). Consequently, the evaporative fraction [EF = LH/(LH + SH)] biases compared to MTE in Figs. 5a,d,g,j show a spatial pattern very similar to that of the LH biases (for MERRA-2, the spatial correlation between MTE LH and EF biases is 0.83).

Separation of mean JJA turbulent flux into EF and incoming radiation biases, with the (a) MTE EF, (b) MTE LH + SH, and (c) CERES-EBAF reference data and the difference from the reference data for (d)–(f) MERRA-2, (g)–(i) MERRA-Land, and (j)–(l)MERRA. The statistics span 1982–2011 for MTE and 2000–15 for CERES-EBAF.
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
The sum of LH and SH approximates the net incoming radiation (after neglecting the ground heat flux and temporal change in
There is no obvious correspondence between the regional biases in the LH (compared to GLEAM or MTE) and the regional biases in
While radiation biases do not appear to be the main predictor of LH biases, biased radiation will result in biased LH and/or SH. Hence, we have partitioned the JJA

The mean JJA radiation terms, from (a)–(c) the CERES-EBAF reference data, and (d)–(f) difference from the reference data for MERRA-2, for (left)
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
The
In summary, the pattern of regional LH biases in the reanalyses suggested by GLEAM and MTE are very similar. This result adds confidence to the use of GLEAM and MTE for estimating regional biases in the reanalyses. As with the annual global land averages in Fig. 1, the maps presented here suggest that MERRA-2 and MERRA (but not MERRA-Land) have a general tendency to overestimate LH. If the GLEAM, MTE, and CERES-EBAF regional means are assumed to be more accurate than the reanalyses, the above comparisons suggest that in energy-limited regions, MERRA-2 (and MERRA) overestimates LH as a result of an overestimated evaporative fraction (i.e., too much incoming radiation is converted to LH rather than SH). There is little change in the global average biases from MERRA to MERRA-2. However, there are some isolated regions in Mexico and South Asia that are typified by moisture-limited LH, where MERRA has positive LH biases associated with overestimated EF, while MERRA-2 and MERRA-Land have much smaller biases. The precipitation corrections in MERRA-2 (and MERRA-Land) removed a relatively large amount of precipitation across these locations (Reichle et al. 2017b, their Fig. 3b), strongly suggesting that the use of precipitation observations in these products reduced the LH biases.
2) Air temperature
The biases in the MERRA-2 and MERRA JJA monthly mean daily minimum, daily maximum, and diurnal range in

The mean JJA
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
The LH and SH biases in Fig. 4 and the DTR biases in Fig. 7 show some of the expected regional similarities. In particular, in the high latitudes and the Amazon MERRA-2 has relatively large positive LH biases (and negative SH biases) and relatively large negative DTR biases. MERRA also has overestimated LH and underestimated DTR in the same regions, as well as in Southeast Asia and Central America. This is consistent with an underestimated DTR caused by underestimated SH (and overestimated LH), particularly given that the
Recall that in section 3c(1) above, the CERES-EBAF comparison suggested that the MERRA-2 (and MERRA)
d. Turbulent heat flux anomaly correlations over boreal summer
Here the monthly mean turbulent heat flux time series are evaluated over boreal summer based on their temporal correlations

The
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
The
Moving on to SH, Figs. 8c,f,i,l show
Globally averaged, the rank order of the mean LH
e. Comparison to FLUXNET tower data
Since the reference datasets used above do not represent direct observations, we now compare the globally averaged LH and SH statistics from section 3a (for the annual mean turbulent heat fluxes over land) and section 3d (for the mean JJA

Bar plot of the mean annual (a) LH and (b) SH across the 20 FLUXNET site locations from M-2, M-L, M, FLUXNET (FlN), MTE, and GLM (LH only), calculated using each dataset at its native resolution (and screened temporally for FLUXNET availability). For the global datasets, circles are plotted for the global land annual mean (taken from Fig. 1).
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
For SH, the FLUXNET observations agree less well with the global land comparison. First, the annual mean of the FLUXNET data is about 10 W m−2 below the global mean estimates from the other reference datasets. For each of the global reference datasets and reanalyses, the annual average over the 20 FLUXNET sites is also 15–20 W m−2 lower than the global average, suggesting that the relatively low FLUXNET annual mean is associated with the spatial sampling of the FLUXNET sites. Second, averaged across the FLUXNET sites, the FLUXNET mean SH is close to that of MERRA-Land, and above that of MERRA-2 (by 6 W m−2; 18%). In contrast, for the global averages in section 3a the reference datasets were all close to MERRA-2 (and MERRA), with MERRA-Land standing out as being biased high.
Figure 10 shows the JJA

Bar plot of
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
It is notable that over the FLUXNET tower sites, both GLEAM and MTE have higher average
Note that for FLUXNET,
f. Precipitation corrections and air temperature performance
Finally, we seek to establish whether the precipitation corrections in MERRA-2 influenced the local

The (a) MERRA-2
Citation: Journal of Climate 31, 2; 10.1175/JCLI-D-17-0121.1
Comparing Fig. 11c to Fig. 3c, the regions with the strongest sensitivity of
4. Summary and conclusions
The land surface energy budgets of three reanalyses from NASA (MERRA, MERRA-Land, and MERRA-2) are compared here to the best available estimates from the literature and to (largely) independent global reference datasets. In terms of the global land annual averages, the results suggest that the MERRA-2 LH and SH are biased high by 5 and 6 W m−2, respectively, while
Compared to reference flux estimates from GLEAM and MTE over the boreal summer (when both the fluxes themselves and their biases are greatest), the largest MERRA-2 LH biases (>20 W m−2, vs either GLEAM or MTE) occur in regions where LH is energy limited, such as in the high latitudes, the tropics, parts of South Asia, and the eastern United States. The MERRA-2 LH biases are typically smaller in regions where LH is moisture limited, which include the drier regions of the mid and low latitudes. In some of these moisture-limited regions (parts of South Asia and Mexico) the high bias in the MERRA LH was largely removed in MERRA-2 (and MERRA-Land), likely because the observed precipitation used in the latter was lower than that produced by the MERRA (or MERRA-2) modeling systems. Finally, comparison to the evaporative fraction from MTE and to
The temporal agreement between the reanalyses and the reference datasets over boreal summer was measured using the monthly anomaly correlation
The use of observed precipitation in MERRA-2 was motivated by the hope that the subsequent improvements in simulated soil moisture would lead to the improved partitioning of incoming radiation between latent and sensible heating, ultimately leading to improvements in the diurnal evolution of the boundary layer. It is difficult, however, to unequivocally attribute the improvements in MERRA-2 to the use of observed precipitation because MERRA-2 includes many other modeling and assimilation advances relative to MERRA. Nonetheless, many of the improvements in the MERRA-2 LH and
However, some of the largest biases and lowest
Finally, the SH results for MERRA-Land are troubling. While MERRA-Land did have the desired reduction in the LH biases compared to MERRA (to 1 W m−2 in the global land annual average), it also had a compensating, and much larger, increase in the SH bias (up to 15 W m−2 in the global land average). Additionally, the JJA
While this work focused on evaluating surface energy fluxes in MERRA-2, the findings have relevance to anyone interested in designing a methodology to evaluate global estimates of turbulent heat fluxes. The gridded LH reference datasets (GLEAM and MTE) had better agreement with the reanalyses’ time series (as measured by
The GLEAM and MTE reference datasets used here are independent of each other and are based on very different methodologies, thus providing complementary information for use in an evaluation. However, given the use of the common precipitation input data in GLEAM as in MERRA-2, and the fact that MTE data are not optimized to estimate interannual variability, LH estimates from a third reference dataset would be useful. Emerging global and multidecadal land surface flux datasets based on an energy balance approach (Anderson et al. 2011), or alternative observational frameworks (Alemohammad et al. 2017) would provide useful complements to GLEAM and MTE for a more comprehensive analysis.
Funding for this work was provided by the NASA Modeling, Analysis, and Prediction program. Computational resources were provided by the NASA High-End Computing Program through the NASA Center for Climate Simulation. The authors acknowledge the teams that produce and publish the GLEAM, MTE, CERES-EBAF, ERA-Interim, MERRA, MERRA-Land, and MERRA-2 products. Additionally, we are grateful to Diego Miralles (VU University Amsterdam/Ghent University), Martin Jung (Max Planck Institute for Biogeochemistry), and Seiji Kato (NASA Langley Research Center) for their thoughtful feedback on this work and detailed advice on the use of GLEAM, MTE, and CERES-EBAF, respectively. The FLUXNET eddy covariance data processing and harmonization was carried out by the European Fluxes Database Cluster, AmeriFlux Management Project, and Fluxdata project of FLUXNET, with the support of CDIAC and ICOS Ecosystem Thematic Center, and the OzFlux, ChinaFlux, and AsiaFlux offices.
REFERENCES
Alemohammad, S. H., and Coauthors, 2017: Water, Energy, and Carbon with Artificial Neural Networks (WECANN): A statistically based estimate of global surface turbulent fluxes and gross primary productivity using solar-induced fluorescence. Biogeosciences, 15, 4101–4124, https://doi.org/10.5194/bg-14-4101-2017.
Anderson, M., and Coauthors, 2011: Mapping daily evapotranspiration at field to continental scales using geostationary and polar orbiting satellite imagery. Hydrol. Earth Syst. Sci., 15, 223–239, https://doi.org/10.5194/hess-15-223-2011.
Baldocchi, D., and Coauthors, 2001: FLUXNET: A new tool to study the temporal and spatial variability of ecosystem-scale carbon dioxide, water vapor, and energy flux densities. Bull. Amer. Meteor. Soc., 82, 2415–2434, https://doi.org/10.1175/1520-0477(2001)082<2415:FANTTS>2.3.CO;2.
Beck, H., A. van Dijk, V. Levizzani, J. Schellekens, D. Miralles, B. Martens, and A. de Roo, 2017: MSWEP: 3-hourly 0.25° global gridded precipitation (1979–2015) by merging gauge, satellite, and reanalysis data. Hydrol. Earth Syst. Sci., 21, 589–615, https://doi.org/10.5194/hess-21-589-2017.
Betts, A., A. Tawfik, and R. Desjardins, 2017: Revisiting hydrometeorology using cloud and climate observations. J. Hydrometeor., 18, 939–955, https://doi.org/10.1175/JHM-D-16-0203.1.
Chen, M., W. Shi, P. Xie, V. B. S. Silva, V. E. Kousky, R. W. Higgins, and J. E. Janowiak, 2008: Assessing objective techniques for gauge-based analyses of global daily precipitation. J. Geophys. Res., 113, D04110, https://doi.org/10.1029/2007JD009132.
Decker, M., M. Brunke, Z. Wang, K. Sakaguchi, X. Zeng, and M. Bosilovich, 2012: Evaluation of the reanalysis products from GSFC, NCEP, and ECMWF using flux tower observations. J. Climate, 25, 1916–1944, https://doi.org/10.1175/JCLI-D-11-00004.1.
Dee, D., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553–597, https://doi.org/10.1002/qj.828.
De Lannoy, G., and R. Reichle, 2016: Assimilation of SMOS brightness temperatures or soil moisture retrievals into a land surface model. Hydrol. Earth Syst. Sci., 20, 4895–4911, https://doi.org/10.5194/hess-20-4895-2016.
de Rosnay, P., G. Balsamo, C. Albergel, J. Muñoz-Sabater, and L. Isaksen, 2014: Initialisation of land surface variables for numerical weather prediction. Surv. Geophys., 35, 607–621, https://doi.org/10.1007/s10712-012-9207-x.
Dharssi, I., K. Bovis, B. Macpherson, and C. Jones, 2011: Operational assimilation of ASCAT surface soil wetness at the Met Office. Hydrol. Earth Syst. Sci., 15, 2729–2746, https://doi.org/10.5194/hess-15-2729-2011.
Draper, C., J.-F. Mahfouf, and J. Walker, 2011: Root-zone soil moisture from the assimilation of screen-level variables and remotely sensed soil moisture. J. Geophys. Res., 116, D02127, https://doi.org/10.1029/2010JD013829.
Draper, C., R. Reichle, G. De Lannoy, and B. Scarino, 2015: A dynamic approach to addressing observation-minus-forecast bias in a land surface skin temperature data assimilation system. J. Hydrometeor., 16, 449–464, https://doi.org/10.1175/JHM-D-14-0087.1.
FLUXNET, 2015: FLUXNET2015 dataset. Fluxdata, accessed 9 August 2016, http://fluxnet.fluxdata.org/data/fluxnet2015-dataset/.
Gelaro, R., and Coauthors, 2015: Evaluation of the 7-km GEOS-5 nature run. NASA Tech. Memo. NASA/TM-2014-104606, Vol. 36, 285 pp.
Gelaro, R., and Coauthors, 2017: The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2). J. Climate, 30, 5419–5454, https://doi.org/10.1175/JCLI-D-16-0758.1.
Global Modeling and Assimilation Office, 2008a: tavg1_2d_slv_Nx: MERRA 2D IAU diagnostic, single level meteorology, time average 1-hourly V5.2.0. GES DISC, accessed 1 October 2016, https://doi.org/10.5067/B6DQZQLSFDLH.
Global Modeling and Assimilation Office, 2008b: tavgM_2d_lnd_Nx: MERRA 2D IAU diagnostic, land only states and diagnostics, monthly mean V5.2.0. GES DISC, accessed 1 October 2016, https://doi.org/10.5067/XOHTIIK0W9RK.
Global Modeling and Assimilation Office, 2008c: tavgM_2d_mld_Nx: MERRA simulated 2D incremental analysis update (IAU) MERRA-Land reanalysis, GEOSldas-MERRALand, time average monthly mean V5.2.0. GES DISC, accessed 1 October 2016, https://doi.org/10.5067/K9PCGOMQ1XP1.
Global Modeling and Assimilation Office, 2015a: MERRA-2 tavg1_2d_lfo_Nx: 2D, 1-hourly, time-averaged, single-level, assimilation, land surface forcings V5.12.4. GES DISC, accessed 1 October 2016, https://doi.org/10.5067/L0T5GEG1NYFA.
Global Modeling and Assimilation Office, 2015b: MERRA-2 tavgM_2d_lnd_Nx: 2D, monthly mean, time-averaged, single-level, assimilation, land surface diagnostics V5.12.4. GES DISC, accessed 1 October 2016, https://doi.org/10.5067/8S35XF81C28F.
Global Modeling and Assimilation Office, 2015c: MERRA-2 tavgM_2d_slv_Nx: 2D, monthly mean, time-averaged, single-level, assimilation, single-level diagnostics V5.12.4. GES DISC, accessed 1 October 2016, https://doi.org/10.5067/AP1B0BA5PD2K.
Harris, I., P. Jones, T. Osborn, and D. Lister, 2014a: Updated high-resolution grids of monthly climatic observations—the CRU TS3.10 dataset. Int. J. Climatol., 34, 623–642, https://doi.org/10.1002/joc.3711.
Harris, I., and Coauthors, 2014b: CRU TS3.22: Climatic Research Unit (CRU) time-series (TS) version 3.22 of high resolution gridded data of month-by-month variation in climate (Jan. 1901-Dec. 2013). NCAS British Atmospheric Data Centre, accessed 17 May 2017, https://doi.org/10.5285/18BE23F8-D252-482D-8AF9-5D6A2D40990C.
Huffman, G., R. Adler, D. Bolvin, and G. Gu, 2009: Improving the global precipitation record: GPCP version 2.1. Geophys. Res. Lett., 36, L17808, https://doi.org/10.1029/2009GL040000.
Jiménez, C., and Coauthors, 2011: Global intercomparison of 12 land surface heat flux estimates. J. Geophys. Res., 116, D02102, https://doi.org/10.1029/2010JD014545.
Jung, M., M. Reichstein, and A. Bondeau, 2009: Towards global empirical upscaling of FLUXNET eddy covariance observations: Validation of a model tree ensemble approach using a biosphere model. Biogeosciences, 6, 2001–2013, https://doi.org/10.5194/bg-6-2001-2009.
Jung, M., and Coauthors, 2010: Recent decline in the global land evapotranspiration trend due to limited moisture supply. Nature, 467, 951–954, https://doi.org/10.1038/nature09396.
Jung, M., and Coauthors, 2011: Global patterns of land-atmosphere fluxes of carbon dioxide, latent heat, and sensible heat derived from eddy covariance, satellite, and meteorological observations. J. Geophys. Res., 116, G00J07, https://doi.org/10.1029/2010jg001566.
Kato, S., N. Loeb, F. Rose, D. Doelling, D. Rutan, T. Caldwell, L. Yu, and R. Weller, 2013: Surface irradiances consistent with CERES-derived top-of-atmosphere shortwave and longwave irradiances. J. Climate, 26, 2719–2740, https://doi.org/10.1175/JCLI-D-12-00436.1.
Kato, S., and Coauthors, 2017: CERES_EBAF-Surface_Ed4.0: Data quality summary (May 26, 2017). NASA CERES Rep., 29 pp., https://ceres.larc.nasa.gov/documents/DQ_summaries/CERES_EBAF-Surface_Ed4.0_DQS.pdf.
Koster, R., and Coauthors, 2006: GLACE: The Global Land–Atmosphere Coupling Experiment. Part I: Overview. J. Hydrometeor., 7, 590–610, https://doi.org/10.1175/JHM510.1.
Koster, R., G. Salvucci, A. Rigden, M. Jung, G. Collatz, and S. Schubert, 2015: The pattern across the continental United States of evapotranspiration variability associated with water availability. Front. Earth Sci., 3, 35, https://doi.org/10.3389/feart.2015.00035.
L’Ecuyer, T., and Coauthors, 2015: The observed state of the energy budget in the early twenty-first century. J. Climate, 28, 8319–8346, https://doi.org/10.1175/JCLI-D-14-00556.1.
Martens, B., and Coauthors, 2017: GLEAM v3: Satellite-based land evaporation and root-zone soil moisture. Geosci. Model Dev., 10, 1903–1925, https://doi.org/10.5194/gmd-10-1903-2017.
Miralles, D., T. Holmes, R. de Jeu, J. Gash, A. Meesters, and A. Dolman, 2011: Global land-surface evaporation estimated from satellite-based observations. Hydrol. Earth Syst. Sci., 15, 453–469, https://doi.org/10.5194/hess-15-453-2011.
Miralles, D., M. van den Berg, A. Teuling, and R. de Jeu, 2012: Soil moisture-temperature coupling: A multiscale observational analysis. Geophys. Res. Lett., 39, L21707, https://doi.org/10.1029/2012GL053703.
Molod, A., L. Takacs, M. Suarez, J. Bacmeister, I.-S. Song, and A. Eichmann, 2012: The GEOS-5 atmospheric general circulation model: Mean climate and development from MERRA to Fortuna. NASA Tech. Memo. NASA/TM-2014-104606, Vol. 28, 117 pp.
Molod, A., L. Takacs, M. Suarez, and J. Bacmeister, 2015: Development of the GEOS-5 atmospheric general circulation model: Evolution from MERRA to MERRA-2. Geosci. Model Dev., 8, 1339–1356, https://doi.org/10.5194/gmd-8-1339-2015.
Mueller, B., and Coauthors, 2011: Evaluation of global observations-based evapotranspiration datasets and IPCC AR4 simulations. Geophys. Res. Lett., 38, L06402, https://doi.org/10.1029/2010GL046230.
Mueller, B., and Coauthors, 2013: Benchmark products for land evapotranspiration: LandFlux-EVAL multi-data set synthesis. Hydrol. Earth Syst. Sci., 17, 1661–1679, https://doi.org/10.5194/hess-17-3707-2013.
NEWS Science Integration Team, 2007: A NASA Earth science implementation plan for energy and water cycle research: Predicting energy and water cycle consequences of Earth system variability and change. NASA Energy and Water Cycle Study Rep., 89 pp., http://news.cisc.gmu.edu/doc/NEWS_implementation.pdf.
Reichle, R., and Q. Liu, 2014: Observation-corrected precipitation estimates in GEOS-5. NASA Tech. Memo. NASA/TM-2014-104606, Vol. 35, 18 pp.
Reichle, R., R. Koster, G. De Lannoy, B. Forman, Q. Liu, S. Mahanama, and A. Toure, 2011: Assessment and enhancement of MERRA land surface hydrology estimates. J. Climate, 24, 6322–6338, https://doi.org/10.1175/JCLI-D-10-05033.1.
Reichle, R., C. Draper, Q. Liu, M. Girotto, S. Mahanama, R. Koster, and G. D. Lannoy, 2017a: Assessment of MERRA-2 land surface hydrology estimates. J. Climate, 30, 2937–2960, https://doi.org/10.1175/JCLI-D-16-0720.1.
Reichle, R., Q. Liu, R. Koster, C. Draper, S. Mahanama, and G. Partyka, 2017b: Land surface precipitation in MERRA-2. J. Climate, 30, 1643–1664, https://doi.org/10.1175/JCLI-D-16-0570.1.
Rienecker, M., and Coauthors, 2011: MERRA: NASA’s Modern-Era Retrospective Analysis for Research and Applications. J. Climate, 24, 3624–3648, https://doi.org/10.1175/JCLI-D-11-00015.1.
Schlosser, C., and X. Gao, 2010: Assessing evapotranspiration estimates from the Second Global Soil Wetness Project (GSWP-2) simulations. J. Hydrometeor., 11, 880–897, https://doi.org/10.1175/2010JHM1203.1.
Trenberth, K., J. Fasullo, and J. Kiehl, 2009: Earth’s global energy budget. Bull. Amer. Meteor. Soc., 90, 311–323, https://doi.org/10.1175/2008BAMS2634.1.
Wang, K., and R. Dickinson, 2013: Global atmospheric downward longwave radiation at the surface from ground-based observations, satellite retrievals, and reanalyses. Rev. Geophys., 51, 150–185, https://doi.org/10.1002/rog.20009.
Wild, M., D. Folini, and M. Hakuba, 2015: The energy balance over land and oceans: An assessment based on direct observations and CMIP5 climate models. Climate Dyn., 44, 3393–3429, https://doi.org/10.1007/s00382-014-2430-z.
Wilson, K., and Coauthors, 2002: Energy balance closure at FLUXNET sites. Agric. For. Meteor., 113, 223–243, https://doi.org/10.1016/S0168-1923(02)00109-0.
Xie, P., and P. Arkin, 1997: Global precipitation: A 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs. Bull. Amer. Meteor. Soc., 78, 2539–2558, https://doi.org/10.1175/1520-0477(1997)078<2539:GPAYMA>2.0.CO;2.