Soil moisture observations from seven observational networks (spanning portions of seven states) with different biome and climate conditions were used in this study to evaluate multimodel simulated soil moisture products. The four land surface models, including Noah, Mosaic, Sacramento soil moisture accounting (SAC), and the Variable Infiltration Capacity model (VIC), were run within phase 2 of the North American Land Data Assimilation System (NLDAS-2), with a ⅛° spatial resolution and hourly temporal resolution. Hundreds of sites in Alabama, Colorado, Michigan, Nebraska, Oklahoma, West Texas, and Utah were used to evaluate simulated soil moisture in the 0–10-, 10–40-, and 40–100-cm soil layers. Soil moisture was spatially averaged in each state to reduce noise. In general, the four models captured broad features (e.g., seasonal variation) of soil moisture variations in all three soil layers in seven states, except for the 10–40-cm soil layer in West Texas and the 40–100-cm soil layer in Alabama, where the anomaly correlations are weak. Overall, Mosaic, SAC, and the ensemble mean have the highest simulation skill and VIC has the lowest simulation skill. The results show that Noah and VIC are wetter than the observations while Mosaic and SAC are drier than the observations, mostly likely because of systematic errors in model evapotranspiration.
Soil moisture information is valuable for weather and climate prediction (de Goncalves et al. 2006; de Rosnay et al. 2013; Koster et al. 2009; Yang et al. 2011), flood control (Pal and Eltahir 2002; Martinis et al. 2009; Koster et al. 2014), slope failure control (Ray et al. 2010), reservoir management (Maurer and Lettenmaier 2004), geotechnical engineering, water quality monitoring, and drought monitoring (Atlas et al. 1993; Mo and Lettenmaier 2014; Xia et al. 2014a). Soil moisture largely controls evapotranspiration (ET) and total runoff. Soil moisture also controls the exchange of heat energy between the land surface and the atmosphere by affecting latent heat and ground heat flux. Ek and Holtslag (2004) have comprehensively described how soil moisture affects latent heat, soil temperature, land surface temperature (upward longwave radiation), and sensible and latent heat fluxes from a soil thermodynamics viewpoint and how soil moisture affects ET and total runoff from a soil hydrology viewpoint. Simulations with numerical weather prediction models have shown that improved characterization of surface soil moisture, vegetation, and temperature can lead to significant forecast improvements (Case et al. 2011; Drusch 2007; Kumar et al. 2014; Li et al. 2014). Soil moisture has the potential to impact convective precipitation by influencing energy and moisture fluxes (Alfieri et al. 2008) and thus modifying planetary boundary layer depth (Santanello et al. 2009) and atmospheric instability (Frye and Mote 2010).
Soil moisture data mainly come from 1) in situ observations, 2) remote sensing information (satellite and aircraft), and 3) climate and land surface models. The North American Soil Moisture Database (NASMD) contains data from more than 27 observational networks in the United States and Canada, comprising over 1800 stations observing in situ soil moisture. The database includes soil moisture observations in a variety of soil depths, with the majority of the data starting in 2000 (Ford and Quiring 2013). The International Soil Moisture Network (ISMN) contains data from 35 networks, which together include more than 1400 sites around the world (Dorigo et al. 2013) with an hourly-to-weekly temporal resolution. The ISMN data quality has been assessed by Dorigo et al. (2013), and measurement errors have been comprehensively evaluated by Gruber et al. (2013). Although in situ observations are limited and not spatially extensive, they are an indispensable source of information to calibrate and validate satellite- and model-based soil moisture estimates. The soil moisture retrieved from satellite-based remote sensing such as Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E; Reichle et al. 2007), Advanced Scatterometer (ASCAT; Wagner et al. 1999), and Soil Moisture Ocean Salinity (SMOS; Kerr et al. 2001) offer soil moisture observations at global and regional scales; however, they measure soil moisture information in the upper few centimeters over sparsely vegetated areas. Efforts are being made to extrapolate these measurements to the root zone (Gao et al. 2007; Reichle et al. 2007; Sabater et al. 2007; Drusch et al. 2009; Ford et al. 2014b). Soil moisture derived from measured microwave brightness temperature from an airborne L-band Push Broom Microwave Radiometer (PBMR) mounted on an aircraft can have a very high spatial resolution at watershed scale (Peters-Lidard et al. 2008). Thermal-based retrievals also offer potential to infer root-zone soil moisture (Hain et al. 2009) and can complement microwave retrievals (Li et al. 2010).
Model-based soil moisture products can be produced using real-time/retrospective weather and climate systems (e.g., reanalysis products) or offline land surface models. Real-time forecast systems may also be differentiated by what they assimilate, such as screen-level data in the European Centre for Medium-Range Weather Forecasts (ECMWF) operational system (Drusch and Viterbo 2007) or satellite soil moisture retrievals in the Met Office (UKMO) operational global model (Dharssi et al. 2011). Offline simulations can be used as an experimental tool because it is easier to compare models when they are all run with the same forcing data. However, model-based soil moisture products are largely dependent on accurate surface forcing data (e.g., precipitation, radiation, and air temperature) and reasonable land models and parameterization schemes.
Phase 2 of the North American Land Data Assimilation System (NLDAS-2) is an offline modeling system, running four land models [Noah, Mosaic, Sacramento soil moisture accounting (SAC), and the Variable Infiltration Capacity model (VIC)] within NLDAS-2 on a ⅛° grid over the continental United States. Daily gauge-based precipitation is quite reliable in the United States and is used in NLDAS-2 (Xia et al. 2012a). The other forcing data are obtained from North American Regional Reanalysis (NARR; Mesinger et al. 2006) with a bias-correction process for downward solar radiation. Temporal disaggregation is employed for daily gauge-based precipitation by using hourly satellite and radar observations (Xia et al. 2012a). Noah is the land model of the National Centers for Environmental Prediction (NCEP) operational regional and global weather and climate models (Chen et al. 1997; Betts et al. 1997; Ek et al. 2003). Mosaic is the land model for the NASA global climate model (Koster and Suarez 1994, 1996), but it has been replaced by the Catchment land surface model for the recent upgrade of NASA’s GEOS-5 (Reale et al. 2009). VIC was developed as a large-scale, grid-based, semidistributed hydrological model that solves the full water and energy balances (Liang et al. 1994; Wood et al. 1997). SAC is a grid-based semidistributed hydrological model that solves a full water balance (Koren et al. 1999) based on a lumped conceptual hydrology model (Burnash et al. 1973). SAC is calibrated for small catchments and used operationally in National Weather Service (NWS) River Forecast Centers (RFCs). Noah and Mosaic were developed within the surface–vegetation–atmosphere transfer scheme community for coupled land–atmospheric modeling. These schemes focus on the interaction between land and atmosphere through surface energy and water flux exchange. SAC and VIC were developed by the hydrological community as uncoupled hydrological models with a focus on prediction of variables such as streamflow. NLDAS-2 has generated long-term (>35 years) hourly soil moisture products at four soil layers down to 2 m over the continental United States to support operational drought monitoring (Mo and Lettenmaier 2014; Xia et al. 2014a). These soil moisture products have been evaluated using in situ observations in Illinois and Oklahoma (Xia et al. 2014b). However, the previous evaluation by Xia et al. (2014b) used only 6 years (1997–2002) rather than 13 years (1999–2012) of observations.
The recently released NASMD provides a new opportunity to reevaluate NLDAS-2 soil moisture products using a more comprehensive set of in situ observations. Furthermore, updated soil property information has been used to improve the accuracy of soil moisture measurements for the Oklahoma Mesonet (Scott et al. 2013; Ochsner et al. 2013). This provides a high-quality dataset with over 13 years of observations with which to evaluate NLDAS-2 products. This study evaluates NLDAS-2 soil moisture using in situ soil moisture observations from more than 385 sites in the continental United States from 1999 to 2012. This is the first part of a pair of papers. The companion paper (Xia et al. 2015a) investigates the impact of soil texture and vegetation type mismatches (site observed vs model default) on soil moisture simulation in Noah. After the description of data and methods used in this study, soil moisture from the four models and their ensemble means are evaluated using in situ data from 385 sites in seven states. This is followed by an evaluation of seasonal variations in anomaly correlation and error metrics, a comparison of the simulated and observed soil moisture climatology at different soil layers, and a discussion of the reasons for the differences in performance among the four models.
2. Data and methods
a. NASMD observed daily soil moisture
NASMD is the quality-controlled soil moisture database developed and maintained by Texas A&M University. Soil moisture data are collected from numerous networks across Canada, Mexico, and the United States, covering a vast array of soil textures, land cover, elevation, and climate regimes. Currently, NASMD includes data from more than 27 observational networks and 1800 sites. NASMD also provides information about the soil and vegetation characteristics at each site. NASMD uses the land-cover classification scheme provided by the Environmental Protection Agency’s National Land Cover Data 2001 classes (http://www.epa.gov/mrlc/classification.html). If soil information is not provided by the observation network, these parameters are estimated from the U.S. Department of Agriculture Soil Survey Geographic Database (SSURGO). Soil moisture data from some of the networks that are archived in NASMD have previously been used for evaluating model-simulated soil moisture, including the Oklahoma Mesonet (Robock et al. 2003; Ford and Quiring 2014a,b; Xia et al. 2014b), Illinois Climate Network (Schaake et al. 2004; Fan et al. 2006; Xia et al. 2014b), and Soil Climate Analysis Network (SCAN; Liu et al. 2011; Xia et al. 2014b). In this study, observation networks in seven states were selected on the basis of 1) a high site density for each network, 2) consistent use of sensors within each network (to ensure the same measurement errors), and 3) at least 5 years of data. The seven regions that have networks that met these criteria are Alabama (AL), Colorado (CO), Michigan (MI), Nebraska (NE), Oklahoma (OK), West Texas (WTX), and Utah (UT). This subset consists of 385 sites (Fig. 1). The soil moisture sensors, measurement error, number of depths where measurements are made, length of record, and website/contact information for all seven networks are listed in Table 1. Each of the networks has a single operator, with the exception of AL, which includes stations from both SCAN and U.S. Climate Reference Network (USCRN). Both SCAN and USCRN employ Stevens Hydra Probes at 5-, 10-, 20-, 50-, and 100-cm depth, and thus SCAN and USCRN data were evaluated together in AL. The measurement error is estimated to be ~0.03 m3 m−3 for most networks, except for SCAN and USCRN (0.03–0.05 m3 m−3) and SNOTEL (0.03–0.06 m3 m−3). The network operators confirmed that most sensors do not accurately detect soil moisture content when the water is frozen in the soil. Therefore, measurements in Colorado, Michigan, Nebraska, and Utah are excluded during the cold season (October–April).
b. NLDAS-2 simulated daily soil moisture
1) NLDAS-2 background
Meteorological forcing data (i.e., downward shortwave radiation, downward longwave radiation, 2-m air temperature, 2-m air specific humidity, surface precipitation, surface pressure, and 10-m wind speed) are generated from the NARR product (Mesinger et al. 2006) through a series of processes unique to NLDAS-2 (Xia et al. 2012a,b). These processes include spatial and temporal downscaling, topographic adjustment for 2-m air temperature and 10-m specific humidity, replacement of the NARR precipitation with Climate Prediction Center (CPC) gauge precipitation (Chen et al. 2008), bias correction of the gauge precipitation with monthly Parameter–Elevation Regressions on Independent Slopes Model (PRISM) product (Daly et al. 1994), and bias correction of the NARR downward shortwave radiation with GOES-8 retrievals (Pinker et al. 2003). The details are described in Cosgrove et al. (2003) and at the NCEP/Environmental Modeling Center (EMC) NLDAS-2 website (http://www.emc.ncep.noaa.gov/mmb/nldas/LDAS8th/forcing/forcing_narr.shtml). The meteorological forcing data extend from 1 January 1979 to present with an hourly temporal resolution and ⅛° spatial resolution (same as the NLDAS-2 grid). NLDAS-2 uses these forcing data to drive four land surface models to produce energy fluxes, water fluxes, and state variables (e.g., soil moisture). Noah, Mosaic, SAC, and VIC simulate soil moisture in different ways.
Noah has four soil layers: 0–10, 10–40, 40–100, and 100–200 cm and simulates soil moisture at the middle of each soil layer (5, 25, 70, and 150 cm). The physics of water movement between the layers is governed by a discrete representation of Richards’ equation (Richards 1931), except that infiltration is governed by a conceptual parameterization that considers heterogeneity over the area of precipitation and the local potential for infiltration. The ET process largely affects soil moisture variation through both bare soil evaporation process and vegetation transpiration process (e.g., through vegetation roots in soils). All models except SAC use the Penman–Monteith equation (Monteith 1965) to calculate ET processes, as SAC is hydrological model without energy budget calculation. Noah uses a dominant vegetation type with a varied root depth (e.g., 100 cm for grassland and 200 cm for forest and woodland) at a given grid, and Mosaic and VIC may have several vegetation types at a given grid cell with a tiling method (Mitchell et al. 2004). Mosaic uses a constant root depth of 40 cm for all vegetation types, and VIC uses the varied root depth from 135 to 300 cm (see Mitchell et al. 2004).
Mosaic has three soil layers: 0–10, 10–40, and 40–200, with soil moisture simulated in the middle of the three soil layers (5, 25, and 120 cm). Each grid box is further divided into a maximum of 10 tiles representing different vegetation. The soil water and energy balances in each tile are simulated independently. Water movement between the layers uses the one-dimensional Richards’ equation. Soil moisture at each soil depth is calculated as a weighted average of the soil moisture from all tiles.
VIC also has three soil layers, with a 10-cm top layer and spatially varying depths for the other two layers (upper zone and lower zone). The root zone can span all three layers, depending on vegetation types (forest, grass, etc.). The upper and lower zones are operated as the conceptual water storages. The top 10-cm soil layer is intended to capture the fast dynamics of water movement near the land surface. Water can only be extracted from this layer through ET. The upper zone soil water storage determines the partitioning of rainfall into surface runoff and infiltration. The lower zone soil water storage determines the amount of base flow. Like Mosaic, VIC uses subgrid vegetation tiles to represent the spatial heterogeneity of soil moisture distribution within the grid cell and to determine surface runoff, infiltration, and base flow. Because water storage in the upper and lower zones is conceptual and does not correspond explicitly to soil layers, their storage capacities must be determined through model calibration. VIC storage capacities and other parameters were calibrated using retrospective monthly historical hydrometeorological data (Nijssen et al. 1997).
SAC is also a storage-type model, which is conceptually different from the other three models as it does include energy budget computation, although it is similar to VIC from a hydrologic viewpoint. It represents water storage using five conceptual water storage components divided into upper and lower zones, which are further separated into tension and free water storage components. The free water storage of the lower layer is further divided into two substorages that control supplemental (fast) and primary (slow) groundwater flows. Additionally, the model features a sixth variable water storage component that accounts for the effects of varying areas of saturation near streams. Together, these components represent the active part of water storage in each grid cell.
2) Translation of simulated soil moisture data to common soil layers
A direct comparison with observations and among models is not possible because the measurements are made at different soil depths for the seven networks (Table 1) and the simulated soil moisture from the four models are defined at different soil layers. For SAC, there is no specified soil layering scheme as discussed above. To overcome this, we used a simple linear interpolation technique or rescaling method to match the in situ measurements with the four Noah soil layers (depth to the middle of the soil layers are 5, 25, 70, and 150 cm). The simulated soil moisture data from the other three models were also interpolated to match Noah. Noah was chosen as the target because its soil layers are uniform across the domain and it has the greatest number of layers across the models (four), allowing for a more detailed vertical comparison with the measurements. Mosaic data were converted to Noah soil layers using a simple linear interpolation. VIC soil moisture data were transferred to Noah layers by calculating the weighted average of soil moisture in each VIC layer that intersected each Noah layer. For grid cells where the deepest VIC layer was shallower than the deepest Noah layer, VIC soil moisture was assumed to be uniform down to the bottom of the Noah layer.
Unlike the other NLDAS-2 models, the water storage components of SAC are not tied to any soil depth or thickness. This characteristic complicates intercomparison with other models and validation against soil moisture observations. This issue was addressed as part of a recent frozen ground physics upgrade, as SAC gained the ability to map the conceptual water storages to distinct soil layers [SAC–Heat Transfer (HT); Koren et al. 2010]. This upgrade has allowed for the accurate simulation of the vertical profile of soil moisture and soil temperature; however. it is not yet used in NLDAS-2. As an interim solution for NLDAS-2, soil moisture output was computed at distinct soil layers using a postprocessing technique. The model parameter–soil property relationships were used to convert the upper and lower soil moisture capacities into soil moisture contents at a number of soil layers. A physically based HT component was used to determine the distribution of liquid/frozen water in the layered soil column. Five layer depths (i.e., 0–40, 40–80, 80–120, 120–160, and 160–200 cm) are defined a priori to cover a 2-m soil profile with thinner layers closer to the soil surface. The layered soil moisture contents from the HT component were then interpolated by weighted average to the same layers as Noah (Xia et al. 2014b). See the NLDAS-2 website for more details (ftp://ldas.ncep.noaa.gov/nldas2/sac_sm/Readme_Post-Processed_SAC_Soil_Moisture.pdf). The postprocessing software can be obtained from the NCEP/EMC Land-Hydrology group.
c. Evaluation method
Comparisons of simulated and observed soil moisture can be done in two ways. The first is a direct comparison between the simulated soil moisture and observations at each individual site, but this kind of comparison is problematic because of the differences in spatial scale. Spatial variations in soil moisture are related to small-scale hydrological processes, soil characteristics, and vegetation cover (Crow and Wood 1999). These small-scale soil moisture heterogeneities can lead to bias when observations are directly compared with model simulations. An alternative is to use a simple spatial average for the observed and simulated soil moisture values, which is used in this study. This method can reduce the spatial noise and provide a more meaningful comparison, although at the expense of averaging out error (i.e., bias) characteristics. Since some networks (e.g., SCAN) have highly variable periods of record for different stations and depths, the spatial average is biased toward stations with long periods of records. Xia et al. (2014b) estimated this error to be less than 0.05 m3 m−3 based on comparisons in the U.S. Midwest. This simple spatial average method has been used in many similar validation studies (Entin et al. 1999; Robock et al. 2003; Fan et al. 2006; Xia et al. 2014b). The advanced soil moisture upscaling techniques suggested by Crow et al. (2011) are more robust methods for reducing sampling errors and comparing point observations with grid cells. These techniques could be used in the future when extra-high-resolution (e.g., 1 km) NLDAS soil moisture products are available.
The evaluation metrics used in this study include root-mean-square error (RMSE), Bias, relative bias (), anomaly correlation (AC), and the Taylor skill score S. It should be noted that the NLDAS-2 simulated and observed soil moisture are spatially averaged from multiple stations for each region (Table 1, Fig. 1). The soil moisture anomaly is the temporal anomaly after the mean seasonal cycle is removed. NLDAS-2 soil moisture is compared to the measured soil moisture with the RMSE, Bias, , and AC:
and with the S criterion (Taylor 2001)
where N is total number of days of soil moisture observation, is the simulated soil moisture (anomaly), is the observed soil moisture (anomaly), is the temporal average of observed soil moisture (anomaly), is the temporal average of simulated soil moisture anomaly, R is the correlation between the simulated and observed soil moisture, is the theoretical maximum correlation (assumed to be 1), and is the standard deviation of simulated soil moisture normalized by the standard deviation of observed soil moisture.
The RMSE is used to assess overall error when NLDAS-2 soil moisture is compared and Bias and (Lohmann et al. 2004; Xia et al. 2012b) are used to detect the model systematic error. Anomaly correlation is used to evaluate model capacity to capture daily variability of the observed soil moisture, and S is used to evaluate model ability to capture both seasonal variability and variance of the observed soil moisture. When the model variance approaches the observed variance and R is close to , the S approaches unity (a perfect score). When the correlation becomes more negative or the model variance approaches either zero or infinity, S decreases toward zero (no skill). The variable S increases linearly with the correlation when variance is fixed. For a given R, S is proportional to the variance when the model variance is small, and S is inversely proportional to the variance when the model variance is large.
a. Overall evaluation and comparison
Taylor skill score for the four models and their ensemble mean for all seven states is calculated using observed and simulated soil moisture for a period of 5–14 years (Fig. 2), depending on the different networks (Table 1). AL and UT show considerably lower S values for 5-cm soil moisture than the other regions, although model-to-model variability within AL and UT is quite high. Generally speaking, Mosaic and SAC show the best performance at 5- and 25-cm soil moisture, with VIC consistently underperforming at these depths. All models tend to underperform in WTX for 25 cm. This is potentially due to the effect of frozen soils and the other data quality problems that were not removed in WTX. Xia et al. (2015b) demonstrated that skill scores between models and observations improved when the frozen soil and the other data quality problem effects were removed. The actual reason remains unclear and needs further investigation. VIC performs much better for 70-cm soil moisture; however, the multimodel ensemble performs best at this depth.
Daily anomaly correlations (Fig. 3) show variability between depths, regions, and models, similar to that in S values. Based on anomaly correlations, Mosaic and multimodel ensemble mean (MM) show the strongest correspondence at the 5-cm depth, with VIC again consistently underperforming. VIC performance is generally improved at the 25-cm soil depth, but it is quite variable between regions. As with S values, anomaly correlation performance varies the least among models at the 70-cm depth. In general, Mosaic, SAC, and MM showed consistently better performance than Noah and VIC, based on Taylor skill score and anomaly correlations (Table 2). The causes of the variations in performance of the four models remain unclear and will require further investigation.
Analysis of Taylor skill score and anomaly correlations provide an evaluation of a model’s ability to capture soil moisture variability but cannot be used to diagnose model error. For this purpose, RMSE, Bias, and are calculated for each depth, region, and model and are summarized in Tables 3 and 4 for 5- and 25-cm depths, respectively. For 5- and 25-cm soil moisture, intermodel variability of RMSE scores was lowest in AL, CO, NE, and WTX. SAC exhibited anomalously high RMSE values in MI (5 cm) and OK (5 and 25 cm), while VIC performed poorly in UT (5 cm). Values of Bias and show that SAC and Mosaic consistently underestimated 5- and 25-cm soil moisture, while VIC and Noah somewhat less consistently overestimated 5- and 25-cm soil moisture. Despite consistently drier soils at 25 cm, Mosaic noticeably outperformed the other models in CO and UT, and on averaged exhibited the smallest average 25-cm Bias values. Noah outperformed the other models in AL, NE, OK, and WTX despite consistently wetter 5-cm soils. Interestingly, SAC dramatically underperformed with regard to the other models in MI and OK at 5 cm and AL, NE, OK, and WTX at 25 cm. For both depths, SAC exhibited on average the highest RMSE and Bias values.
Soil moisture at 70 cm was only available at AL, NE, OK, and WTX. At 70 cm, Noah, Mosaic, and SAC consistently underestimated soil moisture, while VIC did so only in AL and NE. Similar to 5- and 25-cm depths, SAC exhibited the highest RMSE values at 70 cm, although the difference was less in NE. When the RMSE values (Tables 3–5) are larger than the corresponding instrument measurement errors for a given state (Table 1), this indicates that the simulation errors come from model limitations such as inadequate model physical parameterizations, inappropriate model values, and inaccurate forcing data. Another source of error is that associated with comparing point observations at a single depth to model layers.
The general pattern observed from RMSE and Bias values is that Noah and VIC overestimate soil moisture, while Mosaic and SAC underestimate soil moisture. This makes the MM estimate closest to the observed soil moisture. The soil moisture overestimation (underestimation) in Noah and VIC (Mosaic and SAC) is partially due to small (large) ET in Noah and VIC (Mosaic and SAC), which will be discussed in section 3e. According to the measurement error estimates in Table 1, most of the errors generated by all models and MM are greater than the measurement error range, suggesting that models and model forcings still have room to improve.
b. Seasonal variation of statistics
The seven regions are divided into two groups for the analysis of monthly variation of daily anomaly correlation with depth because AL, NE, OK, and WTX have in situ soil moisture observations at all depths while CO, MI, and UT have in situ soil moisture observations at only one or two depths. Anomaly correlations vary seasonally, by depth, and from state to state (Fig. 4). Correlations are larger in the warm season (from May to September) than in the cold season (from October to April) for all four models. The total soil water includes water in liquid and solid phases. Even though the models can simulate the total water, the sensors can usually only measure the liquid part. In the cold season, when soil is frozen, the measurements give the liquid soil moisture only, while the models give the total soil moisture. For this reason, the Nebraska network developer suggested that measured soil moisture in the cold season should not be used (K. Hubbard 2014, personal communication). Therefore, NE includes warm season soil moisture only. Even if the soil is seldom frozen, the inconsistency between measurements and model outputs in the cold season will affect the anomaly correlation (Xia et al. 2014b). The same issue could arise in WTX in the shallower soil depths, but all data were used in this study.
Anomaly correlations at 5 and 25 cm in CO, MI, and UT (Fig. 5) show considerably different patterns of monthly variability. July anomaly correlations in CO decrease in all models at both the 5- and 25-cm depths. A similar pattern is seen for July and August anomaly correlations in UT. Correlations of 5-cm soil moisture in MI show the least variability between models and months. Monthly variability of RMSE with depth indicates considerable differences between depth and model (Fig. 6). Small RMSE appears in the upper soil depths and large RMSE appears in lower soil depths for all models in AL. Warm season RMSE values are smaller than those during the cold season for all models and depths, with the exception of SAC deep soil layers. This is particularly true in WTX. In OK, all models except for VIC show a fairly uniform RMSE, while VIC exhibits a large (small) RMSE in the warm (cold) season.
c. Comparison of simulated and observed daily soil moisture climatology
Simulated and observed soil moisture climatology in AL, NE, OK, and WTX for all soil depths are compared in Fig. 7. In general, the annual range of the simulations from the four models at three depths captures the observed soil moisture range, with the exception of the 25- and 70-cm depths at AL and the 70-cm depth at NE. In each of these three cases, model soil moisture is consistently underestimated with regard to the observations. For the majority of conditions, VIC represents the wettest soil simulation, while SAC represents the driest. Consistent with error metrics in Tables 2–4, Mosaic and SAC underestimate soil moisture while VIC overestimates observations. At the 5-cm soil depth in four states (AL, NE, OK, and WTX), seasonal variations in observed soil moisture (wet in winter and dry in summer) are successfully captured by all four models, except for VIC, which shows little seasonal variation. This is attributable to the small bare soil fraction used in VIC (Xia et al. 2014b). When VIC uses a relatively large bare soil fraction, the seasonal variation of the observed top 5-cm soil moisture in Illinois can be well captured (Xia et al. 2014b). At 25-cm soil depths, all models basically capture the seasonal variation of the observed soil moisture in all four states except for WTX. In this case, the models fail to capture the observed increases in soil moisture that occur during the wet season (February–May) in WTX. This is why all models have small Taylor skill scores and anomaly correlations (Figs. 2, 3) at the 25-cm soil depth in WTX. In addition, it should be noted that VIC shows little seasonal variability at 5- and 25-cm soil depth over AL, OK, and WTX. At the 70-cm soil depth, all models capture the seasonal variations in observed soil moisture for all four states.
The simulated and observed soil moisture climatology in CO, MI, and UT are also compared (Fig. 8). The simulations from the four models are reasonably similar to the observed soil moisture. All models can capture the variations in observed soil moisture during the warm season, although Noah and VIC have small seasonal variation at 5-cm soil depth for all three states. VIC is in the upper bound of the simulations and SAC and Mosaic are in the lower bound of the simulations. This means that VIC tends to overestimate soil moisture whereas Mosaic and SAC tend to underestimate soil moisture as compared to the observations. Noah tends to fall in the middle of the model simulations, although it overestimates the soil moisture in CO and UT. This is consistent with the error metric analysis in Tables 2–4 in which Noah and VIC (Mosaic and SAC) exhibit positive (negative) Bias values. Overall, Mosaic and SAC are able to accurately simulate the drying phases in CO and UT.
Model validation with in situ soil moisture observations from seven regions shows that three of the land surface models (i.e., Noah, Mosaic, and SAC) can generally capture the variations in observed soil moisture, such as the seasonal cycle and interannual variability. The models are also able to accurately simulate the daily soil moisture variability at different soil depths, albeit with large intermodel variability dependent on location, model, and soil depth. In contrast to the variability and anomaly comparisons, large negative and positive Bias values exist in many instances between model and observation datasets. In particular, SAC largely underestimates soil moisture and VIC largely overestimates soil moisture at all depths for most locations. The differences among the four models may be due to ET, as it represents the primary control of moisture extraction from the soil. Daily ET climatologies for the four models are compared in each state except UT (Fig. 9), as it is very similar to the ET climatology of CO. Mosaic and SAC simulate consistently larger ET than Noah and VIC in all six regions. The increased (decreased) ET leads to decreased (increased) soil moisture and consistent underestimation (overestimation) of actual soil moisture content. The only exception to this pattern is Mosaic 70-cm soil depth, where Mosaic has comparable or higher (AL) soil moisture than Noah. This is because Mosaic sets the first two soil layers (0–10 and 10–40 cm) as root zone for all vegetation types (from grassland to forest). Soil moisture is changed only through vertical diffusion between the 70- and 10–40-cm depths. Soil moisture underestimation is worse in SAC than Mosaic, most likely because of soil texture–related parameters (e.g., total water storage capacity, wilting point, and hydraulic conductivity). Schaake et al. (2004) demonstrate that total water storage capacity controls the amount of soil water that can be simulated by each land surface model. SAC has lower total water storage capacity values over the continental United States than Mosaic and simulates consistently drier soils than the other models. Because all of the NLDAS-2 models use the same forcing data, soil texture, and vegetation type, the differences in the soil moisture simulations are due to differences in model structure/philosophy (i.e., three or four soil layers, different root-zone depth, lumped conceptual hydrological model) and model parameters (e.g., parameters related to soil texture and vegetation type). It should be noted that there is a complex relationship between soil moisture and the simulated outputs such as ET, surface runoff, and base flow. A complete water budget comparison is a more rigorous method to conclusively determine the cause of the differences in the model-simulated soil moisture. This issue will be addressed in a future paper.
It should be noted that Noah ET is much smaller than the other models in CO. Noah heavily constrains aerodynamic conductance to reduce large sublimation during the wintertime when the atmospheric boundary layer is stable. The purpose is to overcome early snowmelt occurrences in the model and to improve snow water equivalent and streamflow simulation, in particular for their seasonal cycle simulation (Livneh et al. 2010). As a result, this leads to the unexpected small ET simulation in cold regions such as mountainous regions as most of the SNOTEL stations in CO are located in mountains where the atmospheric boundary layer is more frequently stable. This issue has been addressed by Xia et al. (2015c).
In summary, errors between the simulated and observed soil moisture may come from 1) instrument measurement errors (see Table 1); 2) model structure errors (e.g., different soil layers, different root zones, and model physics deficiencies); 3) model parameter errors; 4) forcing data errors (e.g., precipitation, air temperature, radiation, and wind); and 5) errors caused by using soil texture, vegetation type, or land-cover parameters that are different than what exists at the measurement sites. Importantly, the interpolation and averaging techniques employed in this study inevitably introduce error as well. The error introduced by soil, vegetation, and land-cover parameterization is further examined in the second part of our companion paper using Noah (Xia et al. 2015a).
In this study, soil moisture observations from seven observational networks with different biome and climate conditions were used to evaluate multimodel simulated soil moisture products in NLDAS-2: 385 sites in AL, CO, MI, NE, OK, WTX, and UT were used to evaluate 0–10-cm soil moisture; 321 sites were used to evaluate 10–40-cm soil moisture; and 217 sites in AL, NE, OK, and WTX were used to evaluate 40–100-cm soil moisture. Soil moisture was spatially averaged in each region to reduce noise. Simulation skill was assessed using anomaly correlation and Taylor skill score to evaluate the variation and standard deviation of the simulated soil moisture from the four models and their ensemble mean. Error metrics including RMSE, Bias, and were used to determine total and systematic error in the simulated soil moisture from the four models and their ensemble mean. In general, the four NLDAS-2 models can capture broad features of soil moisture variations in all three soil layers in seven states, except for the 10–40-cm soil layer in WTX and the 40–100-cm soil layer in AL, where there are small anomaly correlations. Mosaic, MM, and SAC have the highest simulation skill, VIC has the lowest simulation skill, and Noah’s performance is in between. Generally speaking, there are stronger (weaker) anomaly correlations and smaller (larger) RMSE in summer (winter). In the top two soil layers, Noah and VIC overestimate the observed soil moisture, and Mosaic and SAC underestimate the observed soil moisture. In the 40–100-cm soil layer, all models, except for VIC, underestimate the observed soil moisture.
Anomaly correlations and RMSEs show strong seasonal variations that vary by model and soil layer. Generally speaking, there are stronger anomaly correlations in summer and weaker anomaly correlations in winter. There are small RMSEs in lower soil layers in summer and large RMSEs in upper soil layers in winter. Overall results show that Noah and VIC have wetter soils and Mosaic and SAC have drier soils than the observed soil moisture. The major reason is that Noah and VIC have less ET and Mosaic and SAC have more ET. Besides ET, soil hydraulic parameters and model parameters also play an important role in soil moisture simulation. Although all models employ common fields of vegetation and soil class, model parameters such as soil layer thickness, number of soil layers, root depth, root density, and seasonal cycle of vegetation (Mitchell et al. 2004) may be different to avoid negating the legacy of calibration or tuning invested over the past decades.
In addition, soil moisture and ET are strongly affected by soil texture and vegetation type, and there are strong nonlinear interactions between soil moisture and ET. When site-observed soil texture and vegetation type are different from model gridded values, this will definitely affect the accuracy and representativeness of soil moisture simulations. The second part of this companion paper will address this issue (Xia et al. 2015a).
The authors thank Prof. Ken Hubbard from University of Nebraska–Lincoln and Dr. Jeff Andersen from Michigan State University, who gave us measurement error estimates and suggested we not use measured soil moisture in the cold season. Y.X. was sponsored by the NOAA Climate Program Office’s Modeling, Analyses, Predictions, and Projections (MAPP) Program. S.Q. and T.F. were funded by the National Science Foundation (Award AGS-1056796). The authors also thank Roshan Shrestha and Binbin Zhou at EMC and three anonymous reviewers whose comments greatly improved the quality of this manuscript.