Remotely sensed soil moisture data are typically incorporated into numerical weather models under a framework of weakly coupled data assimilation (WCDA), with a land surface analysis scheme independent from the atmospheric analysis component. In contrast, strongly coupled data assimilation (SCDA) allows simultaneous correction of atmospheric and land surface states but has not been sufficiently explored with land surface soil moisture data assimilation. This study implemented a variational approach to assimilate the Soil Moisture Active Passive (SMAP) 9-km enhanced retrievals into the Noah land surface model coupled with the Weather Research and Forecasting (WRF) Model under a framework of both WCDA and SCDA. The goal of the study is to quantify the relative impact of assimilating SMAP data under different coupling frameworks on the atmospheric forecasts in the summer. The results of the numerical experiments during July 2016 show that SCDA can provide additional benefits on the forecasts of air temperature and humidity compared to WCDA. Over the U.S. Great Plains, assimilation of SMAP data under WCDA reduces a warm bias in temperature and a dry bias in humidity by 7.3% and 19.3%, respectively, while the SCDA case contributes an additional bias reduction of 2.2% (temperature) and 3.3% (humidity). While WCDA leads to a reduction of RMSE in temperature forecasts by 4.1%, SCDA results in additional reduction of RMSE by 0.8%. For the humidity, the reduction of RMSE is around 1% for both WCDA and SCDA.
Soil moisture is a key variable that modulates the radiative energy partition into latent and sensitive heat fluxes over the land surface and has a relatively long memory (days to weeks) (Dirmeyer et al. 2009). Via data analysis, many research studies have revealed the feedback of soil moisture on precipitation variability on various space and time scales, and this feedback is particularly strong over midlatitudes during the summer (Dirmeyer 2011; Taylor et al. 2012; Spennemann and Saulo 2015; Liu et al. 2016, 2017; Tuttle and Salvucci 2016). For modeling perspectives, it has been demonstrated that numerical weather prediction (NWP), especially precipitation forecasts, are sensitive to the initialization of land surface soil moisture (Ek and Holtslag 2004; Sutton et al. 2006; Aligo et al. 2007; Case et al. 2008, 2011; Trier et al. 2008; Hohenegger et al. 2009; Van Weverberg et al. 2010; Quintanar and Mahmood 2012; Massey et al. 2016). Therefore, accurate representation of soil moisture is important in NWP.
Assimilation or nudging of satellite soil moisture measurements is one of the most common methods for improving initial soil moisture conditions in NWP. There are a growing number of satellite missions capable of measuring surface soil moisture, including the Advanced Scatterometer (ASCAT; Wagner et al. 2013; Brocca et al. 2017), Soil Moisture and Ocean Salinity (SMOS; Kerr et al. 2010), Soil Moisture Active Passive (SMAP; Entekhabi et al. 2010a), and Advanced Microwave Scanning Radiometer on the Earth Observing System (AMSR-E; Njoku et al. 2003), which lead to increased soil moisture data coverage in time and space. To assimilate soil moisture data, several NWP operational centers have tested and implemented a coupled system that allows land surface and atmospheric data analysis separately. This type of system is built upon a fully coupled land–atmosphere model or an offline version land surface model that communicates with its corresponding atmospheric component every analysis cycle (e.g., 6 h). For example, a soil moisture nudging method was first implemented into the Integrated Forecast System at the European Centre for Medium-Range Weather Forecasts (ECMWF; Drusch 2007). To incorporate observations of various types with more realistic error covariance, an extended Kalman filter has been implemented and studied in the land analysis system of the Aire Limitée Adaptation Dynamique Développement Internation (ALADIN) and ECMWF (Mahfouf et al. 2009; Mahfouf 2010; De Rosnay et al. 2013; Schneider et al. 2014; Duerinckx et al. 2017). In addition, a soil moisture nudging method was implemented into the NWP system at the Met Office (Dharssi et al. 2011) while an ensemble Kalman filter was tested with the Global Forecast System (GFS) at the U.S. National Centers for Environmental Prediction (NCEP) (Yin and Zhan 2018; Zheng et al. 2018). On a regional scale, previous research also demonstrated the benefit of soil moisture data assimilation within the Weather Research and Forecasting (WRF) Model (Rasmy et al. 2011, 2012; Santanello et al. 2016; Lin et al. 2017a,b).
The aforementioned practices and implementation have demonstrated how assimilated soil moisture can improve weather forecasts via reducing errors propagated from the land surface during model integration. However, the practices and implementation in these past studies are mainly considered under a framework of weakly coupled data assimilation, indicating that the land and atmospheric analyses are performed separately. Strongly coupled data assimilation requires the estimation of cross-model error covariance and the correction of analysis variables in coupled models simultaneously (Penny et al. 2017; Yoshida and Kalnay 2018; Suzuki and Zupanski 2018), but its relevant research with land–atmosphere models is little up to now. Lin and Pu (2018, hereafter LP18) have estimated the model background error covariance between land surface soil moisture and atmospheric states, including potential temperature, specific humidity, and winds. They found that the error correlation between surface soil moisture, temperature, and humidity is comparable, which suggests 1) part of the errors in the surface soil moisture come from the atmosphere forcing and 2) the potential of correcting atmospheric initial conditions via soil moisture data assimilation.
In light of challenges in coupled data assimilation, we have implemented the methodology in LP18 to study the relative effect of assimilating soil moisture data on weather forecasts under a framework of weakly and strongly coupled land–atmosphere data assimilation. Specifically, we aim to quantify the additional impact on lower-troposphere atmospheric forecasts via a direct analysis procedure (i.e., a strongly coupled case) relative to the impact on forecasts via the dynamics of land–atmosphere interactions (i.e., a weakly coupled case) when satellite soil moisture data are assimilated. This study uses the Noah land surface model coupled with the WRF Model and conducts the experiments in the summer. The SMAP 9-km level-2 enhanced soil moisture retrievals (O’Neill et al. 2016) are assimilated. LP18 revealed a weak correction between the forecast errors in surface soil moisture and winds and suggested that the wind forecasts are less sensitive to the assimilation of surface soil moisture than the temperature and humidity forecasts. Thus, this study focuses on the effect of coupled data assimilation on only the states of air temperature and humidity over the Great Plains of the United States.
The rest of the paper is organized as follows: section 2 describes the coupled data assimilation system, the configuration of the model and domain, the SMAP soil moisture dataset and its bias correction, the experiment design, and the method and datasets for verification. Section 3 presents the evaluation of model skills in surface soil moisture, temperature, specific humidity, and precipitation over the Great Plains. Section 4 includes discussions and concluding remarks.
a. Coupled data assimilation system and study domain
This study uses the 1D-Var approach described in LP18 to perform weakly and strongly coupled data assimilation. Specifically, we use the following equation to obtain the state analysis:
where xb and xa denote the vectors of the background and analysis, respectively; denotes the linear form of an observational operator; d is the innovation vector; and and represent the background and observational error covariance matrices, respectively. The -matrix can be further decomposed as = ΣΣ, where Σ consists the error standard deviation of the analysis states in the diagonal component; and describes the correlation of the forecast errors between different analysis states.
The -matrix is particularly important in a strongly coupled data assimilation system, as it enables an observation over land to directly impact the atmospheric state analyses and vice versa (e.g., Sawada et al. 2018; Suzuki et al. 2017). To study the impact of satellite soil moisture data in coupled data assimilation, we adopt the background error covariance estimates from LP18. These estimates are available on a monthly scale and at a spatial resolution of 9 km for the control states of top 10-cm soil moisture and air potential temperature, specific humidity, and winds at 40 pressure levels throughout the atmosphere. The monthly estimates were obtained according to the characteristics of multiple forecasts from 2015 to 2017 using the National Meteorological Center method (Parrish and Derber 1992). As LP18 found that the error correlation between land surface soil moisture and atmospheric winds is not obvious, this study will consider only surface soil moisture and atmospheric temperature and humidity as the control analysis states in the coupled data assimilation experiments.
To maintain consistency, this study adopts the domain configuration used by LP18. This includes a single domain of 602 × 392 grids at a spatial resolution of 9 km (Fig. 1). The top pressure level is set at 50 hPa with 40 vertical layers below. We further selected the Great Plains as a region of interest to study the effect of coupled soil moisture data assimilation. The Great Plains is commonly known as a transition zone between dry and wet climates, leading to strong land–atmosphere coupling (Koster et al. 2004, 2006; Dirmeyer et al. 2009). In section 3d, we also briefly discuss the coupled data assimilation performance at the Mountain West. LP18 indicates that the magnitude of the error covariance is on average larger over the Great Plains than the rest of the contiguous United States. According to Fig. 5 in LP18, the area average values of the error standard deviation in July are 0.0138 (m3 m−3), 1.39 (K), and 1.50 (g kg−1) for the top 10-cm soil moisture, bottom-layer temperature, and bottom-layer specific humidity, respectively, at the region of interest (the Great Plains). In contrast, these average numbers are 0.0122 (m3 m−3), 1.07 (K), and 1.05 (g kg−1) for the entire domain (the contiguous United States) and are 0.0094 (m3 m−3), 1.13 (K), and 1.06 (g kg−1) for the Mountain West. The error correlation between land surface soil moisture and bottom-layer temperature is also larger in the Great Plain than the rest of the domain. Since larger error covariance generally leads to a higher model sensitivity to the assimilated observations, it further justifies the selection of the Great Plains for model verification in this study. More details of the implemented coupled data assimilation system can be found in LP18.
b. Configuration of WRF-Noah Model
This study uses WRF version 3.9.1 with the Advanced Research WRF (ARW) solver (Skamarock et al. 2008; Powers et al. 2017). The WRF Model is a mesoscale NWP system for both research and operations and is currently maintained by the National Center for Atmospheric Research (NCAR). Same as LP18, we use the WRF contiguous United States (CONUS) physics suite, which includes the new Thompson microphysics scheme (Thompson et al. 2008), the Rapid Radiative Transfer Model (RRTM) longwave and shortwave schemes (Iacono et al. 2008), the Monin–Obukhov-based Eta similarity surface layer scheme (Janjić 2002), the Noah land surface model (Chen and Dudhia 2001), the Mellor–Yamada–Janjić planetary boundary layer scheme (Janjić 1994), and the Tiedtke cumulus parameterization scheme (Tiedtke 1989; Zhang et al. 2011). The Noah land surface model has four soil layers in default with the thickness of 10, 30, 60, and 100 cm from top to bottom. The Noah model simulates water movement vertically in a single soil column and provides latent and sensible heat fluxes to the bottom atmospheric layer of its coupled WRF Model.
c. SMAP soil moisture data and bias correction
This study uses version 1 SMAP level-2 enhanced soil moisture retrievals (O’Neill et al. 2016). This product uses 1) the Backus–Gilbert technique to optimally interpolate the native SMAP brightness temperature measurements at a spatial resolution of 36–9 km and 2) the core algorithms to obtain soil moisture retrievals (Chan et al. 2018). The 9-km enhanced product that utilizes the oversampling present in the native SMAP overpasses is found to reveal fine spatial features that are not obvious in the 36-km SMAP retrievals. Among the core retrieval algorithms, the V-polarized channel (SCA-V) provides the best retrieval performance with an overall unbiased root-mean-square error (ubRMSE) of less than 0.04 m3 m−3 (Chan et al. 2018). We use the SCA-V data for both ascending and descending overpasses, which are valid around 1800 and 0600 local time, respectively. The SMAP product retrievals include several types of surface flags, indicating uncertain quality, and these flags include surface conditions of water, urban areas, precipitation, snow cover and frozen ground, terrain aspects, and dense vegetation. We excluded data of these surface conditions, and the majority of the exclusion is due to dense vegetation. Figure 2 shows the sample size for the study period from 1 to 27 July 2016. Over the Great Plains, there are approximately 11% of the area without coverage of SMAP data, and the observation sample size is on average 26 per grid cell for the rest of the area. Furthermore, examples of the effect of surface flags on data sample sizes over different seasons can be seen in Lin and Bras (2017).
Satellite and model soil moisture climatology typically differ from each other, and therefore the implementation of bias correction on soil moisture data is often necessary (Reichle and Koster 2004, 2005). The most commonly adopted bias correction method is the cumulative distribution function (CDF) technique that rescales satellite soil moisture retrievals to match the climatology of the model. This study particularly adopts the bias correction method introduced by Blankenship et al. (2016, 2018), as this method was tested with the Noah model, same as this study. This method requires the construction of the CDFs by grouping samples of the same soil type within the Noah land surface model and needs satellite data of only a short time period such as a couple of months. To construct the CDFs, we performed multiple WRF-Noah open-loop forecasts with a lead time of one day initialized every 12 h at 0000 and 1200 UTC from May to September 2016 and paired the forecasts with the SMAP retrievals at a spatial resolution of 9 km. To ensure a large enough sample size for each soil type group, we merge the samples for soil types such as silt loam and organic material. We manually examined and ensured a similar CDF for the merged soil types.
Table 1 shows the percentage of available SMAP samples in each soil group over the study domain from 3 May to 26 September 2016. Figure 3 shows the correction curves of each soil group. This figure illustrates that most of the correction curves are similar except for sandy soils. This is mostly due to the physical constraint in the soil parameters, which in the used WRF version have a field capacity (below this number soil water moves under tension) of 0.192 (m3 m−3) and a water content at saturation of 0.339 (m3 m−3) for sandy soils while these two parameters for other soil types are at least 0.283 and 0.404 (m3 m−3) for the field capacity and saturated water content, respectively. The curves are mostly smooth due to a large sample size except that SMAP retrieval has many small and constant values of 0.02 m3 m−3, which are the minimum retrieved soil moisture values (Entekhabi et al. 2014). For those SMAP samples with a value of 0.02, we assign a median model soil moisture value of the corresponding range for each soil type group. For example, 8.6% of SMAP data in the group of soil sand have a soil moisture value of 0.02, and the rescaled SMAP data are assigned a value of 0.076 m3 m−3, the value of model soil moisture at the percentile of 4.3%. Figure 4 shows the CDFs of the SMAP soil moisture with and without bias correction as well as the WRF-Noah data. As is obvious, the rescaled SMAP soil moisture data become in a close agreement with the Noah soil moisture estimates.
d. Experiment design
To explore the relative importance of assimilating satellite soil moisture retrievals via the schemes of weakly and strongly coupled data assimilation, three numerical experiments are conducted:
OPNL, which is the open-loop run without any data assimilation.
WCDA, which includes the assimilation of the rescaled SMAP 9-km soil moisture data and updates only the state of top 10-cm soil moisture every 12 h. This experiment is considered an experiment of weakly coupled land–atmosphere data assimilation.
SCDA, which is the same as WCDA but updates the states of not only land surface soil moisture but also atmospheric potential temperature and specific humidity every 12 h. This experiment is considered an experiment of strongly coupled land–atmosphere data assimilation.
In the data assimilation experiments, we update only top 10-cm soil moisture as 1) the focus of this study is to explore the direct impact of assimilating satellite data on the analyses/forecasts of air temperature and humidity rather than root-zone soil moisture and 2) several previous studies show that assimilation of satellite soil moisture results in a small effect on lower-layer soil moisture (Reichle and Koster 2005; Yin et al. 2014; Blankenship et al. 2016; Lin et al. 2017a,b). Furthermore, because the SMAP data are available every 12 h, we compute the analysis every 12 h. SMAP soil moisture data from the ascending overpasses are assimilated at 0000 UTC, while those from the descending overpasses are assimilated at 1200 UTC. We chose a constant observation error [i.e., in Eq. (1)] of 0.04 m3 m−3, which is the overall SMAP mission accuracy (Entekhabi et al. 2010a).
This study performs cycling experiments from 1 to 27 July 2016. This summertime period is chosen as the forecast error and the cross-variable error correlation are relatively large in the summertime (LP18), indicating that the model is more sensitive to soil moisture data assimilation in the summer than the winter. We use the 0.25° NCEP Final Analysis (FNL) to obtain the lateral boundary conditions for the OPNL, SCDA, and WCDA experiments. The NCEP FNL is produced via the Global Forecast System (GFS) with the Noah land surface model. Thus, the used soil moisture in our experiments is not only inherently consistent with the WRF-Noah model but also spun up well as the FNL dataset is produced via cycling GFS-Noah runs. We first prepared multiple spinup runs based on 6-h WRF-Noah forecasts valid every 12 h at 0000 and 1200 UTC during the study period (see the black boxes in Fig. 5). In each 12-h cycle, there are three procedure steps: replacement, data assimilation, and model integration. In the replacement step (i.e., the red boxes in Fig. 5), from the second cycle, we use the 6-h spinup forecasts as the initial conditions for all variables except the states of soil moisture, potential temperature, and specific humidity. The initial conditions for soil moisture, temperature, and humidity (i.e., first guess) are obtained from the 12-h forecasts of the previous cycle. In the data assimilation step (i.e., the cyan boxes in Fig. 5), we update only the top 10-cm soil moisture state in WCDA, while we update the top 10-cm soil moisture and atmospheric temperature and specific humidity states in SCDA. Last, based on the initial conditions with/without data assimilation, we perform 1) 12-h model integration for the next analysis cycle and 2) 72-h forecast runs for the evaluation of the experiments (see the green boxes in Fig. 5).
e. Evaluation method and reference datasets
To evaluate the model skill of land surface soil moisture and atmospheric temperature and humidity, we consider the International Soil Moisture Network (ISMN), the conventional sounding and surface measurements, the NCEP 0.25° FNL analysis dataset, and NCEP Stage IV precipitation dataset. The ISMN consists of a collection of individually operated soil moisture network and includes automated quality control (Dorigo et al. 2013). To evaluate the performance of modeled top 10-cm soil moisture, from the ISMN we particularly choose data from the Soil Climate Analysis Network (SCAN) and the Climate Reference Network (CRN) due to their high quality over the United States (Schaefer et al. 2007; Diamond et al. 2013). The sounding data are measured twice at 0000 and 1200 UTC every day at the contiguous United States and routinely assimilated into the NCEP Global Data Assimilation System. The surface weather observations often include pressure, temperature, humidity, and winds and can be reported in the METAR data format with thousands of stations in the United States. Figure 6 shows the stations of SCAN/CRN and sounding measurements within the region of interest. The NCEP FNL dataset is used for evaluating the forecasts of temperature and humidity profiles at 0000, 0600, 1200, and 1800 UTC every day during the study period. The NCEP FNL dataset, available every 6 h at a nearly real time, is obtained from the NCEP Global Forecast System with the incorporation of many observations into the analyses. This dataset is commonly used as an atmospheric verification reference (e.g., Rakesh and Goswami 2011; Singh et al. 2011, 2012; Hsiao et al. 2012; Kumar et al. 2014; Chen et al. 2015; Wang and Cui 2018). The NCEP Stage IV dataset is a ground-based precipitation product with gauge correction (Lin and Mitchell 2005).
We use the metrics of bias, root-mean-square error (RMSE), unbiased RMSE (ubRMSE), and Pearson’s correlation coefficient (ρ) to quantify the model performance. The formulations of metrics bias, RMSE, and ubRMSE are as follows:
where Mi and Oi are the model outputs and the observational references, respectively; and E[⋅] is the expectation operator. Metric ubRMSE is commonly used for evaluating soil moisture retrievals and modeling, due to the presence of biases (Entekhabi et al. 2010b). We also use the following equations to quantify the relative impact (RI) of data assimilation:
a. Comparison of land surface soil moisture and heat fluxes
This subsection evaluates the top 10-cm soil moisture model output against the ISMN over the Great Plains. The 6- and 12-h forecasts of surface soil moisture at a spatial resolution of 9 km initialized every 0000 and 1200 UTC during 1–27 July 2016 that are collocated with the ISMN stations are evaluated. Table 2 shows the statistics averaged over the ISMN stations within the region of interest. Both of the data assimilation experiments show improvement in all the metrics. Although there is a difference in the performance of WCDA and SCDA, the difference is relatively marginal compared to the improvement relative to OPNL. The statistics of metric bias is not shown, as the averaged results are nearly unbiased (i.e., less than 0.01 m3 m−3) for all the three experiments. The table also summarizes the relative impact of data assimilation according to Eqs. (6)–(8). Both the WCDA and SCDA experiments show a reduction in RMSE and ubRMSE by approximately 10% and an enhancement of temporal variability from 24% to 26%. On the whole, the results show that the implemented variational approach can effectively improve the forecasts of surface soil moisture, consistent with the previous research in Lin et al. (2017a,b).
The peak-time (i.e., 1800 UTC or around 1200 local time) surface latent and sensible heat flux forecasts during 1–27 July 2016 are compared. Due to limited heat flux observations, we mainly analyze the variations of the heat fluxes between the open-loop and data assimilation experiments. On average over space and time, the peak-time values of OPNL are 260 W m−2 (latent heat) and 273 W m−2 (sensible heat). The absolute differences in the latent heat fluxes between OPNL and the data assimilation experiments are 21.9 W m−2 (OPNL vs WCDA) and 23.7 W m−2 (OPNL vs SCDA), while these values for the sensible heat fluxes are 22.9 W m−2 (OPNL vs WCDA) and 24.6 W m−2 (OPNL vs SCDA). The variations of approximately 10% between OPNL and WCDA is directly attributed to the soil moisture data assimilation. In contrast, adding air temperature and humidity into the control states does not lead to a significant difference in the heat flux forecasts between WCDA and SCDA, which makes sense as the difference in the soil moisture model skills between WCDA and SCDA is only marginal (see Table 2).
b. Evaluation of air temperature and humidity against sounding data
This subsection evaluates the atmospheric temperature and specific humidity forecasts against the conventional sounding data. The forecasts with various lead times (i.e., 12, 24, 36, 48, 60, and 72 h) that are initialized during 10–27 July 2016 are verified against the NCEP sounding data at 925, 850, 700, 500, 400, 300, 200, and 100 mb. The forecasts for the first few days are considered the spinup of the cycling experiments and not evaluated. The sample size of the sounding data is on average around 1700 at each pressure level except 925 mb, which has a size of around 1000 because of three stations on the western part of the study region having a relatively high elevation. Figure 7 shows the forecast skills in term of bias and RMSE. The results show that the impact of soil moisture data assimilation is negligible above 700 mb for both WCDA and SCDA, which is consistent with LP18. LP18 suggests that the impact of soil moisture data under a scheme of strongly coupled data assimilation is negligible above an eta level of 0.8 (approximately 760 hPa). At the pressure levels of 850 and 925 mb, WCDA always leads to a higher forecast skill than OPNL. SCDA results in an even better forecast skill than WCDA at 850 and 925 mb, except the metric RMSE in humidity at 850 mb. In the next subsections, we will further explore the forecast skills of temperature and humidity below approximately 760 hPa or the bottom 10 model atmospheric layers, including the degradation of humidity forecasts at 850 hPa in SCDA (Fig. 7d).
c. Evaluation of air temperature and humidity analyses against NCEP FNL dataset
This subsection investigates the impact of soil moisture data assimilation on atmospheric conditions by comparing the analyses of temperature and specific humidity against the reference NCEP final analysis dataset. The analyses during 10–27 July 2018 are evaluated. In addition, we emphasize that this subsection focuses on the evaluation of the analyses over the bottom 10 layers, as the impact of data assimilation is marginal at upper air (see section 3b). The pressure levels of the bottom 10 layers are under approximately 760 hPa on average over the Great Plains. Note that we do not compute metric correlation coefficient here, as this metric reflects mainly the diurnal variation for temperature and humidity, and Lin et al. (2017b) shows the computed correlation coefficient for temperature and humidity is very similar between the cases with and without land surface data assimilation.
Figure 8 shows the domain-mean values of the difference in the absolute bias and RMSE between the WCDA/SCDA analyses and OPNL initial conditions every 12 h during 10–27 July 2016. For the temperature, the WRF initial conditions in OPNL have an overall warm bias at the bottom 10 atmospheric layers with an average absolute bias of 2.1 (K) and an average RMSE of 3.1 (K), averaged over various analysis cycles and vertical layers (not shown). When SMAP data are assimilated, WCDA leads to an overall reduction in the bias and RMSE. SCDA further improves the atmospheric analyses, with an additional reduction in the bias and RMSE, which are about 54% and 36%, respectively, more than the reduction in WCDA (Figs. 8a–d). In terms of humidity, the OPNL initial conditions have a dry bias on average with an average absolute bias of 0.85 (g kg−1) and an average RMSE of 2.33 (g kg−1). SCDA also show a reduction in the bias and RMSE that are approximately 35% and 8%, respectively, more than WCDA (Figs. 8e–h). As section 3a reports that WCDA and SCDA have only marginal difference in the land surface moisture and heat flux estimates, the results further demonstrate that the additionally improved atmospheric analyses in SCDA, compared to WCDA, are mainly attributed to the implemented land–atmosphere model error covariance and its corresponding strongly coupled data assimilation.
d. Evaluation of air temperature and humidity forecasts against NCEP FNL dataset
We further explore the benefit of soil moisture data assimilation on the forecasts of temperature and humidity against the NCEP FNL analysis. Similar to section 3c, the forecasts over the bottom 10 atmospheric layers (below approximately 760 hPa) initialized every 12 h during 10–27 July 2018 are evaluated. Figure 9 shows metrics bias and RMSE in OPNL forecasts over the Great Plains and the performance in WCDA and SCDA relative to OPNL. Table 3 summarizes the average values of the relative improvement reported in Figs. 9d, 9e, 9i, and 9j. Figure 9a shows that there is a warm bias in OPNL, with a relatively large magnitude at layers from 3 to 7. At the lower layer, the bias is the smallest, which leads to the largest relative improvement in terms of bias (Figs. 9a,d,e). However, the small bias is mainly attributed to the average values over times, while in fact, the bias has a strong diurnal variability. We observed that the warm bias at the lowest layer peaks at 2400 UTC and can be as large as the upper air temperatures, while the temperature bias at the lower layers is small and even negative (cold bias) during the night (1200 UTC) (not shown). RMSE in OPNL is also larger on average at layers 3–7 than the bottom layer (Fig. 9f). After data assimilation, the warm bias is reduced. It can be seen that throughout various lead times and vertical levels, both WCDA and SCDA lead to improved temperature forecasts in terms of the bias and RMSE. SCDA provides additional benefits compared to WCDA. Table 3 shows that WCDA reduces the temperature bias by 7.3%, while SCDA further improves the forecasts with an additional bias reduction of 2.2%. Besides, WCDA results in an RMSE reduction by 4.1%, while SCDA further reduces the RMSE by an addition 0.8%.
Figure 10 shows the statistics of specific humidity forecasts over the Great Plains. OPNL forecasts show an overall dry bias with an average over space and time of −0.59 g kg−1 (Fig. 10a and Table 3). After data assimilation, the bias is reduced. On average, WCDA results in a reduction of bias by 19.3%, and SCDA contributes an additional 3.3% of bias reduction (Table 3). In terms of RMSE, SCDA leads to a larger RMSE reduction below 850 hPa (around layer 7) within a forecast lead time of up to two days than WCDA, with an average reduction of 0.038 g kg−1 in WCDA versus 0.046 g kg−1 in SCDA. However, after 48 h, both WCDA and SCDA show a slight degradation on the humidity forecast between layers 7 and 10 (i.e., approximately 850 and 760 hPa, respectively). This phenomenon over different levels also provides a further explanation of the model performance in Fig. 7d. Overall from Figs. 10i and 10j and Table 3, we can see that the average reduction in RMSE is marginal (e.g., around 1%).
To understand the spatial effect due to data assimilation, we compute the bias and RMSE of each pixel over the Great Plains (Fig. 11). At each pixel, the samples are from the 24-h forecasts of temperature and humidity that are over the bottom 10 atmospheric layers and initialized every 12 h during 10–27 July 2016. The figure reaffirms that OPNL forecasts have a warm temperature bias and a dry humidity bias. The bias is particularly evident in the eastern part of the region. Assimilation of SMAP data leads to cooler and wetter conditions for most of the areas than the control run (Figs. 11e,g,i,k). The results also show that a positive impact of assimilating bias-corrected SMAP data is seen for most of the areas in terms of bias and RMSE reduction, except a marginal difference in humidity in terms of RMSE. The improvement of the forecasts in SCDA is even more substantial over the upper part of the region than that in WCDA. Overall, the results demonstrate that there is an additional gain in terms of weather forecasts with the implementation of strongly coupled data assimilation.
We further considered the Mountain West that has relatively complete coverage of the valid SMAP data (see Figs. 1 and 2) to briefly understand the effect of data assimilation over another region. Similar to the above results, the forecasts of bottom-10-layer temperature and humidity during 10–27 July 2016 over the Mountain West are evaluated against the NCEP FNL analysis. It is found that the average difference in statistics between OPNL, WCDA, and SCDA is only marginal. In terms of the bias and RMSE, the differences over the Mountain West between the experiments are less 0.01 (K) for temperature and less than 0.02 (g kg−1) for specific humidity, which are much less than those over the Great Plains (Table 3). This indicates that coupled data assimilation is less effective over the Mountain West. As the land–atmosphere coupling strength and the modeling error of land–atmosphere states vary significantly in time and space (Koster et al. 2004, 2006; Dirmeyer et al. 2009; LP18), the relative importance of coupled soil moisture and atmosphere data assimilation certainly varies in different regions. Future research that compares the effect of coupled data assimilation over more areas and seasons is recommended.
e. Evaluation of 2-m temperature and humidity forecasts against the surface METAR observations
In the WRF-Noah model, 2-m temperature and humidity are diagnostic variables and computed mainly based on the temperature and humidity of the land surface and bottom atmospheric layer. To further demonstrate the effect of data assimilation, we evaluate 2-m temperature and humidity forecasts against the surface METAR stations. Over the region of interest, there are a total of around 240 surface weather stations. The forecasts with lead times of every 6 h from 6 to 72 initialized at 0000 and 1200 UTC during 10–27 July 2016 are investigated. Figure 12 shows the bias and RMSE in 2-m temperature averaged over all the cycles and stations as well as the relative improvement according to Eqs. (5) and (6). The results are consistent with Table 3 and Fig. 9, indicating the constant benefit over various lead times of SMAP data in 2-m temperature forecasts under a strongly coupled data assimilation structure in addition to that under a weakly coupled structure. Similarly, Fig. 13 shows the statistics for 2-m humidity forecasts, and it further demonstrates that SCDA is superior to WCDA. The figure also reveals a reduced gap in the improvement of humidity forecasts between SCDA and WCDA beyond a lead time of 48 h, as observed in Fig. 10.
f. Comparison of precipitation and soil moisture analysis increments
The 1- and 2-day precipitation forecasts of a 9-km resolution initialized at 0000 and 1200 UTC during 10–27 July 2016 are verified against the NCEP Stage IV data. Figure 14 shows the overall bias and RMSE averaged over the Great Plains and the time. In general, the assimilation of SMAP data slightly improves the precipitation forecasts. The small effect is likely attributed to the fact that WCDA/SCDA leads to the improvement of temperature and humidity forecasts over mainly the lower troposphere (e.g., below 760 hPa), while precipitation is a diagnostic variable that accounts for the dynamics across all over the troposphere. Nonetheless, SCDA is showed to provide additional benefits in the forecasts compared to WCDA.
To further explore the effect of the underestimation, we compare the precipitation and soil moisture analysis increments. During 10–27 July 2016, the soil moisture analysis increments are on average 0.0032 (m3 m−3) with a standard deviation of 0.0044 (m3 m−3), which means that the majority of the surface soil moisture analyses becomes wetter. The underestimation of precipitation forecasts is likely the main cause of the overall small but positive soil moisture analysis increments. Here, we recall that we performed the CDF technique to remove a large-scale bias between the model and observation (see section 2c). In the CDF matching, we used multiple short-term WRF-Noah forecasts with a lead time of 24 h to avoid any forecast degradation under long-term model integration (e.g., Zhang et al. 2018). For the same period, 1-day precipitation forecasts used in the CDF matching have an average bias of −0.78 (mm day−1), while this number in the OPNL cycling experiment is −0.96 (mm day−1) (see Fig. 14). With such a difference, soil moisture in the three main cycling experiments (i.e., OPNL, WCDA, and SCDA) is expected to be drier than that used in the CDF matching.
We compare the 36-h precipitation accumulation from the NCEP Stage IV dataset and WCDA experiment as well as the surface soil moisture analysis increment valid at three times (13, 16, and 18 July 2018) in Fig. 15. These cases are selected as they have relatively large analysis increments. As is obvious, wherever heavy precipitation appears in the NCEP Stage IV data but is missed or underestimated in WCDA, the analysis increments of surface moisture reflect the precipitation discrepancy (see the pink arrows in Fig. 15). The issue of incorrect precipitation forcing is quite common in land surface data assimilation (e.g., Blankenship et al. 2018) and is expected to be exaggerated in coupled land–atmosphere modeling. This further explains on average a small but positive soil moisture analysis increments, which help in reducing the bias and RMSE in temperature and humidity forecasts.
4. Conclusions and discussion
This study presented promising results for the relative impact of assimilating SMAP 9-km enhanced soil moisture retrieval on the WRF-Noah forecasts in the summer under the frameworks of weakly and strongly coupled data assimilation (WCDA and SCDA) over the Great Plains. We have provided evidence that the assimilation of SMAP soil moisture data improves the forecasts of surface soil moisture. The implementation of SCDA provides additional benefits of the assimilation of SMAP soil moisture on the analyses of temperature and humidity over the lower troposphere (e.g., below 760 hPa) compared to the use of WCDA. The temperature and humidity forecasts at various lead times of up to 72 h were verified against the conventional sounding data, surface weather stations, and NCEP analysis. The verification showed that soil moisture data assimilation reduces not only a warm bias in the temperature forecasts and a dry bias in the humidity forecasts but also RMSE in both temperature and humidity forecasts over the lower troposphere. The improvement in temperature and humidity forecasts also leads to the improvement in precipitation forecasts. SCDA is demonstrated to be superior to WCDA.
Considering the results obtained from this study, future research can be devoted to the implementation of soil moisture assimilation into operational NWP systems. In this study, we deployed 1D-Var in the coupled data assimilation experiments, meaning that the assimilated data in our SCDA experiment have a confined impact over an atmospheric column above the analysis grid cell during the analysis procedure. It is possible that the impact of soil moisture data under an SCDA framework is broader and stronger when an approach such as three-dimensional variational data assimilation (3D-Var) or ensemble Kalman filter (3D EnKF) is used. In addition, this study used a model grid resolution of 9 km to match the resolution of the SMAP enhanced soil moisture retrieval and assimilated only land surface observations. Future implementation of soil moisture assimilation would allow not only the research of coupled data assimilation over various time and space scales but also the simultaneous assimilation of land and atmospheric observations.
Future work may also be further enhanced via the analyses of deeper soil, the broader understanding of local land–atmosphere coupling (LoCo) perspectives (e.g., Santanello et al. 2018), and the incorporation of error statistics based on observations. First, in this study, the moisture of deeper soil layers is affected by coupled data assimilation only via model integration; and it can be included in the control analysis states in the future. Second, future research can study coupled data assimilation on a finer time scale and explore the detailed connection between the land surface and atmosphere via heat fluxes and planetary boundary layer diurnally. Third, this study uses mainly the model-based error statistics to represent the -matrix. It would be interesting if the -matrix can be verified by or even incorporated with observation-based error statistics between land–atmosphere variables (e.g., Liu and Pu 2019). We also note that this study used an analysis cycle of 12 h due to the availability of SMAP observations. Once atmospheric observations are incorporated, one should conduct analysis more frequently such as every 6 h or even shorter.
This study is sponsored by NOAA NWS Award NA16NWS4680015. The support and resources from the Center for High Performance Computing at the University of Utah are gratefully acknowledged. We would also like to acknowledge high-performance computing at Cheyenne (Computational and Information Systems Laboratory 2017) provided by NCAR’s Computational and Information Systems Laboratory, sponsored by the National Science Foundation. The ISMN soil moisture data were obtained freely at https://ismn.geo.tuwien.ac.at/ismn/. National Centers for Environmental Prediction, National Weather Service, NOAA, U.S. Department of Commerce (2008, 2015) provides the NCEP FNL dataset (https://rda.ucar.edu/datasets/ds083.3/) and the conventional sounding and surface data (https://rda.ucar.edu/datasets/ds337.0/) at no cost. The NCEP Stage IV precipitation data were obtained from the Earth Observing Laboratory at NCAR (freely available at https://data.eol.ucar.edu/dataset/21.093). The SMAP enhanced soil moisture retrievals are obtained from the National Snow and Ice Data Center, freely available at https://nsidc.org/data/SPL2SMP_E/. The WRF Model was obtained from NCAR, freely available at http://www2.mmm.ucar.edu/wrf/users/. We appreciate these agencies for providing the model, data, and technical assistance. We also thank three anonymous reviewers for their helpful comments.