In the past decade, a variety of algorithms have been introduced to downscale passive microwave soil moisture observations. Some exploit the soil moisture information from optical/thermal sensing of land surface temperature (LST) and vegetation dynamics while others use active microwave (radar) observations. In this study, downscaled soil moisture data at 9- or 1-km resolution from several algorithms are intercompared against in situ soil moisture measurements to determine their reliability in an operational system. The finescale satellite data used here for downscaling the coarse-scale SMAP data are observations of LST from the Geostationary Operational Environmental Satellite (GOES) and vegetation index (VI) from the NASA Moderate Resolution Imaging Spectroradiometer (MODIS) for the warm seasons in 2015 and 2016. Three recently developed downscaling algorithms are evaluated and compared: a simple regression algorithm based on 9-km thermal inertial data, a data mining approach called regression tree based on 9- and 1-km LST and VI, and the NASA SMAP enhanced 9-km soil moisture product algorithm. Seven sets of in situ soil moisture data from intensive networks were used for validation, including 1) the CREST-SMART network in Millbrook, New York; 2) Walnut Gulch Watershed in Arizona; 3) Little Washita Watershed in Oklahoma; 4) Fort Cobb Reservoir Experimental Watersheds in Oklahoma; 5) Little River Watershed in Georgia; 6) the Tibetan Plateau network in China, and 7) the OzNet in Australia. Soil moisture measurements of the in situ networks were upscaled to the corresponding SMAP reference pixels at 9 km and used to assess the accuracy of downscaled products at a 9-km scale. Results revealed that the downscaled 9-km soil moisture products generally outperform the 36-km product for most in situ datasets. The linear regression algorithm using the thermal sensing based evaporative stress index (ESI) had the best agreement with the in situ measurements from networks in the contiguous United States according to the site-by-site comparison. In addition, the inertial thermal linear regression method demonstrated the lowest unbiased RMSE when comparing to the matched-up in situ datasets as well. In general, this method is promising for operational generation of fine-resolution soil moisture data product.
Soil moisture (SM) information is required for the initialization of numerical weather, climate, hydrological, and ecological prediction models, because of its impact on the water, energy, and carbon exchanges between land surface and the atmosphere (Entekhabi et al. 1996). The Soil Moisture Active Passive (SMAP) satellite mission launched by NASA on 31 January 2015 was designed to be the best SM observatory in terms of data quality and spatial resolution (Chan et al. 2018; Piepmeier et al. 2016). The high-quality coarse-resolution L-band radiometer observations and SM retrievals have been generated and made available to the public since April 2015. However, the high-spatial-resolution SM retrievals were obtained for only 84 days before the loss of the L-band radar on 7 July 2015.
In addition to the L-band radar backscatter coefficient that was expected from SMAP for radiometer downscaling, some observations from other satellite sensors at fine spatial resolution have also been found sensitive to SM. Examples are the C-band radar backscatter from ESA Sentinel-1A and land surface temperature (LST), albedo (A), and vegetation indices (VI) from various optical/thermal sensors (such as GOES, VIIRS, MODIS, and AVHRR). Attempts to generate SM data products from these satellite observations have been well documented [e.g., Sabel et al. (2007) using scanning synthetic aperture radar (ScanSAR) sigma data on Envisat; Carlson (2007) using thermal and NDVI observations from Landsat]. However, these quantities are either too sensitive to nonsoil moisture factors (such as radar backscatter to surface roughness), or not directly related to the soil moisture content (e.g., LST, A, and VI). Consequently, SM estimates based on these finer-scale satellite observations are less reliable than the coarser-scale microwave radiometer observations, but with the trade-off of being higher spatial resolution.
While the coarse-resolution SMAP radiometer observations may be the state-of-the-art data product in terms of the best L-band radiometer antenna design and radio frequency interference mitigation strategy, downscaling to finer resolution will better meet SMAP data user requirements, triggering wide research interests within the SM remote sensing community (Das et al. 2011). Many downscaling algorithms have been proposed in literature (details in section 2).
To select the best algorithm for fusing the coarse-scale SMAP radiometer and finescale radar or optical sensor observations for an operational finescale SMAP SM product, this study intercompares algorithms introduced in recent literature using in situ SM measurements. Three downscaling algorithms are introduced including 1) a linear regression algorithm using surface vegetation and temperature observations (Fang et al. 2013), 2) a data mining technique (regression tree), using visible and thermal data (Gao et al. 2012), and 3) enhancement of brightness temperature using oversampling of radiometer scans (Chan et al. 2018). Details of these data fusion algorithms are given in section 2. The in situ SM measurements from seven intensive networks are used here, including the Cooperative Center for Earth System Sciences and Remote Sensing Technologies Soil Moisture Advanced Radiometric Testbed (CREST-SMART) network in Millbrook, New York; four USDA Agricultural Research Service (ARS) watershed networks (Walnut Gulch, Little Washita, Fort Cobb, and Little River); the Tibetan Plateau network in China; and the OzNet in Australia.
The objective of this study was to evaluate the relative performance of the candidate downscaling algorithms and find out which one could be implemented operationally in operational environment of the National Environmental Satellite, Data, and Information Service (NESDIS) (NOAA 2011). Operationalization criteria to choose candidate algorithms include 1) that data sources of higher-resolution observations (e.g., MODIS/VIIRS VIs or LST) should be reliably available; 2) the algorithm should be simple and stable for implementation; 3) the algorithm must meet computational efficiency requirements with product latency less than 6 h; and 4) that the downscaled SM product has satisfactory accuracy.
2. Downscaling algorithms
There have been many downscaling or data fusion algorithms introduced in the literature in the past decades (Sabaghy et al. 2018). Table 1 presents a brief summary of the candidate algorithms, categorized in terms of finer-scale data sources and downscaling methods. The finer-scale data sources include radar backscatter from active microwave data (Entekhabi et al. 2014; Sabel et al. 2007; Wagner et al. 2007) and optical and/or thermal infrared satellite observations (Zhan et al. 2002; Fang et al. 2013, 2018; Carlson 2007; Petropoulos et al. 2009; Peng et al. 2016). Examples of downscaling algorithms include linear regression approaches such as simple regression in Fang et al. (2013), the multivariate regression method (Zhan et al. 2002), the “universal triangle” theory-based methods (Piles et al. 2011), the random forest method (Abbaszadeh et al. 2019; Zhao et al. 2018), and the SMAP L2-SM-A/P product algorithm to combine the SMAP radiometer and radar observations (Entekhabi et al. 2014); the change detection method (Narayan et al. 2006; Njoku et al. 2002; Das et al. 2011); the data mining approach such as regression tree introduced by Gao et al. (2012), the neural network (NN) by Alemohammad et al. (2018) and Soo et al. (2011), and the machine learning method by Chakrabarti et al. (2016), Park et al. (2015), Im et al. (2016), and Jiang et al. (2017); the Bayesian merging method (Zhan et al. 2006); the combined modeling and remote sensing method (Merlin et al. 2006, 2008; Fang et al. 2013); the data assimilation method (Lievens et al. 2015; Draper et al. 2009; Draper and Reichle 2015; Reichle et al. 2001, 2017; Sahoo et al. 2013; Parinussa et al. 2014); and the deterministic method (Merlin et al. 2008; Ines et al. 2013).
The primary objective of this study is to find a simple but reliable downscaling method that could be implemented for operationally generating a finescale soil moisture data product from SMAP radiometer observations. Thus, the linear regression approach (Fang et al. 2013, 2018) and the regression tree data mining technique (Gao et al. 2012) are first evaluated against in situ soil moisture measurements together with the NASA SMAP enhanced 9-km soil moisture product.
a. Downscaling algorithms to be evaluated
1) Regression tree algorithm
The regression tree data mining technique was developed by Gao et al. (2012) to sharpen coarse-resolution thermal satellite images using fine-resolution optical products (e.g., reflectance, or VI). The approach has the unique advantages of 1) automatically searching for patterns among data samples without a predefined relationship between variables; 2) using multiple independent variables (e.g., vegetation indices, land surface type, LST, etc.). In this study, the data mining technique was adopted to downscale the SMAP SM product to 1-/9-km resolution using 1-km resolution visible and thermal channel observations from MODIS.
The data mining approach automatically searches for relationships between optical/thermal pixels at finescale and SM at coarse scale, among sample pairs constructed from homogeneous pixels, and then applies the pattern to fine-resolution optical/thermal maps so as to predict high-resolution SM. Therefore, the quality of “training samples” is crucial to successful implementation of the regression tree method.
The following factors need to be considered when collecting the training samples. First, samples should be well distributed throughout the experimental scene with a size large enough in order to guarantee the training set to be representative for different surface conditions. Second, the relationship should be built on homogeneous pixels only. A practical strategy to screen out outliers and preserve homogeneous pixels is to select coarse-resolution samples with coefficient of variation CV from subpixels at a resolution less than a predefined threshold (Kustas et al. 2004). The coefficient of variation CV is defined as
where μ and σ are the mean and standard deviation of the fine-resolution variables within the coarse-resolution pixel. Here, the variables are surface leaf area index (LAI) and LST (morning and afternoon). The CV was calculated for each of the selected variables when multiple variables are used. Obviously, the smaller the CV value is, the purer the sample is (Gao et al. 2012). In this study, the threshold for a valid sample was chosen as CV < 0.2 (or 20%), consistent with previous study (Gao et al. 2012). Third, to minimize the artificial box-like patterns in SM at finescale based on the regression relationship, a moving window approach was applied, in which the central prediction area moves across the whole experiment region with some overlap from the previous area (Gao et al. 2012). Results showed that this method was effective to remove the boundary effects because of common samples from overlap regions.
In our implementation, MODIS global LAI and Fraction of Photosynthetically Active Radiation (FPAR) product (MOD15A2, V005) and LST (MOD11A1.006) product are collected as fine-resolution data sources. The original MODIS products at 1-km spatial resolution are resampled to 9 km for the process of downscaling SMAP SM from 36 to 9 km. Homogeneous pixels are first identified according to the CV of MODIS LAI and LST values within the moving window. SMAP SM pixel is then paired up with MODIS LAI and LST pixels over the homogeneous 36-km pixels. With the training samples, a regression tree composed of regression models and defined model coefficients can be obtained for SM derivation. Once the model is built, the tree is applied to the prediction areas to derive higher-resolution SM based on finer-resolution LAI and LST inputs. To generate 9- or 1-km downscaled SMAP soil moisture products, the regression tree models are implemented with the 9-km or original 1-km MODIS LAI and LST.
2) Thermal inertial linear regression algorithm
Based on the relationships between daily land surface temperature changes and surface SM, Fang et al. (2013, 2018) tested a simple linear regression model to downscale AMSR-E SM from 1/4° resolution to 1-km MODIS resolution. The algorithm was developed based on the thermal inertial theory that drier soil corresponds to larger diurnal temperature changes because the heat capacity of water is greater than dry soil. Fang et al. (2013) found that the disaggregated 1-km SM dataset presents better spatial heterogeneity and accuracy, compared to the coarse AMSR-E product, according to the validation against in situ observations from Oklahoma Mesonet and Little Washita networks.
This study has implemented the thermal inertial linear regression method of Fang et al. (2013) with two changes. First, MODIS LAI is used to replace NDVI in Fang et al.’s method to represent surface vegetation conditions. The use of LAI is expected to overcome the saturation issue of NDVI at moderate to high vegetation covers. Second, SM product derived from the Atmosphere–Land Exchange Inversion (ALEXI) model (Anderson et al. 1997, 2007; Hain et al. 2009) is used for regression analysis instead of NLDAS soil moisture in Fang et al.’s method (Fang et al. 2013). ALEXI SM retrievals have better spatial resolution of 8 km, compared to NLDAS SM of 12.5 km. Moreover, the ALEXI model has been rigorously evaluated over a wide range of climatic and vegetation conditions (Anderson et al. 2007, 2011; Fang et al. 2016).
To apply Fang et al.’s (2013) method for a SMAP 9 km SM product, linear regression equations are developed between the ALEXI SM and the diurnal range of MODIS LST for each of the MODIS LAI levels (6 levels with intervals of 2) on each of the 36-km grids of the contiguous U.S. (CONUS) domain. The equation is then applied to high-resolution LST (9 km) to obtain finescale soil moisture product. The difference between the SMAP 36-km SM and the average of the four by four 9-km SM is then used to correct the 9-km SM values from their corresponding linear regression calculation. The bias-corrected 9-km grid SM values are the final downscaled 9-km SMAP SM product.
3) Resampling algorithm for the NASA SMAP enhanced SM data product
The NASA SMAP mission generated the Enhanced radiometer-based SM product (L2/3_SM_P_E) at 9-km grid (O’Neill et al. 2016; Chan et al. 2018; Jackson et al. 2016; Piepmeier et al. 2016). The SMAP radiometer scans Earth’s surface at a constant incidence angle of 40° from nadir producing a combination of fore and aft observations. Additionally, the reflector rate combined with integration time leads to oversampling in the along-scan direction. The sampling density therefore enables estimation of minimum grid spacing needed to mapping brightness temperature observations on a global Earth-fixed grid at a spatial resolution of 9 km. An enhanced SM product can then be derived based on the higher-resolution brightness temperature. The final L2_SMP_E SM retrieval data are released on a 9-km Earth-fixed grid on the EASE Grid 2.0 global cylindrical projection as two-dimensional arrays of 1624 rows and 3856 columns (O’Neill et al. 2016; Jackson et al. 2016). This study directly used this product in the evaluation effort.
It needs to be noted that even though the Enhanced product is mapped on the 9-km grid, the native resolution of the SMAP enhanced TB and SM data products is still similar to the resolution of the SMAP standard TB and SM data products from the SMAP passive radiometer (33 vs 36 km; Chan et al. 2018). The enhanced products come from an optimal interpolation approach and more fully utilizes the oversampled along-scan TB measurements in the original radiometer data. Consequently, they are able to reveal spatial features at finer spatial scale that are concealed or not immediately obvious in the standard SMAP products (Chan et al. 2018).
b. SMAP SM products to be evaluated
The following five SMAP SM data products were evaluated in this study: 1) SMAP standard radiometer SM product at 36-km resolution (SMAP 36 km); 2) SMAP enhanced radiometer SM product at 9 km (SMAP 9 km); 3) downscaled SMAP SM at 9 km based on the regression tree method, using 9-km MODIS LST/LAI (RT 9 km); 4) downscaled SMAP SM based on thermal inertial linear regression algorithm at 9 km (LR 9 km); and 5) downscaled SMAP SM at 1-km resolution based on the regression tree method, using 1-km MODIS LST/LAI (RT 1 km). Details are given in Table 2. The NASA SMAP radiometer 36-km soil moisture product and the enhanced radiometer 9-km soil moisture product are obtained from the NASA National Snow and Ice Data Center Distributed Active Archive Center (NSIDC DAAC; Colliander et al. 2017b). The analysis in this study uses retrievals from these products when the retrieval quality flag indicated recommended quality, screening out high uncertainty pixels.
3. Soil moisture in situ measurements
Two types of in situ soil moisture measurement datasets were collected and used to evaluate the above SMAP soil moisture data products. One is the soil moisture measurements from various numbers of measurement stations of four groups of individual intensive networks across different continents. The other is the spatially matched up soil moisture datasets provided by NSIDC DAAC (Colliander et al. 2017b). Details of the intensive networks and the matched up datasets are briefed as follows.
a. CREST-SMART network
The Soil Moisture Advanced Radiometric Testbed (SMART) is a SM observation network in Millbrook, New York, developed and maintained by NOAA Cooperative Remote Sensing Science and Technology (CREST) Center of the City University of New York (M. Temimi et al. 2011, meeting presentation). In addition to the standard NOAA Climate Reference Network (CRN) site located in the property of the Cary Institute of Ecosystem Studies, seven sites surrounding the CRN site have been equipped with SM and soil temperature probes for three soil depths (2.5, 5, and 10 cm), but only four stations have collected data with reasonable time coverage and are used in this study. The data are automatically collected and delivered to the SMART website every 15 min daily since later 2015. The Millbrook region is a composite of open agricultural fields (~40%) and forested terrains (~60%) with a small urban fraction of Millbrook Township. However, the in situ soil moisture and temperature probes are installed within the agricultural fields. Detailed information about CREST-SMART network can be found at https://www.noaacrest.org/smart/.
b. USDA ARS networks
USDA ARS has built and maintained SM monitoring networks covering several watersheds within the CONUS (Jackson et al. 2010). Data from four of the ARS networks are used in this study, including the networks in the Walnut Gulch Watershed in Arizona (Keefer et al. 2008), the Little Washita River and Fort Cobb Reservoir Experimental Watersheds in Oklahoma (Cosh et al. 2006, 2008, 2014), and the Little River Watershed in Georgia (Bosch et al. 2007).
A dense in situ SM network developed in the Walnut Gulch Watershed in Tombstone, Arizona, has been collecting SM measurements since 2002. The watershed contains 19 Stevens Water Hydra Probe surface (5 cm) SM sensors installed at the USDA sites within the watershed. This watershed is about 150 km2 in a semiarid climate region with land cover of mainly rangeland (83%), forest (12%), and miscellaneous (5%) (Renard et al. 1993). The Little Washita Watershed is located to the southwest of Oklahoma City covering approximately 610 km2 in subhumid to semiarid climate region with 20 SM stations deployed uniformly across the domain with a surface sensor at 5 cm horizontal depth. Land use types include grassland/rangeland (68%), cropland (20%), forests (8%), and miscellaneous uses (4%). The Fort Cobb Watershed is a neighboring watershed with 15 SM stations and is dominated by row cropping and some irrigation. The Little River Watershed is about 334 km2 in size and located in south-central Georgia, with land types consisting of woodland (40%), row crops (36%), pasture (18%), and 4% water. Little River has approximately 33 stations distributed across the entire domain.
c. OzNet hydrological monitoring network
The OzNet hydrological monitoring network is an Australian monitoring network for SM and micrometeorology developed by Monash University and the University of Melbourne (Smith et al. 2012). The network consists of the regional Murrumbidgee sites along with three focused experimental areas in Yanco, Kyeamba, and Adelong Catchments. The Murrumbidgee Catchment has high spatial variability in climate varying from semiarid in the west to humid in the east. It is predominantly agricultural with the exception of steeper parts of the catchment, which are a mixture of native eucalypt forests and exotic forest plantations. The Yanco network is a 60 km × 60 km square area to the southwest of the Yanco Research Station. It is a large flat area with land use of irrigation. Kyeamba is a medium to small catchment about 600 km2 where topography is dominated by gentle slopes with predominant land use of sheep and beef grazing. Adelong Catchment is a relatively small catchment (~145 km2) compared to Yanco and Kyeamba with steep slopes.
d. Tibetan Plateau SM networks
Several SM measurement networks have been developed in the Tibetan Plateau by the Chinese Academy of Sciences since 2010 (Yang et al. 2013). SM data from the in situ network over the Zoige grassland of the eastern Tibetan Plateau were used for this study. The network contains about 25 stations covering about 100 km × 60 km area with sensors at 5-, 10-, 20-, 40-, and 80-cm depths (Su et al. 2011). The land surface type is a uniform land cover of short grassland used for grazing by sheep and yaks.
e. Matched-up validation data pairs
The NASA SMAP core validation and calibration partners at each network conducted spatial upscaling of the in situ SM measurements within SMAP reference pixels to better capture the spatial heterogeneity at SMAP product grid size (Colliander et al. 2017a). A set of core validation sites where well-calibrated in situ SM measurements were selected. The matched-up pairs of upscaled in situ measurements and SMAP products are built for those validation core sites (O’Neill et al. 2016). The sampling requirements for the SMAP mission specified that 9-km core sites should have at least five stations that have 70% confidence for 0.03 m3 m−3 SM uncertainty with 0.05 m3 m−3 variability (Famiglietti et al. 2008; Colliander et al. 2017a). The Voronoi diagram technique was used as the default approach to defining weights of the qualified stations distributed within the pixel. The matched-up in situ measurements at SMAP 9-km grid sizes are used here for the above sites. The NASA SMAP matched-up in situ validation datasets were generated for four different grids: 36, 33, 9, and 3 km. The 36-km dataset was paired with the SMAP standard 36-km SM product while the 9-km dataset was used to evaluate the 9-km downscaled or the resampled SMAP enhanced SM products.
4. Visual comparison of coarse- and finescale SMAP SM maps
Before quantitative evaluation of the downscaled SMAP SM data products, these products for some regions were mapped for preliminary visual examination. Figure 1 shows the comparison of SMAP SM datasets at coarse and fine scales over the Oklahoma region (100.15°–94.53°W, 34.2°–37.06°N) on 30 April 2015 (Fig. 1a). The maps include the standard SMAP passive SM product at 36 km, the standard enhanced 9-km product, downscaled SM date set based on the thermal inertial linear regression method at 9 km, and one downscaled SM retrieved from the data mining algorithm at 1 km, respectively. The downscaled map at 9 km using data mining method is not shown here because the pattern looks similar to that of the 1-km map, only with less spatial details. The 1-km map shown in this figure is able to represent the performance of the data mining (regression tree) approach using MODIS LST and LAI.
The comparison illustrates successful implementation of downscaling algorithms. The dry-to-wet transition pattern from western to eastern Oklahoma is well captured in all SM datasets. More spatial details can be detected as the resolution was increased. For instance, the boundary of the Arkansas River and the shape of the Canadian River became clearer as the resolutions were disaggregated from 36 to 9 km and to 1 km. Notably, the finest-scale map at 1 km was able to show the small lakes and reservoirs that spread across the state, such as Lake Altus and the Tom Steed Reservoir to the very southwest of Oklahoma, the Great Salt Plains Lake to the north, and Lake Thunderbird toward the southeast of Oklahoma City.
Figure 1b shows the SMAP SM maps over northeast Texas bordering Oklahoma, Arkansas and Louisiana states (98°–92.5°W, 31°–35°N) on 2 April 2016. Similar conclusions can be drawn that the candidate downscaled algorithms are capable of producing SM maps with significantly improved spatial heterogeneity. Much drier conditions can be detected extending from the northwestern corner of the region to the Dallas city area, compared to that in southern Arkansas and northern Louisiana where SM greatly increases to around 0.40 to 0.45 m3 m−3. To better demonstrate the enhancement in spatial heterogeneity, two subregions over the river and reservoir area outlined by red and black boxes in Fig. 1b are magnified in Figs. 1c and 1d, respectively. Notably, the big reservoirs of Sam Raybum and Toledo Bend at the boundary of Louisiana and Texas stand out in the downscaled 9-km map obtained from thermal inertial method and the 1 km map from the regression tree approach (Fig. 1c). The other zoomed-in figure in Fig. 1d shows that the outline of red river becomes visible in the 9-km map derived from thermal inertial method using ESI and very clear in the 1-km map.
5. Quantitative evaluation of the coarse- and finescale SMAP SM products
The performance of candidate downscaling algorithms is evaluated using in situ SM observations. A typical validation method, which directly compares satellite retrievals with ground observations point by point is being widely used for evaluation of satellite-based products. However, the traditional validation approach inevitably results in mismatch issue between ground observations and SMAP pixel caused by the heterogeneity of land surface. The scaling issue has been a challenging topic in validation of satellite SM retrievals. Spatial upscaling of individual ground sites to satellite reference pixels is expected to address the issue of spatial-to-pixel soil moisture variability. Therefore, the assessment of downscaled products was carried out from two perspectives. One is based on in situ observations for each individual network and the other is based on upscaled in situ observations, which would provide complementary information and assessment to the performance of SM products. Upscaling techniques employed to upscale individual stations to SMAP pixel scale include Voronoi diagram, soil type and land cover based weights, arithmetic average, etc. (Famiglietti et al. 2008; Colliander et al. 2017a). Performance metrics of the two validation approaches will provide complementary information and assessment to SMAP SM datasets accuracy at the coarse and finescale.
Results from site-by-site evaluation are given for each of the individual in situ networks in section 5a through section 5d, while the analysis using the matched-up in situ observations in section 5e. The comparison over some sample stations is presented first, which is then followed by the statistics of unbiased RMSE (ubRMSE) and correlations averaged over all the ground stations within each network.
Time series comparison of the SMAP 36-km product, the downscaled datasets and in situ SM measurements averaged over four ground stations is shown in Fig. 2. In general, all satellite based retrievals were able to capture the dry and wet surface SM signals as reflected in the in situ measurements, although biases can be found for certain sites or over certain time periods. The performance matrix of each in situ network with average error statistics is presented in Fig. 3. Validation over the OzNet and Tibetan Plateau networks exclude the linear regression algorithm based on ESI dataset because the spatial coverage of the current ESI product extends only over the North America region. Detailed analysis on individual stations is given first, followed by the results based on the matched up in situ measurements.
a. CREST-SMART NY network
Figure 2a presents the time series comparison between satellite retrievals and in situ observations averaged over four stations in CREST-SMART network (the Green House, the Fountains, the Dutchess Day School, and the Gifford House). The wet and dry dynamics match well but present a significant wet bias in all satellite-based SM datasets. However, the biases are reduced for the downscaled data (colored dots vs black dots in Fig. 2a). For instance, the ubRMSE and correlation of the SMAP coarse-resolution product over the Fountains (one of the four sites) site was 0.069 and 0.551 m3 m−3, respectively. The enhanced 9-km product was very comparable to the coarse-resolution product with ubRMSE of 0.062 m3 m−3 and correlation of 0.662. The linear regression algorithm using ESI provided the highest performance with significant reduction of ubRMSE by 0.037 m3 m−3 and increase in correlation by 0.345.
Error statistics averaged from the four sites in the CREST-SMART network is shown in Fig. 3a with ubRMSE on the left of bar chart and correlation on the right. The NASA enhanced 9-km product shows moderate improvement in accuracy. Evidently, use of the thermal inertial linear regression algorithm showed the largest improvement. However, products using the data mining method did not show any improvement in accuracy, and yet had slightly higher RMSE by 0.01 m3 m−3 and lower correlation by 0.1 compared to SMAP coarse-resolution product. As stated in section 3a, the site is highly heterogeneous with agricultural fields and forests. The small agricultural fields where the in situ measurement probes were installed comparing to the 9- and 36-km grids might not be fairly representative to the soil moisture retrieved for the 9- or 36-km grids because of the forests damping effects. The ESI used in this algorithm was developed on 8-km scale using LST and evapotranspiration retrievals that contains more soil moisture information even for the forested patches. Consequently, the ESI based linear regression algorithm showed better results that the coarse-scale product.
b. USDA ARS network
Time series of SMAP coarse-resolution product and downscaled products compared with averaged in situ observations over 20 stations of the Little Washita Watershed are shown in Fig. 2b for warm season in 2015 and Fig. 2c in 2016, respectively. The plots illustrate an overall satisfactory performance of all satellite based retrievals, being able to capture the wetting and drydowns. All satellite SM retrievals are very much alike. The Little Washita sites do not depict obvious bias in the downscaled SM retrievals throughout the validation period in 2015 (Fig. 2b). However, a slightly dry bias can be found in satellite SM retrievals from late May until the end of July in 2016 (Fig. 2c).
Similarly, the averaged ubRMSE of each satellite SM product for the USDA ARS networks over the warm season (April–October) of 2015 and 2016 is shown in Fig. 3b. Downscaled SMAP products showed a decrease in ubRMSE and increase in correlation. The improvement in the enhanced 9-km dataset is very consistent over warm seasons. The data mining algorithms and the simple linear regression method showed comparable performance.
c. OzNet network
The OzNet network in Australia provides long-term field measurements with very high quality. In situ observations averaged over 34 sites from the Yanco network and 7 sites from Kyeamba network compared with satellite SM retrievals are plotted in Figs. 2d and 2e, respectively. Both SMAP original and downscaled products agree very well with the ground measurements over the validation period extending from September 2015 to May 2016.
At the Yanco sites, the 36-km and enhanced 9-km products underestimated the SM from November to December 2015. The downscaled 1-km product retrieved from the data mining method slightly corrected the dry biases over that period. However, SM retrievals at 9-km resolution from the data mining method drifts away to even drier estimates. Overestimation can be found for many wetting instances associated with rainfall events throughout the validation period, such as the end of December 2015 and 20 March 2016 in the Yanco network and mid-March of 2016 in the Kyeamba network. For those instances of overestimation, the downscaled 9-km product using the regression tree approach seemed to be able to correct the wet bias to a certain extent. Standard SMAP 36-km SM product and the Enhanced product present similar dynamics over the Kyeamba sites with an obvious dry bias over the period from November 2015 to April 2016, while downscaled products from either the regression tree algorithm or the thermal inertial algorithm showed lower bias than standard SMAP products.
Overall, the downscaled products showed lower ubRMSE and higher correlation than the coarse-scale SMAP SM products according to the performance matrix in Fig. 3c. The enhanced 9-km data evaluation had comparable results to the coarse-resolution SMAP SM product with slight improvement in both ubRMSE and correlation. Disaggregated 9- and 1-km products using regression tree method have seen lower ubRMSE than the coarse-resolution SMAP SM product and the enhanced product.
d. Tibetan Plateau network
Validation carried out using the Tibetan Plateau ground measurements was conducted over a short 3-month period from April to June 2015. According to the error statistics shown in Fig. 3d, downscaling did not show improvement in either ubRMSE or correlation for data from this network. In this instance, finer-resolution data showed a rise in ubRMSE compared to the SMAP 36-km product. An investigation on why the downscaling performance was poor over that region is ongoing. The short time series availability is a big concern, as a few bad measurements may introduce large uncertainties in the comparison results.
e. Evaluation using matched-up in situ measurements
The above results are based on site by site analysis. This section evaluates all the SM retrievals from candidate downscaling algorithms from another perspective. Instead of using observations from individual ground site, the comparison is built on the average SM within the reference SMAP pixels after spatial upscaling. The objective of upscaling is to capture the spatial heterogeneity of surface SM distribution. Evaluation based on matched-up in situ observations provides an additional insight into the quality of the SM estimates and the performance of the downscaling algorithms.
The matched-up in situ dataset was provided by NASA Cal/Val team at 9-km pixel resolution over the period of April–July 2015 (O’Neill et al. 2016). The 9-km matched-up in situ dataset is used to evaluate those three disaggregated 9-km products (SMAP 9 km, LR 9 km and RT 9 km). At this point, there are no matched-up in situ data to validate the 1-km SM dataset derived from the regression method.
The scatterplots of downscaled SM retrievals for the matched-up datasets for the sites in TxSON(48010902) and Walnut Gulch (16010921) are shown in Fig. 4. TxSON network has been selected by NASA Cal/Val team as a core calibration and validation site for SMAP mission (Colliander et al. 2017b). The in situ measurements of TxSON are geographically distributed within the SMAP 3-, 9-, and 36-km EASE-2 grid with replication at each satellite footprint (Caldwell et al. 2018). At the TxSON site (Fig. 4a), the thermal inertial LR algorithm (LR 9 km) was found to be very comparable to SMAP enhanced product (SMAP 9 km), matching precisely with the upscaled in situ measurements, although dry biases can be detected. However, the regression tree method (RT 9 km) presented a relatively larger fluctuation compared to other algorithms for these sites in TxSON network. At Walnut Gulch (Fig. 4b), the SMAP enhanced product has shown an obvious dry bias while the remaining downscaled SM products correct the underestimations to a certain degree.
The ubRMSEs of each SM product averaged from all matched-up pairs are listed in Table 3. The statistics indicate that all the downscaled SM products show reasonable agreement with the matchup in situ measurements. The ubRMSE of 9-km SM dataset using the regression tree algorithm and the SMAP enhance product are close to each other while the lowest ubRMSE comes from thermal inertial LR method.
6. Discussion and conclusions
To find a reliable downscaling algorithm for operational downscaling of SMAP coarse-scale SM retrievals, this study evaluated three algorithms using site-by-site comparison from intensive in situ SM networks across different continents, as well as matched-up in situ observations within SMAP reference pixels.
The site-by-site validation results indicated that the downscaled 9-km SMAP SM data presented better overall agreement with the in situ measurements for most of networks, compared to the SMAP 36-km product. For CREST-SMART network in New York (Fig. 3a), the NASA enhanced 9-km product showed moderate improvement in accuracy while the largest improvement came from the thermal inertial linear regression algorithm which reduced ubRMSE by 0.037 m3 m−3 and strengthened correlation by 0.345. Retrievals from the regression tree method present slightly lower ubRMSE but lower correlation coefficients at the same time. The validation results on the USDA ARS networks (Fig. 3b) indicate that the selected downscaling algorithms improve surface SM accuracy with reduced ubRMSE over the validation period. The ubRMSE of SMAP enhanced product is very close to that of the original SMAP SM retrievals at 36-km resolution. The accuracy of products derived from the regression tree method is considerably enhanced at both 9- and 1-km scales. Among data fusion algorithms tested, the thermal inertial linear regression method shows the best agreement with the in situ observations by showing the highest correlation of 0.729. SMAP SM retrievals at coarse and finer resolutions were validated with ground observations from the OzNet network over the period extending from September 2015 to May 2016 (Fig. 3c). It is encouraging to find that both downscaled products show lower ubRMSE and higher correlation than those from the coarse-scale SMAP SM product. However, downscaled SM datasets did not show significant positive impact when compared to the Tibetan ground observations (Fig. 3d). The ubRMSE difference between SMAP 36-km and disaggregated 9-km SM was found to be marginal. Rather, the correlation became lower after downscaling. The root cause of this lower correlation needs further investigation.
The 9-km matched-up in situ dataset was adopted to assess the relative performance of the downscaled 9-km SM products. The matched-up in situ observations cover the period of April–July 2015 (O’Neill et al. 2016). All downscaled datasets outperformed the coarse-resolution SMAP product with the lowest ubRMSE coming from the downscaled 9-km product derived from the inertial thermal linear regression method (LR 9 km).
Optical and thermal satellite observations at higher spatial resolution proved to be good sources to disaggregate microwave products at coarse resolution. However, algorithms using optical/thermal observations could not be obtained for cloudy areas. Future work shall involve the introduction of microwave brightness temperature observations (AMSR2) under cloudy conditions to improve spatial coverage. In addition, the downscaling method of combining active and passive microwave observations shall be included in future work. The NASA SMAP mission has been testing the SAR data from ESA’s Sentinel-1A, replacing the SMAP radar data in the algorithm for the high-resolution SMAP SM product. This Sentinel-1A and SMAP radiometer combined product will be tested in future work.
Another limitation in the validation of downscaled SM products in this study is the lack of the comparison using a spatial distributed high-resolution SM as a reference. It would be ideal to compare downscaled products with spatially distributed high-resolution soil moisture like airborne soil moisture dataset. However, airborne data were not available at the time we were doing this assessment.
The main goal was to evaluate characteristics of different downscaling algorithms under the operationalization criteria and transition downscaled high resolution SMAP SM product into NOAA numerical weather prediction (NWP) and national water model (NWM) operations. The Soil Moisture Operational Product System (SMOPS) has been operational at NOAA/NESDIS since 2013, which is well positioned to take research advances attained during the research phase and work toward operational transition at NOAA (Zheng et al. 2018). The SMOPS provides global SM data products from individual sensors, such as the MetOp-A and MetOp-B ASCAT of EUMETSAT, GCOM-W1 AMSR2 of JAXA, and SMOS of ESA, as well as a blended product from all these products. The candidate downscaling algorithms discussed in this study are all simple and stable to be implemented, and the data sources of higher-resolution observations are reliable and can be accessed at NOAA NWP and NWM operations. Once the downscaling algorithm is finalized and high-resolution SMAP SM is generated, the downscaling system will be integrated into SMOPS framework.
In summary, these results indicate that 1) downscaled 9-km SM products from all downscaling algorithms generally outperform the coarse-scale 36-km product for most in situ datasets; 2) NASA enhanced SM product at 9-km grid resolution using SMAP overlapped radiometer observations is very consistent with the 36-km SMAP SM product with moderate improved accuracy; 3) the data mining method using optical or thermal observations proved to have overall good performance in the generation of a fine-resolution product—both disaggregated 9- and 1-km SM datasets showed improvements in accuracy compared to the coarse-resolution product globally; and 4) the downscaled 9-km SM product using the ESI in the thermal inertial linear regression method had the best agreement with the in situ datasets over the CONUS domain. The simple linear regression method is promising for the operational generation of a fine-resolution product.
This project was supported by the NASA SMAP Science Utilization Program (SUSMAP: NNH16AD19I). The authors declare that there is no conflict of interest regarding the publication of this paper. The manuscript contents are solely the opinions of the authors and do not constitute a statement of policy, decision, or position on behalf of NOAA or the U. S. Government. USDA is an equal opportunity employer and provider.
Denotes content that is immediately available upon publication as open access.