The Impact of Quadratic Nonlinear Relations between Soil Moisture Products on Uncertainty Estimates from Triple Collocation Analysis and Two Quadratic Extensions

Simon Zwieback Institute of Environmental Engineering, ETH Zurich, Zurich, Switzerland

Search for other papers by Simon Zwieback in
Current site
Google Scholar
PubMed
Close
,
Chun-Hsu Su Department of Infrastructure Engineering, University of Melbourne, Parkville, Victoria, Australia

Search for other papers by Chun-Hsu Su in
Current site
Google Scholar
PubMed
Close
,
Alexander Gruber Department of Geodesy and Geoinformation, Vienna University of Technology, Vienna, Austria

Search for other papers by Alexander Gruber in
Current site
Google Scholar
PubMed
Close
,
Wouter A. Dorigo Department of Geodesy and Geoinformation, Vienna University of Technology, Vienna, Austria, and Laboratory of Hydrology and Water Management, Ghent University, Ghent, Belgium

Search for other papers by Wouter A. Dorigo in
Current site
Google Scholar
PubMed
Close
, and
Wolfgang Wagner Department of Geodesy and Geoinformation, Vienna University of Technology, Vienna, Austria

Search for other papers by Wolfgang Wagner in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The error characterization of soil moisture products, for example, obtained from microwave remote sensing data, is a key requirement for using these products in applications like numerical weather prediction. The error variance and root-mean-square error are among the most popular metrics: they can be estimated consistently for three datasets using triple collocation (TC) without assuming any dataset to be free of errors. This technique can account for additive and multiplicative biases; that is, it assumes that the three products are linearly related. However, its susceptibility to nonlinear relations (e.g., due to sensor saturation and scale mismatch) has not been addressed. Here, a simulation study investigates the impact of quadratic relations on the TC error estimates [also when the products are first rescaled using the nonlinear cumulative distribution function (CDF) matching technique] and on those by two novel methods. These methods—based on error-in-variables regression and probabilistic factor analysis—extend standard TC by also accounting for nonlinear relations using quadratic polynomials. The relative differences between the error estimates of the ASCAT remotely sensed product by the quadratic and the linear methods are predominantly smaller than 10% in a case study based on remotely sensed, reanalysis, and in situ measured soil moisture over the contiguous United States. Exceptions with larger discrepancies indicate that nonlinear relations can pose a challenge to traditional TC analyses, as the simulations show they can introduce biases of either sign. In such cases, the use of nonlinear methods may complement traditional approaches for the error characterization of soil moisture products.

Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JHM-D-15-0213.s1.

Corresponding author address: Simon Zwieback, Institute of Environmental Engineering, ETH Zurich, Stefano-Franscini-Platz 3, 8093 Zurich, Switzerland. E-mail: zwieback@ifu.baug.ethz.ch

Abstract

The error characterization of soil moisture products, for example, obtained from microwave remote sensing data, is a key requirement for using these products in applications like numerical weather prediction. The error variance and root-mean-square error are among the most popular metrics: they can be estimated consistently for three datasets using triple collocation (TC) without assuming any dataset to be free of errors. This technique can account for additive and multiplicative biases; that is, it assumes that the three products are linearly related. However, its susceptibility to nonlinear relations (e.g., due to sensor saturation and scale mismatch) has not been addressed. Here, a simulation study investigates the impact of quadratic relations on the TC error estimates [also when the products are first rescaled using the nonlinear cumulative distribution function (CDF) matching technique] and on those by two novel methods. These methods—based on error-in-variables regression and probabilistic factor analysis—extend standard TC by also accounting for nonlinear relations using quadratic polynomials. The relative differences between the error estimates of the ASCAT remotely sensed product by the quadratic and the linear methods are predominantly smaller than 10% in a case study based on remotely sensed, reanalysis, and in situ measured soil moisture over the contiguous United States. Exceptions with larger discrepancies indicate that nonlinear relations can pose a challenge to traditional TC analyses, as the simulations show they can introduce biases of either sign. In such cases, the use of nonlinear methods may complement traditional approaches for the error characterization of soil moisture products.

Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JHM-D-15-0213.s1.

Corresponding author address: Simon Zwieback, Institute of Environmental Engineering, ETH Zurich, Stefano-Franscini-Platz 3, 8093 Zurich, Switzerland. E-mail: zwieback@ifu.baug.ethz.ch

1. Introduction

Soil moisture has been identified as a crucial component of the earth system, in particular owing to its role as a link between the water, energy, and carbon cycles (Seneviratne et al. 2010). Observations of its temporal and spatial dynamics have great value and potential in applications as diverse as streamflow prediction (Brocca et al. 2012), crop monitoring (de Wit and van Diepen 2007), weather forecasting (Scipal et al. 2008a), and the mapping of disease vector abundance (Chuang et al. 2012). One approach to mapping soil moisture is based on land surface models that are driven by or coupled with atmospheric models (Koster and Milly 1997; Koster et al. 2009; Balsamo et al. 2015). Satellite-based remote sensing products provide independent estimates of surface soil moisture (Kerr et al. 2012; Wagner et al. 2013; Dorigo et al. 2015), as do in situ probes, which are often considered to be the most accurate method (Seneviratne et al. 2010; Crow et al. 2012). However, observational networks are sparse, and these measurements furthermore typically refer to much smaller spatial scales. The spatial resolution is also a common source of disparity between different models and remote sensing products (Wagner et al. 2007; Western et al. 2002). Additional inherent differences between soil moisture products have also been attributed to, for example, different parameterizations, variable sampling depths, the impact of vegetation, or instrument measurement noise (Beven 2001; Crow et al. 2012; Famiglietti et al. 2008; Mialon et al. 2015). The discrepancies with respect to the underlying “true” soil moisture are commonly partitioned into systematic and random deviations (Entekhabi et al. 2010; Yilmaz and Crow 2013). Both have to be taken into account when feeding observations into a model or when combining different datasets, for example, in data assimilation for streamflow prediction or weather forecasting (e.g., Reichle and Koster 2004; Crow and van den Berg 2010; Yilmaz et al. 2012).

The triple collocation (TC) technique can be applied to three such products to yield consistent estimates of the variance of the random errors (Stoffelen 1998; Zwieback et al. 2012c; Su et al. 2014). It can do so without declaring one of the products to be the truth, that is to say, free of errors (Caires and Sterl 2003; Dorigo et al. 2010). The three products do not have to be calibrated with respect to one another: it is possible to account for differences in the mean (additive biases between the products) and in the magnitude (multiplicative biases or different sensitivities to the underlying soil moisture; Yilmaz and Crow 2013; McColl et al. 2014; Su et al. 2014). By compensating for additive and multiplicative biases, TC essentially assumes that the systematic components of the products are linearly related. Besides linearity, the TC estimation of the error variance is based on additional assumptions, in particular that the random errors are uncorrelated with each other and with the true soil moisture. The validity of these two assumptions has recently been questioned by Yilmaz and Crow (2014). Furthermore, the characteristics of the soil moisture products, and hence the relationships between the products, should not change over time, for example, seasonally. However, soil moisture datasets often differ in their seasonal cycle (Drusch et al. 2005), so that temporal anomalies are frequently analyzed rather than the standard products themselves (Brocca et al. 2011; Dorigo et al. 2010). Several studies have suggested extensions to TC that can account for complex temporal changes at various time scales (e.g., Loew and Schlenz 2011; Zwieback et al. 2013; Su and Ryu 2015). By contrast, the assumption of a linear relation between the datasets, that is, that—noise notwithstanding—they only differ by an offset and a different scaling factor, has not been addressed explicitly. The cumulative distribution function (CDF) matching procedure can partially account for them, but it is limited in that it does not explicitly consider errors in the soil moisture products (Yilmaz and Crow 2014).

Evidence for such nonlinear relations between different products has been found by, for example, Mittelbach et al. (2011), Drusch et al. (2005), De Lannoy et al. (2007a), and de Rosnay et al. (2009). For instance, Mittelbach et al. (2011) observed a declining sensitivity of an impedance probe as the soil became increasingly wet. De Rosnay et al. (2009) also identified a similar saturation effect when comparing areal soil moisture with point measurements. The associated discrepancy in the spatial scale may also have contributed. When upscaling point- to field-scale measurements, De Lannoy et al. (2007a) noted that a nonlinear method (CDF matching) was more accurate than a linear mapping. The gap in spatial scale was also a possible contributing factor to the nonlinear relations between in situ and remotely sensed soil moisture found by Drusch et al. (2005), in addition to inherent differences between these techniques. Many of the other previously identified origins of additive and multiplicative biases, such as differences in the sampling depth and the parameterization of the retrieval method or hydrologic model, may also contribute to nonlinear relations (De Ridder 2003; Koster et al. 2009).

Such nonlinear relations potentially impact common error metrics such as the correlation coefficient or the estimates obtained by the TC technique (Gruber et al. 2016). Their influence on TC error estimates of satellite soil moisture products has not been studied in detail before. We want to address this open question using both linear TC analysis and nonlinear techniques: the established nonparametric CDF matching technique as well as two new extensions of TC that can handle quadratic effects. The first one views TC as an instance of factor analysis (FA), a statistical technique that is widely applied in the social sciences and psychology (Wall and Amemiya 2007). The rationale of FA is to express observed quantities (the soil moisture products) as noisy linear measurements of a limited number of latent variables called factors (the underlying soil moisture). Nonlinear extensions of FA are usually based on additional assumptions: in our case we require that both the underlying soil moisture and the errors follow normal distributions. The second method that we propose extends the regression approach suggested by Scipal et al. (2008b) but is restricted to small quadratic nonlinearities. We analyze the accuracy and the robustness of these techniques in a simulation study. The accuracy in the presence of quadratic nonlinearities is assessed by comparing the simulation results with the known underlying error magnitudes as the degree of quadratic nonlinearity varies. In particular, this will allow us to see how a violation of the linearity assumption can impact the error estimates by the linear TC technique, for example, what magnitude and sign a possible bias can have. We furthermore analyze the robustness of the methods with respect to violations of their assumptions (e.g., normality and model misspecifications) using dedicated scenarios in which these assumptions are violated. Subsequently, we compare the methods in a small case study, where we assess a coarse-scale remote sensing product [Advanced Scatterometer (ASCAT)] by comparing it to a coarse-scale model (ERA-Interim/Land) and around 100 in situ probes of the U.S. Climate Reference Network (USCRN) over the contiguous United States. Such a combination of datasets is commonly employed in TC studies in order to characterize the uncertainty of the remotely sensed soil moisture product (e.g., Dorigo et al. 2015; Loew and Schlenz 2011): here, we focus on the sensitivity of the ASCAT error estimates to the choice of method, in particular whether it accounts for linear or quadratic relations. In contrast to the simulations, the underlying error magnitudes are not known, but the differences between the linear and the quadratic methods may give an indication of the impact of quadratic nonlinearities on TC error characterization. We apply these methods to both the standard products and the temporal anomalies. As soil moisture commonly exhibits a pronounced seasonal variation and as discrepancies in the climatology are frequently a major source of difference between different soil moisture datasets (Brocca et al. 2011), the removal of the seasonal signal when forming the temporal anomalies may potentially account for nonlinear relations induced by the mismatched climatological components. We thus hypothesize that the impact of nonlinearities on the error estimates of the temporal anomalies is smaller. Taken together, the simulations and the case study will provide a first insight into the practical relevance of nonlinear, especially quadratic, relations in TC error characterization of soil moisture datasets (Gruber et al. 2013; Su et al. 2014; Dorigo et al. 2015).

2. Triple collocation

Even though TC is commonly applied to time series of measurements, the temporal character of the data is only rarely accounted for explicitly (Zwieback et al. 2013; Su and Ryu 2015). We will also neglect the temporal character of the data, that is, the instances of the time series will essentially be treated as independent samples. In particular, this implies that the error terms are not autocorrelated and that the error variances and the offsets between the products (e.g., an additive bias) do not change.

In its simplest form, the TC approach assumes that such offsets do not exist, that is, that the products are matched. They all are direct measurements of the underlying soil moisture subject to additive errors (zero mean, variance ):
e1
To simplify the analysis of the FA method in section 3b, we propose an alternative parameterization in terms of a dimensionless soil moisture anomaly T, which has zero mean and unit standard deviation. The first equation of (1) is then expressed as , and in order for the products to be scaled, we require the linear scaling factors of all products i to be equal.

For such matched products, the TC technique can yield unbiased and consistent estimates of the error variances under the following assumptions (Zwieback et al. 2012c; Yilmaz and Crow 2014; Gruber et al. 2016):

  • errors have zero mean,

  • they are mutually uncorrelated, and

  • they are not correlated with the underlying soil moisture signal (i.e., T). If the latter is treated as a deterministic quantity, the error variances must not depend on T.

In practice, different systematic relations to the underlying soil moisture are commonly observed. In particular, many datasets require the compensation of an offset (additive bias) and of differences in the sensitivity (multiplicative bias), for example, for Y:
e2
Not only do different products commonly have different linear scaling factors , but they may also differ in the measure in which the water content is expressed. For instance, the ASCAT remotely sensed soil moisture product reports the degree of saturation (DoS; dimensionless) rather than the more common absolute volumetric soil moisture (m3 m−3). In such cases, also the linear scaling factors and error variance have different dimensions, making comparisons of the latter difficult. Normalized error quantities, such as the signal-to-noise ratio (SNR), have been found to be more amenable for such comparisons (McColl et al. 2014; Gruber et al. 2016). The SNR of product i measures the magnitude of the underlying signal Si relative to the noise level Ni:
e3
When the error is independent of the anomaly T, the variance Vi of product i is the sum of the power due to the signal and that due to the noise . Thus, the signal power can be estimated from the observed product variance by . The noise power estimate is directly given by the estimated .

One way to estimate this error variance using TC was introduced by Stoffelen (1998). The method also provides estimates of the scaling factors with respect to a chosen reference product i, for which and (Zwieback et al. 2012c; Yilmaz and Crow 2013). This approach, which has since been shown to be an instance of the instrumental variables (IV) method (Su et al. 2014), can be used to rescale the data to achieve matched products as in (1). Alternatively, these error estimates can also be obtained directly without explicit rescaling (Stoffelen 1998; Caires and Sterl 2003; McColl et al. 2014; Gruber et al. 2016).

Rather than relying on the IV rescaling, Scipal et al. (2008b) suggested a similar approach that is based on total least squares (TLS). This regression technique yields estimates of the factors with which the datasets can be matched when the error variances are known. As the error variances can be estimated by TC, Scipal et al. (2008b) iterated the TC analysis and TLS regression until convergence.

These two methods provide consistent estimates of the scaling factors , that is, the results can be expected to converge to the actual value as the number of samples increases (Zwieback et al. 2012c; Gruber et al. 2016). The property of consistency is also attractive in data assimilation studies, whereas inconsistent methods were deemed suboptimal by Yilmaz and Crow (2013). A common example of such an inconsistent method is CDF matching.

3. Potential methods for tackling nonlinear relations

a. CDF matching

The idea of the CDF matching approach is to transform a soil moisture dataset so that its marginal distribution matches or approximately matches that of a reference product during the study period (Reichle and Koster 2004). Thus, a bias or a discrepancy in the dynamic range can be removed, and also the higher-order moments become identical (Gao et al. 2007). There are two general approaches to matching soil moisture products based on their distribution. The first one applies a mapping (or scaling) to one dataset so that the lower-order moments (e.g., the first and second) of the results equal those of the reference (Brocca et al. 2013; Yilmaz and Crow 2013). The alternative approach employed here operates directly on the empirical cumulative distribution functions of the reference product X and the product to be matched Y. Following Drusch et al. (2005) and Brocca et al. (2011), the two datasets are sorted and then the difference of the corresponding elements of the two ranked datasets computed. Subsequently, a polynomial in Y is fitted to the differences. This polynomial provides a correction term to Y so that its CDF approximately matches that of X. We use a polynomial of degree 5, which is considered to provide an adequate balance between the number of fitted parameters and model generality (Mahfouf 2010; Brocca et al. 2011, 2013).

This general method does not yield consistent estimates of the rescaling relations, which Yilmaz and Crow (2013) deemed detrimental for data assimilation studies. Negative effects also occur when estimating the error variances in standard linear triple collocation analysis of section 2 (see the treatment in section S1 of the supplemental material).

b. Factor analysis: EM algorithm

Nonlinear relations between the soil moisture products could alternatively be accounted for in a parametric approach, whereby they are represented by mean map functions :
e4
where such a function may be a polynomial parameterized as
e5
According to statistical terminology, the dimensionless anomaly T is a latent or hidden variable, while the products X, Y, and Z are observed. In the social sciences, operations research, and many more disciplines, datasets that might be described by models similar to (4) are studied using factor analysis (Anderson 2003; Wall and Amemiya 2007). Linear mean maps are common, but several nonlinear parametric extensions and associated algorithms for estimating these mean maps and the error variances have been proposed in the literature (Yalcin and Amemiya 2001). The estimation by maximum likelihood is among the most popular methods and has been found applicable to a large variety (e.g., linear, different kinds of nonlinear functions, and different dimensionality) of models. A potential drawback is its requirement of specifying the probability distributions of T and the noise terms explicitly (Yalcin and Amemiya 2001). These are commonly assumed to be normal distributions even though the validity of this assumption has been questioned repeatedly (e.g., Mooijaart 1985; Anderson and Amemiya 1988). However, both simulations and observational studies have indicated that the estimates are commonly robust to deviations from normality (Browne 1987; Browne and Shapiro 1988; Yalcin and Amemiya 2001). Owing to the simplicity of the normal distribution, we will thus also assume it for both the underlying dimensionless anomaly T—despite the presence of physical bounds, skewness, nonnormal kurtosis, or bimodality (Rodriguez-Iturbe et al. 1999; Milly 2001; Western et al. 2002; Teuling et al. 2005)—and the noise terms (Crow et al. 2011). The robustness of this assumption will be tested by simulations and by comparison with other methods that do not rely on normality. In line with the restrictions outlined previously, we will further assume that the samples j of the observed products, which are typically given as time series, are independent and identically distributed. The impact of autocorrelation on standard TC analysis is limited as long as the length of the time series is much greater than the time scale of the autocorrelation (Zwieback et al. 2012c, 2013). It does, however, generally inflate the uncertainty of the parameter estimates (Zwieback et al. 2013). The increase in uncertainty also occurs in related techniques such as principal component analysis or least squares regression, where it has been shown to impact tests of significance (Neville et al. 2004; Erzini et al. 2005). Autocorrelations, seasonality, and other temporal aspects of the soil moisture products could in future be explicitly modeled (Crow and Yilmaz 2014). The additional temporal parameters of such models may (once estimated from the data) provide a more comprehensive characterization of the product uncertainties.
In addition, we will restrict ourselves to linear and quadratic polynomials for the mean maps [K = 1 or 2 in (5)]. The quadratic polynomial of product i, , can be characterized by its quadratic nonlinearity parameter μi:
e6
We will prescribe that one of these mean maps be linear, that is, μi = 0, in order for the estimation problem to be solvable (Yalcin and Amemiya 2001). The identification of this one product is part of the assumptions inherent in the parameterization, which also include the restriction to quadratic expansions of the mean maps. In contrast to standard TC, the weights of all three products are free parameters in this approach, owing to the anomaly parameterization of section 2, in which there is no reference dataset. This parameterization makes the problem solvable if , that is, the model is identifiable (cf. Yalcin and Amemiya 2001). Under these assumptions and given observations of the products, the likelihood function l is uniquely defined. However, the evaluation of the likelihood function is expensive as one has to integrate out (marginalize over) the unobserved anomaly T, thus rendering standard optimization approaches inefficient (MacKay 2003). In such cases, the expectation–maximization (EM) algorithm is an established method for maximizing l and thus estimating the parameters (Neal and Hinton 1998; Zwieback et al. 2012a). However, the standard EM algorithm is computationally intractable for this particular problem. An extension of the EM algorithm, variational EM, provides approximate solutions by maximizing a lower bound on l instead (Jaakkola et al. 1996; Frey and Hinton 1999). This bound is called the variational free energy F and is given by the sum over all samples j of Fj, that is, with
e7
using an index notation in which i denotes the product (of which there are N; i = 0 corresponds to the anomaly T) and k denotes the degree of the mean map polynomial. The observed value (sample j) of product i is denoted by xi. The idea of the variational approach is to introduce, for each bound Fj specific to a sample j, variational parameters and . These represent, respectively, the mean and variance of the approximation to the distribution of the dimensionless anomaly given the observed xi of all N products. The advantage of the variational approach is that this latter distribution, which is difficult to evaluate, does not have to be computed. Instead, it is approximated by minimizing the variational free energy of sample j with respect to the variational parameters. This minimization constitutes the expectation step (E-step) of the EM algorithm. As part of the E-step, the output means and output covariances , which depend on the variational parameters, have to be evaluated: they are nonphysical quantities that describe the distributions of hypothetical soil moisture products. A more detailed explanation and the relevant formulas are given in section S2 of the supplemental material. It also contains a derivation of the maximization step (M-step), which yields estimates of the error variances and the polynomial coefficients for each product. These estimates depend on the results of the E-step: the algorithm alternates these two steps until convergence, thus yielding final estimates of the parameters of the probability distributions ( and ). The probability distributions can be parameterized in terms of quadratic Q mean maps or exclusively in terms of linear L mean maps, and we refer to the associated EM algorithms for parameter estimation as EM-Q and EM-L, respectively.

c. TLS extension

An alternative approach to incorporating nonlinear parametric functions may be to extend the error-in-variables (or TLS) regression approach employed by Scipal et al. (2008b) in the linear case. Their approach consists of three steps. Step 1 is the TLS regression, which provides estimates of the scaling factors , whereby one of the products (say X) is taken as a reference with and . Step 2 rescales the data (e.g., from the observed Y) and employs TC on this rescaled dataset [see (1)], which yields error estimates of the scaled soil moisture products. These error estimates are subsequently scaled back in step 3 to the actually observed products, so that, for example, is converted to . These error variances are required in step 1 and this cycle is iterated until convergence.

The regression analysis of step 1 is also applicable to higher-order polynomials if the mean map of the reference product remains linear, for example, using the efficient algorithm by Boggs et al. (1987). However, the rescaling in step 2 of the original approach cannot be extended so easily. First, the inverse mapping from Y to its scaled version will not be unique: for instance, in the quadratic case there will in general be two values of that correspond to any value of Y [note that the sub-subscript s (scaled) indicates that the coefficient links Y′ to Y rather than T to Y]. We propose to address this by choosing the solution that is closer to the value obtained with , which we expect to be the “correct” solution when this quadratic term is small, that is, in the anomaly parameterization. Second, such rescaling is not compatible with the error model of (4) as the noise ceases to be additive upon application of a nonlinear mapping, as opposed to a linear transformation. We thus expect this approximation to be increasingly accurate as approaches 0. The mapping of the error variance of step 3 is also only valid in the linear case. We propose to use a linear approximation (first-order Taylor expansion around the mean ) to achieve this scaling of the error variance, a step that is also expected to be increasingly accurate as . In summary, we propose to map Y to using the quadratic equation and to use a linear approximation to scale the TC error variance estimate of to that of Y. The mean map parameters are subsequently converted from the parameterization based on a reference product to that based on the dimensionless anomaly T (section 2).

d. Method comparisons

In the simulations and the case study, we will employ the methods outlined above and summarized in Table 1, along with the underlying assumptions (e.g., linear vs nonlinear and distributions).1 To separate the relaxed linearity assumption from the additional parametric assumption, the EM algorithm will be run in a linear (EM-L) and a quadratic mode (EM-Q), and similarly for TC based on TLS regression. When all these methods are applied to measured data, we compare two error estimation techniques m1 and m2 by analyzing the differences and the relative differences (RD) between the estimates of the soil moisture RMSE = of any product i:
e8
e9
where the relative difference will be commonly expressed in percent. The SNR of different methods will be compared using their ratio (RSNR):
e10
which we will express in decibels. In contrast to the case of real data, the true underlying values are known in the simulated case study, and the estimates can thus be compared with them directly.
Table 1.

Overview of the methods mk employed for error characterization, including the section where they are described and a summary of their assumptions.

Table 1.

4. Synthetic case study

a. Scenarios

To assess the suitability and limitations of these methods, we perform a synthetic case study. The soil moisture products are simulated according to the error model (4). The model consists of the dimensionless soil moisture anomaly (i.e., T) and the noisy soil moisture products X, Y, and Z, each reported as volumetric water content (VWC; m3 m−3). Their expected values are related to the underlying dimensionless anomaly by polynomial functions, which can be linear or quadratic owing to the TLS and EM methods’ restriction to quadratic mean maps. We distinguish between different degrees of nonlinearity by varying the quadratic nonlinearity parameter, with μ = 0 corresponding to linear mean maps. To test the sensitivity of the different methods with respect to the distributional forms of the random variables, we furthermore add scenarios where the simulations are based on a range of probability distributions. There is also one scenario in which the linear mean map in the parameterization of the quadratic EM and TLS approaches is distinct from that used in the simulations. In addition, the impact of the SNR on the applicability of the methods is examined. The base scenario , from which the other scenarios are derived, is associated with the model of (11):
e11
with μ set to 0.1. All weight coefficients have dimensions of volumetric soil moisture as they link the dimensionless anomaly T to the simulated soil moisture products that are reported in terms of volumetric soil moisture. These products differ with respect to their mean (e.g., 0.3 vs 0.2 m3 m−3 for X and Y, respectively), their linear sensitivity with respect to the underlying soil moisture (e.g., X and Y by a factor of 2, that of X corresponding to a dynamic range of about 0.4 m3 m−3), and their quadratic sensitivity. Also, the noise levels, that is, the standard deviations of , vary by almost a factor of 3, comparable to the spread found in previous studies (Dorigo et al. 2010; Leroux et al. 2013). The three noise terms and T are assumed independent from one another.

The additional scenarios are distinguished from this base case as follows:

  • scenario linear (L): the quadratic terms in products Y and Z are set to zero, that is, μ = 0;

  • scenario quadratic 2 (Q2): the quadratic terms are twice as large, that is, μ = 0.2;

  • scenario quadratic 4 (Q4): the quadratic terms are four times as large, that is, μ = 0.4;

  • scenario right skewed (RS): T is modeled by a skew normal distribution (Azzalini 2005), where the first- and second-order moments remain the same and the skewness is set to 0.75 (the maximum possible value being 1.0);

  • scenario heavy tails (HT): both T and the noise terms have heavy tails, that is, a larger kurtosis, and they follow a Student’s t distribution with the same first and second moments as in but with 5 degrees of freedom (Anderson 2003);

  • scenario bimodal (BM): T follows a bimodal mixture of Gaussians distribution (MacKay 2003), consisting of two components with equal weight and mean ±0.5 (maximum possible value being 1.0);

  • scenario truncated Gaussian (TG): T follows a truncated Gaussian distribution whose extent is limited to plus or minus one standard deviation, which is set to 1.85;

  • scenario low noise (LN): all mean maps are multiplied by 4, thus increasing the SNR as the error variance is kept constant;

  • scenario high noise (HN): all mean maps are divided by 4, thus reducing the SNR as the error variance is kept constant; and

  • quadratic 2 with switched mean maps (Q2s): same as Q2, but EM-Q and TLS-Q use quadratic mean maps for X and a linear mean map for Y.

For each scenario, 250 time series of 350 independent samples are drawn from the corresponding probability distribution. All the methods of Table 1 are applied to all 250 time series. For each method and scenario, this yields a distribution of estimated RMSEs that can be directly compared to the known underlying RMSE.

b. Results and discussion

For these scenarios, the distributions of the RMSE estimates obtained using the methods of Table 1 are summarized in Figs. 1 and 2. For the linear scenario (i.e., L), all methods typically achieve comparable and satisfactory results, with median biases typically below 5%. In particular, the differences in both median and spread between the linear (TC and EM-L) and the quadratic (EM-Q and TLS-Q) methods are small compared to the variability of the estimates. Only for product Y are there noticeable (≈8%) median biases for methods TLS-Q and CDF.

Fig. 1.
Fig. 1.

Distribution of the RMSE (m3 m−3) estimates obtained with five different methods from Table 1 (distinguished by their color; TLS-L is not shown as the results are visually indistinguishable from TC) for four scenarios with increasing μ: L (μ = 0), nonlinear N (quadratic relation with μ = 0.1), and Q2 and Q4 (with μ = 0.2 and 0.4, respectively). The colored bars indicate the median of the 250 simulations, and the error bars span 95% of the distribution. The underlying true value is shown in red for reference. The annotation at the bottom of each bar gives the percentage of invalid estimates corresponding to negative variances.

Citation: Journal of Hydrometeorology 17, 6; 10.1175/JHM-D-15-0213.1

Fig. 2.
Fig. 2.

Distribution of the RMSE (m3 m−3) estimates obtained with five different methods from Table 1 (distinguished by their color; TLS-L is not shown as the results are visually indistinguishable from TC) for 11 scenarios: N (quadratic relation with μ = 0.1), RS, HT, BM, TG, LN, HN, Q2s, and L (μ = 0). The colored bars indicate the median of the 250 simulations, and the error bars span 95% of the distribution. The underlying true value is shown in red for reference. The annotation at the bottom of each bar gives the percentage of invalid estimates corresponding to negative variances.

Citation: Journal of Hydrometeorology 17, 6; 10.1175/JHM-D-15-0213.1

The differences between the methods are larger in the quadratic scenarios (Fig. 1). The median bias of the nonlinear EM method (i.e., EM-Q) barely (<5%) changes as the quadratic nonlinearity parameter (i.e., μ) increases. On the other hand, that of its linear counterpart (i.e., EM-L) increases with μ. Its sign and magnitude depend on the product. For product X, which is linearly related to the dimensionless anomaly T, it is negative and for Q4 it attains a value of more than 0.02 m3 m−3, corresponding to a relative magnitude of more than 40%. There is a positive bias for products Y and Z. Their magnitude increases nonlinearly with μ: for product Z it increases from 0.002 m3 m−3 at μ = 0.1 to 0.026 m3 m−3 at μ = 0.4. Comparable median biases also occur for the standard TC method. In addition, the underestimation for product X is concomitant with the occurrence of invalid negative variance estimates.

The TLS-Q method is also subject to biases in the nonlinear scenarios. These are not necessarily evident in the median bias but they can be related to the upward-skewed distribution of the estimates. Product Y is most affected by such skewed distributions. While the median bias increases to only less than 0.01 m3 m−3 in magnitude as μ increases to 0.4, the spread of the distribution more than doubles. Conversely, products X and Z are not characterized by a comparable increase in the spread of the estimates, but their median biases show an increasing trend with μ. However, they remain small compared to the linear methods (<40%). The median biases of the CDF matching method are comparable to those of TLS-Q in magnitude, that is, they are also typically at least a factor of 2 smaller than those of the linear methods. The CDF method is also affected by large spreads with upward-skewed distributions of the estimates. This effect is most pronounced for scenario Q4 for product X and tends to increase with μ for all products.

The CDF matching method is particularly affected by deviations from normality (Fig. 2). In the HT scenario, its estimates are not robust (95% coverage interval exceeding twice the RMSE for all products), and invalid values occur with a frequency of 14%. The EM-Q method, which is explicitly based on the assumption of normality, is less affected by heavy tails and also robust to skewed, bimodal, or truncated Gaussian distributions. Its linear cousin (EM-L) and also the standard TC method, which makes no distributional assumptions, achieve comparable results, but in a few cases (e.g., product Z and HT) the median biases increase by more than 50% compared to the base scenario. Also the TLS-Q method is found to be affected by deviations from normality, even though it is not based on such an assumption. While it is robust to heavy-tailed, skewed, and bimodal distributions, the distribution of the estimates becomes wide and biased in the TG scenario, similar to scenarios Q2 and Q4.

The scenario with low noise (LN) points toward the limitations of the linear methods (EM-L and TC), which are subject to median biases exceeding 0.02 m3 m−3 and which produce invalid estimates in up to 71% of the cases. A possible reason for this phenomenon may be the nature of the TC estimator, in which a positive quantity (the error variance) is estimated using a difference. For a fixed signal magnitude, the number to be estimated approaches zero as the SNR increases, so that any violation of the assumptions (e.g., a nonlinear mean map) can have a relatively larger impact. For product Z, the linear methods overestimate the RMSE, and the magnitude of this positive median bias is comparable to the scenario Q4 in which the quadratic nonlinearity parameter (i.e., μ) is 4 times as big. While the performance of the EM-Q and TLS-Q methods remains comparable to the base scenario, that of CDF matching deteriorates appreciably as the 95% coverage interval for products X and Z exceeds twice the RMSE. By contrast, a high noise level (HN) affects the TLS-Q most strongly, which is characterized by wide (95% coverage more than twice the RMSE) and upward-skewed distributions. The susceptibility of TLS-based triple collocation to high noise levels is consistent with the findings by Boggs et al. (1988), who analyzed and described the impact of random noise on the TLS regression estimates.

The TLS-Q method is more susceptible to misspecified mean maps than EM-Q in the Q2s scenario, in which the data are simulated with X and Y having linear and quadratic mean maps, respectively, whereas the mean maps are switched in the parameterization of TLS-Q and EM-Q. The results of TLS-Q show a larger median bias than the linear methods for product Z (0.14 m3 m−3). The 95% coverage is furthermore comparable to the RMSE of products X and Z. Conversely, EM-Q appears to be robust to such a model misspecification with, for example, a median bias of <0.01 m3 m−3 for product Z.

Overall, the EM-Q method consistently achieves the most reliable results, both in the presence of quadratic nonlinearities of varying magnitude and cases where the normality assumption is violated. This indicates that deviations from normality may be not critical in many situations, consistent with previous related studies (Yalcin and Amemiya 2001). Its linear counterpart and also standard TC, on the other hand, are susceptible to deviations from linearity. Such nonlinearities are also seen to affect TLS-Q. While this method is designed to handle small μ ≪ 1, its bias and spread (especially for μ = 0.4) suggest that this limitation may already be problematic at . Large spreads and the biases that the upward-skewed distributions of the estimates induce have also been found for the CDF matching method, corresponding to, for example, HT or pronounced nonlinearities (Q4). While certain limitations of CDF matching in TC analysis in the linear case are known [see section 3a and Yilmaz and Crow (2013, 2014)], the impact of nonlinearities and the dependence on the distribution have not been described before and thus raise questions about the general applicability of CDF matching methods.

5. Uncertainty analysis of the ASCAT soil moisture product over the contiguous United States

a. Datasets and methods

The remotely sensed soil moisture product (TUW 2015) is derived from observations of the ASCAT instruments onboard the Meteorological Operational (MetOp) satellite series (Wagner et al. 2013). These active microwave instruments operate at C band (wavelength of 5.7 cm). The Vienna University of Technology (TU Wien) Water Retrieval Package, version 5.5 (WARP 5.5), algorithm is a change detection approach, with which time series of the DoS of the soil are derived from the multiangular backscatter observations at a resolution of 25 km (Naeimi et al. 2009; Wagner et al. 2013). The spatial posting of the product is 12.5 km. As the change detection model linking satellite observations and soil moisture does not apply during frozen conditions or snowmelt (Zwieback et al. 2015), such observations are screened using the approach by Naeimi et al. (2012).

The USCRN by the National Oceanic and Atmospheric Administration/National Climatic Data Center (NOAA/NCDC) consists of more than 100 climate monitoring stations within the contiguous United States. It is part of the International Soil Moisture Network (ISMN; Dorigo et al. 2011). All sites are equipped with dielectric soil moisture probes (Stevens Hydra Probe II SDI-12) at several depths, of which we use the hourly observations of VWC made at 5 cm depth (Bell et al. 2013; NCDC 2015). The quality controls by Dorigo et al. (2013) were applied to these data to filter out gross errors.

The ERA-Interim/ERA-Land dataset (ECMWF 2015) is based on the European Centre for Medium-Range Weather Forecasts (ECMWF) land surface model forced by ERA-Interim atmospheric data, with precipitation corrected toward observations (Balsamo et al. 2015). The dataset has a spatial resolution of around 80 km and a temporal sampling interval of 3 h (Albergel et al. 2013). The soil is represented by four layers, of which we use the topmost one (0–7 cm depth). The volumetric soil moisture content has been found to reflect that of in situ measurements accurately, thus making it suitable for comparisons with remotely sensed observations (Balsamo et al. 2015; Dorigo et al. 2015; Albergel et al. 2013).

For the remote sensing and reanalysis datasets, the grid cell closest to the location of each in situ probe is considered. Owing to the irregular temporal sampling of the remote sensing product, the datasets are collocated by choosing the in situ and reanalysis product nearest in time (within a window of 3 h; Dorigo et al. 2010). Stations with less than 100 collocated observations during the study period (from 1 January 2011 to 31 December 2014) are discarded, yielding a total of 112 stations. All methods of Table 1 were applied to these temporally matched soil moisture time series [referred to as standard products (std)] and also to short-term temporal anomalies (anom) calculated by subtracting the mean within a time window of 35 days (Brocca et al. 2011). Temporal anomalies are typically defined with respect to a long-term climatology, but the reference with respect to a moving average that is employed here is commonly used when the data are of limited temporal extent (Albergel et al. 2009; Dorigo et al. 2015). For both standard products and temporal anomalies, we prescribed linear mean maps for the ASCAT product and quadratic mean maps for the remaining ones. When assessing the sensitivity of the ASCAT error estimates to the choice of estimation method, we focus on the relative impact on the RMSE and SNR using the RD and RSNR metric, respectively (see section 3d).

b. Results

The error magnitude (RMSE) estimates for the ASCAT product based on the nonlinear quadratic EM method typically differ by less than 10% from those based on linear methods. The comparisons between all the methods are summarized in terms of the RD and D for the ASCAT product in Fig. 3 and Table S1 (in the supplemental material), and in terms of the estimated RMSE in Table S2 (in the supplemental material). On average, the error estimates are larger when estimated by EM-Q than when estimated by EM-L or TC (for both the standard products and the temporal anomalies). The larger errors of EM-Q correspond to lower SNRs (see Fig. S2 in the supplemental material), but the corresponding impact on the SNRs is typically less than 1.5 dB RSNR. The difference between the nonlinear and linear methods is found to be related to the estimated mean nonlinearity parameter (i.e., μ; see Fig. 4a). Larger relative differences ( > 10%) only occur when the estimated μ exceeds 0.1. A similar relation is also observed when comparing the RD with the estimated quadratic nonlinearity parameter of the ERA-Land product μERA or that of the in situ measurements μin_situ, rather than with their mean value μ (see Fig. S3 in the supplemental material).

Fig. 3.
Fig. 3.

Box plots of the RD (ASCAT) between different pairs of methods, summarizing all the 112 USCRN soil moisture stations. The method pairs m1, m2 are distinguished by their color according to the legend. The results for the standard products and the temporal anomalies are reported in the left and right half of the figure, respectively. Each box plot summarizes the respective median by a horizontal line, the interquartile range (IQR) by a colored box (the whiskers span 2.5 times the IQR), and the remaining values by plus signs. The vertical axis is linear from −1 to 1 and logarithmic in the absolute value for greater magnitudes.

Citation: Journal of Hydrometeorology 17, 6; 10.1175/JHM-D-15-0213.1

Fig. 4.
Fig. 4.

Relation between estimates of μ (i.e., the mean of μERA and μin_situ) and method comparison error metrics observed in the case study. Observed relation between μ and RD between (a) EM-Q and EM-L and (b) EM-Q and TLS-Q. (c) The inconsistency of the estimates of μ (in situ product) obtained using EM-Q and TLS-Q and the RD between these two methods.

Citation: Journal of Hydrometeorology 17, 6; 10.1175/JHM-D-15-0213.1

In absolute terms, the difference D between the RMSE estimates by EM-Q and the linear methods is typically (50%) less than 0.4% DoS in magnitude for the ASCAT product (Fig. 5a and Table S1 in the supplemental material). Around 15% of the stations have D(EM-Q, EM-L) exceeding 2% DoS in magnitude: in all these cases, the ASCAT error estimates by EM-Q are larger than those by its linear counterpart (i.e., EM-L). However, they rarely correspond to large > 25%, as the ASCAT RMSEs for these stations, which are mainly located in the Great Plains and the Western Cordillera, tend to be comparatively large, that is, they typically exceed 15% DoS (Figs. 6a,d). For the ERA-Land product, the D(EM-Q, EM-L) found at these stations has the opposite sign, that is, the error estimates by EM-Q are smaller than those by EM-L, with typical D between 2% and 3% VWC (Fig. 5b). Conversely, the D of the in situ product does not show such a clear correspondence at these sites (Fig. 6c).

Fig. 5.
Fig. 5.

Maps of the conterminous United States showing the in situ stations and different error metrics pertaining to all three products. The colors correspond to D(EM-Q, EM-L) for the (a) ASCAT and (b) ERA-Land product. The associated color bar is shown at top, and it also pertains to (c) D(EM-Q, EM-L) for the in situ measurements. (d) The estimate of the EM-L ASCAT RMSE (% DoS), along with its color bar.

Citation: Journal of Hydrometeorology 17, 6; 10.1175/JHM-D-15-0213.1

Fig. 6.
Fig. 6.

Maps of the conterminous United States showing the in situ stations and different error metrics pertaining to the ASCAT product. The colors correspond to the relative differences (a) RD(EM-Q, EM-L) and (b) RD(EM-Q, TLS-Q), respectively. The associated color bar is shown at top, and it also pertains to (c) RD(EM-Q, CDF). (d) The estimate of the ASCAT RMSE (% DoS), along with its color bar.

Citation: Journal of Hydrometeorology 17, 6; 10.1175/JHM-D-15-0213.1

The comparisons of the second nonlinear method TLS-Q with the linear methods reveal similar values with respect to D and RD. For both EM-Q and TLS-Q the disagreements with CDF matching are prominent (e.g., outliers with RD ≫ 10%). These disagreements are also larger when comparing the EM-Q and TLS-Q with each other than when comparing either of them with EM-L and TLS-L, respectively.

While the impact of the relaxed linearity assumption on the estimated error magnitudes is predominantly smaller than 10%, there are also stations with larger relative discrepancies that indicate more pronounced nonlinearities, such as those arising from the apparent saturation of one product with respect to another. The geographical distribution of these stations in Fig. 6 (and Fig. S4 in the supplemental material for the SNRs) does not show a clear correspondence to climate zones or topography. A particularly large between EM-Q and EM-L exceeding 20% is found for the Sebring station (Florida). The mean maps estimated by EM-Q and TLS-Q correspond to a diminished sensitivity or saturation of the in situ measurements for low ASCAT soil moisture values (Fig. 7). The opposite effect—an apparent saturation for high ASCAT values—is inferred for both the standard product and the temporal anomalies at Spokane (Washington) in Figs. 7b and 7c, respectively. The between the nonlinear and the linear EM methods is about a factor of 3 larger for the former (11%) than for the latter (3%). A comparable of 2% is also found for the temporal anomalies of Chillicothe (Ohio; Fig. 7d), and the disagreement between EM-Q and TLS-Q is similarly limited (3%–5%). The error estimates obtained using these two nonlinear methods differ by more than 20% in 5% of the sites, two of which are shown in Figs. 7e and 7f: Lafayette (Louisiana) and Chatham (Wisconsin). They correspond to situations where there is only a weak relation between the in situ and ASCAT soil moisture estimates, which may be related to the proximity of the Atchafalaya Swamp and the Gulf of Mexico for the Lafayette station or of Lake Superior (≈15 km) for the Chatham station. Such large RD between these two nonlinear methods are observed to occur when the quadratic nonlinearity is pronounced (μ > 0.1, see Fig. 4b) and when the two methods do not agree on the size of the quadratic nonlinearity parameter (, see Fig. 4c). When comparing EM-Q to CDF instead, larger differences of occur in southwestern Arizona, in Nevada, and across Nevada’s border in California (see Fig. 6). These correspond to sites where the estimated ASCAT RMSE (i.e., ) is particularly small (<4% DoS, Fig. 7).

Fig. 7.
Fig. 7.

Scatterplots of observed ASCAT (standard product or anomaly) and in situ soil moisture for six different stations. Each panel also shows the relation between the expected value of these two products for a given anomaly T as inferred by EM-Q (orange), TLS-Q (blue), and TC (green). The RD(EM-Q, EM-L) and RD(EM-Q, TLS-Q) are annotated in the upper-left and upper-right corner of each panel, respectively, whereas the name and geographic coordinates are given underneath.

Citation: Journal of Hydrometeorology 17, 6; 10.1175/JHM-D-15-0213.1

The discrepancies between the error estimates of the nonlinear and linear EM methods are more pronounced for the standard products than for the temporal anomalies. For the Spokane station in Figs. 7b and 7c, for instance, the is 7% points (pp) smaller for the anomalies (11% vs 3%). When comparing all stations, we find more generally that large values greater than 15% are more common for the standard products (10%) than for the anomalies (4%; see also Table S1 in the supplemental material). Values of (std) this large are typically reduced to less than 15% when the errors of the temporal anomalies are estimated instead (Fig. 8). By contrast, smaller values of (std) do not exhibit such a tendency, and stations with (std) < (anom) also occur (30% in total). A similar relation is observed when the impact of the nonlinearities is determined by comparison of TLS-L and TLS-Q. The main difference is the increased frequency of large (anom) > 15%.

Fig. 8.
Fig. 8.

Comparison of standard products and temporal anomalies with respect to observed in the case study. The relation between the RD(m1, m2) of the standard product and the temporal anomalies is shown, with m1 being a quadratic method and m2 its linear counterpart: (a) m1 = EM-Q and m2 = EM-L and (b) m1 = TLS-Q and m2 = TLS-L.

Citation: Journal of Hydrometeorology 17, 6; 10.1175/JHM-D-15-0213.1

6. Discussion

a. Influence of quadratic nonlinearities on the error estimates

The impact of introducing nonlinear relations in the form of quadratic mean maps on the estimated ASCAT error magnitudes seems to be limited, as the RD between EM-Q and EM-L rarely exceeds 15%. Such large values are restricted to cases when the quadratic nonlinearity is pronounced (μ > 0.1, Fig. 4a). However, such a pronounced quadratic nonlinearity does not necessarily lead to large RD (see Fig. 4a). In other words, a large deviation from linearity as expressed by the quadratic measure (i.e., μ) seems to be a necessary but not a sufficient condition for EM-L to diverge appreciably from EM-Q. Also, the simulation study indicates that nonlinearities are not sufficient for EM-L to cease to provide reliable error estimates, as this estimator and also the other linear methods perform well for product Y despite its quadratic mean map with μ = 0.4 in scenario Q4. In this simulation scenario, the underestimation of the RMSE by EM-L (compared to both the true value and the estimate by EM-Q) for product X is accompanied by an overestimation for product Z. A similar seemingly compensatory deviation between EM-Q and EM-L for two products occurs in the case study, where about 15% of the stations have large negative D(EM-Q, EM-L) > 2% DoS for the ASCAT product and large positive D(EM-Q, EM-L) > 1% VWC for the ERA-Land product. Taken together, these findings indicate that the error induced by neglecting nonlinear quadratic relations can be of either sign for any given product. However, owing to its focus on the magnitude of the deviations, this study has not addressed ways of statistically assessing whether a nonlinear model provides a better fit. Possible ways of testing the evidence for nonlinear relations may include bootstrapping or the analysis of the likelihood within the factor analysis model, which, however, can be sensitive to the assumptions on which they are based, for example, regarding autocorrelation or normality (MacKay 2003; Yalcin and Amemiya 2001).

In the case study, the linear methods are more similar to each other than to the quadratic ones. The discrepancies of the nonlinear EM-Q with the standard TC method and also with TLS-L are comparable to those with EM-L. As the EM-L method relies on the normality assumption, its close agreement with standard TC may indicate that this assumption is not a limiting factor in the case study. A comparatively small impact of violations of this assumption was also observed in the simulation study of Fig. 2 for EM-L and EM-Q.

In contrast to the EM methods, the TLS-based algorithms make no assumption of normality. The quadratic version (TLS-Q), however, does require that the deviations from linearity be small. The simulation study confirmed this limitation by showing that the deviations (especially the spread) increase with the degree of quadratic nonlinearity. In the case study, such direct comparisons are impossible as the truth is not known. However, the disagreement between TLS-Q and EM-Q was found to increase with the estimated quadratic nonlinearity parameter (i.e., μ; Fig. 4). The observed dependence on μ suggests that the assumption of small nonlinearities can be a limiting factor in practice. The scatterplots of Figs. 7c–e show cases where the quadratic nonlinearity parameter estimated by TLS-Q exceeds that obtained with EM-Q. In both cases the TLS-Q estimates of μ may be interpreted as too large by a human observer, that is, the fitted relations seem questionable. However, in Fig. 7f the relation obtained with TLS-Q appears more natural than those estimated by EM-L or EM-Q. The latter two cases correspond to low SNRs, suggesting that the estimation of the mean maps is less stable when the soil moisture products are not closely related. As the simulation study has also indicated, such a lack of covariability limits the applicability and accuracy of TLS (Boggs et al. 1988) and also of standard TC (Su et al. 2014). While the simulations do not reveal associated limitations of the methods based on factor analysis, the limited scope of the synthetic case study does not preclude such limitations. Conversely, it does indicate a lack of robustness of TLS-Q to the specification of which product is assumed to have a linear mean map. This is in contrast to the EM-Q model based on factor analysis, which also requires that one of the products have a linear mean map: however, the simulations in Fig. 2 indicate that it is less affected by such a model misspecification. The simulation does not address model misspecifications with respect to the functional form of the mean maps, whereas, for example, Fig. 7a indicates that more flexible functions such as cubic polynomials may be more appropriate in practice. The appropriate model specification is hence an important issue that deserves further investigation.

An alternative way to match the products is the nonparametric technique CDF matching. As opposed to the other methods, it only compares the marginal distributions of the soil moisture products, that is, it does not explicitly account for errors. As shown in Fig. S1 (in the supplemental material; results based on normal distributions and linear mean maps) it can also be applied when the SNR is very small (10−1), provided that the SNR of the different products are equal (Yilmaz and Crow 2013). This requirement is likely problematic in practice, as the SNRs vary between the products, thus inducing biases even in the linear case. The simulations of section 4 suggest furthermore that deviations from normality (e.g., heavy tails) can exacerbate the biases induced by CDF matching. These biases are found to be mainly due to the wide and asymmetric distribution of the estimates, which additional simulations indicate is not due to the length of the time series. The origin and nature of this phenomenon thus deserve closer scrutiny, which may also affect potential mitigation strategies, for example, the use of nonparametric approximation approaches to matching the two CDFs. Despite these drawbacks, CDF matching has the advantage that it does not require a specific parametric form, as opposed to EM-Q and TLS-Q.

All the estimation methods may furthermore be affected when the error magnitude is related to the soil moisture content. Several examples in Fig. 7 (namely, Figs. 7a,d) indicate that this can indeed be the case. For instance, in Sebring in Fig. 7a, the disagreement between ASCAT and the in situ data increases as the soil becomes wetter. This trend appears to be related to the inferred mean maps, which indicate a saturation of the sensitivity. For this particular example, the RMSE varies by at least a factor of 3 between dry and wet conditions. Yilmaz and Crow (2014) and Mittelbach et al. (2011) also observed a connection between the random errors and the underlying soil moisture. As the average soil moisture often varies seasonally, the dependence of the error magnitude on the soil water content may also contribute to seasonal variations of the RMSE, as previously observed by, for example, Loew and Schlenz (2011) and Zwieback et al. (2012b). Uncertainty assessments that account for seasonal variations may therefore mitigate the impact of such a dependence even if they do not account for it explicitly. As these dependences can be large (e.g., a factor of 3 increase in the RMSE), they may also be relevant in data assimilation or product merging. If these applications are to account for variable error magnitudes, appropriate error characterization methods such as the multiplicative approach by Alemohammad et al. (2015) will be required.

b. Standard products and temporal anomalies

One of our central objectives was the comparison of the impact of nonlinearities on the error estimates of the anomalies compared to those of the standard products. We find that are on average smaller for the anomalies. In particular, large > 15% occur almost exclusively for the standard product (Fig. 8). The Spokane station is one such example, as the of its standard product is 3 times as large as that of its anomaly. The scatterplots in Fig. 7 suggest that this difference is indeed related to the reduced nonlinearity of the anomaly datasets. The nonlinear relations between the standard products thus seem to be dominated by different representations of the climatology component (cf. Su and Ryu 2015). However, there appear to be other factors besides the different climatologies that affect the observed impact of nonlinearities. In particular, Fig. 8 shows that there are also cases where the impact of the nonlinearities is larger in the anomaly product.

c. Practical relevance

Overall, the results of the case study suggest that the impact of quadratic nonlinear relations on the error characterization of soil moisture products may be limited in practice. The relative differences between linear and nonlinear quadratic methods of typically <10% may be smaller than other sources of uncertainty. One such source is the limited sample size. Previous studies have limited the error estimation to time series exceeding 100 samples (see section 5). The relative uncertainty of the RMSE for this length was estimated by Zwieback et al. (2012c) to be around 20%; this value depends on the statistical distribution of the noise terms. Additional sources of uncertainty (e.g., spatial mismatch and temporal variability) have also been identified as critical (Gruber et al. 2013, 2016; Su et al. 2015). Whether they are more important than nonlinear relations will depend on the application and dataset. In our study we find that nonlinearities seem to be relevant for about 15% of the stations [as measured by >10%], for example, the Sebring station in Fig. 7a. In such cases, the degree of quadratic nonlinearity may be an indication that the soil moisture products do not represent the same “signal.” Alternatively, if the connection between the datasets is deemed meaningful, the explicit consideration of nonlinear relations may be beneficial in practice, for example, in data assimilation (DA) studies. The nonlinear mean maps correspond to nonlinear observation operators, which cannot be handled directly by certain techniques such as the Kalman filter or optimal interpolation (Montzka et al. 2012). They could, however, be incorporated into nonlinear methods, for example, the particle filter or the ensemble Kalman filter. An alternative to estimating these relations before the actual DA is the inclusion of a bias model within the DA method (Lievens et al. 2015). The bias models could be chosen to consist of quadratic polynomials or other parametric functions capable of reproducing the observed nonlinear relations (De Lannoy et al. 2007b).

Accounting for nonlinear relations may also improve the merging and downscaling of soil moisture products. Two methods that can do so are CDF matching and copulas (Liu et al. 2012; Leroux et al. 2014). These are both built on a probabilistic framework but they do not explicitly consider the uncertainties of these products. However, probability theory can also be applied to incorporate these uncertainties, as, for instance, in factor analysis model of section 3b. Within its probabilistic model of (4), the merging of two soil moisture products X and Y may consist of a conditioning operation. The conditional distribution of T, , represents the knowledge and uncertainty given the observations and . For linear mean maps, Yilmaz et al. (2012) analyzed such a probabilistic merging and provided analytic formulas. For nonlinear mean maps, such closed-form solutions do not exist. However, the variational EM algorithm of section 3b incorporates an approximate solution. The description of the procedure in section S2 (in the supplemental material) also includes references to alternative methods, for example, Markov chain Monte Carlo sampling. Future research may identify the most suitable of these methods for merging, for example, in terms of speed and accuracy. More generally, the impact of nonlinear relations on product merging and blending remains an open question.

7. Conclusions

Nonlinear relations between the three soil moisture products whose error variances are to be estimated cannot be handled by standard triple collocation (TC) analyses. Our simulations show that quadratic relations between soil moisture products can induce biases of either sign and that they can also be associated with negative, nonphysical error variance estimates. Triple collocation estimation can also be applied after preprocessing by CDF matching, which can account for arbitrary nonlinear relations: however, it does not explicitly consider random errors. Our theoretical analyses and simulations indicate that the accuracy and robustness of CDF matching (bias and spread of the estimates) are sensitive to the noise level, the probability distributions (e.g., normal, skewed, or heavy tailed), and the degree of quadratic nonlinearity of the products.

We introduce two estimation methods than can account for quadratic nonlinearities: they are both parametric methods that assume one of the relations to be linear but allow the others to be quadratic. First, total least squares (TLS) as part of triple collocation is extended to quadratic relations; however, it can only deal with small nonlinearities. In the simulation study, this technique becomes increasingly prone to outliers as the degree of quadratic nonlinearity increases. The presence and prevalence of these outliers, which are also related to the noise level, induce a bias, as they correspond to an overestimation of the error variance. Second, we develop a probabilistic method based on the variational expectation–maximization (EM) algorithm, which can handle both quadratic and linear relations. It assumes both the errors and the underlying soil moisture to be normally distributed. Empirically, however, this method is found to be the most robust error estimation approach in the simulation study, that is, it is least affected by nonlinearities of varying degrees and also nonnormal distributions (e.g., skewed or heavy tailed).

Comparisons of the quadratic EM method with its linear counterpart and the standard TC approach in a test study (coarse-scale remotely sensed and reanalysis soil moisture, in situ measurements) reveal typical relative differences in the error estimates of less than 10%. The discrepancies with CDF matching and both the linear and nonlinear TLS methods are of similar magnitude. However, we also observe a number of cases where quadratic nonlinear relations had a larger impact on the error estimates, with the relative difference exceeding 15%. Such cases occur almost exclusively for the standard products rather than the temporal anomalies, indicating that error estimates of the latter are typically less affected by quadratic nonlinearities. The observed quadratic relations correspond to the saturation of the sensitivity of one of the products with respect to others. They have been attributed in previous studies to, for example, different spatial scales, the parameterization of the hydrologic or remote sensing retrieval methods, or physical limitations of the sensors. Such nonlinear relations are usually not considered explicitly in applications of the soil moisture datasets, including data assimilation and product merging. In certain cases, these may profit from the error characterization provided by the suggested methods or new developments that can account for error autocorrelation or different functional forms of the nonlinear relations (e.g., cubic or higher-order polynomials). The explicit consideration of nonlinear relations thus has the potential to not only provide a more comprehensive uncertainty characterization of soil moisture products but also to improve the use of these products in applications as diverse as weather and flood forecasting.

Acknowledgments

The authors are grateful to Alexandra Konings, Kaighin McColl, and an anonymous reviewer for their insightful comments and suggestions, and to Christa D. Peters-Lidard for her handling of the manuscript submission and reviews. They gratefully acknowledge support from the European Commission via the FP7 project Earth2Observe (http://www.earth2observe.eu/; Grant Agreement 603608) and from EUMETSAT via the H-SAF (http://hsaf.meteoam.it/) project. They further thank Emanuel Dutra, Clément Albergel, and Gianpaolo Balsamo from ECMWF and Matthias Langer from ZAMG for providing the ERA-Land data. The ASCAT data are provided within the H-SAF project (http://hsaf.meteoam.it/).

REFERENCES

  • Albergel, C., Rüdiger C. , Carrer D. , Calvet J.-C. , Fritz N. , Naeimi V. , Bartalis Z. , and Hasenauer S. , 2009: An evaluation of ASCAT surface soil moisture products with in-situ observations in southwestern France. Hydrol. Earth Syst. Sci., 13, 115124, doi:10.5194/hess-13-115-2009.

    • Search Google Scholar
    • Export Citation
  • Albergel, C., and Coauthors, 2013: Monitoring multi-decadal satellite Earth observation of soil moisture products through land surface reanalyses. Remote Sens. Environ., 138, 7789, doi:10.1016/j.rse.2013.07.009.

    • Search Google Scholar
    • Export Citation
  • Alemohammad, S. H., McColl K. A. , Konings A. G. , Entekhabi D. , and Stoffelen A. , 2015: Characterization of precipitation product errors across the United States using multiplicative triple collocation. Hydrol. Earth Syst. Sci., 19, 34893503, doi:10.5194/hess-19-3489-2015.

    • Search Google Scholar
    • Export Citation
  • Anderson, T. W., 2003: An Introduction to Multivariate Statistical Analysis. Wiley, 752 pp.

  • Anderson, T. W., and Amemiya Y. , 1988: The asymptotic normal distribution of estimators in factor analysis under general conditions. Ann. Stat., 16, 759771, doi:10.1214/aos/1176350834.

    • Search Google Scholar
    • Export Citation
  • Azzalini, A., 2005: The skew-normal distribution and related multivariate families. Scand. J. Stat., 32, 159188, doi:10.1111/j.1467-9469.2005.00426.x.

    • Search Google Scholar
    • Export Citation
  • Balsamo, G., and Coauthors, 2015: ERA-Interim/Land: A global land surface reanalysis data set. Hydrol. Earth Syst. Sci., 19, 389407, doi:10.5194/hess-19-389-2015.

    • Search Google Scholar
    • Export Citation
  • Bell, J. E., and Coauthors, 2013: U.S. Climate Reference Network soil moisture and temperature observations. J. Hydrometeor., 14, 977988, doi:10.1175/JHM-D-12-0146.1.

    • Search Google Scholar
    • Export Citation
  • Beven, K., 2001: How far can we go in distributed hydrological modelling? Hydrol. Earth Syst. Sci., 5, 112, doi:10.5194/hess-5-1-2001.

    • Search Google Scholar
    • Export Citation
  • Boggs, P. T., Byrd R. H. , and Schnabel R. B. , 1987: A stable and efficient algorithm for nonlinear orthogonal distance regression. SIAM J. Sci. Statist. Comput., 8, 10521078, doi:10.1137/0908085.

    • Search Google Scholar
    • Export Citation
  • Boggs, P. T., Spiegelman C. H. , Donaldson J. R. , and Schnabel R. B. , 1988: A computational examination of orthogonal distance regression. J. Econom., 38, 169201, doi:10.1016/0304-4076(88)90032-2.

    • Search Google Scholar
    • Export Citation
  • Brocca, L., and Coauthors, 2011: Soil moisture estimation through ASCAT and AMSR-E sensors: An intercomparison and validation study across Europe. Remote Sens. Environ., 115, 33903408, doi:10.1016/j.rse.2011.08.003.

    • Search Google Scholar
    • Export Citation
  • Brocca, L., Moramarco T. , Melone F. , Wagner W. , Hasenauer S. , and Hahn S. , 2012: Assimilation of surface- and root-zone ASCAT soil moisture products into rainfall–runoff modeling. IEEE Trans. Geosci. Remote Sens., 50, 25422555, doi:10.1109/TGRS.2011.2177468.

    • Search Google Scholar
    • Export Citation
  • Brocca, L., Melone F. , Moramarco T. , Wagner W. , and Albergel C. , 2013: Scaling and filtering approaches for the use of satellite soil moisture observations. Remote Sensing of Energy Fluxes and Soil Moisture Content, G. P. Petropoulos, Ed., CRC Press, 411–426, doi:10.1201/b15610-21.

  • Browne, M. W., 1987: Robustness of statistical inference in factor analysis and related models. Biometrika, 74, 375384, doi:10.1093/biomet/74.2.375.

    • Search Google Scholar
    • Export Citation
  • Browne, M. W., and Shapiro A. , 1988: Robustness of normal theory methods in the analysis of linear latent variate models. Br. J. Math. Stat. Psychol., 41, 193208, doi:10.1111/j.2044-8317.1988.tb00896.x.

    • Search Google Scholar
    • Export Citation
  • Caires, S., and Sterl A. , 2003: Validation of ocean wind and wave data using triple collocation. J. Geophys. Res., 108, 30983114, doi:10.1029/2002JC001491.

    • Search Google Scholar
    • Export Citation
  • Chuang, T.-W., Henebry G. M. , Kimball J. S. , VanRoekel-Patton D. L. , Hildreth M. B. , and Wimberly M. C. , 2012: Satellite microwave remote sensing for environmental modeling of mosquito population dynamics. Remote Sens. Environ., 125, 147156, doi:10.1016/j.rse.2012.07.018.

    • Search Google Scholar
    • Export Citation
  • Crow, W. T., and van den Berg M. J. , 2010: An improved approach for estimating observation and model error parameters in soil moisture data assimilation. Water Resour. Res., 46, W12519, doi:10.1029/2010WR009402.

    • Search Google Scholar
    • Export Citation
  • Crow, W. T., and Yilmaz M. T. , 2014: The Auto-Tuned Land Data Assimilation System (ATLAS). Water Resour. Res., 50, 371–385, doi:10.1002/2013WR014550.

    • Search Google Scholar
    • Export Citation
  • Crow, W. T., van den Berg M. J. , Huffman G. J. , and Pellarin T. , 2011: Correcting rainfall using satellite-based surface soil moisture retrievals: The Soil Moisture Analysis Rainfall Tool (SMART). Water Resour. Res., 47, W08521, doi:10.1029/2011WR010576.

  • Crow, W. T., and Coauthors, 2012: Upscaling sparse ground-based soil moisture observations for the validation of coarse-resolution satellite soil moisture products. Rev. Geophys., 50, RG2002, doi:10.1029/2011RG000372.

    • Search Google Scholar
    • Export Citation
  • De Lannoy, G. J. M., Houser P. R. , Verhoest N. E. , Pauwels V. R. , and Gish T. J. , 2007a: Upscaling of point soil moisture measurements to field averages at the OPE3 test site. J. Hydrol., 343, 111, doi:10.1016/j.jhydrol.2007.06.004.

    • Search Google Scholar
    • Export Citation
  • De Lannoy, G. J. M., Reichle R. H. , Houser P. R. , Pauwels V. R. N. , and Verhoest N. E. C. , 2007b: Correcting for forecast bias in soil moisture assimilation with the ensemble Kalman filter. Water Resour. Res., 43, W09410, doi:10.1029/2006WR005449.

    • Search Google Scholar
    • Export Citation
  • De Ridder, K., 2003: Surface soil moisture monitoring over Europe using Special Sensor Microwave/Imager (SSM/I) imagery. J. Geophys. Res., 108, 4422, doi:10.1029/2002JD002796.

    • Search Google Scholar
    • Export Citation
  • de Rosnay, P., Gruhier C. , Timouk F. , Baup F. , Mougin E. , Hiernaux P. , Kergoat L. , and LeDantec V. , 2009: Multi-scale soil moisture measurements at the Gourma meso-scale site in Mali. J. Hydrol., 375, 241252, doi:10.1016/j.jhydrol.2009.01.015.

    • Search Google Scholar
    • Export Citation
  • de Wit, A., and van Diepen C. , 2007: Crop model data assimilation with the ensemble Kalman filter for improving regional crop yield forecasts. Agric. For. Meteor., 146, 3856, doi:10.1016/j.agrformet.2007.05.004.

    • Search Google Scholar
    • Export Citation
  • Dorigo, W. A., Scipal K. , Parinussa R. , Liu Y. , Wagner W. , de Jeu R. , and Naeimi V. , 2010: Error characterisation of global active and passive microwave soil moisture datasets. Hydrol. Earth Syst. Sci., 14, 26052616, doi:10.5194/hess-14-2605-2010.

    • Search Google Scholar
    • Export Citation
  • Dorigo, W. A., and Coauthors, 2011: The International Soil Moisture Network: A data hosting facility for global in situ soil moisture measurements. Hydrol. Earth Syst. Sci., 15, 16751698, doi:10.5194/hess-15-1675-2011.

    • Search Google Scholar
    • Export Citation
  • Dorigo, W. A., and Coauthors, 2013: Global automated quality control of in situ soil moisture data from the International Soil Moisture Network. Vadose Zone J., 12 (3), doi:10.2136/vzj2012.0097.

    • Search Google Scholar
    • Export Citation
  • Dorigo, W. A., and Coauthors, 2015: Evaluation of the ESA CCI soil moisture product using ground-based observations. Remote Sens. Environ., 162, 380395, doi:10.1016/j.rse.2014.07.023.

    • Search Google Scholar
    • Export Citation
  • Drusch, M., Wood E. F. , and Gao H. , 2005: Observation operators for the direct assimilation of TRMM Microwave Imager retrieved soil moisture. Geophys. Res. Lett., 32, L15403, doi:10.1029/2005GL023623.

    • Search Google Scholar
    • Export Citation
  • ECMWF, 2015: ERA-Interim/Land. European Centre for Medium-Range Weather Forecasts, accessed 26 June 2015. [Available online at http://www.ecmwf.int/en/research/climate-reanalysis/era-interim/land.]

  • Entekhabi, D., Rechle R. , Koster R. , and Crow W. , 2010: Performance metrics for soil moisture retrievals and application requirements. J. Hydrometeor., 11, 832840, doi:10.1175/2010JHM1223.1.

    • Search Google Scholar
    • Export Citation
  • Erzini, K., Inejih C. A. O. , and Stobberup K. A. , 2005: An application of two techniques for the analysis of short, multivariate non-stationary time-series of Mauritanian trawl survey data. ICES J. Mar. Sci., 62, 353359, doi:10.1016/j.icesjms.2004.12.009.

    • Search Google Scholar
    • Export Citation
  • Famiglietti, J., Ryu D. , Berg A. , Rodell M. , and Jackson T. , 2008: Field observations of soil moisture variability across scales. Water Resour. Res., 44, W01423, doi:10.1029/2006WR005804.

    • Search Google Scholar
    • Export Citation
  • Frey, B., and Hinton G. , 1999: Variational learning in nonlinear Gaussian belief networks. Neural Comput., 11, 193214, doi:10.1162/089976699300016872.

    • Search Google Scholar
    • Export Citation
  • Gao, H., Wood E. F. , Drusch M. , and McCabe M. , 2007: Copula-derived observation operators for assimilating TMI and AMSR-E retrieved soil moisture into land surface models. J. Hydrometeor., 8, 413429, doi:10.1175/JHM570.1.

    • Search Google Scholar
    • Export Citation
  • Gruber, A., Dorigo W. A. , Zwieback S. , Xaver A. , and Wagner W. , 2013: Characterizing coarse-scale representativeness of in situ soil moisture measurements from the International Soil Moisture Network. Vadose Zone J., 12 (2), doi:10.2136/vzj2012.0170.

    • Search Google Scholar
    • Export Citation
  • Gruber, A., Su C.-H. , Zwieback S. , Crow W. , Dorigo W. , and Wagner W. , 2016: Recent advances in (soil moisture) triple collocation analysis. Int. J. Appl. Earth Obs. Geoinf., 45, 200211, doi:10.1016/j.jag.2015.09.002.

    • Search Google Scholar
    • Export Citation
  • Jaakkola, T., Saul L. K. , and Jordan M. I. , 1996: Fast learning by bounding likelihoods in sigmoid type belief networks. Advances in Neural Information Processing Systems, Vol. 8, D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, Eds., MIT Press, 528–534.

  • Kerr, Y., and Coauthors, 2012: The SMOS soil moisture retrieval algorithm. IEEE Trans. Geosci. Remote Sens., 50, 13841403, doi:10.1109/TGRS.2012.2184548.

    • Search Google Scholar
    • Export Citation
  • Koster, R. D., and Milly P. , 1997: The interplay between transpiration and runoff formulations in land surface schemes used with atmospheric models. J. Climate, 10, 15781591, doi:10.1175/1520-0442(1997)010<1578:TIBTAR>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Koster, R. D., Guo Z. , Yang R. , Dirmeyer P. A. , Mitchell K. , and Puma M. J. , 2009: On the nature of soil moisture in land surface models. J. Climate, 22, 43224335, doi:10.1175/2009JCLI2832.1.

    • Search Google Scholar
    • Export Citation
  • Leroux, D., Kerr Y. , Richaume P. , and Fieuzal R. , 2013: Spatial distribution and possible sources of SMOS errors at the global scale. Remote Sens. Environ., 133, 240250, doi:10.1016/j.rse.2013.02.017.

    • Search Google Scholar
    • Export Citation
  • Leroux, D., Kerr Y. , Wood E. , Sahoo A. , Bindlish R. , and Jackson T. , 2014: An approach to constructing a homogeneous time series of soil moisture using SMOS. IEEE Trans. Geosci. Remote Sens., 52, 393405, doi:10.1109/TGRS.2013.2240691.

    • Search Google Scholar
    • Export Citation
  • Lievens, H., and Coauthors, 2015: Optimization of a radiative transfer forward operator for simulating SMOS brightness temperatures over the upper Mississippi basin. J. Hydrometeor., 16, 11091134, doi:10.1175/JHM-D-14-0052.1.

    • Search Google Scholar
    • Export Citation
  • Liu, Y. Y., Dorigo W. A. , Parinussa R. M. , de Jeu R. A. M. , Wagner W. , McCabe M. F. , Evans J. P. , and Van Dijk A. I. J. M. , 2012: Trend-preserving blending of passive and active microwave soil moisture retrievals. Remote Sens. Environ., 123, 280297, doi:10.1016/j.rse.2012.03.014.

    • Search Google Scholar
    • Export Citation
  • Loew, A., and Schlenz F. , 2011: A dynamic approach for evaluating coarse scale satellite soil moisture products. Hydrol. Earth Syst. Sci., 15, 7590, doi:10.5194/hess-15-75-2011.

    • Search Google Scholar
    • Export Citation
  • MacKay, D. J., 2003: Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 640 pp.

  • Mahfouf, J.-F., 2010: Assimilation of satellite-derived soil moisture from ASCAT in a limited-area NWP model. Quart. J. Roy. Meteor. Soc., 136, 784798, doi:10.1002/qj.602.

    • Search Google Scholar
    • Export Citation
  • McColl, K. A., Vogelzang J. , Konings A. G. , Entekhabi D. , Piles M. , and Stoffelen A. , 2014: Extended triple collocation: Estimating errors and correlation coefficients with respect to an unknown target. Geophys. Res. Lett., 41, 62296236, doi:10.1002/2014GL061322.

    • Search Google Scholar
    • Export Citation
  • Mialon, A., Richaume P. , Leroux D. , Bircher S. , Al Bitar A. , Pellarin T. , Wigneron J.-P. , and Kerr Y. , 2015: Comparison of Dobson and Mironov dielectric models in the SMOS soil moisture retrieval algorithm. IEEE Trans. Geosci. Remote Sens., 53, 30843094, doi:10.1109/TGRS.2014.2368585.

    • Search Google Scholar
    • Export Citation
  • Milly, P. C. D., 2001: A minimalist probabilistic description of root zone soil water. Water Resour. Res., 37, 457463, doi:10.1029/2000WR900337.

    • Search Google Scholar
    • Export Citation
  • Mittelbach, H., Casini F. , Lehner I. , Teuling A. , and Seneviratne S. , 2011: Soil moisture monitoring for climate research: Evaluation of a low cost sensor in the framework of the SwissSMEX campaign. J. Geophys. Res., 116, D05111, doi:10.1029/2010JD014907.

    • Search Google Scholar
    • Export Citation
  • Montzka, C., Pauwels V. R. N. , Hendricks Franssen H. J. , Han X. , and Vereecken H. , 2012: Multivariate and multiscale data assimilation in terrestrial systems: A review. Sensors, 12, 16 29116 333, doi:10.3390/s121216291.

    • Search Google Scholar
    • Export Citation
  • Mooijaart, A., 1985: Factor analysis for non-normal variables. Psychometrika, 50, 323342, doi:10.1007/BF02294108.

  • Naeimi, V., Scipal K. , Bartalis Z. , Hasenauer S. , and Wagner W. , 2009: An improved soil moisture retrieval algorithm for ERS and MetOp scatterometer observations. IEEE Trans. Geosci. Remote Sens., 47, 19992013, doi:10.1109/TGRS.2008.2011617.

    • Search Google Scholar
    • Export Citation
  • Naeimi, V., Paulik C. , Bartsch A. , Wagner W. , Kidd R. , Park S.-E. , Elger K. , and Boike J. , 2012: ASCAT Surface State Flag (SSF): Extracting information on surface freeze/thaw conditions from backscatter data using an empirical threshold-analysis algorithm. IEEE Trans. Geosci. Remote Sens., 50, 25662582, doi:10.1109/TGRS.2011.2177667.

    • Search Google Scholar
    • Export Citation
  • NCDC, 2015: USCRN soil moisture data. International Soil Moisture Network, accessed 26 June 2015. [Available online at http://ismn.geo.tuwien.ac.at/networks/uscrn/.]

  • Neal, R., and Hinton G. , 1998: A view of the EM algorithm that justifies incremental, sparse, and other variants. Learning in Graphical Models, NATO ASI Series, Vol. 89, Kluwer Academic, 355–368, doi:10.1007/978-94-011-5014-9_12.

  • Neville, J., Simsek O. , and Jensen D. , 2004: Autocorrelation and relational learning: Challenges and opportunities. Proc. Workshop on Statistical Relational Learning/21st Int. Conf. on Machine Learning, Banff, Alberta, Canada, International Machine Learning Society, 74–81.

  • Reichle, R. H., and Koster R. D. , 2004: Bias reduction in short records of satellite soil moisture. Geophys. Res. Lett., 31, L19501, doi:10.1029/2004GL020938.

    • Search Google Scholar
    • Export Citation
  • Rodriguez-Iturbe, I., Porporato A. , Ridolfi L. , Isham V. , and Cox D. R. , 1999: Probabilistic modelling of water balance at a point: The role of climate, soil and vegetation. Proc. Roy. Soc. London, 455A, 37893805, doi:10.1098/rspa.1999.0477.

    • Search Google Scholar
    • Export Citation
  • Scipal, K., Drusch M. , and Wagner W. , 2008a: Assimilation of a ERS scatterometer derived soil moisture index in the ECMWF numerical weather prediction system. Adv. Water Resour., 31, 11011112, doi:10.1016/j.advwatres.2008.04.013.

    • Search Google Scholar
    • Export Citation
  • Scipal, K., Holmes T. , de Jeu R. , Naeimi V. , and Wagner W. , 2008b: A possible solution for the problem of estimating the error structure of global soil moisture data sets. Geophys. Res. Lett., 35, L24403, doi:10.1029/2008GL035599.

    • Search Google Scholar
    • Export Citation
  • Seneviratne, S., Corti T. , Davin E. , Hirschi M. , Jaeger E. , Lehner I. , Orlowsky B. , and Teuling A. , 2010: Investigating soil moisture–climate interactions in a changing climate: A review. Earth-Sci. Rev., 99, 125161, doi:10.1016/j.earscirev.2010.02.004.

    • Search Google Scholar
    • Export Citation
  • Stoffelen, A., 1998: Toward the true near-surface wind speed: Error modeling and calibration using triple collocation. J. Geophys. Res., 103, 77557766, doi:10.1029/97JC03180.

    • Search Google Scholar
    • Export Citation
  • Su, C.-H., and Ryu D. , 2015: Multi-scale analysis of bias correction of soil moisture. Hydrol. Earth Syst. Sci., 19, 1731, doi:10.5194/hess-19-17-2015.

    • Search Google Scholar
    • Export Citation
  • Su, C.-H., Ryu D. , Crow W. T. , and Western A. W. , 2014: Beyond triple collocation: Applications to soil moisture monitoring. J. Geophys. Res. Atmos., 119, 64196439, doi:10.1002/2013JD021043.

    • Search Google Scholar
    • Export Citation
  • Su, C.-H., Narsey S. Y. , Gruber A. , Xaver A. , Chung D. , Ryu D. , and Wagner W. , 2015: Evaluation of post-retrieval de-noising of active and passive microwave satellite soil moisture. Remote Sens. Environ., 163, 127139, doi:10.1016/j.rse.2015.03.010.

    • Search Google Scholar
    • Export Citation
  • Teuling, A., Uijlenhoet R. , and Troch P. , 2005: On bimodality in warm season soil moisture observations. Geophys. Res. Lett., 32, L13402, doi:10.1029/2005GL023223.

    • Search Google Scholar
    • Export Citation
  • TUW, 2015: MetOp ASCAT 25 km soil moisture images. Research Group Remote Sensing, Vienna University of Technology, accessed 26 June 2015. [Available online at http://rs.geo.tuwien.ac.at/products/461b7121-9c07-5022-8c40-23aba622c74c/320899/.]

  • Wagner, W., Blöschl G. , Pampaloni P. , Calvet J.-C. , Bizzarri B. , Wigneron J.-P. , and Kerr Y. , 2007: Operational readiness of microwave remote sensing of soil moisture for hydrologic applications. Nord. Hydrol., 38, 120, doi:10.2166/nh.2007.029.

    • Search Google Scholar
    • Export Citation
  • Wagner, W., and Coauthors, 2013: The ASCAT soil moisture product: A review of its specifications, validation results, and emerging applications. Meteor. Z., 22, 533, doi:10.1127/0941-2948/2013/0399.

    • Search Google Scholar
    • Export Citation
  • Wall, M., and Amemiya Y. , 2007: A review of nonlinear factor analysis and nonlinear structural equation modeling. Factor Analysis at 100: Historical Developments and Future Directions, R. Cudeck, and R. C. MacCallum, Eds., Routledge, 337–362.

  • Western, A., Grayson A. , and Blöschl G. , 2002: Scaling of soil moisture: A hydrologic perspective. Annu. Rev. Earth Planet. Sci., 30, 149180, doi:10.1146/annurev.earth.30.091201.140434.

    • Search Google Scholar
    • Export Citation
  • Yalcin, I., and Amemiya Y. , 2001: Nonlinear factor analysis as a statistical method. Stat. Sci., 16, 275294, doi:10.1214/ss/1009213729.

    • Search Google Scholar
    • Export Citation
  • Yilmaz, M. T., and Crow W. T. , 2013: The optimality of potential rescaling approaches in land data assimilation. J. Hydrometeor., 14, 650660, doi:10.1175/JHM-D-12-052.1.

    • Search Google Scholar
    • Export Citation
  • Yilmaz, M. T., and Crow W. T. , 2014: Evaluation of assumptions in soil moisture triple collocation analysis. J. Hydrometeor., 15, 12931302, doi:10.1175/JHM-D-13-0158.1.

    • Search Google Scholar
    • Export Citation
  • Yilmaz, M. T., Crow W. T. , Anderson M. C. , and Hain C. , 2012: An objective methodology for merging satellite- and model-based soil moisture products. Water Resour. Res., 48, W11502, doi:10.1029/2011WR011682.

    • Search Google Scholar
    • Export Citation
  • Zwieback, S., Bartsch A. , Melzer T. , and Wagner W. , 2012a: Probabilistic fusion of Ku and C band scatterometer data for determining the freeze/thaw state. IEEE Trans. Geosci. Remote Sens., 50, 25832594, doi:10.1109/TGRS.2011.2169076.

    • Search Google Scholar
    • Export Citation
  • Zwieback, S., Dorigo W. , and Wagner W. , 2012b: Temporal error variability of coarse scale soil moisture products—Case study in central Spain. IEEE Geoscience and Remote Sensing Symposium 2012, IEEE, 722725, doi:10.1109/IGARSS.2012.6351463.

  • Zwieback, S., Scipal K. , Dorigo W. , and Wagner W. , 2012c: Structural and statistical properties of the collocation technique for error characterization. Nonlinear Processes Geophys., 19, 6980, doi:10.5194/npg-19-69-2012.

    • Search Google Scholar
    • Export Citation
  • Zwieback, S., Dorigo W. , and Wagner W. , 2013: Estimation of the temporal autocorrelation structure by the collocation technique with emphasis on soil moisture studies. Hydrol. Sci. J., 58, 17291747, doi:10.1080/02626667.2013.839876.

    • Search Google Scholar
    • Export Citation
  • Zwieback, S., Paulik C. , and Wagner W. , 2015: Frozen soil detection based on advanced scatterometer observations and air temperature data as part of soil moisture retrieval. Remote Sens., 7, 32063231, doi:10.3390/rs70303206.

    • Search Google Scholar
    • Export Citation
1

These methods are implemented in the software package nlscaling (http://dx.doi.org/10.5281/zenodo.44383).

Supplementary Materials

Save
  • Albergel, C., Rüdiger C. , Carrer D. , Calvet J.-C. , Fritz N. , Naeimi V. , Bartalis Z. , and Hasenauer S. , 2009: An evaluation of ASCAT surface soil moisture products with in-situ observations in southwestern France. Hydrol. Earth Syst. Sci., 13, 115124, doi:10.5194/hess-13-115-2009.

    • Search Google Scholar
    • Export Citation
  • Albergel, C., and Coauthors, 2013: Monitoring multi-decadal satellite Earth observation of soil moisture products through land surface reanalyses. Remote Sens. Environ., 138, 7789, doi:10.1016/j.rse.2013.07.009.

    • Search Google Scholar
    • Export Citation
  • Alemohammad, S. H., McColl K. A. , Konings A. G. , Entekhabi D. , and Stoffelen A. , 2015: Characterization of precipitation product errors across the United States using multiplicative triple collocation. Hydrol. Earth Syst. Sci., 19, 34893503, doi:10.5194/hess-19-3489-2015.

    • Search Google Scholar
    • Export Citation
  • Anderson, T. W., 2003: An Introduction to Multivariate Statistical Analysis. Wiley, 752 pp.

  • Anderson, T. W., and Amemiya Y. , 1988: The asymptotic normal distribution of estimators in factor analysis under general conditions. Ann. Stat., 16, 759771, doi:10.1214/aos/1176350834.

    • Search Google Scholar
    • Export Citation
  • Azzalini, A., 2005: The skew-normal distribution and related multivariate families. Scand. J. Stat., 32, 159188, doi:10.1111/j.1467-9469.2005.00426.x.

    • Search Google Scholar
    • Export Citation
  • Balsamo, G., and Coauthors, 2015: ERA-Interim/Land: A global land surface reanalysis data set. Hydrol. Earth Syst. Sci., 19, 389407, doi:10.5194/hess-19-389-2015.

    • Search Google Scholar