Abstract

Triple collocation analysis (TCA) enables estimation of error variances for three or more products that retrieve or estimate the same geophysical variable using mutually independent methods. Several statistical assumptions regarding the statistical nature of errors (e.g., mutual independence and orthogonality with respect to the truth) are required for TCA estimates to be unbiased. Even though soil moisture studies commonly acknowledge that these assumptions are required for an unbiased TCA, no study has specifically investigated the degree to which errors in existing soil moisture datasets conform to these assumptions. Here these assumptions are evaluated both analytically and numerically over four extensively instrumented watershed sites using soil moisture products derived from active microwave remote sensing, passive microwave remote sensing, and a land surface model. Results demonstrate that nonorthogonal and error cross-covariance terms represent a significant fraction of the total variance of these products. However, the overall impact of error cross correlation on TCA is found to be significantly larger than the impact of nonorthogonal errors. Because of the impact of cross-correlated errors, TCA error estimates generally underestimate the true random error of soil moisture products.

1. Introduction

Introduced by Stoffelen (1998), triple collocation analysis (TCA) is an error magnitude estimation methodology that intercompares geophysical products obtained via three or more independent estimation/observation techniques. The approach was originally developed for ocean wind studies but has been increasingly being applied in land surface hydrology (Scipal et al. 2008; Crow and van den Berg 2010; Dorigo et al. 2010; Miralles et al. 2010; Hain et al. 2011; Parinussa et al. 2011; Anderson et al. 2012; Yilmaz et al. 2012; Yilmaz and Crow 2013; Zwieback et al. 2013; Draper et al. 2013). The majority of TCA land surface applications have focused on quantifying errors in satellite-based surface soil moisture products retrieved from instruments like the Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E) sensor aboard the National Aeronautics and Space Administration’s (NASA) Aqua satellite. In addition, ongoing Soil Moisture Ocean Salinity (SMOS) validation efforts have included TCA to attribute the sources of error magnitudes to different ground conditions like forest cover, texture, and radio frequency interference (RFI; Leroux et al. 2013). Likewise, current Soil Moisture Active Passive (SMAP) calibration and validation plans call for the use of TCA along with other methodologies to estimate retrieval errors over sparse networks using point observations (T. J. Jackson 2013, personal communication). In addition to error magnitude comparison studies, the error variance estimates obtained from TCA can also be used to provide observation error covariance information required by data assimilation (Crow and van den Berg 2010; Crow and Yilmaz 2014) and least squares merging approaches (Yilmaz et al. 2012). Finally, TCA can also be used as stand-alone rescaling methodology to remove systematic differences between the signal variance (true variability) component of observations in data assimilation studies (Yilmaz and Crow 2013).

Using three products that observe the same geophysical variable, TCA is able to find relative error estimates for these products under certain assumptions, namely, the independence of product errors with respect to the truth (i.e., error orthogonality) and the relative independence of errors in each product (i.e., zero error cross correlation; Stoffelen 1998). These assumptions are generally required to reduce the overdetermined TCA system of equations into a determined one. Using three datasets, for example, the TCA system has seven unknown parameters with only three available equations (Yilmaz et al. 2012). Assuming one of the datasets as reference for rescaling purposes and further assuming error orthogonality and zero error cross correlation, the number of unknown parameters drops to three and the system is solvable. These assumptions can be eased if the magnitude of error orthogonality is known (Stoffelen 1998; Portabella and Stoffelen 2009) or more than three products are available for cross comparison (Zwieback et al. 2012); however, the majority of soil moisture–based triple collocation studies use only three datasets and are therefore forced to rely on these assumptions to ensure a bias-free TCA.

Here we investigate the impact of the error orthogonality and zero cross-correlation assumptions on TCA-based soil moisture error estimates by 1) analytically deriving the specific terms in TCA that must vanish in order for TCA to produce an unbiased error variance estimate and 2) numerically evaluating these terms over a series of watershed sites where high-quality ground-based soil moisture observations are obtainable. Section 2 presents the analytical basis of the study, section 3 presents results, and section 4 summarizes our key findings.

2. Methodology

a. Triple collocation–based errors

Given three datasets (, , and ), TCA is typically based on selecting a single dataset (e.g., ) and rescaling the other two products ( and ) to this reference via the derivation of specific rescaling factors. To illustrate, a general linear model is assumed to represent these three datasets:

 
formula
 
formula
 
formula

where is the truth; , , and are the datasets of interest; , , and are time series of true random errors with mean zero and variance , , and , respectively; and the scaling parameters , , and relate the sensitivity of , , and to . Normality of these errors is not a requirement. However, given known problems with applying TCA to soil moisture datasets possessing different seasonal climatologies (Miralles et al. 2010; Draper et al. 2013), here , , , and are all assumed to represent a time series of mean-zero anomalies derived by first subtracting off a long-term seasonal soil moisture climatology.

Choosing as a reference, and can then be rescaled via the factors

 
formula
 
formula

where the overbar indicates temporal averaging and , for example, refers to the rescaling factor of product with respect to . Lacking any assumptions about error orthogonality or error cross correlation, inserting (1)(3) into (4) and (5) yields

 
formula
 
formula

where is the signal variance of . In (7), the terms inside the first set of parentheses (in both the numerator and the denominator) are related to the variance of the signal, the second parenthetical term represents nonorthogonal terms describing the covariance between the true and the product errors, and the third parenthetical terms capture the impact of error cross covariance. If errors are assumed to be orthogonal and mutually independent , then simplifies to

 
formula

and describes an optimal rescaling factor that matches the signal component of to that of (Stoffelen 1998).

Once and estimates are available from (4) and (5), rescaled datasets can be derived as and . TCA-based error variance estimates can then be calculated as

 
formula
 
formula
 
formula

If we retain our earlier assumptions of orthogonal and mutually independent errors, then TCA-based error variance estimates will correspond to the true error variances of X, Y*, and Z* . However, the analysis below presents a more general case in which these assumptions are not made.

b. Implications of triple collocation assumptions

Hereafter, for simplicity, and are written as and , respectively. Without making any simplifying assumptions about either error orthogonality or cross covariance, inserting (1)(3) into (9) and applying the above definitions of and allows us to expand as

 
formula
 
formula
 
formula

Following (14), can be decomposed into four separate components:

 
formula

where

 
formula
 
formula
 
formula
 
formula

The term in (19) represents the “true random error” variance of the product (i.e., the typical target of TCA) while (16)(18) represent bias terms that must be neglected for TCA to become an unbiased estimator of (19). The term in (16) is referred to as the “leaked signal” term because it describes the fraction of this signal variance that is “leaked” into the error variance via the inappropriate specification of the rescaling factors [i.e., the failure of and to conform to the optimal scaling parameter defined in (8)]. Likewise, and bias terms refer to “orthogonal error” and “cross-correlated error” variances and capture the eventual impact (on TCA) of violating either the error orthogonality or the zero error cross-covariance (e.g., ) assumptions described above.

In order for [(15)] to converge to [(19)], the bias terms [(16)], [(17)], and [(18)] must all be neglected. Specifically, this requires that 1) the term must vanish via optimal rescaling ( and/or ), 2) the term must vanish via or perfect rescaling or perfectly compensating error nonorthogonality terms ( or ), and 3) the term must vanish via or via perfectly compensating error cross covariances . All three bias terms (16)(18) are sensitive to the specification of scaling parameters ( and ). Likewise, error orthogonality and zero error cross correlation are also requirements for the derivation of optimal rescaling factors [Yilmaz and Crow 2013; also see (8) above]. As a result, any discussion of (16)(18) must be linked with an examination of both rescaling and the statistical properties of product errors. However, it should be stressed that, unlike the typical TCA analysis presented in (9)(11), neither (12)(14) nor (6) and (7) require any simplifying assumptions regarding error orthogonality or cross correlation.

c. Evaluation using ground-based data

Here, we attempt to quantify error nonorthogonality and cross-correlation statistics and the manner in which these statistics are combined (with each other and rescaling factors) to yield terms (16)(18) contributing as bias to TCA error variance estimates. To this end, we use ground-based station data (; obtained from heavily instrumented watersheds) to estimate error orthogonality and cross-covariance statistics. These sampled statistics are then inserted into (16)(18) to explicitly calculate the magnitude of the , , and bias terms. The term in (19) is then obtained as a residual by subtracting the sum of these bias terms (16)(18) from TCA-based derived via (9)(11). Details are given below.

TCA rescaling (7) and error (9)(11) components are functions of multiple terms reflecting error nonorthogonality and error cross covariance , which require time series of both and . Assuming high-quality is available to serve as a proxy for truth (i.e., ), time series errors in (for example) can be explicitly estimated as

 
formula

where is the station data rescaled to reference . Note that is rescaled into to eliminate the introduction of additional error nonorthogonality and cross correlations, which may arise from rescaling differences between and [see (A9) in  appendix A, section b, for details].

Once time series errors for , , and are obtained as in (20), station-based product error nonorthogonality variance and error cross-covariance estimates can be directly calculated as

 
formula
 
formula

These quantities will be used to calculate (16)(18) and their impact on TCA error variance estimates given in (9)(11). In addition, using , the station-based error variance estimate of can be directly estimated as

 
formula

Note that (23) estimates the same error variances as TCA (9)(11). See  appendix A, section b, for the analytical comparison of station- and TCA-based error estimates to determine the relative bias of both approaches.

Also note that (21) and (22) do not provide all variables necessary to replicate (16)(18). In particular, knowledge of the true signal variance is required by [(16)], and the scaling parameters , , and are required for calculation of both and [(17)]. For the calculation of error nonorthogonality and cross-correlation statistics, the station data are assumed to represent absolute truth. However, to quantify the signal variance and scaling parameters, we apply an independent TCA (not shown) to estimate the error in by selecting as the reference dataset (i.e., ) and constructing a triplet out of , , and . This error variance estimate for is then subtracted from the total variance of to estimate

 
formula

See  appendix A, section a, for a complete discussion of how random errors in impacts (20)(24) and subsequent estimates of (16)(19).

Sampling errors in TCA-based error variance (9)(11), TCA components (16)(19), and error nonorthogonality variance and cross-covariance (21) and (22) estimates are calculated using a bootstrapping approach where 1000 separate time series replicates are randomly sampled (with replacement) from the original time series (Efron and Tibshirani 1993). Plotted 95% sampling confidence intervals are calculated as twice the sampled standard deviation of these replicates.

Because the basic premise behind rescaling in TCA is to eliminate the signal variance components of products by multiplying products proportional to their signal variances [see (8)], rescaling factors themselves may be an indicator of relative accuracy of products (i.e., higher rescaling factors are associated with products with lower signal variances) when products with equal total variances are compared. To illustrate the impact of reference dataset selection on rescaling factors, rescaling factors are estimated for the cases where both and are selected as reference.

d. Data and study locations

TCA is performed over the four U.S. Department of Agriculture (USDA) Agricultural Research Service (ARS) watersheds (Little Washita, Little River, Reynolds Creek, and Walnut Gulch) currently used for the validation of AMSR-E and SMOS surface soil moisture products (Jackson et al. 2010, 2012; Leroux et al. 2014). These watersheds have dominant grassland, forest/agriculture, mountainous, and semiarid land cover (respectively) with surface (0–5 cm) soil moisture sensors collecting data at 20–60-min intervals at 16–29 separate spatial locations (Jackson et al. 2010). They extend over areas ranging between 150 and 610 km2 and have, on average, one soil moisture observation station per 16 km2 area (Jackson et al. 2010). Watershed-scale spatial averages (corresponding to above) are obtained via weighted averaging of all stations. These averages, in turn, have been verified using gravimetric soil moisture observations acquired during intensive field campaigns (Cosh et al. 2006, 2008).

In addition to this ground-based data, remotely sensed surface soil moisture estimates (roughly corresponding to the top 1–3 cm of the soil column) are retrieved from the Advanced Scatterometer (ASCAT) and AMSR-E satellite sensors over all four watershed sites. ASCAT soil moisture values are obtained from the Vienna University of Technology using the algorithm described in Wagner et al. (1999) and Naeimi et al. (2009). AMSR-E retrievals are acquired from the VU University Amsterdam using the Land Parameter Retrieval Model (LPRM) described in Owe et al. (2001, 2008). Note that because of the active microwave basis of the ASCAT retrievals and the passive microwave nature of the AMSR-E/LPRM retrievals, these two products are commonly assumed to have independent errors.

Land surface model–based predictions of surface soil moisture are based on the top soil layer (0–10 cm) predictions acquired from version 2.7 of the Noah model. Forcing data for these Noah simulations are based on Global Land Data Assimilation System (GLDAS; Rodell et al. 2004) meteorological data distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC). Soil parameters are based on the dataset of Reynolds et al. (2000) and land cover maps/parameters produced from the University of Maryland global land cover product (Hansen et al. 2000). More information about the Noah model can be found in Ek et al. (2003).

All model- and satellite-based products are obtained at 0.25° spatial resolution while the study is performed between January 2007 and September 2011 at daily time steps. These soil moisture products have high mutual cross correlation (Brocca et al. 2011), which is consistent with the linear relationships assumed in (1)(3). Datasets used in TCA (, , , and ) are obtained after standardizing the anomaly datasets by dividing them by their long-term standard deviations. Note that this makes products, and their eventual error statistics, unitless in nature. This standardization is performed to ensure results from products with varying units can be meaningfully intercompared.

Error estimates (9)(11), (16)(19), and (23), error nonorthogonality (21), and error cross covariances (22) are obtained over each watershed (total of four) and for each datasets (total of three). To condense these results, error estimates for all four watersheds are averaged into a single value for each product.

3. Results

All results are from selecting Noah-based soil moisture products as the reference dataset for TCA because it demonstrates the highest cross correlation with ground data (average cross correlations with station data are 0.60, 0.58, and 0.55 for Noah, LPRM, and ASCAT, respectively). When averaged over all four watersheds, all three soil moisture products (ASCAT, LPRM, and Noah) demonstrate significant levels of error nonorthogonality and cross covariance (Fig. 1). This implies that the neglect of error nonorthogonality and cross covariance is not justified in standard soil moisture TCA. Figure 2 examines this issue directly by plotting the TCA bias terms derived in (16)(18). Results demonstrate that the bias term is strongly negative. This can be analytically explained: in general, is expected to be larger than (well justified with rescaling factors given in Fig. 3), and error cross covariances are positive and have relatively similar magnitudes (well justified with covariance values given in Fig. 1). Hence, the summation of the first two negative terms in (18) is larger than the third positive term, while the sampling errors in rescaling factors shown in Fig. 3 do not impact this result. Accordingly, becomes both nonnegligible and negatively biased (Fig. 2).

Fig. 1.

Error nonorthogonality variance (21) and error cross covariances (22) averaged across all four watershed sites for all three soil moisture products. Error bars represent two std dev of sampling errors estimated using a bootstrapping approach.

Fig. 1.

Error nonorthogonality variance (21) and error cross covariances (22) averaged across all four watershed sites for all three soil moisture products. Error bars represent two std dev of sampling errors estimated using a bootstrapping approach.

Fig. 2.

TCA-based error variances (9)(11) and station-based error variances (23), leaked signal (16), orthogonal error (17), cross-correlation error (18), and true random error (19). Error variances representing the normalized product errors are shown in red, and specific TCA bias terms defined in (16)(18) are shown in blue. All plotted quantities are averaged over all four watershed sites. Error bars represent two std dev of sampling errors estimated using a bootstrapping approach.

Fig. 2.

TCA-based error variances (9)(11) and station-based error variances (23), leaked signal (16), orthogonal error (17), cross-correlation error (18), and true random error (19). Error variances representing the normalized product errors are shown in red, and specific TCA bias terms defined in (16)(18) are shown in blue. All plotted quantities are averaged over all four watershed sites. Error bars represent two std dev of sampling errors estimated using a bootstrapping approach.

Fig. 3.

Rescaling factors averaged over all four watershed sites. Error bars represent two std dev of sampling errors estimated using a bootstrapping approach.

Fig. 3.

Rescaling factors averaged over all four watershed sites. Error bars represent two std dev of sampling errors estimated using a bootstrapping approach.

In contrast to the impact of error cross correlation, Fig. 2 reveals that the impact of error nonorthogonality and nonoptimal rescaling is dampened significantly when aggregated to form the TCA bias terms and . This dampening occurs because of similarities in the scaling parameters of rescaled products and in the magnitudes of error nonorthogonality contributions to (16) and (17). As a consequence, neither term contributes substantial bias to TCA error variance estimates. Further description of this dampening is given in  appendix B and is discussed below.

The summation of the (nonnegligible) negative bias term with the much smaller and bias terms results in a net negative bias term. As a result, TCA error estimates derived via (9)(11) underestimate in (19) for all three products in Fig. 2. Sampling error bars in Fig. 2 confirm that these biases cannot be attributed to sampling error alone. Thus, while both error nonorthogonality and error cross covariance are equally present in the soil moisture triplet (Fig. 1), the eventual impact of error cross covariance on TCA is much greater.

TCA-based error estimates can be validated using station-based error estimates; however it is often not clear how the representativeness errors of station data and TCA assumptions impact these comparisons.  Appendix A, section b, analytically investigates the difference between station- and TCA-based error variances (Fig. 2). This difference is positively biased because of representativeness errors in the ground station data. On the other hand, this bias is reduced by error cross-covariance terms (A11). Representativeness errors impact the estimation of station-based error variances (21)(23) too. However, the sign of the bias of these estimates is difficult to predict (see concluding remarks in  appendix A, section a). This is also supported by Fig. 2: the differences between and can be explained solely by sampling errors, unlike the plotted differences between and .

Figure 3 shows how rescaling factors may change depending on the reference dataset selection. In general, the rescaling factor is expected to be higher than one when the reference dataset has higher signal variance than the rescaled dataset. For products with the same total variance (e.g., in this study the total variance of each product is one), higher signal variance implies a more skillful product (i.e., a smaller error variance and signal-to-noise ratio). Rescaling factors are consistently higher when station data, rather than Noah model predictions, are used as the reference dataset. This implies that the signal variance of the station data is higher than that of other datasets, or alternatively, that the representativeness errors of watershed-averaged station data used in this study are less than estimation errors present in other datasets.

4. Conclusions

Here, we evaluate the appropriateness of the error orthogonality and zero error cross-covariance (e.g., ) assumptions typically required by TCA using three different surface soil moisture products (Noah, LPRM, and ASCAT) and (ground) station-based data as a source of independent validation. We show that error nonorthogonality (21) and cross-covariance (22) statistics do not directly describe the magnitude of bias in TCA predictions. Instead, we analytically decompose TCA into four separate terms (, , , and ) defined in (16)(19). One of these terms captures the typical goal of TCA while the other three terms directly represent bias terms that arise from violations of the basic error orthogonality and zero error cross-covariance assumptions underlying TCA. Through the use of ground-based station data, we evaluate the degree to which these assumptions are respected in soil moisture data products and examine the impact of (potential) violations on the four identified TCA terms.

Results suggest that required TCA assumptions of error orthogonality and zero error cross covariance do not generally hold for typical surface soil moisture data products (Fig. 1). However, error nonorthogonality does not contribute significantly to TCA bias [via the (16) and/or (17) bias terms]. This is primarily due to two factors. First, the and bias terms are always reduced by the application of scaling parameters which approximate the optimal values defined in (8). In addition, while error nonorthogonality degrades our ability to accurately estimate such optimal values [see (7)], the and terms can still be dampened by the case of nonorthogonality being equally distributed among all three products, which appears to be the general case for surface soil moisture products examined here (see  appendix B and Fig. 1). In contrast, the bias term (18)—which is a direct function of error cross correlation between members of the TCA triplet—is not dampened when equally present in all three data products ( appendix B) or via the application of an optimal rescaling factor [see (18)]. As a result, it represents the bias term of greatest concern in TCA. Confirming this analytical prediction, numerical results presented here illustrate the tendency for the negative magnitude of the term to bias TCA error estimates low relative to the actual magnitude of the term (Fig. 2).

Here, we obtained error cross-covariance information using ground datasets as truth. In locations lacking such datasets, the detection of error cross covariance using the available model and satellite products alone (i.e., no dataset available that can be assumed as truth) will present a challenge. However, as demonstrated in Draper et al. (2013) and Zwieback et al. (2012), utilizing more than three (i.e., four) datasets in TCA provides an opportunity to detect the presence of error cross covariance in the absence of any collaborating ground data observations.

TCA results presented here are all verified using comparison against station-based soil moisture observations. However, such data is just another estimate with its own characteristic (representativeness) errors. Therefore, significant bias introduced by error cross variance also results in TCA-error variances being negatively biased when compared to station-based error variances. Here this is demonstrated both numerically (Fig. 2) and analytically ( appendix A, section b). However, it is not possible to predict the sign of the bias between station-based errors—derived via (23)—and . The sign of this bias will vary case by case, and its magnitude will tend to be small because of the counteracting impacts of error cross correlation and representativeness errors ( appendix A, section a). Even though station-based data errors are often less than the errors of satellite- and model-based data, scaling parameter inconsistencies between datasets should be reduced via proper rescaling before using station data in validation studies in order to obtain unbiased error estimates.

Acknowledgments

We thank Michael Cosh of the U.S. Department of Agriculture for the USDA ARS watershed soil moisture datasets, Robert Parinussa of Vrije Universiteit Amsterdam for LPRM datasets, Wouter Dorigo of Technische Universität Wien for ASCAT datasets, and NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC) for Noah datasets. Research was funded by Wade Crow’s membership in the NASA Soil Moisture Active Passive (SMAP) Science Definition Team. M. Tugrul Yilmaz's work is partially supported by Fund 2232 given by Turkish Scientific and Technical Research Council (TUBITAK).

APPENDIX A

Station- versus TCA-Based Error Estimates

In this section, the difference between and is examined by deriving general solutions that do not require commonly made assumptions (i.e., errors are orthogonal and mutually independent). Here, we also complement the study of Miralles et al. (2010) by performing a complete analysis of the representativeness errors that does not neglect error components that are assumed to vanish in TCA.

a. Station-based error variances

Assuming no bias between datasets, ground station-based error variance of product can be found as

 
formula

Here we are interested in the real world scenario where contains representativeness errors ; hence, we define similar to in its most general form:

 
formula

where reflects the association of station data with the absolute truth. In fact, this representation of station data implies that station data are just another estimate of the truth with their own rescaling and error issues. Without assuming cross-correlated and orthogonal errors vanish, the station-based error variance of (A1) can be found as

 
formula
 
formula
 
formula

In addition to true random error , has additional error variance components related with representativeness errors , leaked signal , error nonorthogonality , and error cross covariances . Overall, the leaked signal and error nonorthogonality terms are dampened [similar to (16) and (17) and Fig. 2]. Representativeness errors always positively bias estimates; however, this is counterbalanced by error cross covariances, which negatively bias estimates. Even though representativeness error variances can be larger than error covariance , the term does not have a predictable sign. Accordingly, it is appropriate to attribute as unbiased estimates of . Given ground station data are used to estimate error bias terms (16)(18), the above also implies using ground station data as truth gives unbiased estimates of error bias terms.

b. Station- and TCA-based error variance comparison

Estimates of are not clearly biased, but both analytical expressions and numerical results show estimates are negatively biased [(9)(11); also see the discussion in the results section]. Yet, it is of interest to analytically confirm whether or not the difference between and estimates is clearly biased with the presence of nonorthogonal and cross-correlated error conditions. Estimates of require availability of at least three products. This third product is not used in calculations, implying comparisons of and may not be straightforward. Here, instead of the usual form of estimation (9), we assume a version that does not require a third product:

 
formula
 
formula
 
formula
 
formula

where is the dataset of interest and can be any other independent dataset. Here, is selected as the ground-based station data. This equation is directly comparable to standard TCA while both double (A6) and triple (9)(11) representations give identical error variance estimates under perfect conditions (rescaling issues are handled before this estimation step and assuming no orthogonal and cross-correlated errors). This shorter form also has similar error terms as the full TCA form: leaked signal (first square-bracket term above) and orthogonal error (second term) components are dampened by the rescaling factor differences while the error cross-covariance term (third term) is negatively biased. Hence, the underestimation of true error (19) with this shorter version (A6) is similar to the full TCA version. Moreover, this short-hand form is tested using real data presented in this study, and a majority of the time this version found very close error estimates to standard TCA form (results not shown).

The difference between station- and TCA-based error variances is found as

 
formula
 
formula

As it is the case with error variance estimates, station- and TCA-based error variance estimate difference () has its own components in (A11) related with leaked signal , orthogonal errors , and cross-correlated errors . Here all error cross covariances (, , and ) are expected to be positive, assuming these cross covariances are largely related with inconsistent representation of the absolute truth (hence positively correlated). Assuming representativeness errors of station are less than product errors (can be easily justified as cross correlations of station with other products is the highest), then it is reasonable to expect the additional positive cross-covariance and orthogonal terms to be less when compared to other datasets. This would increase the numerator of (7) less than the denominator. Accordingly, it is plausible that the rescaling factors would be underestimated when station data are selected as the reference dataset, which results in after rescaling (i.e., station signal variance is higher than product signal variance in Fig. 3). This makes the first square-bracket term (total four) of (A11) become positive. Similarly, it is very likely that the second term of is positive too (assuming and ~ ). Assuming all error cross covariances are positive (i.e., assuming products are largely independent, then errors could be correlated due to their correlations with the truth and become positive), then the third term of becomes negative. Finally, it is also plausible that the variance of the representation error is higher than its cross covariance with , resulting in the last term to become positive.

In summary, three out of the four terms that appear in are generally positive, and the other is negative. However, considering the leaked signal and orthogonal error-related terms are dampened similar to TCA errors (i.e., they can be ignored), then one negative term and one positive term are left. Assuming station representativeness error variances are expected to be higher than its cross covariances with product errors , it is reasonable to conclude is likely to be positive (i.e., station-based error variances are higher than TCA-based error variances). This is particularly true when errors are not correlated [if , , and , then ] and becomes even more clear when there are additionally no rescaling differences .

APPENDIX B

Impacts of Error Nonorthogonality and Cross Covariance on TCA Bias Terms

Numerical results in Fig. 2 point to a fundamental asymmetry in the impact of nonzero error orthogonality versus nonzero error cross covariance on TCA. Because (7) describes our rescaling approach, inserting (7) into (16)(19) analytically explains this asymmetry. This combination yields a single (complex) expression that describes the combined impact of 1) error orthogonality, 2) error cross covariance, and 3) the impact of error orthogonality and error cross covariance on rescaling results.

To make these complex expressions more interpretable, we make the simplifying assumptions that 1) and 2) product scaling parameters are unknown but approximately equal. In addition, consider an initial case where product errors are perfectly orthogonal to the truth . Based on these assumptions, the combination of (7) with (16)(18) yields

 
formula
 
formula
 
formula

Note that bias term can be neglected if error cross covariances are nonzero but approximately equal (i.e., or ). However, an assumption of approximately equal error cross covariances will not yield a zero bias term. Therefore, if errors are orthogonal, approximately equal error cross correlation will dampen the term but not the term.

It is also worth considering a second case of assuming zero error cross correlation (but allowing for error nonorthogonality). When inserting (7) into (16)(18), this assumption yields

 
formula
 
formula
 
formula

Note that and can be both neglected for the case of nonzero but approximately equal level of error nonorthogonality (i.e., or ). Therefore, even if error nonorthogonality is present, its impact on and bias terms can be safely neglected if it is approximately equal for at least two members of the TCA triplet.

Taken as a whole, this analysis demonstrates that the and TCA bias terms will be generally dampened by approximately equal error cross covariance or error nonorthogonality present in any two of the three products of interest. Such dampening, however, does not occur for the term.

REFERENCES

REFERENCES
Anderson
,
W. B.
,
B. F.
Zaitchik
,
C. R.
Hain
,
M. C.
Anderson
,
M. T.
Yilmaz
,
J.
Mecikalski
, and
L.
Schultz
,
2012
:
Towards an integrated soil moisture drought monitor for East Africa
.
Hydrol. Earth Syst. Sci.
,
9
,
4587
4631
, doi:.
Brocca
,
L.
, and Coauthors
,
2011
:
Soil moisture estimation through ASCAT and AMSR-E sensors: An intercomparison and validation study across Europe
.
Remote Sens. Environ.
,
115
,
3390
3408
, doi:.
Cosh
,
M. H.
,
T. J.
Jackson
,
P.
Starksb
, and
G.
Heathm
,
2006
:
Temporal stability of surface soil moisture in the Little Washita River watershed and its applications in satellite soil moisture product validation
.
J. Hydrol.
,
323
,
168
177
, doi:.
Cosh
,
M. H.
,
T. J.
Jackson
,
S.
Moran
, and
R.
Bindlish
,
2008
:
Temporal persistence and stability of surface soil moisture in a semi-arid watershed
.
Remote Sens. Environ.
,
112
,
304
313
, doi:.
Crow
,
W. T.
, and
M. J.
van den Berg
,
2010
:
An improved approach for estimating observation and model error parameters in soil moisture data assimilation
.
Water Resour. Res.
,
46
,
W12519
, doi:.
Crow
,
W. T.
, and
M. T.
Yilmaz
,
2014
:
The Auto-Tuned Land Data Assimilation System (ATLAS)
.
Water Resour. Res.
,
50
,
371
385
, doi:.
Dorigo
,
W. A.
,
K.
Scipal
,
R.
Parinussa
,
Y.
Liu
,
W.
Wagner
,
R.
de Jeu
, and
V.
Naeimi
,
2010
:
Error characterisation of global active and passive microwave soil moisture datasets
.
Hydrol. Earth Syst. Sci.
,
14
,
2605
2616
, doi:.
Draper
,
C.
,
R.
de Jeu Reichle
,
V.
Naeimi
,
R.
Parinussa
, and
W.
Wagner
,
2013
:
Estimating root mean square errors in remotely sensed soil moisture over continental scale domains
.
Remote Sens. Environ.
,
137
,
288
298
, doi:.
Efron
,
B.
, and
R. J.
Tibshirani
,
1993
: An Introduction to the Bootstrap. Chapman and Hall, 436 pp.
Ek
,
M. B.
,
K. E.
Mitchell
,
Y.
Lin
,
E.
Rogers
,
P.
Grunmann
,
V.
Koren
,
G.
Gayand
, and
J. D.
Tarpley
,
2003
:
Implementation of Noah land surface model advances in the national centers for environmental prediction operational mesoscale eta model
.
J. Geophys. Res.
,
108
,
8851
, doi:.
Hain
,
C. R.
,
W. T.
Crow
,
J. R.
Mecikalski
,
M. C.
Anderson
, and
T.
Holmes
,
2011
:
An intercomparison of available soil moisture estimates from thermal infrared and passive microwave remote sensing and land surface modeling
.
J. Geophys. Res.
,
116
,
D15107
, doi:.
Hansen
,
M. C.
,
R. S.
DeFries
,
J. R. G.
Townshend
, and
R.
Sohlberg
,
2000
:
Global land cover classification at 1 km spatial resolution using a classification tree approach
.
Int. J. Remote Sens.
,
21
,
1331
1364
, doi:.
Jackson
,
T. J.
, and Coauthors
,
2010
:
Validation of advanced microwave scanning radiometer soil moisture products
.
IEEE Trans. Geosci. Remote Sens.
,
48
,
4256
4272
, doi:.
Jackson
,
T. J.
, and Coauthors
,
2012
:
Validation of Soil Moisture and Ocean Salinity (SMOS) soil moisture over watershed networks in the U.S
.
IEEE Trans. Geosci. Remote Sens.
,
50
,
1530
1543
, doi:.
Leroux
,
D. J.
,
Y.
Kerr
,
P.
Richaume
, and
R.
Fieuzal
,
2013
:
Spatial distribution and possible sources of SMOS errors at the global scale
.
Remote Sens. Environ.
,
133
,
240
250
, doi:.
Leroux
,
D. J.
,
Y.
Kerr
,
A.
Bitar
,
R.
Bindlish
,
T.
Jackson
,
B.
Berthelot
, and
G.
Portet
,
2014
:
Comparison between SMOS, VUA, ASCAT, and ECMWF soil moisture products over four watersheds in U.S
.
IEEE Trans. Geosci. Remote Sens.
,
52
,
1562
1571
, doi:.
Miralles
,
D. G.
,
W. T.
Crow
, and
M. H.
Cosh
,
2010
:
Estimating spatial sampling errors in coarse-scale soil moisture estimates derived from point-scale observations
.
J. Hydrometeor.
,
11
,
1423
1429
, doi:.
Naeimi
,
V.
,
K.
Scipal
,
Z.
Bartalis
,
S.
Hasenauer
, and
W.
Wagner
,
2009
:
An improved soil moisture retrieval algorithm for ERS and METOP scatterometer observations
.
IEEE Trans. Geosci. Remote Sens.
,
47
,
1999
2013
, doi:.
Owe
,
M.
,
R.
de Jeu
, and
J. P.
Walker
,
2001
:
A methodology for surface soil moisture and vegetation optical depth retrieval using the microwave polarization difference index
.
IEEE Trans. Geosci. Remote Sens.
,
39
,
1643
1654
, doi:.
Owe
,
M.
,
R.
de Jeu
, and
T.
Holmes
,
2008
:
Multisensor historical climatology of satellite-derived global land surface moisture
.
J. Geophys. Res.
,
113
,
F01002
, doi:.
Parinussa
,
R. M.
,
T. R. H.
Holmes
,
M. T.
Yilmaz
, and
W. T.
Crow
,
2011
:
The impact of land surface temperature on soil moisture anomaly detection from passive microwave observations
.
Hydrol. Earth Syst. Sci.
,
15
,
3135
3151
, doi:.
Portabella
,
M.
, and
A.
Stoffelen
,
2009
:
On scatterometer ocean stress
.
J. Atmos. Oceanic Technol.
,
26
,
368
382
, doi:.
Reynolds
,
C. A.
,
T. J.
Jackson
, and
W. J.
Rawls
,
2000
:
Estimating soil water-holding capacities by linking the Food and Agriculture Organization soil map of the world with global pedon databases and continuous pedotransfer functions
.
Water Resour. Res.
,
36
,
3653
3662
, doi:.
Rodell
,
M.
, and Coauthors
,
2004
:
The Global Land Data Assimilation System
.
Bull. Amer. Meteor. Soc.
,
85
,
381
394
, doi:.
Scipal
,
K.
,
T.
Holmes
,
R.
de Jeu
,
V.
Naeimi
, and
W.
Wagner
,
2008
:
A possible solution for the problem of estimating the error structure of global soil moisture data sets
.
Geophys. Res. Lett.
,
35
,
L24403
, doi:.
Stoffelen
,
A.
,
1998
:
Toward the true near-surface wind speed: Error modeling and calibration using triple collocation
.
J. Geophys. Res.
,
103
,
7755
7766
, doi:.
Wagner
,
W.
,
G.
Lemoine
,
M.
Borgeaud
, and
H.
Rott
,
1999
:
A study of vegetation cover effects on ERS scatterometer data
.
IEEE Trans. Geosci. Remote Sens.
,
37
,
938
948
, doi:.
Yilmaz
,
M. T.
, and
W. T.
Crow
,
2013
:
The optimality of potential rescaling approaches in land data assimilation
.
J. Hydrometeor.
,
14
,
650
660
, doi:.
Yilmaz
,
M. T.
,
W. T.
Crow
,
M. C.
Anderson
, and
C.
Hain
,
2012
:
An objective methodology for merging satellite- and model-based soil moisture products
.
Water Resour. Res.
,
48
,
W11502
, doi:.
Zwieback
,
S.
,
K.
Scipal
,
W.
Dorigo
, and
W.
Wagner
,
2012
:
Structural and statistical properties of the collocation technique for error characterization
.
Nonlinear Processes Geophys.
,
19
,
69
80
, doi:.
Zwieback
,
S.
,
W.
Dorigo
, and
W.
Wagner
,
2013
:
Estimation of the temporal autocorrelation structure by the collocation technique with emphasis on soil moisture studies
.
Hydrol. Sci. J.
,
58
,
1729
1747
, doi:.