## 1. Introduction

Given the availability of multiple approaches (i.e., models, in situ observations, and remote sensing) for estimating many geophysical variables, it is often desirable to merge them to obtain a more accurate product. In data assimilation, the goal is to optimally merge independent datasets with different error characteristics to obtain an analysis product with higher accuracy than all of the parent products.

However, the use of different modeling and/or observational approaches typically leads to predictions with different systematic relationships to the assumed truth. This is particularly true for soil moisture data assimilation given well-known climatological differences in both model-derived (Koster et al. 2009) and remotely sensed (Jackson et al. 2010) soil moisture products. Additionally, absolute values of models and observations differ from ground observations (Reichle and Koster 2004; Reichle et al. 2004). Hence, it is crucial to remove systematic differences between different datasets before using them in a hydrological data assimilation framework (Reichle and Koster 2004). This is commonly achieved by rescaling soil moisture observations to match model-predicted soil moisture (in some statistical sense) during a preprocessing step.

Several potential strategies for such rescaling have been proposed and applied in recent land data assimilation studies. Among them, cumulative distribution function (CDF) matching (Reichle and Koster 2004) and variance matching techniques are perhaps the most common. A handful of studies have applied rescaling based on least squares regression techniques (Crow et al. 2005; Crow and Zhan 2007) but failed to offer any clear rationale for this choice. Additionally, signal variance-based rescaling, typically applied as a preprocessing step in triple collocation analysis (Stoffelen 1998), also provides a means to rescale datasets using three independent estimates of the same variable. However, this approach has not yet been applied in soil moisture data assimilation.

Although there are many existing methods for rescaling hydrological variables, their optimality in terms of analysis errors in an assimilation framework has not yet been assessed. This paper investigates the relative performances of the above-mentioned rescaling methods both analytically and numerically.

The theoretical rationale for rescaling, and the degree to which rescaling techniques discussed above are consistent with this rationale, are discussed in the next section. Section 3 briefly presents the numerical experiment setup, section 4 presents the numerical results, section 5 discusses the implications of the results, and section 6 summarizes our conclusions.

## 2. Rescaling datasets

### a. Analytical solution for the rescaling factor

**x**and the observations

**y**in a linear form as

*μ*and

_{x}*μ*are the mean values of

_{y}**x**and

**y**,

**t**′ is the true anomaly of the geophysical variable,

*α*and

_{x}*α*are scaling factors between the magnitude of the anomaly signals of

_{y}**x**and

**y**with

**t**′, and

_{x}and

_{y}are zero mean random errors in

**x**and

**y**. In hydrological data assimilation, observations

**y**are derived from in situ measurements and/or satellite-based retrievals,

_{y}is commonly assumed to lack autocorrelation, and

_{x}is generally considered to contain autocorrelation owing to the temporal memory of the model. In this setup,

*μ*and

*αt*′ represent the signal component while

**x**and

**y**. In addition, note that, for the case in which the observations are assumed to capture a linear transformation of

**t**′ (rather than

**t**′ itself), the required transformation can simply be folded into the existing linear form of (2) through a trivial redefinition of

*α*. As a result, the development below is equally valid for the case of a linear observation operator.

_{y}The purpose of data assimilation should be to reduce the magnitude of the noise component while preserving the information obtained from the signal components. Although these products have similarities in the way they realize the truth, they often have characteristic differences as well (i.e., different *μ* and *α*). Therefore, without the knowledge of the truth, arguably the best way to ensure the merged product has minimized error variance (assuming the uncertainties of products are characterized accurately) is to match datasets *x* and *y* to minimize the systematic differences between them prior to data assimilation. Without knowledge of the truth, matching datasets can be done by selecting one of the datasets as reference and linearly rescaling the other one.

*x*is the reference dataset,

*y*can be rescaled via the general linear transformation

*c*is a rescaling factor and

_{y}*y** the rescaled dataset. Combining (2) and (3), we obtain

**y*** can be expressed as

**t**=

*μ*+

_{x}*α*

_{x}**t**′ since

**x**is the reference dataset. Our goal here is identifying the functional form of

*c*that leads to an optimal data assimilation analysis. A key condition for such optimality is that assimilated observations

_{y}**y*** have orthogonal errors or

*E*[·] represents long-term temporal averaging,

**t**′, and

*c*

_{y}**y*** be uncorrelated in time and/or errors in the analysis must be orthogonal.

### b. Numerical solutions for the rescaling factor

*α*and

_{x}*α*are typically unknown, (9) cannot be calculated directly. Instead, most land data assimilation studies attempt to replicate (9) using data that are available (i.e.,

_{y}**x**and

**y**). Therefore, it is useful to consider the relationship between functional forms of

*c*derived from potential empirical rescaling strategies and the optimal form in (9). In appendix A we derive functional forms of

_{y}*c*obtained 1) by using linear least squares techniques to regress

_{y}**y**onto

**x**

*y*so that it has the same long-term temporal variance as

**x**

*y*,

*t*(also note that

**x**and

**y**, respectively. Defining the signal variance (

**y**(str

_{y}) and

**x**(str

_{x}) as

**x**. The expressions for

*ρ*

_{(x,y)}is the correlation between

**x**and

**y**(see appendix A). In these forms (18) and (19),

Note that, considering the definition in (4), **x** and **y*** match. This is sufficient for linear systems with Gaussian errors. However, a more general form of matching is also common in which the higher-order statistical moments are also matched. These so-called CDF matching approaches will be considered in numerical examples presented below.

### c. Optimal versus suboptimal solutions

Above we derive the optimal solution for *c _{y}* in a sequential filtering framework as

*c*for three different empirical rescaling strategies (

_{y}*c*= 1], only the TCA-based approach resulted in the optimal solution, whereas REG- and VAR-based solutions resulted in approximations to this optimal solution

_{y}*f*factors are defined as

*f*= str

_{R}_{y}in (15) and

*f*or

_{R}*f*are not equal to one, these two approaches diverge from the optimal solution given by (9). Therefore, the suboptimal REG-based solution converges to the optimal solution as str

_{V}_{y}converges to one, while VAR-based solution converges to the optimal solution only when str

_{x}= str

_{y}. Given the ubiquity of VAR- and REG-based rescaling approaches in contemporary land data assimilation, this demonstrates that a widely applied element of existing assimilation systems is generally suboptimal. An optimal solution is available from TCA-based rescaling approach; however, it requires three independent and mutually linear datasets of sufficient temporal length. If these requirements are not met, which is generally the case for most hydrological data assimilation systems, we are limited to the approximate REG- and VAR-based solutions.

## 3. Synthetic-twin experiment setup

*d*is day of the year,

*x*is the API model value at

_{d}*d*,

*P*is the precipitation value at

_{d}*d*, and

*a*and

*b*values are selected as 0.85 and 0.10, respectively. The model is run over a single 0.25° pixel (35°N, 98°W) using daily Tropical Rainfall Measuring Mission (TRMM) 3B42 precipitation accumulations acquired between 1998 and 2010.

Using the above API model, we have created daily synthetic ground truth **t**. Control runs *x* are obtained from model runs that do not assimilate observations, while API values from *d* to *d* + 1 are additively perturbed with random numbers that have mean of zero and standard deviations given in Table 1. Original observations **y** are created by multiplying the truth with a constant (true observation scaling factors *α _{y}*) and then adding mean-zero random noise with the same standard deviations as the control run (Table 1). We later rescale

**y**to

**x**by using four different rescaling methods: VAR observations are created using (19), CDF observations are created by using the CDF-matching technique described by Reichle and Koster (2004), REG observations are created using (18), and TCA observations are created using (12). For the TCA-based rescaling,

**z**values are created in an identical way to

**y**but using a different random number sequence.

Standard deviation cases for random additive observation and model perturbations *σ _{om}* (20 cases total).

*α*values [

_{y}*α*= (0.12, 1.00, 2.50)] are selected to result in increasing (or decreasing)

_{y}*c*and/or

_{y}*ρ*

_{(x,y)}. The true rescaling factors

*α*are given as input in the experiment design and therefore explicitly known. However,

_{y}*α*is not known and instead calculated as

_{x}**x**′ is the control run anomaly. Rescaled observations are later assimilated into (20) using an ensemble Kalman filter (EnKF) of the form

*e*is the ensemble member number (total is 40);

*e*at

*d*;

*e*at

*d*; and

*K*is the Kalman gain at

*d*. Here we note that the methodology is general to any Kalman filter variant while our choice of EnKF is arbitrary. Ensembles of observations are created by perturbing the observations at any given time step with statistics consistent with the error variances used for the calculation of

*K*. An ensemble of model replicates at any time step are created by adding mean-zero noise (standard deviations given Table 1) to model forecasts of

**x**. Values of

**x**

*at*

_{d}*d*, while observation error standard deviations

**y***.

Using this synthetic-twin framework, we investigate the impact of the REG, VAR, CDF, and TCA rescaling strategies on the accuracy of subsequent EnKF predictions by estimating the error standard deviation of EnKF analysis *ρ*_{(m,t)}. In particular, we investigate these estimates as a function of *ρ*_{(x,y)} since it differentiates the suboptimal REG- and VAR-based solutions [in (18) and (19)].

## 4. Results

Based on the synthetic-twin EnKF setup described above, we examined the performance of various rescaling strategies by selecting three *ρ*_{(m,t)} are presented in Figs. 2 and 3 (similar to Fig. 1, different model and observation perturbation values are plotted for each rescaling method).

EnKF analysis error standard deviations for three different *α _{y}* = (a) 0.12, (b) 1.00, and (c) 2.5. For clarity, actual str values (Fig. 1) are not drawn; instead, their max/min values are given. There are overlapping lines: green with brown in (b) and blue with green in (c).

Citation: Journal of Hydrometeorology 14, 2; 10.1175/JHM-D-12-052.1

EnKF analysis error standard deviations for three different *α _{y}* = (a) 0.12, (b) 1.00, and (c) 2.5. For clarity, actual str values (Fig. 1) are not drawn; instead, their max/min values are given. There are overlapping lines: green with brown in (b) and blue with green in (c).

Citation: Journal of Hydrometeorology 14, 2; 10.1175/JHM-D-12-052.1

EnKF analysis error standard deviations for three different *α _{y}* = (a) 0.12, (b) 1.00, and (c) 2.5. For clarity, actual str values (Fig. 1) are not drawn; instead, their max/min values are given. There are overlapping lines: green with brown in (b) and blue with green in (c).

Citation: Journal of Hydrometeorology 14, 2; 10.1175/JHM-D-12-052.1

As in Fig. 2, except EnKF analysis correlations with truth are plotted on the left axis. Actual str values are shown in Fig. 1. There are overlapping lines: brown and green in (b) and blue with green in (c).

Citation: Journal of Hydrometeorology 14, 2; 10.1175/JHM-D-12-052.1

As in Fig. 2, except EnKF analysis correlations with truth are plotted on the left axis. Actual str values are shown in Fig. 1. There are overlapping lines: brown and green in (b) and blue with green in (c).

Citation: Journal of Hydrometeorology 14, 2; 10.1175/JHM-D-12-052.1

As in Fig. 2, except EnKF analysis correlations with truth are plotted on the left axis. Actual str values are shown in Fig. 1. There are overlapping lines: brown and green in (b) and blue with green in (c).

Citation: Journal of Hydrometeorology 14, 2; 10.1175/JHM-D-12-052.1

Confirming the earlier theoretical analysis, TCA-based rescaling results in the smallest *ρ*_{(x,y)} are very low. VAR-based ^{0.5} (subscripts *x* and *y* refer to the model and observations, respectively) are around one (Fig. 2b). CDF-based *ρ*_{(x,y)} are minimized, which emphasizes the importance of accurate rescaling for variables having moderate to low model/observation correlations (such as soil moisture).

Limited cases, where REG- and VAR-based rescaling produces smaller *c _{y}* ≫ 1 and str

_{x}and str

_{y}are very low. This problem is especially acute for REG-based rescaling when

*c*≫ 1 since it frequently results in rescaled datasets with very small standard deviations due to grossly underestimated rescaling factors when str

_{y}_{y}≪ 0.5. Hence, it is necessary to replot Fig. 2 using

*ρ*

_{(m,t)}as an alternative error metric.

Results in Fig. 3 demonstrate that TCA-based rescaling results in the highest *ρ*_{(m,t)} for all examined cases. In addition, confirming earlier theoretical results, REG- and TCA-based EnKF results have comparable *ρ*_{(m,t)} when str_{y} values are high (Fig. 3c), and VAR-based rescaling converges to TCA-based rescaling when str_{x} and str_{y} are approximately equal (Fig. 3b).

When *α _{y}* values are very low, TCA-based

_{y}< 0.5 and

*f*< 1 (

_{y}< 1). However, this does not imply anything wrong with TCA-based rescaling; on the contrary, it emphasizes the importance of correctly assigning rescaling factors and illustrates that the goal of rescaling is not necessarily to minimize

## 5. Discussion

Given that we present two suboptimal solutions (REG- and VAR-based rescaling) that are widely applied in hydrological sciences, it is of interest to generalize which one leads to a more accurate analysis under specific conditions. Theoretically, the relative accuracy of REG- and VAR-based rescaling depends on the relative magnitudes of str_{y} and (str_{y}/str_{x})^{0.5}. However, such information is seldom readily available to developers of land data assimilation systems. Hence, it is not straightforward to offer general advice about whether the REG- or VAR-based rescaling method is optimal.

Nevertheless, it is possible to perform a consistency check to see whether a particular rescaling approach is consistent with statistical assumptions made during the implementation of a data assimilation system. For example, in the implementation of an EnKF, specific assumptions must be made regarding the error covariance of observations and the forecast uncertainty of the model. Based on these assumptions, estimates of str_{x} and str_{y} can be readily obtained [i.e., str_{y} and str_{x} can be found as _{y} is high (≫0.4), then REG-based rescaling is preferable. In general, the choice of REG- or VAR-based rescaling methods is less critical (perhaps negligible) for very high str_{y} and str_{x} values (str > 0.9). However, note that particular str thresholds acquired from Fig. 2 (e.g., 0.4 and 0.9) might be system specific and not generalizable to other assimilation setups using different land models and/or observations. Nevertheless, at a minimum, this consistency check ensures that an applied rescaling approach is not grossly inconsistent with the error assumptions underlying the application of a particular data assimilation approach.

Another important issue is the relevance of this analysis for the case of utilizing an observation operator to directly assimilate satellite brightness temperature *T _{b}* observations rather than geophysical retrievals based on the inversion of

*T*. One interesting implication of applying a forward model to assimilate

_{b}*T*is that the errors due to the radiance transfer model are effectively moved from the observation side to the model forecast side of the data assimilation system. As a consequence, assimilating

_{b}*T*rather than soil moisture leads to an effective decrease in model-based str (str

_{b}_{x}) and increase in observation str (str

_{y}). In many cases, str

_{y}could be quite close to one, since the accuracy goal of low-frequency (<10 GHz) satellite

*T*retrievals used for soil moisture retrieval (often on the order of 1–3 K) tends to be small relative to the observation dynamic range in true

_{b}*T*(up to 100 K). This suggests that a REG-based rescaling approach is advantageous for rescaling

_{b}*T*observations prior to their assimilation as it yields smaller analysis errors when str

_{b}_{y}is high and str

_{y}> str

_{x}(Fig. 2). However, it should be stressed that, while results presented here can be trivially generalized for the application of a linear observation operator, it is currently unknown how significantly they are impacted by the presence of a strongly nonlinear observation operator. Therefore, additional analysis will be required to fully describe the implications of this analysis for

*T*assimilation based on nonlinear forward radiative transfer calculations.

_{b}## 6. Conclusions

In hydrological assimilation studies, the primary goal is to combine different datasets to obtain a more accurate one via reducing the level of noise in the datasets. However, if datasets do not have a similar systematic relationship with the assumed truth, merging methodologies can result in increased errors even if the product uncertainties are specified correctly. As a result, it is critical to have correctly rescaled datasets before a merging methodology is applied.

This paper investigated existing methods that are widely applied in hydrological data assimilation studies to rescale observations prior to their assimilation into models. Specifically, we have evaluated the VAR-, CDF-, REG-, and TCA-based rescaling methods. Among these methods, the REG-based linear regression solution has been recognized by some studies (Gupta et al. 2009; Holmes et al. 2012) and applied by Crow et al. (2005) and Crow and Zhan (2007), whereas the vast majority of the hydrological assimilation studies have applied VAR- and CDF-based rescaling strategies. Although the triple collocation solution of Stoffelen (1998) has been widely applied, it was not particularly emphasized before that its intermediate rescaling step should be applied in hydrological data assimilation studies.

In a hydrological assimilation study, if the errors of the reference and the matched datasets (i.e., hydrological model and the observations) are assumed negligible when compared to the real signal (implying very high str values), then these suboptimal rescaling factor solutions give very close to optimal estimates. However, for many hydrological studies the noise of the datasets cannot be ignored, hence the rescaling method should also take into account the magnitude of the noise components of both datasets. Among the methods, VAR- and CDF-based rescaling methods match the total variance of observations to the model while neglecting the noise contributions of the datasets (Gao et al. 2007), whereas the REG-based rescaling takes into account these error components via the additional multiplication factor of the correlation coefficient. Nevertheless, the VAR-, CDF-, and REG-based rescaling methods are only suboptimal solutions as they generally violate the orthogonality property of an optimal estimation procedure (section 2a). As a result, they provide only approximations to the optimal estimate with a multiplication factor *f* (*f _{R}* = str

_{y}for the REG-based solution and

*f*or

_{R}*f*converge to one.

_{V}This analytical description of

## Acknowledgments

We thank two anonymous reviewers and Bart Forman for their constructive comments, which led to numerous clarifications in the final version of the manuscript. Research was partially supported by Wade Crow’s membership in the NASA Soil Moisture Active/Passive Science Definition Team. The United States Department of Agriculture is an equal opportunity provider and employer.

## APPENDIX A

### Numerical Solutions for the Rescaling Factor

#### a. Rescaling factor from linear least squares regression

*c*by linearly regressing

_{y}**y**onto

**x**and obtaining the best linear expression for

**x**in terms of

**y**. This least squares sense solution can be found by minimizing the mean square difference (msd) between

**x**and

**y***:

*c*and setting it to zero, we find the regression-based rescaling factor solution

_{y}_{y}in (13) and str

_{x}in (14),

*α*,

_{x}*α*, and str

_{y}_{y}requires additional ground truth or ancillary datasets that are often not available. Consequently, we will rewrite (A8) in terms of readily available variables. To do this, we apply the definition of correlation between model and observation

*ρ*

_{(x,y)}:

#### b. Rescaling factor from variance matching

**y**so that its statistical moments match that of

**x**. Since the form of (3) already ensures a match in means, the simplest viable case of this transformation is based solely on matching variances. Here, the rescaling factor from variance matching

#### c. Rescaling factor from triple collocation

Triple collocation analysis (TCA) is an error magnitude estimation method that uses three linearly related independent products to obtain the errors of each product separately. It was initially introduced for error magnitude estimation in oceanic studies (Stoffelen 1998; Caires and Sterl 2003), and has recently been applied to large-scale soil moisture error estimation-based studies (Scipal et al. 2008; Parinussa et al. 2011; Hain et al. 2011; Yilmaz et al. 2012; Anderson et al. 2012). These studies are typically based on one model-based soil moisture product and two remotely sensed products derived from contrasting remote sensing retrieval techniques (e.g., passive and active microwave).

**z**is a third independent product that is similar to

**x**and

**y**(1)–(2), and defined as

**z**=

*μ*+

_{z}*α*

_{z}**t**′ +

_{z}with time anomaly

**z**′ =

*α*

_{z}**t**′ =

_{z}. Assuming all product errors are independent from both the truth and each other, (A21) can be rewritten as

## APPENDIX B

### Optimal versus Suboptimal Rescaling Error Variances

*c*(

_{o}*c*(

_{s}*w*and

_{o}*w*are the weights of the rescaled observations associated with the optimal and suboptimal rescaling factors, respectively. Given optimal analysis satisfies

_{s}*α*−

_{y}c_{o}*α*= 0, (B1) can be written as

_{x}_{x}, str

_{y}, and

*α*are very low and str

_{y}_{y}< str

_{x}[setup in Fig. 2a for very low

*ρ*

_{(x,y)}]. Furthermore, for the scenario in Fig. 2a

*α*≪ 1, hence

_{y}*c*≫ 1.

_{o}_{x}and str

_{y}are very low), hence

*w*~ 0.5 [considering

_{s}_{y}< str

_{x}< 1, then

*f*< 1, hence

*c*<

_{s}*c*(reminder

_{o}*c*=

_{s}*c*). Accordingly,

_{o}f*c*≫ 1), and

_{o}*w*≫

_{s}*w*. Under this condition (

_{o}*w*≫

_{s}*w*), the first term in (B5) can be approximated to

_{o}*w*~ 0.5). Since

_{s}*c*<

_{s}*c*, the assumption of

_{o}*c*≪

_{s}*c*overestimates the third term in (B5). Thus, this assumption overall results in a higher number (approximately

_{o}*c*<

_{s}*c*and

_{o}*w*≪

_{o}*w*, hence

_{s}Similarly, for the REG-based solution, str_{y} ≪ 1, hence *c _{s}* ≪

*c*. It follows that

_{o}*w*≫

_{s}*w*, which also results in the

_{o}*α*≪ 1, and str

_{y}_{x}and str

_{y}are very low. For these conditions, the REG-based rescaling strategy is particularly prone to spuriously low error variances since

*c*(given

_{y}_{y}< str

_{x}≪ 1 then

## REFERENCES

Anderson, W. B., Zaitchik B. F. , Hain C. R. , Anderson M. C. , Yilmaz M. T. , Mecikalski J. , and Schultz L. , 2012: Towards an integrated soil moisture drought monitor for East Africa.

,*Hydrol. Earth Syst. Sci.***9**, 4587–4631.Caires, S., and Sterl A. , 2003: Validation of ocean wind and wave data using triple collocation.

,*J. Geophys. Res.***108**, 3098, doi:10.1029/2002JC001491.Chui, C. K., and Chen G. , 1998:

*Kalman Filtering with Real-Time Applications.*Springer, 230 pp.Crow, W. T., and Zhan X. , 2007: Continental-scale evaluation of remotely sensed soil moisture products.

,*IEEE Geosci. Remote Sens. Lett.***4**, 451–455.Crow, W. T., Bindlish R. , and Jackson T. J. , 2005: The added value of spaceborne passive microwave soil moisture retrievals for forecasting rainfall-runoff partitioning.

*J. Geophys. Res.,***32,**L18401, doi:10.1029/2005GL023543.Entekhabi, D., and Coauthors, 2010: The Soil Moisture Active Passive (SMAP) Mission.

,*Proc. IEEE***98**, 704–716.Gao, H., Wood E. F. , Drusch M. , and Mccabe M. F. , 2007: Copula-derived observation operators for assimilating TMI and AMSR-E retrieved soil moisture into land surface models.

,*J. Hydrometeor.***8**, 413–429.Gupta, H. V., Kling H. , Yilmaz K. K. , and Martinez G. F. , 2009: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling.

,*J. Hydrol.***377**(1–2), 80–91.Hain, C. R., Crow W. T. , Mecikalski J. R. , Anderson M. C. , and Holmes T. , 2011: An intercomparison of available soil moisture estimates from thermal infrared and passive microwave remote sensing and land surface modeling.

*J. Geophys. Res.,***116,**D15107, doi:10.1029/2011JD015633.Holmes, T. R. H., Jackson T. J. , Reichle R. H. , and Basara J. B. , 2012: An assessment of surface soil temperature products from numerical weather prediction models using ground-based measurements.

*Water Resour. Res.,***48,**W02531, doi:10.1029/2011WR010538.Jackson, T. J., and Coauthors, 2010: Validation of Advanced Microwave Scanning Radiometer soil moisture products.

,*IEEE Trans. Geosci. Remote Sens.***48**, 4256–4272.Koster, R. D., Guo Z. , Yang R. , Dirmeyer P. A. , Mitchell K. , and Puma M. J. , 2009: On the nature of soil moisture in land surface models.

,*J. Climate***22**, 4322–4335.Parinussa, R. M., Holmes T. R. H. , Yilmaz M. T. , and Crow W. T. , 2011: The impact of land surface temperature on soil moisture anomaly detection from passive microwave observations.

,*Hydrol. Earth Syst. Sci.***15**, 3135–3151.Reichle, R. H., and Koster R. D. , 2004: Bias reduction in short records of satellite soil moisture.

*Geophys. Res. Lett.,***31,**L19501, doi:10.1029/2004GL020938.Reichle, R. H., Koster R. D. , Dong J. , and Berg A. A. , 2004: Global soil moisture from satellite observations, land surface models, and ground data: Implications for data assimilation.

,*J. Hydrometeor.***5**, 430–442.Scipal, K., Holmes T. , de Jeu R. , Naeimi V. , and Wagner W. , 2008: A possible solution for the problem of estimating the error structure of global soil moisture data sets.

*Geophys. Res. Lett.,***35,**L24403, doi:10.1029/2008GL035599.Stoffelen, A., 1998: Toward the true near-surface wind speed: Error modeling and calibration using triple collocation.

,*J. Geophys. Res.***103**(C4), 7755–7766.Yilmaz, M. T., Crow W. T. , Anderson M. C. , and Hain C. , 2012: An objective methodology for merging satellite- and model-based soil moisture products.

*Water Resour. Res.,***48,**W11502, doi:10.1029/2011WR011682.