1. Introduction
Hydrological models are essential tools to simulate the fluxes of water and associated storage changes within continental surfaces and hence to understand the terrestrial water cycle (Döll et al. 2016). The first hydrological models emerged during the second half of the nineteenth century as empirical rainfall–runoff models and were initially conceived to predict peak flows (Todini 2007). Nowadays, a noticeable portion of hydrologic models, known as river routing models (RRMs; Perumal and Price 2013), still focuses on river discharge. These RRMs are primarily concerned with spatiotemporal mechanisms for the horizontal propagation of water within river systems and leave the vertical exchange of water between the land and the atmosphere to land surface models (LSMs).
A diversity of RRMs exists in the published literature and the complexity of their driving equations varies from the complete Saint-Venant equation (Barré de Saint-Venant 1871) to simplifications such as the kinematic wave equation (Weinmann 1979) or the Muskingum method (McCarthy 1938; Cunge 1969). Models such as Total Runoff Integrating Pathways (TRIP; Oki and Sud 1998), PCRaster Global Water Balance model (PCR-GLOBWB; van Beek and Bierkens 2008), or Hydrological Modeling and Analysis Platform (HyMAP; Getirana et al. 2012) use the kinematic wave approximation while others such as the Global Water Availability Assessment (GWAVA; Meigh et al. 1999), Hillslope River Routing (HRR; Beighley et al. 2009), or the Routing Applications for Parallel Computation of Discharge model (RAPID; David et al. 2011b) are based on the Muskingum/Muskingum–Cunge methods. Alternatives like LISFLOOD-FP (Bates and De Roo 2000), CaMa-Flood (Yamazaki et al. 2011), or MGB-IPH (Paiva et al. 2013) currently employ more advanced yet simplified versions of the Saint-Venant equations (Bates et al. 2010).
Despite significant advancements in river modeling, RRMs are still plagued by unavoidable inherent uncertainties that originate from an incomplete knowledge of the physics and by the simplifying assumptions that are necessary to the solutions of model equations. Other sources of uncertainties include the approximations resulting from numerical discretization and numerical resolution and limited knowledge of model parameters and model inputs. The combination of these sources of uncertainty results in nonnegligible uncertainties in the outputs of river models.
Available observations along the Earth surface water networks are key allies for river models. While in situ observations have been declining (Vörösmarty et al. 2001) globally, a considerable increase in the availability of spaceborne observations (Alsdorf et al. 2007; McCabe et al. 2017) is helping to fill this gap, and these observations together provide valuable information on surface water extent and surface water elevation from which discharge can be estimated (e.g., Durand et al. 2016). While the relative accuracy of observations and models can be debated, the spatiotemporal coverage of observations remains sparse, and therefore motivates a growing interest in techniques that can coherently merge observations with models. Such techniques are generally known as data assimilation (DA) and consist in combining information from model simulations with observations while accounting for their respective uncertainties in order to improve model estimates (Liu and Gupta 2007).
DA methods for Earth science were first developed and used for atmospheric science (Daley 1991; Kalnay 2003) and oceanography (Ghil and Malanotte-Rizzoli 1991; Bertino et al. 2003). The use of DA for hydrology is relatively more recent but has been rising during the past two decades in part due to an increase in available remotely sensed hydrologic data (Liu et al. 2012). Various components of the terrestrial water cycle have benefited from DA studies including snow cover (Rodell and Houser 2004; Andreadis and Lettenmaier 2006; Zaitchik and Rodell 2009; DeChant and Moradkhani 2011; De Lannoy et al. 2012; Oaida et al. 2019), soil moisture (Pauwels et al. 2001; Brocca et al. 2010; Montzka et al. 2011), land surface temperature (Reichle et al. 2010; Campo et al. 2013), evapotranspiration and vegetation characteristics (Schuurmans et al. 2003; Fang et al. 2011), and terrestrial water storage (Zaitchik et al. 2008; Forman et al. 2012; Eicker et al. 2014; van Dijk et al. 2014; Kumar et al. 2016; Girotto et al. 2017). River modeling has also seen the application of DA methods leveraging measurements of surface water levels (Romanovicz et al. 2006; Matgen et al. 2010; Biancamaria et al. 2011; Pereira-Cardenal et al. 2011; Michailovsky et al. 2013; Pedinotti et al. 2014), river discharge (Vrugt et al. 2006; Clark et al. 2008; Moradkhani et al. 2012; Rakovec et al. 2012; Coustau et al. 2015; McMillan et al. 2013; Abaza et al. 2014; Rafieeinasab et al. 2014; Bauer-Gottwein et al. 2015; Li et al. 2015; Ercolani and Castelli 2017; Emery et al. 2018), or both discharge and water level (Paiva et al. 2013), and even in combination with soil moisture data (Aubert et al. 2003; López López et al. 2016). The objective of these DA methods for river modeling—regardless of the type of observation used—has largely remained to directly or indirectly correct estimates of river discharge.
A variety of DA approaches exist but the most common techniques used in the aforementioned river DA studies appear to be ensemble-based methods such as the ensemble Kalman filter (EnKF; Evensen 1994) or the particle filter (PF; Del Moral 1996). These ensemble-based methods are advantageous because they can efficiently deal with nonlinearities in hydrological systems while remaining relatively easy to implement regardless of model characteristics (Liu et al. 2012). Perhaps most favorably for these ensemble-based methods, all essential components of DA such as the error covariance matrices or the observation operator—together relating the modeled, corrected, and observed variables along with their respective uncertainties—are stochastically estimated from an ensemble of model simulations during the assimilation procedure. Such approaches differ from variational DA methods (Le Dimet and Talagrand 1986; Courtier et al. 1994) or even from the traditional Kalman filter (Kalman 1960) that require an explicit development of such components prior to performing data assimilation. As a result, the EnKF is rather dominant in the published literature (Vrugt et al. 2006; Pereira-Cardenal et al. 2011; Rakovec et al. 2012; Paiva et al. 2013; Abaza et al. 2014; Rafieeinasab et al. 2014; López López et al. 2016; Emery et al. 2018), along with some extensions of the EnKF such as the ensemble square-root filter (Clark et al. 2008), the recursive EnKF (McMillan et al. 2013), the ensemble Kalman smoother (Li et al. 2015), and the local ensemble Kalman smoother (Biancamaria et al. 2011).
Despite the broad benefits of ensemble-based DA methods, such approaches can be—by definition—computationally demanding, and can become cost prohibitive as the resolution and size of study domains increase (Liu et al. 2012). This limitation, while perhaps negligible in the earliest studies focusing on local scales (Vrugt et al. 2006; Romanovicz et al. 2006; Clark et al. 2008) becomes more acute with studies of the world’s largest rivers (Biancamaria et al. 2011; Pedinotti et al. 2014; Paiva et al. 2013; Emery et al. 2018). One potential mitigation strategy for limiting computational costs is the use of simplified versions of the forward model equations as part of the DA method, though it is rather rarely employed (e.g., Margvelashvili et al. 2016). In addition, the stochastic estimation of DA components has motivated the development of correction approaches designed to alleviate some of their imperfections. These corrections include modifications to the error covariance matrices such as localization (Greybush et al. 2011; Sakov and Bertino 2011) to spatially focus the impact of data assimilation, and inflation (Anderson and Anderson 1999; Anderson 2007) to avoid filter divergence (due to a collapsed ensemble spread) and underestimation of the error covariance (caused by insufficient model error specification and/or a limited ensemble size). The application of such correction methods in river DA requires data assimilation expertise and, notwithstanding their effectiveness, may benefit from further justifications on the physical processes defining them.
The purpose of this study is therefore to investigate these parameters, namely the inflation factor and the localization radius, specifically when used in the context of river modeling. As they are not known a priori, our objective is also to reveal the underlying processes determining their optimal value. We use the RAPID model (David et al. 2011b) because its linear equations allow for a classical Kalman filter approach that circumvent the need for ensembles during assimilation and large size problems associated with these methods. Published applications of RAPID range in domain size from 30 000 to 3 000 000 km2 and in spatial resolution from 2 to 5 km (David et al. 2011a,b, 2013a,b, 2015). The underlying code for RAPID has benefited from dedicated efforts toward decreased computational costs (David et al. 2013a, 2015). Previous studies have demonstrated a general capability of RAPID to reproduce observed discharge (David et al. 2011b,a, 2013b). Yet, because the traditional Muskingum method at the core of RAPID remains simple, the model performance is limited in regions that are subject to floods, backwater effects, or home to active anthropogenic storage of surface water, as these processes are not currently accounted for. The existing limitations of RAPID therefore further motivate the inclusion of a DA capability.
The paper is organized as follows. The core routing equations of RAPID are first summarized and followed by a description of a Kalman filter implementation that includes an efficient approximation of its routing procedure. We then present an application to the combined San Antonio and Guadalupe River basins in Texas (see Fig. 1) using state-of-the-art hydrographic and meteorological inputs and discuss our evaluation strategy. Our results follow, along with their implications for the characteristic spatiotemporal scales of the physical processes involved. Core DA components including error covariances and their inflation or localization are then discussed, along with their characteristic spatiotemporal scales.

(a) The Guadalupe River and San Antonio River basins in the United States. (b) The NHDPlus river network with location of the main stems for the Guadalupe and San Antonio Rivers, the 23 assimilation gauges, and the 13 validation gauges (both from USGS). The downstream-most gauges used in Fig. 4 and their closest upstream gauges are also displayed.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

(a) The Guadalupe River and San Antonio River basins in the United States. (b) The NHDPlus river network with location of the main stems for the Guadalupe and San Antonio Rivers, the 23 assimilation gauges, and the 13 validation gauges (both from USGS). The downstream-most gauges used in Fig. 4 and their closest upstream gauges are also displayed.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
(a) The Guadalupe River and San Antonio River basins in the United States. (b) The NHDPlus river network with location of the main stems for the Guadalupe and San Antonio Rivers, the 23 assimilation gauges, and the 13 validation gauges (both from USGS). The downstream-most gauges used in Fig. 4 and their closest upstream gauges are also displayed.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
2. The RAPID model
a. The RAPID model
The matrix
Notably, the core RAPID formulation given in Eq. (3) is a linear system. In addition, if the reaches are sorted from upstream to downstream in all vectors and matrices, Eq. (3) turns into a lower triangular system which facilitates its solving. Further, the sparsity of the river network matrix
However, while RAPID can efficiently be run over large domains, the relative simplicity of its governing equations results in coarse approximations of some hydrological processes. Additionally, imperfect model inputs and model parameters together also cause further limitations in the modeling system. The fusion of discharge observations with discharge simulations from RAPID through data assimilation therefore offers one potential way to alleviate existing challenges in the accurate estimation of discharge in surface water networks. The choice of the specific assimilation approach shall also be made cognizant of past efforts toward computational efficiency.
RAPID, as a forward model, was initially developed, tested, and validated (David et al. 2011b) over the Guadalupe and San Antonio River basins (Fig. 1), two basins in Texas of a total drainage area of 17 453 and 10 800 km2, respectively. The same study domain is used here to validate the proposed data assimilation method.
b. Data
The enhanced National Hydrography Dataset (NHDPlus; McKay et al. 2019) provides a description of the river network for the combined San Antonio and Guadalupe River basins that is composed of 5175 reaches (averaged length of 3 km) and their contributing catchments (average area of 5.11 km2). This NHDPlus river network was initially used for RAPID in David et al. (2011b) and is also used in this study.
The lateral inflow from the land into the river network is computed based on runoff estimates from version 4.0.5 of the VIC land surface model (Liang et al. 1994; Wood et al. 1997) as provided by phase 2 of the North American Land Data Assimilation System (NLDAS2; Xia et al. 2012a,b). The VIC runoff available at a 1/8° spatial resolution and at both an hourly and a monthly temporal resolution were retrieved here for a 4-yr study period ranging from 2010 to 2013. A 3-hourly lateral inflow is used as input to RAPID similarly to previous studies (David et al. 2011b,a, 2013a,b, 2015) and derived here from the temporal accumulation of hourly NLDAS2 VIC runoff that is spatially aggregated using the catchment centroid method of David et al. (2013b). Moreover, two additional LSM are included to the NLDAS2, namely Noah (Betts et al. 1997; Chen et al. 1997; Ek et al. 2003) and Mosaic (Koster and Suarez 1994), and provide runoff outputs at the same spatiotemporal resolution as the VIC outputs.
The set of Muskingum (k, x) parameters used herein was developed through calibration in a previous study David et al. (2011a) where it is denoted using α superscripts (kα, xα). Note that this set of parameters was calibrated using a temporal period and a runoff dataset that are both different from those used here. Our approach therefore guarantees that the evaluation of the proposed data assimilation method is performed with parameters that were not specifically tailored for the runoff that is itself the subject of our assimilation procedure.
Observed daily averaged discharge estimates were obtained from the U.S. Geological Survey (USGS) National Water Information System (NWIS). A total of 36 USGS gauges in the Guadalupe and San Antonio basins that have a complete daily data record during our 4-yr study period were retrieved as part of this study.
3. Development of a data assimilation capability for RAPID
While a broad range of advanced DA methods has been used in the aforementioned review of available literature, the classical Kalman filter (KF; Kalman 1960) appears to be best suited for RAPID. This choice avoids the costly computations required for more complex ensemble DA approaches, and is consistent with the linearity of RAPID that is required by the assumptions of the KF. Perhaps more importantly, the use of a KF here permits an investigation of some of the fundamental aspects of various DA methods applied to river modeling in an attempt to reveal the underlying physical processes impacting their performance. Note that development of a Kalman filtering approach for RAPID is also an improvement over the existing direct insertion method that was developed in David et al. (2011a).
a. General aspects of the Kalman filter
The Kalman filter is a sequential DA algorithm, that is, one in which a new correction is performed at each time a new observation is available. The assimilation cycle of the KF therefore corresponds to the time window ranging between two subsequent observations. Two primary assumptions are made in the development on the KF. First, it is presumed that the model being corrected has linear dynamics. Second, all errors (i.e., control variables, model, observations), represented as random variables, are considered to follow a Gaussian distribution with a zero mean (that is unbiased errors) hence ensuring that error statistics are fully determined by their associated error covariance matrices.
The KF assimilation cycle k is divided into two steps. The first “background” step temporally propagates the model from the last update to the time when a new observation is available. This mechanism, sometimes referred to as “forecast” step or “prediction” step, provides an a priori estimate of the modeled state at the new observation time from a direct execution of the model based on the background control variables
Note that the most general definition of the KF also includes two equations allowing for the update of the control variable error covariance matrix at each analysis step. However, given our specific data assimilation setup (section 3), we opted here for a common approach in which the a priori error covariance matrix
Figure 2 summarizes all the main features of our implementation of the classical Kalman filter to RAPID. Each component of the DA scheme is further described below.

The data assimilation approach used in this study over the assimilation cycle k includes 1) a background step using a priori runoff data where the associated variables are presented in blue, 2) an analysis step correcting the daily averaged runoff forcing through Kalman filtering using the observed variables displayed here in green, and 3) a rerun using the updated runoff data where the associated corrected variables are shown in red.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

The data assimilation approach used in this study over the assimilation cycle k includes 1) a background step using a priori runoff data where the associated variables are presented in blue, 2) an analysis step correcting the daily averaged runoff forcing through Kalman filtering using the observed variables displayed here in green, and 3) a rerun using the updated runoff data where the associated corrected variables are shown in red.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
The data assimilation approach used in this study over the assimilation cycle k includes 1) a background step using a priori runoff data where the associated variables are presented in blue, 2) an analysis step correcting the daily averaged runoff forcing through Kalman filtering using the observed variables displayed here in green, and 3) a rerun using the updated runoff data where the associated corrected variables are shown in red.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
b. Observations and assimilation window
The observed variables used herein consist of daily-averaged discharge measurements from in situ gauges. Only a subset ny of all nr reaches in the study domain are observed and all gauges retained have daily observations available every day. The length of any assimilation cycle is therefore set to one day. All available observations during the daily assimilation cycle k are gathered in the observation vector
c. Control variables xk
The variables involved in any execution of RAPID—see Eq. (3)—are the lateral inflow Qe into the river network, the Muskingum parameters (k, x), and the simulated discharge Q within the network. DA methods can be used to correct any one of these variables if selected as the control variable. The selection of the one control variable then drives the derivation of the specific KF equation to be used. We chose not to focus on model parameters here, in light of our previous work on automatic calibration of RAPID (David et al. 2011a,b, 2013b). As the data assimilation platform developments occurred concurrently to the runoff error propagation model published in David et al. (2019) and the Kalman filter requires an explicit control error model to be defined, the lateral inflow Qe was therefore selected as control variable for this study—a DA practice first proposed and experimented in Pan and Wood (2013) and later studied in Fisher et al. (2020) and Yang et al. (2019).
d. Observation operator and innovation dk
The choice of the observation variable yk and control variable xk over the assimilation window k together dictate the derivation of the observation operator
The extraction of the daily-averaged simulated discharges at the observed locations is then performed by applying a selection operator
Any potential uncertainty introduced in the system with the assumption of perfect initial conditions can be associated with representativeness error. In DA, this type of error, directly related to the observation operator, represent the imperfect mapping from the control space to the observation space (Janjić and Cohn 2006; Janjić et al. 2017). However, the study of the representativeness error is out of the scope of this paper.
e. Observation error covariance matrix
Observation errors gather measurements errors, systematically occurring when an instrument is used to make a measurement, and representativeness errors, originating from the flawed representation of the real observed system when using model and simplifying assumptions. As previously introduced, the representativeness errors are neglected in the present study, and therefore the observations errors reduce to the errors in the measured discharge.
f. Runoff error covariance matrix
The error in lateral inflow is estimated based on the comparison of the aforementioned VIC-based lateral inflow with the NLDAS2 ensemble average in Eq. (14). Such an estimate of the runoff error is computed at each time runoff data are available. This time series is then used to estimate
g. Adjustable implementation features
The investigation of the physical processes impacting the performance of data assimilation for river modeling motivates the inclusion of two adjustable features respectively related to localization and to inflation of the runoff error covariance matrix. A third adjustable feature is also included in an effort to retain the computational efficiency of RAPID while allowing for controllable approximations on one of the key matrices involved in the computation of Eq. (9).
1) The localization radius R
Localization is traditionally used in ensemble-based data assimilation method to limit spurious correlations resulting from small-sized ensembles. While a classical (nonensemble) Kalman filter is used in this study, the estimation of
A simple static B-localization of

Impact of the localization radius R: (a) river reaches within two different radii (R = 25 and R = 50) upstream and downstream of two gauges of interest, (b) nonzero pattern of the runoff error covariance matrix
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

Impact of the localization radius R: (a) river reaches within two different radii (R = 25 and R = 50) upstream and downstream of two gauges of interest, (b) nonzero pattern of the runoff error covariance matrix
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Impact of the localization radius R: (a) river reaches within two different radii (R = 25 and R = 50) upstream and downstream of two gauges of interest, (b) nonzero pattern of the runoff error covariance matrix
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Note that, by enforcing the value of some element in the matrix
2) The inflation factor I
The artificial increase of the magnitude of
3) The Muskingum operator threshold ε
4. Results
a. Evaluation strategy
The evaluation of the proposed data assimilation methodology is here designed sequentially as a three-step strategy. The initial step evaluates an open-loop simulation—that is, a RAPID model execution without data assimilation—against observed discharge records in order to provide a basis for comparison, and also validates our runoff error estimates using the uncertainty propagation approach of David et al. (2019).
The second evaluation step focuses on sensitivity analysis for two fundamental aspects of data assimilation applied to river modeling, that is, error localization R and error inflation I, in an attempt to reveal the underlying physical processes impacting their performance. Note that each aspect is evaluated independently while keeping the other at a constant nominal value. Additionally, no approximation is made to the Muskingum operator (ε = 0) during this second step in order to fully conserve the physical properties of the system. The optimal values for error localization R and error inflation I are determined through the analysis of various discharge metrics and kept for subsequent experiments.
For the experiments using data assimilation, only 23 out of the 36 gauges were selected for assimilation into RAPID, while the 13 remaining gauges were kept for validation (Fig. 1). This selection is performed such that only one gauge is assimilated out of each two consecutive gauges along a same stem of the river network. Note that multiple gauges along a same river system are assimilated simultaneously. However, the update of the lateral runoff only modifies the amount of water entering the river system and the updated runoff is then propagated downstream through RAPID over the same assimilation period starting from the same initial condition as the forecast run (see Fig. 2), hence ensuring no upstream–downstream discontinuities in mass balance along segments between a pair of stations.
The third and final step of our evaluation investigates the impact of Muskingum operator simplification through the use of the threshold ε in order to estimate potential savings in computational storage requirements and expected reduction in execution time; and potential associated degradations in the performance of the data assimilation methodology.
b. Results before data assimilation
1) Evaluation of the open-loop discharge estimates
We start here with an evaluation of our open-loop discharge simulations, that is, the simulated discharge before using DA. To assess model performance, the model is run freely and serves as a reference for subsequent DA experiments. Given that this experimental setup uses model parameters from a different study (David et al. 2011b) and off-the-shelf lateral inflow without specific calibration for our study domain, limited quality is to be expected from this experiment.
Table 1 (row 1) shows the mean and the median Nash–Sutcliffe efficiency (NSE; Nash and Sutcliffe 1970) values independently for the 23 assimilation and the 13 validation gauges. NSE ranges from −∞ to 1; 1 indicates a perfect match and 0 indicates that the model is as accurate as the mean of the observations. These NSE values are obtained from the comparison of daily averaged observations with daily averaged open-loop simulations. Note again that assimilation and validation gauges are separated here solely for the purpose of subsequent comparisons given that no assimilation is performed in this section. The negative mean and median NSE values obtained here confirm that the open-loop run has very limited overall ability to reproduce the observed discharge throughout the domain. Figure 4 shows an example of observed and open-loop simulated hydrographs for the downstream-most stations in the two subbasins of our study domain: the Guadalupe River at Victoria, Texas (NSE = −9.956), and the San Antonio River at Goliad, Texas (NSE = −14.050), and highlights significant overestimation of discharge although some temporal variability is accurately captured. Mass conservation is enforced in the Muskingum method; the large positive bias that is observed in the open-loop simulation (Fig. 4) is therefore related to large positive runoff bias that provides excessive amounts of water to the river system. This limitation further supports the choice of runoff as the control variable for our DA implementation.
Mean and median values of Nash–Sutcliffe efficiency computed from daily averaged discharge statistics over the 4-yr simulation (2010–13) for experimental configurations with varying localization radius R, inflation factor I, and Muskingum operator threshold ε.



Daily hydrographs from the open-loop RAPID simulation and from in situ observations over the 4-yr study period for (a) the Guadalupe River at Victoria (NSE = −9.956), and (b) the San Antonio River at Goliad (NSE = −14.050). The geolocation of these gauges is shown in Fig. 1.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

Daily hydrographs from the open-loop RAPID simulation and from in situ observations over the 4-yr study period for (a) the Guadalupe River at Victoria (NSE = −9.956), and (b) the San Antonio River at Goliad (NSE = −14.050). The geolocation of these gauges is shown in Fig. 1.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Daily hydrographs from the open-loop RAPID simulation and from in situ observations over the 4-yr study period for (a) the Guadalupe River at Victoria (NSE = −9.956), and (b) the San Antonio River at Goliad (NSE = −14.050). The geolocation of these gauges is shown in Fig. 1.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
2) Evaluation of the runoff error estimates
The control variable error statistics are an essential part of any DA method (e.g., section 3). In this study, the control variable is the combined surface and subsurface runoff from outside of the river network, and
First, using monthly time series, our estimated runoff error—obtained from applying Eq. (14) in Eq. (13)—is mapped into its corresponding estimated discharge error using the error propagation model developed by David et al. (2019). Then, the measured discharge error is then obtained from the comparison of discharge observations with open-loop simulations. Three error metrics are computed and compared: the error mean (i.e., the bias), the error standard deviation [i.e., the standard error (STDE)], and the root-mean-square error (RMSE). To validate the runoff error model, the estimated discharge errors must match the measured discharge errors. Figure 5a demonstrates that the monthly estimated discharge errors from propagation of monthly estimated runoff errors conserves the spatial variability (coefficients of determination ρ2 > 0.95) of the monthly measured discharge errors. Yet, an underestimation of the magnitude of the errors is evidenced by linear trends of slope smaller than unity.

Validation of the runoff errors: (a) estimated monthly discharge errors from the propagation of estimated runoff errors vs measured monthly discharged errors, (b) measured daily discharge errors vs measured monthly discharge errors, and (c) as in (a), but for daily errors.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

Validation of the runoff errors: (a) estimated monthly discharge errors from the propagation of estimated runoff errors vs measured monthly discharged errors, (b) measured daily discharge errors vs measured monthly discharge errors, and (c) as in (a), but for daily errors.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Validation of the runoff errors: (a) estimated monthly discharge errors from the propagation of estimated runoff errors vs measured monthly discharged errors, (b) measured daily discharge errors vs measured monthly discharge errors, and (c) as in (a), but for daily errors.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Given that the daily assimilation window used in this study requires daily runoff errors, we also extended the aforementioned methodology to runoff statistics estimated from daily time series. Before that, Fig. 5b shows that measured error STDE and RMSE calculated from daily time series are greater than the measured error STDE and RMSE calculated from monthly time series, as was the case in David et al. (2019). This is expected because monthly averaging suppresses daily variability, while the bias clearly remains constant at both time scales. More importantly, Fig. 5b confirms that measured daily errors are highly correlated with measured monthly errors (with coefficients of determination ρ2 greater than 0.99).
These high correlations justify the application of the same error propagation methodology to daily errors as shown in Fig. 5c. Here again, the comparison of measured errors and estimated errors leads to high coefficients of determination (ρ2 > 0.95). These results allow the indirect validation of the daily runoff errors spatial variability. Still, the slopes of the inferred linear relationships indicate that the discharge errors computed from the propagation of runoff errors underestimate the measured error. However, the information on these slopes can be used to scale the runoff errors so that the corresponding estimated discharge error magnitudes match the measured discharge error magnitude: here a factor of (1/0.3876)2 has to be used for the variances and covariances (as shown in Fig. S1 in the online supplemental material). Such scaling corresponds would actually correspond to error inflation in data assimilation.
From now on, the runoff error variances and covariances in
c. Results with data assimilation
1) Impact of the localization radius R
The longest path from the most upstream to the most downstream reach in the NHDPlus description of the San Antonio and Guadalupe River network traverses 286 river reaches. This longest path provides an upper bound of R = 286 for the largest possible localization radius in our study. We therefore test several values of R from 286 (for which
Table 1 (rows 2–9) shows the mean and median daily discharge NSE values after assimilation for various values of R and indicates that the assimilation procedure diverges for R ≥ 30. In contrast, all assimilation experiments with a radius ranging from R = 0 to R = 20 are consistently improved compared to the open-loop simulation, and the improvement is evidenced at both assimilation and validation gauges. When compared to R = 20, the specific case of R = 25 shows a slight degradation over the assimilation gauges and a larger degradation over the validation gauges. More notably, R = 25 leads to a lower mean NSE over the validation gauges than the open-loop simulation, hence suggesting that assimilation gauges are overfitted at the expense of validation gauges. Figs. 6a and 6b show examples of hydrographs obtained for two values of R and illustrate that increasing the value of the radius nudges data assimilation results closer to the observations. These benefits are expected because larger values of R lead to additional information content for enlarging Kalman filter corrections around available assimilation gauges. However, the results shown in Table 1 indicate that degradations occur beyond a radius on the order of 20 reaches. Yet, the physical meaning of this value remains to be determined.

Discharge hydrographs at (a),(c) Victoria and (b),(d) Goliad over the first six months (January 2010–June 2010) comparing observations, open-loop simulations, and data assimilation for (top) varying localization radius R = 0 and R = 20 while inflation is kept constant at I = 2.58 and (bottom) varying inflation factor I = 1.00 and I = 2.58 while localization is kept constant at R = 20.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

Discharge hydrographs at (a),(c) Victoria and (b),(d) Goliad over the first six months (January 2010–June 2010) comparing observations, open-loop simulations, and data assimilation for (top) varying localization radius R = 0 and R = 20 while inflation is kept constant at I = 2.58 and (bottom) varying inflation factor I = 1.00 and I = 2.58 while localization is kept constant at R = 20.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Discharge hydrographs at (a),(c) Victoria and (b),(d) Goliad over the first six months (January 2010–June 2010) comparing observations, open-loop simulations, and data assimilation for (top) varying localization radius R = 0 and R = 20 while inflation is kept constant at I = 2.58 and (bottom) varying inflation factor I = 1.00 and I = 2.58 while localization is kept constant at R = 20.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Two physical distances are particularly important in this investigation: the radius of the distance traveled by flow waves during the 1-day data assimilation window, represented hereafter by the variable Rp, and the radius of the distance separating consecutive data assimilation gauges, represented hereafter by the variable Ra. To better grasp all of these variables, Fig. S2 illustrates these two topological distances Rp and Ra in the context of the localization radius R for this study.
Figure 7 illustrates the spatial distribution of discharge simulation improvements in NSE compared to the open-loop execution and confirms that optimal data assimilation is obtained for a localization radius of 20. Interestingly, Fig. 7 also highlights a subbasin-specific degradation of data assimilation in that simulations within the Guadalupe River basin (located at the north of the domain) consistently break down at R = 30 whereas such a localization radius has much more limited impact on the San Antonio River basin. Further analysis of river reach lengths in the Guadalupe and San Antonio basins individually reveals that their respective radius of propagation are Rp = 27 and Rp = 32—because of longer reach lengths in the Guadalupe—hence providing preliminary evidence that degradation occurs when the radius of localization exceeds the radius of propagation.

Overall assimilation results when assimilating 23 gauge with an inflation of I = 2.58 and various localization: (a) R = 30, (b) R = 20, (c) R = 10, and (d) R = 0. The maps show, for assimilation gauges (circles) and validation gauges (squares), whether the assimilation improved the simulated discharge (green) or degraded the simulated discharge (red). For a given R (equivalently, for a given map), a marker (circle or square) highlighted in yellow indicates that the best assimilation performance for this gauge was obtained for this value of R (among all maps, each gauge is highlighted only once).
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

Overall assimilation results when assimilating 23 gauge with an inflation of I = 2.58 and various localization: (a) R = 30, (b) R = 20, (c) R = 10, and (d) R = 0. The maps show, for assimilation gauges (circles) and validation gauges (squares), whether the assimilation improved the simulated discharge (green) or degraded the simulated discharge (red). For a given R (equivalently, for a given map), a marker (circle or square) highlighted in yellow indicates that the best assimilation performance for this gauge was obtained for this value of R (among all maps, each gauge is highlighted only once).
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Overall assimilation results when assimilating 23 gauge with an inflation of I = 2.58 and various localization: (a) R = 30, (b) R = 20, (c) R = 10, and (d) R = 0. The maps show, for assimilation gauges (circles) and validation gauges (squares), whether the assimilation improved the simulated discharge (green) or degraded the simulated discharge (red). For a given R (equivalently, for a given map), a marker (circle or square) highlighted in yellow indicates that the best assimilation performance for this gauge was obtained for this value of R (among all maps, each gauge is highlighted only once).
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
To further investigate the physical drivers of the localization radius, a similar sensitivity analysis was performed over an additional set of assimilation gauges. The new set encompasses all 36 gauges and is characterized by a smaller radius Ra = 12. Despite the shorter distance separating the gauges used in assimilation, the same spatial patterns of improvement are observed (Fig. S3). Specifically, the assimilation methodology improves with increasing localization radius until it starts degrading when the localization radius R exceeds the propagation radius Rp, which further confirms the initial observation above even for this smaller radius of assimilation.
Therefore, our analysis shows that the localization radius should be chosen such that the longest distance for which error covariances are accounted for between connected river reaches does not exceed the traveled distance during one assimilation window. This research also suggests that the localization radius could be regionalized as a function of varying reach length and flow wave celerity although such is beyond the scope of the current study. All subsequent experiments will use a fixed localization radius of R = 20 and focus on the reference case which assimilates the subset of 23 gauges.
2) Impact of the inflation factor I
The runoff error validation effort [section 4b(2)] influenced the initial value of the inflation factor I = 2.58 used in the aforementioned evaluation of the localization radius R. Here we instead set a constant value of R = 20 and investigate the effects of multiple values of I ranging from I = 1.00 (no inflation) to I = 5 where the inflation is about twice what was suggested by our error validation. The same 23 assimilation gauges (Fig. 1) are used over our 4-yr study period (2010–13) and no approximation is made on the Muskingum operator (ε = 0).
Table 1 (rows 10–13) shows the mean and median daily discharge NSE values for these experiments, and suggests that increasing the inflation consistently improves discharge estimates for assimilation gauges up to I = 5, with no indication of degradation for any potentially higher inflation values. This behavior is expected because the Kalman gain weighs the relative magnitude of runoff errors and discharge observation errors so that increasing trust is placed in observed discharge when runoff errors grow from inflation. Figures 6c and 6d show example hydrographs obtained with various levels of inflation at the same two downstream validation stations used in section 4c(1) and illustrate the benefits of inflation. The inspection of mean and median NSE values for the other gauges instead—that is, the validation gauges—in Table 1 shows that best performance is obtained on or around I = 2.58. This is also expected from our runoff error validation. Overall, our analysis confirms that inflation is beneficial when exactly making up for underestimated runoff errors, but that excessive inflation degrades simulations away from assimilation sites. Inflation must therefore be used along with an appropriate validation of errors when at all possible.
3) Assimilation results with optimal localization inflation and limitations
Analysis of the sensitivity of our data assimilation methodology to the adjustable localization and inflation parameters therefore suggest that a localization radius of R = 20 and an inflation factor of I = 2.58 are optimal for our case study.
Figure 7 shows further illustration of this specific optimal case. With the sole exception of one validation gauge, our proposed data assimilation approach consistently improves daily NSE values for all assimilation and validation gauges (Fig. 7). This confirms that our methodology is able to provide accurate results even for reaches that are not hosts to a gauge, that is, at unobserved locations. The hydrographs in Fig. 8 show the daily behavior of discharge simulations at both validation gauges (Figs. 8a,c) and assimilation gauges (Figs. 8b,d). In all cases, the assimilated discharge is visually closer to the observations than the open-loop discharge as was already evidenced by improved NSE values (Table 1). Note that we evaluated the relative performance of our DA methodology over a high flow period (the year 2010) and a regular flow period (2011)—periods chosen subjectively from the hydrographs in Fig. 4—and found similar performance (see Table S1).

Overall assimilation results for optimal localization (R = 20) and inflation (I = 2.58). Hydrographs are shown for the first two years (2010–11) for clarity at four locations: (a) Cuero and (c) Fall City, respectively upstream of the assimilation gauges of (b) Victoria and (d) Goliad. The location of all four gauges is identified on Fig. 1.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

Overall assimilation results for optimal localization (R = 20) and inflation (I = 2.58). Hydrographs are shown for the first two years (2010–11) for clarity at four locations: (a) Cuero and (c) Fall City, respectively upstream of the assimilation gauges of (b) Victoria and (d) Goliad. The location of all four gauges is identified on Fig. 1.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Overall assimilation results for optimal localization (R = 20) and inflation (I = 2.58). Hydrographs are shown for the first two years (2010–11) for clarity at four locations: (a) Cuero and (c) Fall City, respectively upstream of the assimilation gauges of (b) Victoria and (d) Goliad. The location of all four gauges is identified on Fig. 1.
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
One must note, however, that the assimilated discharge simulations show some oscillations, especially at validation gauges (Fig. S8). A close examination of these oscillations reveals an apparent 2-day period around the gauge observations that appears to be a series of successive over and undercompensation of the corrections. These oscillations are the source of degraded NSE noted for one validation gauge and explain the relatively lower NSE values over validation gauges compared to assimilation gauges (Table 1).
The oscillations are likely a source for the instabilities noted with increasing localization radius [section 4c(1)] or inflation [section 4c(2)] and may have triggered the associated divergence of simulations. Multiple other factors could have caused the instabilities including the relatively simple routing equations (section 2), our Kalman filter simplifications of steady error covariances (section 3f), the presence of runoff bias despite not being accounted for by the Kalman filter [section 4b(2)] and the several assumptions made when designing the platform, namely, the perfect initial condition (section 3d), the omission of the representativeness error (section 3d), the assumed absence of observational error covariance (section 3e), the flawed runoff estimates (section 3f) and the static relatively simple truncation of the runoff error covariance matrix [section 3g(1)].
The strength of the assumption concerning perfect initial conditions is expected to vary based on the relative position of a given river reach within the river network. As expected, and as further illustrated in Fig. S9, the lateral runoff contribution to the total flaw is relatively small for downstream reaches compared to upstream reaches. However, further investigation of this assumption would involve redefining the error model used in the data assimilation methodology, which is beyond the scope of the present study, although it could be the subject of future developments.
Other options to minimize the oscillations could be to use more regionalized or adaptive localization and/or inflation along with a more refined definition of the control error (by using a larger ensemble to surrogate the true runoff) and the observation error (by including representativeness error) although such approaches are beyond the scope of this paper. One could also adopt a smoothing rather than a filtering approach for the assimilation. Smoothers (e.g., Pan and Wood 2013) are designed to tailor upstream and downstream corrections over longer time windows than filters, though at increased size of the assimilation problem and associated computational costs. Despite these impediments, our proposed data assimilation methodology provides significant improvements in discharge estimates that are evidenced in Table 1 and Figs. 7 and 8.
d. Retaining the computational efficiency of RAPID
The computational implications of the proposed methodology are of importance for RAPID given past efforts ensuring software efficiency (David et al. 2013a, 2015) and ever-increasing domain sizes. Note here that, compared to simulations without data assimilation, our Kalman filtering approach that corrects the inputs inherently imposes a doubling of the computational burden because of the need for both a “background” and an “analysis” step that are each equally as expensive as the open-loop execution. This unavoidable doubling of simulation time was verified experimentally in this study (not shown). However, another aspect of computational efficiency that can benefit from analysis here is the duration of initialization and associated memory requirements.
One critical component of the model setup in this study is the computation and storage of the Muskingum operator
Figures 9a–c show the decreasing number of nonzero elements in the matrix

Influence of threshold ε on the fill pattern of the Muskingum operator M. (a) Full Muskingum operator. (b) Muskingum operator with ε = 10−9, (c) Muskingum operator with ε = 10−3, and (d) percent fill for various threshold values. The impact of threshold model setup time is also shown in (d).
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1

Influence of threshold ε on the fill pattern of the Muskingum operator M. (a) Full Muskingum operator. (b) Muskingum operator with ε = 10−9, (c) Muskingum operator with ε = 10−3, and (d) percent fill for various threshold values. The impact of threshold model setup time is also shown in (d).
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
Influence of threshold ε on the fill pattern of the Muskingum operator M. (a) Full Muskingum operator. (b) Muskingum operator with ε = 10−9, (c) Muskingum operator with ε = 10−3, and (d) percent fill for various threshold values. The impact of threshold model setup time is also shown in (d).
Citation: Journal of Hydrometeorology 21, 3; 10.1175/JHM-D-19-0084.1
5. Summary and conclusions
While the relative uncertainties of discharge observations from gauges and discharge simulations from river routing models can be debated, the spatiotemporal sparseness of observations has naturally driven their fusion with models in an effort to fill in the observational blanks and hence aid various hydrometeorological and water management endeavors. Our review of available literature suggests that such fusion has to date primarily been accomplished through the use of advanced ensemble-based data assimilation methods. These ensemble methods are beneficial because they readily estimate key components of data assimilation, namely the control error covariance and cross-covariance matrices, but they are also computationally demanding. All assimilation approaches have used corrections designed to alleviate some of their imperfections. These common corrections consist of error variance/covariance localization and inflation which, notwithstanding their efficiency, can benefit from further physically based justifications.
This study therefore evaluates the use of these corrections in the simplest relevant setting—the classical Kalman filter—in an attempt to reveal some of the underlying mechanisms controlling their optimal value. We use the RAPID model and hence take advantage of its linear routing equation. Our application assimilates daily averaged in situ discharge measurements to correct daily-averaged runoff inputs and our methodology is evaluated in a 4-yr (2010–13) case study of the San Antonio and Guadalupe River basins in Texas. This study also constitutes the initial developments of a Kalman filter capability for the RAPID model for which the retention of its computational efficiency is a self-imposed criteria.
We find that inflation is indeed justified when it compensates for potential discrepancies in the magnitude of the control variable errors, as expected. However, we also show that excessive inflation—while apparently improving simulations at assimilation sites—actually degrades simulations away from observations. Inflation is therefore best applied in conjunction with a detailed validation of errors prior to the assimilation as done in this study. Additionally, our investigation of localization suggests that instabilities (in the form of 2-day-period oscillations) occur when nonzero control error covariances exist between far away reaches. The threshold distance corresponds to approximately 120 km [(2 × Ra × L) = 2 × 30 × 3 km], that is, the distance traveled by the flow wave during the 1-day assimilation window. Moreover, the experiments indicate that a regionalized localization radius may be beneficial to account for subcatchment-scale variability.
The instabilities evidenced in our results may originate from a variety of simplifying assumptions used in this study including the routing equations, steady error covariances, imperfect initial conditions, absence of observational error covariance, representativeness error, flawed runoff error estimates, and presence of bias. Nonetheless, these simplifying assumptions are commonly used and our results may hence be broadly applicable to more advanced data assimilation techniques for river modeling including ensemble methods. Despite these limitations which primarily appear away from assimilation gauges, our data assimilation algorithm is able to consistently improve the discharge simulations at both observed and unobserved locations. Additionally, the limitations may be a direct result of filtering and could potentially be avoided by smoothing instead of filtering, or by keeping a filtering method but using a dual state-forcing estimation paradigm. Yet, the high computational demands of smoothers and augmented control vectors must also be weighed in such decision.
Finally, and in light of past efforts focusing on software efficiency with RAPID, this study evaluates the use of a threshold that limits storage and computation requirements for one of the key matrices used in the analysis step of the Kalman filter. We show that minimal thresholds can lead to notable temporal savings during model setup while having very limited impact on the quality of simulations.
Despite the relatively small size of the river basins and short study period used in this paper, our methodology is expected to be applicable to larger geographical domains and longer temporal coverage. However, the current daily assimilation window used in our study is best suited for daily observations and its application to temporally sparser observations may results in issues of persistence. Still, while this study makes use of in situ data from USGS, it is an initial step toward assimilation of satellite observations (despite their different spatiotemporal coverage). Specifically, the much anticipated global discharge estimates from upcoming Earth orbiting missions such as NASA’s Surface Water and Ocean Topography (SWOT) mission will likely motivate a number of new investigations that could reuse the methodology presented in this paper.
Acknowledgments
Thank you to the editor, to Dr. Claire Michailovsky, and to two anonymous reviewers whose comments help improve our manuscript. C. M. Emery, C. H. David, K. M. Andreadis, M. J. Turmon, J. T. Reager, J. M. Hobbs, and J. S. Famiglietti were supported by the Jet Propulsion Laboratory, California Institute of Technology, under a contract with NASA; including grants from the SWOT Science Team and the Terrestrial Hydrology Program. We follow a community effort (David et al. 2016; Gil et al. 2016) for sharing software, data, and methods. The Reproducible Routing Rituals (RRR, https://github.com/c-h-david/rrr/tree/20181003) and the Routing Application for Parallel computatIon of Discharge (RAPID, https://github.com/c-h-david/rapid/tree/20180921) are freely available under a Berkeley Software Distribution 3-clause license. The data are shared (http://rapid-hub.org) under a Creative Commons Attribution 4.0 License. The steps linking software and data to produce the results are included with the software. All rights reserved.
REFERENCES
Abaza, M., F. Anctil, V. Fortin, and R. Turcotte, 2014: Sequential streamflow assimilation for short-term hydrological ensemble forecasting. J. Hydrol., 519, 2692–2706, https://doi.org/10.1016/j.jhydrol.2014.08.038.
Alsdorf, D. E., E. Rodriguez, and P. Lettenmaier, 2007: Measuring surface water from space. Rev. Geophys., 45, RG2002, https://doi.org/10.1029/2006RG000197.
Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210–224, https://doi.org/10.1111/j.1600-0870.2006.00216.x.
Anderson, J. L., 2012: Localization and sampling error correction in ensemble Kalman filter data assimilation. Mon. Wea. Rev., 140, 2359–2371, https://doi.org/10.1175/MWR-D-11-00013.1.
Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 2741–2758, https://doi.org/10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.
Andreadis, K. M., and D. P. Lettenmaier, 2006: Assimilating remotely-sensed snow observations into a macroscale hydrology model. Adv. Water Resour., 29, 872–886, https://doi.org/10.1016/j.advwatres.2005.08.004.
Aubert, D., C. Loumagne, and L. Oudin, 2003: Sequential assimilation of soil moisture and streamflow data in a conceptual rainfall-runoff model. J. Hydrol., 280, 145–161, https://doi.org/10.1016/S0022-1694(03)00229-4.
Barré de Saint-Venant, A., 1871: Théorie du mouvement non permanent des eaux, avec application aux crues de rivières et à l’introduction des marées dans leur lit (in French). C. R. Acad. Sci., 73, 237–240.
Bates, P. D., and A. P. J. De Roo, 2000: A simple raster-based model for flood inundation simulation. J. Hydrol., 236, 54–77, https://doi.org/10.1016/S0022-1694(00)00278-X.
Bates, P. D., M. S. Horritt, and T. J. Fewtrell, 2010: A simple inertial formulation of the shallow water equations for efficient two-dimensional flood inundation modelling. J. Hydrol., 387, 33–45, https://doi.org/10.1016/j.jhydrol.2010.03.027.
Bauer-Gottwein, P., I. H. Jensen, R. Guzinski, G. K. T. Bredtoft, S. Hansen, and C. I. Michailovsky, 2015: Operational river discharge forecasting in poorly gauged basins: The Kavango River basin case study. Hydrol. Earth Syst. Sci., 19, 1469–1485, https://doi.org/10.5194/hess-19-1469-2015.
Beighley, R. E., K. G. Eggert, T. Dunne, Y. He, V. Gummadi, and K. L. Verdin, 2009: Simulating hydrologic and hydraulic processes throughout the Amazon River basin. Hydrol. Processes, 23, 1221–1235, https://doi.org/10.1002/hyp.7252.
Bertino, L., G. Evensen, and H. Vackernagel, 2003: Sequential data assimilation techniques in oceanography. Int. Stat. Rev., 71, 223–241, https://doi.org/10.1111/j.1751-5823.2003.tb00194.x.
Betts, A. K., F. Chen, K. E. Mitchell, and Z. I. Janjić, 1997: Assessment of the land surface and boundary layer models in two operational versions of the NCEP Eta Model using FIFE data. Mon. Wea. Rev., 125, 2896–2916, https://doi.org/10.1175/1520-0493(1997)125<2896:AOTLSA>2.0.CO;2.
Biancamaria, S., and Coauthors, 2011: Assimilation of virtual wide swath altimetry to improve Arctic river modeling. Remote Sens. Environ., 115, 373–381, https://doi.org/10.1016/j.rse.2010.09.008.
Brocca, L., F. Melone, T. Moramarco, W. Wagner, V. Naeimi, Z. Bartalis, and S. Hasenauer, 2010: Improving runoff prediction through the assimilation of the ASCAT soil moisture product. Hydrol. Earth Syst. Sci., 14, 1881–1893, https://doi.org/10.5194/hess-14-1881-2010.
Campo, L., F. Castelli, D. Entekhabi, and F. Caparrini, 2013: Analysis of a two-year meteorological dataset produced on Italian territory with a coupling procedure between a limited area atmospheric model and a sequential MSG-SEVIRI LST assimilation scheme. Int. J. Remote Sens., 34, 3561–3586, https://doi.org/10.1080/01431161.2012.716535.
Chen, F., Z. Janjić, and K. Mitchell, 1997: Impact of atmospheric surface-layer parameterizations in the new land-surface scheme of the NCEP Mesoscale Eta Model. Bound.-Layer Meteor., 85, 391–421, https://doi.org/10.1023/A:1000531001463.
Clark, M. P., D. E. Rupp, R. A. Woods, X. Zheng, R. P. Ibbitt, A. G. Slater, J. Schmidt, and M. J. Uddstrom, 2008: Hydrological data assimilation with the ensemble Kalman filter: Use of streamflow observations to update states in a distributed hydrological model. Adv. Water Resour., 31, 1309–1324, https://doi.org/10.1016/j.advwatres.2008.06.005.
Courtier, P., J.-N. Thepaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-var, using an incremental approach. Quart. J. Roy. Meteor. Soc., 120, 1367–1387, https://doi.org/10.1002/QJ.49712051912.
Coustau, M., F. Rousset-Regimbeau, G. Thirel, F. Habets, B. Janet, E. Martin, C. de Saint-Aubin, and J.-M. Soubeyroux, 2015: Impact of improved meteorological forcing, profile of soil hydraulic conductivity and data assimilation on an operational hydrological ensemble forecast system over France. J. Hydrol., 525, 781–792, https://doi.org/10.1016/j.jhydrol.2015.04.022.
Cunge, J. A., 1969: On the subject of a flood propagation computation method (Muskingum method). J. Hydraul. Res., 7, 205–230, https://doi.org/10.1080/00221686909500264.
Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 471 pp.
David, C. H., F. Habets, D. R. Maidment, and Z.-L. Yang, 2011a: Rapid applied to the Sim-France model. Hydrol. Processes, 25, 3412–3425, https://doi.org/10.1002/hyp.8070.
David, C. H., D. R. Maidment, G.-Y. Niu, Z.-L. Yang, F. Habets, and V. Eijkhout, 2011b: River network routing on the NHDPlus dataset. J. Hydrometeor., 12, 913–934, https://doi.org/10.1175/2011JHM1345.1.
David, C. H., Z.-L. Yang, and J. S. Famiglietti, 2013a: Quantification of the upstream-to-downstream influence in the Muskingum method and implications for speedup in parallel computations of river flow. Water Resour. Res., 49, 2783–2800, https://doi.org/10.1002/wrcr.20250.
David, C. H., Z.-L. Yang, and S. Hong, 2013b: Regional-scale river flow modeling using off-the-shelf runoff products, thousands of mapped rivers and hundreds os tream flow gauges. Environ. Modell. Software, 42, 116–132, https://doi.org/10.1016/j.envsoft.2012.12.011.
David, C. H., J. S. Famiglietti, Z.-L. Yang, and V. Eijkhout, 2015: Enhanced fixed-size parallel speedup with the Muskingum method using a trans-boundary approach and a large subbasins approximation. Water Resour. Res., 51, 7547–7571, https://doi.org/10.1002/2014WR016650.
David, C. H., Y. Gil, C. J. Duffy, S. D. Peckham, and S. K. Venayagamoorthy, 2016: An introduction to the special issue on geoscience papers of the future. Earth Space Sci., 3, 441–444, https://doi.org/10.1002/2016EA000201.
David, C. H., J. M. Hobbs, M. J. Turmon, C. M. Emery, J. T. Reager, and J. S. Famiglietti, 2019: Analytical propagation of runoff uncertainty into discharge uncertainty through a large river network. Geophys. Res. Lett., 46, 8102–8113, https://doi.org/10.1029/2019GL083342.
DeChant, C. M., and H. Moradkhani, 2011: Radiance data assimilation for operational snow and streamflow forecasting. Adv. Water Resour., 34, 351–364, https://doi.org/10.1016/j.advwatres.2010.12.009.
De Lannoy, G. J. M., R. H. Reichle, K. R. Arsenault, P. R. Houser, S. Kumar, N. E. C. Verhoest, and V. R. N. Pauwels, 2012: Multiscale assimilation of advanced microwave scanning radiometer-EOS snow water equivalent and moderate resolution imaging spectroradiometer snow cover fraction observations in northern Colorado. Water Resour. Res., 48, W01522, https://doi.org/10.1029/2011WR010588.
Del Moral, P., 1996: Non linear filtering: Interacting particle solution. Markov Processes Related Fields, 2, 555–580.
Döll, P., H. Douville, A. Güntner, H. M. Schmied, and Y. Wada, 2016: Modelling freshwater resources at the global scale: Challenges and prospects. Surv. Geophys., 37, 195–221, https://doi.org/10.1007/s10712-015-9343-1.
Durand, M., and Coauthors, 2016: An intercomparison of remote sensing river discharge estimation algorithms from measurements of river height, width, and slope. Water Resour. Res., 52, 4527–4549, https://doi.org/10.1002/2015WR018434.
Eicker, A., M. Schumacher, J. Kusche, P. Döll, and H. M. Schmied, 2014: Calibration/data assimilation approach for integrating grace data into the Watergap Global Hydrology Model (WGHM) using an ensemble Kalman filter: First results. Surv. Geophys., 35, 1285–1309, https://doi.org/10.1007/S10712-014-9309-8.
Ek, M. B., K. E. Mitchell, Y. Lin, E. Rogers, P. Grunmann, V. Koren, G. Gayno, and J. D. Tarpley, 2003: Implementation of Noah land surface model advances in the national centers for environmental prediction operational mesoscale Eta model. J. Geophys. Res., 108, 8851, https://doi.org/10.1029/2002JD003296.
Emery, C. M., A. Paris, S. Biancamaria, A. Boone, S. Calmant, P.-A. Garambois, and J. S. D. Silva, 2018: Large scale hydrological model river storage and discharge correction using satellite altimetry-based discharge product. Hydrol. Earth Syst. Sci., 22, 2135–2162, https://doi.org/10.5194/hess-22-2135-2018.
Ercolani, G., and F. Castelli, 2017: Variational assimilation of streamflow data in distributed flood forecasting. Water Resour. Res., 53, 158–183, https://doi.org/10.1002/2016WR019208.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 143–10 162, https://doi.org/10.1029/94JC00572.
Famiglietti, J. S., and Coauthors, 2011: Satellites measures recent rates of groundwater depletion in California’s Central Valley. Geophys. Res. Lett., 38, L03403, https://doi.org/10.1029/2010GL046442.
Fang, H., S. Liang, and G. Hoogenboom, 2011: Integration of MODIS LAI and vegetation index products with the CSM-CERES-Maize model for corn yield estimation. Int. J. Remote Sens., 32, 1039–1065, https://doi.org/10.1080/01431160903505310.
Fisher, C. K., M. Pan, and E. F. Wood, 2020: Spatiotemporal assimilation– interpolation of discharge records through inverse streamflow routing. Hydrol. Earth Syst. Sci., 24, 293–305, https://doi.org/10.5194/HESS-24-293-2020.
Forman, B., R. Reichle, and M. Rodell, 2012: Assimilation of terrestrial water storage from grace in a snow-dominated basin. Water Resour. Res., 48, W01507, https://doi.org/10.1029/2011WR011239.
Getirana, A. C. V., A. Boone, D. Yamazaki, B. Decharme, F. Papa, and N. Mognard, 2012: The Hydrological Modeling and Analysis Platform (HyMAP): Evalution over the Amazon basin. J. Hydrometeor., 13, 1641–1665, https://doi.org/10.1175/JHM-D-12-021.1.
Gil, Y., and Coauthors, 2016: Toward the geoscience paper of the future: Best practices for documenting and sharing research from data to software to provenance. Earth Space Sci., 3, 388–415, https://doi.org/10.1002/2015EA000136.
Girotto, M.,