1. Introduction
a. Motivation
Dynamical processes in the atmosphere evolve on a range of spatiotemporal scales, most comprehensively expressed by the full compressible flow equations. Limit regimes, derived from the full compressible flow equations by scale analysis and asymptotics, describe reduced dynamics, examples being the soundproof anelastic and pseudoincompressible models traditionally used at small to mesoscale, and the hydrostatic primitive equations at large to planetary scales (Pedlosky 2013; Vallis 2017; Klein 2010).
To access the dynamics of the full compressible flow equations and of their limit regimes, separate numerical schemes can be developed for each of the limiting models. From a computational perspective, however, the discrepancies between numerical solutions of different equation sets obtained by essentially the same numerical scheme can be substantially smaller than the discrepancies associated with the solution of one and the same equation set by different numerical schemes (Smolarkiewicz and Dörnbrack 2008; Klein 2009).
Benacchio et al. (2014), Klein et al. (2014), and, separately, Smolarkiewicz et al. (2014) developed discretization schemes for the compressible equations that allow access to the pseudoincompressible model within a single numerical framework, showing equivalent results of both configurations in small to mesoscale tests involving acoustically balanced flows. The blended analytical and numerical framework in Benacchio et al. (2014) and Klein et al. (2014), within which the compressible to pseudoincompressible transition is realized as a continuum of models controlled by an appropriate blending parameter, was conceptually extended in Klein and Benacchio (2016) to include access to hydrostatic models. Benacchio and Klein (2019) then proposed a numerical implementation and achieved equivalence of hydrostatic and nonhydrostatic model solutions on large scales in the absence of vertically propagating acoustic modes.
Balanced data assimilation provides a key motivation for blended numerical models. A problem with local data assimilation is the imbalance that it may induce (Lorenc 2003). As the assimilation procedure does not take heed of specific characteristics of a flow, such as conservation of mass, momentum, and energy, or of particular smoothness properties, the initial balance of a flow state may be destroyed by the assimilation procedure, see Neef et al. (2006) and more specifically Greybush et al. (2011) and Bannister (2015) on the effects of localization on balanced analysis fields.
Physically, local data assimilation in a compressible framework can introduce imbalances through fast acoustic modes with velocity amplitudes that may be of the same order of magnitude as the velocities found in the slowly evolving balanced dynamics of interest, with potentially destructive effects on overall solution quality (Hohenegger and Schär 2007). Judicious use of a blended soundproofcompressible model can be employed to counteract this effect. Imbalances inherent in the initial pressure fields can be effectively reduced by solving the initial time steps of a simulation in the pseudoincompressible regime so that, upon the subsequent transition to the compressible regime over several further time steps, the pressure field is balanced with respect to the initial velocities and potential temperature fields (Benacchio et al. 2014; Klein et al. 2014). More specifically, the algorithm leverages a discrete projection of the velocity field onto the space of pseudoincompressible solutions that is orthogonal in a L^{2}inner product weighted by the massweighted potential temperature, up to numerical truncation errors. Additional measures guarantee that the pressure field, too, corresponds to the physically correct pseudoincompressible pressure and not to the Lagrange multiplier calculated as part of the projection. Therefore, the scheme provides the ensemble of balanced solutions closest to the analysis ensemble with respect to the norm induced by the mentioned inner product.
By extension of this insight, when mounting data assimilation on the numerics, a projection of the solution onto the soundproof pseudoincompressible model can suppress the fast acoustic modes arising from the assimilation procedure. After suppression of the fast modes, the remaining time steps until the next assimilation procedure are solved with the compressible model. As this method makes use of the different dynamics modeled by the compressible and soundproof equation sets, it fundamentally deviates from existing methods to handle initialization problems such as the postanalysis digital filter (DFI; e.g., Lynch and Huang 1992) and the incremental analysis update (IAU; Bloom et al. 1996). These techniques act as lowpass filters, and repeated application of the filter may have undesirable effects on longterm dynamics (Houtekamer and Zhang 2016; Polavarapu et al. 2004).
Balance was also shown to improve with the choice of localization space (Kepert 2009) and by allowing observations outside of a localization radius to relax to a climatological mean (Flowerdew 2015). Hastermann et al. (2021) compared the effects of the blending approach with those of the postanalysis penalty method in achieving balanced analysis fields for highly oscillatory systems and found comparable improvements for both methods in the case of nonlinear balance relations. See also Zupanski (2009) and Houtekamer and Zhang (2016) for reviews of balanced atmospheric data assimilation.
b. Contributions
This paper proposes a dynamicsdriven method to achieve balanced data assimilation using a blended numerical framework with the following advances:

Onestep blending of the pseudoincompressible and compressible models by instantaneous switching. This is achieved by (i) accounting for the fact that Exner pressure fields computed at comparable stages within a time step correspond to different time levels in the compressible and soundproof model; (ii) judiciously converting the thermodynamic variables between the compressible and soundproof models motivated by low Mach number asymptotic arguments; and (iii) carefully selecting, based on steps (i) and (ii), the pressure variables used in converting numerical model states at the blending time interfaces. Onestep blending is a sizeable improvement over Benacchio et al. (2014), who needed several intermediate time steps for the blending procedure.

Exploitation of the blended framework for balanced ensemble data assimilation. We employ an untuned data assimilation scheme that is known to introduce imbalances. After each assimilation of data, a single time step in the pseudoincompressible model configuration is used to suppress the fast acoustic imbalances. The model configuration is then switched back to the compressible model. In the reported idealized experiments, balanced analysis fields are obtained by combining data assimilation and blending, thus verifying the ability of the blended model to handle imbalances consistently with the underlying compressible and soundproof dynamics.
The effects of data assimilation and blending on balanced solutions are investigated in the twodimensional numerical experiments of a traveling vortex and of a rising thermal in a vertical slice (see Kadioglu et al. 2008; MendezNunez and Carroll 1994; Klein 2009). For these tests, unbalanced and untuned data assimilation is shown here to destroy solution quality, while the use of blending effectively recovers the structure of the solution as evaluated by comparison with runs without data assimilation. Moreover, with the balanced data assimilation procedure, the solution quality of the observed quantities is maintained or improved independently of the size of the localization region, which is an important tunable parameter of many sequential data assimilation procedures. The order of magnitude of the imbalances introduced by data assimilation in these idealized test cases is quantified by scale analysis.
The paper is structured as follows. Section 2 contains a brief introduction to data assimilation and the Kalman filters considered here. Section 3 reviews the blended numerical framework. Section 4 proposes the new blending scheme and section 5 details the results of numerical experiments. The effectiveness of the onestep blended soundproofcompressible scheme is investigated for balanced data initialization in section 5a, and its application toward balanced data assimilation in section 5c. Section 6 contains a discussion and the conclusions.
2. Data assimilation: A quick primer
Data assimilation is used in numerical weather prediction to improve forecasting. Existing approaches include 4DVar, which optimizes model states over a finite time horizon in the past before launching a new prediction, and sequential assimilation procedures, which assimilate the available observations at specific points in time. Here we focus on the latter, which are more susceptible to the problem of imbalances addressed in this paper due to the local nature of these methods and especially when the localization is severe (Cohn et al. 1998; Mitchell et al. 2002).
Modern weather forecasting techniques aim to represent the uncertainty of a forecast by generating an ensemble of likely candidates of model states. Such an ensemble can be understood as an approximate representation of a probability distribution over model states. The task of sequential data assimilation is then as follows. Suppose we are given the probabilistic weight of each ensemble member at a previous instance in time, i.e., at the beginning of the current simulation window, together with the forward simulation states of all ensemble members at the current time, i.e., at the end of the simulation window. Then the prior probability distribution pdf_{prior} is represented by the model states at the new time level together with their probabilistic weights inherited from the beginning of the simulation window. Now we are to readjust the current states or the probabilistic weights of the ensemble members, at fixed time, such that the resulting posterior probability distribution pdf_{post} best reflects the observations that have arrived during the simulation window.
The Kalman filters
A class of Monte Carlo–based Kalman filters, the ensemble Kalman filters, avoid the problem of high dimensionality by approximating the underlying probability density functions through the empirical distributions given by an ensemble of individual simulation states (Reich and Cotter 2015). As a consequence, ensemblebased methods are often computationally more efficient than any scheme that aims to explicitly describe entire probability density functions.
A drawback to the ensemble Kalman filter is that the covariance is determined by the spread of the ensemble and is therefore typically underestimated. However, ensemble inflation can be applied by multiplying the ensemble covariance by a constant factor larger than 1. This increases the covariance in the direction of the ensemble spread (Anderson 2007; Van Leeuwen et al. 2015).
This paper uses the local ensemble transform Kalman filter (LETKF) data assimilation method (Hunt et al. 2007) based on the ensemble square root filter (ESRF). The LETKF localizes the observation covariance in such a way that observations farther away from the grid point under analysis have less influence, tapering off to zero influence for observations outside of a prescribed observation radius. The algorithm for the LETKF is provided in appendix A.
Localization prevents spurious correlations of faraway observations while potentially reducing the complexity of the problem by making the observation covariance matrix closer to diagonal (Hamill et al. 2001; Houtekamer and Mitchell 1998). After localization, the analysis is only performed on a smaller local region, and the global analysis ensemble comprises different linear combinations of the ensemble members in each of these local regions. This allows the ensemble to represent a higherdimensional space than one constrained by the ensemble size (Fukumori 2002; Mitchell et al. 2002). A smaller ensemble size may necessitate more severe localization.
When applying the LETKF, there are two potential sources for imbalances. In the case of a nonlinear balance relation, the LETKF fails to recover the desired balance due to its local linear construction. Even without localization and for a given observation, the analysis ensemble of the ESRF is obtained as a linear combination of the forecast ensemble. In the case of linear balances, the situation is more subtle. On one hand the ESRF without localization is capable of resolving linear balances due to its linear construction. On the other hand the LETKF, utilizing localization, does not act as a linear map on the global fields and therefore does not necessarily preserve the balance relation. Numerical experiments in this paper investigate imbalances arising from both these sources.
A smooth localization function, such as the truncated Gaussian function or the Gaspari and Cohn (1999) function, may be used to keep the resulting fields sufficiently smooth.
3. The blended numerical model
a. Governing equations
b. Summary of the numerical scheme
A firstorder Runge–Kutta method is used for the advection operator
c. Pseudoincompressible regime
4. Single timestep soundproofcompressible transition
In the following, a conversion of pressurerelated quantities, motivated by low Mach number asymptotics and applied prior to the model transitions, is proposed which allows for model switching within a single time step.
a. Time level of the pressurerelated variables
1) The compressible equations
From (24), π is at time level n + 1/2 after the halftime stepping in (16) while (25) starts with π at time level n for the fulltime stepping in (17). Therefore, the time level of π has to be reset from n + 1/2 to n after the half timestep in (16) and before the full time step in (17). Furthermore, the time level of π after the full time step in (17) is n + 1 as intended.
2) The pseudoincompressible equations
In contrast to the compressible case, expressions (31) and (36) imply that Exner pressure π after the halfstep (15) and (16) is at the time level n, and could be used as the input to (17) as an alternative to using the Exner pressure obtained at the end of time step n − 1. Therefore, π may not have to be reset to time level n after the halftime predictor for the pseudoincompressible solve. Figure 1 summarizes the timelevel analysis of π.
b. Conversion of the pressurerelated variables
Therefore, at the blending time interfaces between the compressible and the pseudoincompressible configurations, one of the two expressions in (39) is applied depending on the direction of the transition.
c. Association of perturbation variables between the compressible and soundproof models
The timelevel analysis of π in section 4a demonstrated that, in a pseudoincompressible solve, both the Exner pressure solution after the full time step from t^{n} to t^{n}^{+1} and that obtained after the subsequent half timestep are associated with the same time level t^{n}^{+1}.
Consider then the compressible to pseudoincompressible transition at time n + 1. The term
The
In addition, choice 2 offers a conceptual advantage. The Exner pressure field in the pseudoincompressible model is not controlled by an evolution equation but rather acts as a Lagrangian multiplier ensuring compliance of the velocity field with the divergence constraint at some fixed time. Thus, a direct dependence of the pressure on its previous time level data, as occurs under option 1), is a numerical artifact that should be avoided.
d. Data assimilation and blending
A similar ensemble averaging is applied to obtain
A Kalman gain K^{n} similar to (4c) is obtained from the observation operator
Once the assimilation procedure is completed, the model switches to the pseudoincompressible limit regime and then back again to fully compressible until the next assimilation time. The process of switching back and forth between the model configurations exploits the blended numerical model to achieve balanced data assimilation and is termed blended data assimilation.
In particular, if data are assimilated into the compressible flow equations at time n, then compressible to pseudoincompressible blending entails setting the switch α_{P} to 0 and converting the quantity P_{comp} with (39b). The solution is then propagated in the pseudoincompressible regime for a time step, after which α_{P} is set back to 1, switching to the compressible flow equations. The quantity P_{psinc} is reconverted by (39a) using either
As our principal strategy is to split measures of balancing the flow state from those of assimilating the data, we have not tuned the data assimilation procedures themselves in any way. Tuning the data assimilation parameters may further improve balance, but as our balancing strategy is rather successful without tuning, the degrees of freedom of parameter tuning might be used more efficiently to achieve additional goals aside from the elimination of unphysical acoustic noise.
5. Numerical results
The idealized test cases of a traveling vortex and a rising warm air bubble are used to validate model performance in this section. To evaluate the effectiveness of the single timestepblended soundproofcompressible scheme, unbalanced states are initialized in the compressible flow equations for both test cases and the blended scheme is applied. The balance of the compressible solution with unbalanced initial states is evaluated by “probe measurements,” i.e., by time series of the flow variables at selected points in the domain, and compared against analogous data extracted from the soundproof solution (Benacchio et al. 2014).
For blended ensemble data assimilation, an ensemble is generated by perturbing the initial conditions. Then, the blended scheme is applied after the assimilation of observations into the compressible flow equations and repeated after each assimilation procedure. The quality of balanced data assimilation is evaluated by rootmeansquare errors with respect to a reference solution.
a. Effectiveness of the improved blending strategy
1) The traveling vortex experiment
A stable configuration of the traveling vortex test case of Kadioglu et al. (2008) with f = 0.0 s^{−1} and g = 0.0 m s^{−2} is considered in the domain x = [−5.0, 5.0] km, z = [−5.0, 5.0] km with doubly periodic boundary conditions and a background wind with velocity 100 m s^{−1} in both directions (Fig. 4). Changes made to the initial setup in Kadioglu et al. (2008) are given in appendix B. The timestep size is constrained by advective
An imbalanced initial state is created by setting P = 347.95 kg m^{−2} K and π = 1.0 over the whole domain for the full compressible flow equations (5) with α_{P} = 1. For runs with blending, this imbalanced initial state is propagated for one time step in the limit pseudoincompressible regime followed by the rest of the time steps in the fully compressible model. The blending scheme in section 4 is used to transition between the model regimes.
For this imbalanced initial state, a compressible run with blending is compared with a compressible run without blending and with a pseudoincompressible run (left panel in Fig. 5). Fast acoustic modes are filtered from the blended solution and the result is indistinguishable from the limit pseudoincompressible reference solution, save for an initial adjustment in the first time step. Blending is able to recover the dynamics of the balanced state.
A closeup (right panel of Fig. 5) compares the blended runs with choices of
2) The rising bubble experiment
The choice of reference units yields Ma ≈ 0.0341. All rising bubble experiments presented in this paper are run on a grid with (160 × 80) cells to a final simulation time of 1000.0 s.
The initial pressure fields are set to reflect a horizontally homogeneous hydrostatic pressure field
The initial stages of the bubble evolution are compared for the compressible, pseudoincompressible and onestep blended runs in Fig. 7. As the initial state is not hydrostatically balanced, pressure waves propagate in the compressible configuration (topleft panel) as seen in a time series of pressure perturbation increment probe measurements δp′ at (x, z) = (−7.5, 5) km (orange cross in the topleft panel and blue line in the topright panel). Here,
Next, the blended run and the pseudoincompressible run are compared in more detail (Fig. 7, middle and bottom panels) with δp′. The probes are located at (x, z) = (−7.5, 5) km (middle panels) and at (x, z) = (0, 5) km (red cross in the topleft panel and bottom panels in Fig. 7), both with a constant small time step Δt = 1.9 s (top, middleleft, and bottomleft panels) and for larger, advective CFLconstrained time steps (middleright and bottomright panels, CFL = 0.5 and Δt = 21.69 s for the first two time steps). Away from the bubble trajectory (middle panels), the pressure perturbation increment due to the rising bubble and the remnants of the background acoustics from blending are comparable in amplitude. Larger amplitudes are observed with the blended model and the larger time step (middleright panel), but they are still very small compared to the fully compressible run (note the different range on the vertical axes between the topright and middleright panels). On the bubble trajectory (bottom panels), the pressure perturbation increment due to the rising bubble dominates and the solutions are almost identical.
Throughout the runs, a single time step spent in the soundproof pseudoincompressible regime largely filters out the fast acoustic imbalances of the compressible run (not shown in the middle and bottom panels of Fig. 7). This is quantified by comparing the relative errors with respect to the reference pseudoincompressible run for the compressible run E_{c} and for the blended run E_{b}, defined in (45) and shown in Table 1. The blended run E_{b} is more than 25 times smaller than E_{c} for the large timestep case, and more than two orders of magnitude smaller for the small timestep case.
Errors E_{c} and E_{b} (see text for definitions) of the time series of δp′ in [0, 1000] s relative to the reference pseudoincompressible run (middle and bottom panels of Fig. 7). The acoustic time step size is Δt_{AC} = 1.9 s, while Δt_{ADV} is determined by advective CFL = 0.5 and Δt_{ADV} = 21.69 s for the first two time steps. Probe location (−7.5, 5) km corresponds to the orange marker and orange lines in Fig. 7, and probe location (0, 5) km corresponds to the red marker and red lines.
We also remark that a probe measurement of the full pressure time increment δp differs slightly between the reference pseudoincompressible run and the onestep blended run (not shown). The difference is due to the timedependence of the hydrostatically balanced background pressure
In view of these results, blending can be employed as an effective means to achieve the balanced initialization of data within a fully compressible model. The single timestep balancing capability in the model presented here substantially improves on the performance of Klein et al. (2014) and Benacchio et al. (2014), whose blended models achieved smaller reductions in amplitude compared to the fully compressible case and needed several time steps in the limit regime.
b. Ensemble data assimilation and blending: Setup
1) Traveling vortex setup
To combine blending with data assimilation as described in section 4d, an ensemble is generated by perturbing the initial vortex center position (x_{c}, z_{c}) within the open half interval of [−1.0, 1.0 km) for both x_{c} and z_{c}. The vortex is then generated around this center position such that the full vortex structure is translated. A total of 10 such samples are drawn, and they constitute the ensemble members. An additional sample is drawn and solved with the full model for the balanced initial condition. This run, denoted by obs, is used to generate the artificial observations. Another run with an identical setup to this additional obs sample is made. This time, however, blending for the first time step is applied and this run is considered the truth in the sequel. This is to correct for any errors in the initialization of π, as discussed in section 4c.
The choice of generating the truth and obs through a perturbation of the initial condition is such that the ensemble mean does not coincide with the truth. Otherwise, ensemble deflation alone would be sufficient to make the ensemble converge toward the truth, see also Lang et al. (2017).
The observations are taken from the obs run every 25 s—only a tenth of the grid points are observed and these are drawn randomly. Sparse observation grid points are randomly drawn as follows: A Boolean mask selecting for a tenth of the grid points is generated where if necessary, a ceiling function is applied to obtain an integer number of grid points selected. The entries of the mask are then shuffled using the algorithm by Fisher and Yates (1953), and the Boolean mask is applied to the obs array to obtain the sparse observations. This framework deviates from a more realistic situation where observations and grid points do not coincide. To simulate measurement noise, Gaussian noise with zero mean is added independently to each of the observed grid points. (The variances used in the experiments are listed in Table D1 and details on how the variances are computed are given in appendix D.) A similar method of generating artificial observations by adding independent Gaussian noise was used in, for example, Bocquet (2011) and Harlim and Hunt (2005) for the Lorenz63 and Lorenz96 models.
The regions for localized data assimilation are of size (11 × 11) grid points and only observations within such a patch are considered for analysis operations at the respective central grid point. A localization function corresponding to a truncated Gaussian function is applied such that observations farther from the grid point under analysis have less influence, and that the influence decays smoothly toward the edges of the localization subdomain, where it is abruptly truncated to zero. No ensemble inflation is applied in this case.
Examples of the observations and truths used in the generation and evaluation of the experiments with data assimilation are displayed in Fig. 8. Notice that we run one test with observations of the momentum fields only, and another test with observations of the full set of variables.
The 10 ensemble members in each of these tests are initialized with balanced states, and blending is applied for the first time step when the model runs in the pseudoincompressible configuration. The ensemble is then solved forward in time with the fully compressible model. Data from the generated observations are assimilated every 25 s. The immediate time step after the assimilation procedure is solved in the pseudoincompressible limit regime while the rest of the time steps in the assimilation window are solved using the full compressible model. Conversions according to the blending scheme in section 4 are employed when switching back and forth between the full and limit models. Furthermore, the choice of
The setup is repeated for two additional ensembles and each observation scenario, one where data are still assimilated but no blending is performed (EnDA), and another where neither data assimilation nor blending are performed (EnNoDA). EnNoDA and EnDA constitute an identical twin experiment (Reich and Cotter 2015; Lang et al. 2017), through which the effects of data assimilation can be evaluated. EnDA along with EnDAB constitute yet another identical twin experiment, which evaluates the performance of blending.
2) Rising bubble setup
The rising bubble ensemble spread is generated by randomly modifying the maximum of the potential temperature perturbation δΘ in the open half interval [2.0, 12.0 K). The ensemble comprises 10 members. While the relative spread of the temperature perturbation is large with this setup, the ensemble spread of the bubble position at the final time of the simulation, t_{fin} = 1000.0 s, is only moderate.
An additional sample is drawn for the obs and the truth, which are identical in this setup. Blending is applied to the first time step of the obs and the truth, obtaining a balanced solution. As the rising bubble flow fields evolve rather slowly in the beginning, data are only assimilated from t = 500.0 s onward. Observations of the momentum field are then assimilated every 50.0 s. As with the vortex experiments, only a tenth of the grid points are observed, independent Gaussian noise is added, and localization within an (11 × 11) grid points region is applied. (The variances used to generate the Gaussian noise are given in Table D1.) A localization function corresponding to the truncated Gaussian function is applied and the ensemble is not inflated. Examples of the observation and truth are given in Fig. 9. Three ensembles corresponding to the EnNoDA, EnDA, and EnDAB settings, with 10 members each, are generated, but only one set of experiments involving assimilation of the momentum field only is pursued.
Note that as the ensembles and the observations are generated with balanced initial conditions, any noise present in the simulation results is introduced by the data assimilation procedure. Table 2 summarizes the details of the data assimilationrelated experimental setup for both test cases.
Assimilationrelated experimental parameters. Here, K is the ensemble size, b is the ensemble inflation factor, t_{first} is the first assimilation time, Δt_{obs} is the observation interval, ψ_{assimilated} is the set of quantities assimilated, (N × N)_{local} is the size of the local region, f_{local} is the type of localization function, η_{obs} is the observation noise, obs_{sparse} is the sparsity of the observations, and N_{blending} is the number of initial time steps spent in the limit model regime. The π′ choice is used in the initialization of N_{blending}, more details in section 4c.
3) Evaluation of data assimilation
c. Ensemble data assimilation and blending: Results
1) Traveling vortex
Figure 10 depicts the ensemble snapshots for the vortex case with all quantities observed and assimilated. EnNoDA acts as the control ensemble, and the top row depicts its solutions for the traveling vortex without data assimilation and blending. While the center position of the vortex for each ensemble member is perturbed, the ensemble mean vortex position (right column) is centered around the origin. This is in line with the conditions used to generate the initial ensemble. With data assimilation, EnDA (middle row), the balance is lost and the vortex structure is not preserved at the final time. Data assimilation and blending, EnDAB (bottom row), recovers the balanced solution and the vortex structure is preserved after three periods of revolution. Moreover, comparing with Fig. 8, the effect of data assimilation becomes obvious. The center position of the EnDAB ensemble mean is in the lower right quadrant, closer to that of the observation and the truth.
Referring to Fig. 11, data assimilation without blending (EnDA, orange lines in Fig. 11) leads to a jump in the RMSE in the thermodynamic P variable upon the first assimilation at t = 25 s. After that, the error stays relatively constant. The scale analysis in appendix C corroborates that the magnitude of this error jump is compatible with a spontaneous acoustic imbalance introduced by the data assimilation procedure.
Assimilating the momentum fields alone is insufficient and the RMSE in the solution (solid lines in Fig. 11) is larger than in the reference EnNoDA run. As expected, EnDAB provides a smoother solution over time as the error does not oscillate. This test includes a strong axisymmetric potential temperature variation (Fig. 4), and the potential temperature is an advected quantity not corrected by momentum data assimilation. Therefore, the initially tight correlation of the velocity and potential temperature variations gets destroyed in the course of data assimilation. Since the potential temperature is fluid dynamically active through the generation of baroclinic torque, the flow fields of the ensemble members increasingly deviate from their reference as a consequence.
Assimilating all the quantities yields an improvement (dashed lines in Fig. 11). While the initial assimilation reduces the error substantially for ρ, ρu and ρw of the EnDA run, the error increases over time until approximately t = 150 s. The increase in the error is due to the imbalances introduced by the chosen (11 × 11) grid point size of the localization regions [more details are provided in section 5c(3) and appendix C]. For the EnDAB run, the imbalances are suppressed and the RMSEs are lower than those of the control EnNoDA run for all quantities over the entire simulation period. Ensemble spread and RMSE are comparable in these traveling vortex runs (not shown).
2) Rising bubble
Figure 12 displays snapshots of pressure perturbation for the bubble case. In the EnNoDA run (first row) the bubbles in the ensemble attain different heights at the end of the simulation time and the ensemble mean is diffused, in line with the spread in the initial conditions used in generating the ensemble. Ensemble members with larger initial potential temperature perturbation rise faster. In the EnDA ensemble (second row), largeamplitude fastmode imbalances are present, while the ensemble mean of the bubble rotor positions at the end time better approximates the true positions of the rotors. For EnDAB (third row), the individual ensemble members are close to one another, as reflected in the ensemble mean. The ensemble better approximates the truth and the fastmode imbalances are suppressed. Moreover, the pressure footprints of the bubble rotors are not visible in plots of the pressure differences between the EnDA and EnDAB ensembles (fourth row), showing that the difference between the EnDA and EnDAB results is predominantly due to the presence of the imbalances only, and suggesting (right column) that data assimilation is comparably effective in nudging the bubble toward the truth in both cases. Blending suppresses the imbalances while leaving the dynamics of the rising bubble largely unaffected.
RMSE plots of data assimilation of the momentum fields in the rising bubble experiment are shown in Fig. 13. The momentum fields are assimilated every 50.0 s after 500.0 s. This is visible in the momentum RMSE plots, where each downward step corresponds to one application of the assimilation procedure. For EnDA, an error is introduced in the density ρ and massweighted potential temperature P. Blending negates this error and the EnDAB curves show a smooth profile, with RMSE lower than the control EnNoDA. As in the traveling vortex case, a jump is visible in the RMSE of P at the first assimilation time for EnDA, and this corresponds to the imbalances introduced. See appendix C on the scale analysis for more details. The ensemble spread and RMSE are again comparable in these runs (not shown).
3) Localization region and imbalances
In this section, results of the EnDA and EnDAB ensembles are investigated for varying localization radii. Here the aim is not to obtain the optimal choice of the localization radius but to illustrate its effect on the imbalances. All the quantities are assimilated for the traveling vortex test case, and localization regions of (5 × 5), (11 × 11), (21 × 21), and (41 × 41) grid points are used in addition to a run without localization (EnNoLoc). Otherwise, the setup follows the parameters laid out in section 5a(1) and Table 2.
The balanced structure of the vortex could be preserved for the quantities ρ, ρu, and ρw in the EnDA case. For these EnDA quantities, the localization length scale plays an important role.
If the localization region is too small, fewer observations are involved in the update of the analysis grid point, and the effect of data assimilation becomes less severe. As a result, the nudging of the vortices in the ensemble toward the truth is more gradual (Fig. 14). The drop in the RMSE after the first assimilation time at t = 25 s for the run with a (5 × 5) localization region (magenta solid with square markers) is the least drastic, and the RMSEs continue to drop for the subsequent assimilation step. However, the small localization region also introduces severe imbalances that deteriorate the compact vortex structure.
If the localization region is moderately small, e.g., (11 × 11) (solid orange line with triangle markers in Fig. 14), sufficient observation points are assimilated and the effect of data assimilation becomes significant. At the same time, the localization region is small enough such that the imbalances introduced are sufficient to deteriorate the compact vortex structure. In such a case, we see a relatively significant increase in the RMSE over time as the combined detrimental effects from a severe initial nudging of the vortices in the ensemble toward the truth and from the imbalances introduced by the localization are the most pronounced.
On the other hand, if the localization region is sufficiently small, e.g., (5 × 5), the imbalances introduced by the localization remain severe, but the gentler nudging of the vortices in the ensemble toward the truth better preserves the compact vortex structure, and we do not see a drastic increase in the RMSEs as with the (11 × 11) run.
For larger localization regions, the imbalance introduced by local data assimilation is mitigated. In Fig. 14, the RMSE of EnDA runs with larger localization regions, e.g., (21 × 21) (yellow solid line with diamond markers) and (41 × 41) (cyan solid line with star markers) generally perform better than the other runs. We also note that a larger localization region corresponds to a smaller error jump in the variable P (topright panel of Fig. 14). For the case without localization (EnNoLoc, solid brown line with cross markers), the error jump is almost nonexistent, but an imbalance is nevertheless introduced, see the fluctuation of the errors around that of the EnNoDA run (black solid dotted line).
A localization region that is too large leads to an erroneous oversampling of the dynamics. For example, the analysis update of grid points inside of the vortex structure is influenced by observations of the background dynamics and vice versa. This mutual influence results in a vortex structure that becomes increasingly spread out as the number of assimilation steps increases. The effect of oversampling can be seen in, e.g., the EnNoLoc run in Fig. 14 and in the (41 × 41) and the EnNoLoc runs in Fig. 15, where the error scores are higher than in a run with a moderate localization region.
Application of blended data assimilation as a balancing mechanism eliminates the imbalances from a local data assimilation procedure. As a result, smaller localization regions may be used with fewer adverse effects, and the best error scores are achieved by an EnDAB run with a localization region of (11 × 11) grid points (dashed orange line with triangle markers in Fig. 15). The higher RMSE in the (5 × 5) run (dashed magenta line with square markers) may be due to the undersampling of the vortex dynamics. This is the opposite of the oversampling effect described above.
For the rising bubble experiments (results not shown), runs with assimilation of only the momentum fields and localization region sizes up to (71 × 71) grid points were investigated. For the quantities ρ, ρu and ρw, the RMSEs generally decrease with larger localization regions, although the decrease in the error is only marginal for localization region sizes larger than (41 × 41) grid points. As in the traveling vortex tests, for smaller localization regions, substantial error jumps in the P and π′ variables are observed in the EnDA runs but not in the EnDAB runs.
6. Discussion and conclusions
This paper has presented a new conceptual framework for balanced data assimilation based on blended numerical models. Using a discrete timelevel numerical analysis for the Exner pressure field and a careful choice of pressure perturbation variables, the blended soundproofcompressible modeling framework of Benacchio et al. (2014) has been substantially upgraded by a functionality to switch between equation sets in a single time step.
In idealized numerical experiments with a traveling vortex and a gravitydriven warm air bubble, a single time step in the pseudoincompressible limit regime was sufficient to recover a balanced state starting from imbalanced initial data. Moreover, the blended model yielded leftover acoustics with amplitude more than one order of magnitude smaller than the ones generated at the onset with the fully compressible model. The amplitude reduction is a sizeable improvement over the scores of Benacchio et al. (2014) who, in addition, needed several time steps in a hybrid soundproofcompressible configuration with noninteger values of the blending parameter α_{P} to achieve their best level of noise reduction.
The upgraded blended model has then been combined with a data assimilation engine and deployed as a tool to reduce imbalances introduced by regular assimilation of data within model runs. Numerical results on ensemble data assimilation with and without blending showed that while data assimilation alone produced imbalances that effectively destroyed important qualitative features of the solution in one of the test cases, data assimilation together with blending strongly reduced those imbalances and led to recovery of accurate results. Moreover, blended data assimilation was effective despite the untuned data assimilation parameters used in the investigations. Throughout our study, a single time step spent in the pseudoincompressible limit regime after the assimilation of data was sufficient to restore a nearly balanced state, as documented by strongly reduced RMSEs with the blended model. The RMSEs of the blended data assimilation run are almost as low as the error scores obtained from assimilating data into a pseudoincompressible ensemble run (results not shown).
For ensemble data assimilation experiments with the traveling vortex, assimilation of the momentum fields alone was found to be insufficient in the case of large variation of the potential temperature in the vortex core. In the course of longer simulations, the ensemble with balanced data assimilation carried larger errors than the control ensemble without data assimilation with such a setup (green solid curves in Fig. 11). We associate this behavior with an issue of controllability (Jazwinski 2007): The potential temperature variations in this case are dynamically relevant owing to the generation of vorticity by baroclinic torque. Thus, if these variations are not assimilated, then the data assimilation steps will destroy the alignment of the pressure and density gradients in the vortex, and forecast quality will soon deteriorate. In fact, a test based on an analogous vortex with initially constant entropy yields results (not shown) close in quality to those of the rising thermal test when only momentum is assimilated. The issue was solved by assimilation of all variables. Further investigation is warranted on how the effectiveness of data assimilation can be improved under such circumstances without the need to observe all state variables.
A scale analysis (appendix C) corroborates the insight that the RMSE increase introduced by the assimilation of data corresponds to the fastmode imbalances seen in the plots of the individual ensemble members and the ensemble mean. In this sense, our experiments make a case for investigations involving relatively simple idealized test cases, as we were able to gain some analytical understanding of the sources and consequences of errors and imbalances. Nevertheless, further studies based on more realistic scenarios will be required to demonstrate that the presented approach and its extensions will actually enable quantifiable improvements of numerical weather prediction skill scores.
In the experiments involving ensemble data assimilation with different localization radii, blended data assimilation yielded, for all localization sizes, substantial improvements to the RMSE relative to the plain data assimilation without a balancing procedure. In fact, the bestperforming data assimilationonly run still produced worse results than the worstperforming run with blending. Furthermore, the recovery of a balanced vortex structure turned out to be sensitive to the choice of localization radius, with best results obtained at some intermediate size of the localization domains. This study hints at a subtle interplay between the data assimilation setup and the idealized tests investigated in this paper, and further investigations into the effects of data assimilation on idealized and realistic dynamics are warranted.
In numerical weather prediction, methods to damp or remove acoustic imbalances have long been employed (e.g., Daley 1988; Skamarock and Klemp 1992; Dudhia 1995; Klemp et al. 2018). Moreover, practical application of sequential data assimilation procedures will generally excite all rapidly oscillatory modes of the compressible system, and filtering techniques are used to negate these unphysical imbalances (Ha et al. 2017). In this context, the results presented in this paper are encouraging in that blended data assimilation was able to suppress acoustic noise and recover balanced analysis fields, albeit for idealized test cases. To the best of the authors’ knowledge, this is the first study of a dynamicsdriven method to suppress acoustic noise arising from the sequential assimilation of data.
In addition, the results presented in this paper prepare the ground for future work in a number of areas. In general, the performance of a data assimilation method can be improved by tuning its adjustable parameters. Here, however, we consciously employed an untuned data assimilation scheme known to produced unphysical imbalances to test the efficacy of our dynamicsdriven method in removing them. Consequently, a comprehensive study similar to Popov and Sandu (2019) on multivariate tuning of the LETKF and localization parameters for the blended numerical model will be an avenue for future improvements of our approach. The study could also compare our method with existing balancing strategies, e.g., the IAU and the DFI, following Polavarapu et al. (2004). To ensure a fair comparison, optimizations of the IAU along the lines of Lei and Whitaker (2016) and He et al. (2020) may have to be carried out. A comparison of the effects of our dynamicsdriven method on the slower dynamics against those of the DFI and IAU, which act as lowpass filters (Houtekamer and Zhang 2016; Polavarapu et al. 2004), will be particularly insightful.
Despite the untuned data assimilation scheme used, the blended model has given promising results, although thus far only for idealized test cases. Another natural evolution will hence involve model performance on more realistic threedimensional moist dynamics scenarios with bottom topography (O’Neill and Klein 2014; Duarte et al. 2015) and on benchmarks at larger scales (Skamarock and Klemp 1994; Benacchio and Klein 2019).
Although presented and refined here for the blending between the compressible Euler equations and the pseudoincompressible model only, the methodology translates to other scenarios as long as one can formulate the according projection onto appropriate reduced dynamics via implicit substeps of a semi or fully implicit scheme. Models imposing a divergence constraint on the weighted velocity field as well as frameworks blending between nonhydrostatic and hydrostatic dynamics will naturally fit into the present approach.
Specifically, the numerical scheme proposed by Benacchio and Klein (2019) enables solution of the hydrostatic system in the largescale limit in addition to the smallscale low Mach number limit considered in this paper. Therefore, a blended data assimilation framework such as the one presented here could be enhanced with hydrostatic blending and used in a twoway blended pseudoincompressible/hydrostatic/compressible model (Klein and Benacchio 2016) exploiting the different dynamics in the equation sets.
Moreover, the theoretical framework developed in that paper also included the unified model by Arakawa and Konor (2009) as one of the reduced models. Thus, after an appropriate extension of the present numerical scheme, yet another framework for blended data assimilation can be developed. In fact, a variant of the fully compressible/Arakawa–Konor model pair has recently been presented by Qaddouri et al. (2021), and a related blending approach will allow for the filtering of smallerscale acoustic noise while leaving the Lamb wave components dynamically unaffected. Investigations similar to the ones in this paper can then be made on balancing initial states and data assimilation for small to planetaryscale dynamics using the resulting doubly blended model framework. Internal waves play an important role for atmospheric dynamics and they should not be removed indiscriminately after a data assimilation step. Therefore, the identification and removal of unwanted internal wave noise while keeping the physically meaningful wave spectrum is an additional challenge that will require further theoretical developments beyond the scope of this paper.
More generally, semiimplicit compressible models feature in several dynamical cores used by weather centers worldwide. Notable examples include the currently operational hydrostatic IFS spectral transform model in use at the European Centre for MediumRange Weather Forecasts (ECMWF; Wedi et al. 2013), and the Met Office’s Unified Model (Davies et al. 2005; Wood et al. 2014), which has a hydrostaticnonhydrostatic switch. ECMWF’s nextgeneration nonhydrostatic compressible dynamical core, IFSFVM (Kühnlein et al. 2019), actually uses a numerical discretization akin to the one considered in this paper and would therefore be an ideal candidate for a first implementation of the blended tools in a semioperational model. In addition, our approach will bear particular relevance to fully compressible operational models featuring the option of selectively employing the dynamics of a limit model (Wood et al. 2014; Melvin et al. 2019; Voitus et al. 2019; Qaddouri et al. 2021).
In this context, multimodel numerics with seamless switching could contribute to creating a level playing field to evaluate accuracy and performance with different equation sets in the same dynamical core. The positive evidence provided here in balancing data assimilation shows, in the authors’ view, a considerable potential and potential impact of deploying the blended model framework across the whole forecast model chain.
Acknowledgments.
R.C., G.H., and R.K. thank the Deutsche Forschungsgemeinschaft for the funding through the Collaborative Research Center (CRC) 1114 “Scaling cascades in complex systems,” Project 235221301, Project A02: “Multiscale data and asymptotic model assimilation for atmospheric flows.” T.B. was supported by the ESCAPE2 project, European Union’s Horizon 2020 research and innovation program (Grant 800897). We thank Sebastian Reich (U Potsdam) for the meaningful discussions on modeling the observation error covariance.
Data availability statement.
The results reported in the paper can be generated using Python scripts linked to the Python source code hosted in the Freie Universität Berlin (FUB)’s GitLab page: https://git.imp.fuberlin.de/raychew/RKLM_Reference. Currently, access to the repository is limited to users with an FUB account due to privacy concerns and can be granted on a casebycase basis by contacting the corresponding author at ray.chew@fuberlin.de.
APPENDIX A
LETKF Algorithm
The local ensemble transform Kalman filter (LETKF) algorithm presented here is a summary of the algorithm published by Hunt et al. (2007) in their paper, adapted to the blended numerical framework.
Start with an ensemble of K state vectors,

Apply the forward operator
$H$ to obtain the state vectors in the observation space:$H{x}_{k,[g]}^{f}={y}_{k,[g]}^{f}\in {\mathbb{R}}^{{l}_{[g]}}.$ 
Stack the anomaly of the state and observation vectors to form the matrices:${\mathsf{X}}_{[g]}^{f}=\left[{x}_{1,[g]}^{f}{\overline{x}}_{[g]}\cdots {x}_{K,[g]}^{f}{\overline{x}}_{[g]}\right]\hspace{0.17em}\in {\mathbb{R}}^{{m}_{[g]}\times K},$${\mathsf{Y}}_{[g]}^{f}=\left[{y}_{1,[g]}^{f}{\overline{y}}_{[g]}\cdots {y}_{K,[g]}^{f}{\overline{y}}_{[g]}\right]\hspace{0.17em}\in {\mathbb{R}}^{{l}_{[g]}\times K},$where
${\overline{x}}_{[g]}$ (${\overline{y}}_{[g]}$ ) is the mean of the state vectors (in observation space) over the ensemble:${\overline{x}}_{[g]}=\frac{1}{K}{\displaystyle \sum _{k=1}^{K}{x}_{k,[g]}^{f}}\hspace{0.17em}\in {\mathbb{R}}^{{m}_{[g]}}.$ 
From
${\mathsf{X}}_{[g]}^{f}$ and${\mathsf{Y}}_{[g]}^{f}$ , select the local X^{f} and Y^{f}. 
From the global observations y_{obs,[}_{g}_{]} and observation covariance R_{[}_{g}_{]}, select the corresponding local counterparts y_{obs} and R. Notice that the subscript [g] is dropped when representing the local counterparts.

Solve the linear system RC^{T} = Y^{f} for
$\mathsf{C}\in {\mathbb{R}}^{K\times l}$ . 
Optionally, apply a localization function to C to modify the influence of the surrounding observations.

Compute the K × K gain matrix:$\mathsf{K}={\left[(K1)\frac{\mathsf{I}}{b}+\mathsf{C}{\mathsf{Y}}^{f}\right]}^{1},$where b ≥ 1 is the ensemble inflation factor.

Compute the K × K analysis weight matrix:${\mathsf{W}}^{a}={\left[(K1)\hspace{0.17em}\mathsf{K}\right]}^{1/2}.$

Compute the Kdimension vector encoding the distance of the observations from the forecast ensemble:${\overline{w}}^{a}=\mathsf{KC}({y}_{\text{obs}}{\overline{y}}^{f}),$and add
${\overline{w}}^{a}$ to each column of W^{a} to get a set of K weight vectors$\left\{{w}_{k}^{a}\right\}$ with k = 1, …, K. 
From the set of weight vectors, compute the analysis for each ensemble member:${x}_{k}^{a}={\mathsf{X}}^{f}{w}_{k}^{a}+{\overline{x}}^{f},\hspace{1em}\text{for}k=1,\dots ,K.$

Finally, recover the global analysis ensemble
$\left\{{x}_{k,[g]}^{a}\right\}$ , k = 1, …, K.
This recovery depends on how the local regions were selected in (A2) and (A3). For local region surrounding the grid point under analysis, the reassembly of the global analysis ensemble is done by reassembling the analyzed grid points back into the global grid.
APPENDIX B
Initial Stable Vortex Configuration
APPENDIX C
Scale Analysis for the Data Assimilation Error in the PressureRelated Fields
Figures 11 and 13 show that the assimilation of only the momentum fields leads to a jump in RMSE in the nonmomentum fields, and the assimilation of all quantities in Fig. 11 leads to a jump in RMSE in the pressurerelated P field. This increase in the error occurs after the first assimilation time and remains of the same order of magnitude for the duration of the simulation, quantifying the imbalance introduced by data assimilation. The imbalance can be characterized by a scale analysis (Klein et al. 2001).
Figure C1 shows the results of scale analysis for the two test cases. Results at assimilation time are omitted. Scale analysis yields EnDA results for
APPENDIX D
Modeling the Observational Noise
The observational noise used in the data assimilation experiments is drawn from a Gaussian distribution. This Gaussian distribution has zero mean and a variance that is approximately 5% of the variance of the sparsely observed field averaged over all observation time. Specifically, the variances given in Table D1 are computed as follows.
Values of the error variance in the observations of the traveling vortex and rising bubble test cases, computed as 5% of the variance of the sparsely observed fields averaged over all observation time.
REFERENCES
Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59, 210–224, https://doi.org/10.1111/j.16000870.2006.00216.x.
Arakawa, A., and C. S. Konor, 2009: Unification of the anelastic and quasihydrostatic systems of equations. Mon. Wea. Rev., 137, 710–726, https://doi.org/10.1175/2008MWR2520.1.
Bannister, R. N., 2015: How is the balance of a forecast ensemble affected by adaptive and nonadaptive localization schemes? Mon. Wea. Rev., 143, 3680–3699, https://doi.org/10.1175/MWRD1400379.1.
Benacchio, T., and R. Klein, 2019: A semiimplicit compressible model for atmospheric flows with seamless access to soundproof and hydrostatic dynamics. Mon. Wea. Rev., 147, 4221–4240, https://doi.org/10.1175/MWRD190073.1.
Benacchio, T., W. P. O’Neill, and R. Klein, 2014: A blended soundprooftocompressible numerical model for smallto mesoscale atmospheric dynamics. Mon. Wea. Rev., 142, 4416–4438, https://doi.org/10.1175/MWRD1300384.1.
Bloom, S., L. Takacs, A. Da Silva, and D. Ledvina, 1996: Data assimilation using incremental analysis updates. Mon. Wea. Rev., 124, 1256–1271, https://doi.org/10.1175/15200493(1996)124<1256:DAUIAU>2.0.CO;2.
Bocquet, M., 2011: Ensemble Kalman filtering without the intrinsic need for inflation. Nonlinear Processes Geophys., 18, 735–750, https://doi.org/10.5194/npg187352011.
Cohn, S. E., A. Da Silva, J. Guo, M. Sienkiewicz, and D. Lamich, 1998: Assessing the effects of data selection with the DAO physicalspace statistical analysis system. Mon. Wea. Rev., 126, 2913–2926, https://doi.org/10.1175/15200493(1998)126<2913:ATEODS>2.0.CO;2.
Daley, R., 1988: The normal modes of the spherical nonhydrostatic equations with applications to the filtering of acoustic modes. Tellus, 40, 96–106, https://doi.org/10.3402/tellusa.v40i2.11785.
Davies, T., M. J. P. Cullen, A. J. Malcolm, M. Mawson, A. Staniforth, A. A. White, and N. Wood, 2005: A new dynamical core for the Met Office’s global and regional modelling of the atmosphere. Quart. J. Roy. Meteor. Soc., 131, 1759–1782, https://doi.org/10.1256/qj.04.101.
Duarte, M., A. S. Almgren, and J. B. Bell, 2015: A low Mach number model for moist atmospheric flows. J. Atmos. Sci., 72, 1605–1620, https://doi.org/10.1175/JASD140248.1.
Dudhia, J., 1995: Reply to comments on “A nonhydrostatic version of the Penn State–NCAR mesoscale model: Validation tests and simulation of an Atlantic cyclone and cold front.” Mon. Wea. Rev., 123, 2573–2575, https://doi.org/10.1175/15200493(1995)123<2573:R>2.0.CO;2.
Durran, D. R., 1989: Improving the anelastic approximation. J. Atmos. Sci., 46, 1453–1461, https://doi.org/10.1175/15200469(1989)046<1453:ITAA>2.0.CO;2.
Fisher, R. A., and F. Yates, 1953: Statistical Tables for Biological, Agricultural and Medical Research. 3rd ed. Hafner Publishing Company, 112 pp.
Flowerdew, J., 2015: Towards a theory of optimal localisation. Tellus, 67A, 25257, https://doi.org/10.3402/tellusa.v67.25257.
Fukumori, I., 2002: A partitioned Kalman filter and smoother. Mon. Wea. Rev., 130, 1370–1383, https://doi.org/10.1175/15200493(2002)130<1370:APKFAS>2.0.CO;2.
Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723–757, https://doi.org/10.1002/qj.49712555417.
Greybush, S. J., E. Kalnay, T. Miyoshi, K. Ide, and B. R. Hunt, 2011: Balance and ensemble Kalman filter localization techniques. Mon. Wea. Rev., 139, 511–522, https://doi.org/10.1175/2010MWR3328.1.
Ha, S., C. Snyder, W. C. Skamarock, J. Anderson, and N. Collins, 2017: Ensemble Kalman filter data assimilation for the Model for Prediction Across Scales (MPAS). Mon. Wea. Rev., 145, 4673–4692, https://doi.org/10.1175/MWRD170145.1.
Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distancedependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 2776–2790, https://doi.org/10.1175/15200493(2001)129<2776:DDFOBE>2.0.CO;2.
Harlim, J., and B. R. Hunt, 2005: Local ensemble transform Kalman filter: An efficient scheme for assimilating atmospheric data. University of Maryland, College Park, 18 pp., https://www2.atmos.umd.edu/∼ekalnay/pubs/harlim_hunt05.pdf.
Hastermann, G., M. Reinhardt, R. Klein, and S. Reich, 2021: Balanced data assimilation for highly oscillatory mechanical systems. Commun. Appl. Math. Comput. Sci., 16, 119–154, https://doi.org/10.2140/camcos.2021.16.119.
He, H., L. Lei, J. S. Whitaker, and Z.M. Tan, 2020: Impacts of assimilation frequency on ensemble Kalman filter data assimilation and imbalances. J. Adv. Model. Earth Syst., 12, e2020MS002187, https://doi.org/10.1029/2020MS002187.
Hohenegger, C., and C. Schär, 2007: Predictability and error growth dynamics in cloudresolving models. J. Atmos. Sci., 64, 4467–4478, https://doi.org/10.1175/2007JAS2143.1.
Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796–811, https://doi.org/10.1175/15200493(1998)126<0796:DAUAEK>2.0.CO;2.
Houtekamer, P. L., and F. Zhang, 2016: Review of the ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 144, 4489–4532, https://doi.org/10.1175/MWRD150440.1.
Hunt, B. R., E. J. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter. Physica D, 230, 112–126, https://doi.org/10.1016/j.physd.2006.11.008.
Jazwinski, A. H., 2007: Stochastic Processes and Filtering Theory. Dover Publications, 376 pp.
Kadioglu, S. Y., R. Klein, and M. L. Minion, 2008: A fourthorder auxiliary variable projection method for zeroMach number gas dynamics. J. Comput. Phys., 227, 2012–2043, https://doi.org/10.1016/j.jcp.2007.10.008.
Kalman, R. E., 1960: A new approach to linear filtering and prediction problems. J. Basic Eng., 82, 35–45, https://doi.org/10.1115/1.3662552.
Kepert, J. D., 2009: Covariance localisation and balance in an ensemble Kalman filter. Quart. J. Roy. Meteor. Soc., 135, 1157–1176, https://doi.org/10.1002/qj.443.
Klein, R., 2009: Asymptotics, structure, and integration of soundproof atmospheric flow equations. Theor. Comput. Fluid Dyn., 23, 161–195, https://doi.org/10.1007/s001620090104y.
Klein, R., 2010: Scaledependent models for atmospheric flows. Annu. Rev. Fluid Mech., 42, 249–274, https://doi.org/10.1146/annurevfluid121108145537.
Klein, R., and O. Pauluis, 2012: Thermodynamic consistency of a pseudoincompressible approximation for general equations of state. J. Atmos. Sci., 69, 961–968, https://doi.org/10.1175/JASD110110.1.
Klein, R., and T. Benacchio, 2016: A doubly blended model for multiscale atmospheric dynamics. J. Atmos. Sci., 73, 1179–1186, https://doi.org/10.1175/JASD150323.1.
Klein, R., N. Botta, T. Schneider, C.D. Munz, S. Roller, A. Meister, L. Hoffmann, and T. Sonar, 2001: Asymptotic adaptive methods for multiscale problems in fluid mechanics. J. Eng. Math., 39, 261–343, https://doi.org/10.1023/A:1004844002437.
Klein, R., U. Achatz, D. Bresch, O. M. Knio, and P. K. Smolarkiewicz, 2010: Regime of validity of soundproof atmospheric flow models. J. Atmos. Sci., 67, 3226–3237, https://doi.org/10.1175/2010JAS3490.1.
Klein, R., T. Benacchio, and W. O’Neill, 2014: Using the soundproof limit for balanced data initialization. Proc. ECMWF Seminar on Numerical Methods, ECMWF, Reading, United Kingdom, 227–236, https://www.ecmwf.int/sites/default/files/elibrary/2014/10483usingsoundprooflimitbalanceddatainitialization.pdf.
Klemp, J. B., W. C. Skamarock, and S. Ha, 2018: Damping acoustic modes in compressible horizontally explicit vertically implicit (HEVI) and splitexplicit time integration schemes. Mon. Wea. Rev., 146, 1911–1923, https://doi.org/10.1175/MWRD170384.1.
Kühnlein, C., W. Deconinck, R. Klein, S. Malardel, Z. P. Piotrowski, P. K. Smolarkiewicz, J. Szmelter, and N. P. Wedi, 2019: FVM 1.0: A nonhydrostatic finitevolume dynamical core for the IFS. Geosci. Model Dev., 12, 651–676, https://doi.org/10.5194/gmd126512019.
Lang, M., P. Browne, P. J. Van Leeuwen, and M. Owens, 2017: Data assimilation in the solar wind: Challenges and first results. Space Wea., 15, 1490–1510, https://doi.org/10.1002/2017SW001681.
Lei, L., and J. S. Whitaker, 2016: A fourdimensional incremental analysis update for the ensemble Kalman filter. Mon. Wea. Rev., 144, 2605–2621, https://doi.org/10.1175/MWRD150246.1.
Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4DVar. Quart. J. Roy. Meteor. Soc., 129, 3183–3203, https://doi.org/10.1256/qj.02.132.
Lynch, P., and X.Y. Huang, 1992: Initialization of the HIRLAM model using a digital filter. Mon. Wea. Rev., 120, 1019–1034, https://doi.org/10.1175/15200493(1992)120<1019:IOTHMU>2.0.CO;2.
Melvin, T., T. Benacchio, B. Shipway, N. Wood, J. Thuburn, and C. Cotter, 2019: A mixed finiteelement, finitevolume, semiimplicit discretization for atmospheric dynamics: Cartesian geometry. Quart. J. Roy. Meteor. Soc., 145, 2835–2853, https://doi.org/10.1002/qj.3501.
MendezNunez, L. R., and J. J. Carroll, 1994: Application of the MacCormack scheme to atmospheric nonhydrostatic models. Mon. Wea. Rev., 122, 984–1000, https://doi.org/10.1175/15200493(1994)122<0984:AOTMST>2.0.CO;2.
Mitchell, H. L., P. L. Houtekamer, and G. Pellerin, 2002: Ensemble size, balance, and modelerror representation in an ensemble Kalman filter. Mon. Wea. Rev., 130, 2791–2808, https://doi.org/10.1175/15200493(2002)130<2791:ESBAME>2.0.CO;2.
Neef, L. J., S. M. Polavarapu, and T. G. Shepherd, 2006: Fourdimensional data assimilation and balanced dynamics. J. Atmos. Sci., 63, 1840–1858, https://doi.org/10.1175/JAS3714.1.
O’Neill, W., and R. Klein, 2014: A moist pseudoincompressible model. Atmos. Res., 142, 133–141, https://doi.org/10.1016/j.atmosres.2013.08.004.
Pedlosky, J., 2013: Geophysical Fluid Dynamics. 2nd ed. Springer, 710 pp.
Polavarapu, S., S. Ren, A. M. Clayton, D. Sankey, and Y. Rochon, 2004: On the relationship between incremental analysis updating and incremental digital filtering. Mon. Wea. Rev., 132, 2495–2502, https://doi.org/10.1175/15200493(2004)132<2495:OTRBIA>2.0.CO;2.
Popov, A. A., and A. Sandu, 2019: A Bayesian approach to multivariate adaptive localization in ensemblebased data assimilation with timedependent extensions. Nonlinear Processes Geophys., 26, 109–122, https://doi.org/10.5194/npg261092019.
Qaddouri, A., C. Girard, S. Z. Husain, and R. Aider, 2021: Implementation of a semiLagrangian fully implicit time integration of the unified soundproof system of equations for numerical weather prediction. Mon. Wea. Rev., 149, 2011–2029, https://doi.org/10.1175/MWRD200291.1.
Reich, S., and C. Cotter, 2013: Ensemble filter techniques for intermittent data assimilation. Large Scale Inverse Problems: Computational Methods and Applications in the Earth Sciences, S. K. M. Cullen, M. A. Freitag, and R. Scheichl, Eds., Radon Series on Computational and Applied Mathematics, Vol. 13, De Gruyter, 91–134.
Reich, S., and C. Cotter, 2015: Probabilistic Forecasting and Bayesian Data Assimilation. Cambridge University Press, 308 pp.
Skamarock, W. C., and J. B. Klemp, 1992: The stability of timesplit numerical methods for the hydrostatic and the nonhydrostatic elastic equations. Mon. Wea. Rev., 120, 2109–2127, https://doi.org/10.1175/15200493(1992)120<2109:TSOTSN>2.0.CO;2.
Skamarock, W. C., and J. B. Klemp, 1994: Efficiency and accuracy of the Klemp–Wilhelmson timesplitting technique. Mon. Wea. Rev., 122, 2623–2630, https://doi.org/10.1175/15200493(1994)122<2623:EAAOTK>2.0.CO;2.
Smolarkiewicz, P. K., 1991: On forwardintime differencing for fluids. Mon. Wea. Rev., 119, 2505–2510, https://doi.org/10.1175/15200493(1991)119<2505:OFITDF>2.0.CO;2.
Smolarkiewicz, P. K., and L. O. Margolin, 1993: On forwardintime differencing for fluids: Extension to a curvilinear framework. Mon. Wea. Rev., 121, 1847–1859, https://doi.org/10.1175/15200493(1993)121<1847:OFITDF>2.0.CO;2.
Smolarkiewicz, P. K., and A. Dörnbrack, 2008: Conservative integrals of adiabatic Durran’s equations. Int. J. Numer. Methods Fluids, 56, 1513–1519, https://doi.org/10.1002/fld.1601.
Smolarkiewicz, P. K., C. Kühnlein, and N. P. Wedi, 2014: A consistent framework for discrete integrations of soundproof and compressible PDEs of atmospheric dynamics. J. Comput. Phys., 263, 185–205, https://doi.org/10.1016/j.jcp.2014.01.031.
Vallis, G. K., 2017: Atmospheric and Oceanic Fluid Dynamics: Fundamentals and LargeScale Circulation. 2nd ed. Cambridge University Press, 964 pp.
Van Leeuwen, P. J., Y. Cheng, and S. Reich, 2015: Nonlinear Data Assimilation. Springer, 118 pp.
Voitus, F., P. Bénard, C. Kühnlein, and N. P. Wedi, 2019: Semiimplicit integration of the unified equations in a massbased coordinate: Model formulation and numerical testing. Quart. J. Roy. Meteor. Soc., 145, 3387–3408, https://doi.org/10.1002/qj.3626.
Wedi, N. P., M. Hamrud, and G. Mozdzynski, 2013: A fast spherical harmonics transform for global NWP and climate models. Mon. Wea. Rev., 141, 3450–3461, https://doi.org/10.1175/MWRD1300016.1.
Wikle, C. K., and L. M. Berliner, 2007: A Bayesian tutorial for data assimilation. Physica D, 230, 1–16, https://doi.org/10.1016/j.physd.2006.09.017.
Wood, N., and Coauthors, 2014: An inherently massconserving semiimplicit semiLagrangian discretization of the deepatmosphere global nonhydrostatic equations. Quart. J. Roy. Meteor. Soc., 140, 1505–1520, https://doi.org/10.1002/qj.2235.
Zupanski, M., 2009: Theoretical and practical issues of ensemble data assimilation in weather and climate. Data Assimilation for Atmospheric, Oceanic and Hydrologic Applications, S. K. Park and L. Xu, Eds., Vol. 1, Springer, 67–84.