Using Satellite Data Assimilation Techniques to Combine Infrasound Observations and a Full Ray-Tracing Model to Constrain Stratospheric Variables

: Infrasound waves generated at Earth ’ s surface can reach high altitudes before returning to the surface to be recorded by microbarometer array stations. These waves carry information about the propagation medium, in particular temperature and winds in the atmosphere. It is only recently that studies on the assimilation of such data into atmospheric models have been published. Intending to advance this line of research, we here use the modulated ensemble transform Kalman ﬁ lter (METKF) } commonly used in satellite data assimilation } to assimilate infrasound-related observations in order to update a column of three vertically varying variables: temperature and horizontal wind speeds. This includes stratospheric and mesospheric heights, which are otherwise poorly observed. The numerical experiments on synthetic data but with realistic reanalysis product atmospheric speci ﬁ cations (following the observing system simulation experiment paradigm) reveal that a large ensemble is capable of reducing errors, especially for wind speeds in stratospheric heights close to 30 – 60 km. While using a small ensemble leads to incorrect analysis increments and large estimation errors, the METKF ameliorates this problem and even achieves error reduction from the prior to the posterior mean estimator. SIGNIFICANCE

ABSTRACT: Infrasound waves generated at the Earth's surface can reach high altitudes before returning to the surface to be recorded by microbarometer array stations.These waves carry information about the propagation medium, in particular, temperature and winds in the atmosphere.
It is only recently that studies on the assimilation of such data into atmospheric models have been published.Intending to advance this line of research, we here use the Modulated Ensemble Transform Kalman Filter (METKF) -commonly used in satellite data assimilation-to assimilate infrasound-related observations in order to update a column of three vertically varying variables: temperature and horizontal wind speeds.This includes stratospheric and mesospheric heights, which are otherwise poorly observed.The numerical experiments on synthetic data but with realistic reanalysis product atmospheric specifications (following the Observing System Simulation Experiment paradigm) reveal that a large ensemble is capable of reducing errors, especially for the wind speeds in stratospheric heights close to 30 − 60 km.While using a small ensemble leads to incorrect analysis increments and large estimation errors, the METKF ameliorates this problem and even achieves error reduction from the prior to the posterior mean estimator.

Introduction
The skill of numerical weather prediction (NWP) forecasts has increased dramatically over the last 50 years.Bauer et al. (2015) labelled this improvement "a quiet revolution", and it is due to three main factors.First, the dynamical models which describe and predict the evolution of the atmospheric flow have improved considerably (Williamson 2007;Lynch 2008).These have evolved in tandem with increasing computational power, which is needed as the spatio-temporal resolution increases and more processes are represented explicitly.Furthermore, the modeling and forecasting paradigm has shifted from single-trajectory to ensemble systems, which allows us to equip the forecast with a probabilistic measure (Palmer 2019).The second factor is the availability of observations, both in the temporal and spatial domains.The adequate use of remote-sensing observations has proved vital to initialise and constrain NWP forecasts (Bluestein et al. 2022), like the correct usage of the ever-expanding constellation of Earth-observation satellites (Eyre et al. 2020(Eyre et al. , 2022)).Finally, we have the ever-evolving data assimilation (DA) techniques whose purpose is to optimally combine information from these two (imperfect) sources of information (e.g., Asch et al. 2016).
Today, NWP faces novel challenges and new frontiers in forecasting.Research on sub-seasonal to seasonal forecast models has grown over the last decade (Vitart and Robertson 2018).This is a challenging endeavor, since these processes exist somewhere between weather and climate phenomena, and they have components associated both with the initial-value problem aspects of NWP and the boundary-condition aspects of climate prediction.Predictability on these timescales is closely related to the memory of the oceans.Furthermore, the stratosphere and its connection to the troposphere also play a fundamental role in this regime (Baldwin and Dunkerton 2001; of a full ray-tracing model into the DA process instead of geometric models.This complicates the localization step in the ensemble DA process, since a straightforward formulation requires an expensive computation of Jacobians of the ray-tracing model.In the case of radiative transfer models needed for satellite DA (Campbell et al. 2010), this issue has been largely overcome by a handful of methods, one being the Modulated Ensemble Transform Kalman Filter (METKF) introduced by Bishop and Hodyss (2009), and refined by Bishop et al. (2017).METKF is a specialized implementation of the Ensemble Transform Kalman Filter (ETKF) (Bishop et al. 2001) that increases the size of an ensemble of forecasts into a modulated ensemble which, by design, yields a desired localized covariance matrix.The second contribution of this work is the application of the METKF for the assimilation of infrasound-related observations.This paper follows the paradigm of Observing-Systems Simulation Experiments (OSSE), described in Arnold and Dey (1986).OSSEs are commonly used in the atmospheric science community to assess the potential of assimilating new types of observations.OSSEs are constructed by generating a known synthetic truth (by running an atmospheric model, for instance), simulating imperfect observations for which the observation operator and error are known and characterized, and attempting to reconstruct the original truth from these observations using DA techniques.In the current work, we are interested in testing: (i) the ability of METKF to estimate a vertical atmospheric profile from infrasound-related experiments with a relatively large ensemble (the idealized setting); and (ii) the capacity of the METKF to overcome the sampling problems that are due to limited-size ensembles (the realistic setting).
The rest of this paper is organized as follows.In Section 2 we discuss sources of infrasound data, in particular the Hukkakero-Arces geographical setting used in our OSSE's.Section 3 includes a detailed description of the problem in terms of DA elements, as well as the justification for using the METKF to localize vertically integrated quantities.Section 4 describes the setup for our OSSE's, and Section 5 presents and analyzes the results.We finish with conclusions and discussion in Section 6.

Infrasound-related measurements and their potential in DA a. Infrasound propagation as a means to probe the atmosphere
There are different sources at the Earth's surface (and aloft) that generate infrasound waves, that is, acoustic inaudible waves in the 0.1 − 10 Hz range.These sources of infrasound can be natural (e.g., volcanoes, earthquakes, ocean-generated microbaroms), or the result of human activities (e.g., controlled detonations of munition or quarry explosions).At such low frequencies, these waves can travel for long distances, suffering little dissipation, and having a wide range of propagation characteristics and various sensitivities to atmospheric conditions (Le Pichon et al. 2010, 2019).
Depending on the atmospheric structure, some of these waves travel up to middle atmospheric or lower thermospheric altitudes from where they are reflected or refracted back to the surface, to be detected by sensor arrays.While traveling through an atmospheric slab, they are affected by the atmospheric conditions, in particular, winds and temperature.Therefore, the infrasoundrelated measurements estimated at the instrument site contain useful integrated information about the conditions along the propagation path.For example, an objective of the EU-funded ARISE (Atmospheric dynamics Research InfraStructure in Europe) projects was to establish and ensure the operation of this type of infrastructure for long-term observation and modeling of the middle atmosphere (Blanc et al. 2018(Blanc et al. , 2019)).
In the past, infrasound measurements have been used to evaluate and validate, and invert for atmospheric model profiles.For instance, Assink et al. (2014) used volcanic infrasound to evaluate winds and temperatures of the ECMWF forecast system.Smets et al. (2016) used these measurements to evaluate the forecast skill on predicting sudden stratospheric warming events.In these studies, values coming from an atmospheric profile (either forecast or analysis) are put through a ray-tracing model to generate the predicted infrasound-related observations.These results are then compared to the actual observations.This is the so-called forward modeling problem.A more challenging problem is to take actual observations and try to infer the atmospheric values to which they correspond, i.e., the so-called inverse problem.In general, this is an underdetermined problem, i.e., it does not have a unique solution.Several recent works have explored the inversion of infrasound-related observations to derive atmospheric profiles or to provide corrections to model profiles (e.g., Le Pichon et al. 2005;Drob et al. 2008;Lalande et al. 2012;Johnson et al. 2012;Arrowsmith et al. 2013;Assink et al. 2013;Park et al. 2016).Recently, Amezcua et al. (2020) and Amezcua and Barton (2021) conducted DA experiments using the infrasound dataset described next.

b. Using infrasound observations from controlled detonation of ammunition in data assimilation
We start by noting that, from an atmospheric probing perspective, we are describing a dataset of opportunity.There are several additional repeated explosion datasets available in this region, stemming, e.g., from open-pit mining blasts (Gibbons et al. 2015).
In August and September since 1988, excess explosives have been disposed of by controlled detonations in Hukkakero, Finland.Infrasound waves generated by these detonations are excited in all directions, and their path through the atmosphere is mainly a function of the atmospheric wind and temperature fields.Some of these waves are detected about 10 minutes later by the array station ARCES (Norway), which is ∼ 180 km almost due North from the explosion site.It has been concluded that these waves have been partially reflected or back-scattered by smaller-scale atmospheric structures at about 30 − 60 km altitude (Blixt et al. 2019;Vorobeva et al. 2023).
The exact altitude of the strongest reflection is unknown and varies between events, but it can be estimated by indirect methods like matching observed travel times with those of simulated infrasonic wave propagation paths.This kind of transient explosion infrasound dataset is less complex than many other infrasound datasets since the waves can be approximated to radiate from a single point source in both space and time, with a known location and origin time.These data are sensitive to wind and temperature in the stratosphere and lower mesosphere.A variety of infrasound wave propagation and atmospheric probing studies have exploited infrasound data recorded from these explosions (Gibbons et al. 2015;Green and Nippress 2019;Gibbons et al. 2019;Blixt et al. 2019;Cugnet et al. 2019;Vera Rodriguez et al. 2020;Vorobeva et al. 2023).Blixt et al. (2019) performed near-inversion experiments by associating observed quantities with wind averages over different altitude "channels" (layers) computed from the ERA-Interim product (Dee et al. 2011).They simplified the acoustic transmission problem using a straightforward geometric model.Vera Rodriguez et al. (2020) performed an inversion of these observations applying a heuristic optimization algorithm, using smoothness criteria to regularise the problem, and using the ERA-5 ensemble product (Hersbach et al. 2019) to initialise the process and constrain its output.
Two recently published scientific papers exploited DA for the observations from this data set (Amezcua et al. 2020;Amezcua and Barton 2021).These works applied different flavors of the Ensemble Kalman filter (Evensen 2006, EnKF), a technique widely utilized in atmospheric science, and which can be considered a sample-based implementation of the Kalman filter (Kalman 1960).
Both cases used the ERA5 10-member ensemble as background (Hersbach et al. 2019), and the simple geometric model developed by Blixt et al. (2019), using the reflection heights inferred in that work.EnKF relies on sample covariance matrices to spread information from observations to model variables.For the number of variables in the atmospheric profiles at the heart of this problem, the quality of a covariance matrix computed from a 10-member ensemble is low, often plagued with spurious sampling errors.Localization is a way to ameliorate these problems (Hamill et al. 2001;Anderson 2012).This is not trivial when using observations that contain integrated quantities (Campbell et al. 2010).Amezcua et al. (2020) evaded this problem by taking averages and reducing the number of vertical levels to update from 137 to only 6. Amezcua and Barton (2021) used 131 levels and applied a type of localization (Shlyaeva et al. 2019) that required the linearization of the observation operator, which was relatively straightforward for the geometric model (Blixt et al. 2019).However, this is complicated and computationally expensive with a full ray-tracing model, which is needed for a more realistic treatment of infrasound observations.
In the last decade, some methods were developed to implement localization in verticallyintegrated satellite observations.For instance, Lei et al. (2016) implemented a global group filter to derive adaptive localization.This filter was similar to the hierarchical filter proposed in Anderson (2007) and later used in Lei and Anderson (2014) to construct vertical localization functions.Farchi and Bocquet (2019) recommend the creation of augmented ensembles by performing e.g. a randomized singular value decomposition.Bishop et al. (2017) developed the Modulated Ensemble Transform Kalman Filter (METKF), a variation of the ETKF (Bishop et al. 2001;Wang et al. 2004) which avoids linearization by computing an extended (modulated) ensemble which has, by construction, a desired localization structure.In the rest of the current paper, we apply this method to assimilate infrasound-related data into realistic atmospheric vertical profiles.

Methodology
This section introduces basic aspects of the DA method used and presents the nomenclature used in the rest of the work.In this work we focus on a single assimilation step of the DA process, i.e. there is no cycling.

a. Problem setup
Let us consider an atmospheric range-independent volume that is discretized into a profile of Each level has  var altitude-dependent physical variables (we do not consider any horizontal variation).In our case, these are the cross-wind (), tail-wind () and temperature  (), rendering  var = 3.At a given time , the state vector x ∈ R   is the concatenation of the   =    var elements.Namely, where (2) We assume that x follows a multivariate Gaussian (MVG) distribution, i.e. x  ∼  µ  , B , where µ  ∈ R   and B ∈ R   ×  are the background mean and covariance matrix, respectively.If there are unknown system parameters, these can be appended to the state vector and estimated by the DA, using a process called state augmentation (e.g.Asch et al. 2016).In this work, however, we only perform state estimation.
The observations y ∈ R   are functions of x, which can be measured.Mathematically, where x true ∈ R   is the true value of the state variable, which is indeed the value we are trying to estimate.The term η ∈ R   corresponds to additive observational error and is assumed to follow an MVG distribution with zero mean (unbiased error).Namely, η ∼  (0, R).The observational error covariance R ∈ R   ×  depends on the measuring system and is often assumed to be known.
Finally, the function ℎ : R   → R   is the so-called observation operator.In remote sensing of atmospheric temperatures using satellite observations of radiance (Campbell et al. 2010), ℎ is a radiation transfer model.The infrasound measurement problem also involves a wave (mechanical instead of electromagnetic) traveling through an atmospheric slab and being modified by the atmospheric conditions it encounters.In both cases, ℎ integrates (or sums) information from state variables, reducing fields into scalars, and yielding   ≪   .Next, we describe how ℎ was previously modeled, and how the modeling has been improved in this work.Amezcua et al. (2020) and Amezcua and Barton (2021) used a simplified observation operator, derived from the geometrical model of Blixt et al. (2019).Although useful, this is a rough approximation of the real process.It relates the cross-wind u to the shift in the backazimuth angle   detected at reception, but excludes the tail-wind v and temperature field T components included in Eq (1).It considers the transmission path to be a perfect triangle, which is a rough approximation (see, e.g., Ostashev and Norris (2022) to appreciate the real complexity of the ray paths).It also considers a unique reflection height  max , corresponding to a single reading of the infrasound array processing output time series.In reality, there are typically multiple arrivals corresponding to multiple reflection heights and infrasound processing output readings (see Blixt et al. 2019, Figs. 4 and 5 for illustrations).
Following Vera Rodriguez et al. (2020), in this work we use a full ray-tracing model, InfraGA, developed by Dr. Philip Bloom at the Los Alamos US National Laboratory (Blom et al. 2015;Blom andWaxler 2017, 2021).InfraGA simulates the infrasonic wave propagation from source to receiver, and its numerical modeling links the three atmospheric fields in Eq (2), to three observables estimated using the infrasound data recorded at the array station, which are related to the direction-of-arrival of the infrasound wavefront impinging the station.They are   ,  travel (travel time), and  trace (trace velocity).In DA terms, the observation vector is: When applying this InfraGA, we still consider a single and known  max .As in Blixt et al. (2019), we introduce the reflection in the horizontal middle point of the path.We acknowledge that there is an inherent time-evolution component in ℎ since the infrasound takes a finite time to travel between the source and the receiver, so the wave encounters a time-evolving atmosphere.In the satellite case, the electromagnetic wave travels at the speed of light, so the atmospheric variability is negligible, but the sound speed is much lower.Nonetheless, and in accordance with previous works, we consider the travel time of the infrasound wave (about 10 minutes in our setting) to be very small compared to the variability time scales of the atmospheric flow.

b. The ensemble transform Kalman filter
Having the background information and observations just described, the goal of DA is to obtain a posterior distribution for the state variable.If all errors involved in the problem are MVG, and , the posterior distribution (also known as analysis) is also MVG, with mean µ  ∈ R   and covariance A ∈ R   ×  .These values can be obtained using the analysis equations of the Kalman filter (Kalman 1960).The extended Kalman filter (EKF) is an adaptation of the Kalman filter for nonlinear setups, yielding a suboptimal yet useful solution to the problem as long as ℎ is not too far from linear (e.g., Jazwinski 1970).Ensemble Kalman filters (EnKFs) (e.g., Nerger et al. 2012;Houtekamer and Zhang 2016) are sample-based implementations of the EKF, which avoid the computation and storage of the Jacobian matrix of ℎ (required by the EKF).
EnKFs rely on an ensemble to estimate means and covariances.Fortunately, modern ensemble prediction systems (e.g., Palmer 2019) generate   forecasts from different initial conditions (selected to represent the uncertainty in the system) and evolved using a forecast model.At a time , we denote the background ensemble as X  ∈ R   ×  , where each column is a valid forecast (we do not need a time index since we do not perform cycling).The ensemble mean x ∈ R   and a matrix of perturbations X ∈ R   ×  can be written as: where 1 ∈ R   is a vector of ones.Note that one can obtain the sample estimator of B, denoted P  , as: One can define an ensemble of predicted observations Y  ∈ R   ×  by applying ℎ to each ensemble member, and compute the corresponding mean ȳ ∈ R   and matrix of perturbations Ŷ ∈ R   ×  (e.g., Evensen 1994).Namely, One can compute the KF equations with the sample elements to transform x into x and P  into P  respectively.However, this does not provide a way to transform individual ensemble members.
There is no unique way to do this, leading to different EnKF formulations.The ETKF is a deterministic square root filter that uses a right matrix multiplication to transform the perturbations where W  ∈ R   ×  is referred to as the transform matrix, chosen such that fulfils the KF analysis equation for covariance.Using the Potter method and the Woodbury matrix inversion lemma, it can be shown (Tippett et al. 2003) that this matrix is: . There is an infinite number of square root matrices, but Wang et al. (2004) propose using the symmetric matrix square root: using the eigendecomposition where G ∈ R   ×  is the matrix of eigenvectors and  ∈ R   ×  is the (diagonal) matrix of eigenvalues of the   ×   positive definite and symmetric matrix on the right side of Eq (12).Hunt et al. (2007) note that the mean x can be found as the sum of x and a linear combination of the columns of X , i.e.: The vector w  ∈ R   contains the coefficients for the linear combination.Explicitly: Finally, X  can easily be constructed by adding x to each column of X .Namely,

c. Localization for integrated observation quantities: the modulated ETKF
The sample estimator quality is crucial for EnKF performance.When   ≪   , sampling errors lead to spurious off-diagonal elements in P  .Recognizing this, both Anderson (2001) and Hamill et al. (2001) proposed to artificially modify P  to reduce sampling errors; this modification is called localization.
Localization can be performed either in state space or in observation space, see, e.g., Farchi and Bocquet (2019) for an ample discussion on the topic.The localized covariance P  loc ∈ R   ×  in state space is obtained from the Schur (element-wise) product of P  and a localization matrix L ∈ R   ×  .Explicitly, we have A typical choice for this decay is the so-called Gaspari-Cohn function, which is a compact-support approximation to a Gaussian (Gaspari and Cohn 1999).In the case of multiple variables per level, one has to modify the different blocks of the matrix separately.When using model-space localisation, the ETKF formulae discussed previously cannot be applied.A matrix square-root formulation is required instead.This implies that, if the observation operator is nonlinear, one needs to compute its Jacobian as indicated in Shlyaeva et al. (2019).In fact, Amezcua and Barton (2021) applied localization in this way to infrasound observations when using a simple geometric propagation model, since in that case the calculation of the Jacobian was trivial.
Observation-space localization follows a different approach, see, e.g., Hunt et al. (2007) for a detailed description.This paradigm focuses on the variables of a given gridpoint of the model and looks at the observations that affect that gridpoint within a defined radius of interest.The local observation precision (inverse of the covariance) matrix is modified to account for the distance of observations from the gridpoint.This is where R −1 i is a submatrix of R specific to the  ℎ gridpoint, and L is a localization matrix with the corresponding dimensions.The full analysis field is a collection of patches of local analyses, minimizing jumps amongst the patches (Ott et al. 2004).This formulation is completely compatible with the ETKF formulae discussed earlier (Hunt et al. 2007).
Applying localization to integrated quantities with complex observation operators is a challenge.Campbell et al. (2010) analysed the case involving satellite observations.The difficulty arises because the model space is a column with temperatures at each level, whereas the observation space is a set of channels measuring radiance at different wavelengths.The observational data are hence integrated quantities.Therefore, defining a 'distance' between two channels, or between a channel and a vertical level, is not conceivable as required for observation-space localisation.
Since in our problem we have spatially integrated infrasound wave propagation data, similar difficulties arise.Our model space is an atmospheric slab with temperature and wind variables at vertical levels, and our observation space has three integrated quantities related to the infrasound propagation through the atmospheric slab.A naïve implementation of model-space localization in our case is not feasible since the Jacobian of the ray-tracing model is not available.Bishop and Hodyss (2009) and Bishop et al. (2017) developed a clever method to apply localization for integrated observations in the ETKF without requiring a Jacobian.It takes X  ∈ R   and converts it into a modulated ensemble U  ∈ R   , such that both have the same mean: but where U  produces a localized covariance.This requires a factorization of Eq (16) in the following form: X X⊤ where Û ∈ R   ×  is an ensemble of modulated perturbations, (hence the name Modulated ETKF).Obtaining Û from Eq (19) is possible using the Khatri-Rao product (Khatri and Rao 1968).We follow Bishop et al. (2017) for the next steps, albeit with some variations in notation.L is a positive definite symmetric matrix, allowing for the eigendecomposition: where S ∈ R   ×  is the matrix of eigenvectors, and  ∈ R   ×  is a diagonal matrix of eigenvalues.
The non-symmetric matrix square root of L is: where  1 2 is trivial to compute since it is diagonal.L 1 2 has the same shape as L. In its construction, it is not necessary to keep all the eigenvectors of the localization matrix.Often, the leading   eigenvalues in  add to 90% (or any threshold) of the total sum of eigenvalues, and many of the smallest eigenvalues are close to zero.Bishop et al. (2017) recommend keeping the leading   eigenvalues to construct a rank-  approximate matrix square root L1 2 ∈ R   ×  as: In Eq (22), the subindices indicate the number of [rows, columns] kept for each matrix.Finally Û ∈ R   ×  is computed as the Khatri-Rao product (Khatri and Rao 1968): The Khatri-Rao product (denoted ⋄) is a column-wise Kronecker product.Explicitly, this is: We added parentheses to help identify the elements in Û .Each parenthesis contains the original   -member ensemble modulated by a corresponding eigenvector ℓ 1 2 of L 1 2 .This illustrates that   =     depends on the number of eigenvalues kept.
The ETKF is applied on U  , yielding a modulated analysis ensemble U  ∈ R   .If one needs to recover an ensemble of the original size (e.g., to continue the prediction process), there are ways to select an ensemble X  from U  (Bishop et al. 2017), but it is not necessary for us.

Applying METKF to synthetic infrasound data: Experimental setup
We now implement the METKF in a synthetic yet realistic case involving infrasound-related observations.We simulate the geographic and atmospheric settings of the Hukkakero explosion site, with infrasound waves propagating to the ARCES array (Blixt et al. 2019).The source and reception sites are separated by an approximate North-South distance of  ∼ 180 km along the great circle, with infrasound waves reflected roughly between 30 km and 50 km high.
We start the section by describing the construction of a realistic background ensemble X  .Then, we look into the design of a localization matrix, and the application of ℎ.As is conventional for OSSE experiments, we produce a reference truth x true to compare against, and from which synthetic observations are generated.We then apply the ETKF using a large ensemble to get a solution without sampling error.Then we reduce the ensemble size, which introduces sampling error, and finally we ameliorate this applying localization.The performance of the modulated ensemble is compared with a raw ensemble of the same size.
To assess the DA results, we perform two complementary processes.The first is a singleobservation experiment.In this case, we take a single truth and generate a 'perfect' observation, i.e., we choose the realization of the observation error to be equal to  [η] = 0. We then perform DA with this observation and assess the quality of the result.The second is a statistical evaluation of the DA performance.In this case, we consider  true truths and produce their respective observations using different realizations of η.We perform the DA process, evaluate the performance in each and every case and collect statistics.

a. Background-state generation
Ideally, we would have a large ensemble forecast for a given lead-time in the region of interest.
In its absence, we artificially generate a large ensemble that represents realistic atmospheric structures.We use the Modern-Era Retrospective analysis for Research and Applications Version 2 (MERRA2), produced by NASA (Gelaro et al. 2017).This is a single-trajectory (i.e.not an ensemble) reanalysis product with data beginning in 1980, which exploits several observation types (Koster et al. 2016).It uses the GEOS-5 model with a spatial resolution of about 50 km in the horizontal and with 72 vertical levels, with the uppermost level at 0.01 hPa.The temporal resolution is 3 hours.With these reanalysis data, we create a realistic µ  and B to characterise the MVG to sample from.The elements in this process are detailed with the help of Figure 2 and  The black lines represent the true profile used in the single-observation experiment.
1. We focus in the Northern Hemisphere autumn/winter period which is when the stratospheric winds are strong and have the greatest influence on the propagation of infrasound waves.This is also the period when the stratosphere-troposphere coupling is most prominent, and therefore a better representation of stratospheric dynamics can have the greatest impact on (tropospheric) numerical weather prediction (e.g., Charlton-Perez et al. 2013).We select 207 MERRA2 land grid points over Finland and northeast Norway (9 longitude by 23 latitude points).We consider five days in November 2010 (17 th to 21 st , inclusive).To represent common stratospheric conditions (e.g., a strong westerly polar vortex), we make sure to select days without sudden stratospheric warming events.We extract data for five times a day (every three hours, 9h-21h UTC).
2. The previous step provides a total of 5175 vertical profile samples of the {(), (), ()} variables.We extract the lowermost   = 60 model levels, spanning from the surface to about 62 km altitude.We use these profiles to compute a correlation matrix C, shown in Figure 1.This is a block matrix where each block corresponds to a different physical variable, at all levels.The blocks are highlighted with black lines.
3. We use the statistics coming from the profiles in (1) to prescribe a MVG distribution from which to sample.We do not use these profiles directly since they do not follow an MVG distribution -there is no particular reason for them to do this, since these are columns coming from different locations at different times of a single-trajectory reanalysis.This is different from having, for example, a 12-hour ensemble forecast for a given location.Hence, we construct the distribution as follows.We choose one of the profiles from (a) as µ  .Mimicking a realistic scenario, we prescribe background standard deviations that increase with altitude.
For , these standard deviations are set to vary linearly from 2 K to 8 K, while for both  and  they are set to vary linearly from 1.5 m/s to 6 m/s.These over-simplified values were partially inspired in results from (Amezcua and Barton 2021).With these elements, we construct the background covariance matrix B from: where  Gaspari-Cohn decay from the main diagonal with a fixed length scale   = 0.5 km for all altitudes.
Even though we did not perform a proper sensitivity analysis to the localization length scale, some brief experiments suggested that this length-scale yielded the best DA performance results from the list   = {0.5, 1.0, 1.5, 2.0, 2.5} km.

c. Application of the observation operator
Mapping state variables into observation space is required for two purposes.The first is to generate synthetic observations (constructed from x true ) to be assimilated.The second is to get Y  from X  , as part of the analysis step of the ETKF algorithm.Comparison of approximate and exact localization matrices the 3200 available profiles (0.96%) did not return the predicted observations.We examined the vertical profiles in which the search for an eigenray failed, and found that they share an average head wind of  ≥ 10 /.These conditions do not allow for the physical existence of the desired eigenray.We remove these profiles from the background ensemble.This removal does not change the background statistics beyond the second decimal, with the largest changes in v.
Figure 5 shows a subset of the 3169 successful eigenray examples found.In this example, we show 5 background trajectories (blue lines), and one corresponding to the synthetic truth (red line) in the single-observation experiment.The axes correspond to the 3-D spatial dimensions, where the propagation occurs from Hukkakero to ARCES, which is almost due North.We can appreciate the complexity of the trajectories which was lost in the past when we approximated them as simple triangles when using the geometric model.
For our experiments, we randomly separate our 3169 profiles into two groups:   = 2500 members for X  , and  true = 669 truths to assess the DA performance.Figure 6 shows the resulting observation space values for the 2500 predicted observations.The first row shows the marginal distributions of the observed variables (blue histograms, with vertical blue line representing the sample mean).We use the Shapiro-Wilk test (Shapiro and Wilk 1965) with significance  = 0.05 to asses the Gaussianity of these distributions.Both   and   can be considered Gaussian, while   fails the test ( ∼ 10 −6 ), mostly due to its asymmetry (left skewness).For comparison, we draw the closest Gaussian pdf (one with the same mean and standard deviation) in each panel.
The second row shows two-dimensional cross-sections of the three-dimensional observation space.
The aim of DA is hence to transform the cloud of predicted observations (blue dots) into a cloud closer to the real observation (dark orange star), and then to map this change into the original physical atmospheric variables.For reference, the means of the observed variables are shown in blue lines.Finally, we need to prescribe observation error parameters.We consider η to be uncorrelated.
Hence, R has the following values: R = diag 0.1 180 Namely, the standard deviation of the observational errors are: 0.1 • for   , 1 second for  travel , and 0.5 m/s for  trace , which agree with previous works.The observational error is quite low with respect to the spread of predicted observations, so we can expect large changes from Y  to Y  in the assimilation.

a. Evaluation metrics
For convenience, we present key quantities commonly used to evaluate the performance of a DA system.The analysis increment d  ∈ R   is defined as: i.e. this is the difference between the analysis and background means, which quantifies the effect observations have on the DA process.Hence, when there is no increment, there is no observational impact.
The OSSE allows for a direct comparison of our results to a synthetic reference truth.A common metric is the root-mean-square error of both x and x with respect to the synthetic truth: where the square is evaluated element-wise, and the average (denoted by the overbar) can be taken in space, over time (when several DA cycles are performed sequentially), or over different experiment realizations.In cases with state variables of different physical units, it is common to compute this metric separately or use an energy norm, see, e.g., Kalnay (2003).In our numerical experiments, we compute these metrics for each physical variable at each vertical level.Hence, both expressions in Eq (28) are vectors with   elements.

b. Choices for background ensemble
We apply (M)ETKF in four cases: 1.A large X  with   = 2500.This produces a matrix P  indistinguishable from the one prescribed in Subsection a.
2. A small X  with   = 5 which is useful to illustrate errors coming from small samples.

A modulated X 𝑏
with  mod = 60.This comes from modulating the small ensemble in point (2) with the   = 12 leading eigenvectors of L.
Figures 7 and 8 illustrate aspects of the modulated ensemble.Figure 7 shows the profiles for  () (left), () (middle), and () (right).The gray shaded area represents the full background ensemble (  = 2500).To avoid cluttering we do not show the individual ensemble members, but instead a region corresponding to the mean plus minus 3 standard deviations for each vertical level.The dark blue lines correspond to the randomly chosen small ensemble with   = 5.The gray dashed lines correspond to the modulated ensemble with   = 60.By construction, the mean and standard deviation of the small and the modulated ensemble coincide.However, the range of values covered by the modulated ensemble is larger.We generated additional modulated ensembles where we kept fewer eigenvalues of L, in particular   = {3, 6, 9}.However, they resulted in profiles with pronounced wave-like vertical variations related to the spurious features discussed in Figure 4.These were therefore discarded and not shown for brevity.
Before moving on, it is interesting to ponder whether the modulation process changes the statistical characteristics of the ensemble.To answer this question, we compare the modulated ensemble   = 60 with a raw ensemble of the same size.For each variable at each vertical level, we perform a Shapiro-Wilk test with significance  = 0.05 for marginal Gaussianity.The p-values for both ensembles are shown in Figure 8 (note the logarithmic scale for the x-axis).
Let us start with the right panel.If all the variables were independent, the expected fraction of failed local tests with  = 0.05 would be 0.05.Due to the covariance structure, this increases to 0.18.We can take this number as our basis to pass or fail a field significance test (see e.g.Livezey and Chen (1983) for the distinction between local and field significance).In the left panel we have the results for the modulated ensemble.In this case, the fraction of failed local tests increases to 0.46.Furthermore, the p-values reach lower values than in the case of the raw ensembles.This suggests that the modulated ensemble has lost Gaussianity with respect to the raw ensemble.Our empirical result agrees with the theoretical loss of Gaussianity in modulated ensembles demonstrated in Chan (2023).

c. Results
First we show the results for the single-observation experiment.Figure 9   is therefore also contaminated by direct sampling noise, but also by spurious changes coming from the calculation of an erroneous gain.In the modulated case C  is reduced to a matrix with thin strips around the main diagonals of the 9 blocks, as expected.This structure is changed when computing C  , with information spreading to the previously localised areas.For   = 60 members, C  is considerably similar to the one coming from   = 2500, although some sampling noise is still visible for off-diagonal elements in each one of the blocks.The assimilation is effective for   = 2500 members, with x being much closer to x  than x .
Furthermore, there is a reduction in the spread from X  to X  .The case   = 5 shows different results.In this case, x is in general further from x  than x .We see pronounced errors in levels above  = 35 km and below  = 10 km.This happens despite the fact that the assimilation seems successful in observation space for the same ensemble size (Figure 9).This is because the spurious correlation structures render the projection into model space defective.Moreover, the analysis ensemble spread is too small (as expected from the small sample size), rendering an overconfident and inaccurate ensemble.The third column shows the modulated ensemble.The estimation is not as accurate as with   = 2500, but it avoids the incorrect update that occurred for   = 5.In general, x is closer to x  than x .The incorrect updates in the uppermost levels (which happened with   = 5) are no longer a problem when using the modulated ensemble.Using a raw ensemble with   = 60 yields very similar results to the case with   = 2500, which is not unexpected since this is a relatively small problem (  = 60).
After the visual examination of the results in Figure 11, we perform a more systematic quantification of the DA performance.For this purpose, we repeat the DA experiment for the  true = 669 truths, each with a respective set of observations.These are generated with independent realizations of η.
In Figure 12 we compute both RMSE  (blue line) and RMSE  (purple line) as prescribed in Eq (28) for each variable and vertical level separately.The top row shows the case for   = 2500.
Here, the assimilation always improves the estimation, which is clear since the purple line is always to the left of the blue line.The largest error reduction occurs for both the wind components () and (), with an error reduction close to 2 m/s around  = 30 km for both variables.
The second row corresponds to   = 5 and the third row to  mod = 60.Both ensembles share the same x , and hence they have the same RMSE  .As expected, the small ensemble (  = 5) yields RMSE  > RMSE  for most vertical levels.This implies that assimilating the observation considerably damages the estimation due to the erroneous increments coming from sampling errors.
These spurious errors can reach 3 K for  () and 4 m/s for ().The use of the modulated ensemble is effective in eliminating these errors for the three variables.For  (), RMSE  ≈ RMSE  , but at least there are no negative consequences from the assimilation.For both () and (), there is noticeable error reduction due to the assimilation in this modulated case.The last row shows the case with   = 60 members, which has RMSE values very similar to those coming from   = 2500 members.The performance of the modulated ensemble with   = 60 members is considerably better than that with   = 5 members, but slightly worse than the raw ensemble with   = 60 members.

d. The impact of assimilating different observations
When designing new observation systems and instruments, or devising new uses for existing ones, one needs to determine the impact that the assimilation of these observations have on the state variables.This is often referred to as the forecast sensitivity to observations (FSOI) (e.g.,  towards the top of the column.Assimilating  trace leads to moderate increments throughout the column, except for a region around 50 km.

Summary and discussion
The objective of this paper was to implement an ensemble Kalman filter to assimilate infrasoundrelated observations in order to update an atmospheric vertical profile.In this method, the quality of the sampling estimators for mean and covariance is crucial, and these estimators often come from limited-size ensembles.Localization is a method often used in the EnKF community to reduce sampling errors.Infrasound-related observations contain spatially-integrated information, and can be difficult to localize.To achieve this, we borrowed a technique from the satellite DA community: the modulated ensemble Kalman filter.
We performed state estimation experiments to update three height-dependent fields: cross wind, tail wind, and temperature.We used 4 ensemble sizes: a large ensemble with   = 2500, a small ensemble with   = 5, a modulated ensemble with   = 60, and a (raw) ensemble with   = 60 (to compare against the modulated ensemble).The assimilation process in the three-dimensional observation space was successful in all cases.The mapping back to model space proved to be the main challenge.With a 2500-member ensemble, we were able to reduce the RMSE of the analysis mean in the middle atmosphere (around 30 km) from that of the background mean by about 2 m/s for the horizontal winds and about 1 K for the temperature.Using a 5-member ensemble led to poor results (as expected), with the RMSE of the analysis mean being larger than that of the background.The use of a 60-member modulated ensemble (5 original members modulated by the leading 12 eigenvectors of the designed localization matrix) helped eliminate spurious increments.
The RMSE of the analysis mean was seldom larger than that of the background, and we even had some reductions (although smaller than for the large ensemble).A 60-member raw ensemble showed results similar to that of the 2500-member ensemble, and slightly better than that of the modulated ensemble.With the large-ensemble setting, we also performed sensitivity analysis experiments where we related the assimilation of each of the three observations by themselves to the impacts they have in the vertical profiles at different levels.
This study demonstrates the possibility of assimilating infrasound observations using ensemble DA with a full ray-propagation model as the observation operator.Furthermore, we have shown how satellite DA techniques can be readily applied to ensure appropriate localization and therefore make small ensembles useful.While we have shown some quantitative evidence of the improvement in the estimation of quantities in vertical columns, it is important to remember that the ultimate aim is to assimilate novel observations (in our case the infrasound-related quantities) in tandem with the traditionally available observations like radiosondes (see e.g.Francis et al. (2023) for a discussion on the role of radiosondes as anchor observations in remote sensing).
range-dependent regime and assessing whether extra steps (e.g.Gaussian anamorphosis) are needed.
This study has underlined the possibility of exploiting under-used datasets.Middle-and upperatmospheric observations are considerably sparser -both spatially and temporally-than those covering the lower atmosphere.Therefore, any existing observations, albeit indirect, are potentially useful in NWP.
There is a call for assimilating more wind-related observations in the middle atmosphere and it is notable that 3DVar assimilation experiments have also been performed for winds in the MLT using meteor radar multi-station datasets (Stober et al. 2021).Our long-term goal is to assimilate infrasound-related observations from natural sources like ocean-generated microbaroms.These processes are continuous in time and space.This provides a constant observational record, but also challenging aspects in the modelling and assimilation fronts.

Fig. 1 .
Fig. 1. Background correlation matrix obtained from a regional sample of MERRA2 data, and used in our experiments.The diagonal blocks correspond to correlations between vertical levels of the same physical variable.The off-diagonal blocks correspond to correlations between different vertical levels of different physical variables.

Figures 3
Figures3 and 4illustrate the construction of the low-rank approximation to L. Figure3is a Scree plot, where the cumulative sum of its 180 normalized eigenvalues is shown.We keep the first 12 (shown in red), which account for approximately 86% of the variability.In figure4we show reconstructions with an increasing number of eigenvalues kept   = {3, 6, 9, 12, 15}.In the bottom right panel we show the exact L. The case with   = 12 eigenvalues kept renders a very good approximation with no visible spurious features like the ones observed in the panels of the first row.
observations with N e = 2500 sample elements

Fig. 6 .
Fig.6.Resulting predicted values in observation space for each background ensemble member (blue), and the true observation (dark orange).The first row assesses marginal Gaussianity for the observed quantities.

Fig. 7 .
Fig. 7. Illustration of two background ensembles.Each panel corresponds to vertical profiles of different physical variables.The dark blue lines to the small ensemble (  = 5), and the dashed grey lines to the modulated ensemble ( mod = 60).The gray shaded area corresponds to the mean plus minus three standard deviations coming from the full background ensemble (  = 2500).
Fig. 8. Marginal Gaussianity for the state variables in two ensembles with 60 members.The left panel corresponds to a modulated ensemble, while the right column to a raw ensemble .

Fig. 9 .
Fig. 9. Results in observation space of the application of (M)ETKF in the single observation experiment.Each row corresponds to a different ensemble size, and each column a different observed quantity.The violin plots represent the background (cyan/blue) and analysis (pink/purple), where the ticks correspond to minimum, mean and maximum value.The black star corresponds to the given observation.
Fig. 10. Background (top) and analysis (bottom) correlation matrices involved in the DA experiments with infrasound observations.

Figure 10
Figure 10 depicts the sample background (top) and analysis (bottom) correlation matrices, which we label C  and C  respectively.We display these instead of the covariance matrices to avoid masking of the structures by the different magnitudes of the variables.Once more, we add thick black lines to denote the 9 block matrices they are made of.In the upper left, we have C  generated by   = 2500.This is indistinguishable from the prescribed C. The corresponding C  has structures similar to those found in C  , but some changes are noticeable.For   = 5, the corresponding C  is visibly contaminated by sampling noise, as expected.The corresponding C

Fig. 11 .
Fig. 11.Model space results for the single-observation experiments.Vertical profiles of () are shown for different ensemble sizes.Dark lines correspond to the reference truth (black), background mean (dark blue), and analysis mean (dark purple).Light shaded areas correspond to plus/minus 1 standard deviation (cyan for background, pink for analysis).

Figure 11
Figure 11 displays the DA elements in the model space.For brevity, we only display the results for the tail wind (), which was the most benefited variable by the DA process (the results for the other two variables are qualitatively similar, but less pronounced).Each column corresponds to a different ensemble size.Each panel contains several lines: the true profiles are shown with thick black lines, the background profiles are shown in blue (dark blue for the mean, shaded cyan region for plus/minus 1 standard deviation), and the analysis profiles are shown in purple (dark purple the mean, shaded pink region for plus/minus 1 standard deviation).

Fig. 12 .
Fig. 12. Evaluation of the performance of the (M)ETKF in terms of RMSE of x (blue line) and x (purple lines) with respect to x true for different ensemble sizes (rows) and variables (columns).

Fig. 13 .
Fig. 13.Analysis impact from assimilating different observations (columns): only   , only  travel , only  trace , and three at the same time.The analysis increments are collected over 669 experiments.Each row shows a variable ( in the top,  in the middle,  in the bottom).For each panel, we show the percentiles of the increment.