## 1. Introduction

An appropriate representation of the covariance structure in spatial models of meteorological variables is essential when analyzing (Gandin 1965, 21–121; Kalnay 2003) meteorological data using data assimilation (Hollingsworth and Lönnberg 1986; Evensen 1994; Bonavita et al. 2012; Pu et al. 2016). This generally requires an appropriate representation of the background error covariance matrix. Further, spatial stochastic models for meteorological variables should respect physical relationships.

One of the first approaches to include physical consistency via differential relations between variables can be found in Kolmogorov (1941). Thiébaux (1977) introduced a covariance model for wind fields assuming geostrophic balance, thereby incorporating anisotropy in the geopotential height. Daley (1985) derived a covariance model for the horizontal wind components assuming a Gaussian covariance model for the velocity potential and the streamfunction, where he derived the differential relations between the potentials and the wind field. The covariance model proposed by Daley (1985) is rather flexible as it allows for geostrophic coupling, nonzero correlation of the streamfunction and velocity potential, and differing scales for the two potentials. Daley (1985) also considered geopotential height as an additional model variable. However, the resulting covariance function for the wind fields is not positive definite for many parameter combinations. Hollingsworth and Lönnberg (1986) adapted Daley’s method and formulated a covariance function for the potentials using cylindrical harmonics. They show that on the synoptic scale the correlation between the potentials is small, such that Daley (1991) reformulated his model for zero correlations. These approaches (Thiébaux 1977; Hollingsworth and Lönnberg 1986; Daley 1985) as well as our model differ from current data assimilation methods, as they provide an explicit, parametric, and analytic covariance model for the background error. So-called control variable transform methods (Bannister 2008) describe the background error matrix in an implicit nonparametric way via its square root^{1} using latent variables that model the physical variables. Sample-based methods like the ensemble Kalman filter (Evensen 1994) describe the error statistics based on estimates obtained from an ensemble.

The data assimilation literature (e.g., Thiébaux 1977; Hollingsworth and Lönnberg 1986; Daley 1985) typically uses the stochastic models in order to describe the covariance matrix of the background error, which is the difference of a forecast and the true field. Similar methods have also been used in order to describe the full turbulent field (Frehlich et al. 2001). There has also been considerable interest in describing the statistics of the velocity field directly or via its spectrum (Bühler et al. 2014; Lindborg 2015; Bierdel et al. 2016).

While Thiébaux (1977), Hollingsworth and Lönnberg (1986), and Daley (1985) include physical relations via differentiation of the covariance function, finite difference operators are used in Bayesian hierarchical models. For example, Royle et al. (1999) modeled the geostrophic relation of the pressure and wind fields.

In this paper, we propose a multivariate Gaussian random field (GRF) formulation for six atmospheric variables in a horizontal two-dimensional Cartesian space. Assuming a bivariate Matérn covariance for a streamfunction *ψ* and velocity potential *χ*, we derive the covariance structure of the horizontal wind components *ψ* and velocity potential *χ*, such that

Our multivariate GRF formulation is novel for several reasons. While, for example, Daley (1985) only used the potentials to derive the covariance function of the wind fields, our model is formulated for all related variables, including a formulation for the potential functions and the wind field, as well as vorticity and divergence. Second, our model provides a formulation for anisotropy in the wind field and the related potentials. Further, we allow for nonzero correlations between the rotational and divergent wind components, which might be particularly relevant for atmospheric fields on subgeostrophic scales. We show that the scale parameters considered by Daley (1985) are inconsistent with nonzero correlations between the streamfunction and velocity potential, as they do not lead to a positive definite model. An exact derivation of the condition under which the covariance function of Daley’s model is positive definite is given in appendix A. Further our model is a counterexample to a theorem of Obukhov (1954), which claims that there is no isotropic wind field with nonzero correlation of the rotational and nonrotational components of the wind field. More details to Obukhov’s claim are given in appendix B.

The covariance function of our multivariate GRF will be incorporated into an upcoming version of the spatial statistics R package RandomFields (Schlather et al. 2016). This opens the possibility for a wealth of applications in spatial statistics, including the conditional simulation of the streamfunction and vector potential given an observed wind field, a consistent formulation of the covariance structure for both the potentials and the horizontal wind components to be used in data assimilation, or stochastic interpolation (kriging) of each of the involved variables given the others. Kriging is the process of computing the conditional expectation of a certain variable given others. It is typically used to interpolate fields.

To exemplify the multivariate GRF, we estimated its parameters for atmospheric fields of the numerical ensemble weather prediction system, COSMO-DE-EPS (Gebhardt et al. 2011), provided by the German Meteorological Service (DWD). COSMO-DE is a high-resolution forecast system that provides forecasts on the atmospheric mesoscale (Baldauf et al. 2011). Estimation is realized using the maximum likelihood method, while uncertainty in the parameter estimation is assessed by parametric bootstrap (Efron and Tibshirani 1994). We also discuss the meteorological relevance of the parameters.

The remainder of the paper is organized as follows. In section 2 we introduce the multivariate GRF and demonstrate how the physical relations and anisotropy are included in the model formulation. Section 3 introduces the COSMO-DE-EPS data. Section 4 is devoted to the parameter estimation and the assessment of the uncertainties, while section 5 presents and interprets the results of the estimation. We conclude in section 6 and discuss potential applications, limits, and extensions of our multivariate GRF.

## 2. Theory

An important aspect of our multivariate GRF is the inclusion of the differential relations between the atmospheric variables. Under weak regularity assumptions the derivative of a Gaussian process is again a Gaussian process (Adler and Taylor 2007). Hence, the assumption of Gaussianity of the streamfunction and the velocity potentials implies Gaussianity of all the considered variables. A zero-mean Gaussian process is uniquely characterized by the covariance function; we only need to study the joint covariance of a random field and its derivatives. A Gaussian process

*i*th coordinate direction. In this case, we use the following notation:

*ψ*, velocity potential

*χ*, and the Laplacian of the potentials (i.e., vorticity

*C*fulfillsfor all rotation matrices

*d*-dimensional identity matrix and

**h**and the random vector are rotated simultaneously.

Our GRF is a counterexample to a theorem of Obukhov (1954), which claims that the rotational and divergent component of isotropic vector fields are necessarily uncorrelated, which is equivalent to the streamfunction and velocity potential being uncorrelated. Obukhov considers an invalid expression for the covariance of a rotational field and deduces from this expression that it is necessarily uncorrelated to a gradient field. We present the detailed argument in appendix B.

*ν*and

Figure 1 represents a realization of the full stochastic process, with parameters chosen in order to illustrate the flexibility of the model. The rotational wind component is larger than the divergent wind component with a ratio of

## 3. Data

The horizontal wind fields are taken from the numerical weather prediction (NWP) model COSMO-DE—namely, the wind fields at model level 20 (i.e., at approximately 7-km height). COSMO-DE is the operational version of the nonhydrostatic limited-area NWP model Consortium for Small-Scale Modeling (COSMO) operated by DWD (Baldauf et al. 2011). It provides forecasts over Germany and surrounding countries on a 2.8-km horizontal grid and 50 vertical levels. At this grid size, deep convection is permitted by the dynamics, and COSMO-DE is able to generate deep convection without an explicit parameterization thereof. Thus, COSMO-DE particularly aims at the prediction of mesoscale convective precipitation with a forecast horizon of up to 1 day. The ensemble prediction system (COSMO-DE-EPS) uses COSMO-DE with different lateral boundary conditions (LBC), perturbed initial conditions, and slightly modified parameterizations. The four LBCs are generated by the Global Forecast Systems of NCEP, the Global Model of DWD, the Integrated Forecast System of ECMWF, and the Global Spectral Model of the Meteorological Agency of Japan. For details on the setup of COSMO-DE-EPS, the reader is referred to Gebhardt et al. (2011), Peralta et al. (2012), and references therein.

In our application we concentrate on a COSMO-DE forecast for 1200 UTC 5 June 2011 initialized at 0000 UTC. COSMO-DE-EPS provides 20 forecasts of horizontal wind fields on a grid with 461 × 421 grid points. Five ensemble members are forced with identical LBCs, respectively. They only differ as a result of perturbed initial conditions and four different parameterizations. Thus differences between the members with identical LBCs are mainly due to small-scale internal dynamics. These differences are the differences obtained from subtracting two fields that have been generated using the same lateral boundary conditions. All combinations of fields with different model physics and identical lateral boundary conditions generate a set of 40 different fields of differences. The differences are referred to as inner-LBC anomalies.

*c*close to zero. We chose

## 4. Parameter estimation

**denotes the parameter vector and**

*θ**N*controls for which separations

**h**the likelihood is computed. The set

*N*has to be determined relative to the given problem. If feasible, it should include all lags

**h**for which there is nonnegligible dependence and some for which there is negligible dependence, in order to estimate the range. One way of determining this is to inspect the empirical covariance estimate. We chose

*N*to be a regular 41 ×41 grid with step size one, which is centered in the origin. The choice is justified by the low uncertainties observed in the parametric bootstrap samples presented below.

The unknown parameters are the variances of the potentials *ρ*, the smoothness parameter *ν*, and the scale parameters *θ* of the anisotropy.

To reduce the number of parameters, we use the correlation function instead of the covariance function, which only depends on the ratio and not on the magnitude of the variances of streamfunction and velocity potential (Daley 1991). This is possible as we can estimate the variance of the zonal and meridional wind with very low uncertainty owing to the large size of the considered grid.

CL was maximized using the built-in function “optim” of R Core Team (2015). To show the independence of the optimization technique of the initial values it was started 50 times with varying initial parameters. This reveals that there is a single global maximum of the likelihood function.

Parameter uncertainty such as the Fisher information is not available for our problem. We thus resort to a parametric bootstrap (Efron and Tibshirani 1994) to assess uncertainty of the parameter estimates. We simulated the multivariate GRF using circulant embedding (Wood and Chan 1994) to obtain independent realizations of the fitted process. Reestimating the parameters for a sample of 100 independent realizations provides the uncertainty of the parameter estimates given that the estimated model is true. The simulation of the data was made possible by the implementation of the considered covariance model in an upcoming version of the spatial statistics package RandomFields (Schlather et al. 2016). The parametric bootstrap describes the estimation uncertainty based on the assumption that the model is sufficiently close to the data. It cannot assess the uncertainty related to the modeling error. As the considered data deviates from a Gaussian distribution and is only approximately stationary, this error is presumably not negligible.

## 5. Results

Figure 4 shows the estimates of the parameters of the multivariate GRF and the respective distribution of the parametric bootstrap estimates as a boxplot. The ratio of divergent and rotational wind is estimated to about

Figure 4b compares the statistical estimate for

The correlation between the streamfunction and velocity potential *ρ* is almost zero *ν* is close to 1.24. This corresponds to noncontinuous fields of vorticity and divergence. This relatively low value of *ν* is not due to noise in the data. We have included tentatively a noise parameter in the estimation but it was set to zero (not shown). As a measure for the anisotropy we consider the ratio of the scale parameters

Figure 5 shows the empirical estimate of the correlation structure of the data and the correlation obtained for the maximum likelihood estimation. Again the scale and the orientation of the correlation are very well matched. The

The implementation of our covariance model in an upcoming version of the R package RandomFields (Schlather et al. 2016) allows for the simulation of large field with a size of the order of 800 × 800 grid points. This is made feasible by using circulant embedding introduced by Wood and Chan (1994). Circulant embedding is a powerful simulation technique, which to the best of our knowledge, has not been used for the simulation of wind fields yet.

Figure 6 shows the zonal wind anomalies from Fig. 2 together with a realization of the fitted multivariate GRF, which has been scaled with the spatial variance that has not been resolved by the transformation [(10)]. It shows that the orientation as well as the spatial scale of the zonal wind fields matches very well. The multivariate GRF shows less extreme values and fewer values very close to zero, owing to the assumption of Gaussianity. However, visual accordance is quite good, such that we conclude that the multivariate GRF formulation represents a useful stationary, multivariate Gaussian random fields approximation of mesoscale wind anomalies.

## 6. Conclusions

In this paper we introduce a multivariate GRF that jointly models the streamfunction, velocity potential, the two-dimensional wind field, vorticity, and divergence. Its flexibility allows for different variances of the potential functions, anisotropy, and a flexible smoothness parameter. Further, the model is able to represent nonzero correlation of the divergent and nondivergent wind components. All parameters of the proposed covariance model have direct meteorological interpretation, such that they provide meteorological insight into the dynamics of the atmosphere. Further, the model allows us to easily implement meteorological balances such as nondivergence or geostrophy.

We have reviewed the theory that guarantees the existence of derivatives of stochastic processes, developed a complex covariance model for various atmospheric variables, and studied its transformation subject to anisotropy. Our multivariate GRF is a counterexample to a theorem of Obukhov (1954), which claims that the rotational and divergent components of an isotropic vector field are necessarily uncorrelated.

We have developed an estimation technique and shown its performance for wind anomalies of a mesoscale ensemble prediction system (COSMO-DE-EPS). A parametric bootstrap method provides estimates of the uncertainty implicit in our estimation technique. We thus provide estimates for the ratio of variances of the rotational and divergent wind components without numerical approximations. Numeric estimates suffer from a truncation error, which arises owing to the numerical scheme that computes the derivatives of the wind field.

The multivariate GRF formulation may be particularly useful for global atmospheric models with a spectral representation of the horizontal fields, such as the ECHAM climate model (Roeckner et al. 2003). Spectral models solve the prognostic equations for the potentials instead of the horizontal wind components, whereas the observations are given as horizontal wind vectors. Our multivariate GRF formulation provides a consistent formulation of the covariance structure for both the potential and the horizontal wind components. A stochastic formulation of the potentials may also be relevant for the assimilation of measurements of the vertical velocity (Bühl et al. 2015), which provide proxies for the horizontal divergence of the field. Our covariance function represents the divergence within a stochastic model, which is needed to assimilate the observations.

The proposed covariance model can be used to interpolate observed wind fields and to compute the associated derivative fields. This is feasible either by conditional simulation or kriging. Numerical methods have been used for interpolation (Schaefer and Doswell 1979) and the computation of derivatives of vector fields (Caracena 1987; Doswell and Caracena 1988). While numeric methods become significantly more complex for scattered observations, the multivariate GRF formulation provides an accessible way for both problems, which additionally provides information about the uncertainty.

The fields obtained by computing the expectation of vector potential and the streamfunction given a certain wind field can be shown to solve the differential equations of the Helmholtz equation. In this sense, our covariance model can be used via kriging to solve the Helmholtz equation. As stochastic models describe the uncertainty of all of the variables, these methods even allow stochastic error bands to be computed for the solution of the partial differential equations.

Another potential application is the stochastic simulation of the transport of tracer variables such as aerosols or humidity in the atmosphere. Stochastic models that describe gradient fields and their divergence have been considered in the literature (Scheuerer and Schlather 2012). However, to the best of our knowledge no stochastic model has been formulated to jointly model spatial wind fields and its divergence. Both variables are needed to describe the transport adequately.

Our methods show that both physical coherence and geostrophic constraints can be easily implemented into a covariance model. Further, we have illustrated that the model parameters can be estimated with very small uncertainty on data simulated by our model.

## Acknowledgments

Rüdiger Hewer was funded by VolkswagenStiftung within the project “Mesoscale Weather Extremes: Theory, Spatial Modeling and Prediction (WEX-MOP).” Data used in this study are kindly provided by the German Meteorological Service (DWD). We thank Chris Snyder and an anonymous reviewer for the thoughtful comments, which improved our paper substantially. We are especially grateful to the reviewer for the idea to transform the data such that our model assumptions are more appropriate. We thank Sebastian Buschow for help in preparing the data.

## APPENDIX A

### Positive Definiteness of Daley’s Model

*φ*. This is equivalent toa condition equivalent toIf

## APPENDIX B

### Obukhov’s Independence Claims

*P*. Using the nondivergence of a rotational field, Obukhov deduces from his assumption the following:This differential equation is solved by the functionIf

*P*, as the curl operator derives the first component in direction

*P*.

## APPENDIX C

### Formulas of the Isotropic Covariance Model

## REFERENCES

Adler, R. J., and J. E. Taylor, 2007:

*Random Fields and Geometry*. Springer Monographs in Mathematics, Vol. 17, Springer, 448 pp., doi:10.1007/978-0-387-48116-6.Baldauf, M., A. Seifert, J. Förstner, D. Majewski, M. Raschendorfer, and T. Reinhardt, 2011: Operational convective-scale numerical weather prediction with the COSMO model: Description and sensitivities.

,*Mon. Wea. Rev.***139**, 3887–3905, doi:10.1175/MWR-D-10-05013.1.Bannister, R. N., 2008: A review of forecast error covariance statistics in atmospheric variational data assimilation. II: Modelling the forecast error covariance statistics.

,*Quart. J. Roy. Meteor. Soc.***134**, 1971–1996, doi:10.1002/qj.340.Bierdel, L., C. Snyder, S.-H. Park, and W. C. Skamarock, 2016: Accuracy of rotational and divergent kinetic energy spectra diagnosed from flight-track winds.

,*J. Atmos. Sci.***73**, 3273–3286, doi:10.1175/JAS-D-16-0040.1.Bonavita, M., L. Isaksen, and E. Hólm, 2012: On the use of EDA background error variances in the ECMWF 4D-Var.

,*Quart. J. Roy. Meteor. Soc.***138**, 1540–1559, doi:10.1002/qj.1899.Bühl, J., R. Leinweber, U. Görsdorf, M. Radenz, A. Ansmann, and V. Lehmann, 2015: Combined vertical-velocity observations with Doppler lidar, cloud radar and wind profiler.

,*Atmos. Meas. Tech.***8**, 3527–3536, doi:10.5194/amt-8-3527-2015.Bühler, O., J. Callies, and R. Ferrari, 2014: Wave–vortex decomposition of one-dimensional ship-track data.

,*J. Fluid Mech.***756**, 1007–1026, doi:10.1017/jfm.2014.488.Caracena, F., 1987: Analytic approximation of discrete field samples with weighted sums and the gridless computation of field derivatives.

,*J. Atmos. Sci.***44**, 3753–3768, doi:10.1175/1520-0469(1987)044<3753:AAODFS>2.0.CO;2.Chiles, J.-P., and P. Delfiner, 2009:

*Geostatistics: Modeling Spatial Uncertainty*. Wiley Series in Probability and Statistics, Vol. 497, Wiley, 699 pp.Cox, D. R., and N. Reid, 2004: A note on pseudolikelihood constructed from marginal densities.

,*Biometrika***91**, 729–737, doi:10.1093/biomet/91.3.729.Daley, R., 1985: The analysis of synoptic scale divergence by a statistical interpolation procedure.

,*Mon. Wea. Rev.***113**, 1066–1080, doi:10.1175/1520-0493(1985)113<1066:TAOSSD>2.0.CO;2.Daley, R., 1991:

*Atmospheric Data Analysis.*2nd ed. Cambridge University Press, 457 pp.Doswell, C. A., III, and F. Caracena, 1988: Derivative estimation from marginally sampled vector point functions.

,*J. Atmos. Sci.***45**, 242–253, doi:10.1175/1520-0469(1988)045<0242:DEFMSV>2.0.CO;2.Efron, B., and R. J. Tibshirani, 1994:

*An Introduction to the Bootstrap.*Chapman & Hall, 456 pp.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99**, 10 143–10 162, doi:10.1029/94JC00572.Frehlich, R., L. Cornman, and R. Sharman, 2001: Simulation of three-dimensional turbulent velocity fields.

,*J. Appl. Meteor.***40**, 246–258, doi:10.1175/1520-0450(2001)040<0246:SOTDTV>2.0.CO;2.Gandin, L. S., 1965:

*Objective Analysis of Meteorological Fields*. Israel Program for Scientific Translations, 242 pp.Gebhardt, C., S. Theis, M. Paulat, and Z. B. Bouallégue, 2011: Uncertainties in COSMO-DE precipitation forecasts introduced by model perturbations and variation of lateral boundaries.

,*Atmos. Res.***100**, 168–177, doi:10.1016/j.atmosres.2010.12.008.Gneiting, T., W. Kleiber, and M. Schlather, 2010: Matérn cross-covariance functions for multivariate random fields.

,*J. Amer. Stat. Assoc.***105**, 1167–1177, doi:10.1198/jasa.2010.tm09420.Goulard, M., and M. Voltz, 1992: Linear coregionalization model: Tools for estimation and choice of cross-variogram matrix.

,*Math. Geol.***24**, 269–286, doi:10.1007/BF00893750.Hollingsworth, A., and P. Lönnberg, 1986: The statistical structure of short-range forecast errors as determined from radiosonde data. Part I: The wind field.

,*Tellus***38A**, 111–136, doi:10.1111/j.1600-0870.1986.tb00460.x.Jackson, J. D., 1962:

*Electrodynamics*. Wiley, 641 pp.Kalnay, E., 2003:

*Atmospheric Modeling, Data Assimilation and Predictability*. Cambridge University Press, 341 pp.Kolmogorov, A. N., 1941: The local structure of turbulence in incompressible viscous fluid for very large Reynolds numbers.

,*Dokl. Akad. Nauk SSSR***31**, 538–540.Lindborg, E., 2015: A Helmholtz decomposition of structure functions and spectra calculated from aircraft data.

,*J. Fluid Mech.***762**, R4-1–R4-11, doi:10.1017/jfm.2014.685.Moreva, O., and M. Schlather, 2016: Modeling and simulation of bivariate Gaussian random fields. arXiv.org, 16 pp., https://arxiv.org/abs/1609.06561.

Obukhov, A., 1954: Statistical description of continuous fields.

,*Tr. Geofiz. Inst.*,*Akad. Nauk. SSSR***24**, 3–42.Peralta, C., Z. Ben Bouallègue, S. Theis, C. Gebhardt, and M. Buchhold, 2012: Accounting for initial condition uncertainties in COSMO-DE-EPS.

,*J. Geophys. Res.***117**, D07108, doi:10.1029/2011JD016581.Pu, Z., S. Zhang, M. Tong, and V. Tallapragada, 2016: Influence of the self-consistent regional ensemble background error covariance on hurricane inner-core data assimilation with the GSI-based hybrid system for HWRF.

,*J. Atmos. Sci.***73**, 4911–4925, doi:10.1175/JAS-D-16-0017.1.R Core Team, 2015:

*R: A Language and Environment for Statistical Computing*. R Foundation for Statistical Computing, https://www.R-project.org/.Ritter, K., 2000:

*Average-Case Analysis of Numerical Problems.*Lecture Notes in Mathematics, Vol. 1733, Springer, 225 pp.Roeckner, E., and et al. , 2003: The atmospheric general circulation model ECHAM5. Part I: Model description. Max Planck Institute for Meteorology Tech. Rep. 349, 127 pp., http://www.mpimet.mpg.de/fileadmin/models/echam/mpi_report_349.pdf.

Royle, J. A., L. M. Berliner, C. K. Wikle, and R. Milliff, 1999: A hierarchical spatial model for constructing wind fields from scatterometer data in the Labrador Sea.

*Case Studies in Bayesian Statistics*, Vol. IV, C. Gatsonis et al., Eds., Springer, 367–382.Schaefer, J. T., and C. A. Doswell III, 1979: On the interpolation of a vector field.

,*Mon. Wea. Rev.***107**, 458–476, doi:10.1175/1520-0493(1979)107<0458:OTIOAV>2.0.CO;2.Scheuerer, M., and M. Schlather, 2012: Covariance models for divergence-free and curl-free random vector fields.

,*Stochastic Models***28**, 433–451, doi:10.1080/15326349.2012.699756.Schlather, M., A. Malinowski, P. J. Menck, M. Oesting, and K. Strokorb, 2015: Analysis, simulation and prediction of multivariate random fields with package randomfields.

,*J. Stat. Software***63**, 1–25, doi:10.18637/jss.v063.i08.Schlather, M., and et al. , 2016: Randomfields: Simulation and analysis of random fields, version 3.1.16. R package, http://ms.math.uni-mannheim.de/de/publications/software.

Thiébaux, H. J., 1977: Extending estimation accuracy with anisotropic interpolation.

,*Mon. Wea. Rev.***105**, 691–699, doi:10.1175/1520-0493(1977)105<0691:EEAWAI>2.0.CO;2.Varin, C., N. Reid, and D. Firth, 2011: An overview of composite likelihood methods.

,*Stat. Sin.***21**, 4–42.Wood, A. T. A., and G. Chan, 1994: Simulation of stationary Gaussian processes in [0, 1]

*d*.,*J. Comput. Graphical Stat.***3**, 409–432, doi:10.1080/10618600.1994.10474655.

^{1}

For example, Cholesky decomposition.