• Abhik, S., M. Halder, P. Mukhopadhyay, X. Jiang, and B. N. Goswami, 2013: A possible new mechanism for northward propagation of boreal summer intraseasonal oscillations based on TRMM and MERRA reanalysis. Climate Dyn., 40, 16111624, https://doi.org/10.1007/s00382-012-1425-x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903, https://doi.org/10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131, 634642, https://doi.org/10.1175/1520-0493(2003)131<0634:ALLSFF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2007a: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210224, https://doi.org/10.1111/j.1600-0870.2006.00216.x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2007b: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter. Physica D, 230, 99111, https://doi.org/10.1016/j.physd.2006.02.011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2012: Localization and sampling error correction in ensemble Kalman filter data assimilation. Mon. Wea. Rev., 140, 23592371, https://doi.org/10.1175/MWR-D-11-00013.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 27412758, https://doi.org/10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420436, https://doi.org/10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, Y., and D. S. Oliver, 2010: Cross-covariances and localization for EnKF in multiphase flow data assimilation. Comput. Geosci., 14, 579601, https://doi.org/10.1007/s10596-009-9174-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • De La Chevrotière, M., 2015: Stochastic and numerical models for tropical convection and Hadley-monsoon dynamics. Ph.D. thesis, University of Victoria, 233 pp.

  • De La Chevrotière, M., and J. Harlim, 2017: A data-driven method for improving the correlation estimation in serial ensemble Kalman filters. Mon. Wea. Rev., 145, 9851001, https://doi.org/10.1175/MWR-D-16-0109.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • De La Chevrotière, M., and B. Khouider, 2017: A zonally symmetric model for the monsoon-Hadley circulation with stochastic convective forcing. Theor. Comput. Fluid Dyn., 31, 89110, https://doi.org/10.1007/s00162-016-0407-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • De La Chevrotière, M., B. Khouider, and A. J. Majda, 2016: Stochasticity of convection in Giga-LES data. Climate Dyn., 47, 18451861, https://doi.org/10.1007/s00382-015-2936-z.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Demmel, J. W., 1997: Applied Numerical Linear Algebra. Society for Industrial and Applied Mathematics, xi + 416 pp., https://doi.org/10.1137/1.9781611971446.

    • Crossref
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 14310 162, https://doi.org/10.1029/94JC00572.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Furrer, R., and T. Bengtsson, 2007: Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. J. Multivar. Anal., 98, 227255, https://doi.org/10.1016/j.jmva.2006.08.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757, https://doi.org/10.1002/qj.49712555417.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790, https://doi.org/10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Harlim, J., 2017: Model error in data assimilation. Nonlinear and Stochastic Climate Dynamics, C. L. E. Franzke and T. J. O’Kane, Eds., Cambridge University Press, 276–317, https://doi.org/10.1017/9781316339251.011.

    • Crossref
    • Export Citation
  • Hastie, T., R. Tibshirani, and J. Friedman, 2009: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Vol. 1, Springer Series in Statistics, Springer, 745 pp., https://doi.org/10.1007/978-0-387-84858-7.

    • Crossref
    • Export Citation
  • Hotelling, H., 1953: New light on the correlation coefficient and its transforms. J. Roy. Stat. Soc., 15B (2), 193232.

  • Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811, https://doi.org/10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2005: Ensemble Kalman filtering. Quart. J. Roy. Meteor. Soc., 131, 32693289, https://doi.org/10.1256/qj.05.135.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and F. Zhang, 2016: Review of the ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 144, 44894532, https://doi.org/10.1175/MWR-D-15-0440.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Janjić, T., D. McLaughlin, S. E. Cohn, and M. Verlaan, 2014: Conservation of mass and preservation of positivity with ensemble-type Kalman filter algorithms. Mon. Wea. Rev., 142, 755773, https://doi.org/10.1175/MWR-D-13-00056.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Johnson, R. H., T. M. Rickenbach, S. A. Rutledge, P. E. Ciesielski, and W. H. Schubert, 1999: Trimodal characteristics of tropical convection. J. Climate, 12, 23972418, https://doi.org/10.1175/1520-0442(1999)012<2397:TCOTC>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kalman, R. E., 1960: A new approach to linear filtering and prediction problems. J. Basic Eng., 82, 3545, https://doi.org/10.1115/1.3662552.

  • Katsoulakis, M. A., A. J. Majda, and D. G. Vlachos, 2003: Coarse-grained stochastic processes and Monte Carlo simulations in lattice systems. J. Comput. Phys., 186, 250278, https://doi.org/10.1016/S0021-9991(03)00051-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. J. Majda, 2006a: Model multi-cloud parameterizations for convectively coupled waves: Detailed nonlinear wave evolution. Dyn. Atmos. Oceans, 42, 5980, https://doi.org/10.1016/j.dynatmoce.2005.12.001.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. J. Majda, 2006b: Multicloud convective parametrizations with crude vertical structure. Theor. Comput. Fluid Dyn., 20, 351375, https://doi.org/10.1007/s00162-006-0013-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. J. Majda, 2006c: A simple multicloud parameterization for convectively coupled tropical waves. Part I: Linear analysis. J. Atmos. Sci., 63, 13081323, https://doi.org/10.1175/JAS3677.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. J. Majda, 2007: A simple multicloud parameterization for convectively coupled tropical waves. Part II: Nonlinear simulations. J. Atmos. Sci., 64, 381400, https://doi.org/10.1175/JAS3833.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. Majda, 2008: Multicloud models for organized tropical convection: Enhanced congestus heating. J. Atmos. Sci., 65, 895914, https://doi.org/10.1175/2007JAS2408.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., A. Majda, and M. Katsoulakis, 2003: Coarse-grained stochastic models for tropical convection and climate. Proc. Natl. Acad. Sci. USA, 100, 11 94111 946, https://doi.org/10.1073/pnas.1634951100.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., J. Biello, and A. J. Majda, 2010: A stochastic multicloud model for tropical convection. Commun. Math. Sci., 8, 187216, https://doi.org/10.4310/CMS.2010.v8.n1.a10.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-Var. Quart. J. Roy. Meteor. Soc., 129, 31833203, https://doi.org/10.1256/qj.02.132.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1996: Predictability: A problem partly solved. Proc. Seminar on Predictability, Vol. 1, Shinfield Park, Reading, United Kingdom, 18 pp., https://www.ecmwf.int/sites/default/files/elibrary/1995/10829-predictability-problem-partly-solved.pdf.

  • Mapes, B., S. Tulich, J. Lin, and P. Zuidema, 2006: The mesoscale convection life cycle: Building block or prototype for large-scale tropical waves? Dyn. Atmos. Oceans, 42, 329, https://doi.org/10.1016/j.dynatmoce.2006.03.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Miyoshi, T., K. Kondo, and T. Imamura, 2014: The 10,240-member ensemble Kalman filtering with an intermediate AGCM. Geophys. Res. Lett., 41, 52645271, https://doi.org/10.1002/2014GL060863.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Oke, P. R., P. Sakov, and S. P. Corney, 2007: Impacts of localisation in the EnKF and EnOI: Experiments with a small model. Ocean Dyn., 57, 3245, https://doi.org/10.1007/s10236-006-0088-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Strang, G., 1968: On the construction and comparison of difference schemes. SIAM J. Numer. Anal., 5, 506517, https://doi.org/10.1137/0705041.

  • Tardif, R., G. J. Hakim, and C. Snyder, 2014: Coupled atmosphere–ocean data assimilation experiments with a low-order climate model. Climate Dyn., 43, 16311643, https://doi.org/10.1007/s00382-013-1989-0.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Waite, M. L., and B. Khouider, 2009: Boundary layer dynamics in a simple model for convectively coupled gravity waves. J. Atmos. Sci., 66, 27802795, https://doi.org/10.1175/2009JAS2871.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Waller, J. A., S. L. Dance, A. S. Lawless, and N. K. Nichols, 2014: Estimating correlated observation error statistics using an ensemble transform Kalman filter. Tellus, 66A, 23294, https://doi.org/10.3402/tellusa.v66.23294.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924, https://doi.org/10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • View in gallery
    Fig. 1.

    (a) Vertical profiles of the leading modes of horizontal velocity and potential temperature θ. (first three profiles) Barotropic mode and first two baroclinic modes of velocity and , respectively. (last two profiles) First two baroclinic modes of temperature and . (b) Baroclinic profile of the heating (and cooling) rates associated with the three cloud types of the multicloud model. (left) Mode-2 congestus heating . (middle) Mode-1 deep heating . (right) Mode-2 stratiform heating . The heating curves intersect the vertical straight lines at zero heating points.

  • View in gallery
    Fig. 2.

    Imposed SST meridional profile. The surface temperature gradient at RCE follows a normal distribution centered at 15°N with a standard deviation of 7.2°. Here and are the surface and ABL equivalent potential temperatures at RCE, respectively.

  • View in gallery
    Fig. 3.

    Mean meridional circulation averaged over the training interval for regime A. The top of the ABL (solid black line) is located at height 0 km. The contours represent the indicated fields, and the arrows are the velocity vector field (υ, w). (clockwise) Zonal and meridional winds, potential temperature, total heating, vertical velocity, and pressure.

  • View in gallery
    Fig. 4.

    The 25-day Hovmöller plots for regime A. (clockwise) ABL equivalent potential temperature, free-tropospheric moisture, meridional barotropic wind; and stratiform, deep, and congestus cloud area fractions.

  • View in gallery
    Fig. 5.

    As in Fig. 3, but for regime B.

  • View in gallery
    Fig. 6.

    As in Fig. 4, but for regime B.

  • View in gallery
    Fig. 7.

    (left) Optical thickness , (middle) transmittance , and (right) weighting function as a function of height z for the six channels , , , shown here (in an array of colors; quantities are nondimensional) for different climatological moisture values q.

  • View in gallery
    Fig. 8.

    Nondimensional brightness temperature associated with the first channel, calculated from climatology over a period of 65 days.

  • View in gallery
    Fig. 9.

    Contour plot of the vector localization map for the correlations between the model coordinates i of the field and a channel-1 brightness temperature observation j located at the green point (near the equator). The contour value at the coordinates corresponds to the component of the vector map . For a fixed i and j, the vector components of (along the y axis) are zero outside the radius of the observation location.

  • View in gallery
    Fig. 10.

    The GC function and localization map for the correlation between the observed brightness temperature of channel 1 observed at three different locations (roughly 14°S, 0°, and 12°N) and the model variables (a) and (b) .

  • View in gallery
    Fig. 11.

    Average relative degradation as a function of the parameter ρ for the perfect-model and model error experiments.

  • View in gallery
    Fig. 12.

    Perfect-model experiments. Log-linear plot of the time mean of analysis RMSE as a function of ensemble size is shown for the 14 analyzed fields, for the verification experiments using the localization maps PM with and (dashed black). The training experiment TD_PM using 1000 members is reported as a baseline (solid blue).

  • View in gallery
    Fig. 13.

    Model error experiments. Log-linear plot of the time mean of analysis RMSE as a function of ensemble size is shown for the 14 analyzed fields, for the verification experiments using the localization maps ME1 and ME2. The training experiments TD_ME and TD_PM using 1000 members are reported as baselines (solid blue).

  • View in gallery
    Fig. 14.

    Model error experiments. Time mean of analysis RMSE as a function of training and verification ensemble sizes for experiment ME2 (). White pixels indicate filter divergence.

  • View in gallery
    Fig. 15.

    Snapshot of the meridional circulation of (a) the truth and (b) the analysis for the verification experiment ME2 ( and ). The snapshot is taken at an arbitrary assimilation cycle. The contours represent the indicated fields, and the arrows are the velocity vector field .

  • View in gallery
    Fig. 16.

    25-day Hovmöller plots of (a) the truth and (b) the analysis for the verification experiment ME2 ( and ). (left) Free-tropospheric moisture and (right) deep cloud area fraction.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 118 27 5
PDF Downloads 55 9 2

Data-Driven Localization Mappings in Filtering the Monsoon–Hadley Multicloud Convective Flows

Michèle De La ChevrotièreDepartment of Mathematics, The Pennsylvania State University, University Park, Pennsylvania

Search for other papers by Michèle De La Chevrotière in
Current site
Google Scholar
PubMed
Close
and
John HarlimDepartment of Mathematics, and Department of Meteorology, The Pennsylvania State University, University Park, Pennsylvania

Search for other papers by John Harlim in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

This paper demonstrates the efficacy of data-driven localization mappings for assimilating satellite-like observations in a dynamical system of intermediate complexity. In particular, a sparse network of synthetic brightness temperature measurements is simulated using an idealized radiative transfer model and assimilated to the monsoon–Hadley multicloud model, a nonlinear stochastic model containing several thousands of model coordinates. A serial ensemble Kalman filter is implemented in which the empirical correlation statistics are improved using localization maps obtained from a supervised learning algorithm. The impact of the localization mappings is assessed in perfect-model observing system simulation experiments (OSSEs) as well as in the presence of model errors resulting from the misspecification of key convective closure parameters. In perfect-model OSSEs, the localization mappings that use adjacent correlations to improve the correlation estimated from small ensemble sizes produce robust accurate analysis estimates. In the presence of model error, the filter skills of the localization maps trained on perfect- and imperfect-model data are comparable.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: John Harlim, jharlim@psu.edu

Abstract

This paper demonstrates the efficacy of data-driven localization mappings for assimilating satellite-like observations in a dynamical system of intermediate complexity. In particular, a sparse network of synthetic brightness temperature measurements is simulated using an idealized radiative transfer model and assimilated to the monsoon–Hadley multicloud model, a nonlinear stochastic model containing several thousands of model coordinates. A serial ensemble Kalman filter is implemented in which the empirical correlation statistics are improved using localization maps obtained from a supervised learning algorithm. The impact of the localization mappings is assessed in perfect-model observing system simulation experiments (OSSEs) as well as in the presence of model errors resulting from the misspecification of key convective closure parameters. In perfect-model OSSEs, the localization mappings that use adjacent correlations to improve the correlation estimated from small ensemble sizes produce robust accurate analysis estimates. In the presence of model error, the filter skills of the localization maps trained on perfect- and imperfect-model data are comparable.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: John Harlim, jharlim@psu.edu

1. Introduction

In high-dimensional applications, ensemble Kalman filters are usually implemented using a small number of ensemble members because of the high cost in integrating the forecast model. This occurs for instance in operational numerical weather prediction, where forecast models have state variables while computational resources can only allow for the integration of up to members (Houtekamer and Zhang 2016). With such small ensemble sizes relative to the state space dimension, the technical implementation of the EnKF can suffer from inaccurate state estimation, which manifests as underestimated forecast error covariances, spurious long-range correlations, and ultimately filter divergence. The underestimation of forecast error covariances is often addressed by the method of covariance inflation (Anderson and Anderson 1999; Anderson 2007a) while spurious correlations at long distances are usually mitigated using the technique of covariance localization (Hamill et al. 2001; Houtekamer and Mitchell 1998). Localization schemes use tapering or kernel functions to reduce or zero out unphysical correlations, most often using a Schur product. The distance-based Gaspari–Cohn (GC) function (Gaspari and Cohn 1999) is a standard example of a parametric tapering achieved with a fifth-order polynomial function on a compact support. The half-width of the compact support, which determines the geometric distance at which correlations are cut off, must be tuned for good performance.

While parametric localization functions are useful in practice, they can be expensive to tune in large-scale applications. For instance, the GC localization function has an optimal half-width that tends to vary, for example, by observation type (Houtekamer and Mitchell 2005) and model variables (Anderson 2007b, 2012) or even as a function of time (Anderson 2012; Chen and Oliver 2010). Recently, De La Chevrotière and Harlim (2017) proposed a data-driven localization technique that can capture nonuniform localization bandwidths using a single parameter. The technique uses ensemble archived products from which time series of sampled and undersampled correlations are computed. A supervised learning algorithm analyzes the two training correlation datasets to infer a localization function, named the localization map. The localization map is used in verification mode to transform the poorly estimated sample correlation into an improved correlation. In a series of observing system simulation experiments (OSSEs) using the 40-variable Lorenz-96 model (Lorenz 1996) and a range of linear and nonlinear observation models, the localization maps were found to improve the filter estimates, most notably in the case of nonlinear indirect observations (De La Chevrotière and Harlim 2017).

In light of these promising results obtained using a low-order model, the performance of the localization maps is further explored in a data assimilation system of intermediate complexity. Here, the serial least squares ensemble Kalman filter (LS-EnKF) of Anderson (2003) is implemented in the monsoon–Hadley multicloud model (De La Chevrotière and Khouider 2017; De La Chevrotière 2015), a zonally symmetric model for the meridional Hadley circulation and monsoonal flow. The model’s free troposphere synoptic-scale wave dynamics is given by nonlinear equations for the barotropic and first two baroclinic modes of vertical structure, while the physical processes of convection and precipitation are represented by a stochastic model for clouds. Although the monsoon–Hadley multicloud model is an idealized atmospheric circulation model, it features a nonlinear multiscale wave dynamics with several thousands of model coordinates, which makes it an ideal test bed for the localization maps. The vertical basis function representation of the model is exploited to recreate satellite-like observations using an idealized radiative transfer model. Brightness temperature-like measurements of six satellite channels are assimilated to the model using a sparse observational network. The filter skill of the localization maps is tested with this nonlinear indirect observation model in a series of perfect-model OSSEs as well as in the presence of model error.

The structure of this paper is as follows: in section 2, we review the general framework of the monsoon–Hadley multicloud model and look at numerical simulations of the model in two different regimes. In section 3, the technique of the localization mapping is explained in the context of the LS-EnKF, followed by a description of the idealized radiative transfer model and general experimental design. In section 4 we present the results of OSSEs realized in the perfect and imperfect-model scenarios. We wrap up the paper with a brief summary and conclusions in section 5.

2. The monsoon–Hadley multicloud model

The monsoon–Hadley multicloud model is a zonally symmetric model for the large-scale Hadley circulation, ambient winds, and precipitation associated with the summer monsoon season (De La Chevrotière and Khouider 2017; De La Chevrotière 2015). The model is based on the Galerkin projection of the primitive equations of atmospheric synoptic dynamics onto the first few modes of vertical structure in the free troposphere, and is coupled to a bulk atmospheric boundary layer (ABL) model. The prognostic variables of this vertical projection are the barotropic and baroclinic horizontal velocities, , where each of these modes have zonal and meridional components, u and υ, respectively, and the baroclinic potential temperatures, . The corresponding vertical basis, depicted as functions of height between the sea level height 0 to the tropopause km, are shown in Fig. 1a. The free-tropospheric pressure p and vertical velocity w are given by Galerkin expansions consistent with the basis functions, with baroclinic wave mode amplitudes calculated diagnostically via the hydrostatic equation and incompressibility condition, respectively. Below the free-troposphere model we place a mixed representation of the ABL with prognostic variables that include the horizontal velocity , fluctuation potential temperature , and equivalent potential temperature . See De La Chevrotière (2015), De La Chevrotière and Khouider (2017), and Waite and Khouider (2009) for the detailed formulation. The moist dynamics of the model is modeled with bulk equations for the vertically averaged free-tropospheric moisture fluctuation q.

Fig. 1.
Fig. 1.

(a) Vertical profiles of the leading modes of horizontal velocity and potential temperature θ. (first three profiles) Barotropic mode and first two baroclinic modes of velocity and , respectively. (last two profiles) First two baroclinic modes of temperature and . (b) Baroclinic profile of the heating (and cooling) rates associated with the three cloud types of the multicloud model. (left) Mode-2 congestus heating . (middle) Mode-1 deep heating . (right) Mode-2 stratiform heating . The heating curves intersect the vertical straight lines at zero heating points.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

Altogether, the governing equations for the large-scale variables form the dynamical core of the monsoon–Hadley multicloud model. The dynamical core is a nonlinear system of 13 partial differential equations (PDE) that is not conservative and not necessarily hyperbolic. The PDE system is solved iteratively with the Poisson equation for the ABL pressure using an operator time-splitting strategy (Strang 1968). The details of the numerical scheme are presented in De La Chevrotière and Khouider (2017). The unresolved features of convection and precipitation, which will be described in section 2a below, are represented by the stochastic multicloud cumulus parameterization scheme of Khouider et al. (Khouider et al. 2010; Khouider and Majda 2006a,b,c, 2008, 2007).

The computational domain is reduced to a meridional slice of the troposphere between 40°S and 40°N (roughly 9000 km) with a mesoscale grid resolution of about 37 km. The system is integrated as an initial value problem with the initial condition set to a radiative convective equilibrium (RCE; De La Chevrotière 2015). This is a spatially homogeneous steady-state solution where the convective heating is balanced by the radiative cooling. Details on how to construct a RCE solution for the coupled system can be found in De La Chevrotière (2015). In section 2b we present numerical simulations in an idealized boreal summer setting. A detailed description of the model can be found in the original works (De La Chevrotière and Khouider 2017; De La Chevrotière 2015).

a. The stochastic multicloud parameterization

The multicloud model highlights the role of three heating rates, , , and , corresponding to the three cloud types that are observed in organized tropical convection: cumulus congestus cloud decks (subscript c) that heat the lower troposphere and cool the upper troposphere with cloud top near the freezing level, deep convective towers (subscript d) that heat the entire troposphere, and stratiform anvils (subscript s) that warm and dry the upper troposphere and cool and moisten the lower troposphere. These three cloud types are believed to be responsible for the bulk of the tropical rainfall and constitute a major source of heat for the free-tropospheric circulation (Johnson et al. 1999; Mapes et al. 2006; Abhik et al. 2013). The convective heating rates , , and directly force the first two baroclinic modes of vertical structure as illustrated in Fig. 1b.

The parameterization scheme overlays on top of each grid box a Markov chain square lattice of size , where each lattice site is either cloud free or occupied by a congestus, deep, or stratiform cloud, denoted as state 0, 1, 2, and 3, respectively. A continuous-time Markov process is then defined for each lattice element, allowing transitions from one state to another according to probability transition rates , which are Arrhenius-type functions of the convective available potential energy integrated over the whole troposphere (CAPE), low-level CAPE (CAPEl), and midtroposphere dryness D. The probability rates are constrained by a set of intuitive rules that are based on observations of cloud dynamics in the tropics (Johnson et al. 1999; Mapes et al. 2006). For example, a clear-sky site turns into a congestus site with high probability if CAPEl is positive and the midtroposphere is dry, while a congestus site (or clear site) turns into a deep convective site with high probability if CAPE is positive and the midtroposphere is moist. The three cloud types are assumed to decay naturally into a cloud-free site at some fixed rate. These rules are formalized in Table 1. Note that since cloud transitions occur arguably on different time scales, the probability transition rates are divided by their characteristic time scales . We use time-scale estimate values resulting from a statistical Bayesian inference study of large-eddy simulated data (De La Chevrotière et al. 2016, see their Table 3).

Table 1.

Cloud transition probability rates and time scales in the stochastic multicloud parameterization. Here C, , and D are measures of the environment CAPE, low-level CAPE, and midtroposphere dryness, respectively. The time scales’ mean and standard deviation (SD) Bayes’s estimates are obtained from De La Chevrotière et al. (2016).

Table 1.

Assuming all Markov chains are independent, one can formally derive (Katsoulakis et al. 2003; Khouider et al. 2010, 2003) the stochastic dynamics for the gridbox cloud area fractions alone. Effectively, given the prescribed time scales and the large-scale thermodynamic state (e.g., CAPE, D) at a grid box, the scheme outputs the time-dependent stochastic cloud area fractions , , and for that column. These in turn influence the large-scale heating rates according to the following convective closure equations (Khouider et al. 2010; Khouider and Majda 2006c, 2008):
e1a
e1b
e1c
where , , and Q are model constants; is the average midtroposphere height; is a reference convective time scale; and and determine the contributions of CAPE and CAPEl to stratiform and congestus heatings, respectively. The constant and parameter values in (1) can be found in De La Chevrotière and Khouider (2017). The closure equations are expressed in terms of RCE quantities, which are denoted by overbars, and deviations from the RCE, denoted by primes (e.g., denotes the heating potential at RCE).

b. Idealized boreal summer monsoon simulations

The monsoon–Hadley multicloud model is tested in an idealized summer monsoon setting on an aquaplanet with constant but nonuniform sea surface temperature (SST) mimicking the Indian and Pacific Oceans’ warm pool (WP). The prescribed SST follows a Gaussian meridional profile centered at 15°N, as shown in Fig. 2. This is meant to replicate the warm SSTs observed in the intertropical convergence zone (ITCZ) during the boreal summer. We use a multicloud stochastic lattice of size embedded in our 37-km resolution meridional grid, which results in convective cells with a horizontal extent of O(1) km.

Fig. 2.
Fig. 2.

Imposed SST meridional profile. The surface temperature gradient at RCE follows a normal distribution centered at 15°N with a standard deviation of 7.2°. Here and are the surface and ABL equivalent potential temperatures at RCE, respectively.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

The model is integrated for roughly 1350 days with a time step of 3 min. The solutions are in a statistical steady state after a short transient period of 100–200 days (De La Chevrotière 2015). The first 1000 days are discarded as burn-in and the last 350 days are used for training purposes. Here we present the simulation results for two parameter regimes A and B, which differ only by their convective parameterization cloud time scales: regime A uses the Bayes’s mean time-scale estimates of Table 1, while the time scales of regime B are obtained by adding one Bayes’s standard deviation to the mean. We should point out that most of the model errors are due to misspecification in the deep clouds decaying rate with large standard deviation. We will use these two regimes to simulate numerical experiments with model error.

The mean meridional circulation resulting from taking the time average of the solution over the training interval is plotted in Fig. 3 for regime A. The height–latitude contour plots are shown for the horizontal wind components u and υ, vertical velocity w, potential temperature θ, pressure p, and total heating H, each obtained from its respective Galerkin expansion as detailed in De La Chevrotière and Khouider (2017). The cross sections show the dominant deep tropospheric overturning of the Hadley circulation, with an ascending branch over the WP at 15°N resulting from low-level convergence, and subsidence near 10°S. The upward motion branch of the Hadley cell is associated with a strong deep barotropic heating mode and a stratiform second baroclinic potential temperature mode. The sea level pressure drops significantly moving northward through the ITCZ, a characteristic of the monsoon trough. The low-level wind displays the turning of the equatorial easterlies to westerlies south of the pressure trough and then back to easterlies, similar to the mean monsoonal flow of the boreal summer season.

Fig. 3.
Fig. 3.

Mean meridional circulation averaged over the training interval for regime A. The top of the ABL (solid black line) is located at height 0 km. The contours represent the indicated fields, and the arrows are the velocity vector field (υ, w). (clockwise) Zonal and meridional winds, potential temperature, total heating, vertical velocity, and pressure.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

The Hovmöller plot diagrams of the wave fluctuations from the mean solutions are shown for regime A in Fig. 4. The latitude–time contours of , q, and are plotted for a 25-day period starting at day 1200. The contour plots of the three cloud area fractions are also pictured during that same period. Small-scale intermittent events can be observed throughout the domain, with a larger concentration over the WP, roughly the 5°–25°N band. Mesoscale cloud clusters are seen propagating southward and northward from the WP with suppressed and active phases of convection alternating every 1–2 days. We also observe that the strong events in the precipitation related fields and q correlate well with the cloud coverage peaks.

Fig. 4.
Fig. 4.

The 25-day Hovmöller plots for regime A. (clockwise) ABL equivalent potential temperature, free-tropospheric moisture, meridional barotropic wind; and stratiform, deep, and congestus cloud area fractions.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

The mean meridional circulation and Hovmöller plots for regime B are shown in Figs. 5 and 6, respectively. As mentioned before, the cloud transition time scales of regime B are larger than those of regime A by one standard deviation. This positive bias in the transition time scales has an impact on the wave disturbances: cloud systems are now larger and persist over several days, while their period of oscillation is in the order of 10 days or so. The mean meridional circulation shows a reduced low-level convergence and dampened upward motion over the WP region. The convective total heating also appears diminished throughout the domain.

Fig. 5.
Fig. 5.

As in Fig. 3, but for regime B.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

Fig. 6.
Fig. 6.

As in Fig. 4, but for regime B.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

3. Data assimilation methodology and experimental design

The nonlinear discrete-time filtering problem that we consider in this paper can be written in a compact form as follows:
e2a
e2b
where F is a nonlinear forecast model operator (representing here the Hadley–monsoon stochastic multicloud model) and G is a nonlinear observation model with a Gaussian measurement error with zero mean and covariance . The ensemble Kalman filter (EnKF) is an approximate filtering method introduced by Evensen (1994) to estimate the first two-order statistics of the nonlinear filtering problem in (2a)(2b). The key idea of the EnKF is to model the background (or forecast) density as a Gaussian distribution with mean and covariance estimated empirically from an ensemble of K forecast solutions. Subsequently, analysis mean and covariance matrix are computed using the Kalman filter formula (Kalman 1960), incorporating the information from the current observations to the mean update. Finally, the analysis (or posterior) ensemble estimates are drawn from the Gaussian analysis distribution. We should mention that there are many variations of EnKF (e.g., Houtekamer and Mitchell 1998; Bishop et al. 2001; Anderson 2001; Whitaker and Hamill 2002) since there are nonunique ways to sample the Gaussian analysis density.

In theory, for linear models with Gaussian errors, the EnKF converges to the (exact) KF solution in the limit of large ensemble size, . However the use of large ensembles is often cost prohibitive in practice, and typically we resort to small ensembles of size , where N is the dimension of the state space. For instance, in our application , while . While a small ensemble size is computationally desirable, it introduces sampling errors that lead to underestimated covariances (Furrer and Bengtsson 2007). This issue is routinely addressed using covariance inflation methods (Anderson 2007a) that essentially “blow up” the prior covariance by some inflation factor. Furthermore, small ensemble sizes also induce spurious ensemble correlations between observations and state variables with distant grid points (Lorenc 2003). This problem is usually mitigated using a technique known as localization (Houtekamer and Mitchell 1998).

In this paper, we address the spurious correlation issue with the data-driven technique proposed by De La Chevrotière and Harlim (2017). Instead of tuning a specified parametric localization function (such as the half-width parameter of the Gaspari–Cohn or exponentially decaying functions), this method uses a pair of labeled training datasets to estimate a linear map, called the localization map, that transforms the poorly estimated sample correlation into an improved correlation. This training methodology is effectively an example of supervised learning in the machine learning community (see e.g., Hastie et al. 2009). Here, the localization map is implemented within the sequential least squares framework of Anderson (LS-EnKF; Anderson 2003), a serial variant of the EnKF that allows for observations with independent measurement errors to be assimilated sequentially. Anderson’s scheme breaks down the filtering problem into a sequence of linear regressions of a scalar observation onto the state vector. In this scalar context, the covariances, or correlations, between a single observation and the model state variables appear explicitly and can be easily localized.

In the remainder of this section, we provide a brief review on the LS-EnKF in section 3a and the localization map technique in section 3b. In section 3c, we introduce the idealized radiative transfer model for synthetic satellite observations and conclude with the experimental design in section 3d.

a. Least squares EnKF algorithm

In the LS-EnKF, an ensemble of K model state samples is integrated forward using the forecast model F, producing a set of prior (or forecast) solutions (note that the time index is suppressed for clarity of notation). Each ensemble member is then projected to the observation space by applying the observation operator [i.e., ]. The mean and variance of the jth component of the observation vector are approximated by its ensemble mean and variance,
e3
respectively. The LS-EnKF algorithm works under the assumption that observation errors are independent (the matrix is diagonal) and uses the unperturbed observation component to sequentially update the ensemble solutions. We should point out that our choice of using this algorithm is just for convenience; one can consider more advanced data assimilation techniques that can handle correlated observation errors, especially when dealing with satellite measurements (Waller et al. 2014). For each observation component j, , the LS-EnKF executes the following update:
e4a
e4b
where is the jth diagonal element of . Note that the update step (4) can be interpreted as a scalar ensemble filter in the observation space. The second step of the LS-EnKF is to regress the increment of the observation variable onto the state variables as follows:
e4c
where the cross-covariance vector is also approximated by its ensemble statistics, that is,
e5
As mentioned before, the use of small ensemble sizes has two adverse effects: 1) underestimating the covariance and 2) producing unphysical, spurious long-range correlations. In our numerical implementation, the first issue is dealt with using the adaptive covariance inflation of Anderson (2007a) while the second is addressed using the localization mapping technique described in the next section.

b. Localization mappings and modified LS-EnKF

The localization mappings introduced in De La Chevrotière and Harlim (2017) use information from the sample correlation matrix between the model variable and the observation variable , defined in the usual way as
e6
Here, and are diagonal square matrices having as their main diagonal the diagonal elements of and , respectively. For example, the diagonal elements of are given as in (3). The elements of the cross-covariance matrix , on the other hand, are given by (5). All are estimated from an ensemble of size K.

The idea behind the localization mappings is to obtain a map that transforms, at each time step m, a poorly estimated correlation matrix obtained using a small ensemble of size into an improved correlation matrix, that is “closer” to some target correlation estimated using a large ensemble size . In practice, L is taken as large as possible, in such a way that is a good approximation to the asymptotic correlation , at time m.

The map is assumed to be of linear form, and we seek an estimate of that minimizes the expected squared error between the transformed correlation and the target correlation . Specifically, for every pair of observation j and model state variable i, we find , , that minimizes the following error cost function:
e7
where , is the component of , and . Here, we emphasize J as a function of although it also depends on (or ρ). The distributions of the densities and are assumed to be stationary and will be sampled from a training dataset. Effectively, the solution of the minimization problem (7) produces an estimate of the correlation through a linear map of size that combines the information of the local subsampled correlations that are spatially located within a radius ρ of the model variable ’s grid location. A special form of (7) arises when the radius and is a scalar learned from the point correlation . In this case, the data-driven map corrects the correlation matrix pointwise, in the same manner as a Schur product.

One way to approximate the solution of the minimization problem in (7) is to discretize the cost function using samples from and . We next describe a sampling method based on a historical data assimilation product using a large ensemble size . In high-dimensional applications, one may not have this training dataset since it is computationally not feasible to obtain a reasonably accurate state estimation by employing EnKF with a very large L without any form of localization. In this situation, then one can simulate the training data using an EnKF with an ensemble size L that is only slightly larger than the verification size K [e.g., for ], with a very broad localization range to obtain a reasonably accurate state estimation. An example of such an experiment was reported by Miyoshi et al. (2014) with 10 240 ensemble members.

Our sampling method first consists of generating training data using a global EnKF scheme with an ensemble of size L. Suppose that for each ensemble member , the assimilation experiment generates a time series of forecasts , where T is the number of assimilation cycles. Then, at each cycle m, , we calculate the sample correlation using all of the ensemble members as well as a subsampled correlation , selecting only K members out of L. By this method, we effectively obtain samples and . Given these samples, the Monte Carlo approximation to the minimization of the integral equation in (7) is given by
e8
which is essentially a linear least squares problem. For every , the explicit solution of this linear least squares problem is given by , for , , and . Solving (8) for all i and j, we obtain the linear map estimator . Finally, the straightforward modification on the LS-EnKF is to replace in (4c) by
e9
Here, the components of are given by . The new and improved correlation is closer, in the least squares sense, to the target correlation . Note that the map is trained offline on the data and the computational complexity of each regression problem is (De La Chevrotière and Harlim 2017). The training of with using a Matlab serial application takes an approximate 13-h wall-clock time on a single CPU. The relative residual norm of the linear estimator , calculated as , is on average 20% and is weakly monotonic decreasing as a function of ρ or K.

c. An idealized radiative transfer model

In our numerical experiments below, we consider assimilating satellite-like observations based on an idealized radiative transfer model that assumes an absorption coefficient of the following form:
e10
where is a reference value, is the height of the troposphere, and is a rescaled measure of the free-troposphere vertically averaged moisture anomaly q [in the experiments reported here, , where and and the extrema are taken over the training dataset]. As expected, α decreases exponentially with height, and is sensitive to the atmospheric moisture content q. The optical thickness of the atmosphere between heights z and is defined as
e11
where the last equality is obtained via the integration of the absorption coefficient in (10). The transmittance at the wavelength λ is given by a semblance of Beer’s law:
e12
The vertical derivative of the transmittance is known as the weighting function :
e13

We desire to simulate channels whose weighing function peaks at a specific height in the troposphere. This occurs at . Using this critical height we determine , which in turn specifies . We select 6 distinct wavelengths , that we also call channels, associated with the heights km, respectively. Figure 7 shows the top-of-the-atmosphere optical thickness, transmittance, and weighting function for these six channels for various atmospheric states drawn from climatology.

Fig. 7.
Fig. 7.

(left) Optical thickness , (middle) transmittance , and (right) weighting function as a function of height z for the six channels , , , shown here (in an array of colors; quantities are nondimensional) for different climatological moisture values q.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

The integrated brightness temperature at height associated with the wavelength λ is modeled by
e14
where is the ABL potential temperature and is the potential temperature at height z reconstructed by linear superposition of its first two baroclinic modes as described in Fig. 1. For our data assimilation experiments, we consider the top-of-the-atmosphere satellite brightness temperature–like measurements defined as . Figure 8 shows the top-of-the-atmosphere brightness temperature for the channel calculated from the climatology over a period of 65 days.
Fig. 8.
Fig. 8.

Nondimensional brightness temperature associated with the first channel, calculated from climatology over a period of 65 days.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

d. Experimental design

The monsoon–Hadley multicloud model described in section 2 coupled with the modified LS-EnKF given by (4) and (9) form our assimilation scheme. We will conduct OSSEs in the perfect as well as imperfect-model scenarios. In the perfect-model scenario, both the nature and forecast states are generated using the same model configuration (regime A; identical twin experiments). In the imperfect-model scenario, the nature is generated using the reference parameters of regime A while the forecast model parameters are of regime B, as discussed in section 2b.

The analysis is performed over the 256 internal grid points of the meridional numerical domain, on the 14 large-scale fields . Thus the model state is of dimension . The stochastic cloud area fractions , , and of the convective parameterization are not filtered given the added algorithmic complexity of constraining the updated cloud states to the nonnegative range. Interested readers should consult the alternative method proposed in Janjić et al. (2014), which was designed to preserve the positivity and conserve mass. After analysis, the cloud heating rates , , and are calculated using the background cloud area fractions and the analyzed large-scale fields to enforce the convective closure balance of the multicloud parameterization.

The observations are generated from the nature run according to (2b), where the observation model G is the idealized radiative transfer model described in the previous section, and the observation error covariance matrix is diagonal with components equal to of the climatological variances. More precisely, G maps the mass field at an observation location to a brightness temperature–like measurement for the channel , . All 6 channels are observed on 64 uniformly distributed meridional locations (every 148 km or so) for a total of observations. Observations are assimilated at every 30 model integration steps (1.5 h).

The correlation data used to train the localization maps are obtained by running an OSSE using members for about 1350 days, using the last 90 days [or cycles, accounting for an analysis every 1.5 h] for training. At each cycle , we obtain correlation matrices and for values of K ranging from 10 to 35. The localization maps are obtained from solving the least squares problem in (8) using (results using other values of ρ will be reported in some cases). This means that we correct each correlation between a model state (say at the equator) and an observation (radiance at some remote station) using a linear combination of the correlations between that observation and like-field model states () located within 6 grid points from the model state location (the equator). In Fig. 9, we show the vector localization maps for the correlations between the model coordinates i of the field and a channel-1 brightness temperature observation j located on the green point (near the equator). Each panel of this figure corresponds to the resulting map optimized for different K. Each vertical slice of the contour plot in each panel consists of 13 values () obtained from solving the regression problem in (8). For the observation at location j (green dot), we solve the regression problems corresponding to the model grid points i that satisfy , as shown in the horizontal axis in each panel of Fig. 9. This choice is to avoid excessive computational storage and to ensure that we can capture the nontrivial nonlocal structure of the correlations.

Fig. 9.
Fig. 9.

Contour plot of the vector localization map for the correlations between the model coordinates i of the field and a channel-1 brightness temperature observation j located at the green point (near the equator). The contour value at the coordinates corresponds to the component of the vector map . For a fixed i and j, the vector components of (along the y axis) are zero outside the radius of the observation location.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

Notice that from Fig. 9 the structure of the map is not symmetric with respect to the observation location. Second, the function value of the map for larger (or the support) increases as a function of K. This is consistent with the results in a simpler context (De La Chevrotière and Harlim 2017). In fact, if in (8), the resulting map is one for and zero otherwise for all . This means that for the case of , the map does not provide any localization. If this map is used for filtering with , then the filter will diverge. We will support this argument with numerical results below.

Although the focus will be on the vector maps obtained with , we will also show results of the scalar map () as reference. In Fig. 10, we show examples of the scalar localization maps optimized for compared to the Gaspari–Cohn with half-width parameter equals to 6. Notice that the amplitudes and supports of the data-driven maps vary as functions of the observation location and variable.

Fig. 10.
Fig. 10.

The GC function and localization map for the correlation between the observed brightness temperature of channel 1 observed at three different locations (roughly 14°S, 0°, and 12°N) and the model variables (a) and (b) .

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

The accuracy of the assimilation is measured out of sampling (on a verification interval that is independent of the training data) by the time mean of the RMS error between the analysis ensemble mean, , and the truth run, :
eq1
where , the length of the verification interval, is set to 1 year or 365 days ( analysis cycles).

4. Numerical results

We now present the numerical results for OSSEs realized in the perfect- and imperfect-model scenarios. In the perfect-model experiments, both the truth and forecast states are simulated with regime A. In the imperfect-model experiments, the forecast model and the truth are simulated using regimes B and A, respectively (see Table 2 for details). Recall that regime B differs from regime A by its slightly larger convective parameterization time scales. The goal of the imperfect-model experiments is to test the robustness of the localization maps in the presence of model error.

Table 2.

Description of the experiments in section 4.

Table 2.

In the first numerical experiment, our goal is to check the sensitivity of the filter estimates on parameter ρ. To do this, we train the maps for for the case of . In Fig. 11, we compute the average relative degradation as a function of ρ. Here, the average relative degradation is defined as follows:
eq2
where denotes the RMSE for the jth variable (of the 14 components) and obtained using localization map with parameter ρ. The minimum is taken over the range of . Based on this metric, one can see that the relative degradations for are on average 10% less than those of . Based on this empirical result, we will focus on the case of scalar map and vector map in the remaining of this paper.
Fig. 11.
Fig. 11.

Average relative degradation as a function of the parameter ρ for the perfect-model and model error experiments.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

a. Perfect-model experiments

We first realize a perfect-model experiment using 1000 members (TD_PM), calculating the background correlation at each analysis cycle. Using this dataset we obtain two different maps: 1) a scalar map with and 2) a vector map with . Examples of these two maps are shown in Figs. 9 and 10. We summarize these (and the model error) experiment configurations in Table 2. The results of the perfect-model verification experiments using these two maps (labeled PM and ) were shown in Fig. 12. We report the time mean analysis RMSE over a 1-yr verification interval as a function of ensemble size for the 14 analysis fields. For reference, a verification experiment using a GC localization with a half-length equal to 6 is included. Figure 12 also reports the RMSE of the experiment TD_PM and the climatological standard deviation, which quantifies the error without data assimilation.

Fig. 12.
Fig. 12.

Perfect-model experiments. Log-linear plot of the time mean of analysis RMSE as a function of ensemble size is shown for the 14 analyzed fields, for the verification experiments using the localization maps PM with and (dashed black). The training experiment TD_PM using 1000 members is reported as a baseline (solid blue).

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

Comparing the scalar map PM with the vector map PM and the GC localization, we find that the performance of the vector map PM is the closest to the training experiment TD_PM realized with 1000 members. The improvement of the vector map over the scalar map is most markedly seen for the fields and . In fact, we observe that, of all fields, and have the highest RMSEs relative to climatology. Going back to the training data TD_PM, an inspection of the autocorrelation function for and reveal that their signal is still strongly correlated over the 90-day training interval, which is a violation of the stationarity assumption of the cost function formulation in (7). We should also point out that the dynamics of these variables are almost constant (since they are slow) within the observation time scales (1.5 h), which means that the linearized dynamics are close to identity (or marginally unstable with eigenvalues close to 1). The basic Kalman filter theory suggests that the observability and controllability conditions are necessary for accurate estimation (Kalman 1960). In our case, the observability condition is most likely violated since we do not observe and directly. The fact that the vector map PM improves the estimates relative to the scalar map PM can be due to an improved controllability condition of the filter. We should point out that an analogous finding (inaccurate estimate) was also reported by Tardif et al. (2014) in the context of assimilating purely atmospheric data at frequent times in a coupled atmospheric–ocean data assimilation. To resolve this issue, they proposed to change the observation function and reduce the observation frequency, which effectively improve the observability condition (as an alternative to improving the controllabilty condition). While both filtering problems considered here and in Tardif et al. (2014) are nonlinear, yet the classical linear filtering theory seems to give a plausible explanation for the results. We also note that GC performs the worse among these numerical experiments. In fact, GC numerically blows up when the ensemble size is too small (10 ensemble members in our experiment).

b. Model error experiments

We next investigate the performance of the localization maps in the presence of model error. We first run a model error experiment using 1000 members, producing the training dataset TD_ME. We train two different maps on TD_ME’s background correlations to be used in verification model error experiments (labeled ME2): 1) A scalar map () and 2) a vector map using .

As a reference, we show the model error experiment ME1 using a “perfectly tuned” vector map (), that is, a map trained on the perfect-model training data TD_PM. While this configuration is not practical in real applications since one does not have the knowledge of the true dynamical parameters (regime A in this example), this experiment provides, as we will discuss below, an intuition of how sensitive the proposed method is to the model error in the training dataset.

The results of the model error verification experiments using the maps ME1 and ME2 as well as a GC localization (with a half-width equal to 6) are shown in Fig. 13, along with the RMSE of the two reference training experiments TD_PM and TD_ME. The perfect-model experiment PM is added for sake of comparison. Overall, GC performs the worst with numerical blow up at . The scalar map ME2 produces improved filter estimates over GC except on but it converges for the case of . The vector map ME2 beats these two cases on all counts. Also, the filtering skills of the vector maps ME1 and ME2 are visually indistinguishable. For some of the fields, most notably for , , , and , their skills are close to that of the perfect-model vector map experiment PM . Interestingly, in some cases (e.g., , and ) the vector maps ME2 outperform their own training experiment TD_ME, especially when the ensemble size is large. This result reminisces the same finding by Oke et al. (2007) in a simpler context with a perfect and linear model scenario. In particular, they found that localization on EnKF can improve the effective rank of the ensemble and outperform EnKF with large ensemble size without localization.

Fig. 13.
Fig. 13.

Model error experiments. Log-linear plot of the time mean of analysis RMSE as a function of ensemble size is shown for the 14 analyzed fields, for the verification experiments using the localization maps ME1 and ME2. The training experiments TD_ME and TD_PM using 1000 members are reported as baselines (solid blue).

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

While the results from using the vector maps ME2 are almost indistinguishable compared to ME1 in almost all cases, we should note that the results of ME2 can be sensitive to the training data. In particular in the case we found that for the specific training dataset, the resulting map produces filter divergence estimates. In this case, we found that the issue can be overcome using the map trained on a longer dataset (6 months rather than 3 months; see the red markers in Fig. 13 for the case ). This sensitivity issue, which occurs in the experiments with larger K, can be understood as follows. First, the support of the resulting map grows as K increases as shown in Fig. 9. Second, it is generally difficult to estimate correlations between two random variables that have small true correlations. Even in the simplest case, it is well known that the empirical correlation estimates of independent and identically distributed Gaussian variables with true correlation ρ has error variances with the leading-order term (Hotelling 1953). This means that the estimates for which the true correlations are small have large error variances. On the other hand, accurate estimates of the correlations, especially for those corresponding to i and j with large for maps with larger support, are crucial in the regression in (8) for large K. Since the error of the Monte Carlo approximation in (8) of (7) is proportional to the square root of the ratio between the variance of the integrand and the size of data T, then it is clear that larger T is required to offset the larger variances. This is why the simulations with longer training dataset can overcome the issue. Third, the condition number of the least squares problems in (8) ranges between and [see, e.g., section 3.3 of Demmel (1997) for the definition of the condition numbers for linear least squares problems]. This means that the proposed least squares method is ill-conditioned and thus small perturbation (e.g., on the order of ) to the data (caused by model error) can yield an order one relative error in the resulting vector map. Indeed, when we compare the vector maps of the case that are trained on different training intervals (one that gives accurate analysis and another one that does not converge), the relative error of the resulting vector maps (in uniform norm) is on the order of 1.

If a longer dataset is not available, we numerically found that one can also overcome this issue by choosing different ρ (results are not shown). We should also mention that for a given , one can generate more samples of in (7). Specifically, one can construct using different choices of K ensemble members among the available L training ensemble solutions. In this manuscript, we only regress to one sample of for each . The point we want to make is that one can increase the training data by regressing each to multiple constructed from different subsets of the training ensemble members.

Alternatively, one can use the maps optimized for smaller ensemble sizes. As a supporting argument, we test the robustness of the localization vector maps () by running a model error verification experiment with an ensemble size K, using a map ME2 optimized for an ensemble size . The RMSEs for are reported in Fig. 14. The case is identical to the experiment ME2 in Fig. 13. The results in Fig. 14 reveal that the filter fails systematically when , that is, when the training ensemble size is greater than the verification ensemble size. This result is consistent with our explanation above. That is, since the map’s support increases as a function of , the failure in the case of is partially due to the existing spurious, long-range correlations that are not damped out by the map with larger support trained with ensemble size . On the other hand, except for and , the filtered estimates monotonically degrade if one uses the maps trained on to filter with an ensemble of size K. In this case, the maps trained with smaller ensemble size overly damp out the spurious correlations since they have smaller supports. For and , the sensitivity is more difficult to predict since the filtering problem is in a difficult regime for these two variables as explained in section 4a in addition to highly correlated training data.

Fig. 14.
Fig. 14.

Model error experiments. Time mean of analysis RMSE as a function of training and verification ensemble sizes for experiment ME2 (). White pixels indicate filter divergence.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

The analysis and truth for the experiment ME2 with are compared in a series of plots in Figs. 15 and 16. A snapshot of the meridional profile of the zonal wind u, potential temperature θ, and total heating H are plotted in Fig. 15, with superimposed velocity field components υ and w. Although the analyzed velocity field is not well recovered, the analysis fields u and θ appear to be closer to the true state. We should note here that the data assimilation experiments are performed in the presence of model error and among these variables, only θ is observed indirectly through (14). The total heating H, calculated from the nonassimilated heating rate modes , , and , is expected to be harder to recover than the fully assimilated fields θ and u. Figure 16 contains the Hovmöller plots of the indirectly observed free-tropospheric moisture q and the nonassimilated cloud area fraction . The analyzed q compares well with the true state but the filter estimate of the field contains errors relative to the true field.

Fig. 15.
Fig. 15.

Snapshot of the meridional circulation of (a) the truth and (b) the analysis for the verification experiment ME2 ( and ). The snapshot is taken at an arbitrary assimilation cycle. The contours represent the indicated fields, and the arrows are the velocity vector field .

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

Fig. 16.
Fig. 16.

25-day Hovmöller plots of (a) the truth and (b) the analysis for the verification experiment ME2 ( and ). (left) Free-tropospheric moisture and (right) deep cloud area fraction.

Citation: Monthly Weather Review 146, 4; 10.1175/MWR-D-17-0381.1

5. Summary and conclusions

In this paper, we demonstrated the efficacy of the localization maps introduced in De La Chevrotière and Harlim (2017) in a series of OSSEs realized with the monsoon–Hadley multicloud model, an idealized model with roughly 3600 model coordinates for the synoptic-scale Hadley circulation and monsoonal flow. The model features a stochastic parameterization for clouds to represent the subgrid-scale processes of convection and precipitation and a bulk boundary layer dynamical model. We implemented the localization maps in a serial EnKF to assimilate satellite-like nonlinear indirect observations using an idealized radiative transfer model. We took vertically integrated brightness temperature measurements on 6 different channels over a sparse observational network (the total number of observations is close to 400).

From the perfect-model configuration, we learn that the data-driven localization map with small ensemble sizes of order produced analysis estimates closer to those obtained from EnKF with ensemble sizes of order compared to the other methods in our numerical experiments, provided that the training data are close to stationary. Of all 14 analyzed fields, the filter has most difficulty recovering the free-troposphere barotropic zonal wind and the ABL zonal wind since these two slow variables that are not directly observed are in a difficult filtering regime, as explained in section 4a. As a consequence, the training data of these two fields are highly correlated compared to the other fields. This means that for these two variables, the assumption for the training strategy, that is, stationarity on the correlation distribution, is violated. Nevertheless, when the ensemble size is extremely small, the numerical results suggest that the vector localization map, which uses information of adjacent spatial correlations to improve the correlation estimates, is more robust relative to the scalar localization map that is analogous to the usual Schur product-based localization function. In fact, our numerical results showed consistent improvement over the usual Gaspari–Cohn localization especially when small ensemble sizes are used.

We also checked the proposed localization mapping in the presence of model error arising from misspecification of the convective time scales, which impacts on the stochastic dynamics of the cloud area fractions, and in turn affects the large-scale through the convective closure of the model. In this scenario, we found that the filter performances using the localization maps obtained from imperfect-model training data are almost identical to those using the localization maps obtained from perfect-model training data. In some variables (, and ), where most of these are not observed, we found that the localization mapping outperforms their own imperfect-model training data assimilation filter skill. This result is possibly due to an improved effective ensemble size with the proposed localization mapping, which is analogous to the finding by Oke et al. (2007) in a simpler context.

Closer inspection reveals that the proposed least squares fitting in (8) is an ill-conditioned problem with condition numbers as large as . This suggests that the quality of the dataset is important for accurate estimation of the maps. While this is not a desirable feature, we found that this issue (which we encountered in the case of ) can be overcome by using a longer training dataset, which offsets the larger Monte Carlo error variance in correlation estimates between observation and model variables of large distances. Several strategies were proposed to overcome this issue when a longer dataset is not available. We numerically verified one of these strategies, namely, by using the resulting maps optimized for smaller ensemble sizes.

The numerical results from the scalar map are better compared to the standard GC localization. In particular, the filtering with scalar maps does not blow up in the case of small ensemble sizes where GC does. The major difference between these two localization techniques is that the scalar maps have nonuniform bandwidths while the GC localization uses a uniform bandwidth function. From a practical standpoint, these results are encouraging since specifying nonuniform bandwidths for the GC localization function is not trivial. On the other hand, the scalar maps are trained by setting one parameter . Furthermore, the cost of training the scalar map is less than that of the vector map and the scalar maps are less sensitive to the training data compared to the vector maps (at least we did not encounter the sensitivity issue as in the case of in our numerical experiments). For the vector map, besides the sensitivity issue, an adequate choice of parameter ρ is needed to see the improvement as shown in our numerical example.

From the encouraging results in this paper, this data-driven localization mapping is scalable for high-dimensional applications, replacing the usual distance-based parametric-type localization function that is designed for spatial correlations that are local. The nonparametric nature of this approach allows the data to flexibly determine the appropriate nontrivial shape of the localization maps, including various nonlocal correlation dependence, which is usually ignored with the standard localization. One potential issue is the availability of the high-quality training dataset since generating training data without any localization (as done in this paper) is not possible for atmospheric global circulation models at this point. However, one can train the localization maps using empirical correlations obtained from large ensemble member data assimilation simulations with a very broad localization range such as those demonstrated in Miyoshi et al. (2014). Another issue is the availability of the high-quality training dataset in the presence of more severe modeling error, beyond parameter misspecification considered in this paper. In this situation, one may need more advanced model error estimation techniques (Harlim 2017) to generate reliable training dataset. Another challenge in the operational setting is that the atmospheric dynamics are intermittent and seasonal. In addition, various types of observations are usually assimilated. It remains interesting to see whether we can use the idea in this paper to train the localization maps for the observations that have nontrivial nonlocal correlation structures and whether substantial improvement can be attained to offset the cost in the training procedure.

Acknowledgments

The authors thank Dr. Peter Houtekamer for his careful reading of the manuscript and insightful comments. The research of J. H. is partially supported by the ONR Grant N00014-16-1-2888 and the NSF Grants DMS-1317919 and DMS-1619661.

REFERENCES

  • Abhik, S., M. Halder, P. Mukhopadhyay, X. Jiang, and B. N. Goswami, 2013: A possible new mechanism for northward propagation of boreal summer intraseasonal oscillations based on TRMM and MERRA reanalysis. Climate Dyn., 40, 16111624, https://doi.org/10.1007/s00382-012-1425-x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903, https://doi.org/10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131, 634642, https://doi.org/10.1175/1520-0493(2003)131<0634:ALLSFF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2007a: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210224, https://doi.org/10.1111/j.1600-0870.2006.00216.x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2007b: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter. Physica D, 230, 99111, https://doi.org/10.1016/j.physd.2006.02.011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2012: Localization and sampling error correction in ensemble Kalman filter data assimilation. Mon. Wea. Rev., 140, 23592371, https://doi.org/10.1175/MWR-D-11-00013.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 27412758, https://doi.org/10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420436, https://doi.org/10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Chen, Y., and D. S. Oliver, 2010: Cross-covariances and localization for EnKF in multiphase flow data assimilation. Comput. Geosci., 14, 579601, https://doi.org/10.1007/s10596-009-9174-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • De La Chevrotière, M., 2015: Stochastic and numerical models for tropical convection and Hadley-monsoon dynamics. Ph.D. thesis, University of Victoria, 233 pp.

  • De La Chevrotière, M., and J. Harlim, 2017: A data-driven method for improving the correlation estimation in serial ensemble Kalman filters. Mon. Wea. Rev., 145, 9851001, https://doi.org/10.1175/MWR-D-16-0109.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • De La Chevrotière, M., and B. Khouider, 2017: A zonally symmetric model for the monsoon-Hadley circulation with stochastic convective forcing. Theor. Comput. Fluid Dyn., 31, 89110, https://doi.org/10.1007/s00162-016-0407-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • De La Chevrotière, M., B. Khouider, and A. J. Majda, 2016: Stochasticity of convection in Giga-LES data. Climate Dyn., 47, 18451861, https://doi.org/10.1007/s00382-015-2936-z.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Demmel, J. W., 1997: Applied Numerical Linear Algebra. Society for Industrial and Applied Mathematics, xi + 416 pp., https://doi.org/10.1137/1.9781611971446.

    • Crossref
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 14310 162, https://doi.org/10.1029/94JC00572.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Furrer, R., and T. Bengtsson, 2007: Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. J. Multivar. Anal., 98, 227255, https://doi.org/10.1016/j.jmva.2006.08.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757, https://doi.org/10.1002/qj.49712555417.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790, https://doi.org/10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Harlim, J., 2017: Model error in data assimilation. Nonlinear and Stochastic Climate Dynamics, C. L. E. Franzke and T. J. O’Kane, Eds., Cambridge University Press, 276–317, https://doi.org/10.1017/9781316339251.011.

    • Crossref
    • Export Citation
  • Hastie, T., R. Tibshirani, and J. Friedman, 2009: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Vol. 1, Springer Series in Statistics, Springer, 745 pp., https://doi.org/10.1007/978-0-387-84858-7.

    • Crossref
    • Export Citation
  • Hotelling, H., 1953: New light on the correlation coefficient and its transforms. J. Roy. Stat. Soc., 15B (2), 193232.

  • Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811, https://doi.org/10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2005: Ensemble Kalman filtering. Quart. J. Roy. Meteor. Soc., 131, 32693289, https://doi.org/10.1256/qj.05.135.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and F. Zhang, 2016: Review of the ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 144, 44894532, https://doi.org/10.1175/MWR-D-15-0440.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Janjić, T., D. McLaughlin, S. E. Cohn, and M. Verlaan, 2014: Conservation of mass and preservation of positivity with ensemble-type Kalman filter algorithms. Mon. Wea. Rev., 142, 755773, https://doi.org/10.1175/MWR-D-13-00056.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Johnson, R. H., T. M. Rickenbach, S. A. Rutledge, P. E. Ciesielski, and W. H. Schubert, 1999: Trimodal characteristics of tropical convection. J. Climate, 12, 23972418, https://doi.org/10.1175/1520-0442(1999)012<2397:TCOTC>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kalman, R. E., 1960: A new approach to linear filtering and prediction problems. J. Basic Eng., 82, 3545, https://doi.org/10.1115/1.3662552.

  • Katsoulakis, M. A., A. J. Majda, and D. G. Vlachos, 2003: Coarse-grained stochastic processes and Monte Carlo simulations in lattice systems. J. Comput. Phys., 186, 250278, https://doi.org/10.1016/S0021-9991(03)00051-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. J. Majda, 2006a: Model multi-cloud parameterizations for convectively coupled waves: Detailed nonlinear wave evolution. Dyn. Atmos. Oceans, 42, 5980, https://doi.org/10.1016/j.dynatmoce.2005.12.001.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. J. Majda, 2006b: Multicloud convective parametrizations with crude vertical structure. Theor. Comput. Fluid Dyn., 20, 351375, https://doi.org/10.1007/s00162-006-0013-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. J. Majda, 2006c: A simple multicloud parameterization for convectively coupled tropical waves. Part I: Linear analysis. J. Atmos. Sci., 63, 13081323, https://doi.org/10.1175/JAS3677.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. J. Majda, 2007: A simple multicloud parameterization for convectively coupled tropical waves. Part II: Nonlinear simulations. J. Atmos. Sci., 64, 381400, https://doi.org/10.1175/JAS3833.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., and A. Majda, 2008: Multicloud models for organized tropical convection: Enhanced congestus heating. J. Atmos. Sci., 65, 895914, https://doi.org/10.1175/2007JAS2408.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., A. Majda, and M. Katsoulakis, 2003: Coarse-grained stochastic models for tropical convection and climate. Proc. Natl. Acad. Sci. USA, 100, 11 94111 946, https://doi.org/10.1073/pnas.1634951100.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Khouider, B., J. Biello, and A. J. Majda, 2010: A stochastic multicloud model for tropical convection. Commun. Math. Sci., 8, 187216, https://doi.org/10.4310/CMS.2010.v8.n1.a10.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-Var. Quart. J. Roy. Meteor. Soc., 129, 31833203, https://doi.org/10.1256/qj.02.132.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1996: Predictability: A problem partly solved. Proc. Seminar on Predictability, Vol. 1, Shinfield Park, Reading, United Kingdom, 18 pp., https://www.ecmwf.int/sites/default/files/elibrary/1995/10829-predictability-problem-partly-solved.pdf.

  • Mapes, B., S. Tulich, J. Lin, and P. Zuidema, 2006: The mesoscale convection life cycle: Building block or prototype for large-scale tropical waves? Dyn. Atmos. Oceans, 42, 329, https://doi.org/10.1016/j.dynatmoce.2006.03.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Miyoshi, T., K. Kondo, and T. Imamura, 2014: The 10,240-member ensemble Kalman filtering with an intermediate AGCM. Geophys. Res. Lett., 41, 52645271, https://doi.org/10.1002/2014GL060863.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Oke, P. R., P. Sakov, and S. P. Corney, 2007: Impacts of localisation in the EnKF and EnOI: Experiments with a small model. Ocean Dyn., 57, 3245, https://doi.org/10.1007/s10236-006-0088-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Strang, G., 1968: On the construction and comparison of difference schemes. SIAM J. Numer. Anal., 5, 506517, https://doi.org/10.1137/0705041.

  • Tardif, R., G. J. Hakim, and C. Snyder, 2014: Coupled atmosphere–ocean data assimilation experiments with a low-order climate model. Climate Dyn., 43, 16311643, https://doi.org/10.1007/s00382-013-1989-0.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Waite, M. L., and B. Khouider, 2009: Boundary layer dynamics in a simple model for convectively coupled gravity waves. J. Atmos. Sci., 66, 27802795, https://doi.org/10.1175/2009JAS2871.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Waller, J. A., S. L. Dance, A. S. Lawless, and N. K. Nichols, 2014: Estimating correlated observation error statistics using an ensemble transform Kalman filter. Tellus, 66A, 23294, https://doi.org/10.3402/tellusa.v66.23294.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924, https://doi.org/10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
Save