Optimal Spectral Decomposition (OSD) for Ocean Data Assimilation

Peter C. Chu Naval Ocean Analysis and Prediction Laboratory, Department of Oceanography, Naval Postgraduate School, Monterey, California

Search for other papers by Peter C. Chu in
Current site
Google Scholar
PubMed
Close
,
Robin T. Tokmakian Naval Ocean Analysis and Prediction Laboratory, Department of Oceanography, Naval Postgraduate School, Monterey, California

Search for other papers by Robin T. Tokmakian in
Current site
Google Scholar
PubMed
Close
,
Chenwu Fan Naval Ocean Analysis and Prediction Laboratory, Department of Oceanography, Naval Postgraduate School, Monterey, California

Search for other papers by Chenwu Fan in
Current site
Google Scholar
PubMed
Close
, and
L. Charles Sun National Oceanographic Data Center, Silver Spring, Maryland

Search for other papers by L. Charles Sun in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

Optimal spectral decomposition (OSD) is applied to ocean data assimilation with variable (temperature, salinity, or velocity) anomalies (relative to background or modeled values) decomposed into generalized Fourier series, such that any anomaly is represented by a linear combination of products of basis functions and corresponding spectral coefficients. It has three steps: 1) determination of the basis functions, 2) optimal mode truncation, and 3) update of the spectral coefficients from innovation (observational increment). The basis functions, depending only on the topography of the ocean basin, are the eigenvectors of the Laplacian operator with the same lateral boundary conditions as the assimilated variable anomalies. The Vapnik–Chervonkis dimension is used to determine the optimal mode truncation. After that, the model field updates due to innovation through solving a set of a linear algebraic equations of the spectral coefficients. The strength and weakness of the OSD method are demonstrated through a twin experiment using the Parallel Ocean Program (POP) model.

Corresponding author address: Peter Chu, Department of Oceanography, Naval Postgraduate School, 833 Dyer Road, RM SP-328, Monterey, CA 93943-5122. E-mail: pcchu@nps.edu

Abstract

Optimal spectral decomposition (OSD) is applied to ocean data assimilation with variable (temperature, salinity, or velocity) anomalies (relative to background or modeled values) decomposed into generalized Fourier series, such that any anomaly is represented by a linear combination of products of basis functions and corresponding spectral coefficients. It has three steps: 1) determination of the basis functions, 2) optimal mode truncation, and 3) update of the spectral coefficients from innovation (observational increment). The basis functions, depending only on the topography of the ocean basin, are the eigenvectors of the Laplacian operator with the same lateral boundary conditions as the assimilated variable anomalies. The Vapnik–Chervonkis dimension is used to determine the optimal mode truncation. After that, the model field updates due to innovation through solving a set of a linear algebraic equations of the spectral coefficients. The strength and weakness of the OSD method are demonstrated through a twin experiment using the Parallel Ocean Program (POP) model.

Corresponding author address: Peter Chu, Department of Oceanography, Naval Postgraduate School, 833 Dyer Road, RM SP-328, Monterey, CA 93943-5122. E-mail: pcchu@nps.edu

1. Introduction

Data assimilation is required for operational ocean studies and maneuvers (Sun 1999), and has contributed significantly to the success of ocean modeling and prediction. In a numerical ocean model, a single variable or all the model variables c (no matter two- or three-dimensional) can be ordered by grid point and by variable, forming a single vector of length NP with N as the total number of grid points and P as the number of variables. For multiple model variables, nondimensionalization is conducted before forming a single vector c. The existing data assimilation is to blend modeled (or background) fields (cb) (usually on the grid points) with observational data (co) (usually not at the grid points) of any ocean variable (Cohn 1997; Tang and Kleeman 2004; Chu et al. 2004b; Galanis et al. 2006; Lozano et al. 1996),
e1
to represent the (unknown) “truth” ct with an analysis error,
e2
Here, ca is the assimilated field (analysis field); is an operator that provides the model’s estimate at the observational points; is the weight matrix; and d = [co (cb)] is the innovation (observational increment) (Fig. 1). Various data assimilation schemes such as optimal interpolation (OI), Kalman filter, and variational method [three- and four-dimensional variational data assimilation (3DVAR and 4DVAR)] were developed, and given unified notation by Ide et al. (1997). Their differences are the different ways to determine the weight matrix . For example, minimization of the cost function in the OI gives the weight matrix (e.g., Bretherton et al. 1976; Lozano et al. 1996),
e3
The minimization of the analysis error covariance () in the Kalman filter (Galanis et al. 2006) leads to
e4
Here, and f are the background error covariance matrices, where f is also called the forecast projection matrix by some authors; i is the observational error covariance matrix; and t is time. Despite some differences in formality, (3) and (4) are identical. The most significant challenge for the existing data assimilation methods is the determination of the background error covariance matrix (or forecast projection matrix f) for the OI and 3DVAR (or Kalman filter), since and f are enormous matrices that are difficult to estimate due to the following characteristics: uncertain tunable parameters, inhomogeneous and anisotropic structures, and complex boundaries in oceans.
Fig. 1.
Fig. 1.

Illustration of ocean data assimilation with cb located at the grid points and co located at the points “*.” The ocean data assimilation is to convert the innovation, d = co (cb), from the observational points to the grid points.

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

In standard OI the covariance of a field between two events (space, time) or between a field at a grid location and an observation are prescribed from some general considerations on the nature of the covariances. These covariances can be converted to their equivalent representations in spectral space. Oceanographers have constructed to include inhomogeneities and anisotropies associated with the presence of topography, and to reflect in a way the adaptation of the ocean fields to the topography. Utilization of ocean topography may change the weighting operation, in (1), into a mathematical operator, , that maps the innovation (at the observational points) directly onto the grid points,
e5
where could be different in (1) and (5) when vertical interpolation is involved. The difference, Δc = ccb, is called the anomaly (relative to the background field) of c. Very early in the application of OI to ocean fields, Bretherton et al. (1976) explored the use of spectral representation of functions defined on a grid instead of field values defined on a grid was considered. Along their path, the optimal spectral decomposition (OSD) was developed to apply the spectral method to field values, that is, to perform as such an operator with the eigenvectors of the Laplacian operator as the basis functions that only depend on the topography, satisfy the same boundary conditions as the assimilated ocean variable anomalies (e.g., temperature, salinity, velocity), and are predetermined before the data assimilation.

Although the relative simplicity of an atmospheric spherical shell in comparison to the complexity of oceanic basins might explain the limited use of spectral models for the ocean, the OSD has been proven as an effective ocean data analysis method. With it, several new ocean phenomena have been identified from observational data, such as a bimodal structure of chlorophyll-a with winter/spring (February–March) and fall (September–October) blooms in the Black Sea (Chu et al. 2005b), fall–winter recurrence of current reversal from westward to eastward on the Texas–Louisiana continental shelf from the current meter, a near-surface drifting buoy (Chu et al. 2005a), propagation of long Rossby waves at middepths (around 1000 m) in the tropical North Atlantic from the Argo float data (Chu et al. 2007), and temporal and spatial variability of global upper-ocean heat content (Chu 2011) from the data of the Global Temperature and Salinity Profile Program (GTSPP; Sun et al. 2009). However, the OSD method has not yet been used for ocean data assimilation.

The purpose of this paper is to extend the use of OSD from ocean data analysis to ocean data assimilation. The OSD can be either three- or two-dimensional. However, it is conducted in a horizontal plane (i.e., two-dimensional OSD) in this study. The rest of the paper is organized as follows. Section 2 discusses the lateral boundary conditions. Sections 3 describes the generation of basis functions. Section 4 presents variables at grid versus observational points. Section 5 shows the determination of spectral coefficients from minimization of combined observational and analysis errors. Section 6 illustrates the mode truncation as a statistical learning process using the Vapnik–Chervonenkis (VC) dimension. Section 7 shows the ocean model with the OSD data assimilation procedure. Sections 8 and 9 describe a twin experiment and error statistics of the OSD data assimilation. Section 10 presents the conclusions.

2. Lateral boundary condition

Let (x, z) be the horizontal and vertical coordinates, respectively; and R(z) be the area bounded by the lateral boundary . The anomaly Δc satisfies the generalized homogeneous lateral boundary () condition (see the appendix for a detailed explanation),
e6
where is the horizontal gradient operator with (i, j) as the unit vectors in the horizontal plane; n is the unit vector normal to the boundary; τ denotes a moving point along the boundary; and [] are parameters varying with τ. The boundary condition (6) becomes the Dirichlet boundary condition when b1 = 0 and the Neumann boundary conditions when b2 = 0. It is noted that different variable anomalies have different []. For example, the temperature, salinity, and velocity potential anomalies have b2 = 0 for the rigid boundary and b1 = 0 for the open boundary. However, the streamfunction anomaly has b1 = 0 for the rigid boundary and b2 = 0 for the open boundary.

3. Basis functions

a. Three necessary conditions

Selection of basis functions {} needs to satisfy three necessary conditions: (i) satisfaction of the same homogeneous boundary condition (6) of the assimilated variable anomaly, (ii) orthonormal, and (iii) independence of the assimilated variable. The second necessary condition is given by
e7
where δkk′ is the Kronecker delta, defined as
e8
Because of the independence of the assimilated variable (the third necessary condition), the basis functions are available prior to the data assimilation.
Use of the eigenvectors of the horizontal Laplacian operator as the basis functions is an effective and easy way to get the basis functions that satisfy the three necessary conditions. The eigenvectors {} of the horizontal Laplacian operator are the solutions of the Poisson equation,
e9
Here, {λk} are the eigenvalues, and n is the unit vector normal to the lateral boundary. It is noted that these eigenvectors {} satisfy the three necessary conditions: (i) satisfaction of the same homogeneous boundary condition (9) as the assimilated variable anomaly, (ii) orthonormal, and (iii) independent of the assimilated variables. The features (i) and (iii) distinguish the eigenvectors {} from the commonly used empirical orthogonal functions (EOFs) in ocean data assimilation (e.g., Pham et al. 1998). The EOFs depend on the assimilated variables and do not satisfy the same homogeneous boundary condition (9) as the assimilated variable anomalies.
Because of irregular lateral boundaries, the basis functions {} are usually numerical solutions of (9), {}. Here, xn = (xi, yj), n = 1, 2, …, N, representing the horizontal grid points. From now on, the vertical coordinate z is omitted for simplicity. The first K discrete basis functions for all grid points are represented by the following matrix:
e10

b. Example

With the NOAA National Geophysical Data Center’s Digital Bathymetry Data Base with 5 × 5 resolution (ETOPO5), the basis functions (k = 1, …, K) at a certain depth z are computed for the Pacific Ocean. In assimilating temperature observations, the temperature anomaly Δc satisfies the Dirichet boundary condition (b1 = 0) at the southern boundary (Antarctica) and the Newmann boundary condition (b2 = 0) elsewhere (rigid boundary). Figure 2 shows the first 12 basis functions {} for the Pacific Ocean at the surface for illustration. The first basis function shows the latitudinal variability. The second basis function shows the dipole pattern of zonal variability with opposite signs in the eastern Pacific (negative) and the western Pacific (positive). The third basis function shows the slanted dipole pattern with opposite signs in the northeastern Pacific (positive) and the southwestern Pacific (negative). The fourth basis function shows the tripole pattern with negative values in the western and eastern Pacific and positive values in between. The higher-order basis functions have more complicated variability structures. Some features are quite similar to the recently described global thermal structure (e.g., Chu 2011). It may imply the topographic effect (at least partially) on the horizontal variability such as temperature, salinity, density, and velocity potential (Song et al. 2001).

Fig. 2.
Fig. 2.

First 12 s-type basis functions {, k = 1, …, 12} for the Pacific Ocean at the surface.

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

4. Grid versus observational points

Let variable c have M observations co(x(m), t) at location x(m) (m = 1, 2, …, M) indicated by a superscript (m), and have background field cb(xn, t) at grid points xn. The operator in (1) interpolates the background (modeled) field (cb) from grid to observational point x(m),
e11
with
e12
The innovation d(m) is then given by
e13
Let = [hmn] be an M N matrix. Distribution of all innovations d(m)(x(m)) (m = 1, 2, …, M) from the observational points into the grid points is represented by the same proportionality coefficient. The mean adjustment at grid point xn due to all the observations is given by
e14
where fn denotes observational data influence at grid point xn. The larger the value of fn, the larger the observational influence on that grid point. Let d be the M-dimensional innovation vector and let D be its distributed N-dimensional vector on the grid points,
e15
where the superscript T indicates the transpose. Note that (14) can be written in matrix form,
e16
and is an N × N diagonal matrix,
e17
which is called the data influence matrix. It is noted that both matrices and depend solely on the location of the observational points [x(m)]. The algebraic equation (16) is usually ill-posed. The rotation method (Chu et al. 2004a) is used to convert (16) into a well-posed algebraic equation.

5. OSD data assimilation equation

The observational error () at grid point xn is given by
e18
The components of the vector D represent the difference (observation minus background values) at the grid points,
e19
Its spectral form, Dn, is represented by
e20
where K is the mode number of the optimal truncation (see section 6). The assimilated field with the given K is represented by
e21
The difference between Dn and is given by
e22
Substitution of (2) and (18) into (22) leads to
e23a
which contains the observational error () and the analysis error for a given K [] at grid point xn. It is noted that contains the instrumentation error, especially those associated with remote sensing observations, and the representativeness error, that is, for example, the mismatch of the point observation with the ocean model resolution. The combined observation–analysis error variance over the whole domain is defined by
e23b
Minimization of Jtr after substituting (19) and (20) into (23b),
eq1
leads to a set of algebraic equations for the spectral coefficients {ak},
e24
which is rewritten into a matrix form after using (10),
e25
where A is the spectral coefficient vector, AT = (a1, a2, …, aK). The solution of (25) is given by
e26
Then (21) is written into the matrix form after using (16),
e27
which is called the OSD data assimilation equation. The vector E denotes the observational innovation projected into the spectral space. The matrix form of the OSD data assimilation is quite similar to the existing ocean data assimilation schemes {OI and the Kalman filter with the matrix replaced by and by [see (3) and (4)]}. The two matrices and play a similar role that make the analysis field compact in the observational data-rich area. It is also noted that the OSD data assimilation [see (27)] is applied to one vertical level z. As such, the data assimilation may distort the vertical stratification. Recently developed fully conserved minimal adjust schemes (Chu and Fan 2010; Wang et al. 2012) can be used to stabilize the vertical stratification.

6. Mode truncation using the Vapnik–Chervonenkis dimension

The assimilation results depend on the mode truncation (K), since the spectral coefficients (a1, …, aK) are determined on the base of minimization of the combined observation–analysis error variance Jtr [see (23b)] for the given K. With the calculated spectral coefficients (a1, …, aK) based on observational data, the assimilated field, , can be calculated at any grid point (xn) using (27), and the analysis error is estimated by [see (2)]
e28
where is the model (or background) error. It is noted from (28) that reduction of the model (or background) error (i.e., smaller ) is achieved by the observational innovation using OSD (second term on the right-hand side).
Since the “true” field, ct(xn, t), is still uncertain, the analysis error should be estimated probabilistically. In the spectral decomposition method, the observation space and the model space are projected into the spectral space. There is a need to ensure that the size of the spectral space is adequate for these two purposes. The spectral representation acts as a spatial low-pass filter for the fields, where the highest allowed wavenumbers relate to the highest spectral eigenvalues. The observational network is required to provide information without aliasing. For example, in an eddy field in the deep ocean, one expects that the basis functions can resolve well features of the size of the Rossby radius of deformation. Thus, the ratio of observational points (M) and the spectral truncation (K) is a key to determining the optimal mode truncation Kopt. It is noted that cb(xn, t), (a1, …, aK), and {, k =1, …, K} are given. Let J be the ensemble average of analysis error variance. The probability for the upper bound of J is given by (Vapnik 2000; Chu et al. 2003a,b)
e29
where the mode truncation K is treated as the VC dimension and η (1) is the significance level. Term J* is the upper bound of Jtr. The minimization of the VC cost function (JK),
e30
leads to another set of spectral coefficients . It is noted that for a given M, Jtr decreases monotonically with K and that μ increases with K if η is given (η = 0.1 in this study). Thus, JK has a minimum value for certain mode number Kopt,
e31

7. Ocean modeling

a. Model description

The Parallel Ocean Program (POP) model (Smith and Gent 2002) is used to show the feasibility of the OSD data assimilation. Within the framework of the Community Earth System Model (CESM), the POP is a time-dependent, level-coordinate primitive equation ocean general circulation model rendered on a three-dimensional grid that includes a free surface and realistic topography. The B grid is used for the spatial discretization. Derived from the Bryan–Cox–Semtner class of models, the POP was officially adopted as the ocean component of the CESM based at NCAR in 2001. It has an implicit free surface and general orthogonal coordinates. It is a global model, with the grid defined so that the pole is located in Greenland. Since the purpose of this study is to show the feasibility of the OSD data assimilation rather than to simulate/predict the real ocean processes, a low-horizontal-resolution (3°), 60-vertical-level (Table 1) version of the model with a time step of 2 h is used in this study. In the top 175 m, the model has 30 levels with 10 m between each of the consecutive levels. The discretized model variable at the grid points is represented by c(xn, zl, t), n = 1, 2, …, Nl, l = 1, 2, …, L. Here, Nl is the total number of the horizontal grid points at the vertical level l and L = 60 is the total number of the vertical levels.

Table 1.

Depths (m) of vertical levels in the POP model.

Table 1.

The atmospheric forcing at the surface is provided by an annually varying climatology derived from the surface Co-ordinated Ocean–Ice Reference Experiments (CORE), version 2 (Large and Yeager 2009). The air–sea fluxes of momentum, heat, freshwater, and their components have been computed globally from 1948 at frequencies ranging from 6 hourly to monthly. All fluxes are computed over the 23 years from 1984 to 2006, but radiation prior to 1984 and precipitation before 1979 are given only as climatological mean annual cycles. The input data are based on NCEP–NCAR reanalysis for the surface vector wind, temperature, specific humidity, and density, and on a variety of satellite based radiation, sea surface temperature, sea ice concentration, and precipitation products (from https://climatedataguide.ucar.edu/climate-data/large-yeager-air-sea-surface-flux-corev2-1949-2006). The model simulations for this experiment used climatological forcing, (daily 23-yr average). The forcing is interpolated to the time step of the model.

The POP model has been spun up from rest and climatological annual mean (temperature and salinity) with the daily climatological surface forcing from the CORE, version 2 (Large and Yeager 2009), and integrated for a period of over 300 simulation years. The model output for the year 300 (c300) is treated as the “truth field,” .

b. Initial error

Although we are using a global model, temperature “observations” are only incorporated for the Pacific basin. It is noted that use of single-variable (i.e., temperature) data is not ideal, since observational temperature (T), salinity (S), and velocity (V) data should be assimilated to keep dynamic balance, since (T, S, V) are the dependent variables in ocean models. Chu (2006) shows that assimilation with (T, S) data only introduces dynamic imbalance and suggests that geostrophic velocity corresponding to the (T, S) data should also be assimilated. The results of this study are only used for the preliminary evaluation.

The model is integrated from 1 March of year 210 and uses observations sampled from the fields from 1 March of year 300. The initial error (the variable c denoting temperature) is
e32
The temperature at the surface initially has maximum errors (i.e., the mismatch between years 210 and 300), such as +2°C in the Southern Ocean near the Antarctic and −2°C north of the Kuroshio Extension; medium errors, such as +1°C in the central equatorial Pacific; and low errors () in subtropical areas in both hemispheres (Fig. 3a). The temperature initially has smaller errors at 1106-m depth (level 41) with maximum errors in the circumpolar currents near +1°C in the west and −1°C in the east, and low errors () elsewhere (Fig. 3b). The model without data assimilation is integrated from 1 March with the initial condition,
e33
using daily surface forcing for 20 days, represented by cnon(xn, zl, t).
Fig. 3.
Fig. 3.

Initial errors in temperature (°C) at (a) the sea surface and (b) the depth of 1106 m (at level 41).

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

8. OSD data assimilation

a. Bilinear interpolation

Let the observational point x(m) located in the grid cell be . In this study, the bilinear interpolation is chosen for the operator (Fig. 4),
e34
where
e35
It is noted that the proportionality coefficients {} depend solely on the location of the observational points (x(m)), and
e36
Each row of the M × N matrix = [hmn] in (16) only has four nonzero values,
eq2
Other simple interpolations such as inverse distance weighting, spline, and trigonometric polynomials can also be used for matrix.
Fig. 4.
Fig. 4.

Bilinear interpolation for calculating the basis functions at the observational point xm from their values at the four neighboring grid points.

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

b. Twin experiment

A sampling pattern consisting of a data-rich area and a data-poor area offers a challenge to the ability of the basis function to represent the intended fields well, since the projection of the data onto the spectral fields is likely to generate a field defined in the entire domain. To test the ability of the OSD to represent the intended fields, the “observational” data are sampled from c300 beginning with 1 March for 20 days at locations (unchanged during the data assimilation process) given by the horizontal distribution of the Argo floats in March 2003 (Fig. 5). This produces the observational dataset with a data-rich area north of 20°S and a data-poor area south of 20°S. If the spatial decorrelation scale is much less than the domain size, then the analysis fields using OI will be compact in the data-rich area (i.e., north of 20°S).

Fig. 5.
Fig. 5.

Daily sampling taking from the horizontal distribution of the Argo floats in March 2003. It is noted that the observational data-rich area is north of 20°S and that the observational data-poor area is south of 20°S.

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

The OSD data assimilation process at day = t follows (27) with the following procedure: (i) determine the optimal mode decomposition (Kopt); (ii) compute the difference between the observational and background values at the observational points following (13); (iii) substitute the difference into (26) to obtain the spectral coefficients [a1(z, t), a2(z, t), …, aKopt(z, t)]; and (iv) substitute the spectral coefficients [a1(z, t), a2(z, t), …, aK(z, t0)] into (27) to get the assimilated initial condition ca(xn, z, t). The dependence of the VC cost function (JK) on the VC dimension (Fig. 6) shows that the optimal mode truncation is Kopt = 12 at 125-m depth (level 13; Table 1) and day 0 (for illustration). The assimilation model is then run forward in time for 24 h with the model field saved at the end of 24 h, which is the background field for the day = (t + 1), . At each assimilation time, the optimal mode truncation (Kopt) is recalculated. This process repeats for 20 days and leads to the assimilated output, ca(xn, z, t).

Fig. 6.
Fig. 6.

Optimal mode decomposition (Kopt) at 125-m depth and day 0 is determined by the minimization of the VC cost function (denoted by red square).

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

9. Error statistics

The three datasets ca(xn, zl, t), cnon(xn, zl, t), ct(xn, zl, t) (l = 1, …, L) for the period of 20 days (t0 t t1 = t0 + 20 days) are used to show the root-mean-square error (RMSE) with and without the OSD data assimilation. Let Nl be the number of the horizontal grid points at the vertical level l and L be the number of the total vertical levels (L = 60). The basinwide RMSE and BIAS for the assimilation run (Eassim, Bassim) and nonassimilation run (Enon, Bnon) are given by
e37a
e37b
e38a
e38b
Figure 7 shows the comparison of the basinwide RMSE and BIAS of the model between without data assimilation (dashed curve) and with the OSD data assimilation (solid curve). RMSE increases from 0.50°C at day 1 to 0.52°C at day 20 (0.02°C increase) without data assimilation (4% of error increase), and it decreases from 0.50°C at day 1 to 0.43°C at day 20 with the OSD data assimilation (14% of error decrease). BIAS increases from 0.080°C at day 1 to 0.082°C at day 20 (0.002°C increase) without data assimilation (2.5% increase), and it decreases from 0.08°C at day 1 to 0.04°C at day 20 (0.04°C decrease) with the OSD data assimilation (50% decrease).
Fig. 7.
Fig. 7.

Comparison between the assimilation and nonassimilation runs of the temporally varying basinwide (a) RMSE and (b) BIAS.

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

The local RMSE and BIAS for the assimilation and nonassimilation runs are given by
e39a
e39b
e40a
e40b
A comparison of the local RMSE (Fig. 8) and BIAS (Fig. 9) between without data assimilation (right panels) and with the OSD data assimilation (left panels) for day 1 (top panels), day 10 (middle panels), and day 20 (bottom panels) shows the strength and weakness of the OSD scheme. At day 1, the local RMSE and BIAS are quite comparable between the assimilated run (Figs. 8a and 9a) and the nonassimilated run (Figs. 8b and 9b). The local RMSE has large values around ~2°C in the central equatorial Pacific (10°S–10°N, 160°–120°W), in the eastern tropical North Pacific (10°–18°N, 120°–90°W), a very narrow strip in the Antarctic Circumpolar Current region near the ice shelf (south of 68°S, 160°–90°W), and relatively low values elsewhere. The local BIAS has large values around 0.5 ~ 1°C in the most areas of the low latitudes (20°S–20°N) and high latitudes (north of 40°N) except in the eastern Pacific near coastal regions and in the Antarctic Circumpolar Current region, and relatively low values elsewhere.
Fig. 8.
Fig. 8.

Comparison of temporally varying local RMSE for the (a) assimilation run at day 1, (b) nonassimilation run at day 1, (c) assimilation run at day 10, (d) nonassimilation run at day 10, (e) assimilation run at day 20, and (f) nonassimilation run at day 20.

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

Fig. 9.
Fig. 9.

Comparison of temporally varying local BIAS for the (a) assimilation run at day 1, (b) nonassimilation run at day 1, (c) assimilation run at day 10, (d) nonassimilation run at day 10, (e) assimilation run at day 20, and (f) nonassimilation run at day 20.

Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1

As the time progresses, the local (RMSE, BIAS) for the nonassimilate run remains almost the same at day 10 (Figs. 8d and 9d) and day 20 (Figs. 8f and 9f) as at day 1 (Figs. 8b and 9b), but it changes evidently for the assimilated run at day 10 (Figs. 8c and 9c) and day 20 (Figs. 8e and 9e) as compared to day 1 (Figs. 8a and 9a).

The local RMSE is reduced drastically north of 20°S with a disappearance of high local RMSE originally (day 1) in the central equatorial Pacific, and the eastern tropical North Pacific, and is not reduced or may even be increased slightly south of 20°S with the appearance of high local RMSE originally (day 1) in the Antarctic Circumpolar Current region near the ice shelf (south of 68°S, 160°–90°W). It is noted that the areas with large reduction in error and bias are the observational data-rich areas, and that areas with less decreasing (or even increasing) in error and bias are the observational data-poor areas (cf. Figs. 8 and 9 to Fig. 5).

10. Conclusions

The OSD method has been developed for ocean data assimilation on the basis of the classic theory of the generalized Fourier series expansion, such that any ocean field is represented by a linear combination of the products of basis functions (modes) and corresponding spectral coefficients. The basis functions are the eigenvectors of the Laplacian operator, determined only by the topography with the same lateral boundary conditions for the assimilated variables.

Different from the existing ocean data assimilation methods such as optimal interpolation, Kalman filters, and variational methods (originally developed for atmospheric data assimilation), the OSD method has four specific features: (i) effective utilization of the ocean topographic data, (ii) orthonormal and predetermined basis functions that are independent on and satisfy the same lateral boundary condition of the assimilated variable anomalies, (iii) no requirement of a priori information on a background error covariance matrix (), and (iv) optimal mode truncation through minimization of the Vapnik–Chervonkis dimension as a statistical learning process. After the mode truncation, the model field updates due to innovation through solving a set of a linear algebraic equations of the spectral coefficients.

The capability of the OSD method is demonstrated through a twin experiment using the Parallel Ocean Program (POP) model for the Pacific Ocean. For an objective evaluation, the “observational” data are not uniformly distributed in the data-rich area north of 20°S and in the data-poor area south of 20°S. Within 20 days, the basinwide RMSE (BIAS) increases 4% (2.5%) without the OSD data assimilation and decreases 14% (50%) with the OSD data assimilation. However, the improvement using the OSD data assimilation depends on the observational data distribution. The local RMSE is reduced drastically in data-rich areas (i.e., north of 20°S) but not in data-poor areas (i.e., south of 20°S).

No use of the a priori matrix implies that the observations are purely extrapolated to the data-poor area with the control of the observational influence matrix [see (17) and (27)]. Since the extrapolation causes unpredictable analysis errors and the twin experiment does not show improvement by OSD assimilation in the data-poor area, further studies on constructing the matrix are needed. Moreover, verification using a twin experiment is just a first step. Feasibility studies should be conducted for real ocean data such as conductivity–temperature–depth (CTD), expendable bathythermograph (XBT), Argo profiling data, and glider data.

The OSD method proposed here is two-dimensional and conducted at each vertical level with the basis functions given by the eigenvectors of the horizontal Laplacian operator. This can be extended to a three-dimensional OSD method with the basis functions given by the eigenvectors of the three-dimensional Laplacian operator, where much larger matrix operations will be involved. Besides, for the three-dimensional OSD, the surface boundary conditions of the assimilated variable anomalies may vary due to local climatology. Its impact on the three-dimensional basis functions will be investigated in future studies.

Acknowledgments

The Office of Naval Research, the Naval Oceanographic Office, and the Naval Postgraduate School supported this study.

APPENDIX

Derivation of Lateral Boundary Condition [(6)]

Generally, the assimilated ocean variable c (temperature, salinity, density, velocity, …) have the lateral boundary () condition
ea1
where D(τ) is the forcing term varying with τ. With the inhomogeneous boundary condition (A1), the assimilated variable c(x, z, t) consists of two parts,
ea2
where S(x, z, t) is the solution of the Laplacian equation with the inhomogeneous boundary condition
ea3
and which satisfies the homogeneous boundary condition
ea4
Since
ea5
subtraction of (A5) from (A2) leads to
ea6
Both and satisfy the boundary condition (A4), which leads to the boundary condition (6) for Δc,
eq3

REFERENCES

  • Bretherton, F. P., Davis R. E. , and Fandry C. B. , 1976: A technique for objective analysis and design of oceanographic experiments applied to MODE-73. Deep-Sea Res. Oceanogr. Abstr., 23, 559582, doi:10.1016/0011-7471(76)90001-2.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., 2006: Applications to data assimilation. P-Vector Inverse Method, Springer, 407–414.

  • Chu, P. C., 2011: Global upper ocean heat content and climate variability. Ocean Dyn., 61, 11891204, doi:10.1007/s10236-011-0411-x.

  • Chu, P. C., and Fan C. W. , 2010: A conserved minimal adjustment scheme for stabilization of hydrographic profiles. J. Atmos. Oceanic Technol., 27, 10721083, doi:10.1175/2010JTECHO742.1.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , Korzhova T. P. , Margolina T. M. , and Melnichenko O. M. , 2003a: Analysis of sparse and noisy ocean current data using flow decomposition. Part I: Theory. J. Atmos. Oceanic Technol.,20, 478–49, doi:10.1175/1520-0426(2003)20<478:AOSANO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , Korzhova T. P. , Margolina T. M. , and Melnichenko O. M. , 2003b: Analysis of sparse and noisy ocean current data using flow decomposition. Part II: Applications to Eulerian and Lagrangian data. J. Atmos. Oceanic Technol.,20, 492–512, doi:10.1175/1520-0426(2003)20<492:AOSANO>2.0.CO;2.

  • Chu, P. C., Ivanov L. M. , and Margolina T. M. , 2004a: Rotation method for reconstructing process and field from imperfect data. Int. J. Bifurcation Chaos, 14, 29912997, doi:10.1142/S0218127404010941.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Wang G. H. , and Fan C. W. , 2004b: Evaluation of the U.S. Navy’s Modular Ocean Data Assimilation System (MODAS) using the South China Sea Monsoon Experiment (SCSMEX) data. J. Oceanogr., 60, 10071021, doi:10.1007/s10872-005-0009-3.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , and Margolina T. M. , 2005a: Seasonal variability of the Black Sea chlorophyll-a concentration. J. Mar. Syst., 56, 243261, doi:10.1016/j.jmarsys.2005.01.001.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , and Melnichenko O. M. , 2005b: Fall–winter current reversals on the Texas–Louisiana continental shelf. J. Phys. Oceanogr., 35, 902910, doi:10.1175/JPO2703.1.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , Melnichenko O. V. , and Wells N. C. , 2007: Long baroclinic Rossby waves in the tropical North Atlantic observed from profiling floats. J. Geophys. Res., 112, C05032, doi:10.1029/2006JC003698.

    • Search Google Scholar
    • Export Citation
  • Cohn, S. E., 1997: Estimation theory for data assimilation problems: Basic conceptual framework and some open questions. J. Meteor. Soc. Japan, 75, 257288.

    • Search Google Scholar
    • Export Citation
  • Galanis, G. N., Louka P. , Katsafados P. , Kallos G. , and Pytharoulis I. , 2006: Applications of Kalman filters based on non-linear functions to numerical weather predictions. Ann. Geophys., 24, 24512460, doi:10.5194/angeo-24-2451-2006.

    • Search Google Scholar
    • Export Citation
  • Ide, K., Courtier P. , and Ghil M. , 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan, 75, 181189.

    • Search Google Scholar
    • Export Citation
  • Large, W., and Yeager S. , 2009: The global climatology of an interannually varying air–sea flux data set. Climate Dyn., 33, 341364, doi:10.1007/s00382-008-0441-3.

    • Search Google Scholar
    • Export Citation
  • Lozano, C. J., Robinson A. R. , Arrango H. G. , Gangopadhyay A. , Sloan Q. , Haley P. J. , Anderson L. , and Leslie W. , 1996: An interdisciplinary ocean prediction system: Assimilation strategies and structured data models. Modern Approaches to Data Assimilation in Ocean Modeling, P. Malanotte-Rizzoli, Ed., Elsevier Oceanography Series, Vol. 61, Elsevier, 413–452.

  • Pham, D. T., Verron J. , and Roubaud M. C. , 1998: A singular evolutive extended Kalman filter with EOF initialization for data assimilation in oceanography. J. Mar. Syst., 16, 323340, doi:10.1016/S0924-7963(97)00109-7.

    • Search Google Scholar
    • Export Citation
  • Smith, R. D., and Gent P. R. , Eds., 2002: Reference manual for the Parallel Ocean Program (POP): Ocean component of the Community Climate System Model (CCSM2.0 and 3.0). Los Alamos National Laboratory Tech. Rep. LA-UR-02-2484, 75 pp. [Available online at http://www.cesm.ucar.edu/models/ccsm3.0/pop/doc/manual.pdf.]

  • Song, Y. T., Haidvodgel D. B. , and Glenn S. M. , 2001: Effects of topographic variability on the formation of upwelling centers off New Jersey: A theoretical model. J. Geophys. Res., 106, 92239240, doi:10.1029/2000JC000244.

    • Search Google Scholar
    • Export Citation
  • Sun, L. C., 1999: Data inter-operability driven by oceanic data assimilation needs. Mar. Technol. Soc. J., 33, 5566, doi:10.4031/MTSJ.33.3.7.

    • Search Google Scholar
    • Export Citation
  • Sun, L. C., and Coauthors, 2009: The data management system for the Global Temperature and Salinity Profile Programme (GTSPP). Proceedings of the OceanObs’09: Sustained Ocean Observations and Information for Society, J. Hall, D. E. Harrison, and D. Stammer, D., Eds., Vol. 2, ESA Publ. WPP-306, doi:10.5270/OceanObs09.cwp.86.

  • Tang, Y., and Kleeman R. , 2004: SST assimilation experiments in a tropical Pacific Ocean model. J. Phys. Oceanogr., 34, 623642, doi:10.1175/3518.1.

    • Search Google Scholar
    • Export Citation
  • Vapnik, V., 2000: The Nature of Statistical Learning Theory. Springer, 315 pp.

  • Wang, X., Chu P. C. , Han G. , Li W. , Zhang X. , and Li D. , 2012: A fully conserved minimal adjustment scheme with (T, S) coherency for stabilization of hydrographic profiles. J. Atmos. Oceanic Technol., 29, 18541865, doi:10.1175/JTECH-D-12-00025.1.

    • Search Google Scholar
    • Export Citation
Save
  • Bretherton, F. P., Davis R. E. , and Fandry C. B. , 1976: A technique for objective analysis and design of oceanographic experiments applied to MODE-73. Deep-Sea Res. Oceanogr. Abstr., 23, 559582, doi:10.1016/0011-7471(76)90001-2.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., 2006: Applications to data assimilation. P-Vector Inverse Method, Springer, 407–414.

  • Chu, P. C., 2011: Global upper ocean heat content and climate variability. Ocean Dyn., 61, 11891204, doi:10.1007/s10236-011-0411-x.

  • Chu, P. C., and Fan C. W. , 2010: A conserved minimal adjustment scheme for stabilization of hydrographic profiles. J. Atmos. Oceanic Technol., 27, 10721083, doi:10.1175/2010JTECHO742.1.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , Korzhova T. P. , Margolina T. M. , and Melnichenko O. M. , 2003a: Analysis of sparse and noisy ocean current data using flow decomposition. Part I: Theory. J. Atmos. Oceanic Technol.,20, 478–49, doi:10.1175/1520-0426(2003)20<478:AOSANO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , Korzhova T. P. , Margolina T. M. , and Melnichenko O. M. , 2003b: Analysis of sparse and noisy ocean current data using flow decomposition. Part II: Applications to Eulerian and Lagrangian data. J. Atmos. Oceanic Technol.,20, 492–512, doi:10.1175/1520-0426(2003)20<492:AOSANO>2.0.CO;2.

  • Chu, P. C., Ivanov L. M. , and Margolina T. M. , 2004a: Rotation method for reconstructing process and field from imperfect data. Int. J. Bifurcation Chaos, 14, 29912997, doi:10.1142/S0218127404010941.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Wang G. H. , and Fan C. W. , 2004b: Evaluation of the U.S. Navy’s Modular Ocean Data Assimilation System (MODAS) using the South China Sea Monsoon Experiment (SCSMEX) data. J. Oceanogr., 60, 10071021, doi:10.1007/s10872-005-0009-3.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , and Margolina T. M. , 2005a: Seasonal variability of the Black Sea chlorophyll-a concentration. J. Mar. Syst., 56, 243261, doi:10.1016/j.jmarsys.2005.01.001.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , and Melnichenko O. M. , 2005b: Fall–winter current reversals on the Texas–Louisiana continental shelf. J. Phys. Oceanogr., 35, 902910, doi:10.1175/JPO2703.1.

    • Search Google Scholar
    • Export Citation
  • Chu, P. C., Ivanov L. M. , Melnichenko O. V. , and Wells N. C. , 2007: Long baroclinic Rossby waves in the tropical North Atlantic observed from profiling floats. J. Geophys. Res., 112, C05032, doi:10.1029/2006JC003698.

    • Search Google Scholar
    • Export Citation
  • Cohn, S. E., 1997: Estimation theory for data assimilation problems: Basic conceptual framework and some open questions. J. Meteor. Soc. Japan, 75, 257288.

    • Search Google Scholar
    • Export Citation
  • Galanis, G. N., Louka P. , Katsafados P. , Kallos G. , and Pytharoulis I. , 2006: Applications of Kalman filters based on non-linear functions to numerical weather predictions. Ann. Geophys., 24, 24512460, doi:10.5194/angeo-24-2451-2006.

    • Search Google Scholar
    • Export Citation
  • Ide, K., Courtier P. , and Ghil M. , 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan, 75, 181189.

    • Search Google Scholar
    • Export Citation
  • Large, W., and Yeager S. , 2009: The global climatology of an interannually varying air–sea flux data set. Climate Dyn., 33, 341364, doi:10.1007/s00382-008-0441-3.

    • Search Google Scholar
    • Export Citation
  • Lozano, C. J., Robinson A. R. , Arrango H. G. , Gangopadhyay A. , Sloan Q. , Haley P. J. , Anderson L. , and Leslie W. , 1996: An interdisciplinary ocean prediction system: Assimilation strategies and structured data models. Modern Approaches to Data Assimilation in Ocean Modeling, P. Malanotte-Rizzoli, Ed., Elsevier Oceanography Series, Vol. 61, Elsevier, 413–452.

  • Pham, D. T., Verron J. , and Roubaud M. C. , 1998: A singular evolutive extended Kalman filter with EOF initialization for data assimilation in oceanography. J. Mar. Syst., 16, 323340, doi:10.1016/S0924-7963(97)00109-7.

    • Search Google Scholar
    • Export Citation
  • Smith, R. D., and Gent P. R. , Eds., 2002: Reference manual for the Parallel Ocean Program (POP): Ocean component of the Community Climate System Model (CCSM2.0 and 3.0). Los Alamos National Laboratory Tech. Rep. LA-UR-02-2484, 75 pp. [Available online at http://www.cesm.ucar.edu/models/ccsm3.0/pop/doc/manual.pdf.]

  • Song, Y. T., Haidvodgel D. B. , and Glenn S. M. , 2001: Effects of topographic variability on the formation of upwelling centers off New Jersey: A theoretical model. J. Geophys. Res., 106, 92239240, doi:10.1029/2000JC000244.

    • Search Google Scholar
    • Export Citation
  • Sun, L. C., 1999: Data inter-operability driven by oceanic data assimilation needs. Mar. Technol. Soc. J., 33, 5566, doi:10.4031/MTSJ.33.3.7.

    • Search Google Scholar
    • Export Citation
  • Sun, L. C., and Coauthors, 2009: The data management system for the Global Temperature and Salinity Profile Programme (GTSPP). Proceedings of the OceanObs’09: Sustained Ocean Observations and Information for Society, J. Hall, D. E. Harrison, and D. Stammer, D., Eds., Vol. 2, ESA Publ. WPP-306, doi:10.5270/OceanObs09.cwp.86.

  • Tang, Y., and Kleeman R. , 2004: SST assimilation experiments in a tropical Pacific Ocean model. J. Phys. Oceanogr., 34, 623642, doi:10.1175/3518.1.

    • Search Google Scholar
    • Export Citation
  • Vapnik, V., 2000: The Nature of Statistical Learning Theory. Springer, 315 pp.

  • Wang, X., Chu P. C. , Han G. , Li W. , Zhang X. , and Li D. , 2012: A fully conserved minimal adjustment scheme with (T, S) coherency for stabilization of hydrographic profiles. J. Atmos. Oceanic Technol., 29, 18541865, doi:10.1175/JTECH-D-12-00025.1.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Illustration of ocean data assimilation with cb located at the grid points and co located at the points “*.” The ocean data assimilation is to convert the innovation, d = co (cb), from the observational points to the grid points.

  • Fig. 2.

    First 12 s-type basis functions {, k = 1, …, 12} for the Pacific Ocean at the surface.

  • Fig. 3.

    Initial errors in temperature (°C) at (a) the sea surface and (b) the depth of 1106 m (at level 41).

  • Fig. 4.

    Bilinear interpolation for calculating the basis functions at the observational point xm from their values at the four neighboring grid points.

  • Fig. 5.

    Daily sampling taking from the horizontal distribution of the Argo floats in March 2003. It is noted that the observational data-rich area is north of 20°S and that the observational data-poor area is south of 20°S.

  • Fig. 6.

    Optimal mode decomposition (Kopt) at 125-m depth and day 0 is determined by the minimization of the VC cost function (denoted by red square).

  • Fig. 7.

    Comparison between the assimilation and nonassimilation runs of the temporally varying basinwide (a) RMSE and (b) BIAS.

  • Fig. 8.

    Comparison of temporally varying local RMSE for the (a) assimilation run at day 1, (b) nonassimilation run at day 1, (c) assimilation run at day 10, (d) nonassimilation run at day 10, (e) assimilation run at day 20, and (f) nonassimilation run at day 20.

  • Fig. 9.

    Comparison of temporally varying local BIAS for the (a) assimilation run at day 1, (b) nonassimilation run at day 1, (c) assimilation run at day 10, (d) nonassimilation run at day 10, (e) assimilation run at day 20, and (f) nonassimilation run at day 20.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 662 452 21
PDF Downloads 136 49 4