1. Introduction













Illustration of ocean data assimilation with cb located at the grid points and co located at the points “*.” The ocean data assimilation is to convert the innovation, d = co −
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1





Although the relative simplicity of an atmospheric spherical shell in comparison to the complexity of oceanic basins might explain the limited use of spectral models for the ocean, the OSD has been proven as an effective ocean data analysis method. With it, several new ocean phenomena have been identified from observational data, such as a bimodal structure of chlorophyll-a with winter/spring (February–March) and fall (September–October) blooms in the Black Sea (Chu et al. 2005b), fall–winter recurrence of current reversal from westward to eastward on the Texas–Louisiana continental shelf from the current meter, a near-surface drifting buoy (Chu et al. 2005a), propagation of long Rossby waves at middepths (around 1000 m) in the tropical North Atlantic from the Argo float data (Chu et al. 2007), and temporal and spatial variability of global upper-ocean heat content (Chu 2011) from the data of the Global Temperature and Salinity Profile Program (GTSPP; Sun et al. 2009). However, the OSD method has not yet been used for ocean data assimilation.
The purpose of this paper is to extend the use of OSD from ocean data analysis to ocean data assimilation. The OSD can be either three- or two-dimensional. However, it is conducted in a horizontal plane (i.e., two-dimensional OSD) in this study. The rest of the paper is organized as follows. Section 2 discusses the lateral boundary conditions. Sections 3 describes the generation of basis functions. Section 4 presents variables at grid versus observational points. Section 5 shows the determination of spectral coefficients from minimization of combined observational and analysis errors. Section 6 illustrates the mode truncation as a statistical learning process using the Vapnik–Chervonenkis (VC) dimension. Section 7 shows the ocean model with the OSD data assimilation procedure. Sections 8 and 9 describe a twin experiment and error statistics of the OSD data assimilation. Section 10 presents the conclusions.
2. Lateral boundary condition





3. Basis functions
a. Three necessary conditions






b. Example
With the NOAA National Geophysical Data Center’s Digital Bathymetry Data Base with 5′ × 5′ resolution (ETOPO5), the basis functions
First 12 s-type basis functions {
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1
4. Grid versus observational points






5. OSD data assimilation equation











6. Mode truncation using the Vapnik–Chervonenkis dimension








7. Ocean modeling
a. Model description
The Parallel Ocean Program (POP) model (Smith and Gent 2002) is used to show the feasibility of the OSD data assimilation. Within the framework of the Community Earth System Model (CESM), the POP is a time-dependent, level-coordinate primitive equation ocean general circulation model rendered on a three-dimensional grid that includes a free surface and realistic topography. The B grid is used for the spatial discretization. Derived from the Bryan–Cox–Semtner class of models, the POP was officially adopted as the ocean component of the CESM based at NCAR in 2001. It has an implicit free surface and general orthogonal coordinates. It is a global model, with the grid defined so that the pole is located in Greenland. Since the purpose of this study is to show the feasibility of the OSD data assimilation rather than to simulate/predict the real ocean processes, a low-horizontal-resolution (3°), 60-vertical-level (Table 1) version of the model with a time step of 2 h is used in this study. In the top 175 m, the model has 30 levels with 10 m between each of the consecutive levels. The discretized model variable at the grid points is represented by c(xn, zl, t), n = 1, 2, …, Nl, l = 1, 2, …, L. Here, Nl is the total number of the horizontal grid points at the vertical level l and L = 60 is the total number of the vertical levels.
Depths (m) of vertical levels in the POP model.
The atmospheric forcing at the surface is provided by an annually varying climatology derived from the surface Co-ordinated Ocean–Ice Reference Experiments (CORE), version 2 (Large and Yeager 2009). The air–sea fluxes of momentum, heat, freshwater, and their components have been computed globally from 1948 at frequencies ranging from 6 hourly to monthly. All fluxes are computed over the 23 years from 1984 to 2006, but radiation prior to 1984 and precipitation before 1979 are given only as climatological mean annual cycles. The input data are based on NCEP–NCAR reanalysis for the surface vector wind, temperature, specific humidity, and density, and on a variety of satellite based radiation, sea surface temperature, sea ice concentration, and precipitation products (from https://climatedataguide.ucar.edu/climate-data/large-yeager-air-sea-surface-flux-corev2-1949-2006). The model simulations for this experiment used climatological forcing, (daily 23-yr average). The forcing is interpolated to the time step of the model.
The POP model has been spun up from rest and climatological annual mean (temperature and salinity) with the daily climatological surface forcing from the CORE, version 2 (Large and Yeager 2009), and integrated for a period of over 300 simulation years. The model output for the year 300 (c300) is treated as the “truth field,”
b. Initial error
Although we are using a global model, temperature “observations” are only incorporated for the Pacific basin. It is noted that use of single-variable (i.e., temperature) data is not ideal, since observational temperature (T), salinity (S), and velocity (V) data should be assimilated to keep dynamic balance, since (T, S, V) are the dependent variables in ocean models. Chu (2006) shows that assimilation with (T, S) data only introduces dynamic imbalance and suggests that geostrophic velocity corresponding to the (T, S) data should also be assimilated. The results of this study are only used for the preliminary evaluation.


Initial errors in temperature (°C) at (a) the sea surface and (b) the depth of 1106 m (at level 41).
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1
8. OSD data assimilation
a. Bilinear interpolation





Bilinear interpolation for calculating the basis functions at the observational point xm from their values at the four neighboring grid points.
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1
b. Twin experiment
A sampling pattern consisting of a data-rich area and a data-poor area offers a challenge to the ability of the basis function to represent the intended fields well, since the projection of the data onto the spectral fields is likely to generate a field defined in the entire domain. To test the ability of the OSD to represent the intended fields, the “observational” data are sampled from c300 beginning with 1 March for 20 days at locations (unchanged during the data assimilation process) given by the horizontal distribution of the Argo floats in March 2003 (Fig. 5). This produces the observational dataset
Daily sampling taking from the horizontal distribution of the Argo floats in March 2003. It is noted that the observational data-rich area is north of 20°S and that the observational data-poor area is south of 20°S.
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1
The OSD data assimilation process at day = t follows (27) with the following procedure: (i) determine the optimal mode decomposition (Kopt); (ii) compute the difference between the observational and background values at the observational points following (13); (iii) substitute the difference into (26) to obtain the spectral coefficients [a1(z, t), a2(z, t), …, aKopt(z, t)]; and (iv) substitute the spectral coefficients [a1(z, t), a2(z, t), …, aK(z, t0)] into (27) to get the assimilated initial condition ca(xn, z, t). The dependence of the VC cost function (JK) on the VC dimension (Fig. 6) shows that the optimal mode truncation is Kopt = 12 at 125-m depth (level 13; Table 1) and day 0 (for illustration). The assimilation model is then run forward in time for 24 h with the model field saved at the end of 24 h, which is the background field for the day = (t + 1),
Optimal mode decomposition (Kopt) at 125-m depth and day 0 is determined by the minimization of the VC cost function (denoted by red square).
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1
9. Error statistics


Comparison between the assimilation and nonassimilation runs of the temporally varying basinwide (a) RMSE and (b) BIAS.
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1
Comparison of temporally varying local RMSE for the (a) assimilation run at day 1, (b) nonassimilation run at day 1, (c) assimilation run at day 10, (d) nonassimilation run at day 10, (e) assimilation run at day 20, and (f) nonassimilation run at day 20.
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1
Comparison of temporally varying local BIAS for the (a) assimilation run at day 1, (b) nonassimilation run at day 1, (c) assimilation run at day 10, (d) nonassimilation run at day 10, (e) assimilation run at day 20, and (f) nonassimilation run at day 20.
Citation: Journal of Atmospheric and Oceanic Technology 32, 4; 10.1175/JTECH-D-14-00079.1
As the time progresses, the local (RMSE, BIAS) for the nonassimilate run remains almost the same at day 10 (Figs. 8d and 9d) and day 20 (Figs. 8f and 9f) as at day 1 (Figs. 8b and 9b), but it changes evidently for the assimilated run at day 10 (Figs. 8c and 9c) and day 20 (Figs. 8e and 9e) as compared to day 1 (Figs. 8a and 9a).
The local RMSE is reduced drastically north of 20°S with a disappearance of high local RMSE originally (day 1) in the central equatorial Pacific, and the eastern tropical North Pacific, and is not reduced or may even be increased slightly south of 20°S with the appearance of high local RMSE originally (day 1) in the Antarctic Circumpolar Current region near the ice shelf (south of 68°S, 160°–90°W). It is noted that the areas with large reduction in error and bias are the observational data-rich areas, and that areas with less decreasing (or even increasing) in error and bias are the observational data-poor areas (cf. Figs. 8 and 9 to Fig. 5).
10. Conclusions
The OSD method has been developed for ocean data assimilation on the basis of the classic theory of the generalized Fourier series expansion, such that any ocean field is represented by a linear combination of the products of basis functions (modes) and corresponding spectral coefficients. The basis functions are the eigenvectors of the Laplacian operator, determined only by the topography with the same lateral boundary conditions for the assimilated variables.
Different from the existing ocean data assimilation methods such as optimal interpolation, Kalman filters, and variational methods (originally developed for atmospheric data assimilation), the OSD method has four specific features: (i) effective utilization of the ocean topographic data, (ii) orthonormal and predetermined basis functions that are independent on and satisfy the same lateral boundary condition of the assimilated variable anomalies, (iii) no requirement of a priori information on a background error covariance matrix (
The capability of the OSD method is demonstrated through a twin experiment using the Parallel Ocean Program (POP) model for the Pacific Ocean. For an objective evaluation, the “observational” data are not uniformly distributed in the data-rich area north of 20°S and in the data-poor area south of 20°S. Within 20 days, the basinwide RMSE (BIAS) increases 4% (2.5%) without the OSD data assimilation and decreases 14% (50%) with the OSD data assimilation. However, the improvement using the OSD data assimilation depends on the observational data distribution. The local RMSE is reduced drastically in data-rich areas (i.e., north of 20°S) but not in data-poor areas (i.e., south of 20°S).
No use of the a priori
The OSD method proposed here is two-dimensional and conducted at each vertical level with the basis functions given by the eigenvectors of the horizontal Laplacian operator. This can be extended to a three-dimensional OSD method with the basis functions given by the eigenvectors of the three-dimensional Laplacian operator, where much larger matrix operations will be involved. Besides, for the three-dimensional OSD, the surface boundary conditions of the assimilated variable anomalies may vary due to local climatology. Its impact on the three-dimensional basis functions will be investigated in future studies.
Acknowledgments
The Office of Naval Research, the Naval Oceanographic Office, and the Naval Postgraduate School supported this study.
APPENDIX
Derivation of Lateral Boundary Condition [(6)]




REFERENCES
Bretherton, F. P., Davis R. E. , and Fandry C. B. , 1976: A technique for objective analysis and design of oceanographic experiments applied to MODE-73. Deep-Sea Res. Oceanogr. Abstr., 23, 559–582, doi:10.1016/0011-7471(76)90001-2.
Chu, P. C., 2006: Applications to data assimilation. P-Vector Inverse Method, Springer, 407–414.
Chu, P. C., 2011: Global upper ocean heat content and climate variability. Ocean Dyn., 61, 1189–1204, doi:10.1007/s10236-011-0411-x.
Chu, P. C., and Fan C. W. , 2010: A conserved minimal adjustment scheme for stabilization of hydrographic profiles. J. Atmos. Oceanic Technol., 27, 1072–1083, doi:10.1175/2010JTECHO742.1.
Chu, P. C., Ivanov L. M. , Korzhova T. P. , Margolina T. M. , and Melnichenko O. M. , 2003a: Analysis of sparse and noisy ocean current data using flow decomposition. Part I: Theory. J. Atmos. Oceanic Technol.,20, 478–49, doi:10.1175/1520-0426(2003)20<478:AOSANO>2.0.CO;2.
Chu, P. C., Ivanov L. M. , Korzhova T. P. , Margolina T. M. , and Melnichenko O. M. , 2003b: Analysis of sparse and noisy ocean current data using flow decomposition. Part II: Applications to Eulerian and Lagrangian data. J. Atmos. Oceanic Technol.,20, 492–512, doi:10.1175/1520-0426(2003)20<492:AOSANO>2.0.CO;2.
Chu, P. C., Ivanov L. M. , and Margolina T. M. , 2004a: Rotation method for reconstructing process and field from imperfect data. Int. J. Bifurcation Chaos, 14, 2991–2997, doi:10.1142/S0218127404010941.
Chu, P. C., Wang G. H. , and Fan C. W. , 2004b: Evaluation of the U.S. Navy’s Modular Ocean Data Assimilation System (MODAS) using the South China Sea Monsoon Experiment (SCSMEX) data. J. Oceanogr., 60, 1007–1021, doi:10.1007/s10872-005-0009-3.
Chu, P. C., Ivanov L. M. , and Margolina T. M. , 2005a: Seasonal variability of the Black Sea chlorophyll-a concentration. J. Mar. Syst., 56, 243–261, doi:10.1016/j.jmarsys.2005.01.001.
Chu, P. C., Ivanov L. M. , and Melnichenko O. M. , 2005b: Fall–winter current reversals on the Texas–Louisiana continental shelf. J. Phys. Oceanogr., 35, 902–910, doi:10.1175/JPO2703.1.
Chu, P. C., Ivanov L. M. , Melnichenko O. V. , and Wells N. C. , 2007: Long baroclinic Rossby waves in the tropical North Atlantic observed from profiling floats. J. Geophys. Res., 112, C05032, doi:10.1029/2006JC003698.
Cohn, S. E., 1997: Estimation theory for data assimilation problems: Basic conceptual framework and some open questions. J. Meteor. Soc. Japan, 75, 257–288.
Galanis, G. N., Louka P. , Katsafados P. , Kallos G. , and Pytharoulis I. , 2006: Applications of Kalman filters based on non-linear functions to numerical weather predictions. Ann. Geophys., 24, 2451–2460, doi:10.5194/angeo-24-2451-2006.
Ide, K., Courtier P. , and Ghil M. , 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan, 75, 181–189.
Large, W., and Yeager S. , 2009: The global climatology of an interannually varying air–sea flux data set. Climate Dyn., 33, 341–364, doi:10.1007/s00382-008-0441-3.
Lozano, C. J., Robinson A. R. , Arrango H. G. , Gangopadhyay A. , Sloan Q. , Haley P. J. , Anderson L. , and Leslie W. , 1996: An interdisciplinary ocean prediction system: Assimilation strategies and structured data models. Modern Approaches to Data Assimilation in Ocean Modeling, P. Malanotte-Rizzoli, Ed., Elsevier Oceanography Series, Vol. 61, Elsevier, 413–452.
Pham, D. T., Verron J. , and Roubaud M. C. , 1998: A singular evolutive extended Kalman filter with EOF initialization for data assimilation in oceanography. J. Mar. Syst., 16, 323–340, doi:10.1016/S0924-7963(97)00109-7.
Smith, R. D., and Gent P. R. , Eds., 2002: Reference manual for the Parallel Ocean Program (POP): Ocean component of the Community Climate System Model (CCSM2.0 and 3.0). Los Alamos National Laboratory Tech. Rep. LA-UR-02-2484, 75 pp. [Available online at http://www.cesm.ucar.edu/models/ccsm3.0/pop/doc/manual.pdf.]
Song, Y. T., Haidvodgel D. B. , and Glenn S. M. , 2001: Effects of topographic variability on the formation of upwelling centers off New Jersey: A theoretical model. J. Geophys. Res., 106, 9223–9240, doi:10.1029/2000JC000244.
Sun, L. C., 1999: Data inter-operability driven by oceanic data assimilation needs. Mar. Technol. Soc. J., 33, 55–66, doi:10.4031/MTSJ.33.3.7.
Sun, L. C., and Coauthors, 2009: The data management system for the Global Temperature and Salinity Profile Programme (GTSPP). Proceedings of the OceanObs’09: Sustained Ocean Observations and Information for Society, J. Hall, D. E. Harrison, and D. Stammer, D., Eds., Vol. 2, ESA Publ. WPP-306, doi:10.5270/OceanObs09.cwp.86.
Tang, Y., and Kleeman R. , 2004: SST assimilation experiments in a tropical Pacific Ocean model. J. Phys. Oceanogr., 34, 623–642, doi:10.1175/3518.1.
Vapnik, V., 2000: The Nature of Statistical Learning Theory. Springer, 315 pp.
Wang, X., Chu P. C. , Han G. , Li W. , Zhang X. , and Li D. , 2012: A fully conserved minimal adjustment scheme with (T, S) coherency for stabilization of hydrographic profiles. J. Atmos. Oceanic Technol., 29, 1854–1865, doi:10.1175/JTECH-D-12-00025.1.