A three-dimensional variational data assimilation (3DVAR) method is implemented in a coupled physical–biogeochemical (CPB) model in the Baltic Sea. This study carries out a 10-yr assimilation experiment with satellite sea surface temperature (SST) and observed in situ temperature (T) and salinity (S) profiles. The impact of the assimilation is assessed with the focus on how the biogeochemical model responds to the improved hydrodynamics. The assimilation of temperature and salinity data yields considerable improvements in the physical model. On a basin scale, the mean bias of SST, T, S, and mixed layer depth (MLD) is decreased by 0.18°C (57%), 0.31°C (49%), 0.34 psu (43%), and 1.8 m (43%), respectively. More importantly, the biogeochemical simulation is improved in response to the physical data assimilation. Compared with in situ observations, the mean biases of chlorophyll a (Chl), dissolved inorganic nitrogen (DIN) and phosphorus (DIP) are decreased by 0.09 mg m−3 (15.5%), 0.19 mmol m−3 (9%), and 0.15 mmol m−3 (23%). Physical data assimilation also improves the simulated variability of Chl, DIN, and DIP and their correlations with observation. Compared with satellite observations, the mean bias of surface chlorophyll is reduced by 0.10–0.32 mg m−3 especially in the Skagerrak–Kattegat area and Bornholm basin. The decrease of total Chl change is caused by different mechanisms for winter and summer. While the deepened mixed layer acts as a dilution factor in winter, strengthened stratification agrees well with the decrease of chlorophyll in summer. In the vertical, relatively large changes of DIN and DIP occur below 60 m, which corresponds to the mean permanent halocline depth (~60–80 m) of the Baltic Sea.
Coupled physical–biogeochemical (CPB) models are important tools to understand the role of external forcing such as climate change and nutrient loads in controlling biogeochemical processes in the Baltic Sea (Neumann 2010; Maar et al. 2011). The simulation and prediction of marine ecosystems with CPB models have also triggered growing needs to develop data assimilation capability (George et al. 2005; Nerger and Gregg 2007; Siddorn et al. 2007; Brasseur et al. 2009). Within CPB models, biogeochemical tracers are coupled to physical circulation models through advection and diffusion. As a consequence, the hydrodynamics imposes strong constraints on the simulation of the ecosystems. The quantification of such constraints is thus of particular interest for the development and application of ocean ecosystem models.
CPB models have been increasingly run in an operational way as well as for research purposes. For instance, in European regional seas, as of January 2012, there were six CPB models regularly providing hindcast and nowcast products to the MyOcean project (http://www.myocean.eu). The attraction of data assimilation in CPB models is that it provides a way to extend the observations in a dynamically consistent way by incorporating both physical and biological observations into the coupled system. However, compared with the assimilation in physical models, the assimilation in biogeochemical models is far less mature despite the fact that a variety of biogeochemical models have been developed and applied in the Baltic Sea (Edelvang et al. 2005; Savchuk 2005; Eilola et al. 2009; Maar et al. 2011; Wan et al. 2013). One reason is that many biogeochemical processes are not fully understood and their uncertainties are difficult to accurately quantify especially in a three-dimensional (3D) space. The availability of biogeochemical observations is another limiting factor for data assimilation studies with 3D coupled models. Scarcity of observations also limits the capacity of fully validating CBP models in terms of temporal–spatial variability, which in turn hinders relevant improvements to model parameterization for certain processes. As the data assimilation in physical models has achieved substantial progress for the past decade in the Baltic Sea (Larsen et al. 2007; Liu et al. 2009; Fu et al. 2011b; Zhuang et al. 2011; Fu et al. 2012; Sørensen et al. 2004; Losa et al. 2014), quantifying the impact on biogeochemical variables also becomes increasingly important for future studies with CPB models.
In CPB models, data assimilation can be implemented in three ways: in the physical model, in the biogeochemical model, or in both of them. In the physical model, data assimilation helps to improve the circulation by altering initial and boundary conditions. Meanwhile, the biology is responding to the improved circulation during the CPB model run. For instance, the resultant changes in vertical density structure, mixing, and upper-ocean vertical fluxes may critically affect biological simulations (Brasseur et al. 2009; Raghukumar et al. 2015). Biogeochemical data assimilation often separates into two types: parameter estimation and state estimation. With parameter estimation, the model parameters are modified to fit the constraint of observations (Schartau and Oschlies 2003; Hemmings et al. 2003; Losa et al. 2003, 2004; Friedrichs 2001; Friedrichs et al. 2007). State estimation focuses on modifying state variables to fit the observations (Natvik et al. 2001; Natvik and Evensen 2003; Carmillet et al. 2001; Fontana et al. 2010; While et al. 2012). With CPB models, some studies also attempted the assimilation of physical and biogeochemical data at the same time (Ourmières et al. 2009; Shulman et al. 2013).
The main objective of this study is to evaluate the impact of hydrodynamic changes from physical data assimilation on 3D biogeochemical variables. We seek to implement the assimilation only into the physical model based on several considerations. First, biogeochemical models are strongly subject to the boundary conditions provided by the physical model (Doney et al. 2001; Oschlies and Garçon 1999; Lima and Doney 2004). Data assimilation produced significant improvements for the mixed layer dynamics, vertical stratification, and transport estimation in the Baltic Sea (Fu et al. 2011a, 2012; Fu 2013), which were also shown to be important in controlling the marine biogeochemistry (Oschlies and Garçon 1999; Doney et al. 2004). Second, data assimilation in physical models serves as an important tool to identify some problems in biogeochemical processes. For example, misrepresented processes may degrade biogeochemical model performance even with improved hydrodynamics (Ourmières et al. 2009). This issue is also dependent on the study region because the evolution of a biogeochemical regime at basin scale may vary significantly (Lévy et al. 2005; Oschlies and Garçon 1999; Lima and Doney 2004; Maar et al. 2011). In this sense, quantification with a Baltic CPB model will provide some insights for the regional modeling community regarding specific parameterization. Third, assimilations aiming at 3D biogeochemical state estimation usually require the estimation of background error covariance, whose uncertainties are much more difficult to quantify than in the physical model. This is because the biogeochemical model represents only a small set of the processes controlling the ecosystem and the model often involves a suite of empirical approximations and assumptions. Finally, temperature (T) and salinity (S) changes could be an important factor modulating the Baltic Sea ecosystem (Neumann 2010; Gustafsson et al. 2012) during long-term simulations.
The paper is organized as follows: A brief description of the CPB model is presented in section 2. The assimilation method and datasets are described in section 3 together with the experimental setup. The experiment results are compared with different observations for both physical and biogeochemical variables in section 4. Discussion and conclusions are presented in section 5.
2. The model
a. Physical model
The physical model in this study is a High-Resolution Baltic Sea Model (HBM), which covers the domain 53°31′30″–65°52′30″N, 14°37′30″–30°17′30″E. This model also forms the basis of a common Baltic Sea model for providing marine core service under the Global Monitoring for Environment and Security (GMES) program since 2009. The 3D hydrostatic circulation model features free-surface, different stability functions in the vertical, two-way nesting in the Danish Strait and a k–ϖ turbulence mixing scheme accounting for buoyancy-affected geophysical flows (Umlauf et al. 2003). It also takes into account breaking surface waves through parameterization. More detailed description of the model specification and numerical features can be found in the Danish Meteorological Institute (DMI) scientific reports (Berg and Poulsen 2011; http://www.dmi.dk/fileadmin/Rapporter/SR/sr12-03.pdf).
In this study, the model was set up with a horizontal resolution of about 10 km for the whole Baltic Sea. In Danish waters (53°35′15″–57°35′45″N, 9°20′25″–14°49′35″E), the horizontal resolution is increased to about 1.8 km and the finer grids are two-way nested with the coarse grids for the Baltic Sea (Fig. 1). The two-way nesting technique is similar to Barth et al. (2006) and modifications are made to the HBM setup to ensure the balance of mass and volume exchanges across the nesting boundaries. There are 50 vertical layers for the coarse grid (10 km) model with a top-layer thickness of 8 m. The rest of the layers in the upper 80 m have 2-m vertical resolution. Below 80 m, the layer thickness increases gradually from 4 to 50 m. In the nested domain, there are 52 layers with a top layer of 2 m and then with a 1- or 2-m layer thickness for the rest of the 51 layers. The main purpose is to resolve the complex bathymetry and water exchange in the shallow inner Danish waters (She et al. 2007).
b. Biogeochemical model
The biogeochemical model (Fig. 2) used in this study was similar to the original version developed by Neumann (2000) and Neumann et al. (2002). There are 10 state variables in the model. The nutrient variables are dissolved ammonium, nitrate, and phosphate. Primary production is calculated based on three functional phytoplankton groups: diatoms, flagellates, and cyanobacteria. Since cyanobacteria are able to fix and utilize dinitrogen, the model assumes that phosphate is the only limiting nutrient for this functional group. Cyanobacteria are also considered to be a nitrogen source for the system. A dynamically developing bulk zooplankton variable provides grazing pressure on the phytoplankton. Accumulated dead particles are represented in a detritus pool. Oxygen in the model is coupled to biogeochemical processes via stoichiometric ratios while oxygen levels control the process of denitrification and nitrification. More details are described in some previous studies (Wan et al. 2012; Maar et al. 2011). This model is nitrogen based, and phosphorus is coupled to nitrogen via the Redfield ratio. The state variable oxygen is also taken into account, which includes hydrogen sulfate in the model as the negative oxygen equivalent (Neumann et al. 2002). Temperature plays a limiting role in flagellate growth and detritus mineralization.
In this study, we follow the three-dimensional variational data assimilation (3DVAR) scheme used in Fu et al. (2012) to assimilate temperature and salinity profiles (Fig. 3). In general, the following cost function is minimized to seek the optimal solution of the model state x:
where x is the model state to be estimated. The variable xb is the background state vector, and the observation state vector is denoted as maps the model state to observation space and generates the analysis equivalent of observation, which is able to be compared with the measurements. The term H, by definition, is the nonlinear observation operator, and the superscript T denotes matrix transpose. In (1), background error covariance and observational error covariance weigh the misfit between the analysis and background and the misfit between analysis and observation, respectively. The incremental method (Courtier et al. 1994) is used to minimize the cost function J(x) with respect to x:
where is the innovation vector, H is the linearized observation operator evaluated at x = xb, and δx = x − xb is the analysis incremental vector.
An isotropic recursive filter (Liu et al. 2009; Zhuang et al. 2011) is used to approximate the horizontal part of background error covariance . An empirical function is used to represent the vertical correlation, which helps to reduce the computational cost. For the same reason, the background error covariance is represented with dominant EOF modes. Other parameters and empirical functions used in the recursive filter are described in detail in Zhuang et al. (2011). The observation errors at different sites are assumed to be uncorrelated in space; thus, the observation error covariance matrix is diagonal and the specification of is then reduced to a list of observation error standard deviations for each data type, variable, level, and so on. In our current scheme, the standard deviation for T and S profiles is set to 0.3°C and 0.4 psu, respectively. For satellite SST, the standard deviation is 0.4°C.
The temperature and salinity profiles are obtained from the database of International Council for the Exploration of the Sea (ICES). The ICES Data Center (http://ecosystemdata.ices.dk) collects and compiles a wide variety of marine data and metadata types from coastal countries bordering the North Atlantic and Baltic Sea. The T/S profiles have been used in some previous studies such as long-term assimilation and operational forecasts (Fu et al. 2012, 2011b; Zhuang et al. 2011). Satellite-derived chlorophyll a (Chl) data used for validation are generated from daily Medium Resolution Imaging Spectrometer (MERIS) data into the monthly mean and have a resolution of 1.2 km in the Baltic Sea. More information can be found on the website (http://marcoast.dmi.dk/chlorophyll.php). In addition, field observations for Chl, dissolved inorganic nitrogen (DIN), and dissolved inorganic phosphorous (DIP) are also obtained from the ICES. The satellite sea surface temperature (SST) data have a resolution of about 5 km. The product is a synthesis of multiple satellite data sources by an optimal interpolation (Høyer and She 2007) and is now available as the operational product in the DMI (http://ocean.dmi.dk/satellite/index.uk.php). The satellite SST data are well comparable to in situ SST data (root-mean-square difference <0.15°C) in open areas. In areas near coastlines, the differences are generally within 0.3°C.
c. Experimental setup
We carried out two experiments from 2000 to 2009 with the coupled HBM model. The one without assimilation is referred to as the control run, while the assimilative run incorporates available SST and T/S profiles with the 3DVAR scheme as that in Fu et al. (2012). Both runs use the same meteorological forcing and boundary conditions. The T/S profiles are quality-controlled dynamically during the assimilation, which helps to avoid strong shocks to the circulation model (Fu et al. 2012). In addition to the T/S profiles, satellite SST is also assimilated. As the spatial resolution of the satellite SST (about 5 km) is higher than the model grid, some subgrid-scale features may be introduced into the model, which is often referred to as “representative errors” in data assimilation. To deal with this, we add spatial smoothing to the SST product and sample the data at 10-km resolution, which was denoted as “super-obs” (Oke and Sakov 2008).
The initial conditions of the physical model are taken from a 20-yr reanalysis run conducted by Fu et al. (2012) with only the physical HBM model. The meteorological forcing is based on a reanalysis using the regional climate model HIRHAM through a dynamic downscaling (including a daily reinitialization) from ERA-Interim global reanalysis (Christensen et al. 2007). The surface momentum and heat fluxes are calculated by bulk formulations with hourly HIRHAM data of 10-m wind, 2-m air temperature, mean sea level pressure, surface humidity, and cloud cover. The daily river runoffs are provided by the operational hydrological model of the Swedish Meteorological Hydrological Institute (Bergström 1992) in combination with observations from the German Bundesamt für Seeschifffahrt und Hydrographie and climatology.
The ecosystem model is initialized with the data from World Ocean Atlas 2001 (WOA01) for ammonia, nitrate, phosphate, and dissolved oxygen (Conkright et al. 2002). The rest of the state variables are initialized with zero. WOA01 data are also used to provide lateral boundary conditions. Instead of WOA01 data, an internal phosphorus source from anoxic sediments (Savchuk 2005) is taken into account, which is released into the Baltic Proper at about 18 000 tons annually.
In this section, we evaluate the impact of T/S data assimilation on some physical and biogeochemical processes. The emphasis is placed on the upper-layer dynamics, changes in Chl and macronutrients such as DIN and DIP. For the physical model, we focus on the changes of two important indicators: SST and mixed layer depth (MLD). SST is important for biological growth and respiration rates as well as air–sea gas exchange, and MLD influences nutrient flux into the photic zone and the average light observed by the phytoplankton. Surface salinity is also shown to illustrate the impact of physical data assimilation. The validation of salinity and temperature at depth is not elaborated here. One reason is that the changes of the assimilative run at depth are found to be similar to those in the long-term reanalysis experiments (Fu et al. 2012). In addition, the subsurface changes of T/S could be partially manifested in the MLD changes. For the biogeochemical model, the comparisons with satellite Chl data and in situ nutrients can be considered as independent because the assimilation is only conducted in the physical model.
a. Physical model
1) SST and SSS
Fu et al. (2012) found that the assimilation of T/S profiles contributed greatly to improving the stratification in the Baltic Sea and baroclinic velocity, which led to more realistic simulation of upper-layer salt and volume transport (Fu 2013). In addition to T/S profiles, satellite SST is also assimilated in this study. In Fig. 4, the differences between the assimilative and control runs are shown for the mean SST and SSS (over 2000–09) in the Baltic Sea. The increments indicate the changes of the mean bias. The SST is more strongly constrained by the assimilation because satellite SST is directly assimilated. Spatial variations are pronounced in the mean SST differences, ranging from −0.5° to 0.5°C. In the northern Kattegat and Bothnian Bay, the SST is increased by about 0.3°C in the assimilative run. At the same time, the SST is markedly reduced (~−0.5°C) in the central Baltic Sea where strong negative changes extend eastward from the Bornholm area to the Gulf of Gdansk and then to the eastern Baltic Proper. There are also negative SST changes in the Gulf of Riga. The spatial extent is about hundreds of kilometers and outstrips the typical diameter of mesoscale eddies in this region. This large-scale pattern suggests a systematic error of the physical model in this area. Surface salinity is increased by 0.1–0.7 psu in most of the Baltic Sea, especially in the central Baltic, Bothnian Sea, and Bothnian Bay. As the salinity in the control run is lower than the in situ observations (Table 1), the increases suggest the improvements due to the T/S assimilation. In addition, the HBM model was known to have a negative bias for salinity in the Baltic Sea (Liu et al. 2009; Maar et al. 2011).
On the basin scale, the difference of averaged SST and SSS between the assimilative and control run is calculated and the time series from 2000 to 2009 are presented in Figs. 4c and 4d. The SST in the assimilative run is generally decreased, which is consistent with Fig. 4a. From Table 1, the climatology SST of the control run is 0.26°C higher than the observations. The SST decrease suggests some improvements on the control run. The salinity in the control run shows a strong negative bias in the Baltic Sea (Fu et al. 2012; Fig. 9a). Therefore, the increase (0.25–0.41 psu) of the mean SSS in the assimilative run implies a substantial improvement for salinity in the surface layer. In addition, the difference of SSS between the assimilative and control run exhibits an increasing trend from 2000 to 2009. This reflects a cumulative effect of the assimilation because the amount of observations assimilated is continuously increasing from 2000 to 2009, and the model state is gradually drawn close to the observations.
2) Mixed layer depth
MLD refers to a depth above which a given physical parameter of the ocean state (e.g., temperature and density) is assumed to be mixed and homogeneous to a certain level (e.g., regarding some space/time scales). The MLD here is defined as the depth at which the surface temperature is decreased by 0.5°C (November to April)/0.3°C (other months). Variations in the MLDs have a strong influence on ocean biogeochemistry through their control of nutrient fluxes to surface waters, the irradiance received by phytoplankton, and the ventilation of subsurface waters. Figure 5a shows the spatial distribution of the MLD differences between the assimilative and control run over the period 2000–09. Large differences are found in the Baltic Proper where the MLD is increased by around 3.5 m in the assimilative run. Similarly, positive differences also occur in the Bothnian Sea and Skagerrak. Comparatively, the differences are small in the shallow Danish waters and Arkona basin. The spatial pattern is consistent with Fu et al. (2012), which revealed that the assimilation of T/S profiles improved the MLD simulation more strongly in the deep waters of wintertime than in summer and shallow waters. Apart from these regions, there are also negative MLD differences in the Bothnian Bay and Gulf of Finland. On the basin scale, we find that the averaged MLD is increased by ~1.8 m from 2000 to 2009 in the assimilative run as compared to the control run (Fig. 5b). Seasonal cycles are also clear in Fig. 5b with much deeper MLD in winter than in summer. The MLD changes are also relatively weaker in summertime compared to that in winter (the MLD increase is about 9 m in March 2008). This can be expected because the effect of heating plays a dominant role in summer for the mixed layer formation in the Baltic.
3) Overall impact of the assimilation
We summarize the overall impact of the T/S assimilation in Table 1. The SST, temperature, salinity, and MLD are compared with observations. Compared with satellite SST, the control run has a positive bias of 0.26°C and a RMSE of 0.92°C. In the assimilative run, the SST bias is decreased by 0.11°C (57%) and the RMSE is decreased by 0.34°C (37%). Compared with in situ observations, the mean bias of temperature, salinity, and MLD is reduced by 0.31°C (49%), 0.34 psu (43%), and 1.79 m (43%), respectively. Similarly, the total RMSE is reduced by 0.41°C (26%), 0.40 psu (17%), and 2.95 m (38%). The overall impact of the assimilation is also quantified in a normalized Taylor diagram (Fig. 6). The correlation, standard deviation (SD), and centered pattern root-mean-square error (RMSE_cp) (Taylor 2001) are calculated with all available in situ observations of temperature, salinity, and MLD during the assimilation period. The RMSE_cp, the distance from the reference point to the model symbol, does not include bias information. RMSE_cp and SD values are normalized by the respective standard deviation of the observations (3.0°C for SST, 2.5°C for temperature, 2.7 psu for salinity, and 7.75 m for MLD). It is clear that SST, temperature, and MLD are considerably improved in the assimilative run. Variability of SST, temperature, and MLD in the assimilative run is stronger (increased by ~5%) than the control run and closer to the observed variability. However, the variability of salinity in the assimilative run is weaker than that of the control run. Correlations with observations are increased to varying degrees in the assimilative run especially for MLD (~21%) and temperature (~9%). In addition, model–data misfits are generally reduced in terms of RMSE_cp. The RMSE_cp is decreased by 36%, 23%, 14%, and 39% for SST, temperature, salinity, and MLD, respectively.
b. Biogeochemical model
1) Comparison with satellite chlorophyll data
In this study, surface chlorophyll concentration is derived from the N biomass of three groups of phytoplankton using a constant conversion factor [2.1 mg Chl m−3 (mmol m−3)−1] similar to Maar et al. (2011). Figure 7a presents the satellite chlorophyll observations, while Figs. 7b and 7c are the differences with observations for the control run and assimilative run. The satellite observations show large chlorophyll concentrations in the coastal areas (above 5.5 mg m−3), whereas the concentrations are relatively small in the pelagic zone (below 3 mg m−3). This is reasonable because coastal zones are more likely influenced by nutrient discharges from river runoffs than the pelagic zones in the Baltic Sea. Higher nutrient concentration will enhance phytoplankton growth under similar light and temperature conditions. In the control run (Fig. 7b), the chlorophyll concentrations are generally lower in areas near the coastlines where the magnitude is only half of the observations. In the Kattegat, the control run yields higher Chl concentrations than the satellite data (about 2 mg m−3). Similarly, higher Chl concentrations are found in the Arkona and Bornholm basin. It is also a little higher in a stripe area in the eastern Baltic Proper to the Gulf of Riga. In the assimilative run, the correspondence with the observation is noticeably improved in many areas. In particular, the Chl concentrations are visibly decreased in the Skagerrak and Bornholm basin. In the Gulf of Finland and Bothnian Sea, there are also small improvements as compared with the satellite data. Near the lateral boundary, however, the changes in the assimilative run are very minor. The places with large improvements do not occur where the SST and MLD changes are large. For instance, in the central Baltic Sea, the large SST and MLD changes only produce small Chl changes. Relatively, in the Bornholm basin, there are large Chl changes. This could be understood from the perspective of assimilation, which places an integral constraint on the biogeochemical model by altering the mixing, advection, and circulation. The local changes in Chl concentration are also affected by the interaction of physical and biological fields, which may be amplifying or damping. Nevertheless, the result is pretty encouraging since the assimilation is only conducted in the physical model.
On the basin scale, time series of averaged surface Chl concentration are presented in Fig. 8a. Available satellite observations from 2004 are also added in the figure. At the surface, chlorophyll concentration is characterized by strong seasonal cycles from 2000 to 2009. Lower concentration occurs from October to February rather than the rest of the year in the Baltic Sea. In January, the concentration is lowest because light availability is the dominant limiting factor at high latitudes. However, during this time, strong wind-driven turbulence and low water temperatures break down the stratified water column formed during the summer. The peak of chlorophyll concentration appears in March because the available nutrients together with increased light availability permit a rapid growth rate of the phytoplankton. Compared with satellite observation, the assimilative run fits better with the observation than the control run, which overestimates the surface Chl concentration particularly for the spring bloom period. The RMSE is about 1.14 mg m−3 compared to 1.23 mg m−3 in the control run. Improvements are more significant (about 0.3 mg m−3) from April to September. We also calculate the rank correlation coefficient, which measures the extent to which one variable tends to increase as the other variable increases. It is about 0.32 between the changes of surface Chl and MLD and 0.58 between surface Chl and SST.
Figure 8b shows the time series of basin-averaged Chl concentration above 60 m. Here, the depth-averaged Chl is used to represent the characteristics of primary production in the upper layers (Nerger and Gregg 2007; Behrenfeld and Falkowski 1997). By comparison with the control run, the depth-averaged Chl concentration is generally decreased in the assimilative run, ranging from 0.05 to 0.27 mg m−3. The low Chl concentration is further reduced in winter. This is conceivable as an increased MLD in winter (Fig. 5b) acted as a dilution factor. The same amount of Chl is mixed farther down to a larger water volume in the assimilative run than the control run. In summer, the Chl is decreased in the assimilative run more remarkably than in winter. This is in good agreement with the stratification changes (simply defined as the density difference between the surface and a depth of 45 m), as shown in Fig. 8c. The stratification in the assimilative run is generally increased in summer particularly from May to August (3% to 15%). The enhanced stratification will limit the upward flux of nutrients into the mixed layer and reduce the nutrient availability. As a result, the phytoplankton growth is limited. Our result is consistent with studies in other regions that demonstrated the crucial role of upper-layer mixing and stratification in biogeochemical processes (Oschlies and Garçon 1999; Doney et al. 2004). It is also interesting to note that the differences of Chl concentration between the assimilative and control runs agree very well with that of MLD. The rank correlation coefficient is 0.81 between the MLD changes and that of the depth-averaged Chl, while it is only 0.38 between SST and depth-averaged Chl changes.
2) Comparison with field data
In situ observations (locations shown in Fig. 1) are used to further compare the control and assimilative runs. The original ICES data are processed into monthly means, and vertical interpolation is used to obtain the model counterpart above 150 m, as most data occur above that depth for Chl, DIN, and DIP (Wan et al. 2012). Such assessment is of particular interest for operational forecasts, one object of which is to reduce the prediction error of biogeochemical variables.
Mean bias, RMSE_cp, and RMSE are calculated over the period 2000–09 (Table 1) to measure the model–data misfit. These statistics present a quantitative way to measure the overall impact of physical data assimilation on the biogeochemical variables. Compared with the in situ observations, DIN and DIP are underestimated at these locations (bias is −2.16 and −0.65 mmol m−3), while Chl is overestimated by 0.58 mg m−3. In the assimilative run, mean bias, RMSE_cp, and RMSE for DIN, DIP, and Chl are generally decreased. For example, the mean bias of Chl, DIN, and DIP is decreased by 0.09 mg m−3 (15.5%), 0.19 mmol m−3 (9%), and 0.15 mmol m−3 (23%). Similarly, the RMSE is reduced by 0.09 mg m−3 (8%), 0.22 mmol m−3 (5.6%), and 0.25 mmol m−3 (21%) for Chl, DIN, and DIP, respectively.
In Fig. 9, the time series of mean DIN and DIP concentrations are compared with the in situ observations between 0 and 150 m. Since most of the in situ Chl data below 80 m are missing, the time series of Chl is compared for the first 80 m. The model data are interpolated horizontally and vertically to match the observation. Similar to the basin-averaged Chl, the Chl concentrations on these observation locations are overestimated in the control run (Fig. 9a). The Chl concentration is decreased visibly in the assimilative run especially from 2006 to 2009. The control run also underestimates the DIN and DIP concentrations on the observation locations. In the assimilative run, the underestimated nutrients are elevated to varying degrees (see also Table 1). It should be noted that the mean concentrations here are calculated above 150 m. For the DIN (Fig. 9b), the model–data differences are slightly reduced and large discrepancies still exist in April to August. Relatively, the DIP is more significantly (Fig. 9c) improved in the assimilative run particularly in summertime (May to September). This comparison is complementary to that with satellite data and helps to further validate the efficacy of physical assimilation in a three-dimensional way. One disadvantage is that it is difficult to ascribe the changes to certain biogeochemical or physical processes in this figure because these observations are distributed over a large portion of the Baltic Sea.
Vertical profiles of the DIN, DIP, and Chl are compared with the in situ observation (Fig. 10) for the first 100 m. The mean density profile is also calculated at the observation locations (Fig. 10d). It is clear that mean stratification is enhanced as the vertical density gradient is considerably increased in the assimilative run. Accordingly, mean Chl concentration is decreased above 60 m in the assimilative run and the profile fits better with the observations (Fig. 10a). It should be noted that the difference in chlorophyll is very small given the large differences in density. The reduced temperature in surface layers may also contribute to the Chl decrease. Compared with the control run, the DIN and DIP are more improved below 60 m than near the surface. This is consistent with the averaged halocline depth that occurs at the depth of around 60–80 m (Backer and Leppänen 2008) in the central Baltic Sea and Gulf of Finland. In the Baltic Sea, the halocline was shown to respond most strongly to the assimilation of T/S profiles (Fu et al. 2012). Therefore, it is reasonable that large improvements occur below 60 m.
Chl, DIN, and DIP changes in response to the physical data assimilation are also quantified in the Taylor diagram (Fig. 6). The ICES in situ observations from the surface to 150 m are used. RMSE_cp and SD values are normalized by the respective standard deviation of the observations (1.15 mg m−3 for Chl, 3.8 mmol m−3 for DIN, and 0.8 mmol m−3 for DIP, respectively). We see that the correlation with observations is improved for Chl, DIN, and DIP. For DIP, the correlation is slightly increased from 0.81 to 0.87. The correlation with observations is also improved by ~3% for DIN and Chl. The RMSE_cp for Chl, DIN, and DIP is reduced by 15%, 4.4%, and 21%, respectively. Compared with the control run, standard deviation of Chl, DIN, and DIP is lower than in the assimilative run and closer to the observations.
5. Discussion and conclusions
In the Baltic Sea, a 3DVAR method has been implemented into a CPB model to assimilate satellite SST and in situ T/S profile data. The main objective of this study is to examine the response of a biogeochemical model to improved hydrodynamics induced by the assimilation in the physical model. The results are evaluated by comparison with various satellite/in situ observations across the Baltic Sea.
The assimilation of T/S profiles is shown to be able to substantially improve the simulation of the physical model. Spatially, mean SST is reduced for most of the Baltic, while the difference of MLD between the assimilative and control runs exhibits more spatial variations. Compared with the in situ observations, the mean bias of SST, T, S, and MLD is decreased by 0.18°C (57%), 0.31°C (49%), 0.34 psu (43%), and 1.8 m (43%), respectively. Similarly, the RMSE is reduced by 0.34°C (37%), 0.41°C (26%), 0.40 psu (17%), and 2.95 m (38%). In response to the changes in the physical model, small improvements are found in the biogeochemical model. The mean bias of Chl, DIN, and DIP is decreased by 0.09 mg m−3 (15.5%), 0.19 mmol m−3 (9%), and 0.15 mmol m−3 (23%). The total RMSE is reduced by 0.09 mg m−3 (8%), 0.22 mmol m−3 (5.6%), and 0.25 mmol m−3 (21%). A Taylor diagram also shows improved correlation with observations for Chl, DIN, and DIP. The RMSE_cp is reduced by 0.06 mg m−3 (15%), 0.15 mmol m−3 (4.4%), and 0.19 mmol m−3 (21%). The variability of Chl and DIP is also improved in the assimilative run.
Comparison with satellite data shows that surface chlorophyll is improved especially in the Skagerrak–Kattegat area and Bornholm basin. The chlorophyll concentration is also slightly increased in the lateral boundary areas in the assimilative run but still lower than the observations. The chlorophyll concentration is decreased in the assimilative run both in summer and winter, but the mechanism is different. In winter, the increased MLD acts as a dilution factor that tends to reduce the Chl concentration. In contrast, strengthened stratification in summertime limits the entrainment of nutrients from below into the euphotic zone and thus decreases the phytoplankton growth. It is worth pointing out that the interaction between the euphotic zone and the mixed layer may also affect net photosynthesis in the water column. This interaction is important in the Baltic Sea where the mixed layers shows strong spatial variations from being shallower than the euphotic zone to being much deeper than the euphotic zone (Kratzer et al. 2003). As a result, the above conclusion should be used with caution.
The improvements of chlorophyll in the assimilative run are encouraging compared with some previous studies that directly assimilated chlorophyll data. We find that the decreases of mean biases (0.09 mg m−3) and RMSE (0.09 mg m−3) are comparable with some of the studies. For instance, the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) chlorophyll data were directly assimilated into a global 3D circulation model in Gregg (2008), and the differences between the assimilative and control runs were found to be about −0.05 to 0.05 mg m−3 in most of the global ocean in March 2001. In another study with SeaWiFS data, Fontana et al. (2010) assimilated chlorophyll data in the northwestern Mediterranean Sea using a singular evolutive extended Kalman filter (SEEK). They found that the data assimilation system reduced the mean absolute error of 1.51 to 0.77 mg m−3 (49% gain) in the year 2001. In this study, the mean absolute error is reduced from 1.23 to 1.14 mg m−3 (8% gain). In Hu et al. (2012), a local ensemble Kalman filter was applied to assimilate SeaWiFS chlorophyll data into a three-dimensional biological model in the Middle Atlantic Bight. RMSE of surface chlorophyll was reduced by 0.12–0.2 mg m−3. More recently, Teruzzi et al. (2014) used a 3DVAR method to assimilate satellite chlorophyll data in a 3D operational biogeochemical forecast system in the Mediterranean Sea. They found that the largest improvement was observed for the Levantine subregion with a 36% and 31% reduction in absolute bias and RMS, respectively (it is 15.5% and 8% reduction in our study).
Comparison with in situ observations shows that large improvements for DIN and DIP occur below 60 m, which corresponds to the halocline depth. The mean absolute error is decreased by 29% and 41% for DIN and DIP. The T/S data assimilation modifies vertical density structure and the changed stratification affects nutrient concentrations. As the halocline responds most strongly to the T/S assimilation, it is reasonable that large DIN and DIP changes occur at that depth. It is also important to note that the enhanced stratification in the assimilative run could further hinder vertical water exchange, leading to prolonged duration of anoxic periods in the bottom layer. As a result, the phosphate released from anoxic bottoms could be changed. However, the biogeochemical model in this study misses the feedback mechanism for this process as phosphate release is a constant value and not parameterized based on the Redox state at/in the sediment.
While physical data assimilation can improve estimates of modeled biogeochemical variables, some issues should be addressed with caution, such as initialization shocks and changes in vertical nutrient fluxes induced by the physical assimilation. In a recent paper by Raghukumar et al. (2015), they found that physical data assimilation with four-dimensional variational data assimilation (4DVAR) improves the correlation of chlorophyll with observations but leads to a large bias of phytoplankton standing stock particularly in regions of low mean concentration. Two causes were identified for this increase: biological rectification of fluctuating vertical nutrient transport due to gravity wave generation at assimilation cycle initialization and increased nutrient variance on density surfaces.
The results shown in this paper are promising for the operational forecasts with CPB models because the assimilation of physical data only yields clear contributions to biogeochemical modeling. Some studies (Berline et al. 2007; Ourmières et al. 2009) showed that only assimilating surface data into the circulation model had a weak impact on the biogeochemical processes. We find similar results in the Baltic in an experiment (2000–09) with only satellite SST data assimilated. The mean difference of surface chlorophyll with the control run (Fig. 11) ranges from −0.16 to 0.1 mg m−3. The basin average is about −0.02 mg m−3. By assimilating the in situ T/S data, we demonstrate that the impact of physical data assimilation is appreciable for biogeochemical variables. For the purpose of the operational forecast, it is preferable to simultaneously assimilate satellite chlorophyll data and other in situ observations into the biogeochemical model, which will be the next step in the DMI. However, such implementations into the biogeochemical model require specific a priori information about the uncertainties of some biogeochemical parameters, which might be distinct from the circulation model and difficult to approximate (Brasseur et al. 2009; Doron et al. 2011, 2013).
We thank Dr. Jun She and Zhenwen Wan from the DMI for their support in preparing this manuscript. We are grateful to the anonymous reviewers for the constructive suggestions and valuable comments. The river runoff data are provided by the Swedish Meteorological and Hydrological Institute and the Bundesamt für Seeschifffahrt und Hydrographie in Hamburg, Germany. This work was supported by European Commission OPEC (Contract 283291), the National Basic Research Program of China (Grant No. 2012CB955202), and the National Natural Science Foundation of China (Grant No. 41576019). In situ observations can be obtained from ICES data portal (ices.dk/marine-data/data-portals/Pages/default.aspx).
Current affiliation: Department of Earth System Science, University of California, Irvine, Irvine, California.
Publisher’s Note: This article was revised on 21 March 2016 to include more accurate funder information in the acknowledgements section.