## Abstract

In this study, temperature and salinity profiles from Argo floats are assimilated into a coupled ice–ocean model over the North Atlantic Ocean and Arctic using an ensemble optimal interpolation (EnOI) scheme, with the aim of improving the thermohaline structure of the Labrador Sea estimated by the model. Data assimilation experiments are carried out from September 2014 to April 2015 both with and without a one-step bias correction method from the literature. It is found that assimilation of the Argo profiles reduces the errors in the model temperature and salinity when verification is done against both withheld Argo profiles and sea surface temperature from satellite data. The assimilation also leads to deeper mixed layer depth in the Labrador Sea, closer to observations shown in other studies, in particular when bias correction is used. We hypothesize that this is because the bias field leads to vertical density profiles that are less stratified, and hence requiring less energy for mixing, than when bias correction is not used.

## 1. Introduction

The Labrador Sea is one of the few locations in the World Ocean where deep convection occurs. This is the process by which surface waters are brought to depth through vigorous vertical mixing (Lilly et al. 1999). Deep mixing events are initiated through surface buoyancy loss during the winter months, when cold air from the North American landmass blows over the Labrador Sea. These events typically mix a patch covering a surface area of approximately 100–200 km to a depth of 2000 m (Lilly et al. 1999) After a deep mixing event, the region is restratified by convective eddies, which are eddies generated through baroclinic instability resulting from the strong density gradients that surround a patch that has undergone convective mixing (Jones and Marshall 1997; Chanut et al. 2008; Frajka-Williams et al. 2014).

Predicting the depth of convective mixing and the duration of the convective season in the Labrador Sea is important for several reasons. Deep mixing events can lead to the formation of Labrador Sea Water (LSW), which impacts the southward return branch of the Atlantic meridional overturning circulation, an important component of the global climate system (Marshall and Schott 1999). There is recent evidence that deep convection may play an important role in the global carbon cycle, for example, because of the role of convection in atmosphere–ocean gas exchange (Koelling et al. 2017; Fröb et al. 2016), and because of the link between eddies generated as a result of convection and the spring phytoplankton bloom (Mahadevan et al. 2012).

In spite of recent advances in the sophistication of ocean models and increases in spatial resolution, it is still very difficult to capture the details of convection in the Labrador Sea. For example, the location of the convection region is sensitive to the strength and path of the Irminger eddies, which can be diagnosed through the position and magnitude of the eddy kinetic energy in their formation region off the west coast of Greenland (Chanut et al. 2008). Both the depth and the extent of convection have also been shown to be sensitive to the frequencies available in the atmospheric forcing provided to the ocean model (Holdsworth and Myers 2015), in addition to possible biases in the boundary conditions or forcing (Fenty and Heimbach 2013; Treguier et al. 2005). One method to potentially improve the representation of convection provided by a numerical ocean model is through data assimilation. This is a method in which observations are combined with a model state to obtain an improved state estimate (Lahoz et al. 2010). Previous studies have demonstrated the ability of data assimilation to produce ocean state estimates with reduced errors (e.g., Oke and Schiller 2007; Oke et al. 2008; Deng et al. 2010; Deng et al. 2011; Fu et al. 2011; Jones et al. 2012; Pan et al. 2014; Sakov and Sandery 2015). The data assimilation methodology defines these state estimates to be optimal in the sense of minimal variance, but this is true only if a set of assumptions are met. One of the key assumptions is that of unbiased errors for both the background state (which is from the prognostic ocean model) and observations. This assumption can be checked by looking at the bias in the difference between the background state and the observations. If this vector exhibits bias, then either the background state and/or the observations are biased. Typically, based on knowledge of the system, the user chooses to correct the bias in either the background state or the observations. If these biases are not corrected, then the optimal state estimate produced by the data assimilation system will also exhibit bias (Dee 2005).

Systematic model biases, such as would arise in a background state, are common in numerical realizations of the ocean (e.g., Thompson et al. 2006; Zhao and Hendon 2009; Brown et al. 2013; Weisheimer et al. 2014). Previous studies have shown that the temperature and salinity model states exhibit bias in the North Atlantic (Rattan et al. 2010; Treguier et al. 2005; Marzocchi et al. 2015). To address model bias in data assimilation, bias correction schemes can be used (Dee and da Silva 1998; Dee 2005; Chepurin et al. 2005; Lorente-Plazas and Hacker 2017). The two-stage scheme (Dee and da Silva 1998; Dee and Todling 2000) is particularly appealing, since it produces a minimum variance estimate for both the state and the bias. However, the two-stage scheme involves solving an additional equation for the bias vector that is as computationally intensive as that solved for the analysis (optimal state). A reduced version of this scheme, referred to as “one-step bias correction scheme” is widely used (Dee 2005; Deng et al. 2010, 2011). This reduced scheme updates the bias vector by using strictly the bias vector from the previous assimilation cycle and error covariance matrices (i.e., not through solving a minimization problem). In this case, assuming the error covariance matrix for the bias is proportional to that of the background state, with a small constant of proportionality, allows the bias vector to be updated in a computationally inexpensive manner (Dee 2005).

In this study, the one-step bias correction scheme (Dee 2005) is used to assimilate temperature *T* and salinity *S* profiles from Argo floats in the Labrador Sea into a coupled ice–ocean model of the North Atlantic and Arctic Oceans. The ensemble optimal interpolation (EnOI) scheme is use for the data assimilation. EnOI is widely represented in ocean data assimilation schemes because of its low computational cost (e.g., Oke et al. 2008; Oke and Schiller 2007). EnOI is a reduced version of the ensemble Kalman filter (EnKF) in the sense that it uses a static ensemble to estimate the background error covariance matrix. The background error covariance matrix is computed using anomalies of the model state from a mean state, typically from a long model run. In this way the background error covariance matrix allows for anisotropic and inhomogeneous state-dependent errors. However, the matrix is static (it is computed in advance of the assimilation) and does not evolve with the model state, as is done in the ensemble Kalman filter. Thus, there is a trade-off between the ability of the ensemble to capture “errors of the day” and the efficiency of the method.

The purpose of the present study is to investigate if the simple one-step bias correction (Dee 2005) in an EnOI system scheme is able to improve the ocean state, with emphasis on the thermohaline structure and convection processes in the Labrador Sea. The application domain, convection in the Labrador Sea, is not widely studied using data assimilation. While other bias correction schemes have been evaluated in the literature, (e.g., Reichle and Koster 2004; Lea et al. 2008; Kumar et al. 2012; Chepurin et al. 2005) and more sophisticated data assimilation methods have been used for ocean state estimation and forecasting (Pan et al. 2014; Fenty and Heimbach 2013; Sakov and Sandery 2015), the present combination was chosen for its simplicity and computational efficiency.

In the next section the ocean model configuration, observational data, and the EnOI scheme are introduced. This is followed by the design of experiments, results, discussion, and concluding remarks.

## 2. The data assimilation system

### a. Model configuration

The model used in this study is the Nucleus for European Modelling of the Ocean (NEMO; Madec 2008), version 3.4, coupled to the Louvain-la-Neuve Sea Ice Model (LIM2; Fichefet and Morales Maqueda 1997; Bouillon et al. 2009). The Arctic and Northern Hemisphere Atlantic (ANHA) configuration used in the present study covers a subdomain of the global ORCA25 model. It consists of a parent grid with a resolution of ¼° (ANHA4), with a nested domain with a resolution of 1/12° centered on the North Atlantic (ANHA4-SPG12, where SPG stands for subpolar gyre; black dashed box in Fig. 1). The nesting and two-way communication between the two grids is carried out using the Adaptive Grid Refinement in FORTRAN (AGRIF) toolbox (Debreu et al. 2008). The parent grid covers all of the North Atlantic Ocean north of 27°N in addition to the entire Arctic Ocean, with open boundaries at 27°N and at the Bering Strait. Boundary conditions are provided by the Global Ocean Reanalysis and Simulations (GLORYS2v3). The nested domain is centered on the Labrador Sea, with boundaries along 36°N to the south, 24°W to the east, the Hudson Strait outflow to the west, and Davis Strait to the north. There are 50 unevenly spaced levels in the vertical. The thickness of the first level is 1 m at the surface, with thickness of subsequent levels increasing with depth to a thickness of 450 m at the lowest level. Lateral mixing is carried out using a biharmonic operator with a horizontal eddy viscosity of 1 × 10^{10} (1.5 × 10^{11}) m^{4} s^{−1} in the nested (parent) domain. For tracer lateral diffusion, an isopycnal Laplacian operator is used with a horizontal eddy diffusivity of 50 (300) m^{2} s^{−1} in the nested (parent) domain. In the nested domain, the resolution is sufficiently fine to be eddy permitting over most of the region of interest, with the first Rossby radius of deformation resolved by 1–2 grid points (Dupont et al. 2015). The model configuration and model spinup is the same as that described in Müller et al. (2017).

Atmospheric forcing is provided by fields from the Global Deterministic Prediction System (GDPS) developed at the Canadian Meteorological Centre (Smith et al. 2014). The atmospheric forcing is refreshed every hour and has a spatial resolution of 33 km. The experiments were run from a spunup state and were run for two years from September 2012 to September 2014 before any data were assimilated.

### b. Observational data

#### 1) Argo data

In this study, temperature *T* and salinity *S* profiles from Argo observations are assimilated for the period from 19 September 2014 to 30 April 2015. The script used to download the data was generated using the North Atlantic Argo Regional Center information and data mining tool. As the focus of the study is on deep convection, a geographic region centered at 45°W and 56.75°N with a width of 23.7° and a height of 14° was specified to obtain the Argo data. This region is shown in Fig. 2. Both real-time and delayed mode Argo profiles were downloaded from the North Atlantic Argo Regional Center and are subject to automated quality control procedures. To select the profiles for assimilation, each day the availability of Argo observations across the top layer of the model is checked. A maximum number of two profiles are selected for assimilation each day. The rest of the *T*/*S* profiles are used for evaluation of the analyses from data assimilation and are referred to as withheld data. These withheld data can be at a geographically distant location from the observations assimilated. For example, an observation can be assimilated in the central Labrador Sea, and a withheld observation on the same day may be in the Irminger Sea. In this way the withheld data are an assessment of the ability of the model to interpolate the assimilated information in space and time. In total, 321 *T*/*S* profiles are assimilated and 141 profiles are withheld. Geophysical positions of the Argo profiles are shown in Fig. 2. Data that are within 100 km of any coastline are not assimilated because of large differences between the observed profiles and the background state from the model in these regions.

The spatial distribution of the assimilated dates was evenly spread throughout the region. There were two profiles available for assimilation in the month of September, while the remaining months each had between 40 and 60 profiles for assimilation. The withheld data were also evenly spread throughout the domain, with slightly more data in the Irminger Sea during October and November. Before assimilation, the Argo data are mapped to the model grid. In the horizontal a nearest neighbor interpolation is used, while in the vertical the data are binned according to the vertical grid levels of the model (Jones et al. 2012). The mean of each set of *T* and *S* observations in each bin is the *T* and *S* observation assimilated at the corresponding model grid point. The same approach was used to compare the analyses from the data assimilation with the withheld data.

#### 2) SST

Sea surface temperature (SST) from NOAA’s 0.25° daily Optimum Interpolation Sea Surface Temperature, version 2 (OISSTv2; Reynolds et al. 2007), are used to verify SST from the analyses. The NOAA product combines SST from the Advanced Very High Resolution Radiometer (AVHRR), buoys, and ship data. Because of the difference in spatial resolution between the OISSTv2 and the model grid, the model output was interpolated to the observational grid by averaging over boxes of approximately ¼° spatial resolution.

In addition to SST, each grid point is assigned a sea ice concentration that corresponds to the median value for the week centered on the given date. Sea ice concentration data used in the SST analysis are from the NCEP operational dataset and are based on retrievals from passive microwave brightness temperatures. All points for which the sea ice concentration was greater than zero were removed before comparison with the analyses. This removes the SSTs that are close to land, particularly along the Labrador coast and around Newfoundland, with a narrower strip around Greenland. Argo data are not used in the OISSTv2 product and are hence independent data.

### c. The EnOI scheme

The EnOI scheme computes the analysis state vector **x**^{a} using the following equation (Oke et al. 2008):

where **x**^{a} and **x**^{b} denote the analysis and background state vectors, respectively; **y**_{o} is the observation vector; is the observation operator that maps the state space to the observation space; and is the gain matrix, which is defined according to

In the gain matrix, is the background error covariance matrix and is the observation error covariance matrix.

For the background error covariance matrix, the ocean model is used to provide information. In the EnOI method, is represented by a stationary ensemble of model perturbations ,

where *N* is the number of ensemble members, is a matrix for which each column is an anomaly from a mean model state (described below), and *α* is a tunable parameter that controls the magnitude of the background error covariance matrix entries. In this study, is used, which provides a conservative update to the model state (Oke and Schiller 2007). The state vector consists of temperature and salinity at each point in the assimilation domain. The other prognostic model variables, notably *u* and *υ*, were not included because no data were assimilated for these variables and because computational expense of the data assimilation was an issue. A similar approach has been used in previous studies assimilating *T*/*S* data using EnOI (Jones et al. 2012). In the interest of computational efficiency, the data assimilation is carried out in a smaller region within the nested domain (the black box in Figs. 1 and 2) that extends from approximately 43° to 70.5°N and from 21° to 69°W).

#### 1) Background error covariance matrix

In the EnOI approach, the background error covariance matrix is calculated using the anomaly matrix . This matrix is generated by temporal filtering and sampling of a control run, which is a model run in which no data are assimilated. While there is no consensus as how to formulate , in general the variability in the anomaly fields used to construct should be representative of the model errors of interest for the problem being studied. In this study Argo data are assimilated once each day. Hence, the errors should represent those that occur on a daily time scale. To represent these errors, a 2-yr control run of the same NEMO model configuration was used, spanning the period from 27 September 2012 to 28 September 2014. From this control run, 121 daily mean model states were selected by sampling every 6 days. From these, 121 anomaly fields were generated by subtracting each daily mean state from its mean. This yields an ensemble of size 121, which is consistent with the ensemble size chosen in other studies (Mignac et al. 2015; Xie and Zhu 2010; Counillon and Bertino 2009; Sakov and Sandery 2015).

#### 2) Localization

Because of the limited number of ensemble members, in comparison to the size of the state vector, , spatial localization was used to reduce spurious long-range covariances that arise when computing the background error covariance matrix (Oke et al. 2007; Houtekamer and Mitchell 2001; Buehner 2005). This was carried out by element-wise multiplication of the background error covariance matrix by a localization matrix . With localization, the gain matrix, which can be obtained by substitution of Eq. (3) into Eq. (2), can be expressed as

where the open circle denotes an element-by-element matrix multiplication. The *i*, *j* element of , *L*_{ij}, is the correlation coefficient between the *i*th and *j*th grid points of the model and depends only on the physical distance between the two. A Gaussian function with *e*-folding scale is used to calculate *L*_{ij}:

where *L*_{h} is the horizontal localization length scale, *L*_{υ} is the vertical localization length scale, *d*_{h} is the horizontal distance, and *d*_{υ} is the vertical distance between the *i*th and *j*th grid points. In our experiment, *L*_{h} = 100 km and *L*_{υ} = 750 m were used. Initial tests with no localization in the vertical dimension showed that in this case an observation could have a significant impact across the entire ocean depth, even though the observations themselves reach a maximum depth of only 2000 m. These large increments deep in the ocean can be attributed to spurious correlations resulting from the limited sample size and are not physical. A small vertical localization length scale (50 m) was then tested. Because the model tends to be too warm and salty, in particular at the surface, this led to a large increment close to the surface and the formation of a cool, fresh layer at the surface, which shut down convection. This outcome is similar to what has been found when eddies with a cool fresh core in the upper layer enter the convective region. Such eddies have been found to inhibit convection (Schmidt and Send 2007). A vertical localization scale of 750 m was then tested. This led to increments that appeared physically reasonable with no shut down of convection.

In the implementation of the EnOI method, Eq. (4) was modified following Zhu et al. (2011) to a form that is significantly less expensive to calculate. By shifting observation profiles to the closet model grid point during processing, the observation points are coincident with the model grid points, and the gain matrix can be written as

#### 3) Observation error covariance matrix

The observation error covariance matrix is assumed to be diagonal, which means spatial correlations and intervariable correlations are neglected in the observation error structure. Given that is assumed diagonal, only the diagonal components, or error variances, need to be calculated. Here the observation error variance specification follows that in Oke et al. (2008) and Deng et al. (2010),

where is the instrument error and is the representation error. In Oke et al. (2008), there is an additional term to account for the error resulting from the difference between the analysis time and the observation time. This was not taken into account because the assimilation window is relatively short [1 day, as compared to 7–10 days in Oke et al. (2008)]. The instrument errors were set to for temperature and for salinity at all depths. The representation error is dependent on the model state following Deng et al. (2010),

where *κ* is a tunable coefficient and is the model standard deviation calculated from the 2-yr control run. For the results shown, for both temperature and salinity. This means that where is much greater than the instrument error, the observation error variance will be less than the background error variance and the observations will be given more weight in the analysis than the background state. However, where is much smaller than the instrument error, the observation error variance will be larger than the background error variance, and the observations will be given less weight in the analysis than the background state. The former is more likely to occur in regions of high mesoscale variability as well as in the thermocline region, while the latter is more likely to be the case for the deeper, more quiescent, levels of the model. Note that while the instrument error used for temperature is higher than that used in previous studies (Mignac et al. 2015; Turpin et al. 2016), it is the ratio between the observation and background error variances (as well as the spatial structure of the covariance fields) that determines the impact of the observation on the state estimate. For the formulation of the observation error used here, a different value of instrument error could be used with a different value of *κ* to achieve similar results.

#### 4) Bias correction

The one-step bias correction scheme used here solves the following equations for the state and bias estimates at the *k*th data assimilation cycle,

where , , , and denote the analysis and background state vectors, the bias vector, and the observation vector, respectively. In this study ; thus, the bias estimates evolve slowly with time. For the first assimilation cycle, is used (Deng et al. 2010, 2011).

### d. Experiment design

In this study three experiments are carried out for the period from 19 September 2014 to 30 April 2015. The first experiment is a control run (CTL), and no observational data are assimilated. The second experiment (ASSIM) assimilates *T* and *S* profiles from Argo observations. The third experiment (ASSIM-BC) assimilates the same *T* and *S* profiles as in ASSIM but implements the one-step bias correction scheme. Note that both ASSIM and ASSIM-BC can be described by Eq. (9) with for ASSIM. An assimilation window of 24 h (once per day) was used. No special treatment was found necessary to ease the addition of the increment to the background state; that is, the method did not suffer from initialization shock (Jones et al. 2012; Oke et al. 2008). This could be due to the choice of the assimilation window, which is shorter than that used in typical oceanographic data assimilation studies. A shorter assimilation window means the model is less likely to drift as far from the optimal trajectory as compared to when a longer assimilation window is used. However, given the sparse distribution of the observations in space and time, it is more likely that the lack of initialization shock is due to the small *α* value used [see Eqs. (3) and (9), and the discussion in Mignac et al. (2015)].

To quantify the skill of the data assimilation schemes, the difference between the analyses and both withheld Argo profiles and independent SST observations is calculated, as described in section 3b.

## 3. Results

### a. Bias reduction

To assess the model bias, the innovations of the data assimilation, which are defined as , are examined (Dee 2005). The innovations for CTL, ASSIM, and ASSIM-BC averaged over the study period show negative values for temperature (Fig. 3a) and salinity (Fig. 3b) at all depths, indicating the model is too warm and salty in comparison with the observations. The innovation for temperature ranges from −1°C at around 2000 m to −3°C in the top layers, and that for salinity ranges from −0.3 at around 2000 m to −0.5 in the top layers. In general, both temperature and salinity are less biased as depth increases below the thermocline. Note that the largest biases are seen for CTL, which makes sense because this is the experiment without data assimilation. A greater magnitude of bias can be seen for ASSIM as compared with ASSIM-BC, because of the effectiveness of the bias correction scheme.

The reduction in both the bias and standard deviation of the state in comparison to the observations resulting from data assimilation can be seen in comparing with (lines with and without symbols in Fig. 3). For the temperature the biases are reduced for both ASSIM and ASSIM-BC in the upper 800 m, with a slight reduction in the standard deviation. Note that from the analysis from ASSIM-BC the bias is approximately 1.2°C below 300 m, whereas for CTL the bias is greater than this over all depths until 1200 m. For salinity the reduction in bias in the upper 200 m is less significant, and a small increase can be seen in ASSIM and ASSIM-BC when is compared with below the thermocline. The standard deviation is reduced for salinity at all model depths with data assimilation as compared to CTL.

The evolution of the horizontally averaged value of the observation minus background (for CTL) and the observations minus analyses (for ASSIM-BC) at a depth of 110 m with respect to time are shown in Figs. 4a and 4b for salinity and temperature, respectively. The depth of 110 m was chosen because the bias is significant at that depth. For temperature (Fig. 4a) it can be seen that the bias is greatest at the beginning of the assimilation period for both CTL and ASSIM-BC, with a decreasing trend with respect to time. For both temperature and salinity, the difference between CTL and ASSIM-BC becomes significant at the end of January. Note that the outliers that can be seen in these plots are from Argo profiles in the southern portion of the domain, where there is significant eddy activity from the North Atlantic Current.

### b. Evaluation of analyses from data assimilation

#### 1) Comparison against withheld Argo data

To quantify the skill of the data assimilation schemes, the mean difference (MD) between the model analysis and withheld observations is calculated according to Deng et al. (2011),

where is the model analysis interpolated to the location of the *i*th observation, is the withheld observation, *i* is the index of the observation, and *n* is the total number of observations. The MD between the analyses and the withheld data is shown in Fig. 5. It can be seen that the data assimilation is able to reduce the model biases, as shown in Figs. 5a and 5b for temperature and salinity, respectively, for all points in the Labrador Sea interior (blue symbols in Fig. 2). Both ASSIM and ASSIM-BC show smaller values of MD with respect to withheld data than CTL at all model levels. For temperature, ASSIM-BC shows greater reduction in MD than ASSIM as a result of the effectiveness of the bias correction scheme. For temperature it was found that the MD between the different experiments was statistically significant for all depths above 300 m. For salinity the MDs between CTL and both ASSIM and ASSIM-BC are statistically significant above 380 m, while for ASSIM and ASSIM-BC MDs are statistically significant for depths between 50 and 220 m. Mean differences for profiles closer to the coast (red symbols in Fig. 2) for temperature and salinity observations are shown in Figs. 5c and 5d, respectively. It can be seen that MDs are reduced for temperature with data assimilation, but there is no impact of data assimilation on salinity MDs for these locations.

#### 2) Comparison against independent SST observations

The bias of the difference between the SST from the observations and the SST from the analyses is shown in Fig. 6. It can be seen that there is a negative bias in the central Labrador Sea for CTL that is ≈−1.5°C. A similar bias has been found in other studies (Dupont et al. 2015; Marzocchi et al. 2015). This bias is slightly reduced for ASSIM and is significantly reduced for ASSIM-BC (note the red tones indicate less bias). A time series of the SST bias averaged over the region north of 52°N and west of 36°W is shown in Fig. 7. The region was chosen to reduce the contribution from the southern part of the assimilation domain on the statistics, since the objective of the study was to examine the convective region. It can be seen that ASSIM-BC has the lowest bias over the duration of the experiment, and the difference between the bias from ASSIM and ASSIM-BC is sustained at approximately half a degree through March and April. The standard deviation of the differences between the SST from the analyses and that from the observations was also checked. It was found there were no significant differences between the experiments.

### c. MLD changes

In this section, we examine how assimilation of Argo temperature and salinity profiles changes the mixed layer depth (MLD). The MLD is defined as the depth at which the potential density difference relative to the value at a reference vertical level near the surface exceeds a threshold value, which is chosen to be 0.01 kg m^{−3}. The MLD is calculated using the code provided by Holte and Talley (2009). We note that this definition has been found to lead to deeper MLDs than other definitions in regions with deep convection as a result of temperature–salinity compensation. We found the conclusions of our study were the same when other definitions of the MLD were examined.

The average MLD over the period from 1 January 2015 to 31 March 2015 for each experiment is shown in Fig. 8. It can be seen in comparing Fig. 8a with Figs. 8b and 8c that assimilation of the Argo profiles of temperature and salinity leads to deeper convection in the central Labrador Sea. The average MLD from the CTL run is around 1100 m, while ASSIM-BC exhibits deeper convection, with depths in excess of 1300 m, over a larger area. The difference between the MLDs from ASSIM-BC and CTL is shown in Fig. 8d. Fröb et al. (2016) calculated the MLD using a semisubjective method for Argo profiles from February 2015 to April 2015 and showed that the deepest MLD is around 1800 m. Yashayaev and Loder (2016) showed that the deepest MLD in the winter of 2014/15 reaches 1700 m, defined as the 75th percentile of the pycnostad’s deepest extent of in situ observations and Argo profiles. Note that ASSIM and ASSIM-BC also show deeper convection along the path of the East Greenland Current and in the Irminger Sea, which is in better agreement with observations (Fröb et al. 2016) than CTL.

The time evolution of the MLD averaged over the rectangular box in Fig. 8a is shown in Fig. 9. For clarity, the MLDs shown are a rolling average over the week centered on the date indicated. It can be seen that data assimilation leads to deeper convection depth throughout the convection season. While the onset of convection occurs at approximately the same time of the year for each experiment, ASSIM and ASSIM-BC show the rate of MLD deepening is faster with data assimilation. The bias correction scheme of ASSIM-BC leads to deeper convection all through the convection season. Plots of the potential density, temperature, and salinity averaged over the box in Fig. 8a are shown in Fig. 10. It can be seen that changes in the potential density, which lead to the increased rate of deepening of the MLD with data assimilation, are due to changes in both the potential temperature and salinity.

## 4. Discussion

Without data assimilation, the model is biased toward predicting a warmer and saltier Labrador Sea than is seen in the Argo data [see Figs. 5 and 6, and the discussion in Rattan et al. (2010) and Marzocchi et al. (2015)]. Based on this observation, two data assimilation experiments were carried out, both with and without a one-step bias correction scheme from the literature (Dee 2005). This scheme was effective in reducing the model bias and improving the analyses when they are compared to both withheld and independent data, and led to a deeper mixed layer depth during the convective season as compared to the experiment without bias correction. To understand this further, we note that for the one-step bias correction method, the bias does not evolve with a prognostic equation between assimilation cycles as does the state, which is updated by the ocean model physics. The bias field is updated when a new Argo observation is assimilated through a comparison of the model state with the observations. Because of the use of a spatially correlated weight matrix, the bias update is then spread horizontally and vertically, and is able to influence the model over a larger spatial domain than the set of points at which observations are assimilated. This behavior is analogous to the spreading of the innovation when a spatially correlated background error covariance matrix is used (Buehner 2005). For example, the bias fields at a depth of 110 m for temperature and salinity are shown in Fig. 11 for 30 March 2015. The locations of the Argo profiles assimilated up to and including the specified date are shown with the small black dots. Note that the assimilation of a profile does not necessarily make a significant contribution to the bias field; for example, if there is not a systematic difference between the model state and the observation, then over time the impact of an individual observation on the bias field will be reduced.

The bias has the effect of locally reducing the temperature and salinity of the model state, which may make it more susceptible to deep mixing. To investigate this, we looked for days when the mixed layer depth exhibited evidence of local deep mixing events for which the depth of mixing was different between the experiments with and without bias correction. As an example, the MLD is shown in Fig. 12 for 15 February 2015 for the region outlined with the black box in Fig. 11a. This box was chosen because it is the largest area within the convection region where the magnitude of both the temperature and salinity bias fields are significant from the end of January until the end of the experiment. It can be seen there is a region for ASSIM-BC (Fig. 12b) where the MLD is deeper than that for ASSIM (Fig. 12a). We also note that the density contours show evidence of strong lateral mixing. An investigation of the potential density profiles before the mixing event indicated that ASSIM-BC exhibited profiles with a less steep slope than ASSIM, indicating they may be more favorable to deep mixing. To quantify this further, we calculated the convective energy that must be removed to allow convection to a given depth, which is defined as (Holdsworth and Myers 2015)

and is closely related to the concept of convection resistance (Frajka-Williams et al. 2014). The calculation of the convective energy was carried out at each latitude and longitude location in the black box in Fig. 11a for which there was an increase in MLD from 14 to 15 February of at least 200 m. The probability density functions of the convective energy for both ASSIM and ASSIM-BC are shown in Figs. 13a and 13b, respectively. This calculation used the potential density on 14 February, the day prior to the deep mixing event. Note there are no negative values for convective energy because the mixing parameterizations used in NEMO support only stably stratified density profiles (unstable profiles are immediately mixed). It can be seen in Fig. 13 that the probability of low convective energy is higher for ASSIM-BC than for ASSIM. Similar results were found on other days when deep mixing events were noted to be different between ASSIM and ASSIM-BC. In addition, the results were not sensitive to the exact threshold used for the difference in MLD before and after mixing (thresholds larger than 200 m were also tested and yielded similar results).

## 5. Conclusions

This study assimilates Argo observations of temperature and salinity into the Labrador Sea in a coupled ice–ocean model of the Arctic and northern Atlantic. A static background error covariance matrix is computed from a 2-yr free run for the EnOI scheme. The findings were as follows: 1) The EnOI schemes with and without bias correction were successful in improving the model state estimate in comparison with independent data. It should be noted that a vertical localization scale of 750 m was key to this improvement. When a much smaller vertical localization scale (e.g., 50 m) was used, this led to a strongly stratified top layer, which prohibited deep convection. 2) The differences between model predictions and withheld observations of temperature and salinity are smaller with data assimilation than without. 3) The SST bias in the central part of the Labrador Sea is reduced with data assimilation. With bias correction this reduction is statistically significant. 4) MLDs are deeper and closer to other studies in the model runs with data assimilation (1300 m for ASSIM vs 1100 m for CTL). Other studies (e.g., Fröb et al. 2016; Yashayaev and Loder 2016) show the deepest mixed layer depth in the Labrador Sea is around 1700–1800 m for the winter of 2014/15. 5) The one-step bias correction scheme leads to deeper MLD (1400 m) than when bias correction is not used.

Considering the sparsity of the Argo data and the relatively short horizontal localization length scales (100 km), it may seem surprising that the data assimilation is able to have an impact. Over the time scales considered in the present study, it is not likely that this impact is related to the advective nature of the bias (Rattan et al. 2010). It is more likely that the impact is related to the changes in temperature and salinity profiles because of the bias field. In particular, when bias correction is used, this reduces the convective energy required for mixing. Finally, note that other dates were found that exhibited deeper mixing with bias correction, regardless of whether an observation was assimilated on that date. Therefore, the exact timing and location of the observations assimilated may be less critical than having a sufficient number of floats to sample the convection region and to generate a bias field that is able to reduce the convective energy required for deep mixing.

## Acknowledgments

This study was supported by the National Science and Engineering Research Council of Canada through the Climate Change and Atmospheric Research Program. The study was carried out as part of the VITALS project, studying ventilation, interactions, transports across the Labrador Sea. We gratefully acknowledge the computing resources provided by Compute Canada. We would also like to thank G. Smith for the CGRF forcing fields, made available from Environment and Climate Change Canada; and G. Garric for the GLORYS2v3 output. Argo float data are managed by the Global Ocean Data Assimilation Experiment (http://www.usgodae.org/argo/argo.html) with documentation available (http://doi.org/10.17882/42182). We sincerely thank the reviewers for their constructive comments, which significantly improved the quality of the manuscript.

## REFERENCES

*Data Assimilation: Making Sense of Observations*, W. Lahoz, B. Khattatov, and R. Ménard, Eds., Springer, 3–12, https://doi.org/10.1007/978-3-540-74703-1_1.

## Footnotes

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).