## Abstract

The multifractal fusion method has proved to be an effective algorithm to mitigate the noise of the sea surface salinity (SSS) of Soil Moisture Ocean Salinity (SMOS) mission. However, the traditional nonparametric weight function used in this method is unable to fully capture the dynamic evolution of the oceanic environment. Considering the multiscale, nonuniform, anisotropic, and flow-dependent nature of the ocean, a prototype with the so-called flexible circle (FLC) weight function or flexible ellipse (FLE) weight function with a set of predefined parameters is proposed in this paper. The improved weight functions could draw dynamic information from the sea surface temperature, Rossby radius of deformation, and surface geostrophic flow to improve the quality of the remotely sensed SSS. The validation against the in situ data indicates that the improved weight functions perform better than the traditional one with a reduced root-mean-square (RMS) and standard deviation (STD) of the differences with respect to EN 4.2.0 profiles (from 0.50 and 0.46 to 0.42 and 0.38 for FLC and 0.39 and 0.36 for FLE in the global ocean). In particular, the FLE scheme could highlight the variation of the strong currents without affecting the computational efficiency. Furthermore, this paper discusses the influences of the error distribution on the fusion results and underlines the importance of error-based adaptions for further improvements.

## 1. Introduction

Satellite observations have become essential tools to depict phenomena and features in the ocean. Almost all kinds of sea surface fields, including sea surface temperature (SST), sea surface height (SSH), sea surface ice, and chlorophyll concentration, have been mapped by the satellite platforms for a long time. Nevertheless, remote sensing of sea surface salinity (SSS) was not realized until the launch of Soil Moisture Ocean Salinity (SMOS) mission in 2009 (Font et al. 2010; Kerr et al. 2010). Subsequently, the Aquarius mission was launched in 2011 with the only purpose of measuring remotely sensed SSS (Drucker and Riser 2014). Regrettably, this mission was abruptly ended in 2015. The Soil Moisture Active Passive (SMAP) mission launched in 2015 is currently providing the SSS data as an ancillary product. The SSS data sounded by SMAP was retrieved by the Aquarius salinity retrieval algorithm while the accuracy still fell behind of Aquarius product (Scott et al. 2016). Considering the temporal continuity and the technical maturity, this paper mainly discusses the product and algorithm of SMOS SSS data.

The initial versions of SSS data retrieved from SMOS are demonstrated to be very noisy. The root-mean-square error of the SMOS salinity level 3 (L3) binned maps derived from the level 2 (L2), v622, could reach 0.6 in the open ocean and it was even larger than 1.7 in the coastal areas, which was unsatisfactory compared to the designed accuracy of 0.2 (Boutin et al. 2012). The evaluation of Bao et al. (2016) suggested that the quality of SMOS Barcelona Expert Center (BEC), v2.0, data was not as good as Aquarius Combined Active Passive (CAP) data and the bias of SMOS SSS at low latitudes was lower than the bias at high latitudes. In addition, since the revisit time of SMOS is 3-day, daily acquisitions display gaps between the swaths.

The process from satellite observation to a reliable gridded product is arduous because of the presence of both systematic and random errors. As analyzed by Martín-Neira et al. (2016), the systematic errors are mainly caused by three factors: man-made radio frequency interferences (RFI), land–sea contamination, and spatial–temporal drifts. These errors must be removed by the improvements of the retrieval method and reprocessing algorithm. Empirical land–sea contamination corrections were proposed in Olmedo et al. (2016) to be applied over the L2 Ocean Salinity (OS). More recently, the new version of the L2 OS, v662, includes an empirical land–sea contamination correction, which is applied at brightness temperature level, that is, before the retrieval of the salinity (ARGANS 2017). Alternatively, the debiased non-Bayesian method proposed by Olmedo et al. (2017) has been proven to reduce the land–sea contamination and the RFI effects, both of which have led to an improvement in the accuracy and an increase of the spatial coverage of the SMOS salinity maps. Thanks to the endeavor of the scientists, the quality of the latest versions of SSS is far better than the previous ones. For example, Garcia-Eidell et al.(2017) has shown that the accuracy of the v3.0 SMOS SSS maps released by BEC at high latitudes is better than the accuracy of Aquarius, v5.0.

Efforts to merge the SSS acquired by different missions would mitigate the random errors and lead to a more robust SSS time series. Generally, there are two types of method to blend the SSS data: that is, parametric method and nonparametric method. The former one requires parameter estimation. Nardelli (2012) used a novel optimal interpolation (OI) algorithm that included SST in the covariance estimation based on the hypothesis of a high correlation between SST and SSS at small scales. Hoareau et al. (2014) made the SSS level 4 product by means of data assimilation (DA) in a numerical model. The parameters estimated by historical data and the physics contained in numerical models promise a reasonable SSS product. Nonetheless, parameter estimation is limited by the lack of SSS observations because of its strong reliance on historical data. And the complexity of numerical integration and assimilation scheme makes DA a demanding method for researchers.

Multifractal fusion algorithm is a typical nonparametric method (Umbert et al. 2014, 2015; Olmedo et al. 2016). It is purely a geometrical approach with the idea that the fractal or geometrical structure of different scalar fields resembles each other. The theory is based on the microcanonical multifractal formalism (Turiel et al. 2008a). A multifractal scalar can be characterized by a dimensionless variable named singularity exponent (SE), which is a useful tool to evaluate the fronts and flows (Turiel et al. 2005). The SEs from different scalar maps, including SST, chlorophyll concentration (Nieves et al. 2007), and brightness temperature (Isern-Fontanet et al. 2007), turn out to be very similar. In fact, the singularity lines derived from any ocean scalar should align with the streamlines (Turiel et al. 2008b). The hypothesis was further verified using the model output from the Ocean General Circulation Model for the Earth Simulator (OFES; Umbert et al. 2014). Making use of the SST map as the template and the scale-invariant weight function, the signal-to-noise ratio of L4 SSS was then effectively improved (Umbert et al. 2014, 2015; Olmedo et al. 2016). The accuracy and coverage of L4 SSS were further improved using the swath data retrieved by the debiased non-Bayesian method (Olmedo et al. 2017). Furthermore, a new product over the North Atlantic Ocean and the Mediterranean Sea using a combination of debiased non-Bayesian retrieval, data-interpolating empirical orthogonal functions (DINEOF), and multifractal fusion has been newly available (Olmedo et al. 2018). Nevertheless, the weight function of multifractal fusion method is fixed regardless of the temporal and spatial variation of oceanic dynamic environment.

All of the fusion methods share a common point that, because of the limited spatial resolution of the satellite SSS observations, the accuracy and the spatial and temporal resolutions of the gridded remotely sensed SSS could be improved by using fusion schemes. Thus we sparked an idea of employing climatological and physical information into the multifractal method in order to combine the advantages of both methods.

This paper takes the physical constraint into consideration and proposes a semiparametric weight function to improve the multifractal fusion algorithm. Taking reference of the autocorrelation function in data assimilation (Meyers et al. 1991), the weight function is taken as a nonuniform, anisotropic, and time-dependent Gaussian-type function containing three given parameters. Considering the actual distribution of the SSS field, the first baroclinic Rossby radius of deformation *R*_{d} and sea surface geostrophic velocity are introduced to determine the parameters of the weight function. This improvement enables the fusion algorithm to evolve with the oceanic environment without increasing the computation burden.

The paper is organized as follows. In section 2, the data used are presented. Section 3 is a brief introduction of the basic theory of multifractal fusion method. In section 4, in order to circumvent the shortcomings of the nonparametric weight function, the semiparametric weight functions are proposed. Section 5 validates the improved weight function schemes using different methods and in situ data. Considering the difference between remotely sensed salinity and in situ near-surface salinity, the uncertainty is discussed in section 6. Section 7 discusses the correspondence between the results and the error distribution of the input data. Finally, conclusions are drawn in section 8. The description of the scalar approach and the calculation of Rossby radius of deformation are included in appendixes A and B, respectively.

## 2. Data

The experiment is carried out using the data during the year 2015. Several data products are utilized to introduce dynamic information into the SSS fusion. It must be noted that different data must match each other to ensure their temporal and spatial consistency.

### a. Satellite data

#### 1) SMOS SSS data

This paper uses the SMOS BEC operational v3.0 L3 uncorrected binned SSS with a temporal resolution of 9 days and a spatial resolution of 0.25°. Taking the middle day of every 9 days as the date of the data, 41 files for year 2015 are chosen as the raw materials. This binned product is weighted averaged onto regular grids from the BEC non-Bayesian retrievals. Note that this product is not time corrected, thus it is still biased (Barcelona Expert Centre 2017). This noisy and biased product could act as a touchstone to evaluate the new algorithm from its performance in diffusing biases and reducing noises.

#### 2) OISST

The Optimum Interpolation Sea Surface temperature (OISST) of the National Oceanic and Atmospheric Administration (NOAA), also known as the Reynolds SST, is taken as the template (Reynolds 2010). Although there is a product using SST from both Advanced Very High Resolution Radiometer (AVHRR) and Advanced Microwave Scanning Radiometer (AMSR), this paper uses AVHRR-only product because of the short temporal coverage of the former (2002–11). The resolution is 0.25° daily and its high accuracy ensures its competency of the role of template.

#### 3) CMEMS geostrophic current

This paper uses the Copernicus Marine and Environment Monitoring Service (CMEMS) geostrophic current data, which used to be distributed by the Archiving, Validation, and Interpretation of Satellite Oceanographic Data (AVISO). This daily gridded delayed time data are merged from the data of multiple altimeter missions by the Developing Use of Altimetry for Climate Studies (DUACS) multimission altimeter data-processing system (Ducet et al. 2000). The spatial resolution of the product is 0.25°.

### b. Climatology dataset

The decorrelation scale of the weight function is related to the deformation radius calculated from the *World Ocean Atlas 2013* (*WOA13*) climatology dataset (Locarnini et al. 2012). This paper employs the monthly climatological temperature and salinity from *WOA13* dataset with 57 fixed layers from 0 to 1500 m and the horizontal resolution of 0.25°.

### c. In situ data

#### EN 4.2.0 profiles

The observed salinity data from the EN4 dataset released by the Met Office Hadley Centre is employed here to validate the salinity product. The version is 4.2.0. This dataset includes the profiles from Array for Real-Time Geostrophic Oceanography (ARGO), Arctic Synoptic basin Wide Oceanography (ASBO), Global Temperature and Salinity Profile Program (GTSPP), and World Ocean Database 2009 (WOD09), providing a series of quality-controlled temperature and salinity profiles and gridded fields (Good et al. 2013). Because the strict 0-m data of profiles suffer from data loss and bad quality, this paper uses the valid data above 10 m to approximately represent the surface salinity data. It must be admitted that the skin salinity measured by the satellite is different from the so-called surface salinity of profiles. This approximation is relatively feasible because of the existence of the mixed layer. Still, we have considered the uncertainty and make a discussion in section 6.

## 3. Basic theory of multifractal fusion algorithm

### a. Singularity exponents and singularity resemblance

The singularity exponent (SE) is a nondimensional exponent, depicting the geometrical arrangement and fractal components of the surface. The mathematic expression of SE is that any 2D signal *s* should satisfy the relation (Mallat 1999)

where $T\psi |\u2207s|\u2061(x,r)$ is the modulus of gradient *s* [i.e., $|\u2207s|\u2061(x,r)$] projected on wavelet function $\psi $, *r* is the nondimensional scale parameter, $x\u2032$ represents the points around **x**, $\alpha \u2061(x)$ denotes the amplitude factor, $h\u2061(x)$ denotes the SE of signal *s* on the grid **x**, and $o\u2061[rh\u2061(x)]$ is negligible when $rh\u2061(x)$ is close to 0.

The calculation of SE can be referred to in the work of Pont et al. (2011). And a graphical user interface (GUI) is freely available for researchers at the BEC website (https://cp34-bec.cmima.csic.es/data/singularity-analysis-service/).

Previous works have proved the resemblance of the SE of SST, the SE of SSS, and streamlines of the flow regardless of the data version and resolution (Turiel et al. 2008a,b; Umbert et al. 2014). The fractal resemblance implies that SSS can be merged with the SST as a template, which has the better resolution and higher quality. Based on this assumption, the multifractal fusion method was proposed.

### b. Multifractal fusion algorithm

According to the linear correspondence between the SE of SST and the SE of SSS, a basic relation can be drawn as [readers who are interested can get the detailed derivation in the paper of Umbert et al. (2014) and Olmedo et al. (2016)]:

This function is the core of the fusion algorithm, showing a relation $\Phi \u2061(x)$ between the gradient of SSS $[\u2207s\u2061(x)]$ and the gradient of SST $[\u2207\theta \u2061(x)]$ on a given location **x**. Taking the high-quality SST as the template, the error of SSS can be reduced and the resolution of SSS can also be improved given a SST product with high resolution. The coefficients of $\Phi \u2061(x)$ can be estimated by minimizing the cost function:

where $\u2329\u22c5\u232ax$ is the local average operator. The local average of any scalar *f* at grid **x** is defined as

It is a weighted average with the weight function $N\u2061(x)$. The weight function is an indicator of how the points around the center point influence **x**. As the variation of $\Phi \u2061(x)$ is very slow, the weight function should decay faster to gain a more accurate result (Olmedo et al. 2016). In turn, the fast decaying of weight function enables the local average to be carried out in a moving window.

The algorithm can be divided into matrix approach, vector approach, and scalar approach according to the form of $\Phi \u2061(x)$. Theoretically, the more complex $\Phi \u2061(x)$ is, the more accurate the fusion is. However, the matrix approach and vector approach fail in operational application because of the limitations in the numerical integration of the gradients. The scalar method circumvents this problem by directly integrating Eq. (2). Then Eq. (2) degrades into the local linear relation between SSS $[s\u2061(x)]$ and SST $\theta \u2061(x)$ with the scalar $a\u2061(x)$ and $b\u2061(x)$ [i.e., Eq. (5)]:

Further information of this method is introduced in appendix A.

## 4. The improvements of weight function

The weight function in Eq. (4) seems a minor detail. However, it is a big problem to determine the decaying of the weight function. If the decaying were too fast, the noise would not be mitigated by enough compromise of the surrounding values. If the decaying were too slow, the salinity map would be oversmoothed because of the false interference of the surrounding data.

### a. The traditional weight function

The previous work took the weight function as the power function form:

Umbert et al. (2014) took *n* = 2, while Olmedo et al. (2016, 2018) took *n* = 4 for a faster decaying. It can be witnessed that researchers are trying to improve the weight function, but they seem to emphasize little on this aspect.

Although the employment of the above weight function has received a good feedback, there remain potentials of improvement. The biquadratic weight function is smoothing in a fixed circle; thus, it fails to be adaptive to the oceanic environment. On one hand, the true ocean is dominated by various dynamic processes, resulting in a nonuniform, anisotropic, and multiscaled distribution of ocean fields. For example, the salinity is rather uniform in the large warm pool in the west Pacific while its spatial gradient is very large in the frontal zones in the high latitude (Talley et al. 2011). An ideal weight function should take the spatial distribution into consideration. On the other hand, the weight function should evolve with the temporal variation of oceanic fields. This concept is similar to the “flow dependent” considered in data assimilation. To conclude, a dynamics-based weight function is welcomed to improve the fusion results.

### b. Enlightenment of the autocorrelation function

A typical diffusion function in data assimilation is usually in a Gaussian form. Meyers et al. (1991) raised a classical form, so-called autocorrelation function (ACF):

or

Here a scale *L*, namely the decorrelation scale, is introduced. Once a flexible scale *L* in the Eq. (7) is chosen, a nonuniform and multiscaled ACF could be constructed. Furthermore, Eq. (8) has realized a simple version of an anisotropic ACF. Nevertheless, it is not a best choice to directly employ the ACF as the weight function. The determination of the scale *L* is the core of the weight function. And the influence radius of Eq. (8) does not depend on the flow. Adaptions must be made.

### c. Introduction of Rossby radius of deformation

In physical oceanography, the Rossby radius of deformation *R*_{d} is closely related to the Kelvin wave and internal wave. It is an important scale in the research on such phenomenon as mesoscale eddies, coastal currents, and equator currents (Talley et al. 2011). Strictly speaking, the deformation radius is divided into barotropic and baroclinic radius, the latter of which is applied to the layered fluid. The deformation radius referred to in this paper is specifically the baroclinic radius.

Owing to its dynamic meaning, *R*_{d} is usually a reference scale when setting the parameters of the numerical model. Killworth (1980) and Pedlosky (1982) proved the importance of matching the scale of the grid with *R*_{d}. Qiu (2003) found that *R*_{d} determined the depth of 1.5-layer model. Wang et al. (2014) made use of *R*_{d} to determine the error correlation scale in OI and improved the traditional homogenization correlation scale scheme. These works enlighten us to employ *R*_{d} to adjust the decorrelation scale *L*.

We calculate *R*_{d} by the Wentzel–Kramers–Brillouin (WKB) method using the *WOA13* dataset (basic theory and calculation of *R*_{d} are shown in appendix B). The WKB method, also known as WKB approximation, is a method proposed for finding approximate solutions to linear differential equations with spatially varying coefficients (Bender and Orszag 2013). It must be admitted that WKB approximation would sacrifice some accuracy. The results of Chelton et al. (1998) and Cai et al. (2008) implied that the deformation radius calculated by WKB was generally larger than that calculated by the eigenequation. Nevertheless, WKB approximation is calculation economical and easy to discuss. And the pixel of SSS map in this paper (0.25°) is larger than the bias of the calculation method. As the first baroclinic Rossby radius of deformation explains most of the information contained in the equation, *R*_{d} in this paper is just referred to the first baroclinic Rossby radius of deformation.

Figure 1 depicts the spatial distribution of the climatological *R*_{d} (the area between 5°S and 5°N is absent because the contours are too crowded). This result (shading) coincides with that of Chelton et al. (1998) and Cai et al. (2008). The contours of *R*_{d} are generally of zonal distribution with the value and intensity increasing from the high latitudes to the equator. In the offshore area, the distribution of *R*_{d} resembles the topography. From the perspective of the resolution of the SSS map (1/4°, about 27.8 km around the equator), *R*_{d} is less than one grid size in the coastal area and at high latitudes, while *R*_{d} grows rapidly to over 6 times of the grid size at lower latitudes. As is analyzed above, the introduction of *R*_{d} could reflect the variation of latitude, topography, and stratification, endowing the SSS fusion with the nonuniform and multiscale property.

### d. Improved semiparametric weight functions

As is described above in section 4a, there are some innate deficits in the traditional weight function. According to its characteristic, we would like to call the biquadratic weight function as the “fixed circle” (FIC) weight function. Several improvements could be achieved by the introduction of a set of predefined parameters.

#### 1) “Flexible circle” weight function

Following the form of ACF in Eq. (7), the weight function could be adapted as

where *δ* denotes the grid size. Equation (10) is necessary because *R*_{d} is not completely compatible with the resolution of the SSS map. The decorrelation scale *L* must strike a balance between undersmoothing and oversmoothing, thus a limit must be set based on *δ*. The upper limit is chosen as 6*δ* (about 168 km around the equator) mainly because the *R*_{d} over 6*δ* varies as fast as the latitude (Fig. 1). For example, if no limit was set, the decorrelation scale of two neighbor points might be 6*δ* and 10*δ*, respectively. In addition, 6*δ* is also recommended by several comparison tests (not shown). Note that *L* is almost fixed around equator (as the upper limit of 6*δ*) and at high latitude (as the lower limit of *δ*).

Only the points near the center point could have a considerable influence because of the decaying of exponent function. Here we define the points satisfying $N\u2061(x)\u2264e\u22121$ as the influence zone, which can be the proxy of the pattern of the weight function.

Figure 2 depicts the climatological pattern of this weight function. Because only one parameter *L* derives from the monthly climatological *R*_{d}, this weight function is just monthly flexible. Owing to the dynamic meaning of *R*_{d}, we can easily distinguish the “large circles” at low latitudes and open oceans from the “small circles” at high latitudes and offshore areas in Fig. 2a. We could further see the coastal and latitudinal effect in Fig. 2b and Fig. 2c, and the effect is more obvious at latitudes lower than 30°. We are going to give this weight function a name as “flexible circle” (FLC) to declare its capability of nonuniform and multiscale fusion. It must be noted that the influence zone of FLC is very similar to that of the traditional biquadratic weight function at high latitudes and the circles around the equator are almost fixed as 6*δ* except some points with special stratification.

Unlike the traditional weight function, the center point of the FLC is taken into the calculation of local average. Considering that the largest errors and uncertainties cumulate in the coastal areas (which will be discussed in sections 5 and 6), the FLC scheme makes a further adaption by assigning climatological SSS to the center point of those moving windows overlapped with the land. The radius of the moving window is set to be 7 times the grid interval combined to the upper limit of *L*. This design is also applied to the improved weight function of section 4d(2).

There remains a concern whether the new weight function would bring in biases because of the larger influencing zone and the artificial assignment of the center point, which will be validated in section 5.

#### 2) “Flexible ellipse” weight function

The flexible circle weight function could preliminarily highlight the distribution of SSS. However, the SSS is not just related to the stratification, latitude, and topography. Advection is also an important factor. In other words, the weight function must be anisotropic to take the flow into consideration. The form of the ACF in Eq. (8) could highlight the anisotropic dynamic environment, but it only contains the zonal and meridional extension. An adapted form of weight function is then introduced as

where $L\upsilon =max\u2061[\u2061(\upsilon /\upsilon 0)Rd,Rd]$ denotes the flow-stretched major axis length; *υ* denotes the speed of the sea surface current; *υ*_{0} denotes the reference velocity (here, *υ*_{0} is taken as 0.1 m s^{−1}), which is a scaling factor characterizing how the speed interacts with *R*_{d}; and *α* denotes the angle of the surface current, which can be calculated from the geostrophic velocity. The weight function constructed by Eqs. (11)–(13) makes the weight function an ellipse pattern, whose major axis goes along the direction of the flow and extends according to the speed of the flow. Similar to Eqs. (9) and (10), a limit must be set to avoid undersmoothing and oversmoothing. The daily distributed geostrophic velocity product enables the weight function to have a temporal resolution of 1 day. A coarse realization of flow dependence is achieved.

Figure 3 depicts the climatological pattern of this weight function. The improved weight function inherits the merits of the flexible circle weight function. Furthermore, it highlights the influence of strong currents. Comparing Fig. 3a with Fig. 2a, it can be seen that the influence zones are quite similar in most areas except those areas dominated by strong currents (e.g., the western boundary currents and the equatorial currents). Visually, this weight function stresses on the difference between “circle” and “ellipse.” The “flexible ellipse” (FLE) is used as the substitute of the name of this weight function hereafter.

In Figs. 3b and 3c, we could find that the ellipses coincide with the pattern of SSS in the eastern fresh tongue, Gulf Stream, Amazon plume, and Agulhas Current, indicating the anisotropic and flow-dependent ability of FLE. Additionally, the stretching of influence zone would hardly get into the trouble of oversmoothing. Taking the SSS front near Cape Hatteras in Fig. 3b as an example, although the influence zones of FLE are stretched compared to FLC, the ellipses would not result in cross-front mixing because of the constraint of velocity. Note that FLE is almost the same as FLC around the equator, with the radius mostly being the upper limit of 6*δ*. This simplification circumvents the failure of the quasigeostrophic balance around the equator.

## 5. Application and validation

### a. Experiment set

In section 4, we have described three weight function schemes: that is, FIC, FLC, and FLE. For a better discussion, we would like to refer to them as the traditional scheme, scheme 1 and scheme 2, respectively. We aim to investigate the effects of the improved weight functions on the fusion method.

### b. Validation

We choose the SSS maps in the Gulf Stream to show the difference among different schemes. Compared with the noisy original SSS map shown in Fig. 4a, the fused maps show a more smooth and continuous pattern of SSS (Figs. 4b–d). As has been validated by former researchers, the multifractal fusion algorithm has a good effect in mitigating the noise. The SSS maps of the new schemes appear to be smoother than the traditional one without affecting the sharp gradient of the salinity front.

Considering both systematic and random errors, several questions would be raised as follows: 1) whether the adaption of weight function would introduce artificial systematic errors, 2) whether the improved schemes are effective and robust compared to the traditional scheme, and 3) how the coastal salinity values affect the performance of fusion and validation against in situ observations. First, we try to explain questions 1 and 2 in section 5b(1) from the theoretical basis of multifractal fusion, that is, the fractal correspondence; then these questions are practically answered by the validation against in situ observation in section5b(2). Question 3 is further analyzed in section 6 with a discussion on uncertainties.

#### 1) Singularity analysis

The multifractal fusion method is based on the fractal resemblance. And the true state of the SSS map is demonstrated to have a clear fractal structure. As is seen in Fig. 5, it is hard to detect the dynamic structure in the SE map of the original SSS (Fig. 5a), because the signals have been strongly interfered by the noises. However, the SE structure is clearer in the fused maps (Figs. 5b–d). The SE structure presented by the new schemes appears to be clearer than the traditional scheme, especially in the main axis of Gulf Stream. It suggests that the new schemes could effectively mitigate the noises and highlight the geometric structure of the SSS map.

To make comparison on the fractal structures, we employ a reduced singularity spectrum (RSS). The RSS is a dimensionless, scale-invariant quantity representing the multifractal signature. Pont et al. (2009) demonstrated that any real signal, regardless of dimension or resolution, should have a very similar RSS. Readers who are interested should refer to their work for further details.

Figure 6 shows the RSS of various products. Generally speaking, the RSS curves of all the products coincide with each other, indicating that the new schemes did not bring in a significant large-scale artificial geometry structure. The RSS curve of the traditional scheme departs from others when the SE is lower than −0.3. After the removal of coastal values, the RSS of the traditional scheme becomes lower and resembles the others. It suggests that the traditional scheme is risky when used in the coastal areas. This could be partly explained by the theory of local regression. Although the regression itself does not response to the bias [i.e., $a\u2061(x)$ would not change if a constant was added to Eq. (5)], the local average diffuses the bias. The bias could have an even more significant influence if there are not enough points taken into the calculation. Thus, the result of the traditional scheme can be unreliable when the fast-decaying local average window overlaps with the land.

Note that none of the RSS curves should be taken as the true state. As the template, the SST has a shorter range of SE (−0.4–0.3) than SSS (−0.4–0.6), and its RSS is higher than that of high-quality OA SSS. The RSS curves of the fused products are more similar to the curve of SST than the curve of the original SSS, of which scheme 2 presents the best resemblance with SST, although the template seems invalid when SE is over 0.3. It appears that RSS provides a reference for the fusion results, while further quantitative validation against in situ observations is necessary.

#### 2) By EN 4.2.0 profiles

The quality-controlled profiles of EN 4.2.0 could act as a reliable monitor of SSS. To make comparisons, the SSS products fused by different schemes are interpolated onto the scattered locations of the profiles (viz., the collocated observations) by bilinear interpolation method. We validate the improved weight functions by several statistics, including root-mean-square difference (RMSD), bias, correlation coefficient, and standard deviation (STD). Here, RMSD (STD) means RMS (STD) of the differences between the co-observations of SSS products and EN4 profiles. Note that we have discarded the collocated observations with a larger difference than 3 (166, 380, 6, and 7 collocations of 24 464 for original, traditional scheme, scheme 1, and scheme 2, respectively).

Figure 7 shows the RMSD and bias of different schemes and their evolution with time. As can be seen in Fig. 7a, the multifractal fusion algorithm can effectively mitigate the noise of the original SSS map regardless of the weight function scheme. Nonetheless, it appears that the results of the traditional scheme are not as robust as those of the new schemes. Combined to the analysis of SE, the bad performance is possibly related to the coastal values. The RMSD curves of scheme 1 and scheme 2 are similar, with the largest differences appearing from October to February. In terms of the pattern of these two schemes in Figs. 2 and 3, it could be deduced that scheme 2 has a better correspondence to the strong currents in the boreal winter, indicating that the largest improvements probably originate from the strong currents in the Northern Hemisphere such as the Gulf Stream.

After removing the coastal values (defined as the SSS values that are less than five grid intervals away from the land), the RMSDs of all products are significantly reduced (Fig. 7b). The new schemes are still better than the traditional one throughout the year, and the superiority of scheme 2 coincides with that in Fig. 7a.

The influence of coastal values on the RMSD is further presented in Fig. 8a. The largest RMSD cumulates in the region less than 500 km away from the coast, especially in the region less than 200 km from the coast. In areas are over 1500 km away from the coast, the performances of three schemes are almost equally good, reaching the asymptotic line (possibly the limitation of SMOS retrievals) of the lowest RMSD of 0.2. It turns out that the SSS retrievals in the open ocean have such a low level of noise that any form of the weight function would be enough to mitigate the SSS map. The largest improvements of the new schemes are right in the areas that are close to the coast. In these areas, the traditional scheme even increases the RMSD. The advantage of scheme 2 compared to scheme 1 is not significant based on the statistics derived from global SSS, which is acceptable because the difference between scheme 2 and scheme 1 is mainly in the strong boundary currents.

Before the further validation in specific regions, a discussion on the biases is necessary to justify whether the predefined forms and parameters of the new schemes bring in artifacts. In Fig. 7c, the biases of four products are of the same level, with a slight reduction of annual global mean biases (−0.18, −0.18, and −0.15 for the traditional scheme, scheme 1, and scheme 2, respectively) from the original data (−0.19). Similar to RMSD, the biases are also reduced after the removal of coastal values (Fig. 7d), while the reduction is not as significant as that of RMSD. The bias distribution with the distance to coast in Fig. 8c shows that the bias also cumulates in coastal areas, but it is widely distributed in the global ocean. On one hand, although the adaption of the new schemes with climatological data helps to reduce the biases in the coastal areas, they do not have a significant influence on the global biases. On the other hand, the new schemes are free of the concern of putting on artificial biases even if they are applied to other debiased inputs.

To highlight the advantages of the new schemes, we have mapped the differences between the new schemes and the traditional scheme, as are presented in Figs. 9a and 9c. These figures coincide with the pattern of the influence zone in Figs. 2 and 3. The largest differences between scheme 1 and the traditional scheme lie at low latitudes (Fig. 9a). The difference is mainly trapped in the latitudinal band of 20°S–20°N, with the most noticeable difference in 10°S–10°N. There are also differences distributed around the coastal areas. As for the scheme 2, the differences spread from low to midlatitudes (Fig. 9c), and it is not regularly confined to a zonal band. There are also noticeable differences in the strong currents (e.g., Gulf Stream and Kuroshio), highlighting the mesoscale dynamic signals.

In view of the quasi-latitudinal distribution of differences in Fig. 9c, we make a comparison on the RMSD as a function of latitude. In Fig. 9b, the RMSD of the original SSS (blue) and the traditional scheme (red) generally decrease with the latitude. The traditional scheme takes effects in mitigating the original data except in the range of 20°–40°, which corresponds to Mediterranean Sea and the currents near western Australia in our calculation (not given). The new schemes present a robust performance at all latitudes, especially at those latitudes with a larger coverage of land or with strong currents. The RMSD–latitude distributions of the new schemes are improved, with the RMSD in the Southern Hemisphere efficiently reduced to about 0.3. The latitudinal performances are highlighted by the definition of a simple improvement percentage (IP):

where RMSD_{fused} denotes the RMSD of the fused SSS product and RMSD_{original} denotes the RMSD of the original SSS product. The IP represents the relative improvement of the fused SSS compared to the magnitude of the original RMSD. The higher the IP is, the better the SSS product is.

The IPs of the SSS products fused by different schemes are illustrated in Fig. 9d. The IP of the traditional scheme is of 10% at low latitudes and then subject to a sharp drop in the subtropical areas. Although the IPs of the new schemes also decrease in the subtropics, the performances of the new schemes are generally in an increasing trend with latitude, with the maximum IP reaching 50%. The low IP around the equator appears to be contradictory to the distribution of difference in Figs. 9a and 9c. A possible explanation of this problem is that the strength of noise around the equator is relative low to be mitigated by any decaying. The factors that influence the fusion results will be further discussed in section 7. The latitudinal range where scheme 2 is superior to scheme 1 is about 30°–50°, which coincides with the latitudinal range with strong boundary currents.

Considering that the differences are possibly artificial signals, we calculate the spatial correlation coefficient *R* with all of the collocated observations in 2015. The values of *R* for the original SSS, the traditional scheme, scheme 1, and scheme 2 are 0.92, 0.92, 0.95, and 0.96 with respect to the EN4 profiles. This result demonstrates that the new schemes provide a more reliable pattern of SSS distribution.

Furthermore, we select four regions to evaluate the performance of the new schemes in those areas with noticeable differences, as shown in Tables 1 and 2. The annual mean RMSD of global ocean can be reduced by the traditional scheme by 0.05–0.06. As for the new schemes, the global improvement is over 0.1 (from 0.55 to 0.42 for scheme 1 and 0.39 for scheme 2 in Table 1). In the tropical ocean, the RMSD is only reduced by 0.02 by the traditional scheme, while the reduction of RMSD could reach 0.08 as for the new schemes (Table 1). After the removal of coastal values, the traditional scheme presents a significant improvement (from 0.37 to 0.32), while the improvement of the new schemes with respect to the traditional scheme is as slight as 0.01 (Table 2). It suggests that the new schemes mainly take effect in the coastal areas in the tropical ocean. The Gulf Stream, as shown in both tables, presents the largest RMSD of four regions. And the traditional scheme loses its efficacy regardless of the removal of coastal values. The new schemes perform equally well in Table 2, but scheme 2 stands out with a reduction of RMSD of 0.36 (from 0.94 to 0.58) in the Gulf Stream (GS) in Table 1, declaring the advantage of the flow-dependent design.

As for the Agulhas Current, the RMSD is less sensitive to the coastal values (0.34 in Table 1 and 0.30 in Table 2). Regardless of the coastal effect, the traditional scheme performs a robust work, and scheme 2 improves the traditional scheme by 0.03. It can be inferred that the performance of the fusion schemes is closely related to the location, or error distribution, which will be discussed in section 7.

## 6. Uncertainty analysis

The comparison against collocated in situ profiles is one of the relatively reliable validation methods available for researchers. However, the remotely sensed SSS and the in situ SSS observations are innately measured differently. Boutin et al. (2016) has pointed out that the validation must take subfootprint variability and vertical stratification into consideration. Nevertheless, the in situ observations are sparse and nonuniform in both temporal and spatial distribution. And rarely could an in situ profile take a vertically high-resolution record of accurate near-surface data.

We have calculated the difference between SSS products and EN4 profiles. Twelve collocations with the largest differences are chosen for each product. As shown in Fig. 10a, the collocations with largest differences are all located in the coastal areas, most of which are in the Northern Hemisphere. And most of the locations (except two stations in Red and Okhotsk Seas) are observed by enough EN4 profiles. In Fig. 10b, all of the positions are negatively biased, implying that these remotely sensed observations are fresher than the in situ data. The largest differences of the original SSS and the traditional scheme are mostly in the boreal winter and spring, when the vigorous dynamic processes (such as mixing) are strong in the Northern Hemisphere. In Fig. 10c, most profiles are characterized by the loss of strict surface observation and the existence of deep mixed layer. Many of the mixed layers can even reach the depth of over 200 m. It is possible that the near-surface salinity of these profiles and the remotely sensed salinity are significantly different from each other even if they are measured with accuracy. Nevertheless, the temporal, horizontal, and vertical sparseness of in situ observations make it a difficult problem to extract the components of uncertainties contained in the validation. Even though, we have calculated the STD of the differences with respect to EN4 profiles to discuss the uncertainties from an overall perspective.

Figure 11 depicts the spatial distribution of the STD of the differences with respect to EN4 profiles. The STD in every 3° × 3° box provides a reference of uncertainty though it also contains the temporal and subbox variation. As is shown in Fig. 10a, the largest STD cumulates in several areas, including the Mediterranean Sea, north Indian Ocean, Kuroshio, Gulf Stream, Mexico Gulf, and Antarctic Circumpolar Current (ACC). These areas are right in the areas with vigorous currents or large freshwater flux. It will require more work to separate the strong variation of SSS from the STD in these areas. Nevertheless, these areas are also labeled as the areas with large RMSD. The new schemes perform better than the traditional one in reducing the STD in such areas as the Arabian Sea, Kuroshio, and ACC (Figs. 10b–d).

Figures 12a and 12b show the STD of differences as a function of month and of distance to coast, respectively. Similar to the analysis of RMSD, the new schemes present a robust result in reducing STD and the coastal areas are cumulated with large STD. We have also calculated the STD in four chosen regions, as presented in Tables 1 and 2. The results are generally very similar to those of RMSD. It should be noted that in Table 2, the STD of the new schemes is only slightly reduced if no coastal values are considered. It suggests that the SSS data fused by different schemes in the open oceans present similar uncertainties.

## 7. Discussion

The innate differences between the measurement of remotely sensed salinity and the in situ observations make the “true” validation a challenging task. The analysis in section 6 drives us to make a retrospect on the validation of section 5, albeit the uncertainty is only discussed from an overall perspective.

Combined to the RMSD and STD distribution with the distance to coast (Figs. 8 and 12), the performance of the new schemes must be evaluated according to the location. In the coastal areas or the areas with large STD (as shown in Fig. 10a), the new schemes take much better effects in reducing both RMSD and STD than the traditional one (e.g., from 0.82 and 0.78 to 0.58 and 0.54 in GS for scheme 2 in Table 1). However, it could only be affirmed that the new schemes are more robust than the traditional one. The reliability of the SSS products in these areas cannot be guaranteed unless they have been validated by intensive observations.

As for the areas that are distant from the coast, the new schemes could improve the traditional one by a global reduction of ~0.05 on RMSD (from 0.36 to 0.32 and 0.31 for scheme 1 and scheme 2 in Table 2). This improvement is robust in the whole year (as is shown in Fig. 7b), although it is not as significant as the results in the coastal areas. Note that the STD of differences in the open oceans is generally lower (Figs. 11b–d). In this case, the improvements of the new schemes present a higher confidence.

Now that we have demonstrated the effectiveness of the new schemes compared to the original one, there remains a question about whether the new schemes are applicable to other input data and whether the results are similar to those of BEC, v3.0, binned input. In fact, we also carried out a fusion experiment based on an old version of SMOS SSS (SMOS BEC, v1.0, L3 binned SSS). The v1.0 input is noisier and the values in the areas with large STD in Fig. 11 are almost all discarded [quality of this product could be found in Barcelona Expert Centre (2014)]. In this case, the results are similar to (although not as significant as) those shown in Table 2 in GS and AC. But the results in TO are better than those in Table 2, with a ~0.03 reduction of RMSD compared to the traditional scheme (detailed figures are not given). It turns out that the results of the new schemes are determined by the properties of the input data (e.g., the error distribution).

The distance to the coast is no doubt the vital factor influencing the fusion results (Figs. 8 and 12) and the SSS in the coast areas is not in a “white” pattern but contaminated by large bias. The adaption of the new schemes according to the topography thus “resonates” well with the error structure of the input data, resulting in a significant improvement in Table 1.

The influence of velocity on the performance in the strong currents is robust, especially when the coastal values are retained. And the improvement from scheme 1 to scheme 2 corresponds to the strong variability (e.g., improvements in boreal winter in Fig. 7a) of the strong boundary currents (e.g., improvement of GS in Table 1).

In the open oceans, the main factor is possibly the strength of noise. Here we use “strength of noise” to represent the potential noises that could be reduced by the smoothing or fusion methods, albeit a lack of an exact quantitative definition. It could be inferred that the v1.0 input has a higher strength of noise than the v3.0 input in the tropical area from the quality reports [e.g., the RMSD and STD in Barcelona Expert Center (2014, 2017)], thus leading to a more significant improvement in the tropical ocean.

It could be concluded that the main reason why the new schemes perform a better work than the traditional scheme is because the dynamics-based adaptions coincide better with the error distribution. The new schemes improve the traditional scheme mainly by the consideration of topography and velocity, which corresponds with the largest RMSD/STD cumulating in the coastal areas or strong boundary currents. In the open oceans, the areas with a higher strength of noise have the potential to be improved by, for example, enlarging the decaying scales in the fusion method.

The results suggest that the fusion scheme must take the error distribution of the input data into consideration and make adaptions accordingly. Although it seems common sense in data fusion, it is usually ignored in the practical application, especially in the multifractal fusion. Considering that most of the SSS retrievals have a similar error structure with large biases and RMSD cumulating in the coastal areas (Garcia-Eidell et al. 2017), the new schemes are assumed to be applicable to most SSS data. However, there remains a great potential for improvement because of the different distribution of noise strength in different products. A perfect scheme should have a “perfect resonance” with the error distribution of input data.

Most of the fusion methods cannot effectively tackle the biased errors in the coastal areas, although our adaption of incorporating climatological data into the coastal fusion helps. In fact, most data assimilation methods such as OI are based on the unbiased estimation, and the multifractal fusion is based on local regression, which is also innately unbiased. Calibration would help to solve this problem; however, it is still challenging because of the uncertainty existing in the measurement of remotely sensed SSS and in situ SSS (Boutin et al. 2016). Besides, the failure of the traditional scheme in the coastal areas is mainly due to the lack of points taken into the local regression. This problem would be improved by using a high-resolution template (e.g., 0.05°).

It is hypothesized that the product could be further improved if the matchup between the fusion scheme and strength of the noise is fully considered. This problem seems well solved by, for example, the estimation of signal-to-noise ratio in OI. However, the decorrelation scale is usually set artificially to be a fixed value [e.g., 321 km in Olmedo et al. (2017)] or a *R*_{d}-based radius (e.g., Wang et al). Although the databased decorrelation scales (e.g., those estimated by Tzortzi et al. 2016) are expected to improve the results, these scales are based on SSS data instead of the strength of noise. Further works are needed to quantify the strength of noise and justify the application of the decaying scale estimated by the strength of noise.

## 8. Conclusions

This paper rephrases the work of multifractal SSS fusion of Umbert et al. (2014) and Olmedo et al. (2016) and draws the idea from the weight function. The traditional weight function, the so-called fixed circle (FIC), lacks the ability to evolve with the dynamic environment of the ocean. And its fast-decaying pattern makes the coastal results less reliable because of the lack of points calculated in local regression. Considering the multiscale, nonuniform, anisotropic, and flow-dependent nature of the ocean, this paper comes up with “flexible circle” (FLC) and “flexible ellipse” (FLE) weight functions.

Taking advantage of monthly climatological data and daily surface geostrophic velocity, the new schemes improve the traditional one with a clearer singularity structure and the reduced RMSD without putting on the biases, especially in those coastal areas or the strong boundary currents. Although the mathematic forms of the new schemes seem complicated compared to the traditional one, the calculation efficiency of the new schemes would not decrease and a global RMSD and STD reduction of ~0.1 (from 0.50 and 0.46 to 0.42 and 0.38 and to 0.39and 0.36 in Table 1) makes the new schemes worthy of application.

The validation against EN4 profiles indicates that the coastal biases and noises are the vital factors influencing the quality of the products. Great effort must be taken to tackle the coastal contamination and calibrate the retrieved SSS before implementing any fusion schemes. The topography and velocity considered in the new schemes coincide with the error distribution of the input data, thus taking effect in mitigating the noises. The adaption of the decaying scale is assumed to be constrained by the strength of noise, although further work is needed to quantify its influence.

In summary, the error structure of the input data must be fully considered in the fusion methods, even for the so-called nonparametric methods. And the dynamics-based weight functions must resonate with the noise strength to fulfill their potential. However, our schemes have provided an available prototype. To achieve further improvements, the parameters in the new schemes can be either predefined according to the error distribution of the input data or preestimated using historical data (e.g., estimation of the decorrelation scale). In future work, we are going to apply a set of error-based parameters to seek a better fusion scheme.

## Acknowledgments

This work is supported by the National Natural Science Foundation of China (41706021, 4197060570). All of the authors must express our best gratitude to three anonymous reviewers for their valuable suggestions, which greatly improve the quality of our paper. SMOS BEC data can be downloaded in BEC website (http://bec.icm.csic.es/thredds/catalog/ADVAOA009D025B/2015/catalog.html). The SST data are provided by NOAA’s National Climatic Data Center (NCDC; ftp://eclipse.ncdc.noaa.gov/pub/OI-daily-v2/NetCDF/). Various SSH-derived data are packed into one data file (provided by CMEMS: ftp://my.cmems-du.eu/../Core/SEALEVEL_GLO_PHY_L4_REP_OBSERVATIONS_008_047/dataset-duacs-rep-global-merged-allsat-phy-l4/). The *WOA13* dataset is also provided by NCDC (https://www.nodc.noaa.gov/OC5/woa13/woa13data.html). The EN 4.2.0 dataset provided by the Met Office Hadley Centre can be accessed online (http://hadobs.metoffice.com/en4/download.html). The color map in Fig. 9 is from the cmocean toolbox provided by Thyng et al. (2016). We are grateful to Dr. Oriol Pont for his selfless sharing and explanation of the singularity analysis code. We appreciate the generous help and professional instructions of Dr. Estrella Olmedo. The author also appreciates Chengzu Bai and Lizhi Yang for their enthusiastic encouragement and supervision.

### APPENDIX A

#### Scalar Approach and the Reason for Its Operational Use

##### a. The scalar approach

As for the scalar approach, if $\Phi \u2061(x)$ acts as a scalar $a\u2061(x)$, then

Because of the simple form of the function, the relation can be transformed into the relation between local SSS and SST via integration; namely,

This is a typical linear regression problem, whose solution is

##### b. Failure of the numerical integration

It is no doubt that the simplification of $\Phi \u2061(x)$ as a scalar is slightly contradictory to the SE theory and the relation degrades into the correspondence between SST and SSS. However, the integration of the gradient into the SSS map remains a big headache. The previous work of Olmedo et al. (2016) used the integration method of Turiel and Del Pozo (2002), whose idea is convolutional transformation in the Fourier space. The main problem lies in that the gradient on the coastal boundary is not continuous. Thus, another systematic error is introduced into the algorithm. Although Olmedo et al. (2016) used the moving window method to make amends for the systematic error, the matrix approach and vector approach still perform worse than the scalar approach. Therefore, the method employed by the BEC is still the simplified scalar approach.

### APPENDIX B

#### Calculation of *R*_{d}

The Rossby radius of deformation *R*_{d} can be calculated from the quasigeostrophic potential vorticity function (Gill 1982). Assuming that the vertical velocity can be divided into the multiplication of vertical and horizontal function; that is, $w\u2061(x,y,z,t)=\varphi \u2061(z)W\u2061(x,y,t)$, the equation can be split as

where $\u2207h2$ denotes the horizontal Laplace operator, $f=2\Omega \u2009sin\u2061\theta $ the Coriolis parameter, *θ* the latitude, *c* the phase speed of gravity wave, $\beta =2\Omega Re\u22121\u2009cos\u2061\theta $, $\Omega =7.29\xd710\u22125\u2009s\u22121$ Earth’s rotation angular velocity, and *R*_{e} = 6317 km the radius of Earth; $N2\u2061(z)$ denotes the square of Brunt–Väisälä frequency calculated by $N2\u2061(z)=\u2212(g/\rho )\u2061(\u2202\rho /\u2202z)$, where *ρ* is the potential density of the seawater.

Equation (B2) is a typical eigenvalue problem with the boundary condition of $\varphi =0$ when $z=0$; $\varphi =0$ when $z=\u2212H$, where *H* denotes the depth of the water. Employing the central difference to discrete the equation, Eq. (B2) can be transformed into a Sturm–Liouville problem with the eigenvalue of $\mu =\u22121/c2$. The deformation radius of *m*th mode is then calculated by Eqs. (B3) and (B4), where $cm$ is the phase speed of the *m*th-mode baroclinic gravity wave (Chelton et al. 1998):

The calculation is rather complex by numerically solving the eigenequation. WKB approximation can be employed to get a simplified result:

## REFERENCES

*Advanced Mathematical Methods for Scientists and Engineers: Asymptotic Methods and Perturbation Theory.*Springer, 593 pp.

*Proc. IEEE*,

**98**, 649–665, https://doi.org/10.1109/JPROC.2009.2033096.

*Atmosphere-Ocean Dynamics*. Academic Press, 662 pp.

*Proc. IEEE*,

**98**, 666–687, https://doi.org/10.1109/JPROC.2010.2043032.

*Dyn. Atmos. Oceans*,

**4**, 141–142, https://doi.org/10.1016/0377-0265(80)90007-X.

*2012 Fall Meeting*, San Francisco, CA, Amer. Geophys. Union, Abstract OS41C-1732.

*A Wavelet Tour of Signal Processing*. 2nd ed. Elsevier, 620 pp.

*Geophysical Fluid Dynamics*. Springer, 710 pp., http://link.springer.com/10.1007/978-3-662-25730-2.

*International Workshop on Combinatorial Image Analysis*, J. K. Aggarwal et al., Eds., Lecture Notes in Computer Science, Vol. 6636, 346–357.

*Ocean Sciences Meeting 2016*, New Orleans, LA, Amer. Geophys. Union, Abstract PO44E.

*Descriptive Physical Oceanography: An Introduction.*6th ed. Elsevier, 555 pp.

*Oceanography*,

**29**, (3), 9–13, https://doi.org/10.5670/oceanog.2016.66.

## Footnotes

© 2019 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).