Ocean surface turbulent fluxes play an important role in the energy and water cycles of the atmosphere–ocean coupled system, and several flux products have become available in recent years. Here, turbulent fluxes from 6 widely used reanalyses, 4 satellite-derived flux products, and 2 combined product are evaluated by comparison with direct covariance latent heat (LH) and sensible heat (SH) fluxes and inertial-dissipation wind stresses measured from 12 cruises over the tropics and mid- and high latitudes. The biases range from −3.0 to 20.2 W m−2 for LH flux, from −1.4 to 6.0 W m−2 for SH flux, and from −7.6 to 7.9 × 10−3 N m−2 for wind stress. These biases are small for moderate wind speeds but diverge for strong wind speeds (>10 m s−1). The total flux biases are then further evaluated by dividing them into uncertainties due to errors in the bulk variables and the residual uncertainty. The bulk-variable-caused uncertainty dominates many products’ SH flux and wind stress biases. The biases in the bulk variables that contribute to this uncertainty can be quite high depending on the cruise and the variable. On the basis of a ranking of each product’s flux, it is found that the Modern-Era Retrospective Analysis for Research and Applications (MERRA) is among the “best performing” for all three fluxes. Also, the European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (ERA-Interim) and the National Centers for Environmental Prediction–Department of Energy (NCEP–DOE) reanalysis are among the best performing for two of the three fluxes. Of the satellite-derived products, version 2b of the Goddard Satellite-Based Surface Turbulent Fluxes (GSSTF2b) is among the best performing for two of the three fluxes. Also among the best performing for only one of the fluxes are the 40-yr ERA (ERA-40) and the combined product objectively analyzed air–sea fluxes (OAFlux). Direction for the future development of ocean surface flux datasets is also suggested.
The atmosphere and ocean interact at their interface through surface turbulent fluxes of temperature [sensible heat (SH)], moisture [latent heat (LH)], and momentum (wind stress τ). Knowledge of these fluxes is important to understand the ocean heat and freshwater budget and the partitioning of the global pole-to-equator heat transport between the atmosphere and the ocean. The fluxes are also needed to provide a boundary condition for both atmospheric and ocean models and are instrumental in assessing numerical weather prediction and global coupled models (Brunke et al. 2002, 2003).
Direct observation of these fluxes can be made using high-speed instruments, such as a sonic anemometer or high-speed hygrometer, but such measurements are limited to at most a few cruises per year. Another option is to derive fluxes from measurements of bulk quantities of air temperature, sea surface temperature, air humidity, and wind speed from moored buoys and ships of opportunities as was done for da Silva et al. (1994) and the National Oceanography Centre, Southampton (NOC), flux product (Berry and Kent 2009), which used observations from the International Comprehensive Ocean–Atmosphere Data Set (ICOADS; Worley et al. 2005). Such products utilize these bulk quantities as input into bulk aerodynamic algorithms to calculate the fluxes by way of the following equations:
where ρa is the density of air, cp is the specific heat of air at constant pressure, and Lυ is the latent heat of vaporization. The bulk quantities are the wind speed, U; the surface and near-surface atmospheric potential temperatures, θs and θa, respectively; and the surface and near-surface atmospheric specific humidities, qs and qa, respectively. The values for the exchange coefficients—CH for heat, CE for moisture, and CD for momentum (also known as the drag coefficient)—are determined empirically based upon the bulk quantities and differ slightly from algorithm to algorithm (e.g., see Zeng et al. 1998; Brunke et al. 2002, 2003). However, such observations of bulk quantities from moored buoys and ships of opportunity are limited temporally and spatially, being particularly absent from the Southern Hemisphere, and fluxes from these products are not very accurate (e.g., da Silva et al. 1994; Josey et al. 1999; Wang and McPhaden 2001).
A third option is to derive the bulk quantities from satellite observations, which provide truly global coverage, and the surface fluxes are generally calculated using a bulk algorithm. Several such datasets have been developed over recent years, most notably the Goddard Satellite-Based Surface Turbulent Fluxes (GSSTF; Chou et al. 2003), the Japanese Ocean Flux Data Sets with Use of Remote Sensing Observations (J-OFURO; Kubota et al. 2002), the Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite Data (HOAPS, http://hoaps.org; Andersson et al. 2010), and those documented in Bentamy et al. (2003) and Bourras et al. (2002). For these datasets, wind speed and sea surface temperature can be easily derived from satellite measurements. In contrast, near-surface air humidity derived from satellite measurements is not particularly accurate, while in the past, near-surface air temperature was almost impossible to retrieve accurately (Bourras 2006). Recent studies (Jackson et al. 2006; Roberts et al. 2010; Jackson and Wick 2010), however, have produced accurate derivations of near-surface air temperature from satellite measurements. Still, some datasets use the air temperatures from reanalysis products instead of using a satellite-derived air temperature, while some others employ a constant surface–air temperature difference as done in HOAPS (Andersson et al. 2010). Another method would be to use a neural network to find a direct relationship between Special Sensor Microwave Imager (SSM/I) brightness temperatures and sensible heat flux similar to what was done for latent heat flux by Bourras et al. (2002).
Finally, sea surface turbulent fluxes can be derived from global model results that have been constrained by surface and rawinsonde observations and satellite measurements. Such products are called reanalyses and are produced by some of the major modeling centers, such as the National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction (NCEP), the European Centre for Medium-Range Weather Forecasts (ECMWF), the Japan Meteorological Agency (JMA), and the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC) Global Modeling and Assimilation Office (GMAO). Again, bulk algorithms are utilized to calculate the fluxes generally using the values from the lowest model layer as the near-surface quantities. Because of the use of models to produce these reanalyses, model errors can be introduced into their surface fluxes (e.g., Weller and Anderson 1996; Wang and McPhaden 2001; Smith et al. 2001; Renfrew et al. 2002).
Reanalyses are widely used, while satellite-derived fluxes are becoming more widely used as their accuracy is approaching that of the reanalyses. The climatological fluxes in these products have been intercompared extensively, particularly when a new product or version has been introduced. For example, Chou et al. (2003) compared the climatological characteristics of their version 2 of GSSTF (GSSTF2) with HOAPS, the older NCEP–National Center for Atmospheric Research (NCAR) reanalysis, and da Silva et al. (1994). Similarly, Kubota et al. (2002) compared their original version of J-OFURO to da Silva et al. (1994), the NCEP–NCAR reanalysis, and ECMWF’s reanalysis, and Kubota et al. (2003) added HOAPS and GSSTF to their comparison of J-OFURO2.
The goal, as established by the U.S. Climate Variability and Predictability (CLIVAR) Program and the Global Energy and Water Cycle Experiment (GEWEX), is to attain an accuracy of 5 W m−2 for each component of the surface heat budget (Curry et al. 2004). To get there, one must know what is contributing most to the product bias: the algorithm used or the bulk variables inputted into them. Brunke et al. (2002) did this by separating out the algorithm and bulk variable contributions in the NASA GMAO’s Goddard Earth Observing System (GEOS) reanalysis and HOAPS, and comparing them to each other. Bourras (2006) went one step further to assign contributions to the individual bulk variables for several satellite-derived products, but this was only done for latent heat flux using buoy data in the North Pacific and Atlantic.
Futhermore, with the recent availability of a new generation of several ocean surface turbulent flux products, we should now ask the question, Is the new generation of reanalyses [e.g., the Modern-Era Retrospective Analysis for Research and Applications (MERRA), the Climate Forecast System Reanalysis (CFSR), and the ECMWF interim reanalysis (ERA-Interim)] better than the previous generation [e.g., NCEP–NCAR, NCEP–Department of Energy (DOE), and 40-yr ERA (ERA-40)]? And how well do the satellite-derived and combined products perform relative to the reanalyses? Furthermore, what is the main contributor to the flux errors in all of these products? Here, we address these questions with six reanalyses, four satellite-derived products, and one product based on both satellite-derived measurements and reanalysis. These products are briefly described in section 2. Instead of using buoy measurements, as in Bourras (2006), which needed a bulk algorithm to derive the “observed” surface turbulent fluxes, we compare the product fluxes to direct observations taken from 12 experimental ship cruises in the tropics and Northern Hemisphere subtropics and mid- and high latitudes. These cruises are described briefly in section 3. To better understand the product flux biases, we split the total bias into two components, a bulk variable uncertainty and a residual one, and rank the products according to their biases and standard deviation of the errors (SDEs) as described in section 4. The results are presented in section 5, followed by some further discussion of the results and some concluding remarks in section 6.
2. Data products
The ocean surface turbulent flux data products compared here include commonly used reanalyses and satellite-derived products: NASA GMAO’s MERRA, ERA-40 and ERA-Interim, NCEP’s original reanalysis (versions 1 and 2) and the latest (CFSR), NASA GSFC’s GSSTF versions 2 and 2b, J-OFURO version 2, HOAPS version 3, and Woods Hole Oceanographic Institution (WHOI)’s objectively analyzed air–sea fluxes (OAFlux). The bulk flux algorithms used by these products are explained in the appendix.
MERRA is the most recent reanalysis produced by the NASA GMAO using the GEOS Data Assimilation System (DAS), which has at its core the GEOS version 5 (GEOS-5) atmospheric general circulation model (AGCM) (Rienecker et al. 2011; Suarez et al. 2008; Rienecker et al. 2007). The model has a finite-volume dynamical core that is run at a resolution of ½° latitude × ⅔° longitude (Suarez et al. 2008) with 72 vertical layers. Assimilation is done by gridpoint statistical interpolation (GSI), a new three-dimensional variational (3DVar) analysis (Rienecker et al. 2011). To reduce shocks from the mass-wind analysis increments, the incremental analysis update (IAU) procedure (Bloom et al. 1996) is implemented (Rienecker et al. 2007). In this procedure, assimilation at the 6-h synoptic times is based upon 6 h of model predictions centered on the synoptic time. The analyzed correction, then, is applied at the previous 6 h, and the model is run for 12 h (Suarez et al. 2008).
Conventional observational inputs to MERRA include data from surface land, ship, and buoy observations; rawinsondes; dropsondes; pilot balloons (PIBALs); wind profilers; and aircraft. Satellite inputs include upper-air winds derived from geostationary satellites and the Moderate Resolution Imaging Spectroradiometer (MODIS); surface winds from SSM/I, the Quick Scatterometer (QuikSCAT), and the European Remote Sensing Satellite (ERS) scatterometers 1 and 2 (ERS-1 and ERS-2, respectively); surface rain rate from SSM/I and the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI); radiances from the Geostationary Operational Environmental Satellite (GOES) sounder, Television and Infrared Observation Satellite (TIROS) Operational Vertical Sounder (TOVS) and Advanced TOVS (ATOVS) instruments, the Atmospheric Infrared Sounder (AIRS), the Microwave Sounding Unit (MSU), the Advanced Microwave Sounding Unit-A (AMSU-A), and SSM/I; and ozone retrievals from the Solar Backscatter Ultraviolet instrument (SBUV) (Rienecker et al. 2011). SST was taken from the Hadley Centre Sea Ice and Sea Surface Temperature (HadISST; Rayner et al. 2003) and the Reynolds et al. (2007) products.
Used here are the surface turbulent flux (tavg1_2d_flx_Nx) and single-level atmospheric bulk variable (tavg1_2d_slv_Nx) data collections, which are provided in the model horizontal resolution (½° latitude × ⅔° longitude) at 1-h temporal resolution. These data were downloaded through the MERRA Web site (http://gmao.gsfc.nasa.gov).
b. ECMWF reanalyses
Included here are two generations of ECMWF reanalyses: ERA-40 and ERA-Interim. ERA-40 utilized the ECMWF atmospheric model with a spectral resolution of T159 and 60 vertical layers (Uppala et al. 2005). The model included improvements to the parameterizations of deep convection and radiation and a new representation of sea ice (Uppala et al. 2005). Data assimilation was done by a 3DVar system. Inputs included conventional sources (surface land and ship observations, rawinsondes, dropsondes, PIBALs, wind profilers, and aircraft) and satellite measurements [upper-level winds from geostationary satellites, radiances from the Vertical Temperature Profile Radiometer (VTPR), the High Resolution Infrared Sounder (HIRS), the Stratospheric Sounding Unit (SSU), MSU, and AMSU-A; total column water vapor and surface wind speeds from SSM/I; ocean wave height and surface wind from ERS-1 and ERS-2, total column ozone from the Total Ozone Mapping Spectroradiometer (TOMS), and ozone profiles from SBUV] (Uppala et al. 2005).
ERA-Interim is the newest generation of ECMWF reanalyses (Simmons et al. 2006). Unlike ERA-40, which was limited to a 45-yr period (September 1957–August 2002; Uppala et al. 2005), ERA-Interim has near-real-time analyses. ERA-Interim’s model has a higher horizontal resolution (T255) with improved model physics. For ERA-Interim, ECMWF implemented a 12-h 4DVar assimilation system with improvements to the handling of data biases and the background error constraint. The same data from ERA-40 was inputted into ERA-Interim with the addition of clear-sky radiances from Meteosat-2; Global Ozone Monitoring Experiment (GOME) ozone profiles; and radio occultation (RO) measurements from the Challenging Minisatellite Payload (CHAMP), the Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC), and the Gravity Recovery and Climate Experiment (GRACE). Also included were reprocessed ocean wave height data from ERS-1 and ERS-2 and upper-level winds from Meteosat-2 (Simmons et al. 2006).
Utilized here are the 6-hourly surface and single-level analysis on a 2.5° latitude–longitude fixed grid for ERA-40 and the model-resolution 3-hourly surface fluxes and surface bulk variables from ERA-Interim. The ERA-40 data were downloaded from the NCAR Research Data Archive (RDA) Web site (http://dss.ucar.edu), and the ERA-Interim data were downloaded from the mass storage on NCAR’s supercomputers.
c. NCEP reanalyses
The NCEP reanalyses compared here include NCEP–NCAR, NCEP–DOE, and the CFSR. NCEP–NCAR (also referred to as NCEP-I) was originally a 40-yr product (Kalnay et al. 1996) but has been extended to near–real time. Its model is NCEP’s operational global forecasting model of the mid-1990s with a horizontal resolution of T62 and 28 vertical layers. Assimilation is done by spectral statistical interpolation (SSI), an older 3DVar technique, using data from the conventional sources mentioned above; satellite radiances from Satellite Infrared Spectrometer (SIRS), HIRS, VTPR, and TOVS; and upper-level winds from geostationary satellites. SST was taken from the Met Office (UKMO)’s Global Sea Ice and Sea Surface Temperature dataset (GISST; Rayner et al. 2006) product before 1982 and from the Reynolds and Smith (1994) analysis afterward (Kalnay et al. 1996).
NCEP–DOE (also referred to as NCEP-2) should be considered to be a revised version of NCEP–NCAR, not a new generation of reanalysis, developed to correct errors discovered in the processing of the earlier version. As in the previous version, NCEP–DOE uses a T62 spectral resolution model with 28 layers in the vertical. Some improvements were made to the model physics, including changes to boundary layer turbulence and radiation. The ocean albedo was also reduced from 0.15 in NCEP–NCAR to 0.06–0.07 in NCEP–DOE, while desert albedo was increased in NCEP–DOE. Similar data were ingested into NCEP–DOE, except that sea ice and SST were prescribed from the Atmospheric Model Intercomparison Project (AMIP-II) analysis and a new ozone climatology (Rosenfield et al. 1987) was used. Also, a geolocation error was fixed in the Southern Hemisphere surface pressure analysis (Kanamitsu et al. 2002).
CFSR is the latest generation of NCEP reanalyses (Saha et al. 2010). It uses the Climate Forecast System (CFS), a fully coupled atmosphere–ocean–sea ice–land model. The atmospheric component is run at a spectral resolution of T382 with 64 vertical layers with the addition of a cloud microphysics scheme to determine cloud condensate prognostically (Zhao and Carr 1997; Sundqvist et al. 1989; Moorthi et al. 2001), the simplified Arakawa–Schubert cumulus convection scheme (Pan and Wu 1995; Hong and Pan 1998), and orographic gravity wave drag (Kim and Arakawa 1995; Alpert et al. 1988, 1996). The ocean component is run at a variable horizontal resolution of ¼°–½° latitude × ½° longitude and 40 layers in the vertical with the uppermost layer at 10-m thickness. Data similar to that of the other reanalyses were also ingested into the CFSR with the addition of SSM/I, ERS, QuikSCAT, and WindSat ocean surface winds; GOES, AIRS, Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E), Infrared Atmospheric Sounding Interferometer (IASI), and Microwave Humidity Sounder (MHS) radiances; and RO from CHAMP and COSMIC. SST was taken from NCEP’s optimum interpolation (OI) product (Reynolds et al. 2007).
The 6-hourly surface fluxes and surface bulk variables taken from the model-resolution two-dimensional fields of NCEP–NCAR and NCEP–DOE were downloaded from the NCAR RDA Web site, and those from CFSR’s low-horizontal-resolution (~1.9° × 1.9°) hourly time series products from the mass storage on NCAR’s supercomputers.
d. GSSTF products
Compared here are two of the latest versions of the NASA GSFC satellite-derived products: GSSTF2 and GSSTF2b. GSSTF2 (Chou et al. 2003) was produced for a 13.5-yr period (July 1987–December 2000) with a 1° × 1° latitude–longitude horizontal resolution and a daily temporal resolution. Wind speed is retrieved from SSM/I measurements using the version 4 (Wentz 1997) scheme. Specific humidity at 10 m is derived from SSM/I version 4 total column water vapor and from the lowest 500 m, as described in Chou et al. (1995, 1997). Near-surface air temperature and SST are taken from NCEP–NCAR. The bulk algorithm used to calculate turbulent fluxes is slightly adapted from the Chou (1993) algorithm (Chou et al. 2003; see appendix).
Recently, the GSSTF products were revived and extended with a slightly improved version— that is, GSSTF2b (Shie 2010; Shie et al. 2009) being just released using an updated version 6 of the SSM/I total precipitable water, brightness temperature, and wind speed retrieval (Wentz et al. 2007 and online supplemental material; also see online at http://www.ssmi.com) as well as NCEP–DOE near-surface air temperatures and SSTs. The horizontal and temporal resolutions are unchanged but the period has been extended out to December 2008.
Both versions of GSSTF used here were downloaded via anonymous FTP through the Goddard Earth Sciences Data and Information Services Center (http://disc.sci.gsfc.nasa.gov). GSSTF2b can also be downloaded in Hierarchical Data Format for the Earth Observing System (HDF-EOS5) format (from ftp://measures.gsfc.nasa.gov/data/s4pa/GSSTF/ or http://disc.sci.gsfc.nasa.gov/daac-bin/DataHoldingsMEASURES.pl?PROGRAM=ChungLinShie). There are two sets of data produced for GSSTF2b: sets 1 and 2 (Shie 2010). Set 1 used here was found to possess a slightly increased global latent heat flux, especially after 2000. Therefore, set 2 was produced later by removing certain available satellite products that seemed to have a relatively larger trend in latent heat flux. As such, set 1 may contain a larger global temporal trend in latent heat flux with less missing data, while there might be a smaller trend in set 2 with more missing data. Detailed information about sets 1 and 2 can be found in Shie (2010). The results here are unaffected by the choice of set 1, but there might be more of a difference if data after 1999 (when more data were disqualified) were used. The reanalysis near-surface air temperature was not provided in the GSSTF products, so this was taken by the NCEP–NCAR and NCEP–DOE reanalyses for GSSTF2 and GSSTF2b, respectively.
Shie (2010) pointed out that the GSSTF2b set-1 latent heat flux trend was mainly caused by a trend in SSM/I brightness temperatures used to retrieve the bottom 500-m column water vapor. This was found to be due to temporal variations of the Earth incidence angle of the individual SSM/I satellites (Shie and Hilburn 2011). Thus, adjusted brightness temperatures are being used in an updated version (GSSTF2c) in which the temporal trends in specific humidity and latent heat flux are considerably reduced. This version is scheduled to be released in autumn of 2011. Soon, the GSSTF team will start production on a newer version (GSSTF3) that will include the adjusted brightness temperatures used in GSSTF2c plus increased horizontal (0.25° × 0.25°) and temporal (12-hourly) resolution as well as SSTs based on AMSR-E and TRMM measurements. GSSTF3 is expected to be released no later than spring of 2012.
J-OFURO (http://dtsv.scc.u-tokai.ac.jp/j-ofuro/index.html) is produced by the School of Marine Science and Technology at Tokai University in Japan. In version 2 used here, wind speed for the sensible and latent heat fluxes is taken from SSM/I, ERS-1 and ERS-2, the active microwave instrument (AMI), QuikSCAT, AMSR-E, and TMI. SST is taken from the JMA Merged Satellite and In situ Data Global Daily SST (MGDSST). Specific humidity at 10 m is retrieved from SSM/I measurements following Schlüssel et al. (1995), excluding certain data as in Schulz et al. (1993). Near-surface air temperature was taken from NCEP–DOE. Latent and sensible heat fluxes were calculated using the Coupled Ocean–Atmosphere Response Experiment (COARE), version 3.0, algorithm (Fairall et al. 2003; see appendix). Wind stresses were also available beginning in August 1999, but most of the cruise data (see section 3) occur before this time. So, we have not included the J-OFURO wind stresses here.
J-OFURO’s daily 1° × 1° surface fluxes and specific humidities were provided through the dataset’s Web site (http://dtsv.scc.u-tokai.ac.jp/j-ofuro/index.html). SST was calculated from the surface specific humidity. Air temperature was taken from NCEP–DOE, since it is not included in J-OFURO.
HOAPS is produced by the University of Hamburg and the Max Planck Institute for Meteorology in Germany. Version 3 (Andersson et al. 2010) compared here uses a neural network to obtain wind speed (Krasnopolsky et al. 1995). Specific humidity at 10 m is derived from SSM/I measurements using the updated coefficients of Bentamy et al. (2003), and surface specific humidity is derived using the Magnus formula, as described in Murray (1967), with a salinity correction. SST is taken from the Advanced Very High Resolution Radiometer (AVHRR) Oceans Pathfinder SST product (Casey et al. 2010). Near-surface air temperature is taken as the average of two methods: from the air specific humidity assuming that the relative humidity is a constant 80% (Liu et al. 1994) and from the SST assuming a constant surface–air temperature difference of 1 K (Wells and King-Hele 1990). Latent and sensible heat fluxes are computed using the COARE 3.0 algorithm (Fairall et al. 2003; see appendix).
The twice-daily (12 hourly) surface fluxes and bulk variables on a 1° × 1° grid (referred to as the HOAPS-G dataset) were downloaded from the Climate and Environmental Data Retrieval and Archive Web site (http://cera-www.dkrz.de/CERA/) at the Max Planck Institute for Meteorology.
OAFlux is produced by the WHOI. OAFlux is unique, in that it combines bulk variables derived from satellites with those from reanalyses (Yu and Weller 2007). Satellite-derived wind speeds come from SSM/I measurements using the Wentz (1997) algorithm and from AMSR-E and QuikSCAT. Specific humidity at 10 m is derived from SSM/I measurements using Chou et al. (1995, 1997), which is brought down to 2 m using the COARE 3.0 algorithm (Fairall et al. 2003). Satellite SST comes from NCEP’s OI product (Reynolds et al. 2007). Also ingested are the corresponding bulk variables from NCEP–NCAR, NCEP–DOE, and ERA-40 (Yu et al. 2008). The latent and sensible heat fluxes are calculated using the COARE 3.0 algorithm (Fairall et al. 2003; see appendix).
The daily 1° × 1° fluxes and bulk variables were downloaded from the OAFlux project Web site (http://oaflux.whoi.edu/).
3. Ship cruises
The fluxes and bulk variables are compared with observational data during 12 cruises shown in Fig. 1. Ten of these [the Atlantic Stratocumulus Transition Experiment (ASTEX), the COARE, the Fronts and Atlantic Storm Track Experiment (FASTEX), the Joint Air–Sea Monsoon Experiment (JASMINE), the Kwajalein Experiment (KWAJEX), a cruise to service buoys in the North Pacific (Moorings), Nauru ’99, the Pan-American Climate Study Flux 1999 cruise (PACS Flux ’99), the San Clemente Ocean Probing Experiment (SCOPE), and the Tropical Instability Wave Experiment (TIWE)] were carried out by the NOAA Environmental Technology Laboratory (ETL) [now part of the Earth System Research Laboratory (ESRL)] between 1991 and 1999. The other two [Couplage avec l’Atmosphère en Conditions Hivernales (CATCH) in January–February 1997 and Flux, État de la Mer et Télédétection en Condition de Fetch Variable (FETCH) in March–April 1998] were operated by the Centre d’Etude des Environments Terrestre et Planétaires (CETP) [now part of the Laboratoire Atmosphères, Milieux, Observations Spatiales (LATMOS)] at L’Institut Pierre-Simon Laplace in France. For more information on these experimental cruises, please refer to Brunke et al. (2003) and the references therein.
The observed fluxes used herein are those derived from the covariance and inertial-dissipation methods. Both are provided for sensible and latent heat fluxes as well as wind stress for all of the ETL cruises. Both were also provided for sensible and latent heat fluxes from FETCH, while only inertial-dissipation fluxes were provided for all fluxes during CATCH and for wind stress during FETCH. The covariance fluxes have been shown to be more reliable for latent and sensible heat fluxes (Fairall et al. 1996; Pedreros et al. 2003), so we use those whenever possible in this study. Inertial-dissipation wind stresses are always used, and the inertial-dissipation latent and sensible heat fluxes are used whenever covariance fluxes are unavailable. Additionally, flow distortion, ship motions, and environmental conditions have all been accounted for as described in Brunke et al. (2003).
Since the temporal resolution of these flux products ranges from hourly to daily, mainly the daily average fluxes and bulk variables are compared to each other here. This also allows the bulk variables from some of the reanalyses (e.g., NCEP–NCAR and NCEP–DOE), which are only provided instantaneously, to be compared to the average quantities in the other products. It is important to point out that the daily quantities provided by the satellite-derived products may not be truly averages but a composite of all of the available passes during the day. Also, the bulk variables in GSSTF2, GSSTF2b, J-OFURO, and HOAPS are derived at 10 m, so these have been brought down to 2 m using their respective algorithms.
The product algorithms are also used offline to calculate the turbulent fluxes using the observed bulk variables as input (referred to as algorithm fluxes). This provides an assessment of the impact of the algorithm to the flux product error. To mimic the model timestepping used to produce the reanalyses, the observed temporal resolution was used to produce the algorithm fluxes from the model algorithms, whereas (either daily or 12 hourly) mean quantities of the observations were inputted into the algorithms used to produce the other products. Also, since wave information is not provided for these cruises, the wave age dependence in the additional roughness due to waves, as was done to produce the ECMWF reanalyses using a wave model (e.g., ECMWF 2007), was ignored here, and a constant 0.018 was used instead (see appendix). There would be a small impact on the algorithm fluxes from the ECMWF algorithm, especially for conditions in which young waves are prevalent.
The crux of this study is to analyze the uncertainties associated with the total error of a product’s flux. Thus, the total flux error can be divided into two uncertainties as follows:
where Fprod is the product flux (the flux directly from each product), Fobs is the observed flux, and Falgor is the algorithm flux (computed using the respective offline algorithm inputted with the observed bulk variables). The first term on the right-hand side, called the “bulk-variable-caused uncertainty,” is predominately due to the difference in bulk variables, as the same algorithm was used to produce both Fprod and Falgor. There is also some uncertainty included in this term because of the possible mismatch between a point measurement and a gridbox average. However, this should be lessened by the temporal averaging that is applied to the data from a moving ship. The last term is called the “residual uncertainty” and is partly due to the uncertainty caused by problems in the algorithm, as discussed in Zeng et al. (1998) and Brunke et al. (2002, 2003). There is also some measurement uncertainty in this term, as mean covariance latent heat flux, covariance sensible heat flux, and inertial-dissipation wind stress typically have an uncertainty of 4 W m−2, 2 W m−2, and 5%, respectively (Fairall et al. 1996).
In Brunke et al. (2003), we found it helpful to rank bulk flux algorithms to assess which ones were least problematic. Thus, here we employ a similar strategy to assess the overall quality of the ocean surface fluxes in the products presented here by ranking them for each flux based upon their total bias (i.e., average error, ) and their SDEs.
Thus, for each cruise, a score from 1 for the lowest bias magnitude to 11 (8 for wind stress) for the highest bias magnitude is assigned to each product’s flux for each cruise. A similar score is also assigned to each product’s flux for each cruise based upon their . This method assigns equal weighting to each cruise with variable amounts of valid points (see Table 3 in Brunke et al. 2003). So, scores are also assigned to each product’s flux based upon the all-cruise biases and , as presented in Table 1. An overall ranking of each product’s flux is, then, the average of these four values as follows:
a. Evaluation of the total biases
Most of the bulk algorithms are not well tested at high wind speeds, since few observations are made in this regime. Zeng et al. (1998) found that the algorithm fluxes—that is, the fluxes calculated by bulk algorithms using observed bulk variables as input—diverge for high wind speeds. Figure 2a shows that the suite of cruise data in this study covers a wide range of wind speeds, from very low (<1 m s−1) to ≥10 m s−1. Ten of the 12 cruises, most of them in the tropics and subtropics, have the most number of data points in the moderate range (from 3 to 6 m s−1). The other two (CATCH and FASTEX) have the most number of points in the high wind speed range, and FETCH has many high wind speed points as well as many in the moderate range.
Thus, to examine how well the data products produce fluxes at high wind speeds, the quality of the daily mean fluxes from CATCH and FASTEX are compared with that of all other cruises by comparing the product fluxes with the observed fluxes in Figs. 3–5. For LH and SH fluxes (Figs. 3, 4), the reanalysis regression slopes (except for CFSR LH flux) are generally higher for CATCH/FASTEX than for the other cruises, whereas some of the slopes of the regressions for the satellite-derived products are lower during CATCH/FASTEX. Except for NCEP–DOE’s LH flux, NCEP–NCAR’s and NCEP–DOE’s regression slopes are usually very close to 1, while the scatter in these two products as well as CFSR is higher than the other products. The wind stresses from the reanalyses are better than those from the GSSTF products, especially GSSTF2 (Fig. 5). As for LH and SH fluxes, the wind stress regression slopes for the reanalyses, except CFSR, are all higher for CATCH/FASTEX. CFSR and both versions of GSSTF2 overestimate low wind stresses and underestimate high wind stresses, particularly during the cruises other than CATCH and FASTEX (Fig. 5).
The product total biases based on the daily means for all cruises are presented in Table 1. For LH flux, total biases range from −3.0 W m−2 for J-OFURO to 20.2 W m−2 for NCEP–DOE. The highest total biases come from the reanalyses other than MERRA. SH fluxes over ocean are generally much smaller in magnitude, so the total biases are generally smaller as well (from −1.4 W m−2 for HOAPS to 6.0 W m−2 for NCEP–NCAR). For wind stress, the lowest total bias comes from NCEP–NCAR (−7.6 × 10−3 N m−2), while GSSTF2b has the highest bias (7.9 × 10−3 N m−2).
Also shown in Table 1 are the SDEs based on the daily means. HOAPS has the highest SDE for LH and SH fluxes (50.3 and 26.8 W m−2, respectively). The lowest LH flux SDEs come from MERRA, ERA-40, and ERA-Interim, while GSSTF2, J-OFURO, and HOAPS have higher SDEs than the other products.
The all-cruise biases and SDEs are dominated by lower wind speed data points, so the biases and SDEs for just CATCH and FASTEX are also shown in parentheses in Table 1. The SDEs from these two cruises alone are higher than for all the cruises combined, consistent with the higher scatter seen in Figs. 3–5. Most products have biases that are higher in magnitude for CATCH/FASTEX.
The CATCH/FASTEX biases in Table 1 provide a hint as to the effect of wind speed on the product fluxes. To get a better understanding of this, the total biases for each product averaged for 1 m s−1 bins are shown in Figs. 6a–c. The product biases are generally close to 0 in the moderate wind speed range but diverge at very low wind speeds (<1 m s−1) and high wind speeds (≥9 m s−1). These regimes also have a low number of total observations (Fig. 2a), and there are higher standard deviations in the observed fluxes in the high wind speed bins (not shown).
Another way to differentiate between data from the various cruises is by SST. Tropical SSTs are warm with most temperatures > 26°C, while the midlatitude cruises have cold SSTs with most SSTs lying between 12° and 22°C (Fig. 2b). Figures 7a–c present the total product biases averaged over 2°C bins. The total biases in SH flux and wind stress from the products are typically around 0 except for SSTs < 18°C (Figs. 7b,c). These low SSTs correspond to the midlatitude cruises of CATCH and FASTEX as well as some other midlatitude and subtropical cruises (Fig. 2b), whereas the total biases in product LH flux do not show a strong dependence on SST (Fig. 7a). As was seen for high wind speeds, the scatter in the observed fluxes is higher for SSTs of <18°C (not shown).
b. Assessment of the general performance of the data products
Based upon the scores obtained from Eq. (6), we put the products into categories of performance for each of the fluxes. For the fluxes that all products have (LH and SH flux), there are three categories: A for the four “best performing” or lowest overall scores (SF), C for the three “worst performing” or highest SF, and B for the other four. Because wind stress is only provided by eight of the products, there are only two categories: A and B. This is presented in Table 2, in which the products are listed in alphabetical order per category.
Of the reanalyses, MERRA is in category A for all three fluxes. ERA-Interim falls into category A for two fluxes (LH flux and wind stress), while ERA-40 is also in category A for LH flux. NCEP–DOE is in category A for two fluxes (SH flux and wind stress).
The satellite-derived products generally do fairly well, falling into either category A or B for LH and SH fluxes. For instance, GSSTF2b is in category A for both LH and SH fluxes. Conversely, GSSTF2 LH and SH fluxes are in category C, and HOAPS and J-OFURO have category C LH and SH fluxes, respectively.
c. Evaluation of the bulk variables
To understand the total biases in Figs. 3–7 and Table 1 and the general levels of performance in Table 2, the contributions to the biases are examined by first looking at the accuracy of the bulk variables (wind speed, near-surface air temperature, SST, and near-surface air humidity) that are used as input into the bulk algorithms.
First, Fig. 8a compares the biases in 2-m specific humidity from all of the products for each cruise. The all-cruise biases in this quantity are quite low (from −0.5 to 0.9 g kg−1), which is because this quantity is overestimated in some cruises and underestimated in others (Fig. 8a). During some cruises, the difference between the product and observed specific humidity can be quite substantial. For instance, six products (NCEP–NCAR, NCEP–DOE, GSSTF2, GSSTF2b, J-OFURO, and OAFlux) have biases > +1 g kg−1 during TIWE (Fig. 8a). OAFlux’s overestimation at this location is due to its contribution from NCEP–NCAR and NCEP–DOE and only one of the ECMWF reanalyses (ERA-40), which has a better representation of specific humidity here. Problematic for the products that derived the near-surface specific humidity only from satellite data (J-OFURO and HOAPS) is SCOPE; all of these products underestimate this quantity in excess of ~1.8 g kg−1. The location of this cruise is within ~70 km of the Southern California coast, so the proximity near the coast may have an effect here.
All-cruise biases are higher (−2.1 to 1.0 m s−1) for 10-m wind speed. Seven products (NCEP–DOE, CFSR, GSSTF2, GSSTF2b, J-OFURO, HOAPS, and OAFlux) generally overestimate wind speed for most cruises, with the highest overestimations by the satellite-derived products (Fig. 8b). For FETCH, the reanalyses other than NCEP–DOE plus OAFlux underestimate wind speed in excess of 1 m s−1.
The biases in 2-m air temperature can be quite high as well (Fig. 8c). The reanalyses all overestimate this quantity in excess of 1°C for SCOPE. These products have an inexplicable warming during the latter half of the cruise that is not seen in the observations until the very last day (not shown). J-OFURO and OAFlux also use the air temperatures from reanalyses, so they also overestimate air temperature in excess of 1°C during SCOPE (Fig. 8c). Another problematic cruise for 2-m air temperature in some products is FETCH. Particularly, NCEP–NCAR and NCEP–DOE underestimate this quantity by on average ~2.4° and ~1.8°C, respectively. Since GSSTF2, GSSTF2b, and J-OFURO also utilize these temperatures, they also underestimate this quantity substantially. The underestimation is not as great in OAFlux because it also ingests ERA-40 temperatures, which have a small overestimation during this cruise.
SST biases are generally within 1°C except for a few instances (Fig. 8d). During SCOPE, ERA-40, NCEP–NCAR, and NCEP–DOE have lower SSTs in excess of 1°C. During FETCH, NCEP–NCAR and NCEP–DOE also underestimate SSTs by more than 1°C. Since GSSTF2 and GSSTF2b use the SSTs from NCEP–NCAR and NCEP–DOE, respectively, their SSTs are also much lower than observed during this cruise.
Figure 9 shows the SDEs of the product bulk variables. The satellite-derived 2-m specific humidities in GSSTF2, GSSTF2b, J-OFURO, and HOAPS have slightly higher SDEs than those from the reanalyses and OAFlux (Fig. 9a). Wind speed SDEs, especially from the products that use satellite-derived quantities, are usually higher during CATCH, FASTEX, FETCH, and Moorings, which all experienced some higher wind speeds at higher latitudes (Fig. 9b). SDEs for 2-m air temperature are generally small (<1°C) except for a few instances (Fig. 9c). SST SDEs are generally even smaller (<0.5°C) except during PACS Flux 99; CATCH; FASTEX; and Moorings for all but GSSTF2, J-OFURO, and OAFlux (Fig. 9d).
d. Evaluation of the uncertainties
The combined effect of the bulk variable biases seen in the last subsection is expressed in the bulk-variable-caused uncertainties given in the third column in Table 1. For LH and SH fluxes, the all-cruise mean bulk-variable-caused uncertainties range from very high negative values to more modest positive values (from −13.4 W m−2 from NCEP–NCAR to 8.2 W m−2 from CFSR for LH flux and from −14.3 W m−2 from NCEP–DOE to 5.2 W m−2 from GSSTF2 for SH flux). The lowest mean wind stress bulk-variable-caused uncertainty is from NCEP–NCAR (−3.4 × 10−3 N m−2), while the highest values are from GSSTF2 and GSSTF2b (1.5 × 10−2 and 1.7 × 10−2 N m−2, respectively).
How much these contribute to the total biases can be assessed by comparing them with the residual uncertainties in the fourth column of Table 1. The larger contributor to the total bias depends on the product. In some, the residual uncertainty contributes more to the total bias, while others have the bulk-variable-caused uncertainty contributing more. For LH flux, all products—except MERRA, J-OFURO, and HOAPS—have the most uncertainty coming from the residual uncertainty. In contrast, the total SH flux biases of nine products (MERRA, ERA-40, ERA-Interim, CFSR, GSSTF2, GSSTF2b, J-OFURO, HOAPS, and OAFlux) are predominately composed of the bulk-variable-caused uncertainties. For wind stress, five products (ERA-40, ERA-Interim, CFSR, GSSTF2, and GSSTF2b) have the bulk-variable-caused uncertainties contributing the most, while the residual uncertainties predominate in the total biases from MERRA, NCEP–NCAR, and NCEP–DOE.
Even though the residual uncertainties are composed of contributions from the algorithm and measurement uncertainties, they can still be used to get some understanding of the algorithm uncertainties. For instance, we can see a lessening of the residual uncertainty in CFSR compared to NCEP–NCAR and NCEP–DOE because of the use of the Zeng et al. (1998) roughness lengths for heat and moisture in CFSR. Also, the LH flux residual uncertainties from the satellite-derived and combined products are all lower than those from the ECMWF and NCEP reanalyses, and the SH flux residual uncertainties from these products are lower than those from the NCEP reanalyses. J-OFURO, HOAPS, and OAFlux use the COARE 3.0 algorithm, which Brunke et al. (2003) found to be one of the least problematic overall.
The values for just CATCH and FASTEX are shown in parentheses in the third and fourth columns of Table 1. Different from the tropical- and subtropical-dominated all-cruise means, the reanalysis means during CATCH/FASTEX are usually dominated by the residual means, whereas the satellite-derived means (including OAFlux) are usually dominated by the bulk-variable-caused uncertainties.
An idea of whether these uncertainties are systematic or cruise dependent can be gleaned from the middle and bottom rows of Figs. 6, 7. Similar to the total biases, the spread in the mean bulk-variable-caused and residual uncertainties is greatest at high wind speeds (>10 m s−1), but the spread is higher in the bulkvariable-caused uncertainties in this regime (Figs. 6d–i). Also, the NCEP products have the highest residual uncertainties overall for wind speeds > 2 m s−1 (Figs. 6g–i), but the use of the Zeng et al. (1998) roughness lengths in CFSR does drastically improve these uncertainties for LH flux, making them much closer to the others at ~0 (Fig. 6g).
The middle and bottom rows of Fig. 7 present the mean uncertainties for the 2°C SST bins. The spread in the total biases of Figs. 7a–c is predominantly due to that of the bulk variable uncertainties in Figs. 7d–f. Also, the NCEP products’ residual uncertainties are systematically larger than those of the other products for all SSTs, so this is not a regional or regime-specific bias.
e. The effects of temporal resolution
So far, we have not looked at the effects of temporal resolution on the data products. The most recent reanalyses (MERRA, CFSR, and ERA-Interim) have increased temporal resolution (less than 6 hourly). Table 3 presents the total biases and SDEs of various temporal means of the reanalysis LH fluxes using data from all of the cruises. The biases vary for each temporal resolution within at most ~4 W m−2, so there is no systematic lessening or amplifying of the product biases with temporal resolution in LH flux nor in SH flux and wind stress (not shown). In contrast, the SDE generally decreases as the averaging period increases. This is also the case for SH flux and wind stress (not shown). Thus, any “diurnal cycle” in the reanalysis errors is smoothed out in the daily means that have been compared so far.
This smoothing may have an effect on the rankings in Table 1. The choice of the number of lowest scores in category A and the highest scores in category B is somewhat arbitrary. Any small change in the score might bump a product from one category to another. Thus, the relative position of the reanalysis rankings are tested by rescoring the six reanalyses based upon their 6-hourly means instead of their daily means. The relative positions of the reanalysis rankings are only affected for wind stress where ERA-Interim is improved over ERA-40.
Figure 10 compares the 3-hourly mean bulk variables and fluxes from the reanalyses with the highest temporal resolution (MERRA, CFSR, and ERA-Interim) with observations over the course of KWAJEX during which the ship was fairly stationary at ~8°22′N, 167°44′E. This was chosen because the diurnal cycle would be resolved by the ship measurements without it being due to the movement of the ship. Because of the stationarity, more of the differences between the reanalysis values and the observed values would be due to the comparison between point measurements and grid average values than in most of the other cruises in which the ship was underway. However, some bias could be due to either a mean overall bias or a temporal one. For instance, observed specific humidity is fairly uniform across the day at ~19 g kg−1. The diurnal cycles in the reanalyses are also small, with MERRA having humidities slightly above observed within one standard deviation of observed mean humidity; CFSR and ERA-Interim have humidities that are ~1 and ~1.25 g kg−1, respectively, lower, generally falling below one standard deviation of the observed mean (Fig. 10a). Also, the diurnal cycle of 2-m air temperature in MERRA and CFSR is fairly consistent with observations, while those in ERA-Interim are ~1°C lower outside one standard deviation from the observed mean (Fig. 10b).
Conversely, the SST diurnal cycle in MERRA and ERA-Interim is virtually nonexistent compared to the slight one of ~0.6°C observed (Fig. 10c). CFSR with its coupling to an ocean model has a slight diurnal cycle in SST; however, since the uppermost layer is 10 m thick, it is unable to resolve the surface skin temperature diurnal cycle (Brunke et al. 2008). Near-surface wind speeds in MERRA are fairly consistent with the observed means (Fig. 10d), with the root-mean-square error (RMSE) of the eight 3-hourly averages being 0.2 m s−1. In contrast, the RMSEs of CFSR and ERA-Interim are 0.6 and 0.3 m s−1, respectively.
The combined effect of these bulk variable errors on the diurnal cycle of fluxes can be seen in the bottom two rows of Fig. 10. The underestimated specific humidities in CFSR and ERA-Interim produce overestimated LH fluxes. The poor diurnal cycle in wind speed in CFSR is also reflected in the poor diurnal cycle in its LH flux, with a maximum at 1930 UTC as opposed to the 0430 UTC maximum observed (Fig. 10e). SH flux in ERA-Interim is way too high because of its lower 2-m temperatures. The reanalysis SH fluxes steadily increase over the course of the day because of increasing air–sea temperature differences due to the small diurnal cycle in SST (Fig. 10f). The diurnal cycle of wind stress closely follows that of the 10-m wind speed, and the three reanalyses overestimate wind stress most of the time (Fig. 10g).
6. Discussion and conclusions
Intercomparison of the ocean surface turbulent fluxes in six reanalyses (MERRA, ERA-40, ERA-Interim, NCEP–NCAR, NCEP–DOE, and CFSR), four satellite-derived products (GSSTF2, GSSTF2b, J-OFURO, and HOAPS), and one combined product (OAFlux) reveals that the product flux biases are a combination of two uncertainties: bulk variable caused and residual [Eq. (4)]. The residual uncertainties generally dominate for all-cruise LH flux biases, while the bulk-variable-caused uncertainties tend to dominate in most of the all-cruise SH flux and wind stress biases (Table 1).
The bulk-variable-caused uncertainties in fluxes are a combined effect of the errors in the bulk variables used in the products. While 2-m specific humidity errors over all the cruises are small because of equal and opposite regional errors (Fig. 8a), wind speed errors are quite strong from NCEP–DOE, CFSR, and the satellite-derived products (Fig. 8b). SST and 2-m air temperature biases are generally low except for a few cruises, particularly in 2-m temperature during SCOPE for all of the reanalyses that have a warming during the latter half of the cruise, which is not observed (not shown).
For the satellite-derived products, these bulk variable errors are simply produced by inaccuracies in the retrieval. In the reanalyses, there are also measurement errors and model errors that include uncertainties from the physical parameterizations other than the surface flux algorithm and errors in the assimilation of data. These can result in an overall bias, as in the specific humidity underestimations by CFSR and ERA-Interim or air temperature underestimation by ERA-Interim during KWAJEX, or in temporally dependent biases, as in the wind speed biases in CFSR and ERA-Interim during KWAJEX (Fig. 10). The latter can be partially caused by shocks added to the model when the assimilation is performed. A comparison of the hourly bulk variables from MERRA and CFSR during KWAJEX shows that there are periodically unrealistic jumps in the 10-m wind speed at some of the assimilation times in CFSR, whereas there are none in MERRA (not shown). This suggests the need to minimize shocks to the model at assimilation times, as is done in MERRA, through the use of the incremental analysis update (IAU) technique.
The residual uncertainties include contributions from the algorithm uncertainty (i.e., the uncertainty as to the accuracy of the bulk algorithm used to calculate the flux) and measurement uncertainty. Still, a comparison of the residual uncertainties between products can reveal some valuable information about the algorithm uncertainties. For instance, the reduction in the residual uncertainty in all the fluxes from NCEP–NCAR and NCEP–DOE to CFSR shows that the inclusion of the Zeng et al. (1998) heat and moisture roughness lengths has reduced the algorithm uncertainties in the NCEP flux algorithm (Table 1). The residual uncertainties from the satellite-derived and combined products are also lower than that of the ECMWF and NCEP reanalyses for LH flux and that of the NCEP reanalyses for SH flux. Several of these products (J-OFURO, HOAPS, and OAFlux) utilize the COARE 3.0 algorithm, which was found to be one of the least problematic overall (Brunke et al. 2003).
Finally, the ranking of these widely used product fluxes according to Eq. (5) provides an assessment of the best-performing products (Table 2). MERRA is in category A for all three fluxes. Also in category A are ERA-Interim for LH flux and wind stress, GSSTF2b for LH and SH fluxes, ERA-40 for LH flux, OAFlux for SH flux, and NCEP–DOE for wind stress.
These products qualify because they have some of the lowest biases and/or SDEs. For some of these products, their good performance is due to having very low biases that have nearly equal contributions from bulk variable and residual uncertainties. For instance, this is true for all-cruise LH and SH fluxes in GSSTF2b as well as for all-cruise SH fluxes in MERRA (Table 1). Plus, there are still large bulk-variable-caused uncertainties in some of the products that generally increase in high wind speed conditions (Table 1; Fig. 6). Thus, significant improvement is still needed in both algorithms and the retrieval of bulk variables to reduce the total uncertainty in product LH and SH fluxes to within 5 W m−2.
In general, the new generation of flux datasets is as good or better than the older generations of products. For instance, the new version of GSSTF (i.e., GSSTF2b) represents a significant improvement over GSSTF2 (e.g., Table 1). In contrast, ERA-40 already performed reasonably well, so there is not much improvement in ERA-Interim (e.g., Table 1). The results from CFSR, which is the only reanalysis involving a coupled atmosphere–ocean data assimilation, are not that bad considering that it falls into category B for all three fluxes (Table 1). However, further work may still be needed to realize the full potential of ocean–atmosphere coupled reanalysis.
While the current generation of satellite-derived and combined products (e.g., GSSTF2b and OAFlux) are generally better than the earlier generation of reanalyses (NCEP–NCAR and NCEP–DOE), they are comparable or slightly worse than the new generation of reanalysis (particularly MERRA). This is because 1) the data assimilation method has been improved from the first generation to the current generation of reanalyses and 2) more satellite data (including all of those used in the satellite-derived products plus others that were not used) are assimilated in the new generation of reanalyses. Furthermore, when compared with satellite-derived and combined products, the new reanalyses provide higher temporal resolution (3 hourly in ERA-Interim and hourly in MERRA and CFSR) than the previous generation of reanalyses (6 hourly) or the satellite-derived and combined products (daily or 12 hourly).
Another shortcoming of the satellite-derived products is that they do not generally provide radiative fluxes, which are also needed to study the surface energy budget or to force an OGCM, whereas they are generally not provided in the satellite-derived products [HOAPS does provide the net longwave (LW) flux but not the net shortwave (SW) flux]. Table 4 presents the mean difference in surface downward LW and SW radiative fluxes between the reanalysis values and ship observations from all of the cruises. LW radiation is very well constrained in the reanalyses with CFSR having the lowest bias, but a few reanalyses have high biases (>20 W m−2) in SW radiation. In fact, CFSR has the highest bias of all of the reanalyses for SW radiation. In contrast, if the satellite-derived fluxes are desired, then one could use them with another satellite radiative flux dataset, such as the International Satellite Cloud Climatology Project (ISCCP; Zhang et al. 2004), as is also included with the OAFlux data.
Another problem facing reanalyses is inconsistencies that result with changes with the satellite data ingested as newer and improved satellites come online. Such “jumps” can be seen in MERRA, for instance, in Robertson et al. (2011). For example, the moisture corrections change suddenly from drying to moistening over the course of the MERRA assimilation, with most of that change happening in late 1998, when the assimilation of NOAA-15 AMSU-A data began and to a lesser extent in late 1987 with the incorporation of the SSM/I data. Bosilovich et al. (2011) report that changes in evaporation due to the change in the analysis moisture increments with the ingestion of AMSU-A as compared with before depend upon the region. Such changes result in LH flux biases that are more positive for all of the 1999 cruises except for PACS Flux ’99, which is negative (not shown). PACS Flux ’99 corresponds to an area of decreased evaporation after late 1998, and the other cruises correspond to areas of weak or positive change in evaporation afterward (see Bosilovich et al. 2011, their Fig. 12b). Another possible concern is the increasing number of data from multiple SSM/I satellites ingested, which would affect near-surface wind speed. There is no effective difference in the wind speed biases and SDEs between the cruises except by region (Figs. 8b, 9b), but there were at least two SSM/I satellites ingested into MERRA during all of these cruises (http://gmao.gsfc.nasa.gov/research/merra/IMAGES/MERRA_Satellite_data_streams.jpg). Such biases may be more substantial if any cruises in 1987–90, when there was only one SSM/I satellite, or even pre-SSM/I (before 1987) cruises were used.
Despite these concerns, the current generation of reanalyses (MERRA, ERA-Interim, and CFSR) still holds promise for the future direction in the development of ocean surface turbulent flux datasets because of their high frequency and all of the satellite data that they ingest. Such datasets would be derived from a combination of the high-frequency bulk variables from the new generation of reanalysis products (MERRA, ERA-Interim, and CFSR) and computing high-frequency fluxes using one of the least problematic bulk algorithms discussed in Brunke et al. (2003).
This work was supported by NASA under Grant NNX09A021G. Dr. C. Fairall is thanked for providing the ship cruise data and the COARE 3.0 algorithm used here. We also thank Drs. A. Beljaars, H.-L. Pan, and S.-H. Chou for providing their algorithms. Dr. A. Beljaars and two anonymous reviewers are also thanked for their helpful comments. Finally, NCAR is thanked for providing the computing resources to download the reanalysis data used here.
The Product Bulk Algorithms
where k is the von Kármán constant of 0.4; z is the height of the lowest model level above the ground; zom, zot, and zoq are the roughness lengths for momentum, temperature, and humidity, respectively; L is the Obukhov length; and ψm and ψh are stability functions. It is the parameterization of these coefficients that causes most of the uncertainty between algorithms, although there are differences in how other parameters are set that could be substantial in certain circumstances (Zeng et al. 1998; Brunke et al. 2002, 2003). Here, we focus on the differences in the formulation of the exchange coefficients. Other details of the algorithms are explained in the three studies mentioned above.
The algorithm in the GEOS-5 AGCM, which is used to produce MERRA, as described in Helfand and Schubert (1995), uses stability functions from Clarke (1970) for stable conditions and from Panofsky and Dutton (1984) for unstable conditions. The roughness length for momentum is a function of the friction velocity u* as follows:
where the coefficients a1, a2, a3, a4, and a5 are taken from Large and Pond (1981) for moderate to large wind speeds, Kondo (1975) for weak wind speeds, and interpolated between the two for wind speeds in between. The roughness lengths for temperature and humidity are
where Re* is the roughness Reynolds number calculated from the friction velocity, roughness length for momentum, and kinematic viscosity of air ν as u*zom/ν.
In the algorithm in the ECMWF operational model (Beljaars 1995) used to produce both ERA-40 and ERA-Interim, the stability functions are taken from Holtslag and de Bruin (1988) for stable conditions and as used in Dyer (1974) for unstable conditions. The roughness lengths are
The algorithm in the NCEP operational model used to produce the NCEP–NCAR and NCEP–DOE reanalyses extends Monin–Obukhov similarity to include empirical profile equations for various stability regimes. The roughness lengths are parameterized as follows:
The algorithm in the CFS used to produce the CFSR is the same as the previous one used for NCEP–NCAR and NCEP–DOE, except that the Zeng et al. (1998) roughness lengths for heat and moisture are used instead as shown:
The bulk algorithm used to produce both versions of GSSTF here was based on Chou (1993) with the addition of the salinity effect to surface saturated humidity and a change to the parameters used for the von Kármán constant (Chou et al. 2003). The stability functions are taken from Large and Pond (1981) for stable conditions and Businger et al. (1971) for unstable conditions. The roughness length for momentum is
while the following roughness lengths for heat and moisture are from Liu et al. (1979):
The coefficients a1, a2, b1, and b2 are taken from Table 1 in Liu et al. (1979).
J-OFURO, HOAPS, and OAFlux utilize the COARE 3.0 algorithm (Fairall et al. 1996, 2003), which was found to be one of the least problematic in Brunke et al. (2003). For unstable conditions, the Grachev et al. (2000) stability functions are used, whereas under stable conditions, the stability functions are taken from Beljaars and Holtslag (1991). The roughness length for momentum is parameterized as
where a varies with wind speed, such that
The roughness lengths for heat and momentum are equal to the smaller of or 1.1 × 10−5 m.
This article is included in the MERRA: Modern Era Retrospective-Analysis for Research and Applications special collection.