1. Introduction
Earth’s atmosphere is an essential component of the climate system. Improving knowledge of the natural variability and trends in atmospheric temperature is of vital importance for a better understanding of climate change and its causes. Therefore, consistent long-term observational records of essential climate variables (ECVs), such as upper-air temperature, are required for the detection and attribution of climate change and for verifying climate model simulations.
This topic is a research focus of the international climate science community acting through the World Climate Research Programme (WCRP) and of relevance for the Intergovernmental Panel on Climate Change (IPCC). Sustaining global climate data is the dedicated goal of the World Meteorological Organization (WMO) via the implementation of the Global Climate Observing System (GCOS) based on principles for climate monitoring systems and for the generation of climate data records (CDRs) (GCOS 2011, 2016).
In this context, the activity on Atmospheric Temperature Changes and Their Drivers (ATC) is a long-standing activity within the WCRP/Stratospheric–Tropospheric Processes and Their Role in Climate (SPARC) program. The activity has made substantial contributions to assessments of stratospheric temperature trends, based on analyses of observations and model simulations, with regular contributions to the WMO/United Nations Environment Programme (UNEP) Scientific Assessments of Ozone Depletion (Ramaswamy et al. 2001; Shine et al. 2003; Randel et al. 2009; Thompson et al. 2012; Seidel et al. 2016; Maycock et al. 2018).
Observations from meteorological satellites have become an important source of upper-air data for more than 40 years, while radiosonde measurements from weather balloons are available since the 1950s and earlier. Evaluating long-term temperature changes from these data is challenging, as the instruments were primarily intended for weather observation. Climate monitoring requires higher accuracy (e.g., Karl et al. 2006; Trenberth et al. 2013). Uncertainties due to such factors as instrument changes over time, intersatellite offsets, changes in diurnal sampling, and other modifications in the observational network require homogenization and intercalibration procedures for the construction of CDRs.
Substantial efforts have been put into the reconciliation of atmospheric temperature trends from different observational platforms (e.g., Karl et al. 2006; Randel et al. 2009). Homogenized radiosonde data (e.g., Titchner et al. 2009; Haimberger et al. 2008, 2012) and calibrated records from microwave soundings (e.g., Christy et al. 2007; Mears and Wentz 2009a,b; Zou et al. 2009) confirmed tropospheric warming and stratospheric cooling since the mid-twentieth century. In the low-to-midtroposphere, temperature trends from independent observations and models were found to be consistent, but differences remained in the upper troposphere and stratosphere (e.g., Fu et al. 2011; Mitchell et al. 2013; Lott et al. 2013). Independent observational temperature estimates from the Stratospheric Sounding Unit (SSU) showed large discrepancies and also differed from model trends (Thompson et al. 2012).
There has been substantial interest in comparisons of modeled and observed tropospheric temperature trends. Basic theory of moist adiabatic processes predicts larger warming in the tropical free troposphere compared to near the surface, referred to as tropical tropospheric amplification (Stone and Carlson 1979). A number of previous studies have found that most observational datasets appear to show weaker tropical tropospheric amplification for decadal-scale trends, while this remains a robust feature across several generations of model simulations (Santer et al. 2005, 2008; Po-Chedley and Fu 2012a; Santer et al. 2017b). While various homogenization efforts and reprocessing activities have generally reduced the observation–model differences, the trend amplification estimates from radiosondes and most reprocessed microwave sounding satellite products generally remain smaller than those from climate models and moist adiabatic lapse rate considerations (Thorne et al. 2011; Mitchell et al. 2013). Factors to consider when interpreting model–observation differences include possible errors in climate model forcings (Solomon et al. 2011, 2012; Mitchell et al. 2013; Sherwood and Nishant 2015; Santer et al. 2017a), differences between simulated and observed sea surface temperature (SST) trend patterns (Mitchell et al. 2013; Kamae et al. 2015; Tuel 2019), internal variability (Suárez-Gutiérrez et al. 2017; Kamae et al. 2015; Santer et al. 2019), and, for satellite retrievals, effects from broad vertical weighting functions (Santer et al. 2017b).
Despite these uncertainties, it remains difficult to explain the observed pattern of stratospheric and tropospheric temperature change without anthropogenic forcing (Ramaswamy et al. 2006; Santer et al. 2013; Lott et al. 2013).There is only medium to low confidence in the rate of change of tropospheric warming and its vertical structure. Estimates of tropospheric warming rates encompass surface temperature warming rate estimates. There is low confidence in the rate and vertical structure of the stratospheric cooling.
In recent years, substantial efforts have resulted in further improvements to layer-averaged temperatures from microwave sounding unit observations (Po-Chedley and Fu 2012b; Po-Chedley et al. 2015; Mears and Wentz 2016, 2017; Spencer et al. 2017; Zou et al. 2018). Several merged satellite-based datasets have been constructed for providing continuous climate records in the stratosphere from 1979 to the present (McLandress et al. 2015; Zou and Qian 2016; Randel et al. 2016). Revisiting and reprocessing of stratospheric observations (Zou et al. 2014; Nash and Saunders 2015) led to improved consistency of the revised data versions; however, some differences remain (Seidel et al. 2016). Stratospheric temperature trends from the reprocessed observations and from new models of the SPARC Chemistry Climate Model Initiative (CCMI) showed substantial improvement in the agreement between modeled and observed trends, mainly due to updates of the satellite observations. The range of simulated trends was similar to that in the previous generation of models (Maycock et al. 2018; Karpechko et al. 2019).
The work of Maycock et al. (2018) also contributed to the recent Ozone Assessment Report (WMO 2018; Karpechko et al. 2019). Results confirmed a cooling of the stratosphere and an increase in stratospheric cooling with height; this effect is mainly due to increasing greenhouse gases and is modulated by evolving ozone changes. In the upper stratosphere, both greenhouse gases and ozone were found to contribute to the cooling, whereas in the midstratosphere, greenhouse gases are found to be dominant. In the lower stratosphere, ozone depletion was found to be the dominant factor for the cooling until the mid-1990s. Observed stratospheric cooling trends are weaker since around 1998 (Randel et al. 2016; Seidel et al. 2016; Zou and Qian 2016; Randel et al. 2017), reflecting a decline of ozone-depleting substances and the onset of recovery of the ozone layer (e.g., Harris et al. 2015; Solomon et al. 2016, 2017).
Vertical profiles of atmospheric temperature are available from limb-viewing satellite sounders and from ground-based observations, specifically radiosonde and lidar measurements. Reference radiosonde stations have been established over the past decade within the GCOS Reference Upper Air Network (GRUAN), adhering to the GCOS climate monitoring principles (e.g., Seidel et al. 2009; Bodeker et al. 2016). However, such series are still too short for trend retrievals. Gridded radiosonde records (Haimberger et al. 2012) have been updated recently, as well as observations from light detection and ranging (lidar) instruments (e.g., Keckhut et al. 2004; Wing et al. 2018a).
Since 2001, emerging novel satellite-based observations from Global Positioning System (GPS) radio occultation (RO), generically termed Global Navigation Satellite System (GNSS) RO, have become available for atmospheric and climate studies (e.g., Anthes 2011; Steiner et al. 2011, 2020; Ho et al. 2017, 2020) and have been identified as a key component for the GCOS (GCOS 2011). These long-term stable observations provide profile information with high vertical resolution in the upper troposphere and lower stratosphere and are well suited for climate studies (Lackner et al. 2011; Steiner et al. 2009, 2011).
We have deliberately chosen not to include reanalysis datasets in this study. While we acknowledge the high value of these products, we chose to compare observational records that are as independent of each other as possible. Since reanalyses strive to assimilate all available data sources, reanalysis products depend on all those datasets and details of the assimilation systems determine whether a reanalysis draws to one dataset more than to another. Several state-of-the-art reanalyses are currently available or in production (Fujiwara et al. 2017; Hersbach et al. 2020; Simmons et al. 2020).
In this study, we present the latest observational estimates of tropospheric and stratospheric temperature trends based on updated climate records, including novel GNSS RO satellite observations. These estimates include information from gridded radiosonde records and from lidar instruments. We provide an overview of presently available atmospheric observations and recent advances in their development, as well as some of the limitations of the datasets. We discuss variability and trends in both layer-averaged temperatures and vertically resolved data, as well as the associated uncertainties in these results. We also examine the representation of tropical tropospheric amplification in the observations. See the appendix for a list of acronyms used throughout this paper.
2. Observational datasets
We begin with a brief description of the observational data that are used for temperature trend analyses and discuss advantages and limitations of the data records.
a. Satellite-based observations
Instruments flown on polar-orbiting satellites of the National Oceanic and Atmospheric Administration (NOAA) provide the longest-running records of remotely sensed temperatures. These instruments include the Microwave Sounding Unit (MSU), the Advanced Microwave Sounding Unit (AMSU), and the SSU. SSU measurements are available from late 1978 to 2006 and are the only long-term temperature record in the mid–upper stratosphere with global coverage. The MSU instrument provided data from late 1978 until 1998. The follow-up AMSU instrument provides measurements from 1998 to the present. The sensors measure the radiance of Earth in a cross-track geometry and provide information on broad layer-averages of temperature. Information with higher vertical resolution is given by sensors in limb-viewing geometry, which scan the atmosphere in the vertical. Novel data for climate monitoring with long-term stability are available since 2001 from GNSS radio occultation, the latter exploiting atmospheric refraction.
1) Microwave sounding observations
MSU and AMSU sounders are available from a suite of satellites that partially overlap in time. These passive microwave radiometers measure the radiance of Earth at microwave frequencies. The thermal emission line of oxygen near 50–60 GHz is used for retrieving atmospheric temperature information since oxygen is well mixed in the atmosphere. Measuring at different frequencies near the oxygen absorption line corresponds to weighting functions peaking at different heights, which provide information on bulk temperatures over a typical vertical width of about 10 km.
The MSU instrument had four different channels delivering temperature information on four thick atmospheric layers until the NOAA-14 satellite ceased in 2005. The AMSU-A instrument began operation in 1998 with a larger number of 15 channels, sampling more atmospheric layers with better resolution. The MSU data record has been extended to the present by using the AMSU-A channels that most closely match the MSU channels from 1979 to the present based on satellites from NOAA TIROS-N through NOAA-19, the National Aeronautics and Space Administration (NASA) Aqua satellite, and the European Meteorological Operational (MetOp) satellite series.
For this study we use MSU-AMSU-A records from three groups: Remote Sensing Systems (RSS, Santa Rosa, California), the Center for Satellite Applications and Research (STAR) of NOAA/National Environmental Satellite, Data, and Information Service (NESDIS, College Park, Maryland), and the University of Alabama (UAH, Huntsville, Alabama). We use the latest versions of MSU-AMSU-A climate data products, which include RSS, version 4.0 (RSS 2019; Mears and Wentz 2009a,b, 2016, 2017); STAR, version 4.1 (NOAA STAR 2019; Zou and Wang 2011); and UAH, version 6.0 (UAH 2019; Spencer et al. 2017). Monthly averaged time series and anomaly time series are available at a resolution of 2.5° × 2.5° in longitude and latitude (Table 1).
Overview on observational datasets, version, time period, horizontal format, references. Datasets in italics are discussed but not used in the analysis.
The records contain layer-average temperatures computed from single channels from near-nadir views for the midtroposphere (TMT; MSU channel 2/AMSU-A channel 5), the upper troposphere (TUT or TTS or TTP; MSU channel 3/AMSU-A channel 7), and the lower stratosphere (TLS; MSU channel 4/AMSU-A channel 9). The contributions for the temperature averages originate from broad layers peaking near 5 km for TMT, near 10 km for TUT, and near 17 km for TLS (Fig. 1).
The TMT and TTS weighting functions extend into the stratosphere and contaminate tropospheric information. To accentuate tropospheric information, a TMT corrected temperature (TMTcorr) can be constructed by subtracting the stratospheric contribution from the TMT channel (Fu et al. 2004; Fu and Johanson 2005; Johanson and Fu 2006; Po-Chedley et al. 2015). We computed TMTcorr by a linear combination of TMT and TLS after Johanson and Fu (2006). Similarly, we computed a TTS corrected temperature (TTScorr) by a linear combination of TTS and TLS: 1.18 × TTS − 0.18 × TLS.
Additionally, RSS and UAH provide a product for the lower troposphere (TLT) from a weighted average of measurements made at different incidence angles (RSS) or a weighted combination of TMT, TTS, and TLS observations (UAH) to extrapolate MSU channel 2 and AMSU-A channel 5 lower into the lower troposphere, with a peak contribution near 2 km (Mears and Wentz 2017; Spencer et al. 2017). AMSU-only stratospheric temperature datasets are available from mid-1998 to present from single channels (Wang and Zou 2014).
The merging of MSU and AMSU measurements from many different instruments requires a number of adjustments, since inhomogeneities from different sources can result in spurious trends in retrieved temperatures. Instrument changes over time using different channel frequencies introduce differences due to sampling of slightly different atmospheric layers. Sampling errors occur also from sampling at different local times when satellites are in different orbits. Orbital decay over time can cause brightness temperatures from the near-limb views to warm faster relative to those from the near-nadir views (Wentz and Schabel 1998). The calibration of the electric signal conversion to radiances is also a potential error source when it drifts over time. The absolute calibration uncertainty is estimated to be 0.5–1 K. An overview of known errors is given by Zou et al. (2018).
Over time, the three processing groups developed improved algorithms to account for calibration issues and time-varying biases before the measurements are compiled into a long-term temperature record (Christy et al. 2000, 2003; Mears and Wentz 2009a,b; Zou and Wang 2010, 2011). Mears et al. (2011) performed a detailed uncertainty assessment for RSS data. They discussed uncertainty estimates that arise from the methodological choices in accounting for sampling error, diurnal adjustment, and merging procedures. The different methodological approaches by the processing groups lead to differences in climate data records. However, the latest product versions including improved diurnal drift correction based on observations (Po-Chedley et al. 2015) show much better agreement than the earlier versions (e.g., Seidel et al. 2016; Santer et al. 2017b). Our analyses include each of the datasets to provide a measure of uncertainty due to the differing methodologies.
2) Stratospheric Sounding Unit and merged datasets
The SSU is a nadir-sounding instrument that flew on NOAA operational satellites from November 1978 to April 2006. The sensor measured the thermal emission of atmospheric carbon dioxide (CO2) in the infrared absorption line near 15 μm. The measurement made use of the pressure modulation technique by putting a cell of CO2 gas in the instrument’s optical path. By modulating the gas pressure of the CO2 cell, the single CO2 absorption line was split into three channels with their weighting functions peaking at 30 km for channel 1 (SSU1), 35 km for channel 2 (SSU2), and 45 km for channel 3 (SSU3), respectively. Accordingly, the main contributions for layer-average temperatures stem from heights between 20 and 40, between 25 and 45, and between 35 and 55 km, respectively, spanning the whole stratosphere as illustrated in Fig. 1.
The creation of a consistent homogeneous climate data record from SSU is a challenging task due to several issues and limitations inherent in the SSU measurements that require corrections for radiometric, spectroscopic, and tidal differences (Nash and Saunders 2015). Gas leakage from the onboard CO2 cell caused the cell pressure to decrease, which caused weighting functions to peak at different layers over time. Moreover, the weighting functions were sensitive to CO2 changes in the atmosphere (Shine et al. 2008). Orbital drift also caused biases through sampling of the diurnal cycle at different observation times. Detailed descriptions of these issues are provided by Wang et al. (2012), Nash and Saunders (2013, 2015), and Zou et al. (2014).
In recent work, two groups reprocessed all SSU measurements and generated improved CDRs by correcting for CO2 cell pressure changes, satellite orbit drift, changes in atmospheric CO2, and viewing angle differences, and by accounting for the effects of solar diurnal tides in local time sampling. NOAA/STAR provides version STAR SSU v2.0 (NOAA STAR 2019) as monthly means on a 2.5° × 2.5° latitude and longitude grid (Wang et al. 2012; Zou et al. 2014). The Met Office (Exeter, United Kingdom) provides version UKMO SSU v2.0 only as 6-month-average global means (Nash and Saunders 2015), so that an analysis of latitudinal and seasonal variability is not possible for the UKMO dataset. Comparison of these independently derived versions of SSU show global-mean temperature differences of about 0.5 K, especially from 1979 to 1990 for the upper channels SSU2 and SSU3. A consistency check of SSU data is taking the average of the lowermost channel SSU1 and the uppermost channel SSU3 and subtracting the middle channel SSU2, which should be close to zero. This difference was found to be within 0.2 K for NOAA/STAR data but much larger for UKMO data (see Fig. 7 in Seidel et al. 2016). We therefore use the STAR SSU v2.0 dataset in the current study.
Several merged data records have been constructed by extending the SSU data, which ended in April 2006, with satellite-based datasets from nadir or limb sounders. These data have higher vertical resolution and are integrated vertically with SSU weighting functions to provide SSU-equivalent data that are combined with SSU. McLandress et al. (2015) merged SSU and AMSU data by bridging them with Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) measurements (Fischer et al. 2008). That record is only available until 2012. Randel et al. (2016) have provided a merged SSU record by combining SSU with Microwave Limb Sounder (MLS) data on the Aura satellite. MLS measures microwave emission from O2 and delivers temperatures at 10–90-km altitude with a vertical resolution of about 4–7 km over 20–50 km (Schwartz et al. 2008). Randel et al. (2016) also combined SSU with data from the Sounding of the Atmosphere Using Broadband Emission Radiometry (SABER) instrument based on CO2 emissions, with temperatures retrieved over 16–100 km and a vertical resolution of 2 km (Remsberg et al. 2008). Zou and Qian (2016) have derived a merged dataset, STAR SSU-AMSU v3.0, by combining SSU and AMSU measurements with a variational approach for optimally merging the data. In this work, we use those merged SSU records that have been updated to present, that is, STAR SSU-AMSU v3.0 (NOAA STAR 2019; Zou and Qian 2016) and SSU-MLS (Randel et al. 2016).
In addition to MLS, SABER, and MIPAS, there are other limb-viewing satellite instruments that provide temperature observations with relatively high vertical resolution (2–4 km). These include the Atmospheric Chemistry Experiment–Fourier Transform Spectrometer (ACE-FTS) from 2004 to the present (Bernath 2017) and the Global Ozone Monitoring by Occultation of Stars (GOMOS) for 2002–12. For GOMOS, two temperature datasets have been released recently: one for the stratosphere (Sofieva et al. 2019), and another dataset for the upper stratosphere and the mesosphere (Hauchecorne et al. 2019). Exploration of these data for potential use in trend analyses is the subject of future research.
3) GNSS radio occultation observations
Since 2001, a new type of temperature observation from GNSS RO is available (Anthes 2011). RO is based on the refraction of GNSS radio signals by the atmospheric refractivity field during their propagation to a receiver on a low-Earth orbit satellite. Scanning the atmosphere in limb sounding geometry provides profiles of high vertical resolution of about 100 m in the troposphere and tropopause, and about 1 km in the stratosphere (Kursinski et al. 1997; Gorbunov et al. 2004; Zeng et al. 2019). Horizontally, the resolution is about 1.5 km across ray and ranges from about 60 to 300 km along ray in the lower troposphere to the stratosphere (Melbourne et al. 1994; Kursinski et al. 1997). The uncertainty of individual RO temperature profiles is about 0.7 K near the tropopause, gradually increasing into the stratosphere (Scherllin-Pirscher et al. 2011a,b, 2017). For monthly zonal means, the total uncertainty estimate is smaller than 0.15 K in the upper troposphere and lower stratosphere, and up to 0.6 K at higher latitudes in wintertime (Scherllin-Pirscher et al. 2011a).
As the time delay measurement of the refracted signals is based on precise atomic clocks, this enables long-term stability and traceability to the Système International (SI) unit of the second (Leroy et al. 2006). Therefore, data from different RO missions can be merged to a seamless record without intercalibration or requiring substantial temporal overlap (Foelsche et al. 2011; Angerer et al. 2017). Continuous observations are available from several RO satellite missions. So far, most missions have used GPS signals at wavelengths of 0.19 and 0.24 m in the microwave. At these wavelengths, the signals are not affected by clouds and observations are available in nearly all weather conditions. Bending angles are computed from the refracted signals. At high altitudes, the signal-to-noise ratio of the bending angle decreases (above about 50 km depending on the thermal noise of the receiver) and an initialization of bending angle profiles with background information is performed.
Refractivity is computed from bending angle and is related directly to temperature under dry atmospheric conditions. This is the case in the upper troposphere–lower stratosphere (UTLS) where water vapor is negligible (Kursinski et al. 1997; Scherllin-Pirscher et al. 2011a). In the moist lower-to-middle troposphere, the retrieval of (physical) atmospheric temperature or humidity requires a priori information in order to resolve the wet–dry ambiguity information inherent in refractivity (e.g., Kursinski et al. 1995; Kursinski and Gebhardt 2014). In this study, we use RO dry temperatures (without a priori information) above 9 km to avoid the wet–dry ambiguity. For this altitude range, we found that the difference in trends from dry air temperature to trends from a moist retrieval is negligible (not shown).
The data processing adds structural uncertainty to the data as different processing centers use different background information and methods. For the early RO period, based on the single-satellite CHAMP mission only, this uncertainty increases above 25 km (Ho et al. 2012; Steiner et al. 2013), due to receiver noise and therefore larger impact of the high-altitude bending angle initialization (Leroy et al. 2018). Thus, for climate trend studies, CHAMP is regarded as a limiting factor. In addition, only 150 occultation profiles per day are available from CHAMP. However, uncertainty due to the changing number of observations is reduced by correcting for the sampling error in RO climatological fields (Foelsche et al. 2008). For later missions, based on advanced receivers, data are usable to higher altitudes (Steiner et al. 2020). Overall, structural uncertainty in trends is lowest at 8–25-km altitude globally for all inspected RO variables and different missions (Steiner et al. 2020). Data products comprise individual profiles and gridded fields of bending angle, refractivity, pressure, geopotential height, temperature, and specific humidity. These products have been used in a number of different atmosphere and climate studies (Ho et al. 2010; Anthes 2011; Steiner et al. 2011; Ho et al. 2020).
Comparison of Wegener Center (WEGC) RO data against MSU-AMSU records showed slight differences in TLS trends (Steiner et al. 2007; Ladstädter et al. 2011) while good agreement of Radio Occultation Meteorology Satellite Application Facility (ROM SAF) RO stratospheric trends to Aqua AMSU records was found (Khaykin et al. 2017). Comparisons with collocated radiosondes (detailed in the following section), Vaisala RS90/92 and GRUAN, showed very good agreement with global annual-mean temperature differences of less than 0.2 K. Radiosonde daytime radiation biases were identified at higher altitudes (Ladstädter et al. 2015; Ho et al. 2017). The stability of RO makes it a useful calibration reference for AMSU (Chen and Zou 2014) and radiosondes (Ho et al. 2017; Tradowsky et al. 2018).
In this work, we use RO data over the period 2002–18, WEGC RO OPS v5.6 of the Wegener Center (Graz, Austria) (EOPAC Team 2019; Angerer et al. 2017), the ROM SAF CDR v1.0 version of ROM SAF [Danish Meteorological Institute (DMI) Copenhagen, Denmark] (ROM SAF 2019; Gleisner et al. 2020), and UCAR/NOAA data [UCAR COSMIC Data Analysis and Archive Center (CDAAC); Boulder, Colorado, and NOAA] (UCAR CDAAC 2019; S.-P. Ho and X. Zhou 2020, unpublished manuscript). An overview on RO data processing and a description of retrieval steps for each specific dataset is given in Steiner et al. (2020, their Table 1).
b. Ground-based observations
Ground-based temperature observations are available from radiosonde measurements made with weather balloons. Radiosonde measurements extend into the lower stratosphere only, while lidar measurements extend to the mid- and upper stratosphere. Observations are limited to ground stations and have limited coverage in space and time.
1) Radiosonde observations
Reliable radiosonde temperature records commenced in 1958. Observations are made once or twice per day at stations that are mainly located on Northern Hemisphere continents. The weather balloons reach up to about 25 km until they burst. Depending on wind conditions, typical drift distances are a few kilometers in the lower troposphere to about 50 km in the lower stratosphere (Seidel et al. 2011). All radiosonde datasets have limited coverage in the tropics. Different countries use different instrument types, and instrumentation has changed over time (e.g., Thorne et al. 2011). A further problem is that radiosondes are affected by radiation biases during daytime measurements (Sherwood et al. 2005; Ladstädter et al. 2015). To reduce data discontinuities and residual cooling biases in radiosonde-derived CDRs, a number of different adjustment techniques have been developed.
Several centers have produced homogenized radiosonde products using different methods. NOAA’s Radiosonde Atmospheric Temperature Products for Assessing Climate (RATPAC) (Free et al. 2004) is based on spatial averages of adjusted temperature data (Lanzante et al. 2003) from 1958 to 1995. Since 1996, it is based on the Integrated Global Radiosonde Archive (IGRA) station data using a first difference method (Free et al. 2004) and the record is not fully homogenized. RATPAC-A, version 2, data (NOAA NCEI 2019) are provided as zonal yearly anomalies.
The Hadley Centre Atmospheric Temperature dataset (HadAT) from the Met Office (Thorne 2005; McCarthy et al. 2008) uses a larger number of stations; however, it is only available for 1958–2012 and not updated to the present (www.metoffice.gov.uk/hadobs/hadat/). Sherwood et al. (2008) constructed a radiosonde record based on an iterative universal kriging (IUK) method. The record is available only until 2015 and not updated to present (Sherwood and Nishant 2015).
Haimberger et al. (2008) introduced the homogenized Radiosonde Observation using Reanalysis (RAOBCORE) and Radiosonde Innovation Composite Homogenization (RICH) datasets. Break points are determined either by using composites of neighboring observations as reference (RICH) or by comparing to departures from a reanalysis background (RAOBCORE). RICH is independent of the background, but interpolation errors may be large where sampling is sparse such as in the tropics and the Southern Hemisphere. RAOBCORE reduces interpolation errors at the cost of slight background dependence on ERA-Interim (Haimberger et al. 2012). Gridded RAOBCORE and RICH data have been updated to the end of 2018 (Haimberger 2019). The data are provided as monthly means at 10° × 10° resolution.
Radiosonde temperature data are also available from GRUAN radiosonde stations (https://www.gruan.org/; Bodeker et al. 2016). GRUAN is a reference observing network of quality measurements of ECVs to reduce uncertainty in climate monitoring (Seidel et al. 2009; Thorne et al. 2013). As of 2019, GRUAN comprises of 26 sites, 12 of which have been certified. We do not use GRUAN data because records start in 2009 and are too short for reliable trend estimation.
In this study, we use the RICH and RAOBCORE radiosonde records of the University of Vienna. In addition, we also use radiosonde data from the ERA-Interim archive (denoted RS-VAIS), restricting our attention to data from the Vaisala RS80, RS90, RS92, and RS41 radiosondes from 1995 onward. These measurements are known to be of high quality (Steinbrecht et al. 2008; Nash et al. 2011; Ladstädter et al. 2015).
2) Lidar observations
Stratospheric and lower-mesospheric temperature lidar measurements are available at several locations from the Network for the Detection of Atmospheric Composition Change (NDACC). The Rayleigh lidar technique uses molecular backscattering of a pulsed laser beam to derive the vertical profile of atmospheric density. The collected signal is sampled as a function of time, that is, geometric altitude. The intensity of scattered light is directly related to the air density at the backscatter altitude considered. Using a priori temperature information at the top of the profile, temperature can be retrieved with high spatiotemporal resolution from the measured relative density profile (Hauchecorne and Chanin 1980). Accuracy and precision both depend on altitude, and typically range from less than 0.1 K in the stratosphere to 10 K or more at the very top of the profile (80 km or higher). Descriptions of the Rayleigh lidar temperature retrieval and its uncertainty can be found in Hauchecorne and Chanin (1980), Keckhut et al. (2011), Leblanc et al. (2016), and Wing et al. (2018a). Validation studies showed that the accuracy of individual lidar profiles is better than 1 K in the altitude range of 35–65 km (Keckhut et al. 2004). A variety of studies have assessed stratospheric temperature variability and trends from lidar (e.g., Randel et al. 2009; Steinbrecht et al. 2009; Li et al. 2011; Funatsu et al. 2011, 2016). Lidar temperatures for the stratosphere and mesosphere were used as reference data for detecting biases in satellite-based observations from limb sounders (Wing et al. 2018b).
The Observatoire de Haute Provence (OHP) lidar in southern France (43.94°N, 5.71°E) is one of the longest-running lidar stations, commencing measurements in 1979 (Hauchecorne and Keckhut 2019; Keckhut et al. 1993). Further long-term lidar records (NDACC 2019) are available from the Hohenpeissenberg station in Germany (HOH; 47.80°N, 11.02°E) since 1987 (Werner et al. 1983) and from the Jet Propulsion Laboratory (JPL) Table Mountain Facility (TMF) in California (TMF, 34.4°N, 117.7°W) since 1988 (Leblanc et al. 1998). In addition, we show data available since 1993 from the JPL lidar at the tropical station of Mauna Loa Observatory (MLO) in Hawaii (19.54°N, 155.58°W; Leblanc and McDermid 2001). Under clear-sky conditions, lidar temperature measurements are usually made on 5–20 nights per month at each station. These measurements were then averaged to monthly mean time resolution for each station. The time series for OHP, HOH, TMF, and MLO were analyzed in this study.
3. Trend analysis
For estimation of atmospheric trends from observations, we used global-mean and zonal-mean monthly mean temperature time series of layer-average brightness temperatures and of vertically resolved temperature observations. RO and RS-VAIS zonal-mean fields were corrected to account for their incomplete sampling of the full spatial and temporal variability of the atmosphere (Scherllin-Pirscher et al. 2011a; Ladstädter et al. 2015). The sampling error is estimated from the difference between a field of averaged collocated profiles and a full atmospheric field (Foelsche et al. 2008). The sampling error is subtracted from the gridded climatologies, leaving a small residual sampling error (Scherllin-Pirscher et al. 2011a). The atmospheric fields used in this study for estimating the sampling error were reanalysis fields from ERA5.1 (Simmons et al. 2020) for ROM SAF RO, WEGC RO, and RS-VAIS, and ERA-Interim for UCAR RO. RICH and RAOBCORE gridded fields were not corrected for sampling error.
We computed anomaly time series by subtracting the monthly climatology of the common reference period 2002–18 from the absolute time series. Trend estimates were computed for a number of different periods: 1979–2018, 1979–98, 1999–2018, and 2002–18. Trends were computed by applying a linear ordinary least squares fit as well as by multiple regression analysis. The uncertainty estimates of the trends are expressed as 95% confidence level, accounting for lag-1 autocorrelation of the regression residuals. Trends are deemed to be “significantly different from zero” if the confidence interval does not contain the null hypothesis value (zero trend).
The multivariate regression model includes a linear trend term and natural variability terms accounting for the solar cycle, El Niño–Southern Oscillation (ENSO), stratospheric volcanic eruptions, and the quasi-biennial oscillation (QBO) (Fig. 2). Commonly used indices describe these terms. Solar variability is represented by the radio emission flux from the sun at a wavelength of 10.7 cm. The period 1979–2018 covers almost four solar cycles. Daily observed solar flux values (Natural Resources Canada 2019) were averaged to monthly means.
ENSO originates in the tropical Pacific Ocean with warm SSTs during El Niño phases and cold anomalies during La Niña phases. It dominates interannual variability in the troposphere up to the lowermost stratosphere. During an El Niño event, the tropical troposphere warms and the lowermost tropical stratosphere cools (Free and Seidel 2009; Randel et al. 2009). Deviations from the zonal mean are seen as eddy signals in the subtropics (Scherllin-Pirscher et al. 2012). We use the Niño-3.4 SST index as an ENSO proxy. This is the spatially averaged SST in the Niño-3.4 region (5°S–5°N and 170°–120°W). By definition, El Niño or La Niña periods occur if 5-month running means of SST anomalies in this region exceed +0.4 K or −0.4 K, respectively, for at least six months (Trenberth 1997). Our multiple regression relied on version 5 of the Extended Reconstructed Sea Surface Temperature dataset (ERSSTv5; Huang et al. 2017). To account for lags between this measure of ENSO variability and the response of tropospheric temperature, we used a lag of 3 months for the monthly ERSSTv5 (1981–2010 base period) Niño-3.4 index (CPC 2019).
Tropical stratospheric variability is dominated by the QBO, which has a period of about 28 months. The QBO is characterized by alternating easterly and westerly wind regimes propagating downward to the tropopause at about 1 km per month. This is also seen in the stratospheric temperature structure as positive and negative temperature anomalies of several degrees; anomalies are proportional to the vertical gradient of the zonal winds (Randel et al. 1999; Baldwin et al. 2001). This distinctive thermal structure makes it possible to investigate the QBO with RO temperature anomalies (Wilhelmsen et al. 2018). Here, we use the QBO index of monthly mean zonal winds of the Freie Universität of Berlin (FU Berlin 2019) produced by combining observations of three radiosonde stations: Canton Island, Gan/Maldives, and Singapore (Naujokat 1986). Applying a principal component analysis to the wind profiles over 70–10 hPa, we use the first two orthogonal basis functions, PC1 and PC2, as proxies for the QBO (Wallace et al. 1993).
Explosive volcanic eruptions such as El Chichón in 1982, Mount Pinatubo in 1991 (Robock 2000) and also minor volcanic eruptions after 2000 affect short-term temperature trends in the troposphere and stratosphere (Solomon et al. 2011; Stocker et al. 2019). As a proxy for the effects of volcanic eruptions we compute the stratospheric aerosol optical depth over 15–25 km from the monthly mean Global Space-Based Stratospheric Aerosol Climatology (GloSSAC), version 1.0, averaging over the tropics and subtropics (Thomason 2017; Thomason et al. 2018).
In addition, we used observed surface temperature trends to compare with trends in the free atmosphere. We employed the HadCRUT4 dataset for this purpose (Met Office and the Climatic Research Unit, University of East Anglia, United Kingdom; HadCRUT4 2020; Morice et al. 2012).
4. Results
a. Long-term time series and linear trends
Here we present multidecadal time series and linear trends over the 40-yr period 1979–2018 for the stratosphere and the troposphere. Results are from SSU and MSU layer-average temperatures as well as from lidar temperatures. Figure 3 shows near-global-average (85°S–85°N) anomaly time series of stratospheric temperatures for the lower-stratospheric TLS channel from three MSU-AMSU records and for SSU channels in the mid–upper stratosphere from two merged records, SSU-AMSU and SSU-MLS. Stratospheric temperatures show the impact of the major eruptions of El Chichón in 1982 and Mount Pinatubo in 1991. These large warming signals have peak amplitude in the lower stratosphere and last for roughly two years after the eruptions; only minor TLS changes occurred between the two eruptions.
The linear trend over the last four decades shows cooling of the stratosphere. This is also the case if the anomalous years after the major volcanic eruptions are disregarded in the trend computation, which has minimal impact on trend values but substantially reduces the trend uncertainty (not shown). Accounting for the main modes of natural variability by applying multiple regression analysis also reduces the trend uncertainty with only small impact on trend values. Stratospheric trends increase from the lower stratosphere to the upper stratosphere. For results from the STAR group, for example, the least squares linear trends for the period 1979–2018 are −0.25 ± 0.16 K decade−1 for TLS, and −0.56 ± 0.13, −0.62 ± 0.13, and −0.70 ± 0.14 K decade−1 for channels SSU1, SSU2, and SSU3 (respectively; see Fig. 3). The corresponding trends from the multiple regression model, also obtained with STAR data, are −0.17 ± 0.08 K decade−1 for TLS, and −0.50 ± 0.09, −0.58 ± 0.09, and −0.67 ± 0.10 K decade−1 for SSU1, SSU2, and SSU3 (respectively). These results indicate that the long-term trends are robust to the statistical methodology used for fitting trends. The implied trend uncertainty is smaller in the multiple regression analysis, although some degree of collinearity between several of the predictor variables (see Fig. 2) can hamper assessment of trend uncertainty (Santer et al. 2001). All trends are significant at the 95% level, and results from different research groups are reasonable consistent.
The overall decrease in stratospheric temperature is about 1–3 K over the last four decades, but the characteristics change over time and as a function of atmospheric layer. Cooling is larger in the first half of the record. The amplitude of trends decreases since the late 1990s, particularly in the lowermost stratosphere. This nonlinear behavior in TLS is due to the decline of stratospheric ozone in the early period (Ramaswamy et al. 2001) and ozone recovery after roughly 1998 due to the effectiveness of the Montreal Protocol (WMO 2018). These results are consistent with a number of previous studies (e.g., Randel et al. 2009, 2016; Seidel et al. 2016; Zou and Qian 2016; Randel et al. 2017; Polvani et al. 2017; Solomon et al. 2017; Maycock et al. 2018). Interestingly, since about 2015 the cooling seems to be enhanced, which may be related to the onset of solar minimum conditions.
Lidar time series (Fig. 4) are the only long-term temperature series in the stratosphere suitable for comparison to SSU temperatures. Monthly mean temperature time series from the four selected lidar stations are presented. Equivalent temperatures have been computed from lidar profiles using the SSU3 weighting function from STAR, by sampling and weighting the lidar temperature at the respective height levels (
In general, the variability in lidar temperatures is higher at the more northerly HOH and OHP stations (47.8° and 44°N, respectively) and smaller for the tropical MLO station (19.5°N). Lidar temperature anomalies are well correlated with the SSU3 time series: correlations are 0.76, 0.70, 0.73, and 0.72 for SSU-AMSU versus HOH, OHP, TMF, and MLO (respectively). All linear trends in Fig. 4, both for the lidar data and SSU, were computed for the time periods dictated by the length of individual lidar records. At HOH, the linear trend of −0.39 ± 0.59 K decade−1 for the lidar is smaller than the trend of −0.59 ± 0.38 K decade−1 for SSU-AMSU (Fig. 4a). Although the latter is statistically significant, the trend in HOH lidar data is not significant because of the larger variability of lidar data. At TMF, the lidar trends are larger than SSU trends, specifically due to the differences in the first half of the time series. These differences are likely associated with a warm temperature bias in the first few years of the lidar record. The warm bias was caused by the presence of signal-induced noise in the raw lidar data complicating the extraction of background noise. The pre-1996 temperature data at TMF should therefore be considered with caution. A full reanalysis of these data is currently being undertaken, with the expectation of a more accurate TMF record during these early years. At OHP, lidar trends are of a similar magnitude as SSU-AMSU trends over the time period considered, but these are not statistically significant due to the large variability. Note that data at the end of the OHP time series are not included in this work as they are currently being investigated for biases. For the MLO station, there is very close agreement between the lidar trend of −0.37 ± 0.20 K decade−1 and the SSU-AMSU trend of −0.31 ± 0.12 K decade−1.
Tropospheric temperature anomalies are shown in Fig. 5 based on time series of MSU-AMSU layer-average temperature anomalies. As expected, the interannual variability is strongly correlated with ENSO behavior—for example, positive tropospheric temperature anomalies coincide with El Niño events in 1983, 1997, 2010, and 2016. Statistically significant warming trends are detected over the last four decades in the lower troposphere (TLT), over the total troposphere (TMTcorr), and, to a lesser amount, for TTScorr. The trend is weaker for the unadjusted midtroposphere channel (TMT) because it contains information from cooling of the stratosphere. The upper-troposphere channel TTS (see Fig. 1) also reaches into the lower stratosphere (to about 20 km above Earth’s surface) and therefore integrates over tropospheric warming and stratospheric cooling; this results in a near-zero trend (not shown). Note that the TTS time series of RSS starts in 1987, and is therefore shorter than the TLT, TMT, TMTcorr, and TTS records.
The warming of the troposphere is about 0.6–0.8 K over the last four decades (Fig. 5). The RSS least squares linear trend for TMTcorr (0.19 ± 0.04 K decade−1) is similar to the trend obtained from multiple linear regression of (0.16 ± 0.02 K decade−1). Trend values are lowest for the UAH record (see, e.g., Santer et al. 2017a,b).
b. Latitude structure of trends
We used multiple regression to calculate trends as a function of latitude for 10° zonal bands. The latitudinal structure of stratospheric trends (Fig. 6a) shows a consistent picture of cooling over all latitudes that increases with height. Cooling is statistically significant in all four stratospheric layers and at all latitudes, except poleward of ~50°S and 50°N for TLS and at very high latitudes in the Southern Hemisphere for the SSU channels. Trends range from approximately −0.25 K decade−1 in the lower stratosphere (TLS) to −0.5 to −0.7 K decade−1 in the mid–upper stratosphere (SSU1 to SSU3). At northern high latitudes, cooling is up to −1 K decade−1 in the uppermost SSU channel while at southern high latitudes it is weaker. In the lower stratosphere, the latitudinal trend structure is different from the upper stratosphere, with largest cooling at high latitudes in the Southern Hemisphere and smallest cooling over the northern polar region. This structure in TLS trends arises because the strengthening of the Brewer–Dobson circulation (BDC) over 1979–2018 leads to cooling at low latitudes and high-latitude warming. The BDC partly compensates for the radiative cooling associated with high-latitude ozone depletion, especially in the Southern Hemisphere (Fu et al. 2015, 2019). Both ozone and atmospheric circulation changes are important factors in determining the latitudinal pattern of the lower-stratospheric cooling trend (e.g., Solomon et al. 2017; Maycock et al. 2018). In addition, an enhanced lower-stratospheric cooling is seen in the midlatitudes (Fu et al. 2006), which is caused by the poleward shift of subtropical jets associated with tropical expansion (Fu and Lin 2011; Polvani et al. 2017; Maycock et al. 2018).
Tropospheric trends (Fig. 6b) show significant warming over all latitudes from the lower troposphere to the midtroposphere (channels TLT, TMTcorr, TMT), except at southern high latitudes where the trend is near zero. At northern high latitudes, warming trends are largest and reach about 0.3–0.5 K decade−1. TTScorr shows significant trends only in the tropics. Although RSS and UAH have large differences in TLT trends in this region, both products clearly show significant tropical warming. Only channel TTS shows near-zero trends that can be explained by the broad weighting function, which receives contributions from both tropospheric warming and lower-stratospheric cooling (see Fig. 1).
We also calculated zonal-mean trends for the period 2002–18 (Fig. 7), thus facilitating direct comparison with GNSS RO observations. Stratospheric trends (Fig. 7a) show larger uncertainties than were evident for the 1979–2018 period, particularly at high latitudes. This is due to combined effects of the large dynamical variability and the shorter analysis period. In the lowermost stratosphere, the trend over all latitudes is near zero except at southern high latitudes, where it reaches −1 K decade−1. This result is highly dependent on the end points of the short data record, as Antarctic TLS trends beginning in 1998 are positive (Randel et al. 2017), while trends beginning in 2000 or 2002 are negative (Fu et al. 2019; Fig. 7); the results are strongly influenced by the Antarctic stratospheric warming in 2002 (e.g., Newman and Nash 2005). This sensitivity highlights the uncertainty of polar stratospheric trends derived from short data records with arbitrary end points. Note that Antarctic ozone has been recovering since the late 1990s (Solomon et al. 2017), leading to radiative heating within the background of large dynamic variability. In the mid and upper stratosphere, cooling trends are found to be significant over 50°S–50°N. Almost all TLS trends over 2002–18 fail to achieve statistical significance.
Tropospheric trends for the period 2002–18 (Fig. 7b) show a latitudinal structure similar to that found over the full 1979–2018 period, with significant trends throughout most of the tropics and subtropics. At high latitudes, however, trends have large uncertainties and are not significant.
c. Vertically resolved trends
Vertically resolved trends from radiosonde data in Fig. 8 are presented for the period 1979–2018 together with trends from layer-average temperatures from MSU-AMSU and merged SSU records. This provides an overview of upper-air trends from the lower troposphere to the stratopause for near-global averages (70°S–70°N) (Fig. 8a) and for the tropics (20°S–20°N) (Fig. 8b). Overall, the different records show remarkably good agreement. Surface temperature trends are also indicated and are similar to TLT trends.