## Abstract

Statistical relationships between higher-order moments of probability density functions (PDFs) are used to analyze top-of-atmosphere radiance measurements made by the Atmospheric Infrared Sounder (AIRS) and radiance calculations from the ECMWF Re-Analysis (ERA) and the Modern-Era Retrospective Analysis for Research and Applications (MERRA) over a 10-yr period. The statistical analysis used in this paper has previously been applied to sea surface temperature, and here the authors show that direct satellite radiance observations of atmospheric variability also exhibit stochastic forcing characteristics. The authors have chosen six different AIRS channels based on the sensitivity of their measured radiances to a variety of geophysical properties. In each of these channels, the authors have found evidence of correlated additive and multiplicative (CAM) stochastic forcing. In general, channels sensitive to tropospheric humidity and surface temperature show the strongest evidence of CAM forcing, while those sensitive to stratospheric temperature and ozone exhibit the weakest forcing. Radiance calculations from ERA and MERRA agree well with AIRS measurements in the Gaussian part of the PDFs but show some differences in the tails, indicating that the reanalyses may be missing some extrema there. The CAM forcing is investigated through numerical simulation of simple stochastic differential equations (SDEs). The authors show how measurements agree better with weaker CAM forcing, achieved by reducing the multiplicative forcing or by increasing the spatial correlation of the added noise in the case of an SDE with one spatial dimension. This indicates that atmospheric models could be improved by adjusting nonlinear terms that couple long and short time scales.

## 1. Introduction

The Atmospheric Infrared Sounder (AIRS) is the first of a new generation of low-noise hyperspectral sounders that are now regularly making global twice-daily top-of-atmosphere (TOA) infrared radiance measurements of Earth’s atmosphere for weather prediction (Aumann et al. 2003) and physical retrievals (Susskind et al. 2003). AIRS has been operational since September 2002 and has 2378 spectral channels spanning 3.7–15.4 μm. These channels are sensitive to a wide range of geophysical properties of the atmosphere, including temperature, moisture, amounts of a number of trace gases, aerosols, and water and ice clouds (Aumann et al. 2003). While some channels have failed over time, most of them have been shown to be highly stable with very little uncorrected spectral or radiometric drift (Pan et al. 2015).

Traditionally, radiances measured by infrared instruments have either been directly assimilated into numerical weather prediction (NWP) models or been used as inputs to retrieval schemes (McNally et al. 2006; Susskind et al. 2003). In either case, geophysical fields are determined, but uncertainty estimates are a complicated combination of retrieval errors and measurement errors. While older infrared sounders had a handful of broad channels spanning about 10 cm^{−1}, the new instruments have thousands of high-resolution [≃(0.5–1) cm^{−1}] low-noise channels. This means that one can carefully select channels with narrower weighting functions and that are primarily impacted by atmospheric temperature and one trace gas (Aumann et al. 2003). In this paper, instead of studying the geophysical retrievals, we carefully select channels that are only impacted by one trace gas and study the first 10 yr of AIRS radiances under clear-sky conditions. The particular value of these radiometrically stable datasets in radiance space is their independence from errors in retrievals or model data, which separates them from both retrieved geophysical fields and reanalyses. It is generally agreed that climate signals require longer datasets (at least 25 yr; Leroy et al. 2008), but the existing record is enough to study sources of short-time-scale climate variability. Stochastic forcing (which is the fast change unresolved by a model and approximated as noise, e.g., surface wind fluctuations in an NWP model) and short-period variability by definition take place on time scales shorter than changes of the mean atmospheric state, so the shorter AIRS data record should be sufficient to observe them. This has been demonstrated using ocean models with stochastic forcing over periods of several years in which variations in seasonal cycles are recreated (Frankignoul and Müller 1979).

One recently employed approach for studying atmospheric and ocean variability has been to analyze events in terms of the non-Gaussian shape of the probability density function (PDF) of a dataset (Sura 2011). These events can be modeled stochastically because of the large range of temporal and spatial scales involved in the physical processes occurring in Earth’s atmosphere–ocean system. This may be an appropriate approach to analyzing long-term AIRS radiance observations because they are sensitive to atmospheric properties that evolve on a wide range of scales. For example, ocean surface temperatures are expected to vary relatively gradually, whereas cloud and moisture vary over shorter spatial and temporal scales. Sura and Sardeshmukh (2008) showed that sea surface temperature (SST) PDFs exhibited non-Gaussian behavior and tail extrema because of stochastic forcing from surface winds. Channels sensitive to stratospheric temperature, on the other hand, are unlikely to show as much non-Gaussian behavior because stratospheric forcing occurs on longer time scales (on the order of years; Waugh and Hall 2002), and stochastic forcing generally occurs on shorter time scales. Analyzing atmospheric satellite observations for stochastic forcing helps point to what kinds of measurements can be used to track extrema in the atmosphere, and comparisons with reanalysis data help us to analyze how well models are capturing these events.

In this paper, we choose to include only clear AIRS fields of view because clouds affect radiances in channels sensitive to lower atmospheric temperature, gas constituents, and clouds by significantly reducing the brightness temperature. Thus, PDFs that include clouds will have very long negative tails and are guaranteed to always be highly non-Gaussian. The existence of clouds will result in significant negative skewness and may not represent what we would define as an extremum (e.g., an infrequent atmospheric state that is at least two standard deviations from the mean).

When we directly analyze the clear-sky radiance measurements for carefully chosen channels (i.e., each channel is mainly related to a single physical quantity) rather than first carrying out physical retrievals, the information content is not affected by assumed background fields or retrieval errors. Thus, we are looking at an uncontaminated climate signal. A second goal is to compare the statistical characteristics of AIRS measurements with those derived using a forward model in conjunction with reanalyses from the European Center for Medium Range Forecasts (ECMWF), ERA-Interim (Dee et al. 2011), and also from NASA Goddard’s Modern Era Retrospective Analysis for Research and Applications (MERRA; Rienecker et al. 2011). This is done by interpolating the analyses to the latitude and longitude for each AIRS observation and using the nearest-in-time reanalysis. We then apply the Stand-Alone AIRS Radiative Transfer Algorithm (SARTA; Strow et al. 2003), which maps meteorological profiles to clear-sky top of the atmosphere radiances, which we convert to the equivalent brightness temperature (BT). We use the NASA Goddard AIRS/Aqua level-1B calibration subset (AIRXBCAL) to discriminate for clear scenes (Li 2008), and we apply the clear-sky forward model to ERA and MERRA profiles (without using reanalysis cloud fields) for these AIRS cloud-free scenes to map the ERA and MERRA profiles to BT. Comparisons between the calculated and observed radiances allow us to diagnose how well ERA and MERRA are capturing some of the extrema. Note that Schreier et al. (2014) have also used AIRS data and ERA reanalyses to analyze PDFs of cloud properties, but these were done using retrieved products rather than the radiance observations.

## 2. Data

The data analyzed in this work are the AIRS radiances from September 2003 through August 2013 along with radiances calculated from ERA and MERRA reanalyses at the same times and locations. We discuss the characteristics of the AIRS channels that we have chosen to analyze as well as the reanalysis fields.

### a. AIRS observations

The AIRS channels used in this paper (along with their spectral wavenumbers and which physical variables they are sensitive to) are channel ID 54 (662 cm^{−1}, upper stratospheric temperature), 359 (754 cm^{−1}, midtropospheric temperature), 1055 (1024 cm^{−1}, stratospheric ozone), 1291 (1231 cm^{−1}, surface temperature), 1475 (1344 cm^{−1}, lower tropospheric humidity), and 1614 [1420 cm^{−1}, upper tropospheric–lower stratospheric (UT–LS) humidity]. Channels 54 and 359 are also sensitive to carbon dioxide amounts, but for this paper, we assume the CO_{2} amounts do not change enough on the 10-yr scale to impact the relationships between moments of the PDFs. We have tested this by increasing CO_{2} in the ERA analyses by 2.2 ppm yr^{−1} and found that there are minimal changes to the skewness–kurtosis and PDF plots (not shown here). We will further describe the spectral characteristics of these channels in the next section.

Radiometric accuracy of the AIRS instrument BT is estimated at about 0.2 K (Aumann et al. 2003). Forward model errors for the AIRS channels used in this paper are determined by comparison with the kCompressed Atmospheric Radiative Transfer Algorithm (kCARTA; DeSouza-Machado et al. 1997), which gives estimated RMS errors that range from 0.024 K for 662 cm^{−1} to 0.101 K for 1024 cm^{−1}.

### b. Spectral sensitivity

In this section, we describe how the BT varies with temperatures and gas constituents for each channel analyzed in detail in this paper. Figure 1 shows an AIRS BT spectrum simulated for an ocean scene using the US Standard Atmosphere with CO_{2} adjusted to 385 ppmv (see McClatchey et al. 1972). The black curve shows the simulated TOA radiance, converted to equivalent BT in kelvins. For convenience, we offset that curve by subtracting 250 K. The 800–1200-cm^{−1} region measures BTs of about 288 K, which would be close to the surface temperature (reduced by emissivity being slightly below unity and some column water absorption). The strong absorption bands are evident: 650–800 cm^{−1} (CO_{2} absorption), 1000 cm^{−1} (ozone), and 1300–1650 cm^{−1} (water vapor). Not shown in the plot are AIRS channels measuring radiances in the 2100–2800 cm^{−1} range.

Jacobians are calculated analytically for each layer, multiplied by the layer amounts, so that the units are simply in kelvins. Temperature Jacobians are simply the analytic Jacobians, *d*(BT)/*dT*. The blue curve shows how the simulated BT spectrum (black curve) would change if column ozone were increased. Note that it peaks in the 1000–1100-cm^{−1} region, and we chose AIRS channel 1055 (1024 cm^{−1}) as a proxy for ozone. The red curve shows how the simulations would change if water vapor (H_{2}O) were increased. Note there is a small lowering effect in the window region, but the dominant effect is in the water vapor band (1300–1600 cm^{−1}), where we have chosen AIRS channels 1475 (1344 cm^{−1}, lower tropospheric humidity) and 1614 (1420 cm^{−1}, UT–LS humidity).

The green and magenta curves show how the simulations would change if the stratospheric and tropospheric temperatures were to increase by 1 K (we have multiplied the effect by 10 for clarity). One sees the effect is a net increase—the stratospheric temperature channels are in the 650–700-cm^{−1} region, the tropospheric temperature sounding channels are roughly in the 700–800-cm^{−1} region, while the boundary layer channels are in the window region.

Finally, the cyan curve (multiplied by 10) shows how the simulations would increase if the surface temperature were to increase by 1 K. Though not clearly seen in the figure, AIRS channel 1291 (1231 cm^{−1}) is mostly sensitive to the surface and only weakly to the water vapor column.

What is evident from examining these curves, constructed using only a small subset of trace gas constituents that AIRS is sensitive to, is that for any one AIRS channel, there can be much cross contamination from other constituents and/or temperatures. This clarifies why we need to carefully choose the channels we wish to use for our studies. An additional example (though not plotted here) is the change if CO_{2} were to vary, which would also affect the 650–800-cm^{−1} region.

### c. Height sensitivity

Here we describe how varying the constituents changes the simulated BT as a function of height. Figure 2 shows three such panels, with the Jacobians shown with the height on the vertical axis [in units of pressure (mb)] and the BT sensitivity in K on the horizontal axis.

Figure 2a shows the sensitivity to a 1-K change in temperature. The 662-cm^{−1} channel shows a strong dependence on stratospheric temperature and no sensitivity to the lower atmosphere, while the 754-cm^{−1} channel shows much reduced sensitivity in the upper atmosphere but more sensitivity in the lower troposphere. Figure 2b shows sensitivity to ozone amounts using the 1024-cm^{−1} channel. As in the other plots, it is quite wide but has a peak above 100 mb in the stratosphere. The change in each layer is proportional to the ozone amount in that layer. Finally, Fig. 2c shows sensitivity to water vapor amounts for the 1344-cm^{−1} (sensitive to lower tropospheric humidity) and the 1420-cm^{−1} (UT–LS humidity) channels. The change in each layer is proportional to the water vapor amount in that layer.

The Jacobians in Fig. 2 span many pressure layers, and the numbers on the horizontal axis are quite small; Typical AIRS channel noise is on the same order, about 0.1–0.2 K. For any one spectrum, these combine to produce uncertainties on the order of 1 K–10% humidity for any one layer during a geophysical retrieval. However, since we are using an ensemble of thousands of radiance observations (unaffected by geophysical retrievals) over 10 yr per (latitude–longitude) grid box, we can make definitive statements about the moments of the PDFs, especially as the AIRS channels used here were carefully selected to minimize contamination from multiple gas constituents. The specific numbers shown in these spectral and height Jacobian plots would change if other geophysical profiles are used, but the underlying principles would remain the same.

### d. Reanalysis geophysical fields

ERA uses a spectral model with a 256 × 512 horizontal grid and 60 model levels, and a four-dimensional variational (4D-Var) assimilation algorithm with 12-h assimilation windows. MERRA employs a finite-volume atmospheric model (Lin and Rood 1997) and the Gridpoint Statistical Interpolation analysis system (Wu et al. 2002) with a 180 × 540 horizontal grid and 72 model levels. Both of these systems assimilate radiances from a wide range of satellites, including AIRS and other hyperspectral instruments. They also assimilate a variety of ground- and aircraft-based in situ observations, and error statistics are generated from observation minus forecast comparisons with radiosonde data. A detailed discussion of these comparisons can be found in both Dee et al. (2011) and Rienecker et al. (2011). Our purpose here is to present an alternative way to compare the geophysical fields from reanalyses with AIRS directly in radiance space using non-Gaussian statistical analysis.

At any location, the state of the atmosphere depends on external forcing variables such as season (solar input); internal variables such as CO_{2} concentration, clouds, and aerosols; and boundary conditions like proximity to land and ocean masses and local orography. In general, atmospheric temperature is seen to be relatively uniform over larger areas when compared with moisture, so we expect less variability in channels that are primarily sensitive to stratospheric temperature (Laing and Evans 2011); see for example the plots shown in Tian et al. (2013). Ozone variability is larger in the troposphere though there is much more in the stratosphere. Sea surface temperature (since we are using observations over oceans only) has already been shown by Sura (2011) to have significant stochastic forcing, and clouds are not important here because of our restriction to clear-sky scenes. Finally, tropospheric humidity is highly variable (particularly in the lower troposphere; Laing and Evans 2011), so there is also a good possibility of seeing stronger stochastic forcing in channels sensitive to moisture. This forcing is the result of the complex hydrological cycle that involves precipitation, evaporation, and transport of moisture and clouds.

Given observational and modeled data, an initial state can be assimilated into an NWP model and the future state of the atmosphere forecast by stepping the conservation equations forward in time. Even if one is interested only in hourly forecasts over a limited spatial domain (vs global forecasts over a week or so), not all the microphysics, such as cloud formation and turbulence, can be captured on all spatial and temporal scales. The subgrid processes that are often included using parameterization implies that approximations made in the numerical models lead to reduced accuracy or variability of the forecasts (Gottwald et al. 2016). This could be improved via stochastic modeling of the observed variations (Sura and Hannachi 2015). Direct observations of geophysical variables are impossible at all locations and times, and one avenue of progress could be through the understanding of the statistics and variability of millions of daily TOA radiance observations, which are directly related to geophysical fields, as explained in the previous section.

## 3. Methodology

In this section, we discuss the SARTA forward model used to calculate radiances from ERA and MERRA along with the stochastic analysis techniques used here.

### a. SARTA

SARTA (Strow et al. 2003) used in this paper numerically solves the equation of radiative transfer (Liou 2002; Goody and Yung 1989) by dividing the atmosphere into 97 slabs between the surface (1013 hPa) and top of the atmosphere (TOA; 0.005 mb, roughly 80 km). The slabs are of 0.25-km thickness at the surface, about 0.35 km thick at 15 km and about 3 km thick at TOA. The version of SARTA used in this paper only performs simulated radiative transfer for the 2378 AIRS finite width (rather than monochromatic) channels using specified temperature and trace gas profiles, as it has been optimized for that instrument.

As shown in section 2, the AIRS instrument channels are sensitive to ozone, water vapor, and carbon dioxide in the atmosphere. The optical depths computed by SARTA use line parameters from the high-resolution transmission molecular absorption (HITRAN) 2008 database (Rothman et al. 2009) with an appropriate line shape [Voigt for most gases, line mixing for CO_{2}, and specialized line shapes together with a slight modification of the MT_CKD1.0 (Mlawer et al. 2012) continuum for WV]. SARTA has been extensively validated (Strow et al. 2006). Since AIRS spans 650–2780 cm^{−1} (or 3.6 to 15.5 μm), the fundamental and hot vibrational–rotational bands that the AIRS channels are sensitive to are the bending CO_{2} ν_{2} 15-μm band (roughly spanning 640–800 cm^{−1}), the ozone ν_{1} 10-um band (1000–1100 cm^{−1}), the WV ν_{2} 6.7 μm band (1300–1700 cm^{−1}), and the stretching CO_{2} ν_{3} 4-μm band (2200–2400 cm^{−1}; Strow and Reuter 1988; Mohan 1979; Toth 1991).

The peak of a channel’s weighting function depends on the total optical depth . When it is low, the surface term dominates over the emission term; that channel is then mostly sensitive to surface temperature, emissivity, and to a far lesser extent, column water optical depth. The broad region spanning 800–1300 cm^{−1} (excluding 1000–1100 cm^{−1}) referred to as the thermal infrared window is mostly sensitive only to the surface.

Conversely, if we choose a channel where the total optical depth is much larger than 1, then the transmission from surface to TOA becomes negligible, as does the contribution from the lower atmosphere—the dominant contribution comes from the middle and upper atmosphere. Spectral regions exhibiting this behavior are the bands mentioned above. Weighting functions are further discussed in Liou (2002) and Goody and Yung (1989).

The total optical depth for a particular layer and channel *τ*_{n}(ν) depends on the layer temperature, as well as on the optical depth of the constituent gases. While varying the layer temperature will, in general, vary the optical depth, typically only one or a handful of gases impact any one AIRS channel. For example, in the 640–800-cm^{−1} region, carbon dioxide is the main absorber, and changing its concentration will noticeably raise or lower the outgoing radiances. Changing ozone and water vapor amounts would also show a change in radiances in this spectral region but on a much smaller scale. Conversely, varying methane concentrations would not have any impact in this region but instead would impact other regions such as the 1305-cm^{−1} region.

### b. Stochastic forcing and non-Gaussian PDFs

Stochastic analysis involves the division of the physical process into fast (i.e., rapidly decorrelating) and slow (i.e., slowly decorrelating) modes. The slow modes are governed by deterministic damping or forcing, whereas the fast modes are represented as noise (e.g., Sura 2011). Note that without a pronounced spectral gap, this decomposition cannot be made unambiguously, so stochastic models of atmospheric dynamics and phenomena have to be seen as approximations of the real system (Sura and Hannachi 2015). Generally, using the notation of Sura (2011, 2013), we consider the dynamics of an *n*-dimensional system whose state vector **x** is governed by the stochastic differential equation (SDE)

where the slow deterministic processes are described by the vector **A**(**x**) and the product of the matrix (**x**) and the noise vector ** η**. Thus, (

**x**)

**represents the stochastic approximation to the fast processes. The stochastic components**

*η**η*

_{i}are assumed to be independent Gaussian white-noise processes: 〈

*η*

_{i}(

*t*)〉 and 〈

*η*

_{i}(

*t*)

*η*

_{i}(

*t*′)〉 = δ(

*t*−

*t*′), where 〈⋅〉 denotes the time average (or, assuming ergodicity, an ensemble average), and

*δ*is Dirac’s delta function. In general, the time-evolved

**x**in Eq. (1) will have non-Gaussian statistics and is, therefore, well suited to study extrema. In particular, the state-dependent (multiplicative) stochastic forcing (

**x**)

**has been shown to be one possible scenario that leads to non-Gaussian PDFs even for a linear deterministic term**

*η***A**(

**x**); Sura and Sardeshmukh (2008) and Sardeshmukh and Sura (2009) demonstrated that certain types of observed atmospheric and oceanic non-Gaussian statistics (500-hPa geopotential height and 300-hPa relative vorticity are examples) are consistent with linear stochastically forced dynamics with correlated additive and multiplicative (CAM) noise forcing. CAM forcing is defined for a stochastic system where the random forcing has two terms, one that is multiplied by the state vector and the other that is added to the state vector.

In general, the evolution of the PDF *p*(**x**, *t*) of the stochastic process in Eq. (1) is governed by the corresponding Fokker–Planck Eq. (2) (Sura and Sardeshmukh 2008). For many practical time series analysis applications (like the present analysis of AIRS data), the multivariate SDE Eq. (1) is simplified to a univariate system *dx*/*dt* = *A*(*x*) + *B*(*x*)*η*(*t*), resulting in the Fokker–Planck equation (Sura 2011)

where *A*_{eff} is an effective drift (change in the average value of a random process) and is the sum of deterministic and noise-induced drifts. Sura and Barsugli (2002) show how the parameters in Eq. (2) can be estimated from a dataset.

In the univariate linear case with CAM noise [*A*(*x*) = *Ax* and *B*(*x*) *Bx* + *g*, where *A* is now a positive damping rate and *B* and *g* are CAM noise constants] the Fokker–Planck equation [Eq. (2)] has been shown to result in several statistical properties that can be used to test whether a particular dataset contains this kind of forcing. These include a relationship between the excess kurtosis *K* = 〈*x*′〉/σ^{4} − 3 and skewness S = 〈*x*′〉^{3}/σ^{3} for a variable *x*′ (with zero mean and standard deviation *σ*; Sura 2011):

and the existence of what are called power-law tails of the PDFs,

where 2 ≤ *r* ≤ 0 is a small adjustment that results from errors incurred from the reduction of the multivariate SDE Eq. (1) to a scalar equation. Nonzero *r* is the result of nonlocal effects. Equation (4) holds for large values of |*x*′| relative to *σ* (i.e., the tail). These two relationships provide the means to determine whether the variability in clear-sky AIRS radiance measurements for a particular channel could be modeled by assuming CAM stochastic forcing. Both of them are connected to clear-sky extrema in the system. Kurtosis is a measure of the weight of the tails (or extremes) of the PDF, and skewness is the asymmetry of the PDF. Thus, we expect a power-law tail to occur where there is high excess kurtosis (beyond Gaussian) and where a large skewness gives significantly more weight to one side. The power-law tails will therefore more likely be found in channels (and regions) where the excess kurtosis is large and on the side of the PDF that has a relatively fatter and longer tail (due to skewness). Thus, larger kurtosis and skewness means that there should be regions with longer tails and, therefore, more extrema.

As explained by Sura (2011) and Sura and Sardeshmukh (2008), CAM forcing is a necessary condition for these non-Gaussian statistics when the noise itself is Gaussian, and for Gaussian PDFs, both excess kurtosis and skewness are zero. However, in general, we cannot say that nonzero skewness and excess kurtosis are proof that there is multiplicative noise. We can only say that the observed relationship between skewness and kurtosis plus the power-law behavior are both consistent with the CAM model. Note that for the remainder of the paper, we will use the terms kurtosis and excess kurtosis interchangeably, always to mean excess kurtosis.

The focus of this paper is to test whether these relationships apply to the AIRS radiance measurements and what it tells us about the physical processes that have been measured during the first 10 yr of the instrument’s operation (September 2002–August 2012). We will also build two stochastic models to compare with the AIRS data to help demonstrate what forcing characteristics could achieve similar statistical results. Note that Brindley et al. (2015) have shown that radiance measurements from the Infrared Atmospheric Sounding Interferometer (IASI; Clerbaux et al. 2007) are useful for studying spatial scales of interannual variability, while our approach is to employ the PDF tools developed by Sura.

We end this section with a brief explanation of how restricting to clear-sky radiances would impact measured PDFs. Skewness has a special significance to hyperspectral radiance measurements, because the left side of the PDF contains all of the low temperatures in BT space. These indicate cloudy conditions (because cloud-top temperatures usually are much colder than surface temperatures), which cause the PDFs for AIRS channels whose weighting functions peak in the troposphere and at the surface to be skewed toward the left side. If we include cloudy conditions in the dataset, all of the channels will have a large negative skewness everywhere, and the analysis will be corrupted by the multitude of complex cloud conditions. The inclusion of cloudy scenes would therefore guarantee highly non-Gaussian statistics, independent of whether climate variability is really non-Gaussian. Thus, we choose to restrict the observations to clear conditions (as defined by the AIRS cloud filter, which comes from the AIRXBCAL Distributed Active Archive Center dataset), although the cloud filtering is not perfect, and it is possible some colder cloudy scenes are contained in a set of clear measurements. However, if we restrict the analysis further to positive anomalies, then only very nearly cloud-free measurements are included. We also limit the observations used here to those over oceans because of the difficulty in determining clear conditions over land.

### c. Stochastic models

We have constructed two simple stochastic models to help interpret the analysis of AIRS data in terms of stochastic forcing. The first model is a scalar SDE with CAM forcing (Sura 2011). It is a reduction of Eq. (1) to a scalar form, as described in section 3b:

where *x* is a scalar function of time only and *A* is a constant so that the deterministic term is just a damped linear function of the spatial dimension, *y*. The stochastic forcing term from Eq. (1) is here represented by *Exη* + *gη*, where the first term represents the multiplicative forcing, and the second is the additive forcing. Because we are only concerned with overall statistics rather than details of the evolution, we solve the system using the Euler–Maruyama time-stepping scheme (Kloeden and Platen 1992; Higham 2001). The second model is a one-dimensional (1D; in space) system with the same forcing as in Eq. (5), but now we allow the forcing to be spatially correlated (Sura 2011):

The multiplicative noise is now represented by a diagonal matrix, = ** η**, where

**is a vector and is now spatially correlated with the exponential correlation**

*η**ρ*:

where *L*_{c} is the correlation length scale. Spatially correlated noise terms have long been used in data assimilation systems (e.g., Tangborn 2004) and allow for a rapid decay in correlation with distance. For example, we would expect temperature, humidity, or ozone values to have a smaller correlation at greater distances. For simplicity, we keep the constants *A*, *E*, and *g* as scalars, although they could also be defined as matrices. This system is also solved using Euler–Maruyama time stepping. We will present the statistics for some solutions to these two systems in order to show possible explanations for the structure found in AIRS observations.

## 4. Results

### a. AIRS observations compared with reanalysis calculations

In this section, we present the results of analyzing PDFs constructed from 10 yr of AIRS level-1B data in terms of their non-Gaussian behavior. In particular, we plot the skewness versus kurtosis for each of the channels described earlier, along with the PDF for a grid box having relatively large skewness and kurtosis. The results are discussed in terms of the sensitivity of each channel to atmospheric properties. We also demonstrate how some of the results can be compared with the simple stochastic equations of section 3c as a way to further understand the nature of the atmospheric forcing.

The 10 yr of clear-sky, nighttime AIRS data are divided into 4° bins, resulting in a latitude–longitude grid of 45 × 90. The daily mean, skewness, and kurtosis of all the observations for each grid box are calculated. Only grid boxes with more than 500 observations (see the observation counts per grid box in Fig. 3) are included so as to eliminate grid boxes that are not statistically significant (these occurred near the poles, where there were relatively few clear-sky observations). The effect of the annual cycle is reduced by focusing on Northern Hemisphere winter (DJF) measurements over the 10 yr of data. The mean has also been removed from the dataset. The long-term trends for this period have not been removed as they were found to be insignificant relative to the short-term variability, so that removing them had no noticeable impact on the results presented here.

Figures 4–6 show excess kurtosis versus skewness using AIRS observations or model calculations interpolated to the AIRS observation locations. All of the channels show some degree of non-Gaussian behavior, as indicated by the fact that they have nonzero skewness and kurtosis. Further, the skewness-versus-kurtosis plots exhibit the general relationship *K* ≥ (3/2)*S*^{2} − *r* expected from the stochastic CAM model. A positive value for *r* is a result of the reduction of Eq. (1), with spatially correlated random forcing, to a scalar SDE (Sura 2011). That this *K* ≥ (3/2)*S*^{2} relationship is not exact is also an indication that the AIRS channels studied here have only weak CAM forcing. Note that we indicate the fraction of grid boxes with kurtosis values above this (green) line in the caption of each of the figures. Another feature that is common to these channels is that most of the latitude–longitude bins result in skewness and excess kurtosis values that are relatively small (≤1.0). For example, observations sensitive to stratospheric temperature (662 cm^{−1}; Fig. 4) have values of skewness mostly between −1 and 1 and excess kurtosis between −1.0 and 1.0. But a smaller number of latitude–longitude bins show excess kurtosis that exceeds (3/2)*S*^{2} with significantly larger skewness values. Thus, there are some regions where strong CAM noise forcing exists. This result is not surprising given that the stratosphere is fairly quiescent, and the temperature field does not contain significant small-scale structures. These figures also show that the skewness–kurtosis (*S*–*K*) relationships for AIRS observations and ERA and MERRA calculations are quite similar, implying that the analyses capture the higher-order variability of stratospheric temperature up to 1 mb reasonably well.

Overall, the AIRS channels analyzed here exhibit a variety of statistics, which depend on which part of the atmosphere they are sensitive to. Some are weighted more toward negative skewness [754 cm^{−1} (tropospheric temperature), 1024 cm^{−1} (ozone), and 1231 cm^{−1} (surface)], seen in Figs. 4 and 5. Other channels are weighted toward positive skewness [662 cm^{−1} (stratospheric temperature) and 1420 cm^{−1} (UT–LS H_{2}O)], seen in Figs. 5 and 6. BT from 1344 cm^{−1} (lower tropospheric H_{2}O), Fig. 6, is fairly evenly balanced between positive and negative.

The skewness here is due to the combined variability of ozone and temperature within the stratosphere. The kurtosis values in this channel are mostly below the (3/2)*S*^{2} curve, meaning we have not found strongly non-Gaussian hot side tails. In addition, the calculations made by a clear-sky RTA produce PDFs containing the same negative skewness as the AIRS observations, so we can conclude that it does not originate from cloud contamination.

The two channels whose skewness/kurtosis differ the most from each other are stratospheric ozone (1024 cm^{−1}) and surface temperature and water vapor (1231 cm^{−1}), both shown in Fig. 5. The excess kurtosis for the former is mostly below the (3/2)*S*^{2} curve, indicating the weakest CAM forcing of the six channels analyzed. This is unsurprising given that stratospheric ozone has relatively low variability, except in the polar regions, which are mostly not included with these data. In the latter (1231 cm^{−1}), much more of the grid boxes have excess kurtosis above the (3/2)*S*^{2} curve (though still less than half), suggesting the strongest CAM forcing of the six channels. This is in line with the Sura (2011) analysis of sea surface temperature, though this channel is also weakly sensitive to column water.

We can also use these results to point to where changes in the frequency of extrema will be the greatest. When CAM forcing is large, then the nonlinear term in Eq. (5) will be larger, leading to larger changes in the stochastic variables. Thus, for example, we would expect more changes in the occurrence of extrema in surface temperature than in stratospheric ozone. In the next section, we will investigate the structure of these skewness–kurtosis plots further by simulating both scalar and multidimensional SDEs to determine when Eq. (3) holds and what we can learn about climate forcing from them.

It can be helpful to understand the relationship between skewness and kurtosis by plotting maps of the spatial structures of both, for both AIRS observations as well as ERA and MERRA calculations, shown in Fig. 7 (skewness) and Fig. 8 (kurtosis) for the UT–LS band (1420 cm^{−1}). For all the channels analyzed here, skewness for observations and calculations show similar spatial structures. The kurtosis plots indicate that locations for positive skewness match well with high-kurtosis regions, but some of the negative skewness regions do not correlate well with high-kurtosis regions (e.g., north of Australia). This can also be seen in the scatterplot for 1420 cm^{−1} (Figs. 6b,d,f), which shows that the negative skewness side is primarily below the *S*^{2}/2 curve. These maps also point to which grid boxes (e.g., large skewness and/or kurtosis) are most likely to contain non-Gaussian PDFs, extrema, and the resulting power-law tails. Physically, this just means that cold extrema are less likely than hot extrema. And for clear-sky observations, this can have implications for SST, lower tropospheric temperature, and humidity. For example, in this dataset, high-SST extrema are more likely than low-temperature extrema.

We have also plotted the quantity *K* = 1.5*S*^{2} for 1420 cm^{−1} (Fig. 9, observations only), which shows where kurtosis is above the 1.5*S*^{2} curve. It can be seen that this relationship is relatively random globally, and it is difficult to pick out any region where positive values are more likely.

We have searched through the PDFs for each channel and each grid box for regions with power-law tails by plotting every grid box where skewness is greater than 0.2 (so that only positive skewness is considered) and excess kurtosis is greater than (3/2)*S*^{2} (to increase the probability of CAM forcing). In this way, we consider only positive tails so that any possibility of cloud contamination is removed. With these restrictions, we found a number of grid boxes where non-Gaussian tails are seen in the AIRS observations. For example, Fig. 10a shows the PDFs for AIRS in one grid box for 1231 cm^{−1} (sensitive to surface temperature). But these tails do not closely follow a power-law form [Eq. (4)]. This is possibly due to the relatively small number of observations in each grid box (see Fig. 3), so that the numbers are too small to obtain a statistically significant slope in the tails. The calculated clear-sky BT from ERA and MERRA generally resulted in extreme regions with even less clearly defined tails (Fig. 10b,c). This may indicate some differences in extrema in the reanalyses, or it could be an artifact of the relatively small size of the datasets. We have tested this by carrying out a bootstrap (Efron 1979) uncertainty analysis with 1000 iterations applied to the AIRS PDF in Fig. 10a for 1231 cm^{−1}. The result in Fig. 11 shows the mean PDF along with the mean ± the standard deviation from the 1000 bootstrap calculations. This shows that the uncertainty is small for most of the PDF but larger (100%) for the cold tail and (20%) for the hot tail. Since we are focusing on the latter, we can reasonably expect that the hot side non-Gaussian form is robust.

It is also helpful to plot the PDFs in terms of total observation count, so we have also done this for the AIRS observations and ERA and MERRA calculations for the same grid box as shown in Fig. 10. These are shown in Fig. 12 on a linear scale, from which we have found that the total number of observations in the hot tail (beyond 2σ from the PDF maximum) is 64 for the AIRS observations, 83 for ERA calculations, and about 13 for MERRA calculations. Comparing the exact times of these observations, we find that the event times for AIRS and ERA are the same for 59 observations in the tail, while the overlap between AIRS and MERRA is 12. This means there is significant overlap for extrema between AIRS and ERA and between AIRS and MERRA. It is interesting to note that while ERA overestimates (relative to AIRS) the number of hot tail extrema, MERRA significantly underestimates them.

The solution to the problem of relatively large uncertainties requires a larger number of observations. We have found that combining grid boxes together to get larger datasets generally results in more complicated PDFs with multiple peaks. Reducing the uncertainties is therefore best achieved by a larger dataset, namely, the continued operation of AIRS (now more than 14 years) and combining them with observations from IASI and CrIS.

### b. Numerical simulations of SDEs

In this section, we describe the statistical results from numerical experiments of the stochastic systems in Eqs. (5) and (6). Our goal here is to understand what possible types of forcing can result in the plots found for both AIRS observations and calculations from reanalyses. We have carried out a variety of experiments to determine cases where non-Gaussian statistics occur for these systems. We present here a subset of those experiments that used damping parameter *A* = 2.6 and an additive stochastic forcing parameter *g* = 0.85. A nondimensional time step of Δ*t* = 0.1 is used in both systems, while the multidimensional system uses a domain of length *L* = 1 with 551 grid points. The PDFs are computed from an ensemble of 3000 runs from the scalar system, whereas they are computed at each grid point in the vector system. Figure 13 shows the excess kurtosis versus skewness for an ensemble of simulations of the scalar system in Eq. (5), with multiplicative forcing parameter *E* = 1.05 (Fig. 13a) and *E* = 0.1 (Fig. 13b). The former show that when the multiplicative forcing is on the same order as the additive forcing, then we get the expected *S*–*K* relationship in Eq. (3). The latter shows that when, on the other hand, the multiplicative forcing is only about 8% of the additive forcing, then some of the ensemble solutions drop below the (3/2)*S*^{2} curve. So, one possible mechanism for the *S*–*K* relationships found for the AIRS channels is that these observations (and the reanalyses mapped onto the same observation space) have only weak multiplicative stochastic forcing.

Solutions to the stochastic system in Eq. (6) with one spatial dimension are shown in Fig. 14. Both solutions use *E* = 1.1 so that there is significant multiplicative stochastic forcing (relative to additive forcing), while the spatial correlation of the forcing varies from *L*_{c} = 0.0001 (Fig. 14a) to *L*_{c} = 0.05 (Fig. 14b). With essentially no spatial correlation (in the former), the kurtosis–skewness relationship is similar to the scalar case with *E* = 1.1 in that the excess kurtosis is above the (3/2)*S*^{2} curve. When the correlation is increased to include several grid points, excess kurtosis at some of the grid points with large skewness drops below the (3/2)*S*^{2} curve. This behavior, which differs from the scalar case when multiplicative forcing is much weaker, is predicted analytically (Sura 2011).

We can connect these simulations with the skewness-versus-kurtosis plots of Figs. 4–6 considering what kinds of physical processes might lead to these results and how they might cause the kinds of forcing used in the simulations. For example, when the additive and multiplicative forcing in Eqs. (5) and (6) are roughly equal with sufficient magnitude, then the skewness-versus-kurtosis relationship stays entirely above the *K* = (3/2)*S*^{2} curve. When the forcing is not in balance, the kurtosis not only drops below the curve, but the values of both skewness and kurtosis are much smaller. We see from Figs. 4–6 that all of the channels, for both observations and calculations, have points both above and below the curve. And we also see that there is some variation in the maximum values of skewness and kurtosis, with the largest values occurring for 1231 cm^{−1} (surface temperature; Figs. 5b,d,f). Ozone observations and calculations are mostly below the *S*–*K* curve, possibly indicating less balance in the forcing.

The spatial correlation of the forcing is another significant outcome of these experiments. This can occur during large-scale storms or along a front (though these would occur for mostly cloudy scenes). The fact that we have restricted to clear-sky scenes means that there is probably greater horizontal correlation in the observations than would be found in the complete dataset. And as noted earlier, stratospheric processes are more likely to be correlated over longer distances. And because we are using satellite-based measurements that are sensitive to processes over a range of atmospheric levels, spatial correlation of forcing can be expected to be significant. While these similarities cannot be treated as proof that the different types of stochastic forcing occur for the different geophysical fields, the similarities are consistent with the results from the simulated stochastic systems.

## 5. Conclusions

After filtering for clear-sky scenes, we have computed PDFs of BT for 6 channels sensitive to different atmospheric trace gas constituents from 10 yr of AIRS instrument observations, along with those calculated from ERA and MERRA reanalyses using a fast clear-sky radiative transfer model. We have tested for possible CAM forcing by determining whether scatterplots of kurtosis versus skewness follow the relationship *K* ≥ 1.5*S*^{2} − *r*. Scatterplots for both the observations and calculations show evidence that CAM forcing of the physical processes observed by these channels is occurring. Generally, channels sensitive to tropospheric properties show smaller values of *r* in Eq. (3) than do channels sensitive to the stratosphere. Simulations of simple scalar and one-dimensional SDEs show some possible origins of these statistical results, including weak CAM forcing or spatially correlated forcing. Both of these seem plausible, because each channel has simultaneous contributions from a range of atmospheric levels and trace gas profiles, not all of which are likely to be stochastically forced. And because the forcing at different levels of the atmosphere may well be linked through convective motion, we could expect that the forcing would have some degree of vertical spatial correlation. Horizontal spatial correlation could also be enhanced in the data used here because we have limited the observations to clear scenes only, thereby limiting the variety of types of forcing. But the fact that the AIRS 1231 cm^{−1} (surface temperature) channel gives the most similar results to the many previous stochastic forcing studies cited here indicates that much of the differences are because we are simply looking at different physical variables, which have different forcing characteristics. These can all lead to a shift downward in the kurtosis, as represented by the parameter *r* in Eq. (3) because higher-order moments are dominated by the diagonal (or self-correlation) part of covariance. So, if the spatial correlations are longer, then the diagonal term becomes less important (Sardeshmukh and Sura 2009).

The PDF plots show the existence of extreme tails, which is an indication that AIRS is observing the non-Gaussian nature of atmospheric variability over the 10-yr period studied. We have focused on a search for hot side tails (temperature greater than the mean) to avoid any chance of cloud contamination and found that hot side tails occur in channels sensitive to tropospheric humidity and temperature. Ozone and temperature in the stratosphere had shorter non-Gaussian tails (not shown), likely because of the longer length scales which can be more easily resolved. BT calculations from ERA and MERRA reanalyses reproduced some of the tails but not as consistently. This result is particularly interesting as it shows this approach can be used as a way to assess overall how well reanalyses are capturing extrema in the atmosphere. But because of the relatively small size of the current 10-yr dataset from AIRS, a more complete analysis of this will require us to wait for combined satellite datasets. The inclusion of cloudy scenes will also greatly increase the size of the dataset (Wong and Teixeira 2016).

In this work, we have focused on the PDFs over a 10-yr period and have not addressed the issue of changes to them over time. The techniques applied here can help lead to better understanding of extrema and how long-term satellite observation records can be used to this end. Analyzing different AIRS channels shows that the strength of stochastic forcing depends on which part of the atmosphere and what trace gases we are looking at. So, while the fact that AIRS observations are essentially convolutions of different physical properties from the surface to TOA makes the analysis more complicated, it can still lead us to be able to make some conclusions about where and how extrema are occurring.

The AIRS radiance record is the best candidate for climate studies, especially when combined with successor instruments such as IASI and the Cross-Track Infrared Sounder (CrIS; Bloom 2001). Combined, the observational record from these instruments could extend well beyond 20 yr. As there is a demonstrated need by the numerical weather prediction (NWP) community for these instruments, follow-on missions are likely to continue for many years, thereby extending the climate record.

These data can eventually lead us to a better understanding of climate change in a couple of ways. The PDFs calculated here are, in fact, changing in time and are not really stationary. The Fokker–Planck equation describes PDF evolution, and the relevant statistics can be calculated from the AIRS dataset. We can also calculate changes to the PDF tails over the course of the existing data, which will be extended by IASI and CrIS. Extreme events are generally expected to change more rapidly than mean values of atmospheric properties, so focusing on the tail regions should lead to further insight into what parts of the atmosphere are evolving most rapidly. The results here point to which channels (and, therefore, which geophysical variables) are most likely to change as the result of stochastic forcing.

## Acknowledgments

This work was funded by NASA ROSES Grant NNX14AK58G: Climate studies using AIRS, radiative transfer and spectroscopy. We also acknowledge the High Performance Computing Facility (HPCF) at UMBC, where the computations were carried out. Comments from the three reviewers greatly helped to improve the quality of this paper.

## REFERENCES

*Proc. IEEE Int. Geoscience and Remote Sensing Symp.*, Sydney, Australia, Institute of Electrical and Electronics Engineers, 1341–1343, doi:.

*Satellite Remote Sensing of Clouds and the Atmosphere II*, J. D. Haigh, Ed., International Society for Optical Engineering (SPIE Proceedings, Vol. 3220), 156, doi:.

*Atmospheric Radiation: Theoretical Basis.*2nd ed. Oxford University Press, 519 pp.

*Numerical Solution of Stochastic Differential Equations.*Springer, 636 pp., doi:.

*Introduction to Tropical Meteorology*. 2nd ed. University Corporation for Atmospheric Research. [Available online at www.goes-r.gov/users/comet/tropical/textbook_2nd_edition/index.htm.]

*An Introduction to Atmospheric Radiation*. Academic Press, 583 pp.

*Extremes in a Changing Climate: Detection, Analysis and Uncertainty*, A. AghaKouchak et al., Eds., Water Science and Technology Library, Vol. 65, Springer, 181–222.

_{2}band of H

_{2}

^{16}O: Line strengths and transition frequencies

## Footnotes

© 2017 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).