## 1. Introduction

Long atmospheric reanalyses (Kalnay et al. 1996; Gibson et al. 1997; Kanamitsu et al. 2002) have been widely used for numerous weather and climate variability studies. However, it has been generally considered that it is difficult to use them for estimating climate trends for at least two major reasons (e.g., Kistler et al. 2001). The first and most important is that the observing system is not constant. The models used in the reanalyses are not perfect, and their climatology is different from the real climatology. As a result, the introduction of data from additional observing systems (in the 1950s during the establishment of the rawinsonde-based upper-air observing system, and especially the major addition of satellite observations in 1979) is associated with jumps in the reanalyses. One way to minimize the problem associated with the 1979 major change in observing systems is to average the trends in two separate relatively homogeneous periods in the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis: one for the presatellite era (1948–78) and one for the postsatellite era (1980–present), as done in Kalnay and Cai (2003).

A second issue with the use of reanalysis trends is that some of the models used for long reanalyses have not been modified to reflect the changes in greenhouse gases, such as carbon dioxide, and do not reflect other changes in the atmosphere, such as aerosols of volcanic origin. This essentially is a technical issue. One can include known changes in the external forcings in the model and regenerate the reanalysis [such as the 40-yr European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40; see http://www.ecmwf.int/research/era), which includes evolving greenhouse gases]. Nevertheless, not all of the changes in the external forcings, such as the change in aerosols (due to both natural and anthropogenic sources), are known. Therefore, it is important to ask, given a perfect observational dataset, whether a reanalysis made with a frozen model can capture the trend in the observational dataset due to a steady increase of external forcing that is present in nature but absent in the model. Intuitively, one may argue that because the reanalysis is a weighted average of the model short-term forecast with the observations, it can only reflect a “watered down” version of the trends present in the observations but absent from the model. In this paper, we show that this “intuitive conventional wisdom” about the watering down of trends in reanalysis is incorrect since a data assimilation system, such as the NCEP–NCAR reanalysis, can essentially capture the full strength of a climate trend caused by an external forcing even if this forcing is absent from the model used in the data assimilation, as long as the observations are frequently available for twice-daily assimilation. We also show that model errors do not introduce spurious trends as long as the model is kept unchanged.

## 2. A simple analytic model

*T*) has a constant linear trend

_{O}*W*caused by a steady increase of greenhouse gas concentration since time

*t*= 0. We can express the trend of

*T*between

_{O}*t*and

*t*+

*δt*aswhere

*δt*is the time interval between two adjacent analysis cycles. After

*N δt*, the observed global mean temperature isIn other words, the observation time series

*T*is an arithmetic series and (

_{O}*WNδt)*is the amount of total warming during the period from

*t*= 0 to

*t*=

*Nδt*. Next, let us use a model that has a fixed greenhouse concentration level taken at

*t*= 0 as the model component of the data assimilation system for generating the first guess temperature,

*T*. For the sake of simplicity, let us further assume that the model is a perfect model in the sense that

_{F}*T*(0) is an equilibrium solution of the model with the fixed amount of greenhouse concentration level at

_{O}*t*= 0. In other words, without data assimilation, the model would behave like

*T*. Since the assimilated temperature

_{A}(t)*T*is different from the model equilibrium temperature

_{A}(t)*T*(0), the model integration starting from

_{O}*T*would be subject to a negative tendency that acts to bring

_{A}(t)*T*toward the model’s equilibrium state

_{A}(t)*T*(0). In other words, because the model integration starts from an initial state that is not in balance with the model physics, the frozen model physics acts to damp out the difference between the analysis and the model equilibrium state. Let us use “

_{O}*τ*” to denote the model adjustment time scale from an initial state

*T*to the equilibrium state

_{A}(t)*T*(0). Then the model increment from the previous analysis cycle can be approximately expressed as

_{O}*δt*/

*τ*) is on the order of 10

^{−2}for an energy balance model. For this simple scalar equation, the analysis is obtained as a weighted average of the model forecast (first guess) and the observations (e.g., Kalnay 2003, 145–148):where

*a*is the weight assigned to the first-guess field by the data assimilation procedure and (1 −

*a*) is the weight assigned to the observation. Next, without losing generality, we can assume that at

*t*= 0,It follows that, at

*t*=

*δt*, 2

*δt*, and 3

*δt*, we haveIn deriving (7), we have made use of (2), (4), (5), and (6). Repeating the above procedure for

*N*successive “analysis cycles,” we obtain that at

*t*=

*Nδt*,

*T*in (8) is an arithmetic–algebraic series. We note that the time interval between two adjacent analysis cycles (

_{A}*δt*) of a typical data assimilation system is no longer than 12 h, and the time scale of model adjustment (

*τ*) from the assimilated temperature to the model equilibrium temperature is expected to be longer than several days. Therefore, it is reasonable to assume

^{1}After carrying out the summation from

*j*= 1 to

*j*=

*N*, (8) becomesTaking the difference between the analyses at step

*N*and at step

*N*− 1 and dividing by

*δt*yields the reanalysis trend at time

*t*=

*Nδt:*After making use of (9) and the fact that 0 <

*a*< 1 in a data assimilation system, we obtain the asymptotic solution of (11) for a large value of

*N*,The coefficient in front

*W*in (12) is the ratio between the warming trend in the reanalysis and the observed one. The departure of this coefficient from unity is a measure of the watering down or “reduction” in the trend, equal to

Figure 1 shows the ratio of the reanalysis trend to the observed trend between two consecutive analysis cycles as a function of the analysis step *N* and the weight assigned to the observation (1 − *a*) for (*δt*/*τ*) = 0.01. For *a* = 0.5, after only 20 steps, the trend in the reanalysis reaches 99% of the observed trend. This is achieved despite the fact that the model component of the data assimilation system does not have the physical processes that produce the trend in the observations. Even when the observations are given low weights, such as using (1 – *a*) as low as 0.2, the trend is detected above the 95% level only after 20 steps of analysis cycles. In other words, based on our simple analytical estimation, the trend inferred from reanalysis is virtually identical to the one observed in nature after less than 100 analysis cycles. This explains why the ERA-15 can capture the Mt. Pinatubo eruption within a few days after the eruption even through the model used in the ERA-15 has constant aerosols (Andersen et al. 2001).

Figure 2 displays the ratio of the reanalysis trend after a sufficiently large number of analysis cycles (e.g., *N* > 100) to the trend in the observation as a function of *α* (the weight assigned to the first-guess field in a reanalysis) and the parameter (*δt*/*τ*). It shows that when the model adjustment time scale from the assimilated temperature to the model equilibrium temperature is much longer than the time interval between two adjacent analysis cycles (say, (*δt*/*τ*) < 10^{−3}), the reanalysis trend is virtually identical to the trend in the observations even for small observational weights. However, if the model adjustment time scale is comparable to the time interval of the analysis cycles, the strength of the reanalysis trend can be severely compromised. This is particularly true if the weight assigned to the observations is small. For example, for (*δt*/*τ*) = 0.1, the weight assigned to the observations has to be larger than 0.6 (or *a* < 0.4) in order to assure that the reanalysis trend is no less than 90% of the observed trend. In the worst scenario, namely, (*δt*/*τ*) = 1, the ratio is exactly equal to (1 − *α*), the weight assigned to the observations. Therefore, when (*δt*/*τ*) is close to unity, the reanalysis trend made with a frozen model would be significantly smaller than the reality unless adequate observations are used (so that much larger weights are assigned to observations).

As illustrated in the appendix, the parameter (*δt*/*τ*) is on the order of 10^{−2} for a simple energy balance model. According to Fig. 2, the strength of the reanalysis trend can be easily above the 95% level of the observed trend as long as the weight assigned to the observations is larger than 0.2 (or *α* < 0.8). For a general circulation model, it is expected that the adjustment time scale *τ* is longer than that in a simple energy balance model, implying that (*δt*/*τ*) < 10^{−2}. It follows that the reanalysis would be able to reproduce an observational trend very closely with only a small percentage reduction after a short transient (a few tens of analysis cycles).

Furthermore, it can be easily shown that even the presence of a constant model systematic error would have little impact on the reanalysis trend beyond a similar short transient. In fact, adding a constant model bias *Eδt* in the equation for the first guess, namely (4), results in an extra term {1 − *a ^{N}*[1 − (

*δt*/

*τ*)]

^{N}}/{1 −

*a*[1 − (

*δt*/

*τ*)]}

*aEδt*in (10). Again, because 0 <

*a*< 1 and 0 < (

*δt*/

*τ*) < 1, with large

*N*, this extra term becomes

*aEδt*/{1 −

*a*[1 − (

*δt*/

*τ*)]}, independent of the analysis step. This implies that after a short time transient, the trend between two consecutive analysis cycles can still be described by (12) even through the model used in the data assimilation has a constant systematic error.

*a*. For example, one can easily verify that for

*a*= 0 (which implies that the model forecast is not used in the data assimilation), we haveFor

*a*= 1 (the observations are not used in data assimilation), (10) becomesTherefore, the analysis is identical to the model equilibrium state without using observations. If we neglect the model adjustment toward model equilibrium solution (i.e.,

*τ*→ ∞) the model bias

*E*would produce a linear bias trend in the analysis when observations are not used (i.e.,

*a*= 1). This can be verified by taking the limits of

*a*→ 1 and

*τ*→ ∞ in (10), leading to

## 3. Concluding remarks

In summary, we have shown that for a simple scalar equation, a long reanalysis can detect a trend present in observations assimilated by the reanalysis, even if the physical processes responsible for the trend are completely absent from the model used to create a first guess, and the first guess suffers from a drift due to the imbalance between the model equilibrium temperature and the assimilated temperature. The trend can be detected nearly at its full strength (at least the 95% level even if the observations are given low weights) after a short transient. Model errors do not affect the reanalysis trend as long as the model used remains constant, except for a similar short transient. The imbalance does contribute a systematic reduction of the trend in the reanalysis compared to the observed trend. The reduction can be constrained by two factors: (i) a relatively large weight assigned to the observations and (ii) the rapidity of the data assimilation cycles compared to the model adjustment time scale. Our estimate based on a simple energy balance model indicates that the reduction in the reanalysis trend is less than 5% as long as the weights assigned to observations are larger than 0.2 and the observations are available for twice-daily data assimilation.

As we stated in the introduction, there are other major issues about using reanalysis for a long-term trend analysis, particularly the impact from the major changes in the global observing system in the 1950s and in 1979. Here we merely prove mathematically that the frozen model used in a reanalysis does not cause meaningful harm to the fidelity of the long-term trend in the reanalysis. Although these results were derived for a single scalar “analysis,” we believe that our analytical proof is still relevant to the more complex data assimilation schemes used in the reanalysis because there exists a similarity between the case of single scalar analysis and the complete 3D multivariate statistical interpolation approach used in the NCEP–NCAR reanalysis and other long reanalyses (e.g., Kalnay 2003, p. 155). Particularly, we wish to point out that our mathematical proof is based on the following two important generic features of a data assimilation system (whether complex or simple): (i) the analysis made by a data assimilation system is essentially a linear combination of observations and the first-guess field and (ii) the first-guess field is obtained by integrating the model starting from the analysis field made at the previous analysis cycle. The rapidity (order of 10 h) in making use of the observations in the reanalysis is one of the fundamental requirements in a model with the frozen physics in order for the reanalysis trend to capture the observed trend faithfully. This also implies that the observational data have to be available at least every 12 h. This requirement is satisfied in the NCEP–NCAR reanalysis after the advent of satellite data in 1979 and over data-rich regions such as the continents of the Northern Hemisphere, since the late 1950s.

By working with a scalar equation, we implicitly assume that the space coverage of the observational data is uniform and the observations are sufficiently adequate (e.g., one observation for one model variable) when applying our theoretical argument to a more complex 3D data assimilation system. Therefore, it remains to be shown whether or not the aspect of frozen model may result in a noticeable “watering down” effect over the areas where the observations are severely absent, such as over the Southern Ocean before the advent of satellite data, and whether this lack of information is transported by the model. A definite assessment on the reanalysis trend can only be done by directly comparing the rawinsonde observations and the reanalysis. A nonuniform observational network is equivalent to a sparsely distributed observational network, or an inadequate observational network. We would argue that an inadequate observational network (e.g., a sparse space coverage of the observations versus a high-resolution model grid) is equivalent to effectively assigning a low weight to observations during the data assimilation procedure. According to Fig. 2, a smaller weight to observations in general would not compromise the fidelity of the reanalysis trends unless the observations were not available at a much shorter interval in time compared to the model adjustment time scale. Our estimate based on a simple energy balance model indicates the model adjustment time scale is on the order of 500 h, about 100 times longer than the availability of the observations or the time intervals of two adjacent analysis cycles. According to Fig. 2, the effective weights assigned to observations have to be larger than 0.2 in order to retain the observed trends in the reanalysis (at the 95% level). Given the global coverage of the observations used in NCEP–NCAR reanalysis, we expect that such a small minimum weight requirement should be satisfied.

We note that even an exactly constant increase of greenhouse gas concentration may result in a long-term trend that varies with time because of the presence of various thermodynamic/dynamic feedback processes. However, because of the extremely short transient (e.g., *N* is of the order of 100), we believe that the reanalysis with a frozen model should be able to capture the long time trends in the observations that vary in time. As a possible extension, our results seem to suggest that the reanalysis could also be used to infer information about cloud feedback as long the observations influenced by clouds are included in the reanalysis and the duration of the presence of cloud trends exceeds the transient period described before.

This work was supported by a grant from the NASA Seasonal-to-Interseasonal Prediction Project (NASA-NAG-55825). The comments and suggestions from Dr. Francis Zwiers and the two anonymous reviewers are greatly appreciated.

## REFERENCES

Andersen, U., , E. Kaas, , and P. Alpert, 2001: Using analysis increments to estimate atmospheric heating rates following volcanic eruptions.

,*Geophys. Res. Lett.***28****,**991–994.Gibson, J. K., , P. Kållberg, , S. Uppala, , A. Nomura, , A. Hernandez, , and E. Serrano, 1997: ERA description. ECMWF Re-Analysis Project Report Series 1, 74 pp.

Kalnay, E., 2003:

*Atmospheric Modeling, Data Assimilation and Predictability*. Cambridge University Press, 341 pp.Kalnay, E., , and M. Cai, 2003: Impact of urbanization and land-use change on climate.

,*Nature***423****,**528–531.Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project.

,*Bull. Amer. Meteor. Soc.***77****,**437–471.Kanamitsu, M., and Coauthors, 2002: NCEP–DOE AMIP-II reanalysis (R-2).

,*Bull. Amer. Meteor. Soc.***83****,**1631–1643.Kistler, R., and Coauthors, 2001: The NCEP–NCAR 50-year reanalysis: Monthly means CD-ROM and documentation.

,*Bull. Amer. Meteor. Soc.***82****,**247–267.

# APPENDIX

## Adjustment Time Scale of a Global Energy Balance Model

*τ*, by considering a zero-dimensional energy balance model for the global atmosphere,

*T*,where

*P*is the surface pressure,

_{0}*g*is the gravitational parameter,

*C*is the air heat capacity at constant pressure,

_{p}*σ*is the Stefan–Boltzmann constant, and

*R*represents the net radiation absorbed by the atmosphere. The equilibrium temperature of the model,

_{0}*T*, can be determined byTherefore, the linear tendency due to imbalance between an assimilated temperature

_{O}*T*and the equilibrium temperature can be written asThe imbalance results from the lack of Δ

_{A}*R*in the model that produces the assimilated temperature

*T*in observation. It follows that the time scale,

_{A}*τ*, can be estimated fromA typical analysis cycle step is 6 h. Therefore, we have (

*δt*/

*τ*) ∼ 0.01. It should be pointed out in a typical atmospheric general circulation model that the time scale of adjustment to the observations due to the lack of updated physics in the model is expected to be longer than the one from the zero-dimensional energy balance model. This would further strengthen the argument that using a frozen model in a reanalysis does not have any major impact of the trend caused by the external forcing that is absent in the model.