## 1. Introduction

Determinism was the basic tenet of physics from the time of Newton (late 1600s) until the late 1800s. Simply stated, the future state of a system is completely determined by the present state of the system. The evolution of the state is governed by causal relationships such as the Newtonian equations of motion. Orbital mechanics, especially its application to the prediction of motion of the heavenly bodies, was the test bed for determinism. Pierre-Simon Laplace became champion of determinism, or the mechanistic view of the universe, and Carl Gauss exhibited its power when he predicted the reappearance of the planetoid Ceres after its conjunction with the sun (in 1801–02) (Gauss 1963).

Numerical weather prediction (NWP) initially followed the path of determinism. It was Jule Charney’s theoretical treatment of the scales of motion in the atmosphere that laid the foundation for the first successful NWP (Charney 1948). For the larger-scale motion of the atmosphere, he convincingly demonstrated that it was appropriate to predict changes of the hemispheric flow pattern by advecting the geostrophic vorticity with the geostrophic wind (the quasigeostrophic assumption). And with guidance and institutional support from John von Neumann at Princeton’s Institute for Advanced Study, Charney and his team of researchers used this principle to make two successful 24-h forecasts of the transient features of the large-scale flow (initialized on 30 January and 13 February 1949) (Charney et al. 1950; Platzman 1979). It should be mentioned, however, that other 24-h forecasts made by the team (5 January 1949, e.g.) were not particularly good. It is instructive to read the even-handed accounts of these events by two of the participants, George Platzman and Joseph Smagorinsky (Platzman 1979; Smagorinsky 1983].

The success at Princeton set the meteorological world abuzz, and interest in NWP quickly spread to institutional efforts worldwide [notably at the University of Stockholm’s International Institute of Meteorology (“Rossby’s Institute”; Wiin-Nielsen 1991) and the Air Force Cambridge Research Laboratories (AFCRL; Thompson 1983)]. By mid-decade, operational short-range NWP based on these quasigeostrophic principles was taking place in the United States and in Sweden. The meteorological community was filled with an optimistic fervor (Smagorinsky 1983, p. 25 ), and by the late 1950s there was hope that extended-range prediction would be possible (beyond several days). In this milieu, international politico–scientific efforts were underway to promote a coordinated collection of weather data over the globe—in essence, an observational program that would increase the chances of successful long-range weather prediction (the order of weeks).

Despite promising results from the NWP community (in both research and operations), questions began to arise regarding the limits of deterministic prediction—a limit that is governed by growth of error. The initial state is generally erroneous and the models are imperfect. Furthermore, as would be made known in the early 1960s, the essential character of the causal laws—unstable systems characterized by nonperiodicity—placed limits on the predictability of the system. Whereas Gauss and contemporaries found that the two-body problem of celestial mechanics tolerated small error in the initial state, meteorological prediction under nonperiodic constraints would be found to be less forgiving of these uncertainties.

The line of research that explored the limits of deterministic prediction is the subject of this historical study. The thrust of this research led to an alternative approach to NWP—a stochastic–dynamic approach that coupled probability with determinism. Its practical implementation in meteorology has come to be called ensemble prediction.

To set the stage for discussions of the historical work, the zeitgeist associated with the early decades of NWP is reconstructed. The contributions of the scientists who laid the foundation for stochastic–dynamic prediction in meteorology are examined. This is followed by a discussion of events that built on this framework and led to the practice of ensemble forecasting in meteorology. Finally, a schematic diagram is presented in the appendix that links stochastic–dynamic NWP to the fundamental research traditions in the history of science.

## 2. Zeitgeist

Following the successful demonstration of NWP at Princeton in 1950, and the operational production of NWP forecasts in the mid-1950s, a wide range of reactions to these developments took place in the meteorological community. In an effort to reconstruct the spirit of this time, vignettes are offered from meteorologists who were actively involved in administration, research, or operations during this period—the 1950s and 1960s.

### a. Frederick Shuman/Francis Reichelderfer/Aksel Wiin-Nielsen

Shuman’s assessment is echoed in the following reminiscence of Aksel Wiin-Nielsen, an NWP pioneer from Denmark who later worked at NMC and became the first director of the European Centre for Medium-Range Weather Forecasts (ECMWF) in the 1970s:There was some reluctance on the part of the field forecasters to accept the NWP products. But at the center [National Meteorological Center (NMC)]

^{1}there was enthusiasm among those in the Analysis Section, especially Harlan Saylor, Bill Burnett, and Ed Fawcett. It was evident even when things didn’t look that great, when they looked “very chancy,” sort of like Lincoln’s charge to Grant. (F. Shuman 2004, personal communication)

There was a degree of animosity against NWP by the forecasters and operations people in meteorology. They said, “We really can’t use this stuff you’re putting out. Why don’t you get some humidity and clouds into it, real weather!” (A. Wiin-Nielsen 1992, personal communication)

### b. Frederick Sanders

Well, I remember this seminar in the mid-1950s by a theoretician who shall remain unnamed. At one point he was prompted to say: “A hierarchy of ever more powerful and sophisticated models will be brought to bear on meteorology until our problems will crumble before the onslaught of mathematical physics.” I didn’t share in this unbridled enthusiasm nor did I anticipate better forecasts because of the computer. (F. Sanders 1994, personal communication)

### c. Eric Eady

^{2}). Their separate but interlocking contributions are masterfully reviewed by Gill (1982, sections 13.3 and 13.4). The basic condition of the atmospheric state under which infinitesimal perturbations grow is the fundamental underpinning of their theoretical development. And in the last paragraph of his paper, Eady (1949) speculates on the prospects of dynamical weather prediction. He further elaborates on this theme in his review article in the

*Compendium of Meteorology*(Eady 1951, p. 464):

. . . we never know what small perturbations may exist below a certain margin of error. Since the perturbations may grow at an exponential rate, the margin of error in the forecast (final) state will grow exponentially as the period of forecast is increased, and this possible error is unavoidable whatever our method of forecasting . . . if we are to glean any information at all about developments beyond the limited time interval, we must extend our analysis and consider the properties of the set or “ensemble” (corresponding to the Gibbs-ensemble of statistical mechanics) of all possible developments. Thus, long-range forecasting is necessarily a branch of statistical physics in its widest sense: both our questions and answers must be expressed in terms of probabilities.

## 3. Extended-range forecasting

The period between 1955 and 1960, a period bounded by the onset of operational NWP in the United States and the first international symposium on NWP, in Tokyo, Japan, can alternatively be classified as a period of ferment and one of expansiveness and hopefulness for longer-range prediction. The ferment stemmed from concerns about the limits of deterministic short-range prediction alongside technical/numerical issues related to the computer models, while the hopefulness came from progressive improvement in operational forecasts and the eminently successful numerical experiment performed on the Electronic Numerical Integrator and Computer (ENIAC) by Norman Phillips (Phillips 1956; Lewis 1998).

### a. The ferment

Development of the models was of course the crucial first step in operational NWP, but identification and correction of systematic problems/errors became the major theme during the first five years of operations (Thompson 1987). Examples of problems were the following:

kinetic energy associated with the mean motion increased with time while the energy in the fluctuating component decreased with time, leading to serious errors after 48 h;

rapid retrogression of the large-scale components of motion;

only 65% of the variance in the day-to-day changes in the large-scale circulation was accounted for by the quasigeostrophic models.

The rapid retrogression was rectified by an empirical method—Fourier decomposition—to remove the long waves from the initial state, make the forecast with the truncated analysis, and insert the long waves back into the forecast. Inclusion of boundary layer friction decreased the kinetic energy increase in the mean flow. The problem with variance remained until the primitive equations replaced the quasigeostrophic constraints in the mid-1960s.

Indeed, this pattern commonly referred to as “noodling” was later found to be related to aliasing errors—errors produced by nonlinear interaction, that is, the production of waves beyond the limit of resolution (Phillips 1959). It was Arakawa’s conviction that these aliasing errors need not grow with time and destroy the forecast. This conviction eventually led him to develop mathematically consistent finite-difference forms of the advective terms in the governing equations, the “Jacobian” terms. These finite-difference forms, later known as “Arakawa Jacobians,” allowed long-term integration of the prediction equations without the presence of artificially large diffusion to dampen the noise (Arakawa 1966; Lilly 1997).Also it was important that I was familiar with Fjørtoft’s [1952] argument that no systematic cascade of energy takes place in a barotropic atmosphere as a consequence of the energy and enstrophy conservation. Thus, I began to get the feeling that something is crucially different between the dynamics of the continuous system and that for the discrete system. (A. Arakawa 1997, personal communication)

In addition to grappling with important pragmatic problems, the more philosophical problem of predictability was addressed by Phil Thompson (Thompson 1957). The premise that the deterministic forecast was imperfect was met with some resistance in the meteorological community. As stated in an oral history interview: “They didn’t really want to introduce any element of uncertainty into what was pleasingly deterministic” (Thompson 1987). Thompson credits C.-G. Rossby with giving him the encouragement to pursue this line of research in the face of the resistance (Lewis 1996). Thompson is pictured in his office at National Center for Atmospheric Research (NCAR) in Fig. 2.

The stimulus for Thompson’s work came from noting 48-h forecast errors over southern Canada that stemmed from the lack of observations over the Pacific Ocean—namely, an error in response to the absence of upper-air observations from weather ship *Papa* (stationed approximately 400 miles south of Adak, Alaska). In the absence of this observation, 48-h forecast errors over the continent were sometimes as great as 200–300 m at the 500-mb level.

Thompson’s work was analytical and tied to the arguments found in Eady (1949, 1951)—namely, limits of predictability are governed by the growth of errors in baroclinic flow. A two-level quasigeostropic model was used in his study. He calculated the first derivative of the error at the initial time (a doubling rate of ∼2 days) and then extrapolated this rate of growth to conclude that the limit of prediction is the order of a week. This limit is defined as that point in time when the forecast is no better than that achieved with an arbitrarily chosen atmospheric state (in effect, a guess). Following Thompson’s lead, E. Novikov, a student of the Russian theoretician A. M. Obukhov, made further refinements and estimated that the predictability limit would be roughly 2 weeks (Novikov 1959).

### b. Guarded optimism

By the mid-1960s, there were no less than eight of these hemispheric models in the United Kingdom, United States, USSR, Australia, and France (four in the United States and one in each of the other countries; see Edwards 2000).The enabling innovation by Phillips was to construct an energetically complete and self-sufficient two-level quasigeostrophic model which could sustain a stable integration for the order of a month of simulated time. Despite the simplicity of the formulation of energy sources and sinks, the results were remarkable in their ability to reproduce the salient features of the general circulation. A new era had been opened. (Smagorinsky 1983, p. 25)

The quasigeostrophic constraints that were the basis of operational short-range NWP gave way to the more general primitive equation (PE) constraints in the case of most GCMs. Hinkelmann (1951, 1959) suggested that the PE equations might be a better set of constraints for NWP, Charney (1955) explored it further, and of course, Richardson (1965) used these equations in his gallant, but failed, attempt to predict weather in the precomputer age. The PE constraints accommodate the mass/wind balance of the tropical latitudes and provide an easier pathway for inclusion of “weather elements” (moisture, phase change of water, and turbulence)—distinct advantages over the quasigeostropic models. However, computational demands with the PE model, specifically the requirement for much smaller incremental time steps and the greater number of variables with correspondingly greater requirements on the storage of information, made operational forecasts with these constraints untenable until 1965.

Cecil (Chuck) Leith, a physicist/mathematician at Lawrence Livermore Laboratory (LLL), was the first to develop a PE-based GCM (Leith 1997, 1965). He accomplished this in the summer of 1960 while at the International Institute for Meteorology in Stockholm, Sweden. In the fall of that same year, the model code was executed on the 1-K-memory Livermore Automatic Research Calculator (LARC) (Leith 1997). Leith’s model extended over the global band, equator to 60°N, and incorporated a 5° latitude × 5° longitude horizontal grid over five vertical levels. It was the first numerical model to include moisture, cloud, and rain. Joseph Smagorinsky’s group at the General Circulation Research Section of the USWB [renamed the Geophysical Fluid Dynamics Laboratory (GFDL) in 1963] followed close behind Leith in developing their own version of a PE-based GCM (Smagorinsky 1963).

The scientific divorce between these two complementary components of meteorology had some deleterious effect on the progress of numerical prediction. In Leith’s (1997) reminiscence, he points to the disadvantages of this separation, disadvantages that reflect on the issue of unresolved scales (specifically cloud and turbulence). He contends that the operational NWP models drifted to their own “climate,” which was not compatible with the more reasonable “climate” of the GCMs. And he further argued that the GCM modelers would have benefited from testing their cloud/radiation interactions on a day-to-day basis with the benefit of satellite observations—in effect, are the GCMs getting clouds in the right place and at the right time? He believed that this type of validation would have gone far to improve the cloud parameterization. Arakawa’s historical review of parameterization research related to GCM development succinctly discusses the problematical aspects of prediction in the presence of unresolved scales of motion (Arakawa 1997).Now, there was this problem within the Weather Service between Smagorinsky and Cressman. Cressman was interested in numerical weather prediction . . . On the other hand, Smagorinsky, also essentially in the Weather Service, was interested in GCMs rather than weather predictions, and there must have been I suppose an issue of resources going to those two different directions . . . (Leith 1997)

## 4. The epiphany

Edward Lorenz’s entry into the study of atmospheric predictability was fortuitous. It occurred in 1956 when he was adamant about disproving a well-held contention among statistical meteorologists. These meteorologists were convinced that Norbert Wiener (Wiener 1956) advocated the adequacy of using linear regression methodology to forecast under the constraint of nonlinear dynamics. Lorenz did not believe this contention and set out to disprove it by counterexample.^{3} After an arduous search and associated numerical experimentation, which he fully discusses in his “scientific biography” *The Essence of Chaos* (Lorenz 1993), he found a nonperiodic evolution of the atmospheric state. His governing dynamics was a truncated version of the two-level quasigeostrophic model—a low-order model with 12 parameters. The disproof of the contention came by way of an “early day” Observation System Simulation Experiment (OSSE).^{4} That is, he used model output as data for the regression scheme. The power spectrum of the model parameters was viewed in the context of earlier work by Kolmogorov (1941) and Wiener (1947)—in essence, a time series that is highly predictable from its own past will exhibit extreme values of the spectrum that differ by several orders of magnitude. Based on this principle, Lorenz speculated that the 1-day forecast from linear regression should be good, but the 3- to 4-day forecast should be poor. And indeed, results confirmed this conjecture. These results are found in Lorenz (1962).

These results were obtained just prior to Lorenz’s trip to the NWP Symposium in Tokyo (see Fig. 3). His presentation focused on the inability of the statistical regression model to forecast under the constraint of nonperiodicity. But in the discussion phase of the meeting, he was able to present the predictability results. In the discussion he said,. . . these small errors of three decimal places had amplified so much in the course of two months [simulated time] that they drowned out the signal. And I found this very exciting because this implied that if the atmosphere behaved this way, then long-range forecasting was impossible because we certainly don’t measure things as accurately as that . . . It came really almost as a shock when I realized that it was behaving this way. (Thompson and Lorenz 1986, 9–10)

Arakawa remembers the presentation as follows:We did one experiment to investigate the growth of errors over 4-day period[s] with the error superimposed upon initial conditions . . . We tried it for 40 different initial conditions . . . the average increase [in error] over the 4-day period was a factor slightly less than 3. (Lorenz 1962)

Akira Kasahara, also in attendance at this talk, echoed Arakawa:The reaction of the audience including myself was perhaps best summarized by Charney at the end of his speech presented at the Panel Discussion of the Symposium:

“What happens in a system of this character if there are many more degrees of freedom”. . . although he [Lorenz] clearly had the point he later elaborated upon in the 1960s, the results with Saltzman’s model [Lorenz 1963] were more influential.^{5}(A. Arakawa 2002, personal communication)

A photo of both Arakawa and Kasahara in the company of other members of the NWP group in Tokyo, Japan, is shown in Fig. 4.Ed Lorenz presented his thought on atmospheric predictability at the first international conference on NWP in Tokyo, 1960. It was the last talk in the conference and I got an impression that people were puzzled by the results he got from the time integration of his famous equations. Of course, by 1964 at the WMO–IUGG Symposium on Research and Development of Long-Range Forecasting [WMO 1965], in Boulder, the question of predictability and the need of statistical consideration were central to achieving useful long-rang forecasting. (A. Kasahara 2004, personal communication)

The proposed procedure chooses a finite ensemble of initial states, rather than the single observed initial state. Each state within the ensemble resembles the observed state closely enough so that the differences might be ascribed to errors or inadequacies in observation. A system of dynamic equations previously deemed to be suitable for forecasting is then applied to each member of the ensemble, leading to an ensemble of states at any future time. From an ensemble of future states, the probability of occurrence of any event, or such statistics as the ensemble mean and ensemble standard deviation of any quantity, may be evaluated. Between the near future, when all states within an ensemble will look about alike, and the very distant future, when two states within an ensemble will show no more resemblance than two atmospheric states chosen at random, it is hoped that there will be an extended range when most of the states in an ensemble, while not constituting good pin-point forecasts, will possess certain important features in common. It is for this extended range that the procedure may prove useful. (Lorenz 1965a, p. 207)

## 5. Feasibility of the two-week forecast

Realizing the importance of Phillips’s numerical experiment, Charney arranged for a meeting between Phillips and von Neumann in early 1955. After discussion with Phillips, von Neumann hastily arranged for a colloquium at Princeton to discuss the future of extended-range prediction (Pfeffer 1960). On 26 October 1955, von Neumann opened the meeting with a tightly woven talk that expressed his view on the future of NWP (von Neumann 1955). He essentially indicated that we could expect difficulty in forecasting weather for intermediate time scales—the order of weeks. He deftly argued his case based on the importance of both initial conditions and the energy inputs and dissipation. Further, Arnt Eliassen addressed this question at the NWP conference in Tokyo (Eliassen 1962). His arguments rested on analogy between long-term weather prediction and statistical thermodynamics. In his concluding remarks, he said, “I suspect that it would be very important also for long range prediction studies to have a prediction method that is able to give as good short range predictions as possible.” And indeed, within the decade, his astute conjecture was validated (Lorenz 1969b).

In the early 1960s, there arose a wave of politico–scientific support for a global observation program, a program that would build on the success of the International Geophysical Year (IGY). In President Kennedy’s presentation before the United Nations in September 1961, a plea was made for the peaceful use of space and space technology where the notion of using satellites for meteorological purposes developed (Smagorinsky 1971). Kennedy’s speech provided fuel for the adoption of United Nations Resolutions 1721 and 1802 (in 1962), resolutions that established the World Weather Watch, in which the Global Atmospheric Research Program (GARP) would eventually become an integral component (in 1968). Within the United States, a panel on international meteorological cooperation was formed in 1963. The panel, headed by Jule Charney, was tasked with establishing basic premises that would justify GARP. Among the premises were 1) the atmosphere is inherently predictable to at least two weeks, and 2) for this time scale, a global observing system is necessary (Smagorinsky 1978).

The methods of perturbing the flow were substantially different for each model and these details are found in section 3 of Committee on the Atmospheric Sciences (1966). The collective assessment of the results follows:The question is then, how fast will a given error, interpreted as a perturbation of the atmospheric flow, grow before the perturbed motion differs from the unperturbed motion by as much as two randomly chosen flows.

Arakawa viewed this statement retrospectively:We may summarize our results in the statement that, based on the most realistic of general circulation models available, the limit of deterministic predictability for the atmosphere is about two weeks in the winter and somewhat longer in summer.

The use of the word “feasibility” in the title of the Committee on the Atmospheric Sciences (1966) paper (“The feasibility of a global observation and analysis experiment”) was crafted by Charney. Although the GCM experiments appeared to justify the two-week forecast, results from Lorenz’s numerical experiments (Lorenz 1962, 1963) were not so encouraging. And, indeed, in the aftermath of the Tokyo meeting, Charney became a strong supporter of Lorenz’s line of research on predictability (E. Lorenz 2002, personal communication). Lorenz recalls the situation:This report represents one of the major steps toward the planning of the GARP (see, for example, GARP Topics,

Bull. Amer. Meteor. Soc., 30(1969), 136–141). It showed, for the first time using a realistic model of the atmosphere, the existence of a deterministic predictability limit the order of weeks. The report specifically says that the limit is two weeks, which became a matter of controversy later. To me, there is no reason that it is a fixed number. It should depend on many factors, such as the part of the time/space spectrum, climate and weather regimes, region of the globe and height in the vertical, season, etc. The important indication of the report is that the limit is not likely to be the order of days or the order of months for deterministic prediction of middle-latitude synoptic disturbances. (A. Arakawa 2002, personal communication)

And indeed the money did come, due in no small measure to the Herculean efforts of Charney, the GCM modelers, and other members of the panel. A laudatory tribute to these meteorologists is found in the first chapter of Smagorinsky (1978).In the early 1960s, some people including some in high places in meteorology were saying that when we get this thing going [GARP] we’ll have two-week forecasts . . . Charney got a little worried about this and his recommendation was that we investigate the feasibility of the two-week forecast rather than promising to make them. Of course, the promise of making them was the means to getting the money. (E. Lorenz 2002, personal communication)

## 6. Dispersion of the glob: A measure of uncertainty

In the early 1960s as a professor at University of Michigan, Edward Epstein began to use Monte Carlo methods, probabilistic methods that flourished in the post–World War II computer age at places like Los Alamos National Laboratory (Metropolis and Ulam 1949). Although trained as a meteorologist under Hans Panofsky, he had a long and abiding interest in probability that stemmed from his exposure to the subject as an undergraduate at Harvard University. Specifically, this exposure came in the form of classroom instruction under Professor Frederick Mosteller.

It was a geometrical interpretation where the motion began from an initial position subject to indeterminacy, but once started, the motion followed the deterministic law [cryptodeterministic in Edmund Whittaker’s terminology (Whittaker 1937)]. This point of view provides the link between probabilistic and dynamical aspects of the system. The variables are no longer regarded as deterministic variables, rather as random variables with associated probabilistic, that is, stochastic properties.Lorenz’s paper, as I recall, was less relevant to the issue of uncertainty but his clear presentation greatly sharpened my view of phase space and the correspondence of uncertainty with a glob of points each of which would follow its own deterministic path. (E. Epstein 2002, personal communication)

With this concept firmly in mind, Epstein took a sabbatical at the International Institute of Meteorology in Stockholm beginning in fall 1968. The one-year appointment was secured with the help of Wiin-Nielsen, chair of the Meteorology and Oceanography Department at Michigan (E. Epstein 2002, personal communication). In the company of a stimulating and supportive group at the Institute—Bert Bolin, Bo Döös, and Hilding Sundqvist—Epstein began work on a stochastic–dynamic (SD) approach to NWP. The culmination of this sabbatical work was his paper published in *Tellus*, simply titled “Stochastic dynamic prediction” (Epstein 1969).

In the spirit of Lorenz, Epstein investigated SD prediction by adopting a simple yet nontrivial system of equations—the maximum simplification equations of Lorenz (1960), a three-component spectral form of the barotropic constraint. This set of three coupled nonlinear differential equations describe the energy exchange between the mean (zonal) flow and two disturbances. The derivation of the governing SD equations is straightforward, first representing the spectral amplitudes as the sum of the mean values and deviations, followed by ensemble averaging of the equations. The derivation of the equations for the higher-order moments follows in a similar fashion (forming the variance and covariance equations by appropriate multiplications of the governing set). These details are pedagogically presented in Epstein (1969).The present study is an attempt to deal explicitly with the problems imposed on meteorological prediction by the patent impossibility of observing the atmosphere either in sufficient detail or with sufficient accuracy to consider the initial state as known with certainty.

Generally, the number of equations for stochastic–dynamic prediction is equal to the number of spectral components raised to the power of the number of moments. Thus, in this case where there are three spectral components and there is a desire to include two moments (mean and variances, including covariances), the number of equations in the SD set is 9 = 3^{2}. Yet, these nine equations will include the third-order moments, and thus, to get a closed system of equations, the third-order moments must be discarded or expressed in terms of the lower-order moments—not unlike the closure issues in turbulence theory.

In Epstein’s numerical experiments, he attains closure by dropping the combination of terms in each of the variance/covariance equations that include the third-order moments—apparently a somewhat less severe assumption than strictly setting the third-order moments to zero. Although Epstein assumes multivariate Gaussian distributions (completely described by first and second moments) as a measure of uncertainty in the initial state, the nonlinearity of the dynamics produces probability density functions that possess higher-order moments. As a standard for comparison with his approximate solution (due to the closure assumption), Epstein used the Monte Carlo method. Discrete initial points in phase space (space of the spectral amplitudes) are chosen by a random process such that the likelihood of selecting a given point is proportional to the assumed probability density. The ensemble sample size is typically 1000 for his experiments.

The deterministic predictions are initialized by using the ensemble mean from the Monte Carlo method. He compared the SD prediction with the deterministic prediction and used the Monte Carlo as the standard (truth). In these tests, two versions of the maximum-simplification equations were used: the oscillatory and unstable modes. There is little exchange of energy in the oscillatory mode, but in the unstable mode the disturbances grow at the expense of the zonal flow. In the oscillatory mode, the SD and deterministic predictions show little difference through 4 days of simulated time, and the SD results are nearly identical to the standard. On the other hand, the deterministic prediction exhibits serious error after 2 days in the unstable mode while the SD forecast faithfully reproduces the evolution for the limit of integration (6 days). The SD prediction of the variances is close to the standard throughout.

A recent picture of Epstein is shown in Fig. 5.The closure problem is more severe than I originally thought. The same sensitivity to initial conditions that gives rise to chaos is the death knell to any closure based on an assumed ensemble distribution. (E. Epstein 2002, personal communication)

## 7. Bridge to operational ensemble forecasting

The period between the late 1960s and the mid-1980s was one of waiting, waiting for operational implementation of SD forecasting, which was keyed to the availability of affordable parallel-processing machinery, but more importantly, it was a period that required clarification on the limits of predictability and attention to certain technical issues, issues such as Monte Carlo versus Epstein’s SD approximation, perturbation methodology, and flow-dependent uncertainty.

### a. Clarifying predictability

There were two companion papers by Lorenz during this period that solidified our understanding of predictability (Lorenz 1969a, 1982). The first (Lorenz 1969a) offered a refreshing view of the atmosphere’s instability—an instability that was measured by the divergence of analogues. His approach was free of modeling constraints where he searched through the archive of upper-air data (850-, 500-, and 200-mb geopotential fields) in the hopes of finding quality analogues—those characterized by small rms differences. He experienced difficulty in finding these analogues and was forced to work with what he called “near analogues.” The doubling time for differences in these near analogues was about 8 days; yet, after quadratic extrapolation of these difference curves to smaller errors, he estimated that the doubling time was roughly 2.5 days.

With this same philosophy in mind, Lorenz (1982) created analogues by using the archival information at ECMWF—both the 500-mb analyses and the forecasts that extended to 10 days over an entire “winter” season (to be precise, 1 December 1980 to 10 March 1981). Figure 1 from Lorenz (1982) is reproduced here as Fig. 6. Lorenz’s caption to this figure is retained, where *E _{jk}* is the root-mean-square (rms) difference between

*j*-day and

*k*-day prognoses for the same day, averaged over the globe and over all days in the sample (100 days). Thus,

*E*is the rms error of the model prediction for day

_{ok}*k*—the heavy curve (topmost curve). The thin or light curves represent the divergence of model solutions for the various magnitudes of initial difference, the lowest thin curve representing “moderately small errors.” These are the “best” analogues—the analysis and the 1-day forecast verifying at the time of the analysis. Analogues with larger differences are formed by taking the analysis and the 2-day forecast verifying at the time of analysis, etc. Thus, nine sets of analogues and their divergence with time are represented by the thin curves in Fig. 7. Two of the principal results of the study are the following: the doubling time of small errors is about 2.5 days (again found by quadratic extrapolation of the evolution curve for moderately small errors to

*t*= 0), and the limit of extended-range forecasting is slightly greater than 2 weeks (elaborated upon below).

And, continuing in this line of thought, he speculates on the limit of useful predictions:The rate at which separate solutions of the model diverge is supposed to approximate the rate at which separate solutions of the true atmospheric equations diverge. If it does, and if, at some time during the forecast, the model could suddenly be replaced by the true equations, the remainder of the heavy curve would follow one of the thin curves. The excess slope of the heavy curve over that of an intersecting thin curve may therefore be regarded as a measure of the maximum amount by which the model may still be improved. (Lorenz 1982, p. 509)

He exhibited some cautiousness regarding this predictability limit, however, as he looked ahead to the time when more accurate model forecasts would become available—with associated reduction in error doubling times. In such a case, the rate of climb of the lower curve in Fig. 6 would increase.Even without further improvement in the one-day prediction, the performance of the perfect model should then be given by the lowest curve in Fig. 1 [Fig. 6 here] and its extrapolation to the right, and skilful forecasts more than two weeks ahead should ultimately be expected. (Lorenz 1982, p. 509)

### b. Computational constraints

Phil Thompson became interested in making Epstein’s SD approximation computationally competitive with deterministic prediction. Without some form of simplification, the computational requirements for an SD system with *n* spectral components increase by a factor of roughly *n*^{2} compared to the same deterministic system (when moments up to the third order are retained). Thompson achieved some success in this line of research when he explored SD prediction under quasi-geostrophic constraints. He was able to take advantage of the conservation property of the dynamics to express the covariances as linear combinations of the variances, thereby reducing the number of equations and variables (Thompson 1985).

*Tellus*paper (E. Epstein 2002, personal communication), had the following exchange with Epstein:

Leith, with a superb background in computer simulation of atomic and nuclear reactions, began to employ Monte Carlo methods in his investigations of turbulence (Leith 1971, 1974; Leith and Kraichnan 1972). In Leith (1974), a 2D turbulence model was used to explore the statistical properties of a finite Monte Carlo sample. Although he was unable to determine the precise sample size for determination of higher-order moments in forecast error statistics, he stated, “Adequate accuracy should be obtained for the best mean estimate of the forecast field with sample sizes as small as 8.”When Chuck Leith speculated (I believe it was in early 1970) that the average of eight appropriately chosen forecast runs could give a useful forecast of the mean, I agreed, but admonished that it would not allow you to specify the second moment adequately. [See Leith 1974.] (E. Epstein 2002, personal communication)

### c. Perturbation strategy

*small*ensemble of initial errors. He speculated on the applicability of this result to more realistic models:

If more realistic models with many thousands of variables also have the property that a few of the eigenvalues of

AA[the matrix that controls the error growth in the model] are much larger than the remaining, a study based upon a small ensemble of initial errors should, as already suggested, give a reasonable estimate of the growth rate of random error . . . It would appear then, that the best use could be made of computational time by choosing only a small number of error fields for superposition upon a particular initial state . . . (Lorenz 1965b)^{T}

Well, it may mean that in many-mode systems, the probability of distribution in phase-space may move or spread in certain preferred directions. In that case, one might introduce some bias in the choice of initial conditions represented in the sample, and thereby reduce the number of realizations that have to be carried out in Monte Carlo calculation. But I wouldn’t bet the store on it. (Thompson 1987, p. 51)

Mathematical details on the SV decomposition can be found in Mureau et al. (1993) and Buizza and Palmer (1995).Our studies at ECMWF suggested that we wish to find the initial perturbation, consistent with the statistics of initial error, which evolves into the perturbation with largest total energy. (T. Palmer 2004, personal communication)

Variations on this theme that link initial error covariance with fast-growing modes continue to generate interest (Fisher and Courtier 1995; Barkmeijer 1996; Hamill et al. 2003). In their study, Hamill et al. (2003) cleverly exploit the mechanics of SV algebra (using square root filters) to algorithmically derive the SVs of the forecast error covariance from the SVs associated with the error covariance of the initial analysis.. . . my studies suggested that singular vectors were crucial to the study of predictability. Some time shortly after these studies, I discovered that Lorenz had pretty much said the same thing [Lorenz 1965b]. (T. Palmer 2004, personal communication)

At the National Centers for Environmental Prediction (NCEP), the methodology used to create initial perturbations rests on a Lyapunovian-stability principle (Legras and Vautard 1996). In short, solutions to the governing prediction equations are generated by using two initial states: the standard operational analysis and an analysis generated by superimposing a random perturbation on the standard analysis. One-day forecasts are made with each of the analyses, and the resulting difference field is used to generate an “improved” perturbation. This perturbation is scaled to match the magnitude of the initial perturbation. The new perturbation is then superimposed on the standard operational analysis at *t* = 0 to create the perturbed analysis. The process is repeated, and with continued iteration, the damped-mode component becomes smaller and smaller while the growing-mode component remains. The term “breeding modes” has been used to describe this process of creating the initial perturbations (Tracton and Kalnay 1993). A stimulating discussion of the differences and similarities between the breeding vectors (BV) and SVs is found in Buizza and Palmer (1995).

### d. Flow-dependent uncertainty

Sudden warming is a stratospheric phenomenon that occurs in association with the breakdown of the polar-night vortex (in spring), and it is often accompanied by blocking action in the troposphere. The substance of Miyakoda’s bold statement came to fruition by the beginning of the next decade when he and his colleagues used the GFDL model to forecast the breakdown of the polar-night vortex [Miyakoda et al. (1970), previewed by Smagorinsky (1969)].I want to make a 30-day forecast of the sudden warming phenomenon with a general circulation model.

There are events such as sudden warming that appear to be more predictable than average situations, and certainly there are events less predictable than average. The term that embodies this view of predictability has come to be called flow-dependent predictability or uncertainty (Lorenz 1969b). It has been most dramatically evident in studies with low-order systems such as the Lorenz attractor (Palmer 1993a). In essence, the experiments indicate that the particle’s initial position is only known within a circle of uncertainty. The dispersion of “this glob of points,” to use Epstein’s phraseology, exhibits a wide range of uncertainty dependent on the initial position of the glob in the attractor space. The uncertainty of prediction is “flow dependent.” Thomas Gleeson, a professor of meteorology at The Florida State University, was an early investigator of this type of flow dependency in simple systems, and his work had a significant influence on Epstein’s thinking (Gleeson 1968; E. Epstein 2002, personal communication).

Beyond the pedagogical examples afforded by the low-order systems, the work by Shukla (1981), Miyakoda et al. (1986), and Palmer (1988) has indicated low-frequency long-wave patterns are predictable up to a month, and even up to 45 days in the case of Shukla’s study. In the studies by Miyakoda et al. (1986) and Palmer (1988), the predictability in the Pacific Ocean–North American region was shown to be critically dependent on the mode of the low-frequency variability, where the modes are defined in terms of the positions of large-scale anomalies in the pressure and wind patterns (e.g., see Wallace and Gutzler 1981).

### e. Real-time and lagged-time probabilistic forecasting

A glimmer of hope for operational SD forecasting appeared in late 1985. With parallel-processing computers overly expensive for routine operational prediction, a limited real-time experiment in ensemble forecasting took place at the Synoptic Climatology Branch of the Met Office (Murphy and Palmer 1986). The 11-level PE-based climate model, an upgrade to the 5-level model developed in the 1970s by Corby et al. (1977), was used to make seven-member ensemble forecasts in real-time. The ensemble 30-day forecasts were made at 12-h intervals over a 3-day period as part of an operational long-range forecasting conference in the United Kingdom. Results indicated that the ensemble-mean forecast generally improved on the individual forecasts. Furthermore and just as important, the results indicated a splitting of the forecasts into two distinct clusters with respect to the circulation patterns of the Pacific–North American region. After this initial success, the Synoptic Branch “. . . anticipated that these real-time ensembles of forecasts will be run at least once a quarter and will have a major impact on the operational requirements of the Synoptic Climatology Branch” (Murphy and Palmer 1986). And, as viewed retrospectively (T. Palmer 2004, personal communication), “This indeed turned out to be true.”

In the presence of the limited computational resources to conduct ensemble experiments, a novel approach for generation of ensemble statistics was developed by Hoffman and Kalnay (1983). With an efficient strategy reminiscent of Lorenz’s work with the ECMWF model (Lorenz 1982), these researchers devised a lagged-average ensemble forecast that exhibited stochastic structure. By making use of an archive of forecasts (in their case, a truncated spectral model), an ensemble-mean forecast at a given time was constructed by combining forecasts of varying length, that is, forecasts that were initialized at precedent times and verified at a fixed time. The idea built on the earlier work of Miyakoda and Talagrand (1971) and Leith (1978) where time-weighted forecasts from earlier times were used to reduce error at the end time.

## 8. Ensemble Prediction Systems (EPSs)

The combination of more-affordable parallel-processing machinery in the early 1990s and improved operational forecasting systems—improvements in both model physics and data assimilation—has led to operational stochastic–dynamic prediction at ECMWF, NMC (now NCEP), and the Meteorological Service of Canada (MSC) in the early 1990s. The approach to ensemble prediction used at operational centers exhibits subtle differences when compared with the standard Monte Carlo method. In Monte Carlo, it is assumed that the initial probability density function (pdf) is known and that it is sampled randomly. In the method used operationally, which we will call the “ensemble” method, the pdf is generally not sampled in a random way (i.e., initial perturbations produced by singular vector or breeding vector approaches are not random). In Buizza et al. (2005), we find a valuable comparison of these three operational EPSs. Specific details on the EPSs are found in Palmer (1993b) and Molteni et al. (1996) (ECMWF), Toth and Kalnay (1993) and Tracton et al. (1993) (NCEP), and Houtekamer et al. (1996) (MSC).

The improvements in deterministic forecasting that have justified operational ensemble forecasting are linked to improved analysis (reduction of initial state error) and improved model resolution alongside more accurate model physics (Simmons and Hollingsworth 2002; Bengtsson 1999). A summary of the improvements in the ECMWF forecast/analysis system over the past two decades is found in Simmons and Hollingsworth (2002). These improvements are succinctly summarized in their Fig. 6, here reproduced as Fig. 7. The mode of presentation follows Lorenz (1982) where results from the winter of 1981 are compared with the recent results from 2001. Whereas Lorenz (1982) analyzed global data during the winter of 1981, this later study has restricted its analysis to data from the Northern Hemisphere; this accounts for differences between results displayed in their Fig. 6 and those displayed in the top part of Fig. 7.

The salient features of the forecast errors and the divergence of solutions presented in Fig. 7 are the following:

The forecast errors (solid curves) exhibit significant reduction from 1981 to 2001 across the entire forecast range.

The doubling rate of error (the dashed curves) decreases between 1981 and 2001.

Cursory examination of the dashed curves in Fig. 7 indicates that the errors at day 1 are roughly 25–30 m (1981) and 10 m (2001). The associated doubling time of error is ∼3 and ∼2 days in 1981and 2001, respectively. The doubling time of 3 days for the 1981 data is close to Lorenz’s estimate of 3.5 days (Lorenz 1982). The doubling rates decrease when estimates are based on the growth of smaller errors at *t* = 0 (obtained by extrapolation) – 2.5 (1981) and 1.5 (2001) days. [See Lorenz (1982) and Simmons et al. (1995) for details.] It is remarkable that Lorenz (1982) anticipated the general nature of these results (see discussion at the end of section 4).

To give an idea of the progress in EPS, we show the evolution of the system at ECMWF (Table 1). Operational implementation began in 1992 using a global spectral model with 19 vertical levels and a horizontal resolution of T63 [truncated (T) at zonal wavenumber 63], and where the ensemble set had 33 members. By 1996, the number of ensemble members increased to 51 and this number has remained unchanged to the present day. Increases in both the vertical and horizontal resolution have systematically occurred over this same 9-yr period.

Since the operational data assimilation is underpinned by model error covariance statistics, there is an important connection between data assimilation and ensemble prediction. The coupling between the forecast and data assimilation is especially apparent for systems that use SVs. In this case, the tangent linear model and its adjoint are the bonds—these models are used to generate the SVs and to determine the gradient of the cost function used in data assimilation (see Fisher and Courtier 1995). This area of research is one of the most active, and indeed, one of the most challenging in the age of ensemble prediction. There are a variety of avenues that are being explored, among them the ensemble Kalman filter (see Evensen and van Leeuwen 1996; Burgers et al. 1998; Hamill et al. 2000).

Another area of research that is receiving significant attention is the so-called multimodel approach to ensemble forecasting. In effect, “joint” ensembles are created by combining forecasts from different models. Justification for the approach rests on the apparent lack of divergence of model solutions over the short range (the order of a week or less; see Ehrendorfer 1997). The Observing System Research and Predictability Experiment (THORPEX), an international global atmospheric research program following in the steps of GARP, is aimed at improving the accuracy of 1–14-day weather forecasts. The THORPEX Interactive Grand Global Experiment (TIGGE), a component of the program, has the multimodel approach to ensemble forecasting as the centerpiece. And, in anticipation of this program, the ECMWF has reviewed the current status of operational EPSs, and some of the results from this review are displayed in Table 2. As seen in this table, there are now eight operational EPS centers worldwide. This rapid growth is reminiscent of the 10-yr period from the mid-1950s to the mid-1960s when there was a proliferation of GCMs throughout the world. The wide-varying nature of the EPS approaches—the number of members, perturbations strategies, resolution of the models, the frequency of execution, etc.—speaks of a most-active period in operational ensemble weather prediction.

## 9. Epilogue

When things were “pleasingly deterministic” in the mid- to late 1950s, Philip Thomson raised an amber-colored flag warning of the growth of error in the quasi-geostrophic models. In essence, the flag had been waved earlier and associated with a more vociferous cry—that cry came from Eric Eady. Eady’s statements adamantly warned that dynamic weather prediction must be couched in terms of probabilities. Early experiences with operational NWP and the GCMs exhibited these errors, controlled in part by excessively large viscosity or more elegant finite-difference schemes designed by Akio Arakawa.

In the milieu, there was politico–scientific pressure to exhibit the viability of 2-week forecast skill, skill that would virtually guarantee funding for GARP. Charney was cautious, and rightfully so in light of the error growth associated with certain nonperiodic systems of equations akin to those used in meteorological prediction—exhibiting a fundamental instability of atmospheric flow. And it was Ed Lorenz who clearly linked this instability to the limits of deterministic weather prediction.

Those scientists who forged a path toward a probabilistic–dynamic approach to weather forecasting did it in an environment that was less than enthusiastic. Support for each of them, however, came from a stronghold: Epstein received it from that cadre of close associates at Stockholm—especially Sundqvist, and also Bolin and Döös; Lorenz had Charney; and Thompson got “a shot in the arm” (Thompson 1987) from Rossby.

Edward Epstein was outside that mainstream of dynamic meteorology, yet he was aware of the concerns of this group regarding predictability. From his perch in the statistical domain, he reached out and was entrained into the stochastic–dynamic vein by an insightful presentation by Lorenz. He thoroughly investigated SD prediction with a simple nontrivial model and stimulated an important segment of the meteorological community.

The practical implementation of SD prediction had to lie dormant for more than a decade, but affordable parallel-processing computers did finally arrive. The SD approximation of Epstein’s was not attractive for operations nor did he ever view it in that context. Yet the Monte Carlo method that he and Chuck Leith used as a standard for their preliminary work had wide appeal. In practice, the ensemble method has replaced Monte Carlo in operations—in essence, its design was dictated by the need to overcome sampling problems inherent in Monte Carlo. The current operational practice is substantively similar to Lorenz’s view, which he espoused in 1964. Yet it remains to be determined the most appropriate way to perturb the models, to determine the number of samples in the ensemble, to evaluate the multimodel approach, and to couple data assimilation with ensemble forecasting. Work continues at a breakneck pace in this arena, and without doubt, operational NWP with an SD component is a solid fixture in meteorology’s future.

## Acknowledgments

I am most grateful for the lengthy written correspondence and oral history I have received from four of the protagonists in this scientific history: Akio Arakawa (written correspondence: 2 June 1992; 14 April 1997; and 28 August 2002), Edward Epstein (written correspondence: 26 July 2002), Edward Lorenz (oral history: 22 April 2002), and Philip Thompson (oral history: 19 May 1990). An important addition to these first-hand accounts is Chuck Leith’s oral history (2 July 1997). Paul Edwards conducted this interview and graciously allowed me to draw information from the associated transcript.

My initial efforts in this study lay dormant for several years, but Martin Ehrendorfer rekindled my interest with a stimulating talk on predictability at the Adjoint Workshop in April 2002. And Tim Palmer, scientific leader of the ensemble prediction effort at ECMWF, gave me access to many unpublished documents and personal notes related to his work in this field. He made me aware of the linkage between the current singular vector (SV) work and the little-known component of Lorenz’s earlier work (discussed in section 7c). His colleague, Roberto Buizza, supplied valuable information on the current state of affairs in operational ensemble weather prediction. Parts of his summaries appear in the tables.

My colleague S. Lakshmivarahan helped me interpret results from a number of papers referenced in this historical study, and the librarians at Desert Research Institute (Melanie Scott, John Ford, and Ginger Peppard) worked conscientiously to locate historical documents germane to ensemble prediction.

Finally, I salute the following meteorologists, pioneers in NWP and general circulation modeling, who provided me with letters of reminiscence and/or oral histories: Fred Bushby, Larry Gates, Akira Kasahara, Syukuro Manabe, Kikuro Miyakoda, Norman Phillips, Katsuyuki Ooyama, Fred Shuman, Joseph Smagorinsky, and Aksel Wiin-Nielsen.

Photo acquisition credit belongs to George Platzman (Fig. 3), Akira Kasahara (Fig. 4), and Edward Epstein (Fig. 5). Figure credit: Lennart Bengtsson (Fig. 7).

Assistance with the electronic publication process was generously provided by Domagoj Podnar and Ming Xiao.

A special thanks to Chief Editor Dave Jorgensen for advice on structuring the paper such that it served both as a review and historical paper. Then, Editor Dave Schultz chose four knowledgeable and insightful reviewers whose hard work went far to improve the draft manuscript.

## REFERENCES

AMS, 1968:

*Proceedings of the First Statistical Meteorology Conference*. Hartford, CT, Amer. Meteor. Soc., 179 pp.Anderson, J., 1996: Selection of initial conditions for ensemble forecasts in a simple model framework.

,*J. Atmos. Sci.***53****,**22–36.Arakawa, A., 1966: Computational design for long-term numerical integration of the equations of atmospheric motion.

,*J. Comput. Phys.***1****,**119–143.Arakawa, A., 1997: Cumulus parameterization: An ever-challenging problem in tropical meteorology and climate modeling. Preprints,

*22d Conf. on Hurricanes and Tropical Meteorology*, Fort Collins, CO, Amer. Meteor. Soc., 7–12.Arakawa, A., 2000: A personal perspective on the early years of general circulation modeling at UCLA.

*General Circulation Model Development (Past, Present, and Future)*, D. Randall, Ed., International Geophysical Series, Vol. 70, Academic Press, 1–65.Barkmeijer, J., 1996: Constructing fast-growing perturbations for the nonlinear regime.

,*J. Atmos. Sci.***53****,**2838–2851.Bengtsson, L., 1999: From short-range barotropic modelling to extended-range global prediction: A 40-year perspective.

,*Tellus***51A-B****,**13–32.Bengtsson, L., 2004: Global weather prediction—Possible developments in the next decades.

*Extended Abstracts, 50th Anniversary of Operational Numerical Weather Prediction*, Adelphi, MD, Amer. Meteor. Soc., 23 pp.Bohm, D., 1957:

*Causality and Chance in Modern Physics*. Harper and Bros., 170 pp.Buizza, R., and T. Palmer, 1995: The singular vector structure of the atmospheric global circulation.

,*J. Atmos. Sci.***52****,**1434–1456.Buizza, R., P. Houtekamer, Z. Toth, G. Pellerin, M. Wei, and Y. Zhu, 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems.

,*Mon. Wea. Rev.***133****,**1076–1097.Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter.

,*Mon. Wea. Rev.***126****,**1719–1724.Charney, J., 1947: The dynamics of long waves in a baroclinic westerly current.

,*J. Meteor.***4****,**135–162.Charney, J., 1948: On the scale of the atmospheric motions.

*Geofys. Publ.*,**17**(2).Charney, J., 1955: The use of the primitive equations of motion in numerical weather prediction.

,*Tellus***7****,**22–26.Charney, J., R. Fjørtoft, and J. von Neumann, 1950: Numerical integration of the barotropic vorticity equation.

,*Tellus***2****,**237–254.Committee on the Atmospheric Sciences, 1966: The feasibility of a global observation and analysis experiment. NAS-NRC Publication 1290, Washington DC, 172 pp. [Published in abbreviated form in

*Bull. Amer. Meteor. Soc.*,**47,**200–220.].Corby, G., A. Gilchrist, and P. Rountree, 1977: United Kingdom Meteorological Office five-level general circulation model.

,*Methods Comput. Phys.***17****,**67–110.Eady, E., 1949: Long waves and cyclone waves.

,*Tellus***1****,**33–52.Eady, E., 1951: The quantitative theory of cyclone development.

*Compendium of Meteorology*, T. Malone, Ed., Amer. Meteor. Soc., 464–469.ECMWF, 2004: Representing Model Uncertainty in Weather and Climate Prediction. Item 7.2 of 33d Session of the Scientific Advisory Committee, 16 July 2004, 27 pp.

Edwards, P., 2000: A brief history of atmospheric general circulation modeling.

*General Circulation Model Development (Past, Present, and Future)*, D. Randall, Ed., Academic Press, 67–90.Ehrendorfer, M., 1997: Predicting the uncertainty of numerical weather forecasts: A review.

,*Meteor. Z.***6****,**147–183.Eliassen, A., 1962: Predictability.

*Proc. Int. Symp. on Numerical Weather Prediction*, Tokyo, Japan, Meteorological Society of Japan, 644–646.Epstein, E., 1969: Stochastic dynamic prediction.

,*Tellus***21****,**739–759.Evensen, G., and P. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Agulhas Current using the ensemble Kalman filter with a quasigeostrophic model.

,*Mon. Wea. Rev.***124****,**85–96.Farrell, B., 1990: Small error dynamics and the predictability of atmospheric flows.

,*J. Atmos. Sci.***47****,**2191–2199.Fisher, M., and P. Courtier, 1995: Estimating the covariance matrices of analysis and forecast error in variational data assimilation. ECMWF Research Department Tech. Memo. 220, 28 pp. [Available from ECMWF, Shinfield Park, Reading RG29AX, United Kingdom.].

Fjørtoft, R., 1952: On a numerical method of integrating the barotropic vorticity equation.

,*Tellus***4****,**179–194.Fleming, R., 1971: On stochastic-dynamic prediction.

,*Mon. Wea. Rev.***99****,**851–872.GARP, 1969: GARP topics.

,*Bull. Amer. Meteor. Soc.***50****,**136–141.Gauss, C., 1963:

*Theory of the Motion of the Heavenly Bodies Moving about the Sun in Conic Sections.*Dover, 326 pp. [Originally published in 1809 as*Theoria Motus Corporum Coelestium in Sectionibus Conicus solem Ambientium*; first translated in 1857 by C. H. Davis and published by Little, Brown, and Co.].Gill, A., 1982:

*Atmosphere-Ocean Dynamics*. Academic Press, 662 pp.Gillispie, C., 1981:

*Dictionary of Scientific Biography*. 18 vols. Scribner.Gleeson, T., 1968: A modern physical basis for meteorological prediction.

*Proc. First Statistical Meteorology Conf.*, Hartford, CT, Amer. Meteor. Soc., 1–10.Hamill, T., S. Mullen, C. Snyder, Z. Toth, and D. Baumhefner, 2000: Ensemble forecasting in the short to medium range: Report from a workshop.

,*Bull. Amer. Meteor. Soc.***81****,**2653–2664.Hamill, T., C. Snyder, and J. Whitaker, 2003: Ensemble forecasts and the properties of flow-dependent analysis-error covariance singular vectors.

,*Mon. Wea. Rev.***131****,**1741–1758.Hinkelmann, K., 1951: Der mechanismus des meteorologischen Lärmes.

,*Tellus***4****,**285–296.Hinkelmann, K., 1959: Ein numerisches Experiment mit den primitiven Gleichungen.

*The Atmosphere and Sea in Motion*, B. Bolin and E. Eriksson, Eds., Rockefeller Institute Press 486–500.Hoffman, R., and E. Kalnay, 1983: Lagged average forecasting, an alternative to Monte Carlo forecasting.

,*Tellus***35A****,**100–118.Houtekamer, P., L. Lefaivre, J. Derome, H. Richie, and H. Mitchell, 1996: A system simulation approach to ensemble prediction.

,*Mon. Wea. Rev.***124****,**1225–1242.Kolmogorov, A., 1941: Interpolation and extrapolation.

,*Bull. Acad. Sci., USSR Ser. Math.***5****,**3–14.Lacarra, J., and O. Talagrand, 1988: Short-range evolution of small perturbations in a barotropic model.

,*Tellus***40A****,**81–95.Legras, B., and R. Vautard, 1996: A guide to Lyapunov vectors.

*Proc. ECMWF Seminar on Predictability*, Vol. I, Reading, United Kingdom, ECMWF, 143–156.Leith, C., 1965: Numerical simulation of the earth’s atmosphere.

*Methods in Computational Physics*, Vol. 4, B. Adler, S. Fernbach, and M. Rotenberg, Eds., Academic Press, 1–28.Leith, C., 1971: Atmospheric predictability and two-dimensional turbulence.

,*J. Atmos. Sci.***28****,**145–161.Leith, C., 1974: Theoretical skill of Monte Carlo forecasts.

,*Mon. Wea. Rev.***102****,**409–418.Leith, C., 1978: Predictability of climate.

,*Nature***276****,**352–355.Leith, C., 1997: Oral history interview. Interviewed by Paul Edwards at Stanford University, 2 July 1997. [Copy on file at Center for the History of Physics, American Institute of Physics, College Park, MD 20740.].

Leith, C., and R. Kraichnan, 1972: Predictability of turbulent flows.

,*J. Atmos. Sci.***29****,**1041–1058.Lewis, J., 1996: Philip Thompson: Pages from a scientist’s life.

,*Bull. Amer. Meteor. Soc.***77****,**107–113.Lewis, J., 1998: Clarifying the dynamics of the general circulation: Phillips’s 1956 experiment.

,*Bull. Amer. Meteor. Soc.***79****,**39–60.Lilly, D., 1997: Introduction to “Computational design for long-term numerical integration of the equations of fluid motion: Two-dimensional incompressible flow. Part I.”.

,*J. Comput. Phys.***135****,**101–102.Lorenz, E., 1960: Maximum simplification of the dynamic equations.

,*Tellus***12****,**243–254.Lorenz, E., 1962: The statistical prediction of solutions of dynamic equations.

*Proc. Int. Symp. on Numerical Weather Prediction*, Tokyo, Japan, Meteorological Society of Japan, 629–634.Lorenz, E., 1963: Deterministic nonperiodic flow.

,*J. Atmos. Sci.***20****,**130–141.Lorenz, E., 1965a: On the possible reasons for long-period fluctuations of the general circulation.

*Proc. WMO-IUGG Symp. on Research and Development Aspects of Long-Range Forecasting*, Boulder, CO, World Meteorological Organization, WMO Tech. Note 66, 345 pp.Lorenz, E., 1965b: A study of the predictability of a 28-variable atmospheric model.

,*Tellus***17****,**321–333.Lorenz, E., 1968: On the range of atmospheric predictability.

*Proc. First Statistical Meteorology Conf.*, Hartford, CT, Amer. Meteor. Soc., 11–19.Lorenz, E., 1969a: Atmospheric predictability as revealed by naturally occurring analogues.

,*J. Atmos. Sci.***26****,**636–646.Lorenz, E., 1969b: The predictability of a flow which possesses many scales of motion.

,*Tellus***21****,**289–307.Lorenz, E., 1982: Atmospheric predictability experiments with a large numerical model.

,*Tellus***34****,**505–513.Lorenz, E., 1993:

*The Essence of Chaos*. University of Washington Press, 319 pp.Metropolis, N., and S. Ulam, 1949: The Monte Carlo method.

,*J. Amer. Stat. Assoc.***44****,**335–341.Miyakoda, K., and O. Talagrand, 1971: The assimilation of past data in dynamical analysis. I.

,*Tellus***23****,**310–317.Miyakoda, K., R. Strickler, and G. Hembree, 1970: Numerical simulation of the breakdown of the polar-night vortex in the stratosphere.

,*J. Atmos. Sci.***27****,**139–154.Miyakoda, K., J. Sirutis, and J. Plosay, 1986: One month forecast experiments—Without anomaly boundary forcings.

,*Mon. Wea. Rev.***114****,**2363–2401.Molteni, F., R. Buizza, T. Palmer, and T. Petroliagis, 1996: The ECMWF ensemble prediction system: Methodology and validation.

,*Quart. J. Roy. Meteor. Soc.***122****,**73–119.Mureau, R., F. Molteni, and T. Palmer, 1993: Ensemble prediction using dynamically conditioned perturbations.

,*Quart. J. Roy. Meteor. Soc.***119****,**299–323.Murphy, J., and T. Palmer, 1986: Experimental monthly long-range forecast by an ensemble of numerical integrations.

,*Meteor. Mag.***115****,**337–349.Novikov, E., 1959: On the problem of predictability of synoptic processes.

,*Izv. Acad. Sci. USSR, Geophys. Ser.***11****,**1209–1211.Palmer, T., 1988: Medium and extended range predictability and stability of the Pacific/North American mode.

,*Quart. J. Roy. Meteor. Soc.***114****,**691–713.Palmer, T., 1993a: Extended-range atmospheric prediction and the Lorenz model.

,*Bull. Amer. Meteor. Soc.***74****,**49–65.Palmer, T., 1993b: Ensemble prediction.

*Proc. 1992 ECMWF Seminar*, Reading, United Kingdom, ECMWF. [Available from ECMWF, Shinfield Park, Reading RG29AX, United Kingdom.].Pfeffer, R., 1960:

*Dynamics of Climate—Proceedings of a Conference on the Application of Numerical Integration Techniques to the Problem of the General Circulation*. Pergamon Press, 137 pp.Phillips, N., 1956: The general circulation of the atmosphere: A numerical experiment.

,*Quart. J. Roy. Meteor. Soc.***82****,**123–164.Phillips, N., 1959: An example of non-linear computational instability.

*The Atmosphere and Sea in Motion*, B. Bolin and E. Eriksson, Eds., Rockefeller Institute Press, 501–504.Pitcher, E., 1977: Application of stochastic dynamic prediction to real data.

,*J. Atmos. Sci.***34****,**1–21.Platzman, G., 1979: The ENIAC computations of 1950—Gateway to numerical weather prediction.

,*Bull. Amer. Meteor. Soc.***60****,**302–312.Richardson, L., 1965:

*Weather Prediction by Numerical Process*. Dover, 236 pp.Saltzman, B., 1962: Finite amplitude free convection as an initial value problem—I.

,*J. Atmos. Sci.***19****,**329–341.Shukla, J., 1981: Dynamical predictability of monthly means.

,*J. Atmos. Sci.***38****,**2547–2572.Simmons, A., and A. Hollingsworth, 2002: Some aspects of the improvements in skill of numerical weather prediction.

,*Quart. J. Roy. Meteor. Soc.***128****,**647–677.Simmons, A., R. Mureau, and T. Petroliagis, 1995: Error growth and predictability estimates for the ECMWF forecasting system.

,*Quart. J. Roy. Meteor. Soc.***121****,**1739–1771.Smagorinsky, J., 1963: General circulation experiments with the primitive equations. I. The basic experiment.

,*Mon. Wea. Rev.***91****,**99–164.Smagorinsky, J., 1969: Problems and promises of deterministic extended range forecasting.

,*Bull. Amer. Meteor. Soc.***50****,**286–311.Smagorinsky, J., 1971: Oral history interview. R. Mertz, Interviewer, National Museum of American History, Smithsonian Institution, Washington, DC, 100 pp.

Smagorinsky, J., 1978: History and progress. The Global Weather Experiment—Perspectives on Implementation and Exploitation, FGGE Advisory Panel Rep., National Academy of Sciences, Washington, DC, 4–12.

Smagorinsky, J., 1983: The beginnings of numerical weather prediction and general circulation modeling: Early recollections.

*Advances in Geophysics*, Vol. 25, Academic Press, 3–37.Thompson, P., 1957: Uncertainty of initial state as a factor in predictability of large-scale atmospheric flow patterns.

,*Tellus***9****,**275–295.Thompson, P., 1983: A history of numerical weather prediction in the United States.

,*Bull. Amer. Meteor. Soc.***84****,**755–769.Thompson, P., 1985: Prediction of probable errors in predictions.

,*Mon. Wea. Rev.***113****,**248–259.Thompson, P., 1987: Oral history interview. A. Kasahara and J. Tribbia, Interviewers, 52 pp. [Available from NCAR Archives, P.O. Box 3000, Boulder, CO 80303.].

Thompson, P., and E. Lorenz, 1986: Dialogue between Phil Thompson and Ed Lorenz on 31 July 1986. N. Gauss, Moderator, Amer. Meteor. Soc. Tape Recorded Interview Project, 19 pp. [Available from NCAR Archives, P.O. Box 3000, Boulder, CO 80303.].

Toth, Z., and E. Kalnay, 1993: Ensemble forecasting at NMC: The generation of perturbations.

,*Bull. Amer. Meteor. Soc.***74****,**2317–2330.Tracton, S., S. Kalnay, and E. Kalnay, 1993: Operational ensemble prediction at the National Meteorological Center: Practical aspects.

,*Wea. Forecasting***8****,**379–398.von Neumann, J., 1955: Some remarks on the problem of forecasting climate fluctuations.

*Dynamics of Climate*, R. Pfeffer, Ed., Pergamon Press, 9–11.Wallace, J., and D. Gutzler, 1981: Teleconnections in the geopotential height field during the Northern Hemisphere winter.

,*Mon. Wea. Rev.***109****,**785–812.Whittaker, E., 1937:

*A Treatise on the Analytical Dynamics of Particles and Rigid Bodies (With an Introduction to the Problem of Three Bodies)*. Dover, 456 pp.Wiener, N., 1947:

*Extrapolation, Interpolation, and Smoothing of Stationary Time Series*. Technology Press and John Wiley and Sons, 163 pp.Wiener, N., 1956: Nonlinear prediction and dynamics.

*Proceeding of the Third Berkeley Symposium on Mathematics, Statistics, and Probability*, Vol. III, University of California Press, 247–252.Wiin-Nielsen, A., 1991: The birth of numerical weather prediction.

,*Tellus***43AB****,**36–52.WMO, 1965:

*WMO–IUGG Symposium on Research and Development Aspects of Long-Range Forecasting.*Boulder, CO, World Meteorological Organization, Tech. Note 66, 345 pp.

## APPENDIX

### Genealogy of Ensemble Prediction

The development of ensemble weather prediction in the late twentieth century can be traced to several fundamental lines of research in the history of science. A schematic diagram depicting these lines is presented in Fig. A1. Support for construction of the chart generally comes from information in the *Dictionary of Scientific Biography* (Gillispie 1981).

Ensemble forecasting combines deterministic prediction with probability, and thus two fountainheads appear at the top of the chart—associated with the work of Isaac Newton and Blaise Pascal and Pierre Fermat (chronology of the scientific work is depicted by a time line on the right side of the schematic diagram). Laplace became champion of determinism or the mechanistic view, and along with contemporaries such as Euler, Lagrange, and Gauss, determinism was placed on, what then seemed to be, the firmest of foundations. Gauss and Laplace made important contributions to the theory of observational errors, and this justifies their connection with the probability line.

By the late 1800s, limitations to the deterministic view began to appear, most notably through the work of James Maxwell, Ludwig Boltzmann, and J. Willard Gibbs—statistical explanations for the laws of thermodynamics and other macroscopic properties of matter rather than the traditional deterministic approach (Bohm 1957). Thus, on the schematic, we show determinism and probability linking to give rise to work in statistical thermodynamics. At about this same time, Henri Poincaré began a heightened mathematical exploration of determinism—for example, investigations into the existence of solutions to problems in mechanics such as the three-body problem; this line of research came to be called dynamical systems. Harvard mathematician G. D. Birkhoff was the primary successor of Poincaré in this branch of mathematics. The work of Edward Lorenz in the late 1950s and 1960s, research related to the extreme sensitivity of certain nonperiodic systems (later referred to as chaotic systems), established his position in this line. Lorenz studied under Birkhoff in the early 1940s, but not in the area of dynamical systems. Lorenz’s wartime experiences as a weather forecaster and a subsequent conversation with Henry Houghton, head of MIT’s Meteorology Department, inclined him toward a career in meteorology (E. Lorenz 2002, personal communication).

In addition to the field of dynamical systems, the line of research that came to have great bearing on ensemble forecasting was stochastic–dynamic prediction, an approach that stemmed from mathematical issues related to Brownian motion and statistical thermodynamics, among other processes that exhibited randomness (Bohm 1957). The foundations of this field of study are associated with the names of Alexei Kolmogorov and Norbert Wiener.

Running alongside dynamical systems, we show the line of dynamical weather prediction that stems from the deterministic laws of Newton coupled with the deterministic laws of thermodynamics established by mid-nineteenth century. With the advent of digital computers, these ideas led to operational numerical weather prediction and shortly thereafter to numerical experiments that explored the atmosphere’s general circulation.

On the far right of the schematic, we identify the Monte Carlo method, a probabilistic approach that found wide application in the investigation of branching processes associated with nuclear bombardment. It, of course, is traced back to the Manhattan Project and ultimately to quantum mechanics, the fundamental break with Newtonian mechanics. Stanislaw Ulam, mathematician and colleague of Los Alamos National Laboratory physicists during World War II, was the originator of this probabilistic approach.

The element in the schematic labeled GCMs/GARP with subtitles of “feasibility” and “long-range forecasting” signifies that point in time when the limits of determinism in weather prediction came into focus. These issues have been explored in the main body of this paper. The stochastic–dynamic approach to weather forecasting associated with the work of Edward Epstein was outside the mainstream of extended-range forecasting (thus the bypass around the GCM box). Yet Epstein’s work significantly impacted the thinking of the dynamicists. The issues of atmospheric predictability, especially the limits to deterministic prediction as the result of uncertainty in the initial state, unresolved scales, and the extreme sensitivity of the model output to these uncertainties, made it clear that a stochastic–dynamic approach was justified. With the availability of the parallel-processing computers of the 1980–90s, the ensemble forecast, a forecast based on a variant of the Monte Carlo method, became a fixture of NWP.

Milestones for EPS at ECMWF. (Courtesy of R. Buizza, ECMWF.) HRES: horizontal resolution of forecast model (number of spectral components); VRES: vertical resolution (number of vertical levels in forecast model); MEM: number of members in ensemble. In 2004, SV sampling strategy was changed.

Characteristics of EPS worldwide. (Courtesy of R. Buizza, ECMWF.) FNMOC: Fleet Numerical Meteorology and Oceanography Center; KMA: Korean Meteorological Administration; JMA: Japan Meteorological Agency; AS: analysis cycle (see Houtekamer et al. 1996); BV: bred vectors; EOF: empirical orthogonal functions; SV: singular vectors; HRES: horizontal resolution (number of spectral components); MEM: number of members in ensemble; VRES: number of vertical levels.

^{1}

Information inserted into the quotations by the author is bracketed.

^{2}

Eady was awarded the Ph.D. in 1948, Department of Mathematics, Imperial College, London, United Kingdom. The title of his unpublished dissertation, “the theory of development in dynamical meteorology,” is essentially the same as Eady (1949).

^{3}

After a careful reading of Wiener (1956) at a later time, Lorenz realized that the statistical meteorologists had misinterpreted Wiener (E. Lorenz 2002, personal communication).

^{4}

In the 1970s, the OSSE became a standard method of testing the sensitivity of model forecasts to the input of observations, oftentimes in the form of model-generated data.

^{5}

This contribution (Lorenz 1963), using a truncated form of the atmospheric convection equations of Barry Saltzman (Saltzman 1962), laid the foundation for the field of chaotic systems.