Attribution of Seasonal Wildfire Risk to Changes in Climate: A Statistical Extremes Approach

Troy P. Wixson aColorado State University, Fort Collins, Colorado

Search for other papers by Troy P. Wixson in
Current site
Google Scholar
PubMed
Close
and
Daniel Cooley aColorado State University, Fort Collins, Colorado

Search for other papers by Daniel Cooley in
Current site
Google Scholar
PubMed
Close
Free access

Abstract

Wildfire risk is greatest during high winds after sustained periods of dry and hot conditions. This paper is a statistical extreme-event risk attribution study that aims to answer whether extreme wildfire seasons are more likely now than under past climate. This requires modeling temporal dependence at extreme levels. We propose the use of transformed-linear time series models, which are constructed similarly to traditional autoregressive–moving-average (ARMA) models while having a dependence structure that is tied to a widely used framework for extremes (regular variation). We fit the models to the extreme values of the seasonally adjusted fire weather index (FWI) time series to capture the dependence in the upper tail for past and present climate. We simulate 10 000 fire seasons from each fitted model and compare the proportion of simulated high-risk fire seasons to quantify the increase in risk. Our method suggests that the risk of experiencing an extreme wildfire season in Grand Lake, Colorado, under current climate has increased dramatically relative to the risk under the climate of the mid-twentieth century. Our method also finds some evidence of increased risk of extreme wildfire seasons in Quincy, California, but large uncertainties do not allow us to reject a null hypothesis of no change.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Troy P. Wixson, twixson@rams.colostate.edu

Abstract

Wildfire risk is greatest during high winds after sustained periods of dry and hot conditions. This paper is a statistical extreme-event risk attribution study that aims to answer whether extreme wildfire seasons are more likely now than under past climate. This requires modeling temporal dependence at extreme levels. We propose the use of transformed-linear time series models, which are constructed similarly to traditional autoregressive–moving-average (ARMA) models while having a dependence structure that is tied to a widely used framework for extremes (regular variation). We fit the models to the extreme values of the seasonally adjusted fire weather index (FWI) time series to capture the dependence in the upper tail for past and present climate. We simulate 10 000 fire seasons from each fitted model and compare the proportion of simulated high-risk fire seasons to quantify the increase in risk. Our method suggests that the risk of experiencing an extreme wildfire season in Grand Lake, Colorado, under current climate has increased dramatically relative to the risk under the climate of the mid-twentieth century. Our method also finds some evidence of increased risk of extreme wildfire seasons in Quincy, California, but large uncertainties do not allow us to reject a null hypothesis of no change.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Troy P. Wixson, twixson@rams.colostate.edu

1. Introduction

The Sixth Assessment Report of the IPCC suggests, with high confidence, that climate change has led to warmer and drier conditions, which have increased wildfire risk in North America (Hicke et al. 2022). These worsening conditions have led to “increased burned area in recent decades in western North America” and thus in “the USA, annual costs of federal wildland fire suppression have increased by a factor of 4 since 1985” (Hicke et al. 2022, p. 1948). These national and regional trends have been echoed at state levels. The Colorado Division of Fire Prevention and Control reports that “20 of 20 largest wildfires have occurred in the last 20 years (since 2001)” and the three largest fires, which burned more than 500 000 acres, occurred in 2020 (State of Colorado 2023). Data from the California Department of Forestry and Fire Protection (State of California 2023) show that 18 of the 20 largest fires, 18 of the 20 most destructive fires, and 11 of the 20 most deadly fires in California history occurred between 2003 and 2021. Abatzoglou and Williams (2016) state that the “increased forest fire activity across the western continental United States (US) in recent decades has likely been enabled by a number of factors, including the legacy of fire suppression and human settlement, natural climate variability, and human-caused climate change.” The focus of this study is the climate signal; we do not consider questions of forest management, fuel availability, or the impact of more people in the wildland–urban interface. This paper is an extreme-event risk attribution study that aims to quantify how much more likely these extreme fire seasons are now than they were previously, due to observed changes in climate.

To narrow our focus to the climate signal we use the well-recognized Canadian forest fire weather index (FWI) as the object of this study (Van Wagner 1987). The FWI system is a series of equations that takes month, latitude, 24-h precipitation, noon wind speed, noon relative humidity, and noon temperature as inputs. These location and weather variables are used daily to compute the fine fuel moisture code (FFMC), duff moisture code (DMC), and drought code (DC), which represent dryness, and thus fuel availability, at differing levels of depth and time scales. The FFMC is combined with wind speed to compute the initial spread index (ISI). The buildup index (BUI), which represents fuel availability, is computed from the two longer range moisture codes (DMC and DC). Daily FWI values are computed from the ISI and BUI. The FWI was designed to represent the energy that would be released at the edge of a fire but can be interpreted in several ways including fire intensity risk. FWI is based on weather variables and thus changes discovered in the analysis are due to changes in observed climate.

The 153-day time series of FWI values for the 2020 fire season near Grand Lake, Colorado, is shown in Fig. 1. Highlighted are the 14 days with FWI values above the 0.975 empirical quantile of present climate. Each highlighted, high-risk day was the result of processes like the drying of fuels through extended periods of low precipitation coupled with hot days and high winds. The time-dependent nature of wildfire risk makes it clear that simple estimation of some high marginal quantile would not capture the phenomenon we are concerned with. Wildfire risk, though quantifiable on smaller time scales (e.g., daily), is more sensibly thought of as a seasonal quantity and thus our aim is to model entire wildfire seasons.

Fig. 1.
Fig. 1.

Daily FWI values from 2020, ERA5 data near Grand Lake.

Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1

We perform statistical attribution; a statistical model for entire wildfire seasons is built and fit to each climate period (past and present). The fitted models are used to simulate wildfire seasons. The proportions of simulated high-risk seasons from each period are compared to make our attribution statement. We consider a season to be high risk if it has at least as many days above a high quantile as were observed in the most extreme year in the region. These years, 2020 in Colorado and 2021 in California, are used to define high-risk seasons so that communication of results is simple and relatable.

This study uses a reanalysis product and weather station data. Use of these data sources allows us to directly make within-product comparisons as we can use the 2020 (2021) season from each source to define high risk. However, these data sources could have confounding factors. Reanalysis products are a data assimilation of model output and global observational data from ground sensors and satellites, but satellite data are only available in the present period. Weather station data collection methods transitioned from manual to automated measurements during the study period and there may have been changes in the location of the measurements. Additionally, weather station data have the usual challenges inherent in true observational data; missing observations, short historical records for some variables, and the noisiness expected with truly local measurements.

An alternative and often used attribution approach employs climate models that were run under factual and counterfactual worlds (i.e., without anthropogenic forcings). Our year-specific definition of high risk makes it difficult to use climate model output in the estimation of the proportion of high-risk seasons. Comparing climate model output with observed conditions in 2020 (2021) would require accounting for model bias but model bias is not well understood for FWI. A recent attribution approach studies specific events using custom climate model runs that were set up to adequately reproduce the event [e.g., Patricola and Wehner (2018) study 15 tropical cyclones with an emphasis on hurricane Katrina]. These studies require extensive computing resources and, to our knowledge, none have studied season-long events. An advantage of our method is its simplicity; we use available data. A disadvantage is that our attribution statement can only reference differences between past and present climate and cannot speak causally about an anthropogenic effect.

This study compares the proportion of high-risk seasons expected under past climate to the proportion expected under present climate. We define past climate as the 20 earliest available years (1959–78) and present climate as the years from 2002 to 2021. Our definitions of past and present climate do not consider the potential effects of cyclical climate patterns (ENSO, etc.). Additionally, each climate period only has 20 observed seasons and thus uncertainty (computed through bootstrapping) is large. Within each climate period we consider a fire season to be 1 June through 31 October (153 days). We first consider an area north of Grand Lake that was burned in the East Troublesome Fire in 2020 and then repeat the analysis on a region just outside the burn area of the 2020 North Complex and 2018 Camp Fires in California.

We use time series models to capture dependence throughout the fire season. Classical linear time series models [like autoregressive–moving-average (ARMA) models] are based on the autocovariance function, which averages the linear relationship around the mean and thus may not accurately capture the dependence at extreme levels. We characterize tail dependence with the tail pairwise dependence function (TPDF) and employ the transformed-linear time series models for extreme values of Mhatre and Cooley (2021). These models resemble the familiar ARMA models from classical time series but are tied to regular variation (a common framework in extremes) and thus are a natural choice for this study.

This attribution study fits into the extreme event attribution (EEA) framework of Jézéquel et al. (2018) as a “risk based” study because we focus on computing an increase in risk rather than explaining the link between climate change and the physical processes that led to the event. We compute how much more likely a high-risk season is now than in a past climate and thus are interested in a “class of events.” This differs from studies that are focused on computing the probability that a single event (i.e., the 2020 fire season in Colorado) was caused by climate change. Although some studies are conditional on sea surface temperature, greenhouse gas concentrations, and so on, our study uses the observational record to compute the changes in risk based on any detectable changes in climate and thus is considered “unconditional.”

This paper is organized as follows: We review statistical preliminaries and introduce the models in section 2. Our method is explained in detail as it is applied to ERA5 data from one location inside the East Troublesome burn area in Colorado in section 3. We discuss the data and preprocesssing in section 3a, estimation of the pairwise dependence and model fitting is section 3b, and simulation from the models and attribution is section 3c. In section 4 we apply our method to weather station observations [Remote Automatic Weather Station (RAWS) data] from the same location as in section 3. In section 5 we apply our method to ERA5 data from a location near the 2020 North Complex Fire in California. We conclude the paper with discussion of the benefits and some limitations of our method.

2. Statistical preliminaries

a. Regular variation and dependence

To model dependence in the upper tail of our time series, we rely on the framework of regular variation. Regular variation is a common mathematical framework in the study of extremes as its dependence structure is found in characterizations of multivariate extreme value distributions, and its definition only describes the joint tail behavior. Additionally, regular variation is a useful framework for capturing asymptotic dependence (Coles 2001, section 8.4), which loosely implies that variables can be at their most extreme levels at the same time. Our FWI time series appears to exhibit asymptotic dependence at short lags.

A p-dimensional random vector must have heavy tails in each dimension to be a multivariate regularly varying random vector. The decay rate of those tails is described by the tail index α. The distribution of a regularly varying random vector at infinity can be decomposed into independent radial (distance from the origin) and an angular (the point where the vector intersects the unit ball) components (Fig. 2). We use this near-independent (in the large but not limiting case) decomposition to describe the pairwise tail dependence between the dimensions of the random vector. Consider a two-dimensional random vector X = (X1, X2) that has positive components with probability 1 (i.e., the random vector takes values in the first quadrant of the x–y plane). Further assume that X is large in at least one component. Information about the tail dependence between components of the vector is contained in the angular measure HX, which describes the distribution of angles (between 0° and 90°). If there is strong tail dependence we would expect that when X1 is extreme X2 will also be extreme. In this case HX will have mass near 45°, and we would expect to see points like A in Fig. 2. When the tail dependence is weak (or zero) knowing X1 is extreme tells us little about X2. This suggests X2 will likely be in bulk of its distribution (which is near the axis due to the distribution’s heavy tail) and we would expect extreme points to be like point B in Fig. 2. In this case the mass of the angular distribution HX is concentrated near the axes.

Fig. 2.
Fig. 2.

Polar decomposition of points A and B. The radial component is the norm of the point and the angular component places the point on the unit ball. We expect points like A (large together) under strong tail dependence and points like B (large separately) under weak tail dependence.

Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1

Tail dependence of a p-dimensional X continues to be described by its angular measure HX, but estimating or modeling this measure that lies on the p-dimensional unit ball becomes increasingly difficult as p grows. Multivariate models tend to be employed in moderate (p ≈ 5) dimensions.

When analyzing time series, we treat the vector of observations as a single (partial) realization of an infinite-dimensional random vector. In our analysis we consider each season as an observation from the climate of that period, and thus we have 20 (partial) realizations of length 153 (the number of days in our definition of a fire season). The advantage of taking a time series approach over simply viewing the data as realizations of a 153-dimensional random vector is that we can characterize and model dependence as a function of lag (time difference between variables). Classical (nonextreme) time series analysis generally assumes a time series is weakly stationary and restricts focus to characterizing only pairwise dependencies through the autocovariance function.

Here, we summarize the angular measure using a pairwise metric analogous to the autocovariance function, but which characterizes tail behavior. Our pairwise metric is the TPDF (1), which is the time series analog to the tail pairwise dependence matrix (TPDM) introduced by Cooley and Thibaud (2019) and employed by Jiang et al. (2020) to perform extremal principal component analysis for U.S. precipitation. Let the time series {Xt} be regularly varying with tail index α = 2 for all t = 1, 2, … and tail stationary (Mhatre and Cooley 2021); that is, the TPDF is a function of lag only. Define the unit ball in the positive orthant S1+={xR2:x1,x20andx2=1} (where ‖ ‖2 is the Euclidean norm) and for each lag h, let the radial component rt = ‖(xt, xt+h)‖2. The TPDF is
σ(h)=σ(Xt,Xt+h)=S1+xtrtxt+hrtdHXt,Xt+h(s).
Although regular variation assumes that the data are heavy tailed, it can be used as a dependence model for data that are not heavy tailed. We will transform data such as the FWI data so that it can be assumed to come from a tail-stationary regularly varying time series with tail index α = 2. This idea is not uncommon in classical extreme value analysis where characterizations of the extreme value distributions are typically made assuming a particular marginal distribution (deHaan and Ferreira 2006, section 6.1.2), and is similar in spirit to transforming time series data (such as applying a square root or logarithmic transformation) to make it appear more Gaussian in order to fit traditional time series models. Our assumption that α = 2 is made for convenience, Kiriliouk and Zhou (2022) recently extended the definition of the TPDM for a general tail index, but its definition includes α in the integrand.

b. Transformed-linear time series models for extreme values

We want models specifically designed to fit the upper tail as this is when fire risk becomes a concern. The ARMA-like transformed-linear extremes time series models of Mhatre and Cooley (2021) are constructed with the transformed-linear operations of Cooley and Thibaud (2019) so that they only take on positive values. Transformed-linear operations are defined component-wise and involve a map, f, from the real line to the positive half line. For any two vectors in the positive orthant, X1,X2R+p, transformed-linear addition, denoted ⊕, is performed by mapping the components of X1 and X2 back to the real line, adding the two vectors, and then transforming the components back to the half-line: X1X2 = f{f−1(X1) + f−1(X2)}. Transformed-linear multiplication, denoted ∘, works similarly. We use the function f(x) = log{1 + exp(x)} as it has a negligible effect on the upper tail [i.e., limx→∞f(x)/x = 1] and thus regular variation is preserved under these operations (Cooley and Thibaud 2019).

Transformed-linear time series are constructed using transformed-linear operations in the place of classic arithmetic operations on a noise sequence Zt of independent, tail stationary, regularly varying α = 2 random variables. We say that {Xt} is a transformed-linear extremes moving-average (MA)(q) process if, for all t,
Xt=j=0qθjZtj.
For any transformed-linear ARMA(p, q) process:
Xt(ψ1)Xt1(ψp)Xtp=Ztθ1Zt1θqZtq,
there exists an equivalent transformed-linear MA(∞) process. Because of this equivalence, and the ease of fitting transformed-linear MA(q) models for arbitrarily large q, we restrict our attention to the transformed-linear MA(q) processes that have TPDF:
σ(h)=j=0qθj(0)θj+h(0),
where a(0) = max(0, a) and θ0 = 1 (Mhatre and Cooley 2021).

In classical time series the innovations algorithm can be used to recursively compute the one step predictors and then uses the associated prediction errors (innovations) to estimate that steps’ MA coefficients. The transformed-linear extremes innovations algorithm (Mhatre 2022) is similar and is used to fit the transformed-linear MA(q). Mhatre (2022) showed that even if the underlying model is not a transformed-linear model, use of the extremal innovations algorithm will result in a transformed-linear model with a TPDF that closely matches the TPDF of the underlying model and thus the pairwise dependence will match even if the model is not the correct one.

3. Application to ERA5 data in Colorado

a. Data and preprocessing

We first apply our method to a reanalysis data product (ERA5 from the Copernicus Climate Change Service; ECMWF 2023) on one grid box in Grand Lake. ERA5 data are produced on a globally complete, hourly, 30-km grid from 1959 to about two months prior to access. The grid box we considered includes latitude 40.27 and longitude −105.84, which is the location of a RAWS and is in the burn area of the East Troublesome Fire. Proximity to a weather station will allow us to repeat analyses performed on ERA5 data using observed weather for comparison (section 4). The weather variables downloaded were the hourly 10-m wind component in the eastern and northern directions (used to calculate wind speed), 2-m dewpoint temperature (used to calculate relative humidity), 2-m air temperature, and total precipitation. The time series of noontime measurements for wind speed, relative humidity, and temperature were combined with the time series of 24-h precipitation to compute daily FWI values for the season.

An early task in any time series analysis is the assessment of seasonality. It is reasonable to expect that wildfire risk changes as each wildfire season progresses (Fig. 3) and thus we must address the seasonality before assessing the dependence. We consider each day of the wildfire season to have its own distribution under two simplifying assumptions. First, we assume there is no meaningful trend within each defined climate period. Exploratory analysis does not provide evidence against this assumption. Second, we assume that the marginal (daily) distribution of FWI changes throughout the year in a smooth manner. This assumption, while untested, seems reasonable as we do not expect the distribution of fire risk on 15 July to be meaningfully different than the distribution on 16 July. This smoothness assumption allows us to borrow strength from nearby days in the estimation of each day’s marginal distribution without introducing much bias.

Fig. 3.
Fig. 3.

Daily 0.975 quantile of the ERA5 FWI time series from Grand Lake for both past and present periods with daywise confidence bounds computed through bootstrapping. The confidence bounds correspond to 95% intervals for the differences between periods. Quantiles were computed by borrowing strength from surrounding days using a 29-day moving window. The marginal distribution at high quantiles seems to have shifted up in a nearly uniform fashion.

Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1

Seasonal behavior was explored with daily high quantile plots (0.975 in Fig. 3) that were computed and smoothed using a (2k + 1)-day moving window. Exploratory analysis led us to choose k = 14 resulting in a 29-day window (e.g., FWI values for 1–29 July are used to estimate quantiles for 15 July). As anticipated, seasonal behavior was evident and will need to be accounted for via marginal transformation. However, the bimodal subseasonality seen in high quantiles of the FWI in Fig. 3 was unexpected, and was apparent across a wide range of window sizes (smaller windows exhibited the predictable increase in noise). Investigating this bimodality further, we found that the high quantiles of the DMC exhibit similar bimodality. This high quantile bimodality is not as evident in the FFMC and is not in the DC. The DMC captures dryness of middepth fuels, retains moisture information for around 2 weeks, and precipitation less than 1.5 mm is considered too little to reach these fuels. Estimated precipitation quantiles (between 0.05 and 0.9) are largest between mid-July and early August, which suggests that precipitation large enough to dampen the DMC is more common at this time than during the rest of the season.

The marginal shift in high quantiles shown in Fig. 3 suggests an increase in fire risk; values at the past climate 0.975 quantile are now observed more frequently. We subsampled years for both periods and computed the 0.975 quantile for each day to test whether the quantiles were significantly different. Subsampling seasons allows the bootstrap to account for temporal dependence in the data. With each of these 500 bootstrap samples we computed the differences (for each day) of the quantiles and computed 95% bootstrap intervals for the difference. Only the intervals for 1–3 June, 17 June, 8–12 July, and 13–31 October contained zero. To illustrate uncertainty in Fig. 3, daywise confidence bands plotted were computed using the method of Goldstein and Healy (1995), and overlap of these confidence bands closely mimics the dates found by our hypothesis test. We will see that the tail dependence is similar between climate periods (Fig. 4), which suggests that most of the increase in risk that we will find is due to this marginal shift of extreme levels.

Fig. 4.
Fig. 4.

Empirical and fitted TPDF for (top) past and (bottom) present periods of transformed Colorado ERA5 FWI time series. Our fitted model captures the empirical dependence well. The dependence appears to be similar between periods. We note that the plots appear nearly constant around 0.1 after 10–15 lags.

Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1

Quantile plots of the underlying subindices, moisture codes, and weather variables were explored to better understand the meteorological drivers contributing to this observed change in FWI. This shift toward more extreme values under current climate than under past climate is evident across BUI quantiles. ISI quantiles have also increased, but that increase fades in mid-October. These subindex increases can be traced back through the FFMC, DMC, and DC to an increase in temperatures and a decrease in moisture. The median change between study periods in daily 0.975 quantile of temperature is 2°C. A similar increase in temperatures is evident across quantiles. Decreases in the distribution of precipitation and relative humidity are also apparent. For example, the 0.5 quantile of daily precipitation under past climate has a similar seasonal pattern and level to the 0.6 quantile under present climate. Wind speed quantiles appear relatively unchanged between periods.

To fit the models of Mhatre and Cooley (2021), we transform so the data exhibits weak tail stationarity and the marginal distribution is regularly varying with tail index α = 2. We first transform the marginal distribution to be uniform and, in doing so, handle the observed seasonal behavior.

We do this daily with the same 29-day moving window as we used to plot the daily FWI high quantile. Each day’s marginal distribution, Ftp, is estimated with the empirical cumulative distribution function (ECDF) below the 0.975 quantile, μt, and a fitted generalized Pareto distribution (GPD) above. The GPD is a common model for exceedances above a threshold μ and can be defined with the conditional survival function P(Y > y|Y > μ) = {1 + ξ(yμ)/ψ}−1/ξ where ξ ∈ Ρ is the shape parameter and ψ > 0 is the scale parameter (Coles 2001). Our estimated daily semiparametric marginal distribution is
F^tp(y)={[np(2k+1)+1]1i=1npj=tkt+kI(yijp<y)yμ^tp10.025[1+ξ^p(yμtp)/ψ^p]1/ξ^py>μ^tp,
where yijp is the observed FWI from day j of year i in period p (past or present), np is the number of years in that period, and 2k + 1 accounts for the 29-day moving window. The denominator of the ECDF is increased by one so that the observed maximum does not have ECDF value one (Coles 2001, definition 2.4) and the scaling by 0.025 in the GPD arises from modeling exceedances over the 0.975 quantile. Likelihood ratio tests do not provide evidence against the null hypothesis that the scale and shape parameters of the daily GPD were the same for each day in the year. Mean residual life plots confirm the use of the daily 0.975 quantile as the threshold parameter of the GPD. The inverse Fréchet CDF, xt = log(1/ut)−1/2, is applied to the time series with uniform marginals to make it regularly varying with tail index α = 2.

b. Dependence estimation and model fitting

To estimate the dependence we use the natural TPDF estimator of Cooley and Thibaud (2019). We reparameterize the lag-h pairs of points (xt, xt+h), t = 1, …, nh with polar coordinates (Fig. 2). The radial component is the L2-norm: rt = ǁ(xt, xt+h2. The angular component w = (wt, wt+h) = (xt, xt+h)/rt places the point on the unit ball. The TPDF estimator under this parameterization is
σ^(h)=2t=1nhI(rt>r0)t=1nhwtwt+hI(rt>r0),
where the 2 arises from the known (after transformation) marginal distribution. This estimator replaces the angular measure by its empirical estimate and considers only points above some high threshold r0 (we used the 0.975 quantile).

The estimated TPDF (Fig. 4) is similar between the two climate periods. It is well known that extremal dependence measures are biased when tail dependence is weak (see, e.g., Huser et al. 2016) and this bias was noticed and explored in Mhatre (2022). We follow Mhatre’s suggestion and subtract off the mean of the time series before estimating the dependence. This correction reduces, but does not eliminate, bias. Our tail-dependence estimates decrease with lag, but they appear to level off around 0.1 near lag 15 presumably because of remaining bias. Simulations of transformed-linear MA(q) time series (q = 3, 5, 10) had TPDF values, after bias adjustment, beyond lag q close to 0.1.

The order q of our transformed-linear MA(q) is chosen by fitting and comparing several orders. The extremal innovations algorithm is used to estimate the parameters of MA(q) models for q = 1, 2, …, 30. Comparing the estimated TPDF plots with the theoretical TPDF from each iteration of the innovations algorithm demonstrates that our fitted model has very similar pairwise tail dependence as our data and is our first step in model choice [Fig. 4 compares the empirical TPDF with the fitted MA(15)]. The empirical TPDF appears to level off between lag 10 and 15 under both past and present climate, which suggests that the unbiased TPDF would likely be zero somewhere between lag 10 and 20. We compare the ability of these models to reproduce summary statistics (mean run length above high quantiles and high quantile of sum of five consecutive terms) and note, once again, that models of order 10 through 20 are reasonable. Because of the leveling off of the empirical TPDF after lag 15 we continue the analysis with the MA(15) for both periods.

c. Simulation of seasons and attribution

Once we have fit our chosen models, we simulate 10 000 seasons from each period. To simulate seasons, we first generate a noise sequence of 168 (length of the season 153 plus model order 15) independent Fréchet random variables with scale 1 and shape 2. This independent noise sequence is iteratively put through (2) with q = 15, θ0 = 1, and θj(j = 1,2, …, 15) equal to the coefficients that were estimated using the extremal innovations algorithm for the given period. This ensures the pairwise tail dependence of the simulated season matches what was observed. To reintroduce the observed seasonality into these 10 000 stationary simulated seasons we perform a two step back transformation. We first use the ECDF to transform to a uniform marginal distribution. We then use the inverse of (5) to transform the uniform marginal time series into seasons on the original FWI scale.

Attribution is done by comparing probabilities of observing high-risk seasons under past and present climates. We consider a season to be high risk if it had at least as many days above a high threshold as were observed in 2020. Our thresholds were the 0.95, 0.975, and 0.99 quantiles of the observed present climate FWI time series (34.41, 39.47, and 44.94, respectively) (Table 1). In 2020 there were 22, 14, and 6 days over the respective thresholds. Under past climate 31 of the 10 000 seasons had at least 22 days with FWI values above 34.41 as compared with 322 simulated present climate seasons. The ratio of the point estimates (10.39) suggests that a season as extreme as 2020 is more than 10 times as likely under present climate as it was under past climate. The ratios for the 0.975 and 0.99 quantile thresholds were 4.33 and 8.77, respectively.

Table 1.

Results from Grand Lake ERA5 data using a transformed-linear MA(15) for past and present climate. Columns 1 and 2 report three high thresholds of the present climate, and the number of days in 2020 that exceeded those high quantiles is in column 3 (definition of high-risk season). The proportion of simulated high-risk seasons is in columns 4 and 5 for the respective periods. Column 6 is the multiplicative change from past to present (ratio of columns 4 and 5). Bootstrapped 95% confidence intervals are reported in parentheses.

Table 1.

Uncertainty in our estimates is reported with intervals computed through bootstrapping. Each bootstrapped estimate was computed by sampling 20 years, with replacement, from each climate period. We use the observed data from these 20 years to estimate the marginal distribution (5), transform to be regularly varying, estimate the TPDF, fit the MA(15), simulate 10 000 seasons, and compute proportion of high-risk seasons.

The uncertainty bounds computed from these bootstrapped results are large, due to only having 20 observed seasons with which to estimate all aspects of our method. Despite the width of the confidence intervals, the ratios are entirely above 1, and thus the change in attributed seasonal risk is significant. The bootstrapping is computationally intensive, but simple to implement. Five hundred full analyses (minus model order selection) were completed with each using less than 0.5 gigabytes (GB) of RAM running on a single thread and thus the analysis could be completed on a laptop computer. Access to a computer with 64 2.7-GHz cores and 128 GB of RAM, combined with the embarrassingly parallel nature of the bootstrapping, allowed for rapid results from all 500 analyses (each bootstrap analysis took around 12 min to complete on this machine).

Sensitivity to model order was explored by completing the analysis with MA orders 10 and 20. The ratios of proportion of extreme fire seasons (present to past) were 24, 20.68, and 9.88, respectively when using the fitted MA(10) to simulate seasons. There were fewer simulated high-risk seasons with the MA(10) than either of the other models, but the ratio is increased because there were proportionally fewer high-risk seasons in the past as in the present. When using the fitted MA(20) to simulate seasons the ratios were 8.02, 4.94, and 9.79, respectively, which are similar to those obtained with our chosen model. We expect the ratio estimates to be relatively stable between models of orders that accurately capture the tail dependence.

We summarize our findings with the following attribution statement. Applied to FWI data generated from ERA5 output, our method estimates that a wildfire season like the one observed in 2020 near Grand Lake is 4–10 times as likely under recently observed climate as under the climate of roughly 50 years ago. Our method rejects the null hypothesis that the risk of observing a season like 2020 is unchanged between these two periods. Figures 3 and 4 indicate that this increase in risk is due to a shift in the marginal distribution of risk, not the tail dependence.

4. Application to RAWS data in Colorado

a. Data and preprocessing

We apply the same method to RAWS weather station data (National Interagency Fire Center 2023) from Harbison Meadow, Colorado (NWS identifier 050402), which is located within the grid box used for the ERA5 analysis in section 3. The analysis is repeated on weather station data to allow for comparison with reanalysis products and is easily explainable to a broader audience. The RAWS data for Harbison Meadow were downloaded by using the “Weather Data” link on the Wildland Fire Application Information Portal website (https://www.wildfire.gov/application/fire-and-weather-data-extract).

The present period had a nearly complete record but the past period had only 12 years of usable data. Harbison Meadow has data recorded from 1964 to the present, but the years from 1972 through 1974 were deemed unusable for two reasons: approximately one-half of the data in each year were missing, and the maximum FWI value for those years was 0.47, which is well below the next lowest maximum value of the past period, 16.04, and is the 0.56 quantile of the past period.

Seasonal behavior of the RAWS FWI time series was explored in the same manner as was done with the ERA5 data. The FWI daily high quantile (Fig. 5) again shows a clear shift in the marginal distribution at the highest quantiles, which is often larger in magnitude than it was in the ERA5 data. The maximum difference between present and past estimated 0.975 quantiles in the ERA5 data was 18 units. There were 62 days in RAWS data (40% of the season) that had differences that were greater than 18 units. The present period indicates bimodality in the highest quantiles but the second peak is missing in the past period. The 0.975 quantile does not remain at its highest values for as much of the season in the RAWS data as it did in the ERA5 data, but these highest values are greater than what was observed in the ERA5 data. We expect that some of this difference is due to the different spatial scales and model bias. Exploration of high quantiles of FWI components indicated extreme temperatures are hotter, extreme wind speeds are faster, and humidity is lower in the current period than in the past period. Additionally, high quantiles of all three fuel moisture codes (FFMC, DMC, and DC) and of the fire behavior subindices (ISI and BUI) show an increase in risk.

Fig. 5.
Fig. 5.

As in Fig. 3, but from the RAWS Harbison Meadow data.

Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1

The high quantiles for past and present climate are significantly different. This was determined using joint daily 95% bootstrap intervals that only overlap at the end of the season (Fig. 5).

The TPDF for the past period appears to differ from the present period TPDF (Fig. 6), which was not apparent in the ERA5 data (Fig. 4). This suggests that there may be a change in the dependence as well as a change in the marginal distribution. We explore the effects of this by fitting models of orders 10, 15, and 20 to both periods. The results were similar in all cases including when we compared different orders for the two periods. The following results are from the MA(15) fitted to both periods because that matches the previous analysis.

Fig. 6.
Fig. 6.

As in Fig. 4, but for RAWS Harbison Meadow.

Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1

b. Attribution

Table 2 summarizes results from the analysis of the RAWS data. The 0.95, 0.975, and 0.99 quantiles of the present climate RAWS data were 34.96, 42.97, and 51.54, respectively. In 2020 there were 32, 24, and 11 days with FWI values above those respective high thresholds. Of 10 000 simulated present climate seasons there were 78, 104, and 951 that were high risk. Zero of the past climate seasons were high risk. The GPD used to transform the upper tail of the simulated past climate seasons has a bounded tail that is below each high quantile for the majority of the season. This suggests that the FWI values observed in 2020 were not plausible under past climate.

Table 2.

As in Table 1, but results from Harbison Meadow RAWS data using a transformed-linear MA(15) model.

Table 2.

An additional analysis was performed, which ensured that the extreme FWI values observed in 2020 were possible under past climate. We repeated the analysis but enforced Gumbel (unbounded but light) tails on the GPD. This has the added benefit of assessing sensitivity to the GPD parameters in the back-transformation. When the GPD is restricted in this manner there were 0, 0, and 8 simulated past climate seasons of 10 000 that were high risk.

As before, we summarize with the following attribution statement. Applied to RAWS data, our method estimates that the extreme fire weather observed in the 2020 was not possible under the observed climate of roughly 50 years ago. Our method rejects the null hypothesis that seasonal wildfire risk is unchanged between the two studied periods. When reanalyzed using an approach that forces a possibility of observing high-risk seasons under past climate, the method estimates that the risk of observing a season like 2020 is at least 138 times as great under the recently observed climate. The increased risk appears to be due to a shift in the marginal distribution at high quantiles (Fig. 5) and an increase in the tail dependence (Fig. 6).

5. Application to ERA5 data in California

a. Data and preprocessing

We apply our method to an ERA5 grid box containing Quincy, California. This grid box was burned in the 2020 North Complex Fire (over 300 000 acres burned) and is adjacent to a grid box burned by the 2018 Camp Fire, the deadliest fire in California history (Reyes-Velarde 2019). There is a RAWS weather station in the grid box (Quincy Road station, NWS identifier 040910) that could be used for comparison (this comparison was omitted for brevity).

The high quantile seasonality observed in Quincy (Fig. 7) is unimodal and does not show the large marginal shift between periods we noticed in Colorado. The statistically significant change in the marginal distribution is that the season is longer; that is, the FWI time series are at the highest levels for a longer period of time in the present period than in the past period. Exploratory analyses show that the high quantiles of temperature are at the highest levels longer and precipitation and relative humidity quantiles are at their lowest levels for longer periods under present climate than they were under past climate.

Fig. 7.
Fig. 7.

As in Fig. 3, but from Quincy.

Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1

We note that the tail dependence in the California region (Fig. 8) seems to level off at shorter lags than in Colorado (Figs. 4 and 6). This is likely because the dependence estimated by the TPDF is in the “residuals” after accounting for seasonality. In comparison with Colorado, there is less season-to-season variation in the FWI in California, and TPDF values seem to be driven more by variables (like ISI) with shorter time scales. TPDF values also appear slightly stronger and longer lasting under past climate than present climate. This may be attributable to the observed change in the fire behavior subindex related to dryness (BUI) that has stronger and longer-lasting tail dependence under current climate and thus may be included while accounting for seasonality. To assess sensitivity to model order, we have completed the analysis on models with order 5 and 10 and found similar results. The reported estimates are from using model order 10 for both periods.

Fig. 8.
Fig. 8.

As in Fig. 4, but for Quincy.

Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1

b. Attribution

Table 3 shows results from our attribution study. The 0.95, 0.975, and 0.99 quantiles of the present climate ERA5 data in California were 67.5, 70.1, and 72.3, respectively. In 2021 there were 22, 12, and 5 days with FWI values above those respective high thresholds. Forty-five of the simulated past climate seasons were classified as high risk at the 0.975 quantile as compared with 55 of the present climate seasons.

Table 3.

As in Table 1, results from Quincy ERA5 data using a transformed-linear MA(10) model.

Table 3.

Applied to FWI data generated from ERA5 output for the grid cell containing Quincy, our method estimates that a wildfire season like the one observed in 2021 is 1.22 to 1.89 times as likely under recently observed climate as under the observed climate of roughly 50 years ago. However, bootstrap confidence intervals contain one and thus our method fails to reject a null hypothesis of no change. It appears that under current climate, high quantiles of FWI are at higher values for a longer portion of the season, and bootstrap-based hypothesis tests support this conclusion.

6. Discussion

This paper develops a relatively simple, computationally inexpensive, theoretically justifiable method to quantify how much more likely an extreme fire season is now than it was 50 years ago. We quantify the increase in risk in reference to a well-recognized high-risk season, which enables easy communication of results; that is, we perform extreme-event attribution of seasonal risk. Our method relies on time series models that specifically focus on extreme behavior. Because these models capture the pairwise tail dependence well, and because the method treats the two time periods in the same manner, our approach is useful for making meaningful comparisons.

Our method was applied to FWI data from two different locations and two different data sources. In the Grand Lake area of Colorado, our method estimates a dramatic increase in the risk of observing a fire season like the one observed in 2020, and this was seen in both reanalysis and weather station data. This increased risk is mostly attributable to an upward shift of the quantiles of the marginal FWI distribution. In California, our method’s point estimates suggest an increase in the risk, but uncertainty associated with these estimates does not allow one to reject a null hypothesis of no change in seasonal risk. It is possible that analyzing a different fire weather index could result in different results. The FWI was developed in Canada for use with forests that are primarily Jack and Lodgepole pine, and it may not be the best index to summarize weather-driven fire risk in California. Fortunately, our same analysis could readily be repeated with any other indicator of fire risk. As this paper only considers the fire risk associated with the meteorologically derived FWI, it focuses specifically on risk due to changes in observed climate and does not directly consider other factors, such as past forest management, which could contribute to overall fire risk.

Like any analysis, our method requires several modeling choices. Perhaps chief among these is the selection of the MA(q)’s order. Our order selection method was based on subjectively interpreting TPDF plots; fortunately, sensitivity analysis showed that so long as q was chosen to be sufficiently large, conclusions were not affected. Another choice was to fit an MA(q) rather than an ARMA(p, q), which possibly could capture the dependence seen in the TPDF with a smaller number of parameters. We chose an MA(q) because we employed the innovations algorithm to fit our model, and model fitting of general ARMA models is an area for future investigation. Our approach also employs a two-step estimation method, first fitting the marginal and transforming, and then assessing the dependence. This two-step approach propagates any error from the first step forward; however, our bootstrap method for assessing uncertainty accounts for this error propagation. One observed shortcoming of our model is that in Colorado, we observed seasons where the FWI remained moderate for the entire season, and simulated seasons all were extreme at least once during a season. This concern is consistent between climate periods and thus past-to-current comparison should still be relevant. Because model fitting uses only large observations, our approach should only be used to assess extreme behavior and should not be used to assess quantities associated with the bulk of the distribution.

Acknowledgments.

Author Cooley was partially supported by National Science Foundation Grant DMS-1811657.

Data availability statement.

ERA5 data (ECMWF 2023) were accessed from the Climate Data Store (https://cds.climate.copernicus.eu/). RAWS data (National Interagency Fire Center 2023) were downloaded using the “Weather Data” link (https://www.wildfire.gov/application/fire-and-weather-data-extract). Raw data and formatted files can be accessed online (https://github.com/twixson/seasonal_wildfire_risk_attribution). All analyses were performed using R (R Core Team 2022). We used the following packages: cffdrs (Wang et al. 2017), chron (James and Hornik 2023), evd (Stephenson 2002), humidity (Cai 2019), lattice (Sarkar 2008), lubridate (Grolemund and Wickham 2011), knitr (Xie 2014), ncdf4 (Pierce 2023), and tidyverse (Wickham et al. 2019).

REFERENCES

Save
  • Fig. 1.

    Daily FWI values from 2020, ERA5 data near Grand Lake.

  • Fig. 2.

    Polar decomposition of points A and B. The radial component is the norm of the point and the angular component places the point on the unit ball. We expect points like A (large together) under strong tail dependence and points like B (large separately) under weak tail dependence.

  • Fig. 3.

    Daily 0.975 quantile of the ERA5 FWI time series from Grand Lake for both past and present periods with daywise confidence bounds computed through bootstrapping. The confidence bounds correspond to 95% intervals for the differences between periods. Quantiles were computed by borrowing strength from surrounding days using a 29-day moving window. The marginal distribution at high quantiles seems to have shifted up in a nearly uniform fashion.

  • Fig. 4.

    Empirical and fitted TPDF for (top) past and (bottom) present periods of transformed Colorado ERA5 FWI time series. Our fitted model captures the empirical dependence well. The dependence appears to be similar between periods. We note that the plots appear nearly constant around 0.1 after 10–15 lags.

  • Fig. 5.

    As in Fig. 3, but from the RAWS Harbison Meadow data.

  • Fig. 6.

    As in Fig. 4, but for RAWS Harbison Meadow.

  • Fig. 7.

    As in Fig. 3, but from Quincy.

  • Fig. 8.

    As in Fig. 4, but for Quincy.

All Time Past Year Past 30 Days
Abstract Views 1786 1258 0
Full Text Views 3308 3142 515
PDF Downloads 442 250 65