1. Introduction
The Sixth Assessment Report of the IPCC suggests, with high confidence, that climate change has led to warmer and drier conditions, which have increased wildfire risk in North America (Hicke et al. 2022). These worsening conditions have led to “increased burned area in recent decades in western North America” and thus in “the USA, annual costs of federal wildland fire suppression have increased by a factor of 4 since 1985” (Hicke et al. 2022, p. 1948). These national and regional trends have been echoed at state levels. The Colorado Division of Fire Prevention and Control reports that “20 of 20 largest wildfires have occurred in the last 20 years (since 2001)” and the three largest fires, which burned more than 500 000 acres, occurred in 2020 (State of Colorado 2023). Data from the California Department of Forestry and Fire Protection (State of California 2023) show that 18 of the 20 largest fires, 18 of the 20 most destructive fires, and 11 of the 20 most deadly fires in California history occurred between 2003 and 2021. Abatzoglou and Williams (2016) state that the “increased forest fire activity across the western continental United States (US) in recent decades has likely been enabled by a number of factors, including the legacy of fire suppression and human settlement, natural climate variability, and human-caused climate change.” The focus of this study is the climate signal; we do not consider questions of forest management, fuel availability, or the impact of more people in the wildland–urban interface. This paper is an extreme-event risk attribution study that aims to quantify how much more likely these extreme fire seasons are now than they were previously, due to observed changes in climate.
To narrow our focus to the climate signal we use the well-recognized Canadian forest fire weather index (FWI) as the object of this study (Van Wagner 1987). The FWI system is a series of equations that takes month, latitude, 24-h precipitation, noon wind speed, noon relative humidity, and noon temperature as inputs. These location and weather variables are used daily to compute the fine fuel moisture code (FFMC), duff moisture code (DMC), and drought code (DC), which represent dryness, and thus fuel availability, at differing levels of depth and time scales. The FFMC is combined with wind speed to compute the initial spread index (ISI). The buildup index (BUI), which represents fuel availability, is computed from the two longer range moisture codes (DMC and DC). Daily FWI values are computed from the ISI and BUI. The FWI was designed to represent the energy that would be released at the edge of a fire but can be interpreted in several ways including fire intensity risk. FWI is based on weather variables and thus changes discovered in the analysis are due to changes in observed climate.
The 153-day time series of FWI values for the 2020 fire season near Grand Lake, Colorado, is shown in Fig. 1. Highlighted are the 14 days with FWI values above the 0.975 empirical quantile of present climate. Each highlighted, high-risk day was the result of processes like the drying of fuels through extended periods of low precipitation coupled with hot days and high winds. The time-dependent nature of wildfire risk makes it clear that simple estimation of some high marginal quantile would not capture the phenomenon we are concerned with. Wildfire risk, though quantifiable on smaller time scales (e.g., daily), is more sensibly thought of as a seasonal quantity and thus our aim is to model entire wildfire seasons.
Daily FWI values from 2020, ERA5 data near Grand Lake.
Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1
We perform statistical attribution; a statistical model for entire wildfire seasons is built and fit to each climate period (past and present). The fitted models are used to simulate wildfire seasons. The proportions of simulated high-risk seasons from each period are compared to make our attribution statement. We consider a season to be high risk if it has at least as many days above a high quantile as were observed in the most extreme year in the region. These years, 2020 in Colorado and 2021 in California, are used to define high-risk seasons so that communication of results is simple and relatable.
This study uses a reanalysis product and weather station data. Use of these data sources allows us to directly make within-product comparisons as we can use the 2020 (2021) season from each source to define high risk. However, these data sources could have confounding factors. Reanalysis products are a data assimilation of model output and global observational data from ground sensors and satellites, but satellite data are only available in the present period. Weather station data collection methods transitioned from manual to automated measurements during the study period and there may have been changes in the location of the measurements. Additionally, weather station data have the usual challenges inherent in true observational data; missing observations, short historical records for some variables, and the noisiness expected with truly local measurements.
An alternative and often used attribution approach employs climate models that were run under factual and counterfactual worlds (i.e., without anthropogenic forcings). Our year-specific definition of high risk makes it difficult to use climate model output in the estimation of the proportion of high-risk seasons. Comparing climate model output with observed conditions in 2020 (2021) would require accounting for model bias but model bias is not well understood for FWI. A recent attribution approach studies specific events using custom climate model runs that were set up to adequately reproduce the event [e.g., Patricola and Wehner (2018) study 15 tropical cyclones with an emphasis on hurricane Katrina]. These studies require extensive computing resources and, to our knowledge, none have studied season-long events. An advantage of our method is its simplicity; we use available data. A disadvantage is that our attribution statement can only reference differences between past and present climate and cannot speak causally about an anthropogenic effect.
This study compares the proportion of high-risk seasons expected under past climate to the proportion expected under present climate. We define past climate as the 20 earliest available years (1959–78) and present climate as the years from 2002 to 2021. Our definitions of past and present climate do not consider the potential effects of cyclical climate patterns (ENSO, etc.). Additionally, each climate period only has 20 observed seasons and thus uncertainty (computed through bootstrapping) is large. Within each climate period we consider a fire season to be 1 June through 31 October (153 days). We first consider an area north of Grand Lake that was burned in the East Troublesome Fire in 2020 and then repeat the analysis on a region just outside the burn area of the 2020 North Complex and 2018 Camp Fires in California.
We use time series models to capture dependence throughout the fire season. Classical linear time series models [like autoregressive–moving-average (ARMA) models] are based on the autocovariance function, which averages the linear relationship around the mean and thus may not accurately capture the dependence at extreme levels. We characterize tail dependence with the tail pairwise dependence function (TPDF) and employ the transformed-linear time series models for extreme values of Mhatre and Cooley (2021). These models resemble the familiar ARMA models from classical time series but are tied to regular variation (a common framework in extremes) and thus are a natural choice for this study.
This attribution study fits into the extreme event attribution (EEA) framework of Jézéquel et al. (2018) as a “risk based” study because we focus on computing an increase in risk rather than explaining the link between climate change and the physical processes that led to the event. We compute how much more likely a high-risk season is now than in a past climate and thus are interested in a “class of events.” This differs from studies that are focused on computing the probability that a single event (i.e., the 2020 fire season in Colorado) was caused by climate change. Although some studies are conditional on sea surface temperature, greenhouse gas concentrations, and so on, our study uses the observational record to compute the changes in risk based on any detectable changes in climate and thus is considered “unconditional.”
This paper is organized as follows: We review statistical preliminaries and introduce the models in section 2. Our method is explained in detail as it is applied to ERA5 data from one location inside the East Troublesome burn area in Colorado in section 3. We discuss the data and preprocesssing in section 3a, estimation of the pairwise dependence and model fitting is section 3b, and simulation from the models and attribution is section 3c. In section 4 we apply our method to weather station observations [Remote Automatic Weather Station (RAWS) data] from the same location as in section 3. In section 5 we apply our method to ERA5 data from a location near the 2020 North Complex Fire in California. We conclude the paper with discussion of the benefits and some limitations of our method.
2. Statistical preliminaries
a. Regular variation and dependence
To model dependence in the upper tail of our time series, we rely on the framework of regular variation. Regular variation is a common mathematical framework in the study of extremes as its dependence structure is found in characterizations of multivariate extreme value distributions, and its definition only describes the joint tail behavior. Additionally, regular variation is a useful framework for capturing asymptotic dependence (Coles 2001, section 8.4), which loosely implies that variables can be at their most extreme levels at the same time. Our FWI time series appears to exhibit asymptotic dependence at short lags.
A p-dimensional random vector must have heavy tails in each dimension to be a multivariate regularly varying random vector. The decay rate of those tails is described by the tail index α. The distribution of a regularly varying random vector at infinity can be decomposed into independent radial (distance from the origin) and an angular (the point where the vector intersects the unit ball) components (Fig. 2). We use this near-independent (in the large but not limiting case) decomposition to describe the pairwise tail dependence between the dimensions of the random vector. Consider a two-dimensional random vector X = (X1, X2) that has positive components with probability 1 (i.e., the random vector takes values in the first quadrant of the x–y plane). Further assume that X is large in at least one component. Information about the tail dependence between components of the vector is contained in the angular measure HX, which describes the distribution of angles (between 0° and 90°). If there is strong tail dependence we would expect that when X1 is extreme X2 will also be extreme. In this case HX will have mass near 45°, and we would expect to see points like A in Fig. 2. When the tail dependence is weak (or zero) knowing X1 is extreme tells us little about X2. This suggests X2 will likely be in bulk of its distribution (which is near the axis due to the distribution’s heavy tail) and we would expect extreme points to be like point B in Fig. 2. In this case the mass of the angular distribution HX is concentrated near the axes.
Polar decomposition of points A and B. The radial component is the norm of the point and the angular component places the point on the unit ball. We expect points like A (large together) under strong tail dependence and points like B (large separately) under weak tail dependence.
Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1
Tail dependence of a p-dimensional X continues to be described by its angular measure HX, but estimating or modeling this measure that lies on the p-dimensional unit ball becomes increasingly difficult as p grows. Multivariate models tend to be employed in moderate (p ≈ 5) dimensions.
When analyzing time series, we treat the vector of observations as a single (partial) realization of an infinite-dimensional random vector. In our analysis we consider each season as an observation from the climate of that period, and thus we have 20 (partial) realizations of length 153 (the number of days in our definition of a fire season). The advantage of taking a time series approach over simply viewing the data as realizations of a 153-dimensional random vector is that we can characterize and model dependence as a function of lag (time difference between variables). Classical (nonextreme) time series analysis generally assumes a time series is weakly stationary and restricts focus to characterizing only pairwise dependencies through the autocovariance function.
b. Transformed-linear time series models for extreme values
We want models specifically designed to fit the upper tail as this is when fire risk becomes a concern. The ARMA-like transformed-linear extremes time series models of Mhatre and Cooley (2021) are constructed with the transformed-linear operations of Cooley and Thibaud (2019) so that they only take on positive values. Transformed-linear operations are defined component-wise and involve a map, f, from the real line to the positive half line. For any two vectors in the positive orthant,
In classical time series the innovations algorithm can be used to recursively compute the one step predictors and then uses the associated prediction errors (innovations) to estimate that steps’ MA coefficients. The transformed-linear extremes innovations algorithm (Mhatre 2022) is similar and is used to fit the transformed-linear MA(q). Mhatre (2022) showed that even if the underlying model is not a transformed-linear model, use of the extremal innovations algorithm will result in a transformed-linear model with a TPDF that closely matches the TPDF of the underlying model and thus the pairwise dependence will match even if the model is not the correct one.
3. Application to ERA5 data in Colorado
a. Data and preprocessing
We first apply our method to a reanalysis data product (ERA5 from the Copernicus Climate Change Service; ECMWF 2023) on one grid box in Grand Lake. ERA5 data are produced on a globally complete, hourly, 30-km grid from 1959 to about two months prior to access. The grid box we considered includes latitude 40.27 and longitude −105.84, which is the location of a RAWS and is in the burn area of the East Troublesome Fire. Proximity to a weather station will allow us to repeat analyses performed on ERA5 data using observed weather for comparison (section 4). The weather variables downloaded were the hourly 10-m wind component in the eastern and northern directions (used to calculate wind speed), 2-m dewpoint temperature (used to calculate relative humidity), 2-m air temperature, and total precipitation. The time series of noontime measurements for wind speed, relative humidity, and temperature were combined with the time series of 24-h precipitation to compute daily FWI values for the season.
An early task in any time series analysis is the assessment of seasonality. It is reasonable to expect that wildfire risk changes as each wildfire season progresses (Fig. 3) and thus we must address the seasonality before assessing the dependence. We consider each day of the wildfire season to have its own distribution under two simplifying assumptions. First, we assume there is no meaningful trend within each defined climate period. Exploratory analysis does not provide evidence against this assumption. Second, we assume that the marginal (daily) distribution of FWI changes throughout the year in a smooth manner. This assumption, while untested, seems reasonable as we do not expect the distribution of fire risk on 15 July to be meaningfully different than the distribution on 16 July. This smoothness assumption allows us to borrow strength from nearby days in the estimation of each day’s marginal distribution without introducing much bias.
Daily 0.975 quantile of the ERA5 FWI time series from Grand Lake for both past and present periods with daywise confidence bounds computed through bootstrapping. The confidence bounds correspond to 95% intervals for the differences between periods. Quantiles were computed by borrowing strength from surrounding days using a 29-day moving window. The marginal distribution at high quantiles seems to have shifted up in a nearly uniform fashion.
Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1
Seasonal behavior was explored with daily high quantile plots (0.975 in Fig. 3) that were computed and smoothed using a (2k + 1)-day moving window. Exploratory analysis led us to choose k = 14 resulting in a 29-day window (e.g., FWI values for 1–29 July are used to estimate quantiles for 15 July). As anticipated, seasonal behavior was evident and will need to be accounted for via marginal transformation. However, the bimodal subseasonality seen in high quantiles of the FWI in Fig. 3 was unexpected, and was apparent across a wide range of window sizes (smaller windows exhibited the predictable increase in noise). Investigating this bimodality further, we found that the high quantiles of the DMC exhibit similar bimodality. This high quantile bimodality is not as evident in the FFMC and is not in the DC. The DMC captures dryness of middepth fuels, retains moisture information for around 2 weeks, and precipitation less than 1.5 mm is considered too little to reach these fuels. Estimated precipitation quantiles (between 0.05 and 0.9) are largest between mid-July and early August, which suggests that precipitation large enough to dampen the DMC is more common at this time than during the rest of the season.
The marginal shift in high quantiles shown in Fig. 3 suggests an increase in fire risk; values at the past climate 0.975 quantile are now observed more frequently. We subsampled years for both periods and computed the 0.975 quantile for each day to test whether the quantiles were significantly different. Subsampling seasons allows the bootstrap to account for temporal dependence in the data. With each of these 500 bootstrap samples we computed the differences (for each day) of the quantiles and computed 95% bootstrap intervals for the difference. Only the intervals for 1–3 June, 17 June, 8–12 July, and 13–31 October contained zero. To illustrate uncertainty in Fig. 3, daywise confidence bands plotted were computed using the method of Goldstein and Healy (1995), and overlap of these confidence bands closely mimics the dates found by our hypothesis test. We will see that the tail dependence is similar between climate periods (Fig. 4), which suggests that most of the increase in risk that we will find is due to this marginal shift of extreme levels.
Empirical and fitted TPDF for (top) past and (bottom) present periods of transformed Colorado ERA5 FWI time series. Our fitted model captures the empirical dependence well. The dependence appears to be similar between periods. We note that the plots appear nearly constant around 0.1 after 10–15 lags.
Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1
Quantile plots of the underlying subindices, moisture codes, and weather variables were explored to better understand the meteorological drivers contributing to this observed change in FWI. This shift toward more extreme values under current climate than under past climate is evident across BUI quantiles. ISI quantiles have also increased, but that increase fades in mid-October. These subindex increases can be traced back through the FFMC, DMC, and DC to an increase in temperatures and a decrease in moisture. The median change between study periods in daily 0.975 quantile of temperature is 2°C. A similar increase in temperatures is evident across quantiles. Decreases in the distribution of precipitation and relative humidity are also apparent. For example, the 0.5 quantile of daily precipitation under past climate has a similar seasonal pattern and level to the 0.6 quantile under present climate. Wind speed quantiles appear relatively unchanged between periods.
To fit the models of Mhatre and Cooley (2021), we transform so the data exhibits weak tail stationarity and the marginal distribution is regularly varying with tail index α = 2. We first transform the marginal distribution to be uniform and, in doing so, handle the observed seasonal behavior.
b. Dependence estimation and model fitting
The estimated TPDF (Fig. 4) is similar between the two climate periods. It is well known that extremal dependence measures are biased when tail dependence is weak (see, e.g., Huser et al. 2016) and this bias was noticed and explored in Mhatre (2022). We follow Mhatre’s suggestion and subtract off the mean of the time series before estimating the dependence. This correction reduces, but does not eliminate, bias. Our tail-dependence estimates decrease with lag, but they appear to level off around 0.1 near lag 15 presumably because of remaining bias. Simulations of transformed-linear MA(q) time series (q = 3, 5, 10) had TPDF values, after bias adjustment, beyond lag q close to 0.1.
The order q of our transformed-linear MA(q) is chosen by fitting and comparing several orders. The extremal innovations algorithm is used to estimate the parameters of MA(q) models for q = 1, 2, …, 30. Comparing the estimated TPDF plots with the theoretical TPDF from each iteration of the innovations algorithm demonstrates that our fitted model has very similar pairwise tail dependence as our data and is our first step in model choice [Fig. 4 compares the empirical TPDF with the fitted MA(15)]. The empirical TPDF appears to level off between lag 10 and 15 under both past and present climate, which suggests that the unbiased TPDF would likely be zero somewhere between lag 10 and 20. We compare the ability of these models to reproduce summary statistics (mean run length above high quantiles and high quantile of sum of five consecutive terms) and note, once again, that models of order 10 through 20 are reasonable. Because of the leveling off of the empirical TPDF after lag 15 we continue the analysis with the MA(15) for both periods.
c. Simulation of seasons and attribution
Once we have fit our chosen models, we simulate 10 000 seasons from each period. To simulate seasons, we first generate a noise sequence of 168 (length of the season 153 plus model order 15) independent Fréchet random variables with scale 1 and shape 2. This independent noise sequence is iteratively put through (2) with q = 15, θ0 = 1, and θj(j = 1,2, …, 15) equal to the coefficients that were estimated using the extremal innovations algorithm for the given period. This ensures the pairwise tail dependence of the simulated season matches what was observed. To reintroduce the observed seasonality into these 10 000 stationary simulated seasons we perform a two step back transformation. We first use the ECDF to transform to a uniform marginal distribution. We then use the inverse of (5) to transform the uniform marginal time series into seasons on the original FWI scale.
Attribution is done by comparing probabilities of observing high-risk seasons under past and present climates. We consider a season to be high risk if it had at least as many days above a high threshold as were observed in 2020. Our thresholds were the 0.95, 0.975, and 0.99 quantiles of the observed present climate FWI time series (34.41, 39.47, and 44.94, respectively) (Table 1). In 2020 there were 22, 14, and 6 days over the respective thresholds. Under past climate 31 of the 10 000 seasons had at least 22 days with FWI values above 34.41 as compared with 322 simulated present climate seasons. The ratio of the point estimates (10.39) suggests that a season as extreme as 2020 is more than 10 times as likely under present climate as it was under past climate. The ratios for the 0.975 and 0.99 quantile thresholds were 4.33 and 8.77, respectively.
Results from Grand Lake ERA5 data using a transformed-linear MA(15) for past and present climate. Columns 1 and 2 report three high thresholds of the present climate, and the number of days in 2020 that exceeded those high quantiles is in column 3 (definition of high-risk season). The proportion of simulated high-risk seasons is in columns 4 and 5 for the respective periods. Column 6 is the multiplicative change from past to present (ratio of columns 4 and 5). Bootstrapped 95% confidence intervals are reported in parentheses.
Uncertainty in our estimates is reported with intervals computed through bootstrapping. Each bootstrapped estimate was computed by sampling 20 years, with replacement, from each climate period. We use the observed data from these 20 years to estimate the marginal distribution (5), transform to be regularly varying, estimate the TPDF, fit the MA(15), simulate 10 000 seasons, and compute proportion of high-risk seasons.
The uncertainty bounds computed from these bootstrapped results are large, due to only having 20 observed seasons with which to estimate all aspects of our method. Despite the width of the confidence intervals, the ratios are entirely above 1, and thus the change in attributed seasonal risk is significant. The bootstrapping is computationally intensive, but simple to implement. Five hundred full analyses (minus model order selection) were completed with each using less than 0.5 gigabytes (GB) of RAM running on a single thread and thus the analysis could be completed on a laptop computer. Access to a computer with 64 2.7-GHz cores and 128 GB of RAM, combined with the embarrassingly parallel nature of the bootstrapping, allowed for rapid results from all 500 analyses (each bootstrap analysis took around 12 min to complete on this machine).
Sensitivity to model order was explored by completing the analysis with MA orders 10 and 20. The ratios of proportion of extreme fire seasons (present to past) were 24, 20.68, and 9.88, respectively when using the fitted MA(10) to simulate seasons. There were fewer simulated high-risk seasons with the MA(10) than either of the other models, but the ratio is increased because there were proportionally fewer high-risk seasons in the past as in the present. When using the fitted MA(20) to simulate seasons the ratios were 8.02, 4.94, and 9.79, respectively, which are similar to those obtained with our chosen model. We expect the ratio estimates to be relatively stable between models of orders that accurately capture the tail dependence.
We summarize our findings with the following attribution statement. Applied to FWI data generated from ERA5 output, our method estimates that a wildfire season like the one observed in 2020 near Grand Lake is 4–10 times as likely under recently observed climate as under the climate of roughly 50 years ago. Our method rejects the null hypothesis that the risk of observing a season like 2020 is unchanged between these two periods. Figures 3 and 4 indicate that this increase in risk is due to a shift in the marginal distribution of risk, not the tail dependence.
4. Application to RAWS data in Colorado
a. Data and preprocessing
We apply the same method to RAWS weather station data (National Interagency Fire Center 2023) from Harbison Meadow, Colorado (NWS identifier 050402), which is located within the grid box used for the ERA5 analysis in section 3. The analysis is repeated on weather station data to allow for comparison with reanalysis products and is easily explainable to a broader audience. The RAWS data for Harbison Meadow were downloaded by using the “Weather Data” link on the Wildland Fire Application Information Portal website (https://www.wildfire.gov/application/fire-and-weather-data-extract).
The present period had a nearly complete record but the past period had only 12 years of usable data. Harbison Meadow has data recorded from 1964 to the present, but the years from 1972 through 1974 were deemed unusable for two reasons: approximately one-half of the data in each year were missing, and the maximum FWI value for those years was 0.47, which is well below the next lowest maximum value of the past period, 16.04, and is the 0.56 quantile of the past period.
Seasonal behavior of the RAWS FWI time series was explored in the same manner as was done with the ERA5 data. The FWI daily high quantile (Fig. 5) again shows a clear shift in the marginal distribution at the highest quantiles, which is often larger in magnitude than it was in the ERA5 data. The maximum difference between present and past estimated 0.975 quantiles in the ERA5 data was 18 units. There were 62 days in RAWS data (40% of the season) that had differences that were greater than 18 units. The present period indicates bimodality in the highest quantiles but the second peak is missing in the past period. The 0.975 quantile does not remain at its highest values for as much of the season in the RAWS data as it did in the ERA5 data, but these highest values are greater than what was observed in the ERA5 data. We expect that some of this difference is due to the different spatial scales and model bias. Exploration of high quantiles of FWI components indicated extreme temperatures are hotter, extreme wind speeds are faster, and humidity is lower in the current period than in the past period. Additionally, high quantiles of all three fuel moisture codes (FFMC, DMC, and DC) and of the fire behavior subindices (ISI and BUI) show an increase in risk.
As in Fig. 3, but from the RAWS Harbison Meadow data.
Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1
The high quantiles for past and present climate are significantly different. This was determined using joint daily 95% bootstrap intervals that only overlap at the end of the season (Fig. 5).
The TPDF for the past period appears to differ from the present period TPDF (Fig. 6), which was not apparent in the ERA5 data (Fig. 4). This suggests that there may be a change in the dependence as well as a change in the marginal distribution. We explore the effects of this by fitting models of orders 10, 15, and 20 to both periods. The results were similar in all cases including when we compared different orders for the two periods. The following results are from the MA(15) fitted to both periods because that matches the previous analysis.
As in Fig. 4, but for RAWS Harbison Meadow.
Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1
b. Attribution
Table 2 summarizes results from the analysis of the RAWS data. The 0.95, 0.975, and 0.99 quantiles of the present climate RAWS data were 34.96, 42.97, and 51.54, respectively. In 2020 there were 32, 24, and 11 days with FWI values above those respective high thresholds. Of 10 000 simulated present climate seasons there were 78, 104, and 951 that were high risk. Zero of the past climate seasons were high risk. The GPD used to transform the upper tail of the simulated past climate seasons has a bounded tail that is below each high quantile for the majority of the season. This suggests that the FWI values observed in 2020 were not plausible under past climate.
As in Table 1, but results from Harbison Meadow RAWS data using a transformed-linear MA(15) model.
An additional analysis was performed, which ensured that the extreme FWI values observed in 2020 were possible under past climate. We repeated the analysis but enforced Gumbel (unbounded but light) tails on the GPD. This has the added benefit of assessing sensitivity to the GPD parameters in the back-transformation. When the GPD is restricted in this manner there were 0, 0, and 8 simulated past climate seasons of 10 000 that were high risk.
As before, we summarize with the following attribution statement. Applied to RAWS data, our method estimates that the extreme fire weather observed in the 2020 was not possible under the observed climate of roughly 50 years ago. Our method rejects the null hypothesis that seasonal wildfire risk is unchanged between the two studied periods. When reanalyzed using an approach that forces a possibility of observing high-risk seasons under past climate, the method estimates that the risk of observing a season like 2020 is at least 138 times as great under the recently observed climate. The increased risk appears to be due to a shift in the marginal distribution at high quantiles (Fig. 5) and an increase in the tail dependence (Fig. 6).
5. Application to ERA5 data in California
a. Data and preprocessing
We apply our method to an ERA5 grid box containing Quincy, California. This grid box was burned in the 2020 North Complex Fire (over 300 000 acres burned) and is adjacent to a grid box burned by the 2018 Camp Fire, the deadliest fire in California history (Reyes-Velarde 2019). There is a RAWS weather station in the grid box (Quincy Road station, NWS identifier 040910) that could be used for comparison (this comparison was omitted for brevity).
The high quantile seasonality observed in Quincy (Fig. 7) is unimodal and does not show the large marginal shift between periods we noticed in Colorado. The statistically significant change in the marginal distribution is that the season is longer; that is, the FWI time series are at the highest levels for a longer period of time in the present period than in the past period. Exploratory analyses show that the high quantiles of temperature are at the highest levels longer and precipitation and relative humidity quantiles are at their lowest levels for longer periods under present climate than they were under past climate.
As in Fig. 3, but from Quincy.
Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1
We note that the tail dependence in the California region (Fig. 8) seems to level off at shorter lags than in Colorado (Figs. 4 and 6). This is likely because the dependence estimated by the TPDF is in the “residuals” after accounting for seasonality. In comparison with Colorado, there is less season-to-season variation in the FWI in California, and TPDF values seem to be driven more by variables (like ISI) with shorter time scales. TPDF values also appear slightly stronger and longer lasting under past climate than present climate. This may be attributable to the observed change in the fire behavior subindex related to dryness (BUI) that has stronger and longer-lasting tail dependence under current climate and thus may be included while accounting for seasonality. To assess sensitivity to model order, we have completed the analysis on models with order 5 and 10 and found similar results. The reported estimates are from using model order 10 for both periods.
As in Fig. 4, but for Quincy.
Citation: Journal of Applied Meteorology and Climatology 62, 11; 10.1175/JAMC-D-23-0072.1
b. Attribution
Table 3 shows results from our attribution study. The 0.95, 0.975, and 0.99 quantiles of the present climate ERA5 data in California were 67.5, 70.1, and 72.3, respectively. In 2021 there were 22, 12, and 5 days with FWI values above those respective high thresholds. Forty-five of the simulated past climate seasons were classified as high risk at the 0.975 quantile as compared with 55 of the present climate seasons.
Applied to FWI data generated from ERA5 output for the grid cell containing Quincy, our method estimates that a wildfire season like the one observed in 2021 is 1.22 to 1.89 times as likely under recently observed climate as under the observed climate of roughly 50 years ago. However, bootstrap confidence intervals contain one and thus our method fails to reject a null hypothesis of no change. It appears that under current climate, high quantiles of FWI are at higher values for a longer portion of the season, and bootstrap-based hypothesis tests support this conclusion.
6. Discussion
This paper develops a relatively simple, computationally inexpensive, theoretically justifiable method to quantify how much more likely an extreme fire season is now than it was 50 years ago. We quantify the increase in risk in reference to a well-recognized high-risk season, which enables easy communication of results; that is, we perform extreme-event attribution of seasonal risk. Our method relies on time series models that specifically focus on extreme behavior. Because these models capture the pairwise tail dependence well, and because the method treats the two time periods in the same manner, our approach is useful for making meaningful comparisons.
Our method was applied to FWI data from two different locations and two different data sources. In the Grand Lake area of Colorado, our method estimates a dramatic increase in the risk of observing a fire season like the one observed in 2020, and this was seen in both reanalysis and weather station data. This increased risk is mostly attributable to an upward shift of the quantiles of the marginal FWI distribution. In California, our method’s point estimates suggest an increase in the risk, but uncertainty associated with these estimates does not allow one to reject a null hypothesis of no change in seasonal risk. It is possible that analyzing a different fire weather index could result in different results. The FWI was developed in Canada for use with forests that are primarily Jack and Lodgepole pine, and it may not be the best index to summarize weather-driven fire risk in California. Fortunately, our same analysis could readily be repeated with any other indicator of fire risk. As this paper only considers the fire risk associated with the meteorologically derived FWI, it focuses specifically on risk due to changes in observed climate and does not directly consider other factors, such as past forest management, which could contribute to overall fire risk.
Like any analysis, our method requires several modeling choices. Perhaps chief among these is the selection of the MA(q)’s order. Our order selection method was based on subjectively interpreting TPDF plots; fortunately, sensitivity analysis showed that so long as q was chosen to be sufficiently large, conclusions were not affected. Another choice was to fit an MA(q) rather than an ARMA(p, q), which possibly could capture the dependence seen in the TPDF with a smaller number of parameters. We chose an MA(q) because we employed the innovations algorithm to fit our model, and model fitting of general ARMA models is an area for future investigation. Our approach also employs a two-step estimation method, first fitting the marginal and transforming, and then assessing the dependence. This two-step approach propagates any error from the first step forward; however, our bootstrap method for assessing uncertainty accounts for this error propagation. One observed shortcoming of our model is that in Colorado, we observed seasons where the FWI remained moderate for the entire season, and simulated seasons all were extreme at least once during a season. This concern is consistent between climate periods and thus past-to-current comparison should still be relevant. Because model fitting uses only large observations, our approach should only be used to assess extreme behavior and should not be used to assess quantities associated with the bulk of the distribution.
Acknowledgments.
Author Cooley was partially supported by National Science Foundation Grant DMS-1811657.
Data availability statement.
ERA5 data (ECMWF 2023) were accessed from the Climate Data Store (https://cds.climate.copernicus.eu/). RAWS data (National Interagency Fire Center 2023) were downloaded using the “Weather Data” link (https://www.wildfire.gov/application/fire-and-weather-data-extract). Raw data and formatted files can be accessed online (https://github.com/twixson/seasonal_wildfire_risk_attribution). All analyses were performed using R (R Core Team 2022). We used the following packages: cffdrs (Wang et al. 2017), chron (James and Hornik 2023), evd (Stephenson 2002), humidity (Cai 2019), lattice (Sarkar 2008), lubridate (Grolemund and Wickham 2011), knitr (Xie 2014), ncdf4 (Pierce 2023), and tidyverse (Wickham et al. 2019).
REFERENCES
Abatzoglou, J. T., and A. P. Williams, 2016: Impact of anthropogenic climate change on wildfire across western US forests. Proc. Natl. Acad. Sci. USA, 113, 11 770–11 775, https://doi.org/10.1073/pnas.1607171113.
Cai, J., 2019: Humidity: Calculate water vapor measures from temperature and dew point, version 0.1.5. R package, https://github.com/caijun/humidity.
Coles, S., 2001: An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics, Springer-Verlag, 208 pp.
Cooley, D., and E. Thibaud, 2019: Decompositions of dependence for high-dimensional extremes. Biometrika, 106, 587–604, https://doi.org/10.1093/biomet/asz028.
deHaan, L., and A. Ferreira, 2006: Extreme Value Theory: An Introduction. Springer Series in Operations Research and Financial Engineering, Vol. 3, Springer, 418 pp.
ECMWF, 2023: ERA5: Data documentation. ECMWF, https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation.
Goldstein, H., and M. J. R. Healy, 1995: The graphical presentation of a collection of means. J. Roy. Stat. Soc., 158A, 175–177, https://doi.org/10.2307/2983411.
Grolemund, G., and H. Wickham, 2011: Dates and times made easy with lubridate. J. Stat. Software, 40 (3), 1–25, https://doi.org/10.18637/jss.v040.i03.
Hicke, J., and Coauthors, 2022: North America. Climate Change 2022: Impacts, Adaptation and Vulnerability, H. O. Pörtner et al., Eds., Cambridge University Press, 1929–2042, https://doi.org/10.1017/9781009325844.016.
Huser, R., A. C. Davison, and M. G. Genton, 2016: Likelihood estimators for multivariate extremes. Extremes, 19, 79–103, https://doi.org/10.1007/s10687-015-0230-4.
James, D., and K. Hornik, 2023: chron: Chronological objects which can handle dates and times, version 2.3-61. R package, https://CRAN.R-project.org/package=chron.
Jézéquel, A., V. Dépoues, H. Guillemot, M. Trolliet, J.-P. Vanderlinden, and P. Yiou, 2018: Behind the veil of extreme event attribution. Climatic Change, 149, 367–383, https://doi.org/10.1007/s10584-018-2252-9.
Jiang, Y., D. Cooley, and M. F. Wehner, 2020: Principal component analysis for extremes and application to us precipitation. J. Climate, 33, 6441–6451, https://doi.org/10.1175/JCLI-D-19-0413.1.
Kiriliouk, A., and C. Zhou, 2022: Estimating probabilities of multivariate failure sets based on pairwise tail dependence coefficients. arXiv, 2210.12618v1, https://doi.org/10.48550/ARXIV.2210.12618.
Mhatre, N., 2022: Transformed-linear models for time series extremes. Ph.D. dissertation, Colorado State University, 98 pp.
Mhatre, N., and D. Cooley, 2021: Transformed-linear models for time series extremes. arXiv, 2012.06705v3, https://doi.org/10.48550/arXiv.2012.06705.
National Interagency Fire Center, 2023: Remote Automatic Weather Stations (RAWS). https://www.nifc.gov/about-us/what-is-nifc/remote-automatic-weather-stations.
Patricola, C. M., and M. F. Wehner, 2018: Anthropogenic influences on major tropical cyclone events. Nature, 563, 339–346, https://doi.org/10.1038/s41586-018-0673-2.
Pierce, D., 2023: ncdf4: Interface to Unidata netCDF (version 4 or earlier) format data files, version 1.21. R package, https://CRAN.R-project.org/package=ncdf4.
R Core Team, 2022: R: A language and environment for statistical computing. R Foundation for Statistical Computing, https://www.R-project.org/.
Reyes-Velarde, A., 2019: California’s Camp Fire was the costliest global disaster last year, insurance report shows. Los Angeles Times, 11 January, https://www.latimes.com/local/lanow/la-me-ln-camp-fire-insured-losses-20190111-story.html.
Sarkar, D., 2008: Lattice: Multivariate Data Visualization with R. Springer, 268 pp., http://lmdvr.r-forge.r-project.org.
State of California, 2023: Stats and events. CAL FIRE, accessed 26 October 2023, https://www.fire.ca.gov/stats-events/.
State of Colorado, 2023: Historical wildfire information. Division of Fire Prevention and Control, accessed 15 March 2023, https://dfpc.colorado.gov/sections/wildfire-information-center/historical-wildfire-information.
Stephenson, A. G., 2002: evd: Extreme value distributions. R News, No. 2(2), R Foundation, Vienna, Austria, 31–32, https://CRAN.R-project.org/doc/Rnews/.
Van Wagner, C. E., 1987: Development and structure of the Canadian forest fire weather index system. Canadian Forestry Service Forestry Tech. Rep. 35, 48 pp., https://d1ied5g1xfgpx8.cloudfront.net/pdfs/19927.pdf.
Wang, X., B. M. Wotton, A. S. Cantin, M.-A. Parisien, K. Anderson, B. Moore, and M. D. Flannigan, 2017: cffdrs: An R package for the Canadian forest fire danger rating system. Ecol. Process., 6, 5, https://doi.org/10.1186/s13717-017-0070-z.
Wickham, H., and Coauthors, 2019: Welcome to the Tidyverse. J. Open Source Software, 4, 1686, https://doi.org/10.21105/joss.01686.
Xie, Y., 2014: knitr: A comprehensive tool for reproducible research in R. Implementing Reproducible Computational Research, V. Stodden, F. Leisch, and R. D. Peng, Eds., Chapman and Hall/CRC, https://www.taylorfrancis.com/chapters/edit/10.1201/9781315373461-1/knitr-comprehensive-tool-reproducible-research-yihui-xie.