Hurricane Sandy (2012, referred to as Current Sandy) was among the most devastating storms to impact Connecticut’s overhead electric distribution network, resulting in over 15 000 outage locations that affected more than 500 000 customers. In this paper, the severity of tree-caused outages in Connecticut is estimated under future-climate Hurricane Sandy simulations, each exhibiting strengthened winds and heavier rain accumulation over the study area from large-scale thermodynamic changes in the atmosphere and track changes in the year ~2100 (referred to as Future Sandy). Three machine-learning models used five weather simulations and the ensemble mean of Current and Future Sandy, along with land-use and overhead utility infrastructure data, to predict the severity and spatial distribution of outages across the Eversource Energy service territory in Connecticut. To assess the influence of increased precipitation from Future Sandy, two approaches were compared: an outage model fit with a full set of variables accounting for both wind and precipitation, and a reduced set with only wind. Future Sandy displayed an outage increase of 42%–64% when using the ensemble of WRF simulations fit with three different outage prediction models. This study is a proof of concept for the assessment of increased outage risk resulting from potential changes in tropical cyclone intensity associated with late-century thermodynamic changes driven by the IPCC AR4 A2 emissions scenario.
Hurricane Sandy was among the three major storms that affected Connecticut in the past decade (alongside Tropical Storm Irene and the October 2011 nor’easter). Though technically classified as a posttropical cyclone when it made landfall (Blake et al. 2012), Sandy was impactful to Connecticut’s largest electric utility, the Connecticut Light and Power Company, doing business as Eversource Energy (Eversource). At the peak of the restoration over 500 000 customers were affected, with some customers without power for nine days. In addition, more than 15 000 outages were repaired by a workforce 6 times as large as Eversource’s normal operating workforce (Caron et al. 2013). Figure 1 shows the spatial distribution of outages across the Eversource service territory. Most of the outages were concentrated in Fairfield County (southwestern Connecticut), where substantial overhead electric distribution infrastructure and population is present. Although storm surge was extensive during Sandy (Fanelli et al. 2013), the majority of outages in the Eversource service territory were caused by wind and trees affecting overhead lines (T. Layton, Eversource Energy, 2015, personal communication).
Weather is found to be responsible for nearly 44% of power outages in the United States (Campbell 2013), with hurricanes and tropical storms affecting an average of 782 695 customers per event (Hines et al. 2008). The annual cost of power outages (in 2012 USD) has been estimated between $28 billion to as much as $209 billion, with annual weather-related outages estimated to cost between $25 billion and $70 billion (Abraham et al. 2013). In addition to impacts of the economy, utilities can also incur direct costs from tens to hundreds of millions of dollars for labor and equipment because of the storm (Northeast Utilities 2013).
Given that Sandy was particularly impactful for utilities in the mid-Atlantic states and New England (Henry and Ramirez-Marquez 2016), in this paper, we present a proof of concept for assessing the impacts of Sandy within a future climate scenario as it pertains to overhead electric distribution networks (distribution networks). A case study or storyline approach is consistent with the pseudo-global-warming (PGW) approach taken here as a viable means to evaluate the impacts of climate warming on an observed weather event (Schär et al. 1996; Trenberth et al. 2015; Shepherd 2016). The present study complements existing long-term hurricane planning efforts in the United States and answers the following question: how many more outages would occur if Hurricane Sandy impacted Connecticut in the future, forced by a different large-scale climate scenario? As noted by Staid et al. (2014), there is less consensus about whether the frequency of tropical cyclones will increase (Emanuel 2005) or decrease (Emanuel et al. 2008; Knutson et al. 2010). Nevertheless, there is consensus that the strongest tropical cyclones will strengthen to some degree (Pielke 2007; Knutson et al. 2010). Yates et al. (2014) examined the potential of substations being flooded under Future Sandy scenarios and found that coastal flooding in Long Island, New York (close proximity to Connecticut), could nearly double in some areas.
This study is facilitated by the recent work of Lackmann (2015) who investigated Hurricane Sandy track scenarios under current (Current Sandy), future (~2100, Future Sandy), and past climate (~1890, Past Sandy) thermodynamic and sea surface conditions. Lackmann (2015) found that while Past Sandy tracks are indistinguishable from the Current Sandy simulations, Future Sandy scenarios appear to be stronger and shifted farther north toward New England. See Lackmann (2015) for plots of sea level pressure; lower sea level pressure is consistent with stronger storm-centric winds. While the Lackmann (2015) study has caveats, including whether or not Sandy would form under future conditions, the goal of his study was to isolate the influence of changes in the large-scale thermodynamic environment on the intensity and track of a system like Sandy.
Using Lackmann’s technique (Lackmann 2015), we demonstrate a case study of how a potential storm, a hypothetical future Hurricane Sandy, might affect the electrical grid in Connecticut. Generalized conclusions about how future tropical cyclones can affect the distribution network require examination of many events, featuring a variety of tracks and intensities. Nevertheless, the added value of the presented methodology is that it can be implemented when data for future tropical cyclones become available. To assess hypothetical future changes in overhead electric distribution grid outages based on simulation of a single storm event, it is necessary to recognize that impact changes will be a function of (i) changes in the intensity and size of the storm itself and (ii) changes in the track of the storm. This study combines these two aspects through Lackmann’s (2015) simulations, which we believe provides a framework for emergency managers to evaluate the impacts of climate data on infrastructure networks they manage.
The paper is structured as follows: Section 2 discusses the weather simulation modeling framework, and a comparison of the Current and Future Sandy storms; section 3 provides details on the outage prediction modeling, including an overview of the nonparametric models and our methodology; section 4 contains the results and a discussion on how track and severity influenced the occurrence of power outages, as well as the limitations of the study; section 5 contains major findings and future research directions.
2. Weather data
Within the IPCC Fourth Assessment Report (AR4) one can find several future emissions scenarios and the associated impact on global average temperature and sea level rise; these scenarios include keeping emissions at constant levels from the year 2000 and subsequent scenarios with increased emissions. In our study, we utilized the A2 emissions scenario, which describes a heterogeneous world with increasing population and carbon emissions through the year 2100 (Nakicenovic and Swart 2000). It features the second-highest emission scenario of the scenarios used at that time, loosely corresponding to the RCP8.5 scenario in the IPCC Fifth Assessment Report (AR5).
For the work presented here we relied on the Weather Research and Forecasting (WRF) Model (Skamarock et al. 2008) simulations reported in Lackmann (2015). Two different five-member ensemble simulation sets of Sandy were used: one for the current and one for the future climate scenario. The model simulations included three gridded domains with 54-, 18-, and 6-km horizontal grid spacing using one-way nesting for the two inner grids. From the 17 members described by Lackmann (2015), 5 were selected to supply the outage prediction model input. The WRF members were selected based on the availability of the 6-km domain and the variations in the physical parameterization schemes. To achieve a sample of available WRF configurations, the variations included cumulus parameterization, microphysics, and planetary boundary layer schemes. A summary of the variations in the physical parameterizations for each WRF ensemble member is provided in Table 1 herein and in Lackmann (2015), and the 6-km domain is displayed in Fig. 2a.
The initial and boundary conditions for the Current Sandy ensemble set were obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (Dee et al. 2011), with an approximate spatial grid of 0.7°. The PGW procedure used to generate future simulations of Sandy was described in detail by Lackmann (2015), and here we describe the essential aspects. Thermodynamic changes between the 1990s and 2090s were computed using a subset of general circulation model (GCM) projections from the CMIP3 project (Meehl et al. 2007) for the A2 emissions scenario. The GCM-based temperature change fields were applied to initial and lateral boundary conditions as well as to lower boundaries (sea surface and soil temperatures) in the original ensemble. At constant relative humidity, warming was associated with an increase in specific humidity. A hydrostatically balanced geopotential field was then computed based on the modified virtual temperature. The digital filter initialization (DFI) procedure in WRF was used to ensure balance between the wind and mass fields in the model initial conditions. Thus, the future simulations essentially answer the following question: If the synoptic weather pattern preceding Sandy were to take place in a warmer, moister tropospheric environment, how would the track and intensity of the system change?
The authors have much experience using gridded, numerical weather prediction (NWP) model outputs for predicting storm-related power outages (Wanik et al. 2015; He et al. 2017; Wanik et al. 2017). Similar to our previous work, the WRF simulations were processed into a set of features that serve as input to the outage model. Specifically, within the simulated hours enclosing the storm period across the study area, wind and precipitation variables were postprocessed to summarize the storm temporal evolution. Wind speed at 10-m height, precipitation accumulation, and surface gust [see Wanik et al. (2015) for computation] were reduced into the storm maxima and durations exceeding wind thresholds at each grid point within the area covering the Eversource service territory in Connecticut [see Table 2; a detailed postprocess description is given by Wanik et al. (2015)].
b. Evaluation of Current Sandy WRF simulations
To evaluate the consistency of the Current Sandy simulations, we compared the simulated wind speed and precipitation with available observations. We used wind speed observations from airport (METAR) stations provided by the National Centers for Environmental Prediction (NCEP) Automated Data Processing (ADP) Global Upper Air and Surface Weather Observations (NOAA/NCEP 1997) and precipitation from the NCEP Stage IV analysis data (radar and gauges; Lin and Mitchell 2005). The statistical error metrics are listed in appendix A.
We present a comparison of model-simulated temperature from the Current Sandy control (CNTRL) simulation valid 1800 UTC 28 October 2012, and the closely corresponding actual temperature as shown by a GOES-13 IR image from 1815 UTC 28 October 2012. The comparison demonstrates that the CNTRL simulation captured the asymmetrical structure of Current Sandy, and this builds confidence in the accuracy of the WRF simulations we use. The time series of 10-m wind speed (Fig. 3) revealed a temporal bias, but overall the WRF Model was able to depict the highest wind speeds across all simulations. The wind speed RMSE varied between 2.6 and 4.5 m s−1 and the mean bias (MB) between 0.01 and 3.2 m s−1, depending on station and WRF simulation (Table 3). The model-predicted precipitation exhibited low bias and errors relative to Stage IV radar gauge data for the gridded domain over Connecticut for all WRF simulations (Table 3). The RMSE varied between 1.94 and 3.48 mm (2.62 mm for the ensemble mean) and the MB between 0.53 and 0.83 mm (0.69 mm for the ensemble mean). Spatial distribution and magnitude of the predicted accumulated precipitation agreed with the Stage IV data (Fig. 4) in that all members depicted high accumulation at the southwest region of the domain, which was left of Sandy’s landfall. Accumulated precipitation at the northeastern part of the domain exhibited the same spatial pattern and magnitude as shown in the Stage IV plot. Precipitation is believed to contribute to power outages by wetting the soil and allowing for easier uprooting of trees (Foster 1988; McRoberts et al. 2018).
c. Comparison between Current and Future Sandy simulations
Changes in simulated future storm impacts in Connecticut may be attributable both to the more northward track and to the lower minimum sea level pressure of Future Sandy. The cause of Sandy’s more northward future track was discussed in Lackmann (2015) and is also consistent with the simulations of Yates et al. (2014). Increased tropical cyclone intensity with warming has been analyzed by Hill and Lackmann (2011) and others, and can be interpreted as the result of increased condensational heating. In section 2, we evaluate how the change in track drives the change in resulting wind and precipitation intensity across the Eversource service territory. In later sections, we will incorporate lessons learned about the consequence of each simulation’s track into the results of the outage prediction modeling.
1) Storm-track comparison
The best track (thick, dashed black line on Fig. 5), as defined by the National Hurricane Center (NHC), is a smoothed representation of the tropical cyclone’s location and intensity (e.g., latitude, longitude, maximum sustained surface winds, and minimum sea level pressure at 6-hourly intervals). The simulated storm tracks of Current Sandy agreed with the NHC best track, while the simulated Future Sandy tracks deviated toward the northeast rather than the mid-Atlantic states, with the Future Sandy center passing considerably closer to the state of Connecticut. The ensemble (ENS) simulation made landfall closest to the NHC track in comparison with the other five WRF simulations in Current Sandy. We have confidence in the representativeness of the Sandy WRF runs from Lackmann (2015) because they accurately represented Sandy’s track and intensity in the current climate, and they capture the asymmetry of the cloud and precipitation shield (Figs. 2 and 4). See Table 1 for a list of all evaluated WRF Model simulations.
All tracks except the “no TC flux” (NOTCFLX) simulation made landfall below the NHC track in Current Sandy (Figs. 4 and 5). The NOTCFLX simulation made landfall farther northeast in both Current and Future Sandy, and the CNTRL simulation had the most southerly track of all members in Current Sandy. The Goddard scheme (GODDARD), Morrison scheme (MORRIS), and WRF double-moment 6-class microphysics scheme (WDM6) tracks were all very similar in Current Sandy, while the GODDARD, CNTRL, and ENS tracks were very similar for Future Sandy. The MORRIS and NOTCFLX were the only simulations that had storm centers pass over Connecticut in Future Sandy. The GODDARD, CNTRL, and ENS simulations had Future Sandy tracks that made landfall on Long Island, and the WDM6 simulation made landfall in New Jersey.
2) Storm magnitude discussion
We evaluated the change in wind and precipitation magnitude by creating cumulative distribution function (CDF) plots of total accumulated precipitation (Fig. 6), maximum gust (Fig. 7), and maximum wind at 10 m (Fig. 8) for Current and Future Sandy. Each CDF plot shows the distribution of the 2-km grid cells for the variable of interest strictly within the Eversource service territory. The shift to the right of Future Sandy in each plot relative to Current Sandy indicates an increase in the magnitude of wind and precipitation variables.
The GODDARD simulation shows that the maximum gust and wind at 10 m in some of the upper percentiles were decreased in Future Sandy relative to Current Sandy scenarios (Figs. 7 and 8). The GODDARD simulation exhibited very similar distributions of maximum wind at 10 m between Current and Future Sandy. Comparatively, the WDM6, CNTRL, and ENS simulations had greater separation between the Current and Future Sandy distributions. The increase in the cumulative distributions of total precipitation, gust, and 10-m wind speed between Current and Future Sandy indicate that Future Sandy was more intense in most simulations. Table 4 provides the average for each of the distributions in Figs. 6–8. On average, the individual model maximum gust increased 3%–10%, maximum 10-m wind speeds increased 6%–13%, and total precipitation increased 60%–187% when changing from Current to Future scenario in the Eversource service territory. In comparison, the ensemble mean increase of maximum 10-m wind speed (6%) and gust (4%) was lower relative to the individual ensemble members, and more similar to values from the GODDARD simulation.
Spatial distribution of changes in wind and precipitation variables shows that the majority of WRF simulations exhibit an increase in magnitude of the evaluated weather variables across most of the Eversource service territory (Fig. 9). Specifically, for each 2-km grid cell, we subtracted the Current Sandy value from the Future Sandy value, such that positive values on the map indicate an increase in magnitude and negative values indicate a decrease. The increase in the spatial distribution of total precipitation was mostly concentrated in southwest and central Connecticut, while the changes in gust and 10-m wind speed distribution varied depending on the simulation. Given that the tracks shifted northward toward Connecticut, we initially expected all variables to increase in southwest Connecticut, but this did not occur. While precipitation increased heavily in southwestern Connecticut, the majority of increased gust and wind at 10 m actually occurred over eastern Connecticut. The NOTCFLX and WDM6 simulations showed the greatest increases in total accumulated precipitation, up to 200 and 300 mm per grid cell in southwestern Connecticut. The WDM6 simulation was the most southerly of the Future Sandy tracks evaluated, and wind at 10-m height and gust were increased in eastern Connecticut, while there were decreases in western Connecticut. Although each WRF simulation is a hypothetical scenario, each should be treated as equally plausible as each accurately captured the Current Sandy track (Fig. 4), and wind (Fig. 3) and precipitation magnitude (Fig. 5).
3. Outage prediction model (OPM)
There has been much research in the fields of hurricane outage modeling (e.g., predicting locations needing repair), outage monitoring (e.g., detecting locations with power outages), and outage duration modeling (e.g., estimating time until power is restored) for electric distribution networks. Early research leveraged parametric models, such as generalized linear models (Li et al. 2010) and generalized linear mixed models (Guikema and Davidson 2006; Liu et al. 2008), and later researchers explored probabilistic methods (Mensah and Duenas-Osorio 2014) and nonparametric methods, including classification and regression trees (Quiring et al. 2011; Wanik et al. 2015), neural networks (Cole et al. 2017), Bayesian additive regression trees (Nateghi et al. 2011; He et al. 2017), and random forest (Nateghi et al. 2014; Wanik et al. 2017). Beyond building models for specific utilities with proprietary outage data (Nateghi et al. 2014; Wanik et al. 2015; He et al. 2017), outage models have been recalibrated with publicly available data such that the models can be generalized to other geographic regions (Guikema et al. 2014). Other recent research has investigated how tropical cyclone risk would affect customer outages under different climate change scenarios (Staid et al. 2014).
This study is an extension of the outage prediction system previously created for Eversource (Wanik et al. 2015; He et al. 2017) to predict outages associated with synoptic-scale weather systems. The response variable in both models was the count of outages per 2-km grid cell, defined as locations that require manual intervention to restore power. Given that some outage records were missing geographic coordinates (e.g., latitude and longitude), we used 15 251 of the 16 460 recorded outages from Current Sandy for modeling. The outage prediction modeling framework from the referenced works consisted of multiple machine-learning models that used atmospheric conditions, infrastructure, and land use surrounding the overhead power lines to predict outages for upcoming weather events (Fig. 10). Electric grid infrastructure was represented by the counts of isolating devices (i.e., transformers, switches, reclosers, and fuses) per 2-km grid cell. In this paper, land-use and infrastructure variables were aggregated on the same 2-km grid by which outages were aggregated. Our research group previously demonstrated how including land-use and infrastructure data contributed to improved spatial accuracy of outage predictions (Wanik et al. 2015), how results can be improved by including an indicator for tree-leaf condition (He et al. 2017), and how different machine-learning models yielded more accurate point estimates and predictive intervals depending on the unit of aggregation (i.e., grid cells, towns, and service territories) (He et al. 2017). Key differences between the data used in this paper and the Wanik et al. (2015) and He et al. (2017) papers are 1) the use of different storms to train and validate the model, 2) the grid spacing of weather simulation data, and 3) that there is no tree-leaf condition indicator in this study as we assume the storm will have the same tree-leaf condition in Current and Future Sandy.
For this study, the infrastructure and land use are static variables, whereas the atmospheric conditions were obtained using NWP simulations. Atmospheric variables from each WRF simulation were then used as inputs for three machine-learning models (see section 3b). In addition to the five individual WRF simulations, the ensemble mean of the five WRF simulations was used as input for the outage models. The atmospheric variables used for outage modeling were based on the 6-km nested domain of the WRF grid. In this study, we used the same 2-km aggregated land-use and infrastructure data as in Wanik et al. (2015) and joined these data to the centroid of the nearest 6-km centroid of the atmospheric forcing data. A list of all data included in the outage models is presented in Table 2.
b. Nonparametric models
Nonparametric models have been used frequently in the power outage modeling community (Nateghi et al. 2014; Wanik et al. 2015; He et al. 2017) because they require fewer assumptions about the underlying relationship between the explanatory variables and the response variable than parametric models. In this study we used three nonparametric, machine-learning (ML) models to evaluate each of the different Sandy scenarios: Bayesian additive regression trees (BART), boosted trees (BT), and random forest (RF). The model parameters were estimated using the R packages bartMachine (Kapelner and Bleich 2014), gbm (Ridgeway 2007), and randomForest (Liaw and Wiener 2002), respectively. We previously used BT and RF in Wanik et al. (2015) and BART in He et al. (2017) to predict power outages in the Eversource service territory for a wide variety of storms (i.e., blizzards, thunderstorms, and hurricanes.)
The BART model is a derivation of the Bayesian classification and regression trees model that takes advantage of a back-fitting Markov chain Monte Carlo algorithm in generating the posterior sample of classification and regression trees (Chipman et al. 2010). BART as a Bayesian model utilizes a likelihood maximization procedure that benefits from well-selected prior distributions and parallel-grown decision trees. The BT model is a decision-tree-based stochastic gradient boosting algorithm that fits a decision tree on the residuals of the previous tree so that overall fit becomes the cumulative effort of many “weak learners” (Friedman 2001). The RF model uses a random subset of the explanatory variables (with replacement) to build multiple decision trees, and the average of the predictions across all decision trees is used as the final prediction. The RF model was also ideal for our study because of its robustness to outliers and full use of the candidate variables (Breiman 2001).
Each nonparametric model evaluated has advantages and disadvantages, which is a function of how each handles the input data and relates it to the response variable (Mackinnon and Glick 1999; Vapnik 1999). An advantage of nonparametric models is that they are able to nonlinearly relate the input data to the response variable, which requires no assumptions from the analyst. Another advantage of these models is that one does not need to eliminate correlated explanatory variables—the correlated variables increase the time needed to train the models, but will not detract from predictive accuracy. A general disadvantage of nonparametric models is that they may not be good at extrapolating beyond the dynamic range of the independent or explanatory variables. For this reason we have evaluated a full and reduced weather data input with the ML models [see section 3c(1)] and also explored the impacts of a limited dynamic range in section 3c(4). Also, while it is possible to explain the method by which each nonparametric model was fit to the data, it can be difficult to interpret the actual fitted model (e.g., there is no regression equation with coefficients for inference). For example, in the case of the RF model, the final model is the average of many decision-tree models, and the average of the rules from the individual trees is incomprehensible. Therefore, we will rely on variable importance [section 4a(1)] and partial dependence plots [section 4a(2)] to analyze how these nonparametric models fit to the data. This is key to determining whether a nonparametric model has fit on an unusual pattern within the data (known as overfitting).
1) Full and reduced OPM data inputs
Increased precipitation (Figs. 6 and 9) in conjunction with attempting to address ML shortcomings (section 3b) are the reasons for employing full and reduced OPM data inputs. The full data input included both precipitation and wind variables, while the reduced model included only wind variables (Table 2). The combination of three ML models with two data inputs (full and reduced) and six weather simulations (five WRF simulations and their calculated ensemble mean) yielded 36 scenarios each to be evaluated for Current and Future Sandy.
2) Outage prediction model for Current Sandy
We first establish that the WRF simulations could be used to predict Current Sandy outages with each of the three ML models. We refer to “model training” as using the tuned models as in-sample prediction on the Current Sandy data. The model training results were not included in this paper as they are not a true measure of model performance. Instead we present results from a leave-one-observation-out cross-validation (LOOCV) using the tuned models on Current Sandy to demonstrate their performance (we refer to this as “model validation”; “observations” are defined as 2-km grid cells). The following error metrics were calculated for each simulation and model across all grid cells: Pearson correlation (“correlation”; r) between actual and predicted outages per grid cell, mean absolute error (MAE) of predicted outages per grid cell, root-mean-square error (RMSE) of predicted outages per grid cell, and the sum of predicted outages over the service territory (to show estimation error of the predicted outages). Description of the calculation of these error metrics is provided in appendix A.
3) Outage prediction model for Future Sandy
Once we established through model validation that the WRF simulations could predict outages for Current Sandy, we then performed an independent test to evaluate how the trained models would predict outages for a corresponding Future Sandy simulation. We refer to “model testing” as using our trained and validated models from Current Sandy that are used to predict Future Sandy outages. With the knowledge that some weather simulations may be inherently biased (section 2c), we assumed that any bias was consistent between the Current and Future Sandy simulations and absorbed these biases into our modeling framework (Fig. 10) by fitting pairwise outage models to account for the chosen configurations (i.e., an individual Current Sandy simulation from Table 1 was used to predict the corresponding Future Sandy outages, along with the ensemble mean).
In summary, the Current Sandy WRF simulation (joined with actual Current Sandy outages from Eversource) will be used for training and validated using LOOCV, and the Future Sandy simulation will be treated as an independent model test (e.g., holdout sample) of the trained model. In section 4, we provide discussion on the validity of the predictions by examining the magnitude and distribution of predicted outages related to the input weather data.
4) Proof-of-concept results from our previous research
As mentioned, in our previous work we showed how storms of different types and magnitudes could be used to predict outages (Wanik et al. 2015; He et al. 2017). However, a technique that was not previously demonstrated was the use of a single hurricane to predict outages from another hurricane. To build confidence in our methodology, we used Hurricane Irene (2011) to train the outage prediction models (using BART, BT, and RF), and used the trained models with full and reduced data inputs [see section 3c(1)] to predict outages from Hurricane Sandy (2012) as an independent holdout, and vice versa. These storms had similar storm outage totals despite differences in track and magnitude of wind and precipitation (Fig. B1 in appendix B); Sandy had a more extreme distribution of wind-related variables, and Irene had higher total accumulated precipitation over the Eversource-CT service territory.
The results from this proof of concept can be found in Table 5, and show that each ML model we investigated (BART, BT, and RF) was able to predict the outages for each hurricane to within a range between −26% and +28% of the actual total outages for the full data input for both storms. However, using the reduced data input resulted in predictions that were within a range from −25% to 2% of the actual total outages for Sandy, while Irene was overestimated by 18%–56%.
In addition to using a single hurricane to predict another hurricane’s power outages, we also investigated whether an OPM trained on a large number of extratropical storms (n = 76 storms) along with one hurricane could be used to improve the predictions for the other hurricane. For context, an extratropical cyclone is an asymmetric cyclone that usually occurs at the midlatitudes because of temperature and/or humidity gradients and wind shear. Their main characteristic is the presence of frontal systems slowly rotating counterclockwise (in the Northern Hemisphere) around the cyclone center. Their impact on the territory is usually manifested with long-duration winds, gusts, and precipitation, and outages ranged from 20 to 4000 outages per storm (many fewer than the >15 000 outages in Irene and Sandy). In comparison, tropical cyclones have instead a symmetric structure, typically with an eyewall near the center, and are not characterized by frontal structures. The results from this exercise were comparatively worse, with outage predictions typically underestimated by greater than 50% (Table B1 in appendix B).
Given these additional proof-of-concept results, we note the uncertainty that can arise in the Future Sandy predictions when the forecasted weather has a different range than the historical storms. As shown in appendix B (see Fig. B1), Sandy and Irene were more similar to each other with respect to maximum gust and wind than to the extratropical storms, much like Current and Future Sandy (Figs. 7 and 8). Therefore, we will proceed assuming that Current Sandy can be used to predict Future Sandy impacts, with the knowledge that these predictions may be underestimated given the dynamic range of the weather data input.
4. Results and discussion
We will now show that although each nonparametric model was able to represent Current Sandy outage impacts for each WRF simulation (section 4a), there was a divergence in Future Sandy impacts owing to the nonlinear response between the explanatory variables and power outages (section 4b). We also highlight how the inclusion of precipitation influenced outage prediction model accuracy for Current Sandy, and substantially altered the Future Sandy predictions. Note: from now on we will often refer to variable names as they appear in Table 1.
a. Outage predictions for Current Sandy scenarios (model validation)
BT and BART accurately predicted Current Sandy, while the RF model tended to underestimate the outages and had poorer error metrics (Table 6). The BART and BT models had similar performance for both the full and reduced data inputs, with high correlation values between actual and predicted outages (0.85–0.87), low RMSE (4.58–4.77), and low MAE (2.38–2.5) per 2-km grid cell. In contrast, the RF model had comparatively lower correlation (0.54–0.8), higher RMSE (6.88–7.98), and higher MAE (3.3–4.07) values than the BART and BT models. Interestingly, the models trained on the full data input resulted in little change of MAE per grid cell (e.g., up to 1.7% improved MAE, or 3.3% worsened MAE) across all WRF simulations (Table 7).
The model validation results (Table 6) show that the BT and BART models were superior at predicting Current Sandy outages across all five individual weather simulations and the ensemble mean (e.g., high correlation, low error metrics). These low LOOCV error metrics provide confidence in the outage prediction model and suggest that the Future Sandy outage predictions from these ML models will also be reliable. As previously discussed (section 2), temporal lags between simulation and observation did not affect the outage model performance, as the dependency was removed at the postprocessing stage by converting the time series into variables representing the storm peak and severity (Table 2). In addition, note that the MAE values for the BT and BART validation were improved relative to results from our previous studies on Current Sandy (Wanik et al. 2015; He et al. 2017) and comparable to others who also conducted hurricane outage modeling studies (Han et al. 2009a,b; Nateghi et al. 2014; McRoberts et al. 2018). However, comparison with these studies should be done with caution as each study referenced uses a different storm, outage data, geographic regions, aggregations, and spatial resolutions. Given that RF consistently underestimated the storm total outages for Current Sandy, we expected an underestimation of Future Sandy predictions relative to the BT and BART models that more accurately captured the storm total outages (Table 6). However, this did not occur and we will motivate the Future Sandy results (section 4b) by analyzing the Current Sandy variable importance and partial dependence plots in the next subsections.
1) Variable importance for Current Sandy
Variable importance refers to measuring the contribution of each variable in a ML model, and each ML model’s corresponding R package had its own method for measuring variable importance. We will now provide high-level detail on the variable importance calculations for each ML model, and the reader may refer to the R package documentation cited in section 3b for a thorough description of how variable importance was calculated in the BART (e.g., inclusion proportion), BT (e.g., relative influence), and RF models (e.g., inclusion node purity). In general, the higher the variable importance is, the more influential a variable will be in determining the predicted response variable. In the BART model, variable importance was the inclusion proportion for any given predictor, the proportion of times that variable is chosen as a splitting rule out of all splitting rules among the posterior draws of the sum-of-trees model. The importance score for a variable in the RF model was calculated by measuring the out-of-bag forecasting accuracy that occurs from shuffling the values for a particular predictor and dropping the out-of-bag observations down each tree. In the BT model, the reduction in the loss function attributed to each variable at each split was tabulated and the sum returned, which was then summed over each boosting iteration.
Though not shown here, there was moderate positive correlation between the count of assets and actual outages per grid cell during Current Sandy, and the count of assets was the most important variable across all combinations of ML model and WRF simulation evaluated. To facilitate comparison of variable importance, we computed the relative variable importance for each ML model and WRF simulation by normalizing variable importance values by the largest non-asset variable (as the assets had importance values that were generally double the next most important variable). Hence, a value of 100 means that this variable was the most important variable in the WRF simulation and ML model, and the importance of all other variables were scaled by this quantity excluding assets.
The variable importance of each ML model is presented in Fig. 11 for the full data input (wind and rain variables), and the values are color coded to help the reader discern which variables were most important (e.g., darker colors represent more important variables). One will notice that BART and RF have much more coloring than BT, indicating that many more variables had an impact closer to the impact of the assets.
The BART models had high variable importance for land-use variables (PercConif, PercDecid, and PercDev) followed by wind and precipitation variables. Similar to BART, the RF models were influenced from a comparatively larger subset of explanatory variables than BT. In comparison, the BT model had many variables at 0, which suggests only a subset of variables were used for prediction.
2) Partial dependence for Current Sandy
Partial dependence plots were created for each of the Current Sandy simulations. Each plot visualizes how an explanatory variable of interest influences the response variable after accounting for all other variables. The x axis represents the explanatory variable of interest, and the y axis shows the predicted outages per grid cell with all other variables at their mean. Note that the y axis will change between ML models. A positive, increasing trend on each subpanel represents increased predicted outages and vice versa. A flat line represents no change in the predicted y values for given x values.
We present three groups of partial dependence plots that correspond to a subset of the most important variables listed in Fig. 11. More specifically, Fig. 12 contains partial dependence plots of geographic and land-use variables, Fig. 13 contains wind wind-related variables, and Fig. 14 contains precipitation-related variables. Each figure is grouped by ML model, and each subpanel contains six lines that correspond to the WRF simulations (colors correspond to Fig. 5). For brevity, we will focus on the most interesting observed patterns. Note that there is not the same amount of data in each section of the x axis, and patterns observed at the extreme values of the x axis may be influenced by few data points not representative of the entire calibration data (i.e., see Assets for the BART model in Fig. 12).
The Assets were most important in each ML model and WRF simulation, and partial dependence plots show that this resulted in the highest predicted outages (up to 25 outages per grid cell), with all other variables held at their mean values (Fig. 12). Similarly, all ML models and WRF simulations predicted higher outages for increased PercDecid. The wind-related variables in Fig. 13 show a trade-off between calculate mean and maximum variables across ML models—variables of the same group (i.e., wind related) may show an increase for one variable and may show flat lines for other correlated variables. The presence of a flat line does not necessarily indicate that the variable was not “important” (i.e., see MEANWind10m for BART, Fig. 13, and the variable importance was similar to the other wind-related variables). Within the RF model for wind-related variables (Fig. 13), we see that most WRF simulations show a positive trend except the NOTCFLX and WDM6 simulations, which show a negative trend for MAXGust. For GODDARD and MORRIS, the MAXGust variable shows a large positive trend in BT and RF—and this agrees well with the variable importance listed in Fig. 11, which confirms they are among the most important variables. The same is true for CNTRL and ENS simulations within the BT and RF models for MAXWind10m and MEANGust. The precipitation-related variables are shown in Fig. 14, where most WRF simulations show a positive trend for TotPrec except for the ENS simulation. Within the BT and RF models, the TotPrec had a variable importance that was greatly less than the value of the most important wind-related variables, but the variable importance was similar to wind-related variables in the BART model. There was a mix of trends across all other ML models and WRF simulations for MAXPreRate and MEANPreRate.
b. Outage predictions for Future Sandy scenarios (model test)
Each Future Sandy WRF simulation was treated as an equally likely scenario, and each exhibited differences in the landfall location and magnitude of wind and precipitation within the Eversource service territory (Fig. 9). As previously mentioned, total precipitation increased drastically in some simulations so we compared models with full and reduced data inputs. Each Future Sandy simulation evaluated had differing predicted outage counts, but there were some consistent trends (Tables 8 and 9). The vast majority of Future Sandy scenarios evaluated show higher outages for Future Sandy, except for the following combinations: GODDARD (BT full, BT reduced, and BART reduced) and MORRIS (BART full) simulations. Generally, the full model resulted in higher predicted outages than the reduced model, and these full model predictions are displayed in Figs. 15 and 16 by ML model and simulation. Note how the change in outages were most pronounced in areas with the highest population density (Fig. 1), which is inherently related to the amount of electric grid infrastructure.
1) Comparison between full and reduced data inputs on Future Sandy outages
The change in Future Sandy predicted outages varied between the full and reduced data inputs depending on which WRF simulation and ML model was considered (Table 9). The BART model predicted from −30% to +31% storm total outages for Future Sandy across the five individual WRF simulations when precipitation variables were included, while the BT model predicted from −12% to +20% of total outages, and RF predicted from +11% to +53% (Table 9). It was interesting to see that the full data input resulted in consistently increased outages over the reduced data input for the RF model (Table 9), even though the RF model had a slightly less accurate calibration by including precipitation variables (Table 7). In comparison BART and BT calibrations for Current Sandy were slightly improved (e.g., lower LOOCV error metrics) by including precipitation variables, and resulted in increased outage predictions for WDM6 and ENS, while all other WRF simulations had no discernable patterns.
Despite underestimating Current Sandy outages (Table 6), RF predicted the most outages from Future Sandy for both the arithmetic average of the five simulations (97% full; 74% reduced) and the ensemble mean (116% full; 75% reduced; Table 8). Figure 17 shows the quantile–quantile (Q–Q) plot relating the actual Current Sandy outages to the Future Sandy predicted outages for the full data input for all WRF simulations. Generally, RF had the highest change in outages, followed by BART and last BT. Decreases below the 45° line for the 95th percentile in the Q–Q plot shows that ML models did not merely predict the extremely large (and rare) values from the distribution of Current Sandy outages.
2) Influence of storm track on Future Sandy outages
Three of the six Future Sandy tracks made landfall in the center of Long Island. These include the GODDARD, CNTRL, and ENS simulations. The WDM6 simulation made landfall in New Jersey, farther south than all other WRF simulations. The MORRIS and NOTCFLX models made landfall in eastern Long Island, and the centers of these tracks made landfall over southwestern Connecticut (Figs. 4 and 5). From track alone, we would have expected MORRIS and NOTCFLX to have the highest outages, and WDM6 to have the lowest outages for Future Sandy, yet this did not occur. Further, though not shown here, we found that there was little correlation between the change in latitude or longitude at which a storm made landfall and the change in predicted outages, which supports that it is wind and precipitation magnitude and not track that influences power outages.
3) Influence of storm magnitude on Future Sandy outages
We now focus our discussion on the behavior of ML models that used the full data input (both wind and precipitation variables), and how they used changes in wind and precipitation magnitude to predict Future Sandy outages. To support this analysis, the reader can check Fig. 1 for a labeled map of Connecticut counties. Much of our analysis will focus on the most populated county in the territory, Fairfield County (southwest Connecticut, population of ~945 000 residents), which also had the most outages of any other county in the service territory during Current Sandy (Fig. 1).
Qualitatively, one can compare the colors on Fig. 9 (wind and precipitation magnitude changes) and Fig. 16 (outage magnitude changes) to see the changes between Current and Future Sandy. To quantify this relationship, the Spearman rank correlation coefficient ρ was computed between changes in wind and precipitation magnitude per grid cell and the change in predicted outages from Current Sandy to Future Sandy for the full and reduced data input (Fig. 18), and select results were presented for the service territory and Fairfield County (note: service territory results include grid cells from Fairfield County). Spearman’s rank correlation coefficients that are close to 1 have a strong positive relationship, values close to 0 have no relationship, and values close to −1 have a strong negative relationship.
There were minor differences between Spearman correlations in the full and reduced dataset (Fig. 18). This supports that inclusion of precipitation-related variables did not substantially alter the Future Sandy outage predictions despite the increased accumulated precipitation (Tables 8 and 9). Also worth noting is that the correlations within Fairfield County were generally stronger than the entire Eversource Connecticut service territory, which we suspect may be related to the vast amount of electric infrastructure present as compared with that in other parts of the territory. The BART and BT models generally had weak to moderate positive correlation between changes in wind-related variables and changes in outage magnitude for Fairfield County (up to ρ = 0.63), and the Eversource service territory (up to ρ = 0.38). In comparison, the RF model generally had a negative correlation for wind-related variables (except for WDM6) and moderate positive correlation for total precipitation. This suggests that most outage changes within RF were driven by precipitation, while BART and BT were driven by wind-related variables.
As mentioned, the correlations listed in Fig. 18 can also be verified with visual inspection between Figs. 9 and 16. The decreases in gust and wind in the southwest (Fairfield County) during the CTNRL, GODDARD, MORRIS, and ENS simulations appear to match the corresponding outage decreases in BART and BT, and not for the RF model. The GODDARD simulation had nearly unchanged winds in Hartford County and decreased winds in Fairfield County, which may explain why the GODDARD simulation generally had the lowest predicted outage impacts for Future Sandy across ML models. The MORRIS simulation had increased winds and gusts in central and coastal Connecticut, with unchanged or decreased gusts in the southwest and northwest. MORRIS also had a region of increased precipitation in eastern Connecticut. The BART model correspondingly had decreased outages in both southwest (Fairfield County) and eastern Connecticut (Windham County and Fairfield County), resulting in this Future Sandy simulation with the full data input having less outages than Current Sandy input (14 595 outages). Interestingly, the reduced data input for BART gave increased outages (20 735 outages), which may indicate that BART with the full data input was overfit on the precipitation data. Another example of potential overfitting on the precipitation data was the BART and BT models that predicted decreased outages in Fairfield County for the NOTCFLX simulation, despite this region having most accumulated precipitation with unchanged gusts and winds. The WDM6 simulation had lowest increases in precipitation and large increases in gust and wind across Connecticut. There was moderate positive correlation between gust and predicted outages (ρ > 0.5) in Fairfield County, and weak correlations for wind. The ML models had consistently higher outage predictions with the full data input for WDM6, even though the increases in accumulated precipitation were low relative to other WRF simulations.
4) Comparison of machine-learning models
While the BT model had excellent calibration metrics for Current Sandy (Table 6), it usually gave the lowest outages for Future Sandy (Table 8). However, BT had stronger correlation between changes in wind, gust, and changes in predicted outages than RF [see section 4b(3)]. We suspect the reason for divergence between BART and BT for Future Sandy was that the BT model was more influenced by the assets per grid cell than the weather variables; hence increases in weather variable severity across the simulations did not result in increased outages. As a result of its design, the RF model uses all explanatory variables in the input data, while the BT and BART models only use variables that improve model accuracy (see section 3b for details). For the same five weather simulations, BT had the smallest average increased outages, while RF and BART had comparatively larger increased outages (Table 8). The partial dependence plots from Current Sandy showed that increased assets per grid cell led to increased outage predictions across all ML models, but it is worth noting the dynamic range of the y axis in the partial dependence plots for the BT model were typically much smaller than the BART and RF models, further suggesting it could have been overfit on the assets. For the sake of this paper, we treat all nonparametric models as equally valid, highlighting the model dependence of results.
Changes in the severity of outages from future tropical cyclones can be caused by several factors, but here we isolated the meteorological components, which are (i) the future storm was more intense, and (ii) it made landfall much closer to the study area. As mentioned earlier, we have accounted for both changes in track and intensity by making direct use of Lackmann (2015) WRF simulations. Differentiating between those two mechanisms requires shifting the track of Current Sandy to match the future one, while keeping the same intensity. This scenario is challenging to achieve because of changes in storm-relative coastline orientation that would not allow us to simply translate future scenario model wind speeds relative to the present cyclone track moves. Therefore this scenario would require further model simulations, which is beyond the scope of the current study.
Our ability to generalize these results is limited by the use of a single case, and we are aware that for this particular storm, the track changed in a way that helped to potentially maximize winds and precipitation of Future Sandy across the Eversource service territory. Simulations of many cases, with various tracks and intensities, are necessary to reach more general conclusions about how future tropical cyclone impacts could change with warming. How the frequency and intensity of tropical cyclones will change with climate warming remains an area of active research in the atmospheric sciences community. While we do not believe that we have modeled the worst-case scenario for utilities to prepare for, the case study presented in this paper serves as a proof-of-concept method that can be readily implemented when weather data for many additional cases of future tropical cyclones becomes available.
There are many other factors that may play a role in modifying how an electrical distribution system responds to adverse weather. Utilities invest in structural and electrical hardening initiatives that may increase resilience to extreme weather events—depending on the level and type of investment, the grid may respond differently to severe weather (Kuntz et al. 2002). The level of foliage (Ennos 1999; James et al. 2014), which is a function of the day that a storm would occur in the future (Fahey 2016; Carter et al. 2017), would also alter the relationship between wind, trees and resultant outages. On a broader level, tree species mixes may also change as a function of altered temperature and precipitation (Rustad et al. 2012). Further, if a utility were to alter the tree conditions such that the trees were less prone to impact through vegetation management activities, future outages may be limited (Guikema et al. 2006; Wanik et al. 2017). However, the presence of invasive species, such as emerald ash borer (Poland and McCullough 2006), will weaken roadside trees and forests and may lead to greater outage counts in select regions. The electric distribution network typically follows population by necessity, thereby increasing the exposure of the network and contributing to potential outages by virtue of simply having more infrastructure where the system is overhead as population increases (Larsen et al. 2016). Should future population growth occur in cities (Heath 2001; Dawson 2007) rather than rural communities, infrastructure exposure and associated risk may be comparatively lowered as distribution infrastructure tends to be underground in urban areas.
This case study was based on a scientific question about the change in severity of power outages if a storm similar to Hurricane Sandy were to impact Connecticut in the future, taking place with warmer atmospheric conditions. We have presented a case study of how we would expect future outages to occur under different future Hurricane Sandy scenarios given end-of-century atmospheric thermodynamic conditions informed by numerical weather prediction simulations from a recent published work (Lackmann 2015). We acknowledge that changes in both track and intensity affect changes in outage impacts. We did not attempt to separate these effects as our purpose here was to provide a case study of potential power outages owing to a stronger storm and altered track induced by future climate conditions and to illustrate a technique that could be used with a more complete set of future tropical cyclone scenarios. For example, applying this technique to multiseason simulations of future climate scenario (or historic) hurricane tracks and severities could provide a more thorough treatment of the problem.
These simulations between Current Sandy and Future Sandy were shown to increase power outages in Connecticut by an amount ranging between 42% (reduced data input) and 64% (full data input) using the ensemble mean of each atmospheric variable from the five WRF simulations to run the outage models, and 55% (reduced data input) and 64% (full data input) using the arithmetic average of the five ensemble member outage simulations (Table 8).
To limit the weather-related outages, many utilities are investing in multimillion dollar grid-resilience projects to address substation flooding, vegetation management, and pole integrity improvements (Consolidated Edison 2013; Eversource Energy 2013; Public Service Enterprise Group 2016). The study did not account for activity to harden the electricity grid, which would likely moderate future storm impacts. Storm surge and inland flooding, while not evaluated in our model, are also expected to contribute to increased outages in future hurricane scenarios because of Future Sandy’s stronger winds and closer track to the Eversource service territory relative to Current Sandy generate a higher storm surge. In addition, soil moisture may increase power outages in both drought and saturated soil conditions by making tree branches more likely to break (Meir et al. 2015) or making tree roots more likely to be uprooted (James et al. 2014; Vogel 1996). Past research has explored the use of soil moisture data for improving the accuracy of hurricane outage prediction models (Han et al. 2009a,b), and they were recently demonstrated to be useful for predicting outages during Hurricane Matthew (Gorder 2016).
Although we have only analyzed impacts on the electric distribution network by tree-caused damage, there are many other types of infrastructure that would likely be informed by an analysis of this type (i.e., water supply, wastewater, and telecommunications). A future extension of this analysis that includes simulations of many tropical cyclones (not just Hurricane Sandy), with various tracks and intensities, will allow us to reach more general conclusions about how future tropical cyclone impacts could change in a warming climate.
We gratefully acknowledge the support of Eversource and the Eversource Energy Center at the University of Connecticut, which provided funding and data for this research. WRF is developed and maintained by the National Center for Atmospheric Research, funded by the National Science Foundation (NSF). Precipitation NCEP/EMC 4KM Gridded Stage IV data are provided by NCAR/EOL under sponsorship of NSF (http://data.eol.ucar.edu/). We acknowledge ECMWF for provision of the ERA-Interim and NCAR for making the ERA-Interim available.
The statistical metrics used in the model evaluation analyses are presented below. The modeled variable (i.e., wind, precipitation, outages) is represented by Y, the observed variable is represented by X, and N is the total number of data points used in the calculations: