Understanding Climate Risk in Future Energy Systems An Energy–Climate Data Hackathon


What: Approximately 40 participants – with expertise spanning energy, computer science, weather and climate research -– joined a week-long Energy-Climate data “hackathon” in June 2021. It was hosted by the Universities of Oxford and Reading in partnership with the UK Met Office as part of a series of themed hackathons supported by the Met Office and held in the run-up to the UN COP26 conference. Six projects were initiated and developed by teams over the course of the week, supported by access to state-of-the-art computational resources on the UK’s CEDA-JASMIN service, and stimulated by keynote speakers from industry and academia. The hackathon concluded with teams presenting their outputs to a panel of invited experts. Several teams plan to build on their hackathon success in publications, ongoing collaborations and research funding proposals. 
When: 18th May (half-day “scoping” event) & 21st-25th June 2021 (main hackathon) 
Where: Online via Zoom and Gather.Town, supported by Slack communication channels 
Affiliations: Initiated by: University of Oxford Dr Sarah Sparrow, Professor David Wallom, Professor Tim Woollings, & University of Reading Professor David Brayshaw, Dr Hannah Bloomfield, In partnership with the Met Office, the UK’s national meteorological service, and with support from the UK’s CEDA-JASMIN service and Gurobi optimization software.

D ecarbonization is driving rapid change in the production, transmission, and consumption of energy. Nonetheless, current pandemic recovery plans are projected to bring emissions to a new maximum in 2023 from which they will continue to rise with fossil fuels' continued growth (IEA 2021). Investment into clean energy, especially in emerging and developing economies, is therefore required to keep emissions on a Paris-compliant trajectory.
Growing installations of renewable generation and the electrification of heating and transport are increasing the energy sectors' exposure to weather and climate (Deakin et al. 2021) and the impact of growing weather sensitivity is further compounded by an uncertain and changing climate (Bloomfield et al. 2021a;Jaroszweski et al. 2021).
Accurate quantification of weather and climate risk on scales relevant to energy networks is therefore essential in supporting a rapid transition to low carbon and renewable energy systems, a key COP26 priority theme, though there remain many technical and scientific challenges (Bloomfield et al. 2021b).
The Energy-Climate Data Hackathon, initiated by the Universities of Oxford and Reading and supported by the Met Office, therefore sought to address three aims: investigate the risks and challenges facing future energy systems, to develop novel solutions to support exchange of both data and science understanding, and to foster the growth of an integrated community working at the interface of energy and climate research. The hackathon was open to everyone working in the fields of energy or climate, including from industry, policy, or academia. Participants' experience spanned energy system science and modeling, climate science and modeling, software development, data analytics, and other mathematical and physical modeling backgrounds. Projects within the hackathon's scope included developing decision support aids, producing improved solutions for data exchange between energy and climate science, model development, and creating novel end-user applications. Project initiators brought their ideas forward at an initial scoping meeting (18 May), with the aim of building a team to develop the projects over the course of the week-long hackathon (21-25 June).

Event structure
The hackathon was advertised from April 2021, with the choice to register as a participant or to put forward a project idea as a potential "team lead" (several potential team leads were also identified and approached directly by the organizing team). Team leaders were encouraged to discuss their ideas with the organizers in advance of the scoping meeting, and benefited from support in developing their initial ideas into suitable hackathon projects.
On the scoping day, participants were given an overview of the event, alongside technical information regarding the United Kingdom's CEDA-JASMIN computing platform (Lawrence et al. 2013), which was provided for their use, and details regarding common datasets (e.g., meteorological data from the United Kingdom's CEDA archive). A series of short talks showcasing existing tools and datasets available to use during the hackathon such as the Copernicus Climate Data Store (Copernicus 2021) or the Earth System Model Evaluation Tool (Eyring et al. 2020) were presented by external speakers. After a short break, all the team leads seeking to put forward a hackathon project were invited to give 3-min lightning talks to introduce their proposal.
Following the lightning talks, participants joined breakout rooms to further discuss the topic(s) in which they were interested. The scoping day concluded with a final plenary discussion, and details about how to prepare for the next steps of the hackathon. This included the team leads preparing a one-page "project proposal": this detailed their anticipated software/ data requirements as well as an expanded outline of the project concept.
Seven projects were presented on the scoping day. However, between then and the start of the hackathon, some groups working on similar ideas chose to merge, while other new groups formed based on discussions between participants. The one-page project proposals were also circulated to all participants in advance of the event enabling them to indicate their preferences for which project they wished to be involved in. The distribution of participants across projects was sufficiently well spread and balanced that all participants could join their first-choice team.
In the run-up to the start of the main hackathon event, participants joined a dedicated communications platform on Slack, which catalyzed the further development and discussion of project ideas. Technical assistance was given on preparing workstations with relevant tools (including SSH connection to the High Performance Computing resources hosted on the U.K. CEDA-JASMIN platform and use of Jupyter Notebooks; Kluyver et al. 2016). Hackathon time was primarily taken up by dedicated "group hack time" (see Fig. 1 for the event timetable), during which participants worked in team groups or subgroups communicating via Slack channels and Zoom breakout rooms. Keynote talks were held on the first four days. These talks were as follows: Participants also heard from a representative of Meteomatics (a private weather service provider) on the first day, with an introduction to their weather data API.
Troubleshooting sessions were offered to participants on the second and fourth evenings (but were in practice little used). Brief update presentations held before lunch on days 2-4 gave teams the opportunity to share their progress, and exchange feedback and ideas across groups. Social events took place on Gather.Town, meeting on a remote "island" located in the digital ether.
The event concluded with final presentations on the afternoon of the day 5, recordings of which have been made available online (Oxford e-Research Centre 2021). A panel of invited experts commented on the merits and successes of the different projects, giving feedback on how they might be taken forward and feeding into the Met Office's lead-up to COP26 (Met Office 2021). Support for open-access and open-source science was taken as a guiding principle for the hackathon with groups uploading code and data to repositories on GitHub (Energy-Climate Hackathon 2021). Teams additionally shared resources such as presentation slide PDF and tutorial documents through Google Drive, which have now been included in the main GitHub repository. A brief summary of each project is given below.
Projects and their outcomes PV output forecasting using smart meter data of nearby PVs. The premise of this project was that with the recent uptake of rooftop solar photovoltaics, more accurate very short-term forecasting techniques are needed to assist in load balancing within networks. The aim was to develop a forecasting technique that uses the output data of nearby PV systems to yield highquality PV output forecasts of a center station. Improving forecasts in this way could create new business models for prosumers and enhance services of energy suppliers. This hackathon project used data from University of Reading sustainability services (Sustainability Services 2021) as well as crowd sourced data from local residents. Wind speed, wind direction, and incoming solar radiation were used as meteorological inputs. During the hackathon, a novel feature selection algorithm was proposed and combined with machine learning algorithms to produce the forecasts. The developed technique is currently being extended and validated with data from 202 rooftop PV systems from the "PV-sensor field" collected by Utrecht University (Elsinga and van Sark 2017). This project highlighted the ongoing challenges of needing high-quality open energy data and very high-resolution (sub-half hourly) meteorological data for the optimal operation of these algorithms.
Identifying key features of meteorological data relevant for power system studies. Power system models perform complex calculations on a set of input parameters, to achieve a specific goal (e.g., minimizing the total cost of electricity, which includes both building and operating the generation equipment). Due to computational constraints, power system studies commonly only use singular representations of the parameters. The results obtained from the calculations are, however, found to be strongly sensitive to different choices of the input parameters (Schyska et al. 2021). The meteorological input, for instance, can have a substantial impact on the power system model results (Bloomfield et al. 2016;Schlott et al. 2018;Kies et al. 2021). This project compared the outputs from power system simulations using three EURO-CORDEX climate power model simulations spanning the whole twenty-first century (Jacob et al. 2014) to identify the key features of meteorological input data with relevance for the output of power system simulations.
The hackathon team developed advanced statistical comparisons of model simulations, including a very large covariance correlation matrix on many key input/output parameters. This highlighted some important similarities and marked differences between climate models. For example, a stark difference (sign flip) was seen in the temporally correlation of onshore wind capacity factor and solar capacity factor between models. Based on feature maps, links could be made between the simulation outcomes and different characteristics of the input data. The group is continuing with this work after the project, including more climate models and further analysis metrics to increase the robustness of the results.
Tutorials for using subannual data within pyam. The open-source Python package pyam provides a suite of tools for analysis, processing, and visualization of output from integrated assessment models (Huppmann et al. 2021). The source code is available on GitHub (pyam 2021). The package already supports working with subannual time series data as used by several other projects in this hackathon. Alas, there are to date no tutorials for these use cases, which may restrict new users from benefiting from the full capabilities of pyam. As part of this hackathon project, a hands-on tutorial was held for interested participants, and several functions were implemented to improve performance (run times and memory usage) and improve the plotting module.
Developing methods to estimate subdaily energy relevant climate variables from daily data. Power system model simulations are commonly performed using half-hourly or hourly input time series (such as regional electricity demand or renewable generation). The high temporal resolution is important for accurate estimates of power system properties like peak load or ramps in renewable power generation. As the weather dependence of power systems continues to increase, many power system modeling inputs can now be constructed using outputs from numerical weather prediction models and climate models. Unfortunately, most climate model datasets are only produced at temporal resolutions coarser than the hourly data required for power system models [e.g., the U.K. Climate Projections (UKCP) global and regional datasets provide output at daily resolution (Murphy et al. 2018) and the Subseasonal to Seasonal (S2S) Prediction project database (Vitart and Mladek 2020) produces a few outputs at 6-hourly resolution and the rest at daily resolution]. There are a limited number of models that produce some variables at hourly resolution but these are available for either limited areas or for limited periods in the future (Kendon et al. 2021).
This project investigated machine learning methodologies to estimate hourly data from daily data, using the ERA5 dataset as this provides hourly data against which methods can be verified. The validation focused around three high-impact events where hourly data were crucial to balancing energy supply to energy demand. The fields of interest were 2-m temperature, as this is a key input for energy demand, and 10-m wind speed and surface solar irradiance, as these are key inputs for wind and solar power energy generation, respectively. The hackathon team compared multiple machine learning methodologies, finding that simpler methods of linear interpolation for downscaling temperature and solar irradiance proved hard to beat (compared to, e.g., linear regression, random forests, and neural networks). For wind speed, a linear regression based approach modeled the data best. It was much better at capturing the hour-to-hour variability in wind speed than previous approaches, but still had issues accurately capturing wind speed magnitude. This project is part of the U.K. Climate Resilience Programme, and feeds into ongoing work at the Met Office, which will support the codevelopment of a climate service prototype for the energy sector.
Extreme weather for electricity systems. This project explored a dataset of gridded meteorological "adverse weather scenarios" produced by the Met Office in collaboration with the National Infrastructure Commission and Climate Change Committee (Dawkins et al. 2021). The team had several approaches to exploring this dataset: data visualizations to study the extreme events, comparison to existing multidecade climatological datasets (Bloomfield et al. 2020a,b), and using as input to a power system model Calliope (Pfenninger and Pickering 2018) to study impact on energy system reliability, successfully linking events with stress indicators. The project showed that this dataset can aid testing future energy system resilience to weather and climate extremes, and results are being condensed into material for a publication, as well as training material to be available with the dataset.
Quantifying uncertainty in power system design using distributed computing. Power system models are a widely used tool in the design and simulation of potential future energy systems. Such models use optimization-based techniques to solve an algebraic program to seek a future power system that minimizes the combined cost of building and running the associated infrastructure. This optimization problem typically involves numerous constraints to ensure, e.g., that the supply of electricity matches the demand for electricity at an hourly level. Recent research has clearly demonstrated the importance of solving these models over "climatologically robust" weather samples (i.e., weather records spanning many decades; e.g., Hilbers et al. 2019;Bloomfield et al. 2016), but the complexity of the optimization problem has rendered a thorough investigation of current and future climate uncertainty computationally intractable. This hackathon project sought to overcome this computational-complexity barrier by prototyping a distributed computing experiment using the citizen science climateprediction.net framework (Stainforth et al. 2004).
Climateprediction.net is the world's largest climate modeling experiment, running climate model experiments on users' home computers. The hackathon team developed a prototype version of a climateprediction.net workflow, taking raw climate data from a GCM (which can be run on home computers), bias adjusting and converting the output to produce hourly demand and renewable generation time series, and then running through a simple "U.K.-Ireland" power system model implemented in a package known as Calliope (Pfenninger and Pickering 2018). The project forms the groundwork for future project proposals using the framework to do a full sensitivity analysis.

Feedback and lessons learned
In response to requests for participant feedback the event was described as a "very collaborative and insightful experience" where participants enjoyed the networking opportunities with others from around the globe across the energy and climate disciplines. The technical support was reviewed as very prompt and supportive from the organizers and the JASMIN team with participants reflecting, "It is simply hard to fault the organization and execution of the hackathon and the enthusiasm and skills of participants" and "A lot of work was put into the event by the hackathon organisers and this went a long way." Despite the overall success of the hackathon, there remain several areas that could be improved in subsequent events.
In terms of "open research"-which was a guiding principle for the hackathon-time constraints proved a challenge. In particular, although project code and notebooks were uploaded to GitHub, without dedicated time to write good documentation many of the projects contain only a brief description of their intended purpose and operational instructions on GitHub. This is unfortunate, as the final presentations each team prepared were highly informative, including many helpful graphs and figures. Future hackathons might improve the availability of documentation by dedicating additional time for documenting projects and improving code readability.
Although a significant number of participants dropped out between registration and participation (from 90 registrations to 40 participants), there was strong commitment during the main week-long event. Also, although the event ran very smoothly on a technical level (with the Zoom, Slack, and JASMIN providing an excellent platform), the virtual hackathon format was challenging for participants joining from very different time zones. The virtual format may have benefited the ability of group members to drop in and out enabling participation in parallel meetings more efficiently than an in-person event may have allowed, increasing participation, with the downside that the proportion of participants' time during the event may have been reduced compared to a face-to-face event.
Overall, however, a core message that emerges from the hackathon is the importance of preparation and engagement from a core team before the event (particularly the organizers, the project team leaders, and technical support). Identifying the projects early on and engaging with team leaders to develop the ideas and understand their needs were key factors that allowed participants to work effectively. Having the means to prepare compute and data resources and tutorials before the event, for a common computing platform, helped.

Summary
Connecting researchers virtually with video conferencing and communications software is low cost, has a small carbon footprint, and made the hackathon accessible to a wide audience of researchers (particularly early career researchers with limited travel budgets, and researchers with other work or family responsibilities). Participants benefited from the sharing of knowledge relating to the use of tools key to the energy-climate community, including application of power system models, optimizers, and processing tools for working with meteorological data in Python.
Rapid progress was aided through the hackathon format, which enabled a deep level of practical engagement between researchers from different disciplines. Climate modelers were able to provide assistance in reading large volumes of meteorological data while energy modelers could provide details on how their various models worked and the spatial and temporal resolutions of meteorological data that would be most useful as inputs. Industry experts also contributed useful context to how the projects could provide most useful outputs for them.
The projects tackled very different problems in the quantification of climate impacts to energy systems. Some groups had focused on small time scale operation of future energy systems, while other groups were interested in multi-decade-long planning and risk assessment. The pace of progress was impressive across the board, with each team broadly achieving their aims. Teams developing new tools, including the solar PV prediction and power system distributed computing, were able to rapidly develop prototypes, identifying where difficulties arise and further work is needed. While none of the projects would be considered finalized after only one week, they have exceeded the organizers' expectations in providing prototypes and scoping work that can form the basis of novel research and development work. Indeed, it is considered a very positive result that several groups are looking at future collaboration and turning their research into papers or grant proposals. benefited from funding from the Met Office and Reading University. Thanks are given to the project team leaders and participants for their hard work and contributions throughout the event. Special mention to Bryn Pickering for helping multiple groups to setup their own Calliope power system modeling environments. Thanks are given to the members of staff at the University of Reading who responded to Hannah's request for rooftop solar PV data for one of the projects.
We are very grateful to the United Kingdom's CEDA-JASMIN service for providing JASMIN access and support before and during the event and to Gurobi Optimization for generously providing gratis use of their commercial solver for the duration of the workshop.
The work reported in the "Developing methods to estimate subdaily energy relevant climate variables from daily data" section is part of the U.K. Climate Resilience programme, which is supported by the UKRI and codelivered by the Met Office and NERC on behalf of UKRI partners AHRC, EPSRC, and ESRC.
Data availability statement. The full output of projects is available at https://github.com/2021-Energy-Climate-Hackathon.