1. Introduction
Hurricane Ida made landfall on 29 August 2021, bringing sustained winds of 150 mi h−1 (67 m s−1) heavy rainfall, and catastrophic storm surge to the Gulf Coast, before threatening the Northeast with flooding and tornadoes, killing almost 100 people (Hanchey et al. 2021). Natural disasters pose a constant threat of economic damage and lives lost for the United States. In 2021 alone, extreme weather resulted in 688 lives lost, in addition to over USD 145 billion worth of damage. Unfortunately, this toll only continues to rise (Smith 2020). According to the National Oceanic and Atmospheric Administration (NOAA), the average yearly cost of natural disaster damage over the past 5 years has reached USD 148.4 billion, triple that of the 37 years prior (adjusted for inflation) (Smith 2020). Furthermore, a 2012 Intergovernmental Panel on Climate Change (IPCC) report found that extreme weather events will continue to undergo unprecedented changes in frequency, intensity, spatial extent, and duration as a direct result of global warming (IPCC 2012). Nonetheless, according to a 2020 Pew Research Center poll, only about 60% of Americans feel that climate change is a major threat to the country (Poushter and Fagan 2020).
Because of their broad accessibility and widespread popularity in the United States throughout the past decade, social media platforms represent effective environments in which to study public opinions on climate change. Twitter, a robust platform for real-time opinion sharing by experts and nonexperts alike, was specifically chosen for this analysis of public climate change sentiment due to its character limit on posts, enabling sentiment and volume analysis. Given the inextricable connection between extreme weather events and climate change, this study hypothesized that an increase in Twitter posts and news articles relating to natural disasters would positively correlate with both an increase in climate change media volume and sentiment reflective of a higher general belief in global warming.
In previous work, Kirilenko et al. (2015) demonstrated the plausibility of using social media data including Twitter posts for surveys of public opinion, specifically on public climate change perceptions. They confirmed that the public in fact associated extreme temperature changes with climate change. Additionally, Yeo et al. (2017) explored whether Twitter users react differently to the phrases “climate change” and “global warming,” concluding that Twitter audiences tend to equate the two terms. More recently, Al-Saqaf and Berglez (2019) and Berglez and Al-Saqaf (2021) studied the relationships between social media discussions of climate change and those of heat waves, droughts, and floods, as well as the correlation between intensified discussions of extreme weather in general and climate change. These studies validate the use of social media platforms for quantifying public response to extreme weather and climate change. However, these relatively small studies (with regard to Twitter volume) did not gauge public sentiment as a means of determining the intent behind tweet volume patterns, nor did they include analyses of trends across a large variety of events over a sufficiently long timeframe, all of which are highlighted elements in this study.
Methods of climate change sentiment analysis have previously been investigated by Mohamad Sham and Mohamed (2022), using lexicon-based, machine learning, and hybrid approaches. They ultimately concluded that the hybrid method outperformed other approaches although the accuracy was approximately 75%. Cody et al. (2015) classified climate change sentiment in Twitter posts based on expressed happiness and determined that climate rallies, book releases and green ideas contests most effectively increased this sentiment. However, for this study, complex natural language processing models, including gated recurrent unit (GRU) and deep long-short-term memory (LSTM), were trained to detect sentiment specifically indicative of belief in climate change. This enabled study analyses to examine the direct link between natural disasters and actual climate change belief sentiment, as opposed to general emotions surrounding climate change.
Therefore, the purpose of this study is to examine the impact of major natural disasters on public media sentiment and volume surrounding climate change to inform ongoing environmental messaging strategies in the United States. Specifically, the study sought to answer the following research questions: How has climate change–related media volume and sentiment changed between 2010 and 2020? What immediate impact do major natural disasters have on both social and traditional media related to climate change? How do variations in natural disaster and climate change media volume and sentiment correspond? This study builds on existing literature by increasing the scale and complexity of both media sentiment and volume analyses, while isolating the effects of natural disasters.
2. Methods
a. General
The PyCharm Integrated Development Environment (IDE) and Google Colaboratory Pro were both used for Python development. This study consisted of three main phases: 1) data collection, 2) media volume relationships and correlation analysis, and 3) climate change sentiment analysis.
b. Data collection/scraping
Twitter posts published publicly in the United States between 2010 and 2020 were scraped by specific keywords, using the sntwitter module of the snscrape application programming interface (API; Khattak et al. 2020; JustAnotherArchivist 2022). Tweets published prior to 8 November 2017 conform to a 140-character limit, whereas later tweets conform to a 280-character limit. For the purpose of the study, “natural disaster tweets/news” refers to corresponding media collected using the following keywords selected on the basis of the weather events considered to be most linked to climate change: “natural disaster,” “cyclone,” “hurricane,” “tornado,” “wildfire,” “flood,” “tsunami,” and “drought” (Sauerborn and Ebi 2012). Similarly, “climate change tweets/news” refers to corresponding media collected with the specific keywords, “climate change” and “global warming,” given the Twitter audience’s demonstrated lack of differentiation between the two terms (Yeo et al. 2017). “Dual-category tweets/news” refers to climate change media that also contains natural disaster keywords. All data were sourced from the United States.
In total, 131 804 385 natural disaster tweets and 34 876 556 climate change tweets were collected and cataloged. All tweets and their associated likes, replies, and retweets were counted and recorded by date. Climate change tweets were saved and further examined for natural disaster keywords.
News articles were similarly queried by date using the above keywords, through the RSS feeds of Google News, representing a conglomeration of a variety of reputable media sources (https://news.google.com/). Natural disasters, climate change, and dual-category articles were all tallied by date, while climate change article titles were also saved. A total of 71 141 climate change news articles and 233 023 natural disaster news articles were documented during the study.
Major disasters, along with their corresponding date ranges, damage costs, and associated deaths were obtained from NOAA’s database of historical U.S. natural disasters that contributed to upward of USD 1 billion in damage (Smith 2020). Because of the propensity of long-term natural disasters (such as wildfires, droughts, and other forms of extreme weather continuing for multiple weeks or longer) to dilute media discussion, all events with a listed date range longer than 1 week were not included in this study’s analyses.
c. Analysis of media volume trends between natural disasters and climate change
1) Tools
The following Python packages and modules were critical to the study. NumPy was used for advanced data transformations and statistical functions (https://numpy.org/). Pandas was used for reading CSV files, handling time series values, and quick data transformations (Pandas 2022). Matplotlib was used for graphing analyses and figure generation (Matplotlib 2022). Sklearn was used for correlation coefficient calculations and linear regression (scikit-learn 2022).
2) Overview metrics
Overview metrics were calculated using Pandas’ aggregation functions and time series manipulations.
3) Long-term trends
Because of the study’s 11-yr period of analysis, long-term trend figures were created by plotting day, week, and month averages, respectively. Ratios and percentages were calculated day-by-day, according to figure descriptions. Long-term trends of the analyses refer to trends throughout the duration of the study’s collection time interval (2010–20), while short-term trends and analyses generally involve the days, weeks, or months around specific events.
4) Volume increases around major natural disasters
Average volume percent increases around specific major natural disasters were calculated between the average daily media volume of the 15 days preceding each event as a baseline and each event’s start date, end date, peak, and daily average volumes. Analyses were conducted for 1) all of the NOAA dataset natural disasters and 2) the subset of these events in the highest quartile of media coverage (denoted as Q4).
5) Correlation analyses around major natural disasters
All correlation values represent Pearson correlation coefficients and were calculated between climate change and natural disaster media volume, as well as predicted sentiment, around specific major events in the NOAA dataset. Pearson correlation assumptions including level of measurement, related pairs, normality, absence of outliers, and linearity were met prior to conducting analyses. Bar plots were generated by averaging correlation coefficients at specific days around each event’s date range.
6) Corresponding increases between climate change and natural disaster media
Relationships between media volume increases were analyzed by plotting standard deviation increases of natural disasters media against both percent and standard deviation increases of climate change media volume. Month-by-month standard deviations and means were used to control for long-term trends/increases. All climate change media volume increases and natural disaster media volume increases represent averages and were calculated against 1) the entire study time interval (2010–20), 2) the intervals within 5 days of any given major natural disaster, and 3) the intervals within 15 days of any given major natural disaster, in an effort to filter potential noise outside the context of specific events. Correlations between corresponding increases were then calculated before linear regression models were fitted to each of the plots. Sentiment-related analyses used the same general method, with sentiment ratios (positive/negative) substituted for pure volumes.
7) Directly attributing elevated climate change tweet volume
Analyzing the frequency with which elevated climate change tweet volume correlated with actual natural disaster events, outside of the context of the events in the NOAA dataset alone, provided grounds for broader results. Per such analysis, all dates containing climate change media volume at least 2 standard deviations above the mean for their given months (means and standard deviations were calculated on a monthly basis, to control for long-term trends) were identified. Natural disaster news volume on these dates was also measured, to determine the percentages of dates that most probably correspond to an uptick in extreme weather. The relationship between increased climate change media and corresponding natural disasters news volume ranging from the 50th to 90th percentiles was plotted. Correlations were then calculated, and a linear regression model was fitted to the data.
d. Climate change sentiment analysis for belief and perceived severity
Climate change media sentiment analysis at scale for the purpose of this study consisted of identifying model training/testing data, in addition to conducting data preprocessing and tokenization, after which four competing models were developed to achieve the most accurate results possible with finite resources. The ultimate goal of these processes was to determine whether or not a given tweet represented positive climate change sentiment (belief in global warming and its severity) or negative climate change sentiment (skepticism toward anthropogenic global warming).
1) Data preprocessing
Data preprocessing consisted of data cleaning and augmentation. Datasets from 1) Kabaghe and Qin (2018) and 2) CrowdFlower (2016) were used for model training, forming a baseline random sample of 43 600 labeled tweets (Kabaghe and Qin 2018; CrowdFlower 2016). Model testing data were taken from Explore Data Science Academy (EDSA) Climate Change Belief Analysis (EDSA 2020). After data collection, non-ASCII characters, punctuation, popular stopwords, and articles were eradicated from the initial tweets, while duplicate and neutrally classified tweets were removed. The second iteration of tweet cleaning included the fixing of contractions using the open source python “contractions” library, lemmatization using Spacy’s en_core_web_sm module, and spell-correction using SymSpellPy (because of the tendency of misspelled words to disproportionately expand models’ vocabularies) (van Kooten 2022; Weischedel et al. 2022; SymSpellPy 2022). The python “wordninja” package was also used for string concatenation before tweets were iteratively cleaned for numeric characters (Anderson 2022).
Because of the relatively small size of the original training dataset, cleaned tweets were then augmented using a variety of methods. First, using the python “googletrans” package (built on the popular Google Translate API), tweet strings were converted into a total of 14 different common languages, including French, Spanish, Russian, Mandarin Chinese, and Japanese, before being translated back into English, in a popular NLP data augmentation strategy known as back-translation (Han 2022). Through this process, tweets retained meaning but grammar structures and exact diction were altered. Furthermore, in an effort to increase the models’ vocabularies, the nltk.corpus wordnet package was used for randomized synonym replacement (Natural Language Toolkit 2022). Word order swapping was not used to avoid skewing the meaning of individual tweets. After the completion of data augmentation, the study’s baseline training dataset was increased to approximately 2.6 million unique tweets, with a rough balance of positive and negative sentiment classifications (as opposed to the original disparities in classification totals).
2) Tokenization
After data cleaning and augmentation, the tweet messages were tokenized (converted to vectors) for model training using 1) global vectors for word representation (GloVe) and 2) DistilBertTokenizerFast from HuggingFace transformers (Pennington et al. 2014; Sanh et al. 2020). Specific GloVe vector asset embeddings were taken from an open-source embedding dataset (200 byte) (Yadav 2014). GloVe was implemented iteratively with single embedding lengths of 150 words and a 30 000-word maximum vocabulary. The GloVe tokenization method was used for the GRU, LSTM, and stacked-LSTM models, reviewed in the next section. The pretrained distilbert-base-uncased model, introduced in “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” was implemented in DistilBertTokenizerFast, utilizing transfer learning (as discussed in the next section) (Sanh et al. 2020).
3) Models
The following four model architectures were trained and evaluated for performance on the dataset: a GRU, a single-layer LSTM model, a stacked (four layer) LSTM model, and a DistilBert transformer (Fig. 1).
The GRU model was first introduced in 2014 by Cho et al. (2014); however, this study’s specific implementation architecture was borrowed from Kabaghe and Qin (2018) and included dropout, a bidirectional GRU, average and maximum pooling, and a fully connected layer from the Keras layers module (Cho et al. 2014; Kabaghe and Qin 2018; Keras Team 2022). See Fig. 1 for a visual diagram of a gated recurrent unit. All the models used in this study were compiled with Adam optimizers and used categorical cross-entropy loss. After eight epochs (training iterations), the GRU achieved a net testing accuracy of 94.2%, with 96.0% accurate classification of positive climate change sentiment tweets and 92.3% accurate classification of negative sentiment tweets from the testing data.
The LSTM model was first introduced in 1997 by a paper of the same name as a variation of the recurrent neural network (Hochreiter and Schmidhuber 1997). See Fig. 1 for a visual diagram of an LSTM. This study incorporated two LSTM-based architectures: 1) a single-layer framework and 2) a stacked model. The single-layer LSTM consisted of dropout, one LSTM layer, followed by a sigmoid-activated fully connected layer. After five epochs of training, the single-layer LSTM model achieved a net testing accuracy of 96.1% on testing data, with 98.3% accurate classification of positive climate change sentiment and 94.0% accurate classification of negative sentiment. The stacked-LSTM model consisted of four LSTM layers, with the first and last layers using a dropout of 0.2, and the middle two hidden layers implemented with a dropout of 0.4. High dropout rates were crucial to prevent overfitting caused by data augmentation, which yielded a large vocabulary but contributed to many training tweets having similar grammatical structures. After three epochs of training (low number of epochs due to computing power and time limitations), the stacked-LSTM model achieved a net testing accuracy of 96.1%, with 97.9% accurate classification of positive climate change sentiment tweets and 90.3% accurate classification of negative sentiment tweets from the testing data.
The final model implemented in this study for climate change sentiment analysis was a DistilBertForSequenceClassification transformer, which inherits pretrained weights from the same baseline distilbert-base-uncased model used in tokenization. Due to GPU RAM limitations for the study, the transformer was only trained on a small part of the dataset: approximately 43 000 tweets, roughly divided between categories of positive and negative sentiment. See Fig. 1 for a visual depiction of the DistilBert architecture. After three epochs of training, the transformer model achieved a net testing accuracy of 94.6%, with 89.7% accurate classification of positive climate change sentiment tweets and 99.5% accurate classification of negative sentiment tweets from the testing data.
The single-layer LSTM was chosen as the model most suitable for deployment based on the following criteria: 1) prediction speed, 2) evaluation accuracy, and 3) scalability for testing on the large dataset (RAM usage represented the primary scaling limitation of the study).
4) Implementation and use
All saved climate change tweet texts were iteratively classified by the single-layer LSTM model. Classifications were counted by category and saved by date.
3. Results
a. Data overview and exploration
1) Tweet and news volume and sentiment overview metrics
Between 2010 and 2020, approximately 35 million climate change tweets were recorded. With the growth of Twitter and more widespread climate awareness, the total number of climate change tweets per year increased from 1 100 000 in 2010 to over 4 000 000 in 2020, a 300% increase (Table 1). In total, over 275 million likes, 71 million retweets, and 17 million replies were recorded during the 11-yr interval of the study, while 13 million tweets were classified as popular, with at least one like, retweet, or reply. Approximately 3.51% of all climate change tweets also mentioned natural disasters. On average, 18 climate change news articles were recorded per day with increases seen from 2010 (1079 articles) to 2020 (10 408 articles). See Table A1 in the appendix for media overview metrics by year and Table A2 in the appendix for detailed media frequency statistics.
Twitter and news volume overview metrics (number and percent). Metrics were calculated for climate change and natural disasters in 2010, 2015, and 2020, as well as total from 2010 through 2020. Dual-category tweets contain climate change and natural disasters keywords.
2) Major natural disasters from NOAA dataset overview metrics
A total of 122 individual events (with a combined toll of USD 762 billion in damage and 5000 lives lost) between 2010 and 2020 from the NOAA dataset were used for these analyses. On average, each catastrophe cost approximately USD 6.25 billion and killed 40 people. However, median values were significantly lower, at USD 2 billion in damage cost and 3 deaths, suggesting a right skew caused by a select few notable events. See Table A3 of the appendix for more natural disaster overview metrics.
b. Detailed volume and sentiment analyses
1) Long-term trends
Even with long-term climate change tweet volume increases controlled for, the percentage of dual-category tweets drastically increased from 1.93% to 3.97% for all tweets, and from 2.0% to 4.3% for popular tweets, suggesting that public awareness of the relationship between climate change and extreme weather events effectively doubled during the 11-yr study period (Table 1). A similar trend is evident in the number of news articles from the same period.
The ratio of positive climate change tweets to negative climate change tweets by day also increased over the 11-yr study period, suggesting greater belief in climate change over time, reflected in all tweets, as well as popular tweets. Monthly averages of sentiment ratios also grew from less than 2 in 2010 to well over 5 in 2020, a 150% increase (Fig. 2).
2) Volume and sentiment changes around major natural disasters
On average, climate change tweet volume increased approximately 4% between the start dates of each natural disaster and the baseline 15 days prior to the event (Table 2). The end date and daily mean increases from the baseline during disasters were 11.70% and 9.15%, respectively, while the peak climate change tweet volume during major natural disasters demonstrated a 32.47% increase from the baseline, on average. These climate change tweet volume increases were significantly higher for events that fell in Q4 of natural disasters tweet volume at 21.45%, 21.67%, 24.16%, and 50.68% for the start date, end date, daily average, and peak volumes inside events’ date ranges, respectively. Similarly, the number of dual-category tweets rose on average 31.24%, 69.21%, and 49.95% for the start, end, and daily averages for each date range. Events in Q4 of natural disasters tweet volume once again amplified results with dual-category tweets increasing by 102.31%, 162.31%, and 164.02% for the start date, end date, and daily average volumes. Dual-category tweet peak mean increases were 115.58% for all natural disasters and approximately 260% for Q4 natural disasters, connecting climate change tweet volume increases with concurrent extreme weather events.
Climate change media around major natural disasters from 2010 through 2020. Increases are calculated in comparison with daily averages for the 15 days prior to each disaster. “Q4 disasters only” refers to the subset of natural disasters in the highest quartile of media coverage. Positive values indicate an increase, and negative values indicate a decrease.
Climate change news volume exhibited similar increases on average around major natural disasters from the NOAA dataset: mean volume percent increases between the baseline period (15 days prior to each event) and the start date, end date, date range daily average, and peak volumes of each event, were 6.92%, 9.34%, 8.10%, and 62.81%, respectively (Table 2). These increases were once again dwarfed by mean Q4 natural disaster event increases, which measured 10.25%, 23.17%. 14.72%, and 70.40% at the same time points. Therefore, between 2010 and 2020, on average, climate change media volume increased dramatically around natural disasters, particularly for events that sparked above Q4 natural disasters tweet volume. However, the ratio of climate change tweets reflective of belief in global warming (positive sentiment) to tweets reflective of negative sentiment tended to decrease around natural disasters: mean percent changes from the average 15-day pre-event-date-range sentiment ratios were approximately −3%, −10%, and −7% to the start date, end date, and daily average ratios for major natural disasters.
3) Correlations between media types around major natural disasters
Although correlation coefficients of climate change and natural disaster media volume were generally low, they increased considerably around large natural disasters (Fig. 3). On average, between 20 and 30 days prior to an event’s date range, the correlation between natural disasters tweet volume and climate change tweet volume was approximately 0.21, exhibiting little change over that interval. The correlations then assumed an approximately normal-shaped distribution, reaching a maximum of 0.3 (approximately 40% increase) at an average of 2 days after each event, before returning to a stable correlation of roughly 0.25. The correlations between climate change and natural disasters news volumes were slightly higher, starting with an average baseline value of roughly 0.51 (20–30 days prior to the event). News volume correlations also assumed a roughly normal-shaped distribution, reaching a peak value of 0.65 (approximately 30% increase), 3 days after each event. Despite overall somewhat normal distributions, the average media volume correlations in the actual natural disaster date ranges themselves were low (relative to the overall distribution shape), likely as a result of specific outliers. Regardless, correlations between climate change and natural disasters news and tweet volumes increase dramatically around extreme weather events. The average correlations between climate change sentiment ratio (number of positive tweets/number of negative tweets) and natural disasters tweet volume were also calculated and exhibited a far steeper increase than the aforementioned plots: the starting, baseline correlation value was 0.02, growing to a peak of roughly 0.12 (500% increase) 2 days before each event, on average. Although these correlation values are extremely low, the drastic increases around major natural disasters assert the influence of extreme weather events on public attitudes toward climate change.
4) Corresponding increases between climate change and natural disaster media
As stated above, correlation values of climate change and natural disasters media volume were generally quite low, perhaps partly as a result of the two variables’ tendency to increase/decrease at different speeds (a slightly nonlinear relationship). To control for this and focus analysis solely on the relationship between increases in both variables, standard deviation and percent increases for climate change media volume were computed and compared with corresponding natural disaster media volume standard deviation increases, on a day-to-day basis. Correlations were then calculated for the relationships between these increases of both variables, yielding more meaningful values. When calculated for values in 5- and 15-day intervals around specific major natural disasters only, these correlations (between climate change tweet percent increases and natural disaster tweet standard deviation increases) were 0.907 and 0.914, respectively; overall correlation was 0.720 (Fig. 4 and Table 3). Climate change news volume percent and natural disaster news volume standard deviation increases exhibited similar high correlations of 0.909 and 0.910, for values within 5- and 15-day intervals of specific disasters, and 0.838 overall, for the entire study interval.
Correlations and models for climate change media and natural disaster media volume increases. Correlation values were calculated between the specified metric for climate change–related media (news and tweet volume), and standard deviation increases in natural-disaster-specific news article and tweet volume. The linear expressions in the “model” columns represent the actual equations for the linear regression trends present in Fig. 3. The first two columns of values represent the tweet volume data, whereas the rightmost two columns represent values for news article volume data.
Linear regression models were fitted to the graphs of each of the six relationships between volume increases displayed in Fig. 4. Interestingly, each news model had roughly 2 times the slope of its tweet model counterpart: 7.86–3.92, 6.04–3.06, and 2.73–1.33 for the 5-day, 15-day, and overall intervals, and, across each of the figures, smaller intervals almost always resulted in higher model slopes (5-day intervals around specific disasters models usually had the highest slopes, while the overall interval relationship trends had the lowest slopes across all analyses). In general, increases in natural disaster media volume overall and especially around specific major natural disasters, highly corresponded to increased climate change media volume, for both tweets and news, with clear positive linear relationships emerging between the two parameters.
5) Direct attribution of elevated climate change media to extreme weather
Seventy-eight percent of climate change tweet maxima (above 2 standard deviations from a month mean) corresponded to natural disasters news volume above the 50th percentile of natural disasters news volume (Fig. 5). Similarly, over 50% of climate change tweet volume maxima corresponded with Q4 natural disasters news volume, and 26% with 90th percentile or higher natural disaster news volume. The relationship between these two variables is strong, with a clear negative linear trend (correlation of −0.992 and R squared of 0.985), further supporting the notion that climate change discussion tends to spike around natural disasters (indicated by high news volume, which is a less noisy factor than tweet volume). The associated model can be used to extrapolate additional data points from the 50th to 90th percentile of natural disaster news volume alone.
4. Discussion
a. Long-term trends and patterns
Public association of climate change with extreme weather disasters, quantified through tweets and other media, effectively doubled between 2010 and 2020. According to a Ball State University poll of 300 individuals across 43 states, approximately 85% of Americans are to some degree fearful of extreme weather events (Ransford and Strategist 2014). Thus, given Americans’ higher propensity for extreme weather fear than climate change fear (85% vs 60%), public correlation between the two goes a long way toward improving environmental awareness and desire to act on climate change. Furthermore, public belief in climate change undoubtedly grew between 2010 and 2020, likely because of the success of increased environmental awareness messaging. This also contributed toward a developing U.S. political landscape surrounding the importance of environmental issues. For example, in 2020, 60% of Americans were shown to be seriously worried about climate change, while that same metric was only 32% in 2010 (Kohut et al. 2010; Poushter and Fagan 2020). As a result of this increase in concern, 65% of Americans now believe the government is doing too little to combat climate change—a direct and measurable shift in attitude (Tyson and Kennedy 2020).
In accordance with long-term growing public association trends, this study found strong evidence to suggest that individual extreme weather events often directly relate to sizable increases in climate change tweet and news volume, a trend exacerbated for disasters that sparked greater general media conversation, as would be expected. Thus, natural disasters, especially prominent ones, clearly encourage more climate change discussion. Similarly, around specific extreme weather events, the number of dual-category (climate change and natural disaster) tweets and news articles seems to increase more than general climate change media volume, corroborating the direct connection between these natural disasters and increased climate change discussion.
However, the ratio of positive (belief in anthropogenic global warming) to negative (skeptical view of anthropogenic global warming) climate change tweets tended to decrease around specific natural disasters. Based on these results, extreme weather events seem to espouse greater public media input from those not accepting of global warming than those who support environmental action, suggesting that many remain unconvinced of the relationship between climate change and natural disasters. These findings represent a worrisome trend that, although natural disasters encourage greater discussion of global warming, public messaging has yet to convince the general population of both 1) the broad importance of the climate crisis at large and 2) its inextricable link to increasing extreme weather events. Thus, ongoing environmental awareness campaigns seem to have missed an opportunity to leverage both increased attention on global warming and fear of natural disasters to drive increased support for greater climate action in the United States.
b. Interpreting correlations between media volume and sentiment
Further study results suggest that correlations between natural disaster media volume and climate change media volume tend to increase around individual natural disasters. These findings are likely a result of extreme weather events creating concrete context for environmental discussion. This closer relationship between climate change and natural disaster media volume around specific disasters therefore relates to increases of dual-category media during extreme weather events. Similarly, climate change sentiment ratio changes proved to be more correlated with natural disaster tweet volume around specific events, as well, albeit still with low actual values. This aspect of the study thereby alludes to the notion that climate change discussion volume and content is shaped considerably by natural disasters and their accompanying media coverage (both through published articles and posted tweets).
c. Limitations
The study has several limitations, including those that surround the role of social media in society. Twitter posts alone are, at best, an incomplete summary of public sentiment in the United States, and, at worst, potentially biased in the extremity of the views they showcase resulting in voluntary response bias. However, the research of Kirilenko et al. (2015), Yeo et al. (2017), and others have already demonstrated the validity of the use of Twitter data in this context. Furthermore, individuals who write about climate change on a platform like Twitter are perhaps ultimately more influential in discussing and shifting sentiment about the topic due to their advertised opinions. Although this study’s separate analyses of Twitter posts and traditional news articles yielded similar results, further investigation into the relationship between these forms of media may yield additional insight (Anderson 2009). The use of Twitter and traditional media from the entire United States without differentiation by location may also generalize results, ignoring the nuances of public sentiment in specific geographic regions of the country, while also failing to address the impact of extreme weather events on climate sentiment in other countries. Last, additional newsworthy events related to climate change that do not include extreme weather could have impacted the results of this study, if they happened to occur around the same time as a given natural disaster.
d. Key takeaways and lessons for the future
Despite clear increases in correlations between natural disaster and climate change media volume around specific extreme weather events, these correlations were still too low to draw significant conclusions. However, when comparing corresponding increased levels of natural disaster and climate change media volume directly, much greater correlation coefficients emerge. These results suggest a strong positive relationship between the two variables’ increases, especially around individual natural disaster events. This highlights the importance of individuals and media organizations discussing extreme weather when trying to raise climate awareness.
Unfortunately, climate change sentiment ratios exhibited the opposite effect, with strong negative relationships; as natural disasters media volume increased, the relative number of positive climate change media tended to decrease, corroborating previously discussed findings. Thus, public messaging should focus on amplifying understanding of climate change’s relationship to individual natural disasters, to convert increased climate change media discussion sparked by these weather events (and the innate fear of them) into greater support for actively combating climate change in the future. Therefore, the relationship between climate change and natural disaster media coverage in social and conventional media represents an opportunity to encourage broader public support in solving climate change going forward.
Thus, climate messaging strategies must change to incorporate an even greater focus on the relationship between climate change and extreme weather to facilitate broader public consensus on anthropogenic global warming in the United States.
5. Conclusions
Natural disasters and their corresponding media discussion led to evident increases in climate change tweet and news article volumes, as predicted. However, per this study’s analyses, natural disasters generally seem to amplify climate skepticism as opposed to inciting more support for environmental action. Thus, public campaigns should focus on explaining the critical connections between extreme weather and climate change to best strengthen environmental awareness in the United States.
Acknowledgments.
The authors thank David Wilcox and George Negroponte for their mentorship.
Data availability statement.
All data generated for this study was collected directly by the authors, utilizing the APIs discussed. All other datasets used in this study were open source and cited.
APPENDIX
Additional Metrics and Statistics
Table A1 provides media overview metrics by year. Table A2 provides detailed media frequency statistics. Table A3 provides additional natural disaster overview metrics.
Detailed yearly volume and sentiment basic overview metrics.
Detailed time-interval volume overview metrics.
Detailed overview metrics on NOAA dataset natural disasters. Costs are in millions of U.S. dollars.
REFERENCES
Al-Saqaf, W., and P. Berglez, 2019: How do social media users link different types of extreme events to climate change? A study of Twitter during 2008–2017. J. Extreme Events, 6, 1950002, https://doi.org/10.1142/S2345737619500027.
Anderson, A., 2009: Media, politics and climate change: Towards a new research agenda. Sociol. Compass, 3, 166–182, https://doi.org/10.1111/j.1751-9020.2008.00188.x.
Anderson, D., 2022: wordninja: Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies. GitHub, accessed 21 March 2022, https://github.com/keredson/wordninja.
Berglez, P., and W. Al-Saqaf, 2021: Extreme weather and climate change: Social media results, 2008–2017. Environ. Hazards, 20, 382–399, https://doi.org/10.1080/17477891.2020.1829532.
Cho, K., B. van Merrienboer, D. Bahdanau, and Y. Bengio, 2014: On the properties of neural machine translation: Encoder–decoder approaches. Proc. Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, Doha, Qatar, Association for Computational Linguistics, 103–111, https://doi.org/10.3115/v1/W14-4012.
Cody, E. M., A. J. Reagan, L. Mitchell, P. S. Dodds, and C. M. Danforth, 2015: Climate change sentiment on Twitter: An unsolicited public opinion poll. PLOS ONE, 10, e0136092, https://doi.org/10.1371/journal.pone.0136092.
CrowdFlower, 2016: Sentiment of climate change. Data.World, accessed 21 March 2022, https://data.world/crowdflower/sentiment-of-climate-change.
EDSA, 2020: Climate change belief analysis. Kaggle, accessed 21 March 2022, https://kaggle.com/c/edsa-climate-change-belief-analysis.
Han, S., 2022: Googletrans: Free google translate API for python. GitHub, accessed 21 March 2022, https://github.com/ssut/py-googletrans.
Hanchey, A., A. Schnall, T. Bayleyegn, S. Jiva, A. Khan, V. Siegel, R. Funk, and E. Svendsen, 2021: Notes from the field: Deaths related to Hurricane Ida reported by media—Nine states. Morb. Mortality Wkly. Rep., 70, 1385–1386, https://doi.org/10.15585/mmwr.mm7039a3.
Hochreiter, S., and J. Schmidhuber, 1997: Long short-term memory. Neural Comput., 9, 1735–1780, https://doi.org/10.1162/neco.1997.9.8.1735.
IPCC, 2012: Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation. C. B. Field et al., Eds., Cambridge University Press, 582 pp., https://www.ipcc.ch/site/assets/uploads/2018/03/SREX_Full_Report-1.pdf.
JustAnotherArchivist, 2022: Snscrape. GitHub, accessed 20 December 2021, https://github.com/JustAnotherArchivist/snscrape.
Kabaghe, C., and J. Qin, 2018: Classifying tweets based on climate change stance natural language processing. Stanford University Rep., 6 pp., http://cs229.stanford.edu/proj2019spr/report/80.pdf.
Keras Team, 2022: Keras documentation: Keras layers API. Accessed 21 March 2022, https://keras.io/api/layers/.
Khattak, A. M., R. Batool, F. A. Satti, J. Hussain, W. A. Khan, A. M. Khan, and B. Hayat, 2020: Tweets classification and sentiment analysis for personalized tweets recommendation. Complexity, 2020, 8892552, https://doi.org/10.1155/2020/8892552.
Kirilenko, A. P., T. Molodtsova, and S. O. Stepchenkova, 2015: People as sensors: Mass media and local temperatures influence climate change discussion on Twitter. Global Environ. Change, 30, 92–100, https://doi.org/10.1016/j.gloenvcha.2014.11.003.
Kohut, A., C. Doherty, M. Dimock, and S. Keeter, 2010: Little change in opinions about global warming. Pew Research Center, https://www.pewresearch.org/politics/2010/10/27/about-the-survey-322/.
Matplotlib, 2022: Visualization with Python. Accessed 20 March 2022, https://matplotlib.org/.
Mohamad Sham, N., and A. Mohamed, 2022: Climate change sentiment analysis using lexicon, machine learning and hybrid approaches. Sustainability, 14, 4723, https://doi.org/10.3390/su14084723.
Natural Language Toolkit, 2022: Sample usage for wordnet. Accessed 21 March 2022, https://www.nltk.org/howto/wordnet.html.
Pandas, 2022: Python data analysis library. Accessed 20 March 2022, https://pandas.pydata.org/.
Pennington, J., R. Socher, and C. D. Manning, 2014: GloVe: Global vectors for word representation. Accessed 21 March 2022, https://nlp.stanford.edu/projects/glove/.
Poushter, J., and M. Fagan, 2020: Americans see spread of disease as top international threat, along with terrorism, nuclear weapons, cyberattacks. Pew Research Center’s Global Attitudes Project, https://www.pewresearch.org/global/2020/04/13/americans-see-spread-of-disease-as-top-international-threat-along-with-terrorism-nuclear-weapons-cyberattacks/.
Ransford, M., and S. C. Strategist, 2014: Scared of storms? One in 10 Americans may suffer from severe weather phobia. Ball State University, accessed 24 March 2022, https://www.bsu.edu/news/press-center/archives/2014/10/one-in-10-suffer-severe-weather-phobia.
Sanh, V., L. Debut, J. Chaumond, and T. Wolf, 2020: DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv, 1910.01108v4, https://doi.org/10.48550/arXiv.1910.01108.
Sauerborn, R., and K. Ebi, 2012: Climate change and natural disasters—Integrating science and practice to protect health. Global Health Action, 5, 1–7, https://doi.org/10.3402/gha.v5i0.19295.
scikit-learn, 2022: Machine learning in python—Scikit-learn 1.0.2 documentation. Accessed 20 March 2022, https://scikit-learn.org/stable/.
Smith, A. B., 2020: U.S. billion-dollar weather and climate disasters, 1980–present. NOAA National Centers for Environmental Information, https://doi.org/10.25921/STKW-7W73.
SymSpellPy, 2022: A symspell python port—SymSpellPy 6.7.6 documentation. Accessed 21 March 2022, https://symspellpy.readthedocs.io/en/latest/.
Tyson, A., and B. Kennedy, 2020: Two-thirds of Americans think government should do more on climate: Bipartisan backing for carbon capture tax credits, extensive tree-planting efforts. Pew Research Center, https://www.pewresearch.org/science/2020/06/23/two-thirds-of-americans-think-government-should-do-more-on-climate/.
van Kooten, P., 2022: Contractions: Fixes contractions such as ‘you’re’ to you ‘are’ GitHub, accessed 21 March 2022, https://github.com/kootenpv/contractions.
Weischedel, R., and Coauthors, 2022: English. spaCy, accessed 21 March 2022, https://spacy.io/models/en/.
Yadav, T., 2014: glove.twitter.27b.200d.txt. Accessed 21 March 2022, https://kaggle.com/datasets/fullmetal26/glovetwitter27b100dtxt.
Yeo, S. K., Z. Handlos, A. Karambelas, L. Y.-F. Su, K. M. Rose, D. Brossard, and K. Griffin, 2017: The influence of temperature on #ClimateChange and #GlobalWarming discourses on Twitter. J. Sci. Commun., 16, A01, https://doi.org/10.22323/2.16050201.