Measuring Sharpness of AI-Generated Meteorological Imagery

Imme Ebert-Uphoff a Cooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, CO, USA
b Electrical and Computer Engineering, Colorado State University, Fort Collins, CO, USA

Search for other papers by Imme Ebert-Uphoff in
Current site
Google Scholar
PubMed
Close
,
Lander Ver Hoef a Cooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, CO, USA

Search for other papers by Lander Ver Hoef in
Current site
Google Scholar
PubMed
Close
,
John S. Schreck c NSF National Center for Atmospheric Research, Boulder, CO, USA

Search for other papers by John S. Schreck in
Current site
Google Scholar
PubMed
Close
,
Jason Stock d Computer Science, Colorado State University, Fort Collins, CO, USA

Search for other papers by Jason Stock in
Current site
Google Scholar
PubMed
Close
,
Maria J. Molina e Department of Atmospheric and Oceanic Science, University of Maryland, College Park, MD, USA
c NSF National Center for Atmospheric Research, Boulder, CO, USA

Search for other papers by Maria J. Molina in
Current site
Google Scholar
PubMed
Close
,
Amy McGovern f School of Computer Science and School of Meteorology, University of Oklahoma, Norman, OK, USA

Search for other papers by Amy McGovern in
Current site
Google Scholar
PubMed
Close
,
Michael Yu f School of Computer Science and School of Meteorology, University of Oklahoma, Norman, OK, USA

Search for other papers by Michael Yu in
Current site
Google Scholar
PubMed
Close
,
Bill Petzke c NSF National Center for Atmospheric Research, Boulder, CO, USA

Search for other papers by Bill Petzke in
Current site
Google Scholar
PubMed
Close
,
Kyle Hilburn a Cooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, CO, USA

Search for other papers by Kyle Hilburn in
Current site
Google Scholar
PubMed
Close
,
David M. Hall g NVIDIA, Santa Clara, CA

Search for other papers by David M. Hall in
Current site
Google Scholar
PubMed
Close
,
David John Gagne II c NSF National Center for Atmospheric Research, Boulder, CO, USA

Search for other papers by David John Gagne II in
Current site
Google Scholar
PubMed
Close
,
William F. Campbell h U.S. Naval Research Laboratory, Marine Meteorology Division, Monterey, California

Search for other papers by William F. Campbell in
Current site
Google Scholar
PubMed
Close
,
Jacob T. Radford a Cooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, CO, USA
i Global Systems Laboratory, Oceanic and Atmospheric Research, National Oceanic and Atmospheric Administration, Boulder, Colorado, USA

Search for other papers by Jacob T. Radford in
Current site
Google Scholar
PubMed
Close
,
Jebb Q. Stewart i Global Systems Laboratory, Oceanic and Atmospheric Research, National Oceanic and Atmospheric Administration, Boulder, Colorado, USA

Search for other papers by Jebb Q. Stewart in
Current site
Google Scholar
PubMed
Close
, and
Sam Scheuerman j Mathematics, Colorado State University, Fort Collins, CO, USA

Search for other papers by Sam Scheuerman in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

AI-based algorithms are emerging in many meteorological applications that produce imagery as output, including for global weather forecasting models. However, the imagery produced by AI algorithms, especially by convolutional neural networks (CNNs), is often described as too blurry to look realistic, partly because CNNs tend to represent uncertainty as blurriness. This blurriness can be undesirable since it might obscure important meteorological features. More complex AI models, such as Generative AI models, produce images that appear to be sharper. However, improved sharpness may come at the expense of a decline in other performance criteria, such as standard forecast verification metrics. To navigate any trade-off between sharpness and other performance metrics it is important to quantitatively assess those other metrics along with sharpness. While there is a rich set of forecast verification metrics available for meteorological images, none of them focus on sharpness. This paper seeks to fill this gap by 1) exploring a variety of sharpness metrics from other fields, 2) evaluating properties of these metrics, 3) proposing the new concept of Gaussian Blur Equivalence as a tool for their uniform interpretation, and 4) demonstrating their use for sample meteorological applications, including a CNN that emulates radar imagery from satellite imagery (GREMLIN) and an AI-based global weather forecasting model (GraphCast).

© 2025 American Meteorological Society. This is an Author Accepted Manuscript distributed under the terms of the default AMS reuse license. For information regarding reuse and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Imme Ebert-Uphoff, iebert@colostate.edu

Abstract

AI-based algorithms are emerging in many meteorological applications that produce imagery as output, including for global weather forecasting models. However, the imagery produced by AI algorithms, especially by convolutional neural networks (CNNs), is often described as too blurry to look realistic, partly because CNNs tend to represent uncertainty as blurriness. This blurriness can be undesirable since it might obscure important meteorological features. More complex AI models, such as Generative AI models, produce images that appear to be sharper. However, improved sharpness may come at the expense of a decline in other performance criteria, such as standard forecast verification metrics. To navigate any trade-off between sharpness and other performance metrics it is important to quantitatively assess those other metrics along with sharpness. While there is a rich set of forecast verification metrics available for meteorological images, none of them focus on sharpness. This paper seeks to fill this gap by 1) exploring a variety of sharpness metrics from other fields, 2) evaluating properties of these metrics, 3) proposing the new concept of Gaussian Blur Equivalence as a tool for their uniform interpretation, and 4) demonstrating their use for sample meteorological applications, including a CNN that emulates radar imagery from satellite imagery (GREMLIN) and an AI-based global weather forecasting model (GraphCast).

© 2025 American Meteorological Society. This is an Author Accepted Manuscript distributed under the terms of the default AMS reuse license. For information regarding reuse and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Imme Ebert-Uphoff, iebert@colostate.edu
Save