Search Results

You are looking at 1 - 10 of 15 items for :

  • Model performance/evaluation x
  • The 1st NOAA Workshop on Leveraging AI in the Exploitation of Satellite Earth Observations & Numerical Weather Prediction x
  • All content x
Clear All
Ryan Lagerquist, Amy McGovern, Cameron R. Homeyer, David John Gagne II, and Travis Smith

are 1D with lower spatial resolution), which would present a major difficulty for non-ML-based postprocessing methods such as SSPF. The rest of this paper is organized as follows. Section 2 briefly describes the inner workings of CNNs [a more thorough description is provided in Lagerquist et al. (2019) , hereafter L19 ], section 3 describes the input data and preprocessing, section 4 describes experiments used to find the best CNNs, section 5 evaluates performance of the best CNNs, and

Restricted access
Christina Kumler-Bonfanti, Jebb Stewart, David Hall, and Mark Govett

the model learns to fit the training data. The paper is organized as follows. In section 2 , the U-Net architecture is described, as are the numerical metrics used to evaluate its success. Section 3 describes both qualitatively and quantitatively the design and performance of the best U-Net model obtained for identifying tropical cyclones using the Global Forecast System (GFS) total precipitable water field as inputs. In section 4 , three additional U-Net models are introduced that identify

Restricted access
John L. Cintineo, Michael J. Pavolonis, Justin M. Sieglaff, Anthony Wimmers, Jason Brunner, and Willard Bellon

trained model, which is useful in selecting hyperparameters (see section 2d ). However, by choosing hyperparameter values that optimize performance on the validation set, the hyperparameters can be overfit to the validation set, just like model weights (those adjusted by training) can be overfit to the training set. Thus, the selected model is also evaluated on the testing set, which is independent of the data used to fit both the model weights and hyperparameters. c. Model architecture CNNs use a

Restricted access
Kyle A. Hilburn, Imme Ebert-Uphoff, and Steven D. Miller

values. Performance is evaluated using metrics including the mean-square error (MSE), coefficient of determination R 2 , categorical metrics (probability of detection, false-alarm rate, critical success index, and categorical bias) at various output threshold levels, and evaluation of the root-mean-square difference (RMSD) binned over the range of true output values. A potential disadvantage of ML is that it is statistically based, making it harder to interpret. So, besides producing a trained and

Open access
Noah D. Brenowitz, Tom Beucler, Michael Pritchard, and Christopher S. Bretherton

deep convective self-aggregation above uniform SST . J. Atmos. Sci. , 62 , 4273 – 4292 , . 10.1175/JAS3614.1 Chevallier , F. , and J.-F. Mahfouf , 2001 : Evaluation of the Jacobians of infrared radiation models for variational data assimilation . J. Appl. Meteor. , 40 , 1445 – 1461 ,<1445:EOTJOI>2.0.CO;2 . 10.1175/1520-0450(2001)040<1445:EOTJOI>2.0.CO;2 Chevallier , F. , F. Chéruy , N. A. Scott , and

Restricted access
Andrew E. Mercer, Alexandria D. Grimes, and Kimberly M. Wood

.1175/2009WAF2222280.1 Kaplan , J. , and Coauthors , 2015 : Evaluating environmental impacts on tropical cyclone rapid intensification predictability utilizing statistical models . Wea. Forecasting , 30 , 1374 – 1396 , . 10.1175/WAF-D-15-0032.1 Karpatne , A. , I. Ebert-Uphoff , S. Ravela , H. A. Babaie , and V. Kumar , 2018 : Machine learning for the geosciences: Challenges and opportunities . IEEE Trans. Knowl. Data Eng. , 31 , 1544 – 1554

Restricted access
Hanoi Medina, Di Tian, Fabio R. Marin, and Giovanni B. Chirico

global NWP models ( Bauer et al. 2015 ). The representation of these processes is especially challenging over continental areas from the Southern Hemisphere where the abundant vegetation and the sparse observations for evaluation and data assimilation have limited the models’ accuracy. Recent progress in forecasting tropical convection ( Bechtold et al. 2014 ; Subramanian et al. 2017 ) and the increasing quantity and quality of global information encourage the use of NWP for tropical precipitation

Full access
Dan Lu, Goutam Konapala, Scott L. Painter, Shih-Chieh Kao, and Sudershan Gangrade


Hydrologic predictions at rural watersheds are important but also challenging due to data shortage. Long Short-TermMemory (LSTM) networks are a promising machine learning approach and have demonstrated good performance in streamflow predictions. However, due to its data-hungry nature, most of LSTM applications focused on well-monitored catchments with abundant and high quality observations. In this work, we investigate predictive capabilities of LSTM in poorly monitored watersheds with short observation records. To address three main challenges of LSTM applications in data-scarce locations, i.e., overfitting, uncertainty quantification (UQ), and out-of-distribution prediction, we evaluate different regularization techniques to prevent overfitting, apply a Bayesian LSTM for UQ, and introduce a physics-informed hybrid LSTM to enhance out-of-distribution prediction. Through case studies in two diverse sets of catchments with and without snow influence, we demonstrate that: (1) when hydrologic variability in the prediction period is similar to the calibration period, LSTM models can reasonably predict daily streamflow with Nash-Sutcliffe efficiency above 0.8, even with only two years of calibration data. (2) When the hydrologic variability in the prediction and calibration periods is dramatically different, LSTM alone does not predict well, but the hybrid model can improve the out-of-distribution prediction with acceptable generalization accuracy. (3) L2 norm penalty and dropout can mitigate overfitting, and Bayesian and hybrid LSTM have no overfitting. (4) Bayesian LSTM provides useful uncertainty information to improve prediction understanding and credibility. These insights have vital implications for streamflow simulation in watersheds where data quality and availability are a critical issue.

Restricted access
Imme Ebert-Uphoff and Kyle Hilburn

, by trying different sets of hyperparameters, training a complete model for each set, evaluating the resulting model, and then deciding which hyperparameter set results in best performance. Algorithms range from simple exhaustive grid search (as illustrated in the “Using performance measures for NN tuning” section) to sophisticated algorithms ( Kasim et al. 2020 ; Hertel et al. 2020 ). Sample application: Image-to-image translation from GOES to MRMS. We demonstrate many of the concepts in this

Full access
Anthony Wimmers, Christopher Velden, and Joshua H. Cossuth

. However, it is difficult to generalize this difference because of the small sample size for category 5. Overall, the improvement is enough to justify limiting the remaining model evaluation to only the two-channel version of DeepMicroNet going forward. Fig . 5. (a) Intensity error (RMSE) according to best track MSW for the three model versions labeled in the legend, and (b) average standard deviation of the PDFs according to best track MSW. b. Model performance The following describes a two

Full access