Performance of Decision-Tree-Based Ensemble Classifiers in Predicting Fog Frequency in Ungauged Areas

Daeha Kim aDepartment of Civil Engineering, Jeonbuk National University, Jeonju-si, Jeollabuk-do, South Korea

Search for other papers by Daeha Kim in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0001-8478-1278
,
Eunhee Kim aDepartment of Civil Engineering, Jeonbuk National University, Jeonju-si, Jeollabuk-do, South Korea

Search for other papers by Eunhee Kim in
Current site
Google Scholar
PubMed
Close
, and
Eunji Kim aDepartment of Civil Engineering, Jeonbuk National University, Jeonju-si, Jeollabuk-do, South Korea

Search for other papers by Eunji Kim in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

Fog is a phenomenon that exerts significant impacts on transportation, aviation, air quality, agriculture, and even water resources. While data-driven machine learning algorithms have shown promising performance in capturing nonlinear fog events at point locations, their applicability to different areas and time periods is questionable. This study addresses this issue by examining five decision-tree-based classifiers in a South Korean region, where diverse fog formation mechanisms are at play. The five machine learning algorithms were trained at point locations and tested with other point locations for time periods independent of the training processes. Using the ensemble classifiers and high-resolution atmospheric reanalysis data, we also attempted to establish fog occurrence maps in a regional area. Results showed that machine learning models trained on the local datasets exhibited superior performance in mountainous areas, where radiative cooling predominantly contributes to fog formation, compared to inland and coastal regions. As the fog generation mechanisms diversified, the tree-based ensemble models appeared to encounter challenges in delineating their decision boundaries. When they were trained with the reanalysis data, their predictive skills were significantly decreased, resulting in high false alarm rates. This prompted the need for postprocessing techniques to rectify overestimated fog frequency. While postprocessing may ameliorate overestimation, caution is needed to interpret the resultant fog frequency estimates, especially in regions with more diverse fog generation mechanisms. The spatial upscaling of machine learning–based fog prediction models poses challenges owing to the intricate interplay of various fog formation mechanisms, data imbalances, and potential inaccuracies in reanalysis data.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Daeha Kim, daeha.kim@jbnu.ac.kr

Abstract

Fog is a phenomenon that exerts significant impacts on transportation, aviation, air quality, agriculture, and even water resources. While data-driven machine learning algorithms have shown promising performance in capturing nonlinear fog events at point locations, their applicability to different areas and time periods is questionable. This study addresses this issue by examining five decision-tree-based classifiers in a South Korean region, where diverse fog formation mechanisms are at play. The five machine learning algorithms were trained at point locations and tested with other point locations for time periods independent of the training processes. Using the ensemble classifiers and high-resolution atmospheric reanalysis data, we also attempted to establish fog occurrence maps in a regional area. Results showed that machine learning models trained on the local datasets exhibited superior performance in mountainous areas, where radiative cooling predominantly contributes to fog formation, compared to inland and coastal regions. As the fog generation mechanisms diversified, the tree-based ensemble models appeared to encounter challenges in delineating their decision boundaries. When they were trained with the reanalysis data, their predictive skills were significantly decreased, resulting in high false alarm rates. This prompted the need for postprocessing techniques to rectify overestimated fog frequency. While postprocessing may ameliorate overestimation, caution is needed to interpret the resultant fog frequency estimates, especially in regions with more diverse fog generation mechanisms. The spatial upscaling of machine learning–based fog prediction models poses challenges owing to the intricate interplay of various fog formation mechanisms, data imbalances, and potential inaccuracies in reanalysis data.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Daeha Kim, daeha.kim@jbnu.ac.kr

Supplementary Materials

    • Supplemental Materials (PDF 0.5907 MB)
Save
  • Adhikari, B., and L. Wang, 2020: The potential contribution of soil moisture to fog formation in the Namib Desert. J. Hydrol., 591, 125326, https://doi.org/10.1016/j.jhydrol.2020.125326.

    • Search Google Scholar
    • Export Citation
  • Allen, R. G., L. S. Pereira, D. Raes, and M. Smith, 1998: Crop evapotranspiration: Guidelines for computing crop water requirements. FAO Irrigation and Drainage Paper 56, 333 pp., http://www.climasouth.eu/sites/default/files/FAO%2056.pdf.

  • Bari, D., and T. Bergot, 2018: Influence of environmental conditions on forecasting of an advection-radiation fog: A case study from the Casablanca region, Morocco. Aerosol Air Qual. Res., 18, 6278, https://doi.org/10.4209/aaqr.2016.11.0520.

    • Search Google Scholar
    • Export Citation
  • Bari, D., and A. Ouagabi, 2020: Machine-learning regression applied to diagnose horizontal visibility from mesoscale NWP model forecasts. SN Appl. Sci., 2, 556, https://doi.org/10.1007/s42452-020-2327-x.

    • Search Google Scholar
    • Export Citation
  • Bartoková, I., A. Bott, J. Bartok, and M. Gera, 2015: Fog prediction for road traffic safety in a coastal desert region: Improvement of nowcasting skills by the machine-learning approach. Bound.-Layer Meteor., 157, 501516, https://doi.org/10.1007/s10546-015-0069-x.

    • Search Google Scholar
    • Export Citation
  • Belorid, M., C. B. Lee, J.-C. Kim, and T.-H. Cheon, 2015: Distribution and long-term trends in various fog types over South Korea. Theor. Appl. Climatol., 122, 699710, https://doi.org/10.1007/s00704-014-1321-x.

    • Search Google Scholar
    • Export Citation
  • Bhardwaj, P., S. J. Ki, Y. H. Kim, J. H. Woo, C. K. Song, S. Y. Park, and C. H. Song, 2019: Recent changes of trans-boundary air pollution over the Yellow Sea: Implications for future air quality in South Korea. Environ. Pollut., 247, 401409, https://doi.org/10.1016/j.envpol.2019.01.048.

    • Search Google Scholar
    • Export Citation
  • Breiman, L., 2001: Random forests. Mach. Learn., 45, 532, https://doi.org/10.1023/A:1010933404324.

  • Breiman, L., 2017: Classification and Regression Trees. Routledge, 368 pp., https://doi.org/10.1201/9781315139470.

  • Castillo-Botón, C., D. Casillas-Pérez, C. Casanova-Mateo, S. Ghimire, E. Cerro-Prada, P. A. Gutierrez, R. C. Deo, and S. Salcedo-Sanz, 2022: Machine learning regression and classification methods for fog events prediction. Atmos. Res., 272, 106157, https://doi.org/10.1016/j.atmosres.2022.106157.

    • Search Google Scholar
    • Export Citation
  • Chawla, N. V., K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, 2002: SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res., 16, 321357, https://doi.org/10.1613/jair.953.

    • Search Google Scholar
    • Export Citation
  • Chen, T., and C. Guestrin, 2016: XGBoost: A scalable tree boosting system. Proc. 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, Association for Computing Machinery, 785–794, https://www.kdd.org/kdd2016/subtopic/view/xgboost-a-scalable-tree-boosting-system.

  • Clements, C. B., C. D. Whiteman, and J. D. Horel, 2003: Cold-air-pool structure and evolution in a mountain basin: Peter Sinks, Utah. J. Appl. Meteor., 42, 752768, https://doi.org/10.1175/1520-0450(2003)042<0752:CSAEIA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Cornejo-Bueno, L., C. Casanova-Mateo, J. Sanz-Justo, E. Cerro-Prada, and S. Salcedo-Sanz, 2017: Efficient prediction of low-visibility events at airports using machine-learning regression. Bound.-Layer Meteor., 165, 349370, https://doi.org/10.1007/s10546-017-0276-8.

    • Search Google Scholar
    • Export Citation
  • Daly, C., M. Halbleib, J. I. Smith, W. P. Gibson, M. K. Doggett, G. H. Taylor, J. Curtis, and P. P. Pasteris, 2008: Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol., 28, 20312064, https://doi.org/10.1002/joc.1688.

    • Search Google Scholar
    • Export Citation
  • Decesari, S., M. H. Sowlat, S. Hasheminassab, S. Sandrini, S. Gilardoni, M. C. Facchini, S. Fuzzi, and C. Sioutas, 2017: Enhanced toxicity of aerosol in fog conditions in the Po Valley, Italy. Atmos. Chem. Phys., 17, 77217731, https://doi.org/10.5194/acp-17-7721-2017.

    • Search Google Scholar
    • Export Citation
  • Durán-Rosal, A. M., J. C. Fernández, C. Casanova-Mateo, J. Sanz-Justo, S. Salcedo-Sanz, and C. Hervás-Martínez, 2018: Efficient fog prediction with multi-objective evolutionary neural networks. Appl. Soft Comput., 70, 347358, https://doi.org/10.1016/j.asoc.2018.05.035.

    • Search Google Scholar
    • Export Citation
  • Fernández-González, S., P. Bolgiani, J. Fernández-Villares, P. González, A. García-Gil, J. C. Suárez, and A. Merino, 2019: Forecasting of poor visibility episodes in the vicinity of Tenerife Norte Airport. Atmos. Res., 223, 4959, https://doi.org/10.1016/j.atmosres.2019.03.012.

    • Search Google Scholar
    • Export Citation
  • Forthun, G. M., M. B. Johnson, W. G. Schmitz, J. Blume, and R. J. Caldwell, 2006: Trends in fog frequency and duration in the southeast United States. Phys. Geogr., 27, 206222, https://doi.org/10.2747/0272-3646.27.3.206.

    • Search Google Scholar
    • Export Citation
  • Freund, Y., and R. E. Schapire, 1997: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55, 119139, https://doi.org/10.1006/jcss.1997.1504.

    • Search Google Scholar
    • Export Citation
  • Gautam, R., and M. K. Singh, 2018: Urban heat island over Delhi punches holes in widespread fog in the Indo-Gangetic Plains. Geophys. Res. Lett., 45, 11141121, https://doi.org/10.1002/2017GL076794.

    • Search Google Scholar
    • Export Citation
  • Ghiggi, G., V. Humphrey, S. I. Seneviratne, and L. Gudmundsson, 2019: GRUN: An observation-based global gridded runoff dataset from 1902 to 2014. Earth Syst. Sci. Data, 11, 16551674, https://doi.org/10.5194/essd-11-1655-2019.

    • Search Google Scholar
    • Export Citation
  • Ghiggi, G., V. Humphrey, S. I. Seneviratne, and L. Gudmundsson, 2021: G-RUN ENSEMBLE: A multi-forcing observation-based global runoff reanalysis. Water Resour. Res., 57, e2020WR028787, https://doi.org/10.1029/2020WR028787.

    • Search Google Scholar
    • Export Citation
  • Guijo-Rubio, D., P. A. Gutiérrez, C. Casanova-Mateo, J. Sanz-Justo, S. Salcedo-Sanz, and C. Hervás-Martínez, 2018: Prediction of low-visibility events due to fog using ordinal classification. Atmos. Res., 214, 6473, https://doi.org/10.1016/j.atmosres.2018.07.017.

    • Search Google Scholar
    • Export Citation
  • Gultepe, I., and Coauthors, 2007: Fog research: A review of past achievements and future perspectives. Pure Appl. Geophys., 164, 11211159, https://doi.org/10.1007/s00024-007-0211-x.

    • Search Google Scholar
    • Export Citation
  • Gultepe, I., J. A. Milbrandt, and B. Zhou, 2017: Marine fog: A review on microphysics and visibility prediction. Marine Fog: Challenges and Advancements in Observations, Modeling, and Forecasting, D. Koračin and C. E. Dorman, Eds., Springer, 345–394.

  • Gultepe, I., and Coauthors, 2019: A review of high impact weather for aviation meteorology. Pure Appl. Geophys., 176, 18691921, https://doi.org/10.1007/s00024-019-02168-6.

    • Search Google Scholar
    • Export Citation
  • Hersbach, H., and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 19992049, https://doi.org/10.1002/qj.3803.

    • Search Google Scholar
    • Export Citation
  • Jung, M., and Coauthors, 2019: The FLUXCOM ensemble of global land-atmosphere energy fluxes. Sci. Data, 6, 74, https://doi.org/10.1038/s41597-019-0076-8.

    • Search Google Scholar
    • Export Citation
  • Kamangir, H., W. Collins, P. Tissot, S. A. King, H. T. H. Dinh, N. Durham, and J. Rizzo, 2021: FogNet: A multiscale 3D CNN with double-branch dense block and attention mechanism for fog prediction. Mach. Learn. Appl., 5, 100038, https://doi.org/10.1016/j.mlwa.2021.100038.

    • Search Google Scholar
    • Export Citation
  • Ke, G., Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, 2017: LightGBM: A highly efficient gradient boosting decision tree. NIPS’17: Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, Association for Computing Machinery, 3149–3157, https://dl.acm.org/doi/10.5555/3294996.3295074.

  • Köhler, C., and Coauthors, 2017: Critical weather situations for renewable energies—Part B: Low stratus risk for solar power. Renewable Energy, 101, 794803, https://doi.org/10.1016/j.renene.2016.09.002.

    • Search Google Scholar
    • Export Citation
  • Koračin, D., C. E. Dorman, J. M. Lewis, J. G. Hudson, E. M. Wilcox, and A. Torregrosa, 2014: Marine fog: A review. Atmos. Res., 143, 142175, https://doi.org/10.1016/j.atmosres.2013.12.012.

    • Search Google Scholar
    • Export Citation
  • Koziara, M. C., R. J. Renard, and W. J. Thompson, 1983: Estimating marine fog probability using a model output statistics scheme. Mon. Wea. Rev., 111, 23332340, https://doi.org/10.1175/1520-0493(1983)111<2333:EMFPUA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Kusch, E., and R. Davy, 2022: KrigR—A tool for downloading and statistically downscaling climate reanalysis data. Environ. Res. Lett., 17, 024005, https://doi.org/10.1088/1748-9326/ac48b3.

    • Search Google Scholar
    • Export Citation
  • Lee, H.-K., and M.-S. Suh, 2019: Objective classification of fog type and analysis of fog characteristics using visibility meter and satellite observation data over South Korea. Atmosphere, 29, 639658, https://doi.org/10.14191/Atmos.2019.29.5.639.

    • Search Google Scholar
    • Export Citation
  • Lee, Y. H., J.-S. Lee, S. K. Park, D.-E. Chang, and H.-S. Lee, 2010: Temporal and spatial characteristics of fog occurrence over the Korean Peninsula. J. Geophys. Res., 115, D14117, https://doi.org/10.1029/2009JD012284.

    • Search Google Scholar
    • Export Citation
  • Li, Y., F. Aemisegger, A. Riedl, N. Buchmann, and W. Eugster, 2021: The role of dew and radiation fog inputs in the local water cycling of a temperate grassland during dry spells in central Europe. Hydrol. Earth Syst. Sci., 25, 26172648, https://doi.org/10.5194/hess-25-2617-2021.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1969: The predictability of a flow which possesses many scales of motion. Tellus, 21A, 289307, https://doi.org/10.3402/tellusa.v21i3.10086.

    • Search Google Scholar
    • Export Citation
  • Melhauser, C., and F. Zhang, 2012: Practical and intrinsic predictability of severe and convective weather at the mesoscales. J. Atmos. Sci., 69, 33503371, https://doi.org/10.1175/JAS-D-11-0315.1.

    • Search Google Scholar
    • Export Citation
  • Miao, K.-C., T.-T. Han, Y.-Q. Yao, H. Lu, P. Chen, B. Wang, and J. Zhang, 2020: Application of LSTM for short term fog forecasting based on meteorological elements. Neurocomputing, 408, 285291, https://doi.org/10.1016/j.neucom.2019.12.129.

    • Search Google Scholar
    • Export Citation
  • Muñoz-Sabater, J., and Coauthors, 2021: ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data, 13, 43494383, https://doi.org/10.5194/essd-13-4349-2021.

    • Search Google Scholar
    • Export Citation
  • Myers, D. E., 1982: Matrix formulation of co-kriging. J. Int. Assoc. Math. Geol., 14, 249257, https://doi.org/10.1007/BF01032887.

  • Negishi, M., and H. Kusaka, 2022: Development of statistical and machine learning models to predict the occurrence of radiation fog in Japan. Meteor. Appl., 29, e2048, https://doi.org/10.1002/met.2048.

    • Search Google Scholar
    • Export Citation
  • Oh, Y.-J., and M.-S. Suh, 2020: Development of quality control method for visibility data based on the characteristics of visibility data. Korean J. Remote Sens., 36, 707723, https://doi.org/10.7780/kjrs.2020.36.5.1.5.

    • Search Google Scholar
    • Export Citation
  • Pathak, J., A. Wikner, R. Fussell, S. Chandra, B. R. Hunt, M. Girvan, and E. Ott, 2018: Hybrid forecasting of chaotic processes: Using machine learning in conjunction with a knowledge-based model. Chaos, 28, 041101, https://doi.org/10.1063/1.5028373.

    • Search Google Scholar
    • Export Citation
  • Prokhorenkova, L., G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, 2018: CatBoost: Unbiased boosting with categorical features. NIPS’18: Proc. 32nd Int. Conf. on Neural Information Processing Systems, Montréal, Canada, Association for Computing Machinery, 6637–6647, https://proceedings.neurips.cc/paper_files/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf.

  • Qiao, N., L. Zhang, C. Huang, W. Jiao, G. Maggs‐Kölling, E. Marais, and L. Wang, 2020: Satellite observed positive impacts of fog on vegetation. Geophy. Res. Lett., 47, e2020GL088428, https://doi.org/10.1029/2020GL088428.

    • Search Google Scholar
    • Export Citation
  • Runyan, C., L. Wang, D. Lawrence, and P. D’Odorico, 2019: Ecohydrological controls on the deposition of non-rainfall water, N, and P to dryland ecosystems. Dryland Ecohydrology, P. D’Odorico, A. Porporato, and C. W. Runyan, Eds., Springer, 121–137.

  • Shin, J.-Y., K. R. Kim, J. Kim, and S. Kim, 2021: Long‐term trend and variability of surface humidity from 1973 to 2018 in South Korea. Int. J. Climatol., 41, 42154235, https://doi.org/10.1002/joc.7068.

    • Search Google Scholar
    • Export Citation
  • Smith, D. K. E., I. A. Renfrew, S. R. Dorling, J. D. Price, and I. A. Boutle, 2021: Sub-km scale numerical weather prediction model simulations of radiation fog. Quart. J. Roy. Meteor. Soc., 147, 746763, https://doi.org/10.1002/qj.3943.

    • Search Google Scholar
    • Export Citation
  • Steeneveld, G.-J., and M. Bode, 2018: Unravelling the relative roles of physical processes in modelling the life cycle of a warm radiation fog. Quart. J. Roy. Meteor. Soc., 144, 15391554, https://doi.org/10.1002/qj.3300.

    • Search Google Scholar
    • Export Citation
  • Suh, M.-S., S.-K. Hong, and J.-H. Kang, 2009: Characteristics of seasonal mean diurnal temperature range and their causes over South Korea. Atmosphere, 19, 155168.

    • Search Google Scholar
    • Export Citation
  • Tapiador, F. J., J.-L. Sanchez, and E. García-Ortega, 2019: Empirical values and assumptions in the microphysics of numerical models. Atmos. Res., 215, 214238, https://doi.org/10.1016/j.atmosres.2018.09.010.

    • Search Google Scholar
    • Export Citation
  • Tardif, R., and R. M. Rasmussen, 2007: Event-based climatology and typology of fog in the New York City region. J. Appl. Meteor. Climatol., 46, 11411168, https://doi.org/10.1175/JAM2516.1.

    • Search Google Scholar
    • Export Citation
  • Taszarek, M., S. Kendzierski, and N. Pilguj, 2020: Hazardous weather affecting European airports: Climatological estimates of situations with limited visibility, thunderstorm, low-level wind shear and snowfall from ERA5. Wea. Climate Extremes, 28, 100243, https://doi.org/10.1016/j.wace.2020.100243.

    • Search Google Scholar
    • Export Citation
  • Willmott, C. J., and K. Matsuura, 1995: Smart interpolation of annually averaged air temperature in the United States. J. Appl. Meteor., 34, 25772586, https://doi.org/10.1175/1520-0450(1995)034<2577:SIOAAA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Xu, T., and Coauthors, 2018: Evaluating different machine learning methods for upscaling evapotranspiration from flux towers to the regional scale. J. Geophys. Res. Atmos., 123, 86748690, https://doi.org/10.1029/2018JD028447.

    • Search Google Scholar
    • Export Citation
  • Zeng, J., T. Matsunaga, Z.-H. Tan, N. Saigusa, T. Shirai, Y. Tang, S. Peng, and Y. Fukuda, 2020: Global terrestrial carbon fluxes of 1999–2019 estimated by upscaling eddy covariance data with a random forest. Sci. Data, 7, 313, https://doi.org/10.1038/s41597-020-00653-5.

    • Search Google Scholar
    • Export Citation
  • Zhou, B., J. Du, I. Gultepe, and G. Dimego, 2012: Forecast of low visibility and fog from NCEP: Current status and efforts. Pure Appl. Geophys., 169, 895909, https://doi.org/10.1007/s00024-011-0327-x.

    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 263 263 157
Full Text Views 41 41 24
PDF Downloads 48 48 33