• Al Banna, M. H., K. A. Taher, M. S. Kaiser, M. Mahmud, M. S. Rahman, A. S. Hosen, and G. H. Cho, 2020: Application of artificial intelligence in predicting earthquakes: State-of-the-art and future challenges. IEEE Access, 8, 192 880192 923, https://doi.org/10.1109/ACCESS.2020.3029859.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bari, D., and A. Ouagabi, 2020: Machine-learning regression applied to diagnose horizontal visibility from mesoscale NWP model forecasts. SN Appl. Sci., 2, 556, https://doi.org/10.1007/s42452-020-2327-x.

    • Search Google Scholar
    • Export Citation
  • Benjamin, S. G., and Coauthors, 2004: Assimilation of METAR cloud and visibility observations in the RUC. 11th Conf. on Aviation, Range, Aerospace, Hyannis, MA, Amer. Meteor. Soc., 9.13, https://ams.confex.com/ams/11aram22sls/webprogram/Paper81992.html.

    • Crossref
    • Export Citation
  • Bergstra, J., and Y. Bengio, 2012: Random search for hyper-parameter optimization. J. Mach. Learn. Res., 13, 281305, http://www.jmlr.org/papers/v13/bergstra12a.html.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone, 2017: Classification and Regression Trees. Routledge, 368 pp., https://doi.org/10.1201/9781315139470.

    • Crossref
    • Export Citation
  • Chen, T., T. He, M. Benesty, and XGBoost contributors, 2022: Package ‘xgboost.’ R Reference Document, 66 pp., https://cran.r-project.org/web/packages/xgboost/xgboost.pdf.

  • Clark, P. A., S. A. Harcourt, B. MacPherson, C. T. Mathison, S. Cusack, and M. Naylor, 2008: Prediction of visibility and aerosol within the operational Met Office Unified Model. I: Model formulation and variational assimilation. Quart. J. Roy. Meteor. Soc., 134, 18011816, https://doi.org/10.1002/qj.318.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Claxton, B. M., 2008: Using a neural network to benchmark a diagnostic parametrization: The Met Office’s visibility scheme. Quart. J. Roy. Meteor. Soc., 134, 15271537, https://doi.org/10.1002/qj.309.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cornejo-Bueno, L., C. Casanova-Mateo, J. Sanz-Justo, E. Cerro-Prada, and S. Salcedo-Sanz, 2017: Efficient prediction of low-visibility events at airports using machine-learning regression. Bound.-Layer Meteor., 165, 349370, https://doi.org/10.1007/s10546-017-0276-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Creighton, G., E. Kuchera, R. Adams-Selin, J. McCormick, S. Rentschler, and B. Wickard, 2014: AFWA diagnostics in WRF. University Corporation for Atmospheric Research, 17 pp., https://www2.mmm.ucar.edu/wrf/users/docs/AFWA_Diagnostics_in_WRF.pdf.

    • Crossref
    • Export Citation
  • Deng, H., H. Tan, F. Li, M. Cai, P. W. Chan, H. Xu, X. Huang, and D. Wu, 2016: Impact of relative humidity on visibility degradation during a haze event: A case study. Sci. Total Environ., 569, 11491158, https://doi.org/10.1016/j.scitotenv.2016.06.190.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dimitrova, R., A. Sharma, H. J. Fernando, I. Gultepe, V. Danchovski, S. Wagh, S. L. Bardoel, and S. Wang, 2021: Simulations of coastal fog in the Canadian Atlantic with the Weather Research and Forecasting Model. Bound.-Layer Meteor., 181, 443472, https://doi.org/10.1007/s10546-021-00662-w.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ding, J., and Coauthors, 2022: Forecast of hourly airport visibility based on artificial intelligence methods. Atmosphere, 13, 75, https://doi.org/10.3390/atmos13010075.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Doran, J. A., P. J. Roohr, D. J. Beberwyk, G. R. Brooks, G. A. Gayno, R. T. Williams, J. M. Lewis, and R. J. Lefevre, 1999: The MM5 at the Air Force Weather Agency-New products to support military operations. The Eighth Conf. on Aviation, Range, and Aerospace Meteorology, Dallas, TX, Amer. Meteor. Soc., 4.17, https://ams.confex.com/ams/99annual/abstracts/1125.html.

    • Crossref
    • Export Citation
  • Doreswamy, N., K. S. Harishkumar, K. M. Yogesh, G. Ibrahim, 2020: Forecasting air pollution particulate matter (PM2.5) using machine learning regression models. Procedia Comput. Sci., 171, 20572066, https://doi.org/10.1016/j.procs.2020.04.221.

    • Search Google Scholar
    • Export Citation
  • Fita, L., J. Polcher, T. M. Giannaros, T. Lorenz, J. Milovac, G. Sofiadis, E. Katragkou, and S. Bastin, 2019: CORDEX-WRF v1. 3: Development of a module for the Weather Research and Forecasting (WRF) Model to support the CORDEX community. Geosci. Model Dev., 12, 10291066, https://doi.org/10.5194/gmd-12-1029-2019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Friedman, J. H., 2001: Greedy function approximation: A gradient boosting machine. Ann. Stat., 29, 11891232, https://doi.org/10.1214/aos/1013203451.

    • Search Google Scholar
    • Export Citation
  • Gulia, S., S. S. Nagendra, M. Khare, and I. Khanna, 2015: Urban air quality management—A review. Atmos. Pollut. Res., 6, 286304, https://doi.org/10.5094/APR.2015.033.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Guo, B., Y. Wang, X. Zhang, H. Che, J. Zhong, Y. Chu, and L. Cheng, 2020: Temporal and spatial variations of haze and fog and the characteristics of PM2.5 during heavy pollution episodes in China from 2013 to 2018. Atmos. Pollut. Res., 11, 18471856, https://doi.org/10.1016/j.apr.2020.07.019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Haywood, J., and Coauthors, 2008: Prediction of visibility and aerosol within the operational Met Office Unified Model. II: Validation of model performance using observational data. Quart. J. Roy. Meteor. Soc., 134, 18171832, https://doi.org/10.1002/qj.275.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Huang, H., and G. Zhang, 2017: Case studies of low‐visibility forecasting in falling snow with WRF Model. J. Geophys. Res. Atmos., 122, 12862, https://doi.org/10.1002/2017JD026459.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Iizumi, T., H. Takikawa, Y. Hirabayashi, N. Hanasaki, and M. Nishimori, 2017: Contributions of different bias‐correction methods and reference meteorological forcing data sets to uncertainty in projected temperature and precipitation extremes. J. Geophys. Res. Atmos., 122, 78007819, https://doi.org/10.1002/2017JD026613.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • IPCC, 2013: Climate Change 2013: The Physical Science Basis. Cambridge University Press, 1535 pp.

    • Crossref
    • Export Citation
  • Jeong, U., J. Kim, H. Lee, and Y. G. Lee, 2017: Assessing the effect of long-range pollutant transportation on air quality in Seoul using the conditional potential source contribution function method. Atmos. Environ., 150, 3344, https://doi.org/10.1016/j.atmosenv.2016.11.017.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ke, G., Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T. Y. Liu, 2017: LightGBM: A highly efficient gradient boosting decision tree. 31st Conf. on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, Curran Assoc., 31463154.

    • Crossref
    • Export Citation
  • Kim, B. Y., and K. T. Lee, 2018: Radiation component calculation and energy budget analysis for the Korean Peninsula region. Remote Sens., 10, 1147, https://doi.org/10.3390/rs10071147.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., J. W. Cha, W. Jung, and A. Ko, 2020: Precipitation enhancement experiments in catchment areas of dams: Evaluation of water resource augmentation and economic benefits. Remote Sens., 12, 3730, https://doi.org/10.3390/rs12223730.

    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., J. W. Cha, and K. H. Chang, 2021a: Twenty-four-hour cloud cover calculation using a ground-based imager with machine learning. Atmos. Meas. Tech., 14, 66956710, https://doi.org/10.5194/amt-14-6695-2021.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., J. W. Cha, K. H. Chang, and C. Lee, 2021b: Visibility prediction over South Korea based on random forest. Atmosphere, 12, 552, https://doi.org/10.3390/atmos12050552.

    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., Y. K. Lim, and J. W. Cha, 2022a: Short-term prediction of particulate matter (PM10 and PM2.5) in Seoul, South Korea using tree-based machine learning algorithms. Atmos. Pollut. Res., 13, 101547, https://doi.org/10.1016/j.apr.2022.101547.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., J. W. Cha, K. H. Chang, and C. Lee, 2022b: Estimation of the visibility in Seoul, South Korea, based on particulate matter and weather data, using machine-learning algorithm. Aerosol Air Qual. Res., 22, 220125, https://doi.org/10.4209/aaqr.220125.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, D. J., G. Kang, D. Y. Kim, and J. J. Kim, 2020: Characteristics of LDAPS-predicted surface wind speed and temperature at automated weather stations with different surrounding land cover and topography in Korea. Atmosphere, 11, 1224, https://doi.org/10.3390/atmos11111224.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, M., K. Lee, and Y. H. Lee, 2020: Visibility data assimilation and prediction using an observation network in South Korea. Pure Appl. Geophys., 177, 11251141, https://doi.org/10.1007/s00024-019-02288-z.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, Y. P., and G. Lee, 2018: Trend of air quality in Seoul: Policy and science. Aerosol Air Qual. Res., 18, 21412156, https://doi.org/10.4209/aaqr.2018.03.0081.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • KMA, 2013: Utilization Guide of Numerical Weather Prediction Model Data for Activation of the Weather Industry. Korea Meteorological Administration, 62 pp.

    • Crossref
    • Export Citation
  • Lee, J. Y., W. K. Jo, and H. H. Chun, 2014: Characteristics of atmospheric visibility and its relationship with air pollution in Korea. J. Environ. Qual., 43, 15191526, https://doi.org/10.2134/jeq2014.02.0066.

    • Search Google Scholar
    • Export Citation
  • Li, D. D., D. X. Yu, Z. J. Qu, and S. H. Yu, 2020: Feature selection and model fusion approach for predicting urban macro travel time. Math. Probl. Eng., 2020, 6614920, https://doi.org/10.1155/2020/6614920.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lu, H., and X. Ma, 2020: Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere, 249, 126169, https://doi.org/10.1016/j.chemosphere.2020.126169.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ma, C. J., C. S. Lim, G. U. Kang, S. A. Jung, and M. R. Jo, 2020: Visibility degradation and its contributors at an urban site in Korea. Asian J. Atmos. Environ., 14, 335344, https://doi.org/10.5572/ajae.2020.14.4.335.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Maurer, M., O. Klemm, H. L. Lokys, and N. H. Lin, 2019: Trends of fog and visibility in Taiwan: Climate change or air quality improvement? Aerosol Air Qual. Res., 19, 896910, https://doi.org/10.4209/aaqr.2018.04.0152.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Meyer, D., E. Dimitriadou, K. Hornik, A. Weingessel, F. Leisch, C. C. Chang, and C. C. Lin, 2022: Package ‘e1071.’ R Reference Document, 67 pp., https://cran.r-project.org/web/packages/e1071/e1071.pdf.

    • Crossref
    • Export Citation
  • Mittermaier, M. P., 2008: The potential impact of using persistence as a reference forecast on perceived forecast skill. Wea. Forecasting, 23, 10221031, https://doi.org/10.1175/2008WAF2007037.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Nagarajan, B., L. Delle Monache, J. P. Hacker, D. L. Rife, K. Searight, J. C. Knievel, and T. N. Nipen, 2015: An evaluation of analog-based postprocessing methods across several variables and forecast models. Wea. Forecasting, 30, 16231643, https://doi.org/10.1175/WAF-D-14-00081.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Park, D. H., S. W. Kim, M. H. Kim, H. Yeo, S. S. Park, T. Nishizawa, A. Shimizu, and C. H. Kim, 2021: Impacts of local versus long-range transported aerosols on PM10 concentrations in Seoul, Korea: An estimate based on 11-year PM10 and lidar observations. Sci. Total Environ., 750, 141739, https://doi.org/10.1016/j.scitotenv.2020.141739.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Petersen, C., and N. W. Nielsen, 2000: Diagnosis of visibility in DMI-HIRLAM. Danish Meteorological Institute, Scientific Rep. 00-11, 37 pp.

    • Crossref
    • Export Citation
  • Qu, W. J., J. Wang, X. Y. Zhang, D. Wang, and L. F. Sheng, 2015: Influence of relative humidity on aerosol composition: Impacts on light extinction and visibility impairment at two sites in coastal area of China. Atmos. Res., 153, 500511, https://doi.org/10.1016/j.atmosres.2014.10.009.

    • Search Google Scholar
    • Export Citation
  • Rosa, J. P., D. J. Guerra, N. C. Horta, R. M. Martins, and N. C. Lourenço, 2020: Overview of artificial neural networks. Using Artificial Neural Networks for Analog Integrated Circuit Design Automation, Springer, 21–44, https://doi.org/10.1007/978-3-030-35743-6.

  • Singh, A., J. P. George, and G. R. Iyengar, 2018: Prediction of fog/visibility over India using NWP model. J. Earth Syst. Sci., 127, 26, https://doi.org/10.1007/s12040-018-0927-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Singh, A., W. R. Avis, and F. D. Pope, 2020: Visibility as a proxy for air quality in East Africa. Environ. Res. Lett., 15, 084002, https://doi.org/10.1088/1748-9326/ab8b12.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Song, H. J., B. Lim, and S. Joo, 2019: Evaluation of rainfall forecasts with heavy rain types in the high-resolution unified model over South Korea. Wea. Forecasting, 34, 12771293, https://doi.org/10.1175/WAF-D-18-0140.1.

    • Search Google Scholar
    • Export Citation
  • Stoelinga, M. T., and T. T. Warner, 1999: Nonhydrostatic, mesobeta-scale model simulations of cloud ceiling and visibility for an East Coast winter precipitation event. J. Appl. Meteor. Climatol., 38, 385404, https://doi.org/10.1175/1520-0450(1999)038<0385:NMSMSO>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sun, H., J. Wang, and W. Ye, 2021: A data augmentation-based evaluation system for regional direct economic losses of storm surge disasters. Int. J. Environ. Res. Public Health, 18, 2918, https://doi.org/10.3390/ijerph18062918.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sun, X., T. Zhao, D. Liu, S. Gong, J. Xu, and X. Ma, 2020: Quantifying the influences of PM2.5 and relative humidity on change of atmospheric visibility over recent winters in an urban area of East China. Atmosphere, 11, 461, https://doi.org/10.3390/atmos11050461.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Thach, T. Q., C. M. Wong, K. P. Chan, Y. K. Chau, Y. N. Chung, C. Q. Ou, L. Yang, and A. J. Hedley, 2010: Daily visibility and mortality: Assessment of health benefits from improved visibility in Hong Kong. Environ. Res., 110, 617623, https://doi.org/10.1016/j.envres.2010.05.005.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vinutha, H. P., B. Poornima, and B. M. Sagar, 2018: Detection of outliers using interquartile range technique from intrusion dataset. Information and Decision Sciences: Advances in Intelligent Systems and Computing, S. Satapathy et al., Eds., Vol. 701, Springer, https://doi.org/10.1007/978-981-10-7563-6_53.

    • Crossref
    • Export Citation
  • Wang, C., Z. Jia, Z. Yin, F. Liu, G. Lu, and J. Zheng, 2021: Improving the accuracy of subseasonal forecasting of China precipitation with a machine learning approach. Front. Earth Sci., 9, 659310, https://doi.org/10.3389/feart.2021.659310.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, J., S. Lu, S. H. Wang, and Y. D. Zhang, 2021: A review on extreme learning machine. Multimedia Tools Appl., https://doi.org/10.1007/s11042-021-11007-7.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Whalley, J., and S. Zandi, 2016: Particulate matter sampling techniques and data modelling methods. Air Quality—Measurement and Modeling, P. Sallis, Ed., IntechOpen, 29–54, https://doi.org/10.5772/65054.

    • Crossref
    • Export Citation
  • WMO, 2014: Guide to Meteorological Instruments and Methods of Observation. World Meteorological Organization, 1128 pp.

  • Won, W. S., R. Oh, W. Lee, K. Y. Kim, S. Ku, P. C. Su, and Y. J. Yoon, 2020: Impact of fine particulate matter on visibility at Incheon International Airport, South Korea. Aerosol Air Qual. Res., 20, 10481061, https://doi.org/10.4209/aaqr.2019.03.0106.

    • Search Google Scholar
    • Export Citation
  • Wright, M. N., and A. Ziegler, 2017: Ranger: A fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Software, 77, 117, https://doi.org/10.18637/jss.v077.i01.

    • Search Google Scholar
    • Export Citation
  • Wright, M. N., S. Wager, and P. Probst, 2020: Package ‘ranger.’ R Reference Document, 28 pp., https://cran.r-project.org/web/packages/ranger/ranger.pdf.

  • Wu, J., C. Fu, L. Zhang, and J. Tang, 2012: Trends of visibility on sunny days in China in the recent 50 years. Atmos. Environ., 55, 339346, https://doi.org/10.1016/j.atmosenv.2012.03.037.

    • Search Google Scholar
    • Export Citation
  • Wu, X., Y. Wang, S. He, and Z. Wu, 2020: PM2.5/PM10 ratio prediction based on a long short-term memory neural network in Wuhan, China. Geosci. Model Dev., 13, 14991511, https://doi.org/10.5194/gmd-13-1499-2020.

    • Search Google Scholar
    • Export Citation
  • Yu, H., T. Li, and P. Liu, 2019: Influence of ENSO on frequency of wintertime fog days in eastern China. Climate Dyn., 52, 50995113, https://doi.org/10.1007/s00382-018-4437-3.

    • Search Google Scholar
    • Export Citation
  • Yu, Z., Y. Qu, Y. Wang, J. Ma, and Y. Cao, 2021: Application of machine-learning-based fusion model in visibility forecast: A case study of Shanghai, China. Remote Sens., 13, 2096, https://doi.org/10.3390/rs13112096.

    • Search Google Scholar
    • Export Citation
  • Zhang, S., D. Cheng, Z. Deng, M. Zong, and X. Deng, 2018: A novel kNN algorithm with data-driven k parameter computation. Pattern Recognit. Lett., 109, 4454, https://doi.org/10.1016/j.patrec.2017.09.036.

    • Search Google Scholar
    • Export Citation
  • Zhao, T., and Coauthors, 2017: Revealed variations of air quality in industrial development over a remote plateau of Southwest China: An application of atmospheric visibility data. Meteor. Atmos. Phys., 129, 659667, https://doi.org/10.1007/s00703-016-0492-7.

    • Search Google Scholar
    • Export Citation
  • Zhou, B., J. Du, I. Gultepe, and G. Dimego, 2012: Forecast of low visibility and fog from NCEP: Current status and efforts. Pure Appl. Geophys., 169, 895909, https://doi.org/10.1007/s00024-011-0327-x.

    • Search Google Scholar
    • Export Citation
  • Zong, P., Y. Zhu, H. Wang, and D. Liu, 2020: WRF-Chem simulation of winter visibility in Jiangsu, China, and the application of a neural network algorithm. Atmosphere, 11, 520, https://doi.org/10.3390/atmos11050520.

    • Search Google Scholar
    • Export Citation
  • View in gallery
    Fig. 1.

    Schematic diagram of the machine learning (ML) algorithms employed in this study: (a) random forest (RF), (b) extreme gradient boosting (XGB), and (c) light gradient boosting (LGB) (Kim et al. 2021a, 2022a). The circle indicates the node of the tree, and the arrow indicates the growth of the tree (RF) or direction of sequential growth of the tree (XGB and LGB).

  • View in gallery
    Fig. 2.

    Prediction accuracy of local data assimilation and prediction system (LDAPS) variables (a)Ta, (b) Td, (c)Ta − Td, (d) RH, (e) Pa, (f) WD, (g) WS, and (h) precipitation by forecast time [red dots: bias, blue dots: root-mean-square error (RMSE), and black dots: R].

  • View in gallery
    Fig. 3.

    Scatterplots comparing visibility predicted by (a),(b) extreme gradient boosting (XGB) and (c),(d) local data assimilation and prediction system (LDAPS) with automated synoptic observing system (ASOS) visibility observations for training and validation datasets.

  • View in gallery
    Fig. 4.

    Relative importance of input variables to the visibility results predicted by the extreme gradient boosting (XGB) algorithm.

  • View in gallery
    Fig. 5.

    (a),(b) Scatterplots and (c) daily mean time series of the visibility values predicted by extreme gradient boosting (XGB) (VISXGB) and local data assimilation and prediction system (LDAPS) (VISLDAPS) and observed by automated synoptic observing system (ASOS) (VISobs) for the test set.

  • View in gallery
    Fig. 6.

    (a) Forecast time and (b) monthly equitable threat score (ETS) distribution of VISLDAPS and VISXGB.

All Time Past Year Past 30 Days
Abstract Views 208 208 0
Full Text Views 468 468 105
PDF Downloads 372 372 91

Short-Term Visibility Prediction Using Tree-Based Machine Learning Algorithms and Numerical Weather Prediction Data

Bu-Yo KimaResearch Applications Department, National Institute of Meteorological Sciences, Jeju, South Korea

Search for other papers by Bu-Yo Kim in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-6581-5011
,
Miloslav BeloridaResearch Applications Department, National Institute of Meteorological Sciences, Jeju, South Korea

Search for other papers by Miloslav Belorid in
Current site
Google Scholar
PubMed
Close
, and
Joo Wan ChaaResearch Applications Department, National Institute of Meteorological Sciences, Jeju, South Korea

Search for other papers by Joo Wan Cha in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

Accurate visibility prediction is imperative in the interests of human and environmental health. However, the existing numerical models for visibility prediction are characterized by low prediction accuracy and high computational cost. Thus, in this study, we predicted visibility using tree-based machine learning algorithms and numerical weather prediction data determined by the local data assimilation and prediction system (LDAPS) of the Korea Meteorological Administration. We then evaluated the accuracy of visibility prediction for Seoul, South Korea, through a comparative analysis using observed visibility from the automated synoptic observing system. The visibility predicted by machine learning algorithm was compared with the visibility predicted by LDAPS. The LDAPS data employed to construct the visibility prediction model were divided into learning, validation, and test sets. The optimal machine learning algorithm for visibility prediction was determined using the learning and validation sets. In this study, the extreme gradient boosting (XGB) algorithm showed the highest accuracy for visibility prediction. Comparative results using the test sets revealed lower prediction error and higher correlation coefficient for visibility predicted by the XGB algorithm (bias: −0.62 km, MAE: 2.04 km, RMSE: 2.94 km, and R: 0.88) than for that predicted by LDAPS (bias: −0.32 km, MAE: 4.66 km, RMSE: 6.48 km, and R: 0.40). Moreover, the mean equitable threat score (ETS) also indicated higher prediction accuracy for visibility predicted by the XGB algorithm (ETS: 0.5–0.6 for visibility ranges) than for that predicted by LDAPS (ETS: 0.1–0.2).

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Bu-Yo Kim, kimbuyo@korea.kr

Abstract

Accurate visibility prediction is imperative in the interests of human and environmental health. However, the existing numerical models for visibility prediction are characterized by low prediction accuracy and high computational cost. Thus, in this study, we predicted visibility using tree-based machine learning algorithms and numerical weather prediction data determined by the local data assimilation and prediction system (LDAPS) of the Korea Meteorological Administration. We then evaluated the accuracy of visibility prediction for Seoul, South Korea, through a comparative analysis using observed visibility from the automated synoptic observing system. The visibility predicted by machine learning algorithm was compared with the visibility predicted by LDAPS. The LDAPS data employed to construct the visibility prediction model were divided into learning, validation, and test sets. The optimal machine learning algorithm for visibility prediction was determined using the learning and validation sets. In this study, the extreme gradient boosting (XGB) algorithm showed the highest accuracy for visibility prediction. Comparative results using the test sets revealed lower prediction error and higher correlation coefficient for visibility predicted by the XGB algorithm (bias: −0.62 km, MAE: 2.04 km, RMSE: 2.94 km, and R: 0.88) than for that predicted by LDAPS (bias: −0.32 km, MAE: 4.66 km, RMSE: 6.48 km, and R: 0.40). Moreover, the mean equitable threat score (ETS) also indicated higher prediction accuracy for visibility predicted by the XGB algorithm (ETS: 0.5–0.6 for visibility ranges) than for that predicted by LDAPS (ETS: 0.1–0.2).

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Bu-Yo Kim, kimbuyo@korea.kr

1. Introduction

The World Meteorological Organization (WMO) defines visibility as the distance at which light intensity is reduced to 5% of the original intensity (WMO 2014). In unpolluted atmospheric conditions, visibility lies in the range from several tens to hundreds of kilometers; however, low visibility of several kilometers can be caused by suspended particle matter and gases in the atmosphere (Wu et al. 2012). High energy consumption due to urbanization and industrialization leads to the frequent emission of large amounts of air pollutants, resulting in low visibility (Qu et al. 2015). Thus, visibility can be affected by both environmental and meteorological factors, with low visibility in turn leading to traffic delays or traffic accidents resulting in property damage and risks to human health (Huang and Zhang 2017; Wu et al. 2020). Visibility is an effective measure of air quality and closely related to human health indicators such as mortality, heart disease, and decreased lung function (IPCC 2013; Gulia et al. 2015; Jeong et al. 2017). In particular, large cities with substantial industrial activities and large populations emit more air pollutants, which adversely affects visibility, city administration, and human health (Zhao et al. 2017; Singh et al. 2020; Zong et al. 2020).

Visibility is often provided by numerical weather prediction (NWP) models as a direct output parameter or as output of model’s post-processing package (Singh et al. 2018; Fita et al. 2019). Various visibility diagnostic algorithms were developed based on empirical, theoretical, and diagnostic approaches (Dimitrova et al. 2021). These algorithms typically use the relationship between hydrometeor characteristics and light extinction for each hydrometer type (Stoelinga and Warner 1999; Benjamin et al. 2004) and also relationship with other meteorological parameters such as relative humidity (Doran et al. 1999; Creighton et al. 2014), cloud cover, temperature, wind speed (Petersen and Nielsen 2000), etc. Moreover, some NWP models carries and utilizes information about aerosol concentration for visibility diagnostics (Clark et al. 2008; Petersen and Nielsen 2000). Since, visibility is sensitive to presence of both hydrometeors and aerosol it is difficult to accurately predict visibility using the operational NWP models (Won et al. 2020). Therefore, researchers have explored improvements in parameterization, data assimilation, and ensemble prediction to reduce the uncertainty of the NWP models and increase its prediction accuracy (Zhou et al. 2012; Cornejo-Bueno et al. 2017; D. J. Kim et al. 2020). Nevertheless, the NWP models are still disadvantaged by a low prediction accuracy and high computational cost (Zong et al. 2020). Therefore, studies are increasingly using linear or nonlinear relationships between weather variables and visibility or employing machine learning (ML) to predict visibility with higher accuracy (Kim et al. 2021b).

The aim of this study is to develop a more accurate visibility prediction method than existing numerical models. Specifically, we predict visibility using predicted meteorological data based on the Unified Model (UM) model and several tree-based ML algorithms, and then evaluate the predicted performance. For the visibility prediction, we use data from the UM based local data assimilation and prediction system (LDAPS) operated by the Korea Meteorological Administration (KMA) for the region of Seoul, which has the highest air pollution in South Korea. Seoul is a metropolis with high economic activity, and approximately 10 million people living in an area measuring 605.2 km2 (Y. P. Kim and Lee 2018). Prediction of visibility in metropolitan areas and analysis of municipal trends are key issues in meteorological, climate, and air quality research (Jeong et al. 2017; Zhao et al. 2017; Ma et al. 2020). Therefore, high-accuracy visibility prediction can be used to guide current and future environmental policies for air quality improvement and public health in South Korea.

2. Research data and methods

a. Research data

In this study, visibility prediction was performed for Seoul, a large metropolitan city in South Korea. The data used to predict visibility included air temperature (Ta; K), dewpoint temperature (Td; K), air pressure (Pa; hPa), relative humidity (RH; %), wind direction (WD; °), wind speed (WS; m s−1), and precipitation (mm h−1) data of LDAPS (Clark et al. 2008; KMA 2013), a numerical prediction modeling system based on UM version 10.1, operated and serviced by KMA. These data used forecast fields from 1 January 2018 to 31 December 2020. LDAPS is a modeling system with its own analysis and prediction cycle based on three-dimension variational data assimilation (D. J. Kim et al. 2020). LDAPS predicts meteorological variables at 1-h intervals for 36 h four times per day (at 0000, 0600, 1200, and 1800 UTC) and produces grid data at intervals of 1.5 km × 1.5 km (B. Y. Kim and Lee 2018). The visibility diagnostic within the model assumes a simple exponential scattering law and is a function of humidity and a prognostic aerosol content (Clark et al. 2008). The predicted visibility was evaluated using visibility observations from a visibility sensor (Vaisala PWD-22, range: 0.01–20 km) at the Seoul automated synoptic observing system (ASOS) (Station 108, 37.57°N, 126.97°E) observatory.

The collected LDAPS data were used as the input data for several tree-based ML algorithms for visibility prediction. For training, validation, and test of the visibility prediction model, corresponding datasets were constructed by randomly non-reconstructing extraction of the entire dataset (152 488 cases) at a ratio of 5:3:2 (training:validation:test) (Kim et al. 2021a). In this process, WD exhibited a large deviation, so the WD in degrees was converted to 16 direction categories (e.g., 348.75°–11.25° is north) for subsequent analysis. In addition to RH, the dewpoint depression (Ta − Td) was considered for visibility prediction. As Ta − Td becomes smaller, the atmosphere becomes more humid and condensation occurs, which may be an important factor in reducing visibility (Yu et al. 2019). In addition, the Julian day, month, and week were considered as input variables to reflect characteristics according to the model run time (Hour) and forecast time (Time), as well as periodic patterns of various visibility changes (Wu et al. 2020; Kim et al. 2021b). That is, the periodicity variable can indirectly reflect daily, weekly, and seasonal variations in particulate matter that affect visibility variations (Whalley and Zandi 2016). Additionally, to improve the accuracy of visibility prediction in cases of low visibility (less than 10 km), the dependent variable (visibility) was trained by taking the common logarithm for all visibility ranges (0.01–20 km).

To construct the visibility prediction model, the optimal hyperparameters for each ML algorithm were set using the training set and the validation set. In this process, the predicted visibility was evaluated using the observed visibility data. The evaluation results for the test set using the ML algorithm with the highest prediction accuracy are shown in section 3c. Previous visibility prediction studies by Thach et al. (2010), Lee et al. (2014), and Deng et al. (2016) were limited to meteorological conditions with no precipitation and RH of 80% or less. This is because hygroscopic aerosols or suspended substances in the atmosphere exhibit deliquescent properties under precipitation or high RH (Guo et al. 2020). However, in this study, a visibility prediction model was constructed without constraints on the input variables used to predict visibility under various weather conditions. For the resulting predicted visibility (VISpre), the prediction accuracy was evaluated through a comparison with the observed visibility (VISobs) according to the bias, mean absolute error (MAE), root-mean-square error (RMSE), R, and equitable threat score (ETS), as shown in Eqs. (1)(5). In addition, VISpre was compared with the visibility (VISLDAPS) predicted by the LDAPS. As LDAPS predicts visibility in the range of 0.01–100 km, visibility data exceeding 20 km was limited to 20 km:
bias=i=1N(VISpreiVISobsi)/N,
MAE=i=1N|VISpreiVISobsi|/N,
RMSE=i=1N(VISpreiVISobsi)2/N,
R=i=1N(VISpreiVISpre¯)(VISobsiVISobs¯)/i=1N(VISpreiVISpre¯)2i=1N(VISobsiVISobs¯)2,
ETS=aea+b+ce, e=(a+b)(a+c)a+b+c+d.

Here N is the number of data, a is a correct forecast (observed: yes, forecast: yes), b is a false alarm (observed: no, forecast: yes), c is a miss (observed: yes, forecast: no), and d is a correct rejection (observed: no, forecast: no). ETS takes a value between −1/3 and 1, with 1 indicating a perfect prediction and <0 indicating no prediction capability.

b. Machine learning algorithms

The tree-based ML algorithms used in this study were the random forest (RF), light gradient boosting (LGB), and extreme gradient boosting (XGB) algorithms. Tree-based ML algorithms are based on the decision tree (DT) algorithm (Breiman et al. 2017), but there is a difference in the way the tree is developed and the model is optimized. RF is shown in Fig. 1a, where N decision trees are constructed by randomly combining variables at each node, and the final results are an ensemble of the results determined from each tree (Wright and Ziegler 2017). In this process, as the amount of data (or number of variables), which is a disadvantage of the DT algorithm, increases, the complexity and overfitting of the model are improved, and a better prediction result is provided. XGB and LGB algorithms comprise a series of tree-based gradient-boosting algorithms that provide ensembled prediction results similar to RF, but they use boosting instead of bagging for resampling and the ensemble (Friedman 2001). Specifically, as shown in Figs. 1b and 1c, a weak model (tree) was sequentially generated, and components that the previous model did not predict were gradually improved to generate a model with improved predictive power, and the results [f1(x) − fN(x)] were weighted mean to obtain the final predicted value [f(x)] of the prediction (Friedman 2001). In this process, the XGB algorithm grows the tree level-wise, whereas the LGB algorithm grows the tree leaf-wise. That is, the XGB algorithm splits the leaf for each internal node by as much as the maximum depth of the tree, grows the tree, and determines the most optimal prediction result (leaf node). In the case of the LGB algorithm, the tree is grown by creating (optimally splitting) leaves with maximum loss without splitting all internal nodes (Ke et al. 2017). Therefore, the LGB algorithm has the characteristics of similar accuracy while the training speed is much faster than that of the XGB algorithm. However, depending on the data characteristics, the tree can be simplified and the prediction power could be decreased (Li et al. 2020; Sun et al. 2021).

Fig. 1.
Fig. 1.

Schematic diagram of the machine learning (ML) algorithms employed in this study: (a) random forest (RF), (b) extreme gradient boosting (XGB), and (c) light gradient boosting (LGB) (Kim et al. 2021a, 2022a). The circle indicates the node of the tree, and the arrow indicates the growth of the tree (RF) or direction of sequential growth of the tree (XGB and LGB).

Citation: Weather and Forecasting 37, 12; 10.1175/WAF-D-22-0053.1

These algorithms have been employed in various studies because they exhibit high accuracy and much faster training and prediction speeds than vector-based algorithms or deep learning algorithms in cases with abundant sample data (Al Banna et al. 2020; Lu and Ma 2020). The optimal hyperparameters were set by repeated and dense grid searching for each hyper-parameter (Bergstra and Bengio 2012). In this study, we used the “Ranger” package of R (Wright et al. 2020) and set the hyperparameters num.trees (number of trees), mtry (number of variables randomly sampled at each node), and min.node.size (minimal node size) to 550, 12, and 4, respectively. In this study, XGB employed the Gaussian distribution function with the best prediction performance according to the “xgboost” package of R (Chen et al. 2022), and the hyperparameters n.rounds (maximum number of iterations), max_depth (maximum depth of binary tree), and eta (learning rate) were set to 640, 13, and 0.1, respectively. For LGB, we used the “lightgbm” package of R (Ke et al. 2017) and set the hyperparameters nrounds, max_depth, and learning_rate to 2230, 13, and 0.2, respectively. These hyperparameters are commonly used variables for ML optimization. For other parameters of each ML algorithm, default values were used.

3. Results and discussion

a. Prediction accuracy of LDAPS

The prediction accuracy of LDAPS data for each forecast time within the study period is shown in Fig. 2. These results were compared using ASOS data observed at the corresponding forecast time for each variable. For LDAPS data, the bias and RMSE slowly increased with forecast time, whereas R slowly decreased, with a superimposed diurnal variability. This is because the characteristics of the daytime and nighttime predictions differed depending on the forecast time for each run time (Nagarajan et al. 2015), and the boundary conditions of the LDAPS are updated every 6 h. Therefore, the Ta − Td, RH, WD, WS, and precipitation data with large temporal deviations showed relatively large differences and low R (Table 1). This result was similar to the prediction accuracy of the predicted data for each run time (i.e., 0000, 0600, 1200, and 1800 UTC). In this study, we did not bias correct the LDAPS variables using ASOS observation data. This is because the data period and methods of bias correction for each variable are different (Iizumi et al. 2017). Thus, bias correction may induce a new error for variables with a large temporal deviation (especially WD, WS, and precipitation). Therefore, in this study, the prediction performance of LDAPS was used in its original form as input data for ML algorithm learning and visibility prediction.

Fig. 2.
Fig. 2.

Prediction accuracy of local data assimilation and prediction system (LDAPS) variables (a)Ta, (b) Td, (c)Ta − Td, (d) RH, (e) Pa, (f) WD, (g) WS, and (h) precipitation by forecast time [red dots: bias, blue dots: root-mean-square error (RMSE), and black dots: R].

Citation: Weather and Forecasting 37, 12; 10.1175/WAF-D-22-0053.1

Table 1

Mean prediction accuracy of local data assimilation and prediction system (LDAPS) variables by forecast time.

Table 1

b. Machine learning results for training and validation sets

Table 2 compares the visibilities predicted by the ML algorithms and LDAPS model for training and validation sets according to ASOS visibility observations for different visibility ranges. The RF and LGB algorithms showed generally high prediction accuracy for all data in the learning results, but relatively low prediction accuracy for low visibility of <10 km. However, the XGB algorithm exhibited high learning accuracy for visibility prediction, with the smallest difference and a high correlation coefficient in all visibility ranges. This is because the XGB algorithm sequentially enhances the weak learners and determines the optimal prediction result. Similarly, in the correction prediction results using the validation set, the XGB algorithm showed the smallest difference and a high correlation coefficient. To increase the accuracy of prediction for low visibility, the data were learned by taking the common logarithm of the dependent variable, i.e., the visibility. Therefore, the prediction result of LDAPS showed that lower visibility was associated with a larger MAE, whereas the visibility prediction result using ML was characterized by a decrease in the MAE. However, when many high visibility cases were included, the negative bias was large. In this study, cases with a visibility of 20 km accounted for approximately 38% of the all data. Those cases showed a bias of −0.09 km (MAE = 0.34 km) and RMSE of 0.52 km in the training set but a bias of −1.32 km (MAE = 1.67 km) and RMSE of 2.71 km in the validation set, making the negative bias relatively large. Nevertheless, the visibility predicted using the XGB algorithm showed good agreement with the observed visibility (Figs. 3a,b).

Fig. 3.
Fig. 3.

Scatterplots comparing visibility predicted by (a),(b) extreme gradient boosting (XGB) and (c),(d) local data assimilation and prediction system (LDAPS) with automated synoptic observing system (ASOS) visibility observations for training and validation datasets.

Citation: Weather and Forecasting 37, 12; 10.1175/WAF-D-22-0053.1

Table 2

Visibility prediction accuracy by visibility range of machine learning (ML) algorithms and the local data assimilation and prediction system (LDAPS) model using training and validation datasets [units of mean, bias, mean absolute error (MAE), and root-mean-square error (RMSE) values are km].

Table 2

For reference, the visibility prediction results of vector-based k-nearest neighbor (kNN; Zhang et al. 2018) and support vector regression (SVR; Meyer et al. 2022) and deep learning-based artificial neural network (ANN; Rosa et al. 2020) and extreme learning machine (ELM; J. Wang et al. 2021) algorithms, using training and validation datasets in this study, showed poor accuracy of R = 0.8–0.9 and R < 0.8 for each dataset. These results were similar to the prediction results of Doreswamy et al. (2020) and Lu and Ma (2020) showed lower prediction performance than the tree-based ML algorithms. In particular, the tree-based ML algorithms reduced the bias (MAE) of visibility prediction more than the other ML algorithms (vector-, deep learning–, and regularization-based algorithms, such as kNN, SVR, ANN, multilayer perceptron, partial least squares regression, stochastic gradient descent regression, etc.) (Yu et al. 2021; Ding et al. 2022). Therefore, we constructed the visibility prediction model based on the XGB algorithm considering its high visibility prediction accuracy among the three tree-based ML algorithms. However, the optimal ML algorithm suitable for visibility prediction might vary depending on the characteristics of the data used (local or regional meteorological characteristics, frequency of data, types of variables, etc.) (Kim et al. 2022b).

Additionally, when the computer resources [Lenovo SD650 V2 with Intel Xeon Platinum 8368Q CPU processor 38 cores 2.6 GHz (76 threads) and 256 Gb memory (DDR4 3200 MT/s)] used in this study were maximally utilized, the learning and prediction speeds for the training set (76 244 cases) were fastest for the LGB algorithm with means of 3.07 and 0.35 s, respectively. Moreover, the RF algorithm had speeds of 10.68 and 0.99 s, and the XGB algorithm had speeds of 20.54 and 0.05 s, respectively. Although the XGB algorithm has a relatively high computational cost compared to that of the LGB algorithm, this difference can be sufficiently overcome by the use of additional computer resources. However, if the amount of data is very large (such as big data; e.g., number of data points, >1 million) or computer resources are limited, a low computational cost algorithm such as the LGB algorithm could be the most suitable for visibility prediction (Kim et al. 2022b).

For the visibility predicted by the LDAPS model, the accuracy for all visibility ranges in the training and validation sets showed a low correlation coefficient of less than 0.40, with an RMSE of 6 km or more, and low prediction accuracy. Moreover, the frequency of predicting high visibility during observations of low visibility and predicting low visibility during observations of high visibility was high (Figs. 3c,d).

Figure 4 shows the relative importance of the input variables as learning data for the XGB algorithm. The variable relative importance indicates the relative contribution of data features to the prediction result by quantifying the degree of node purity improvements when it is split at each node in the process of growing the XGB tree (Kim et al. 2022a). This relative importance can be obtained using the “xgb.importance” function of the “xgboost” package of R. Note that the relative importance of variables may vary depending on the meteorological characteristics of the study area or the characteristics of the data (data period, data interval, variation of variables, etc.). In many previous studies, RH was selected as the variable with the greatest influence on visibility (Maurer et al. 2019; Kim et al. 2021b). However, when RH is low (i.e., less than 40%) or high (i.e., 80% or more), the correlation with visibility is rather low, and the learning ability of ML is reduced, resulting in a larger error for the visibility prediction (Sun et al. 2020; Won et al. 2020). Therefore, in this study, a visibility prediction model was constructed using the Ta − Td variable and variables that may represent the periodicity of visibility changes (Yu et al. 2019; Wu et al. 2020; Kim et al. 2021b). In this study, Ta − Td showed the highest relative importance among all variables (32.89%), followed by the Julian day (19.16%). The relative importance of the run time variable (Hour) was the lowest at 0.85%. Thus, as there was no large deviation in the prediction accuracy by run time of the LDAPS prediction data (used as input data in this study), the dependence on run time was low. The relative importance of precipitation for the visibility prediction was low (1.17%) because it either worsens or improves visibility depending on the amount and frequency of precipitation (Kim et al. 2021b). Furthermore, the prediction results of LDAPS meteorological variables are shown in Fig. 2 and Table 1, where the largest difference is observed in the precipitation prediction accuracy. However, the input precipitation data showed better results for ML learning and visibility prediction accuracy.

Fig. 4.
Fig. 4.

Relative importance of input variables to the visibility results predicted by the extreme gradient boosting (XGB) algorithm.

Citation: Weather and Forecasting 37, 12; 10.1175/WAF-D-22-0053.1

c. Prediction results of the XGB algorithm for test datasets

Figure 5 shows a scatterplot and time series of the visibility values predicted by XGB (VISXGB) and LDAPS (VISLDAPS) and observed by ASOS (VISobs) for the test set. VISLDAPS showed large scatter, similar to that for the training and validation sets, and low prediction accuracy (bias = −0.32 km, MAE = 4.66 km, RMSE = 6.48 km, and R = 0.40) (Fig. 5a). Conversely, VISXGB showed good agreement with VISobs and high prediction accuracy (bias = −0.62 km, MAE = 2.04 km, RMSE = 2.94 km, and R = 0.88) (Fig. 5b). Here, using the interquartile range [IQR, Q3 − Q1 (75%–25%)], if we exclude outlier data for VISXGB and VISobs (i.e., when the difference between the two values is less than Q1 − IQR × 1.5 or more than Q3 + IQR × 1.5), the prediction accuracy was improved (bias = −0.52 km, RMSE = 2.15 km, and R = 0.94). The prediction accuracy for visibility < 10 km and <5 km was as follows: bias = 0.83 km, RMSE = 1.89 km, and R = 0.80; bias = 1.09 km, RMSE = 1.71 km, and R = 0.64, respectively. The IQR is used to determine statistical outliers, and a multiple of 1.5 is typically employed (Vinutha et al. 2018). The prediction accuracy for each LDAPS meteorological variable in the inlier range between Q1 − IQR × 1.5 and Q3 + IQR × 1.5 was similar to that shown in Table 1. However, as shown in Table 3, the prediction accuracy of outliers was low for Ta − Td, RH, and precipitation. Using the LDAPS meteorological variables with high prediction accuracy indicates that the accuracy of visibility prediction by the XGB algorithm can be increased. The daily mean time series for all data of both predicted and observed visibility values are shown in Fig. 5c. In this case, the prediction accuracy was better for VISXGB versus VISobs (bias = −0.61 km, MAE = 0.63 km, RMSE = 0.75 km, and R = 0.99) than for VISLDAPS versus VISobs (bias = −0.35 km, MAE = 1.72 km, RMSE = 2.16 km, and R = 0.62), with a smaller error and a higher correlation coefficient. Moreover, the prediction accuracy for VISXGB was higher than that for visibility predicted using WRF-Chem and a neural network in a study using a long-term time series (RMSE = 1.76 km and R = 0.42) (Zong et al. 2020).

Fig. 5.
Fig. 5.

(a),(b) Scatterplots and (c) daily mean time series of the visibility values predicted by extreme gradient boosting (XGB) (VISXGB) and local data assimilation and prediction system (LDAPS) (VISLDAPS) and observed by automated synoptic observing system (ASOS) (VISobs) for the test set.

Citation: Weather and Forecasting 37, 12; 10.1175/WAF-D-22-0053.1

Table 3

Prediction accuracy for each local data assimilation and prediction system (LDAPS) meteorological variable when outliers and inliers between VISXGB and VISobs are excluded according to IQR × 1.5 for test datasets.

Table 3

The visibility prediction performance of XGB and LDAPS models is shown in Fig. 6, as well as the forecast time and monthly ETS distribution. VISLDAPS showed a mean ETS of 0.15 for visibility < 10 km and a low mean ETS of 0.09 for visibility < 5 km. This result was similar to the ETS results for the UM models presented in Clark et al. (2008), Haywood et al. (2008), and M. Kim et al. (2020). In other words, the numerical approach for visibility prediction using NWP has limitations. Conversely, VISXGB exhibited higher mean ETS values of 0.60 at <10 km and 0.50 at <5 km. Moreover, for cases with visibility less than 1 km, the VISXGB mean ETS of 0.24, which was higher than the VISLDAPS mean ETS of 0.05 (excluding the forecast time with no visibility of less than 1 km). Therefore, the statistical approach based on iterative learning using the ML algorithm had more suitable results for visibility prediction. Although the predictive power of LDAPS was used without bias correction, a certain level of prediction performance could be expected for each forecast time, unlike in the work of Clark et al. (2008), Haywood et al. (2008), and M. Kim et al. (2020). The monthly ETS distribution was low in all months for VISLDAPS (Fig. 6b) but high in the dry season (spring and winter) and relatively low in the rainy season (summer and autumn) for VISXGB. That is, because of the frequent inflow of convective precipitation systems into South Korea caused by atmospheric pressure patterns and the jangma (B. Y. Kim et al. 2020; Kim et al. 2021a), the accuracy of weather prediction was lower than that in the dry season (Song et al. 2019; B. Y. Kim et al. 2020), which resulted in a relatively low ETS for ML-based visibility predictions (C. Wang et al. 2021). In particular, the prediction accuracy of meteorological variables for precipitation cases in August, when the ETS was lowest, showed a larger difference and lower correlation coefficient than cases in January, when the ETS was highest (Table 4). Nevertheless, the monthly ETS distribution of VISXGB was from approximately 0.1 (rainy season) to 0.4 (dry season) higher than that of VISLDAPS and the visibility predicted by Mittermaier (2008). In addition, the monthly ETS distribution of VISXGB was higher than the ETS for values predicted for visibility of <5 km in a short-term study (+24 h) using a neural network with no precipitation cases (0.45) (Claxton 2008). Furthermore, the ETS value in this study was higher than that of Bari and Ouagabi (2020) (XGB: 0.26, LBM: 0.24, and RF: 0.20), which involved short-term visibility prediction (+24 h) using various meteorological variables (34 variables including radiation, cloud fraction, and liquid water content, and excluding Ta − Td and periodicity variables) predicted via NWP and ML methods. In large metropolitan cities such as Seoul, the mean visibility is lower and the frequency of low visibility is higher than nearby urban and rural areas in the dry season (Wu et al. 2012). Moreover, the frequency of low visibility during this period is highly correlated with an increase in PM10 concentration (Park et al. 2021). Such a relationship occurs frequently in large metropolitan cities such as Seoul; therefore, it is very important to predict the visibility with high accuracy during the dry season.

Fig. 6.
Fig. 6.

(a) Forecast time and (b) monthly equitable threat score (ETS) distribution of VISLDAPS and VISXGB.

Citation: Weather and Forecasting 37, 12; 10.1175/WAF-D-22-0053.1

Table 4

Prediction accuracy for each local data assimilation and prediction system (LDAPS) meteorological variable for January and August precipitation cases in the test dataset.

Table 4

4. Summary and conclusions

In this study, visibility was predicted at the Seoul ASOS observation site using a tree-based ML algorithm and compared with ASOS visibility observations measured by visibility sensor. LDAPS meteorological variables (Ta, Td, Ta − Td, RH, WD, WS, and precipitation) obtained from the KMA, as well as variables representing periodicity (Hour, Time, and Julian day), were used as the NWP input data for ML algorithm training, validation, and testing. In this study, no restrictions were placed on meteorological conditions (e.g., dry condition (low RH), no precipitation (sunny days), etc.). Furthermore, to improve the prediction accuracy for the low visibility, the ML algorithm was trained by taking the common logarithm for visibility. Visibility values predicted using the XGB algorithm exhibited the highest accuracy in each visibility range (<20, <10, and <5 km). When using the test set, the prediction accuracy of VISXGB (bias: −0.62 km, MAE: 2.04 km, RMSE: 2.94 km, and R: 0.88) was better than that of VISLDAPS (bias: −0.32 km, MAE: 4.66 km, RMSE: 6.48 km, and R: 0.40). VISXGB also exhibited a mean ETS of 0.60 for visibility < 10 km and 0.50 for visibility < 5 km, according to the forecast time, indicating better prediction performance than VISLDAPS (ETS: 0.1–0.2). In large metropolitan cities such as Seoul in South Korea, it is important to explore the anthropogenic effects of air quality improvement in relation to traffic accident reduction and public health by monitoring air quality and analyzing air composition and municipal changes. Therefore, by developing a more accurate visibility prediction model, our research has important applications for related research and guiding policy development.

Acknowledgments.

This work was funded by the Korea Meteorological Administration Research and Development Program “Research on Weather Modification and Cloud Physics” under Grant (KMA2018-00224).

Data availability statement.

The LDAPS and ASOS data are available at https://data.kma.go.kr/cmmn/main.do.

REFERENCES

  • Al Banna, M. H., K. A. Taher, M. S. Kaiser, M. Mahmud, M. S. Rahman, A. S. Hosen, and G. H. Cho, 2020: Application of artificial intelligence in predicting earthquakes: State-of-the-art and future challenges. IEEE Access, 8, 192 880192 923, https://doi.org/10.1109/ACCESS.2020.3029859.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bari, D., and A. Ouagabi, 2020: Machine-learning regression applied to diagnose horizontal visibility from mesoscale NWP model forecasts. SN Appl. Sci., 2, 556, https://doi.org/10.1007/s42452-020-2327-x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Benjamin, S. G., and Coauthors, 2004: Assimilation of METAR cloud and visibility observations in the RUC. 11th Conf. on Aviation, Range, Aerospace, Hyannis, MA, Amer. Meteor. Soc., 9.13, https://ams.confex.com/ams/11aram22sls/webprogram/Paper81992.html.

  • Bergstra, J., and Y. Bengio, 2012: Random search for hyper-parameter optimization. J. Mach. Learn. Res., 13, 281305, http://www.jmlr.org/papers/v13/bergstra12a.html.

    • Search Google Scholar
    • Export Citation
  • Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone, 2017: Classification and Regression Trees. Routledge, 368 pp., https://doi.org/10.1201/9781315139470.

    • Crossref
    • Export Citation
  • Chen, T., T. He, M. Benesty, and XGBoost contributors, 2022: Package ‘xgboost.’ R Reference Document, 66 pp., https://cran.r-project.org/web/packages/xgboost/xgboost.pdf.

    • Crossref
    • Export Citation
  • Clark, P. A., S. A. Harcourt, B. MacPherson, C. T. Mathison, S. Cusack, and M. Naylor, 2008: Prediction of visibility and aerosol within the operational Met Office Unified Model. I: Model formulation and variational assimilation. Quart. J. Roy. Meteor. Soc., 134, 18011816, https://doi.org/10.1002/qj.318.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Claxton, B. M., 2008: Using a neural network to benchmark a diagnostic parametrization: The Met Office’s visibility scheme. Quart. J. Roy. Meteor. Soc., 134, 15271537, https://doi.org/10.1002/qj.309.

    • Search Google Scholar
    • Export Citation
  • Cornejo-Bueno, L., C. Casanova-Mateo, J. Sanz-Justo, E. Cerro-Prada, and S. Salcedo-Sanz, 2017: Efficient prediction of low-visibility events at airports using machine-learning regression. Bound.-Layer Meteor., 165, 349370, https://doi.org/10.1007/s10546-017-0276-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Creighton, G., E. Kuchera, R. Adams-Selin, J. McCormick, S. Rentschler, and B. Wickard, 2014: AFWA diagnostics in WRF. University Corporation for Atmospheric Research, 17 pp., https://www2.mmm.ucar.edu/wrf/users/docs/AFWA_Diagnostics_in_WRF.pdf.

    • Crossref
    • Export Citation
  • Deng, H., H. Tan, F. Li, M. Cai, P. W. Chan, H. Xu, X. Huang, and D. Wu, 2016: Impact of relative humidity on visibility degradation during a haze event: A case study. Sci. Total Environ., 569, 11491158, https://doi.org/10.1016/j.scitotenv.2016.06.190.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dimitrova, R., A. Sharma, H. J. Fernando, I. Gultepe, V. Danchovski, S. Wagh, S. L. Bardoel, and S. Wang, 2021: Simulations of coastal fog in the Canadian Atlantic with the Weather Research and Forecasting Model. Bound.-Layer Meteor., 181, 443472, https://doi.org/10.1007/s10546-021-00662-w.

    • Search Google Scholar
    • Export Citation
  • Ding, J., and Coauthors, 2022: Forecast of hourly airport visibility based on artificial intelligence methods. Atmosphere, 13, 75, https://doi.org/10.3390/atmos13010075.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Doran, J. A., P. J. Roohr, D. J. Beberwyk, G. R. Brooks, G. A. Gayno, R. T. Williams, J. M. Lewis, and R. J. Lefevre, 1999: The MM5 at the Air Force Weather Agency-New products to support military operations. The Eighth Conf. on Aviation, Range, and Aerospace Meteorology, Dallas, TX, Amer. Meteor. Soc., 4.17, https://ams.confex.com/ams/99annual/abstracts/1125.html.

    • Crossref
    • Export Citation
  • Doreswamy, N., K. S. Harishkumar, K. M. Yogesh, G. Ibrahim, 2020: Forecasting air pollution particulate matter (PM2.5) using machine learning regression models. Procedia Comput. Sci., 171, 20572066, https://doi.org/10.1016/j.procs.2020.04.221.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fita, L., J. Polcher, T. M. Giannaros, T. Lorenz, J. Milovac, G. Sofiadis, E. Katragkou, and S. Bastin, 2019: CORDEX-WRF v1. 3: Development of a module for the Weather Research and Forecasting (WRF) Model to support the CORDEX community. Geosci. Model Dev., 12, 10291066, https://doi.org/10.5194/gmd-12-1029-2019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Friedman, J. H., 2001: Greedy function approximation: A gradient boosting machine. Ann. Stat., 29, 11891232, https://doi.org/10.1214/aos/1013203451.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gulia, S., S. S. Nagendra, M. Khare, and I. Khanna, 2015: Urban air quality management—A review. Atmos. Pollut. Res., 6, 286304, https://doi.org/10.5094/APR.2015.033.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Guo, B., Y. Wang, X. Zhang, H. Che, J. Zhong, Y. Chu, and L. Cheng, 2020: Temporal and spatial variations of haze and fog and the characteristics of PM2.5 during heavy pollution episodes in China from 2013 to 2018. Atmos. Pollut. Res., 11, 18471856, https://doi.org/10.1016/j.apr.2020.07.019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Haywood, J., and Coauthors, 2008: Prediction of visibility and aerosol within the operational Met Office Unified Model. II: Validation of model performance using observational data. Quart. J. Roy. Meteor. Soc., 134, 18171832, https://doi.org/10.1002/qj.275.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Huang, H., and G. Zhang, 2017: Case studies of low‐visibility forecasting in falling snow with WRF Model. J. Geophys. Res. Atmos., 122, 12862, https://doi.org/10.1002/2017JD026459.

    • Search Google Scholar
    • Export Citation
  • Iizumi, T., H. Takikawa, Y. Hirabayashi, N. Hanasaki, and M. Nishimori, 2017: Contributions of different bias‐correction methods and reference meteorological forcing data sets to uncertainty in projected temperature and precipitation extremes. J. Geophys. Res. Atmos., 122, 78007819, https://doi.org/10.1002/2017JD026613.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • IPCC, 2013: Climate Change 2013: The Physical Science Basis. Cambridge University Press, 1535 pp.

  • Jeong, U., J. Kim, H. Lee, and Y. G. Lee, 2017: Assessing the effect of long-range pollutant transportation on air quality in Seoul using the conditional potential source contribution function method. Atmos. Environ., 150, 3344, https://doi.org/10.1016/j.atmosenv.2016.11.017.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ke, G., Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T. Y. Liu, 2017: LightGBM: A highly efficient gradient boosting decision tree. 31st Conf. on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, Curran Assoc., 31463154.

    • Crossref
    • Export Citation
  • Kim, B. Y., and K. T. Lee, 2018: Radiation component calculation and energy budget analysis for the Korean Peninsula region. Remote Sens., 10, 1147, https://doi.org/10.3390/rs10071147.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., J. W. Cha, W. Jung, and A. Ko, 2020: Precipitation enhancement experiments in catchment areas of dams: Evaluation of water resource augmentation and economic benefits. Remote Sens., 12, 3730, https://doi.org/10.3390/rs12223730.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., J. W. Cha, and K. H. Chang, 2021a: Twenty-four-hour cloud cover calculation using a ground-based imager with machine learning. Atmos. Meas. Tech., 14, 66956710, https://doi.org/10.5194/amt-14-6695-2021.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., J. W. Cha, K. H. Chang, and C. Lee, 2021b: Visibility prediction over South Korea based on random forest. Atmosphere, 12, 552, https://doi.org/10.3390/atmos12050552.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., Y. K. Lim, and J. W. Cha, 2022a: Short-term prediction of particulate matter (PM10 and PM2.5) in Seoul, South Korea using tree-based machine learning algorithms. Atmos. Pollut. Res., 13, 101547, https://doi.org/10.1016/j.apr.2022.101547.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, B. Y., J. W. Cha, K. H. Chang, and C. Lee, 2022b: Estimation of the visibility in Seoul, South Korea, based on particulate matter and weather data, using machine-learning algorithm. Aerosol Air Qual. Res., 22, 220125, https://doi.org/10.4209/aaqr.220125.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, D. J., G. Kang, D. Y. Kim, and J. J. Kim, 2020: Characteristics of LDAPS-predicted surface wind speed and temperature at automated weather stations with different surrounding land cover and topography in Korea. Atmosphere, 11, 1224, https://doi.org/10.3390/atmos11111224.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kim, M., K. Lee, and Y. H. Lee, 2020: Visibility data assimilation and prediction using an observation network in South Korea. Pure Appl. Geophys., 177, 11251141, https://doi.org/10.1007/s00024-019-02288-z.

    • Search Google Scholar
    • Export Citation
  • Kim, Y. P., and G. Lee, 2018: Trend of air quality in Seoul: Policy and science. Aerosol Air Qual. Res., 18, 21412156, https://doi.org/10.4209/aaqr.2018.03.0081.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • KMA, 2013: Utilization Guide of Numerical Weather Prediction Model Data for Activation of the Weather Industry. Korea Meteorological Administration, 62 pp.

  • Lee, J. Y., W. K. Jo, and H. H. Chun, 2014: Characteristics of atmospheric visibility and its relationship with air pollution in Korea. J. Environ. Qual., 43, 15191526, https://doi.org/10.2134/jeq2014.02.0066.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Li, D. D., D. X. Yu, Z. J. Qu, and S. H. Yu, 2020: Feature selection and model fusion approach for predicting urban macro travel time. Math. Probl. Eng., 2020, 6614920, https://doi.org/10.1155/2020/6614920.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lu, H., and X. Ma, 2020: Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere, 249, 126169, https://doi.org/10.1016/j.chemosphere.2020.126169.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ma, C. J., C. S. Lim, G. U. Kang, S. A. Jung, and M. R. Jo, 2020: Visibility degradation and its contributors at an urban site in Korea. Asian J. Atmos. Environ., 14, 335344, https://doi.org/10.5572/ajae.2020.14.4.335.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Maurer, M., O. Klemm, H. L. Lokys, and N. H. Lin, 2019: Trends of fog and visibility in Taiwan: Climate change or air quality improvement? Aerosol Air Qual. Res., 19, 896910, https://doi.org/10.4209/aaqr.2018.04.0152.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Meyer, D., E. Dimitriadou, K. Hornik, A. Weingessel, F. Leisch, C. C. Chang, and C. C. Lin, 2022: Package ‘e1071.’ R Reference Document, 67 pp., https://cran.r-project.org/web/packages/e1071/e1071.pdf.

    • Crossref
    • Export Citation
  • Mittermaier, M. P., 2008: The potential impact of using persistence as a reference forecast on perceived forecast skill. Wea. Forecasting, 23, 10221031, https://doi.org/10.1175/2008WAF2007037.1.

    • Search Google Scholar
    • Export Citation
  • Nagarajan, B., L. Delle Monache, J. P. Hacker, D. L. Rife, K. Searight, J. C. Knievel, and T. N. Nipen, 2015: An evaluation of analog-based postprocessing methods across several variables and forecast models. Wea. Forecasting, 30, 16231643, https://doi.org/10.1175/WAF-D-14-00081.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Park, D. H., S. W. Kim, M. H. Kim, H. Yeo, S. S. Park, T. Nishizawa, A. Shimizu, and C. H. Kim, 2021: Impacts of local versus long-range transported aerosols on PM10 concentrations in Seoul, Korea: An estimate based on 11-year PM10 and lidar observations. Sci. Total Environ., 750, 141739, https://doi.org/10.1016/j.scitotenv.2020.141739.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Petersen, C., and N. W. Nielsen, 2000: Diagnosis of visibility in DMI-HIRLAM. Danish Meteorological Institute, Scientific Rep. 00-11, 37 pp.

    • Crossref
    • Export Citation
  • Qu, W. J., J. Wang, X. Y. Zhang, D. Wang, and L. F. Sheng, 2015: Influence of relative humidity on aerosol composition: Impacts on light extinction and visibility impairment at two sites in coastal area of China. Atmos. Res., 153, 500511, https://doi.org/10.1016/j.atmosres.2014.10.009.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rosa, J. P., D. J. Guerra, N. C. Horta, R. M. Martins, and N. C. Lourenço, 2020: Overview of artificial neural networks. Using Artificial Neural Networks for Analog Integrated Circuit Design Automation, Springer, 21–44, https://doi.org/10.1007/978-3-030-35743-6.

    • Crossref
    • Export Citation
  • Singh, A., J. P. George, and G. R. Iyengar, 2018: Prediction of fog/visibility over India using NWP model. J. Earth Syst. Sci., 127, 26, https://doi.org/10.1007/s12040-018-0927-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Singh, A., W. R. Avis, and F. D. Pope, 2020: Visibility as a proxy for air quality in East Africa. Environ. Res. Lett., 15, 084002, https://doi.org/10.1088/1748-9326/ab8b12.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Song, H. J., B. Lim, and S. Joo, 2019: Evaluation of rainfall forecasts with heavy rain types in the high-resolution unified model over South Korea. Wea. Forecasting, 34, 12771293, https://doi.org/10.1175/WAF-D-18-0140.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stoelinga, M. T., and T. T. Warner, 1999: Nonhydrostatic, mesobeta-scale model simulations of cloud ceiling and visibility for an East Coast winter precipitation event. J. Appl. Meteor. Climatol., 38, 385404, https://doi.org/10.1175/1520-0450(1999)038<0385:NMSMSO>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sun, H., J. Wang, and W. Ye, 2021: A data augmentation-based evaluation system for regional direct economic losses of storm surge disasters. Int. J. Environ. Res. Public Health, 18, 2918, https://doi.org/10.3390/ijerph18062918.

    • Search Google Scholar
    • Export Citation
  • Sun, X., T. Zhao, D. Liu, S. Gong, J. Xu, and X. Ma, 2020: Quantifying the influences of PM2.5 and relative humidity on change of atmospheric visibility over recent winters in an urban area of East China. Atmosphere, 11, 461, https://doi.org/10.3390/atmos11050461.

    • Search Google Scholar
    • Export Citation
  • Thach, T. Q., C. M. Wong, K. P. Chan, Y. K. Chau, Y. N. Chung, C. Q. Ou, L. Yang, and A. J. Hedley, 2010: Daily visibility and mortality: Assessment of health benefits from improved visibility in Hong Kong. Environ. Res., 110, 617623, https://doi.org/10.1016/j.envres.2010.05.005.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vinutha, H. P., B. Poornima, and B. M. Sagar, 2018: Detection of outliers using interquartile range technique from intrusion dataset. Information and Decision Sciences: Advances in Intelligent Systems and Computing, S. Satapathy et al., Eds., Vol. 701, Springer, https://doi.org/10.1007/978-981-10-7563-6_53.

    • Crossref
    • Export Citation
  • Wang, C., Z. Jia, Z. Yin, F. Liu, G. Lu, and J. Zheng, 2021: Improving the accuracy of subseasonal forecasting of China precipitation with a machine learning approach. Front. Earth Sci., 9, 659310, https://doi.org/10.3389/feart.2021.659310.

    • Search Google Scholar
    • Export Citation
  • Wang, J., S. Lu, S. H. Wang, and Y. D. Zhang, 2021: A review on extreme learning machine. Multimedia Tools Appl., https://doi.org/10.1007/s11042-021-11007-7.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Whalley, J., and S. Zandi, 2016: Particulate matter sampling techniques and data modelling methods. Air Quality—Measurement and Modeling, P. Sallis, Ed., IntechOpen, 29–54, https://doi.org/10.5772/65054.

    • Crossref
    • Export Citation
  • WMO, 2014: Guide to Meteorological Instruments and Methods of Observation. World Meteorological Organization, 1128 pp.

    • Crossref
    • Export Citation
  • Won, W. S., R. Oh, W. Lee, K. Y. Kim, S. Ku, P. C. Su, and Y. J. Yoon, 2020: Impact of fine particulate matter on visibility at Incheon International Airport, South Korea. Aerosol Air Qual. Res., 20, 10481061, https://doi.org/10.4209/aaqr.2019.03.0106.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wright, M. N., and A. Ziegler, 2017: Ranger: A fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Software, 77, 117, https://doi.org/10.18637/jss.v077.i01.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wright, M. N., S. Wager, and P. Probst, 2020: Package ‘ranger.’ R Reference Document, 28 pp., https://cran.r-project.org/web/packages/ranger/ranger.pdf.

    • Crossref
    • Export Citation
  • Wu, J., C. Fu, L. Zhang, and J. Tang, 2012: Trends of visibility on sunny days in China in the recent 50 years. Atmos. Environ., 55, 339346, https://doi.org/10.1016/j.atmosenv.2012.03.037.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wu, X., Y. Wang, S. He, and Z. Wu, 2020: PM2.5/PM10 ratio prediction based on a long short-term memory neural network in Wuhan, China. Geosci. Model Dev., 13, 14991511, https://doi.org/10.5194/gmd-13-1499-2020.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Yu, H., T. Li, and P. Liu, 2019: Influence of ENSO on frequency of wintertime fog days in eastern China. Climate Dyn., 52, 50995113, https://doi.org/10.1007/s00382-018-4437-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Yu, Z., Y. Qu, Y. Wang, J. Ma, and Y. Cao, 2021: Application of machine-learning-based fusion model in visibility forecast: A case study of Shanghai, China. Remote Sens., 13, 2096, https://doi.org/10.3390/rs13112096.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, S., D. Cheng, Z. Deng, M. Zong, and X. Deng, 2018: A novel kNN algorithm with data-driven k parameter computation. Pattern Recognit. Lett., 109, 4454, https://doi.org/10.1016/j.patrec.2017.09.036.

    • Search Google Scholar
    • Export Citation
  • Zhao, T., and Coauthors, 2017: Revealed variations of air quality in industrial development over a remote plateau of Southwest China: An application of atmospheric visibility data. Meteor. Atmos. Phys., 129, 659667, https://doi.org/10.1007/s00703-016-0492-7.

    • Search Google Scholar
    • Export Citation
  • Zhou, B., J. Du, I. Gultepe, and G. Dimego, 2012: Forecast of low visibility and fog from NCEP: Current status and efforts. Pure Appl. Geophys., 169, 895909, https://doi.org/10.1007/s00024-011-0327-x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zong, P., Y. Zhu, H. Wang, and D. Liu, 2020: WRF-Chem simulation of winter visibility in Jiangsu, China, and the application of a neural network algorithm. Atmosphere, 11, 520, https://doi.org/10.3390/atmos11050520.

    • Crossref
    • Search Google Scholar
    • Export Citation
Save