The Weather Research and Forecasting (WRF) Model and the Nonhydrostatic Icosahedral Atmospheric Model (NICAM) are forced with the Global Forecast System (GFS) data and run over the United Arab Emirates (UAE) for two 4-day periods: one in the cold season (16–18 December 2017) and another in the warm season (13–15 April 2018). The models’ performance is evaluated against four observational datasets: weather station observations, eddy-covariance flux measurements at Al Ain, microwave radiometer–derived temperature profile, and twice-daily radiosonde measurements at Abu Dhabi. An overestimation of the daily mean air temperature by 1°–3°C is noticed for both models and periods. This warm bias is attributed to the reduced cloud cover and resulting increased surface downward shortwave radiation flux. A comparison with the eddy-covariance data suggested that both models also underestimate the observed albedo. However, when the models predict heavier amounts of precipitation, they tend to be colder than observations, typically by 2°–3°C. NICAM and WRF overpredict the strength of the near-surface wind speed at all weather stations by roughly 1–3 m s−1, which has been attributed to a poor representation of its subgrid-scale fluctuations and surface drag parameterization. WRF tends to be wetter and NICAM drier than the station observations, possibly because of differences in the cloud microphysics schemes. While the performance of both models for the near-surface fields is comparable, NICAM outperforms WRF in the simulation of vertical profiles of temperature, relative humidity, and wind speed, being able to partially correct some of the biases in the GFS data.
According to projections, arid and semiarid regions are expected to expand in a hypothetical warmer world (e.g., Lu et al. 2007; Lelieveld et al. 2016; Huang et al. 2017; Ozturk et al. 2018) and may experience more extreme weather conditions in the future (e.g., Feng et al. 2014). This makes it imperative to correctly simulate specific extreme weather events and better understand the limitations of the numerical models used. One way to gain further insight into the deficiencies of a numerical model is to conduct an ensemble of simulations in which different sets of physics/dynamics options are chosen and/or the initial/boundary conditions are perturbed (e.g., Clark et al. 2008; Evans et al. 2012). This helps in determining the optimal model configuration for a given environment (e.g., Chaouch et al. 2017), with a further improvement of the model’s performance obtained by optimizing relevant tunable parameters defined in the parameterization schemes (e.g., Quan et al. 2016; Duan et al. 2017). Another option is to compare the model’s performance to that of other numerical models that have different physics and dynamics but are forced by the same dataset. The intercomparison of model predictions has been extensively performed for more than three decades now, with emphasis on the different components of the climate system such as the atmosphere only (e.g., Xiang et al. 2017), coupled atmosphere–ocean (e.g., Meehl et al. 2005), carbon cycle (e.g., Penner et al. 2006), land surface and vegetation (e.g., Rabin et al. 2017), aerosol impacts (e.g., Giorgetta et al. 2013), and cryosphere (e.g., Farinotti et al. 2017). One of the most well-known model intercomparison projects currently underway is the Coupled Model Intercomparison Project (CMIP), which began in the mid-1990s and is now in its sixth phase, CMIP6 (Eyring et al. 2016). In addition to general circulation models, intercomparison projects have also been done with regional climate models and hydrological models (e.g., in Fu et al. 2005; Hattermann et al. 2017; Tabari and Willems 2018). An example of an intercomparison project of this nature is the Coordinated Regional Climate Downscaling Experiment (CORDEX; https://www.cordex.org), a project cosponsored by the World Climate Research Programme that aims to, among other things, evaluate and improve regional climate downscaling models and techniques. Zittis et al. (2014) evaluated the performance of 12 different configurations of the Weather Research and Forecasting (WRF; Skamarock et al. 2008) Model over the Middle East and North Africa (MENA) CORDEX domain, at 50-km spatial resolution. The authors concluded that the choice of the cloud microphysics scheme has the largest impact on the temperature predictions, whereas the precipitation depends more strongly on the cumulus scheme mainly in the tropics. The role of the cumulus scheme on the temperature and precipitation predictions over the MENA domain has also been highlighted by Almazroui et al. (2016) using the Regional Climate Model version 4 (RegCM4). Zittis and Hadjinicolaou (2016) conducted a 30-yr WRF simulation over the same region employing two radiation schemes, a simpler and a more detailed one. They found that using the more complex scheme gives more accurate temperature predictions in particular over the deserts, helping to correct the excessive longwave radiation flux at the surface at night and associated cold bias. An increase of the model’s horizontal resolution can potentially reduce some of the temperature biases due to (i) a better representation of the topography and land–sea mask; (ii) a better simulation of the cloud cover. The advantages of employing a higher spatial resolution has also been stressed by Bucchignani et al. (2016a,b), for simulations with the Consortium for Small-Scale Modeling (COSMO) regional climate model (COSMO-CLM), who also found an improved performance when a more accurate parameterization of the albedo and aerosol optical depth is ingested in COSMO-CLM. Such intercomparison projects allow for an assessment of the strengths and weaknesses of the different models and can help in their optimization and in the design of subsequent simulations. For example, after evaluating the performance of different numerical models for a period for which observations are available, their overall skill can be ranked, and their predictions weighted accordingly (e.g., Eyring et al. 2019). In addition, the model found to perform the best for the past climate can then be used for future climate change simulations (e.g., Gaur and Simonovic 2018). A variant of this approach is to run different models over selected (short) periods, instead of conducting multiyear simulations as in intercomparison projects such as CMIP, and rely on the one found to give the most accurate predictions. This methodology allows for the use of high-resolution models at a relatively lower computational cost. This is the approach followed in this work, as two different models, namely the WRF Model version 3.7.1 and the Nonhydrostatic Icosahedral Atmospheric Model (NICAM; Tomita and Satoh 2004; Satoh et al. 2014) version 16.3, are investigated over two relatively short periods and their predictions assessed against the available observational data. The latter is a global model that uses an icosahedral grid and has been found to perform well in the convective tropics (e.g., Fudeyasu et al. 2008; Liu et al. 2009). The target area of this study is the United Arab Emirates (UAE), an arid region in the Arabian Peninsula. While both models were largely used worldwide, their verification in the context of a hyperarid environment like the one in the UAE has not been thoroughly addressed in the literature.
Fonseca et al. (2019) ran the WRF Model over the Atacama Desert for two-week long periods in the local winter season and reported very large biases in the daytime surface temperature, which exceeded 11°C. What is more, in situ surface temperature measurements were found to be very different from the satellite-derived surface temperatures from the Moderate Resolution Imaging Spectroradiometer (MODIS; Wan and Dozier 1996) payload imaging sensors. This last conclusion was also reached by Wan et al. (2002) and Li et al. (2014) and stresses the fact that ground-based observational networks are needed in such remote regions given the poorer quality of satellite-derived and model products. Gunwani and Mohan (2017) also reported that WRF overestimates the daytime air temperature in an arid region in India, in particular in the warm season, which they attribute to an overestimation of the surface downward shortwave radiation flux and its subsequent impact on the surface energy budget. The authors also found that, in addition to air temperature, there are large biases in other relevant surface and near-surface fields, such as horizontal wind speed. WRF simulations over the UAE have also highlighted deficiencies in its performance over the region. Chaouch et al. (2017) tested different planetary boundary layer (PBL) parameterization schemes in the simulation of five cold season (January–March) fog events in Abu Dhabi. Local conditions and strong nighttime inversions foster the occurrence of fog events in the UAE (Aldababseh and Temimi 2017). Local PBL schemes were found to outperform nonlocal schemes, with the Quasi-Normal Scale Elimination (QNSE; Sukoriansky et al. 2005) giving the best results when the model predicts were evaluated against surface observations and radiosonde profiles. As in Weston et al. (2018) that considered eight cold season fog events, the authors reported a cold bias, which the latter managed to partially correct by switching from the Noah land surface model (Noah LSM; Chen and Dudhia 2001; Tewari et al. 2004) to a modified version of the Noah LSM with multiparameterization options (Noah-MP; Niu et al. 2011; Z.-L. Yang et al. 2011). For a five-member physics ensemble conducted for a typical warm season convective event, Schwitalla et al. (2019) also reported a cold bias in the 2-m temperature during the morning and evening transition, in addition to an underestimation of the observed cloud cover, regardless of the cloud microphysics scheme employed. The model underestimates the observed precipitation, with the 10-m wind speed generally higher than that observed except in the evening hours. Wehbe et al. (2019) used both the standalone WRF model and WRF coupled with its hydrological modeling extension package (WRF-Hydro), to investigate the added value of coupled land surface–atmosphere modeling in the simulation of an extreme event in the UAE, which took place in March 2016. This is particularly relevant even during dry periods due to the rapidly changing land surface conditions in the UAE with the expanding green areas and urban development (Aldababseh et al. 2018). WRF-Hydro is found to outperform the standalone WRF, even though both models exhibit similar biases. For example, both models exhibit a warm bias, which may arise from (i) the dry bias in the National Centers for Environmental Prediction Global Forecast System (GFS; GFS 2018) forcing data; (ii) the uncaptured cooling mechanisms of aerosols and sea breezes; (iii) the underestimation of the observed cloud cover; and (iv) possibly discrepancies between the predicted and observed soil moisture. The underperformance over arid regions is not restricted to WRF. For example, Ozturk et al. (2012) ran RegCM version 4 (Giorgi et al. 2012) over Central Asia and concluded that it overestimates the surface temperature in the desert regions of southwestern Asia in the warm season and underpredicts it in the cold season. This model also predicts air temperatures a couple of degrees higher than those observed in the boreal summer season in the North African deserts (Konare et al. 2008). Arid and semiarid regions also suffer from a dearth of observational data, making it difficult to properly evaluate the performance of numerical models. NICAM has been found to perform well in the convective tropics (e.g., Fudeyasu et al. 2008; Liu et al. 2009). However, and to the authors’ knowledge, its performance has not been assessed in hyperarid regions such as the UAE.
The goal of this study is to thoroughly assess the performance of NICAM version 16.3 and WRF version 3.7.1 in the UAE using diverse sources of observations such as ground observations from meteorological stations distributed across the UAE, observations from an eddy covariance station, and temperature and humidity profiles from a ground-based passive microwave radiometer and radiosonde sounding. Eventually, we compare the obtained performance in the UAE to that reported in other arid/semiarid regions in an attempt to understand the models’ local limitations. Both models are run over the UAE for two 4-day periods, one in the cold season and another in the warm season. The findings of this work are particularly relevant to arid and semiarid regions and provide guidance into the configuration of numerical models for weather and climate studies.
This manuscript is structured as follows. In section 2, the experimental setup of WRF and NICAM used in the simulations is described, together with the observational data sources used for their evaluations and the verification diagnostics employed. The results for the case studies are discussed in section 3, while the main conclusions are summarized in section 4.
2. Models, methods, and diagnostics
In this study, the GFS forecast data at 0.25° × 0.25° is used to provide initial and boundary conditions (including for the LSM) for the WRF Model version 3.7.1, and initial conditions for the NICAM version 16.3 for a 4-day period in the cold (15–19 December 2017) and warm (12–15 April 2018) seasons, with the first 20 h regarded as model spinup. The cold season event is characterized by rather heavy precipitation amounts in the eastern half of the country, an extreme event that may occur more frequently in a hypothetical warmer world (e.g., Polade et al. 2014). It is chosen as it features the heaviest rainfall totals over the UAE for the period when weather station data are available for evaluation. The warm season event includes the heaviest precipitation at Al Ain for the period when eddy-covariance and weather station data are available at the site, allowing for a comprehensive evaluation of the surface heat and radiative fluxes in an unstable period. The focus is on the periods from 0000 local solar time (LST) 16 December to 2300 LST 28 December 2017 for the cold case and from 0000 LST 13 April to 2300 LST 15 April 2018 (LST = UTC + 4) for the warm case.
a. WRF configuration
WRF is set up in a two-nested configuration centered at 24°N and 54°E, Fig. 1a, with a 12-km grid covering the entire Arabian Gulf, the eastern part of the Arabian Peninsula, and southern Iran, and a 4-km grid that comprises the whole domain of the UAE. The model output for each grid is stored hourly with the latter used for analysis. The albedo, vegetation fraction and leaf area index used in the WRF runs are derived from multiyear data obtained from the National Oceanic and Atmospheric Administration (NOAA) Advanced Very High-Resolution Radiometer (AVHRR) measurements (Csiszar and Gutman 1999; Gutman and Ignatov 1998). All other land surface parameters are assigned based on the dominant land use type from 1-km AVHRR data spanning April 1992 to March 1993 (Loveland et al. 2010). Soil texture and land use types were modified to reflect their actual state, following a dedicated field campaign conducted as part of the UAE Rain Enhancement Program (UAEREP). The topography is carefully interpolated from a 30″ (about ~925 m) spatial resolution dataset provided by the U.S. Geological Survey (USGS) and downloaded from the model’s website.
The physics parameterization schemes are given in Table 1. A similar set up was used in previous studies over the UAE (e.g., Chaouch et al. 2017; Weston et al. 2018; Schwitalla et al. 2019). Following Chaouch et al. (2017), the QNSE PBL scheme is employed, while Weston et al. (2018) highlighted the added value of the Noah-MP LSM, and Schwitalla et al. (2019) of the Thompson cloud microphysics scheme. The radiation and cumulus schemes considered here are those used in Chaouch et al. (2017) and Weston et al. (2018). The radiation scheme is called every 10 min, and a spatially and temporally varying climatological aerosol distribution based on Tegen et al. (1997) is added. For the Noah-MP scheme, the default settings are considered for all user-defined parameters except for the radiative transfer option (“opt_rad”), for which the modified two-stream scheme is employed, and for the surface layer drag coefficient calculation (“opt_sfc”), for which the original Noah scheme is used. While nearly all settings are more relevant for vegetated regions, the latter is found to be important for the air temperature prediction in the UAE (e.g., Weston et al. 2018), as it allows the thermal and momentum roughness lengths to be different (Chen et al. 1997). In the soil model, four layers are considered, with thicknesses of 10, 30, 60, and 100 cm, with the soil temperature, soil moisture and unfrozen soil moisture computed for each layer. In all simulations, nudging is applied at the lateral boundaries over a four gridpoint transition zone, while in the top 5 km, Rayleigh damping is added to the wind components and perturbation potential temperature on a 5-s time scale (Skamarock et al. 2008). In the vertical, 46 levels are considered, those employed in Weston et al. (2018), with increased vertical resolution in the PBL and the lowest level at ~30 m above ground level. The model top is set to 50 hPa (~24 km).
b. NICAM configuration
NICAM is set up at a grid division level (g-level) 7 (g7) and stretching ratio (s-ratio) of 64 (s64) configuration, centered at 24.57°N and 55.20°E. In its default setting, g-level of zero, the globe is composed of ten icosahedrons, each comprising two triangles. Every time the g-level is increased by one, each triangle is subdivided into four, which corresponds to a doubling of the horizontal grid spacing (Tomita et al. 2002). As derived in Satoh et al. (2014), for a given g-level with globally homogeneous grid spacing, the horizontal grid spacing Δ is approximately given by
where REarth is Earth’s radius. Therefore, in a g7 configuration, the horizontal grid spacing is approximately 56 km. For regional simulations, and to make better use of the computational resources, having a uniform global spatial resolution is not desirable. For such purposes, the stretch NICAM (S-NICAM; Tomita 2008b; Goto et al. 2015; Uchida et al. 2016) with horizontal resolution continuously variable from place to place, is used. In S-NICAM, the horizontal grid spacing is increased near the focal point (target region) and decreased near the antipodal point by a user-defined factor, using a modified Schmidt transformation (Tomita 2008b). For example, in the g7-s64 configuration centered over the UAE used here, the horizontal grid spacing around this region is increased by a factor of 8 to ~7 km, and decreased by the same factor near the antipodal point in the southeastern Pacific Ocean to ~448 km, so that the ratio of the two grid spacings is 64. The NICAM grid configuration is shown in Fig. 1b.
The model settings used in this study are the same as those employed in Uchida et al. (2016) and are given in Table 1. In the MATSIRO LSM scheme, the surface energy budget is evaluated at the ground and canopy surfaces, and at snow-free and snow-covered portions of the grid cell, separately. The soil model has five layers, with depths of 5, 20, 75, 100 and 200 cm, with the soil temperature, moisture, and frozen soil moisture computed for each layer. The roughness lengths of momentum and heat are estimated following Watanabe (1994). They are a function of the exchange coefficients, leaf area index, canopy height, and surface roughness lengths, the latter set to 5 × 10−2 m for momentum and 5 × 10−3 m for heat. In the simulations presented in this paper, both of these values are decreased by a factor of 50, to 10−3 m and 10−4 m, respectively. The main motivation is that using eddy-covariance measurements taken at Al Ain’s International Airport, the momentum roughness length was estimated to be around 10−3 m (Nelli et al. 2020b, manuscript submitted to Earth Space Sci.). In comparison to weather station data, NICAM gives an improved performance with these new values of the surface roughness lengths (not shown). This update to the surface roughness length in WRF will be presented in a subsequent publication. It is found to have a small impact on the model predictions (Nelli et al. 2020b, manuscript submitted to Earth Space Sci.), and hence the results of the analysis conducted here will hold even if this modification were to be made.
The albedo data are taken from the International Satellite Land Surface Climatology Project Initiative II and the vegetation data are derived from the USGS Global Land Cover Characterization Simple Biosphere 2 Model. The former is interpolated from a 150″ (~5 km) resolution dataset, whereas the latter is available at a spatial resolution of about 925 m (30″). The topography is interpolated from a 30″ dataset.
No cumulus scheme is employed in the model simulations. As explained in Satoh et al. (2010), the same set of physics parameterizations should be employed over the whole domain despite the different grid spacings, as doing otherwise may lead to side effects such as artificial noise. Here, the interest is in the finer mesh region, and no cumulus parameterization scheme is considered. The model has been run with explicit convection at this resolution in several studies (e.g., Fudeyasu et al. 2008; Liu et al. 2009). NICAM employs a mixed layer ocean model, with a mixed layer depth of 15 m and the sea surface temperatures (SSTs) nudged toward the daily GFS SSTs on a time scale of 7 days. The bulk surface fluxes over the ocean are computed following Louis (1979). The sea ice mass and fraction are read from the multimodel ensemble of the CMIP Phase 3 averaged over 1979–99. In the vertical, 40 levels are considered with the lowest layer at about 162 m and the highest at about 40 km (i.e., model top at around 7 hPa).
c. Comparison of WRF/NICAM configurations
The major difference between the WRF and NICAM setups is the type of grid used (longitude–latitude in the former, and icosahedral in the latter) and its spatial resolution. Regarding the physics parameterization schemes, they are largely similar (e.g., both models are run with a 1.5-order local closure PBL scheme, and the WRF innermost grid and NICAM are run with explicit convection). The radiation schemes are of comparable complexity, both using a correlated-k approach, with the RRTMG shortwave and RRTM longwave having a total of 30 spectral bands for wavelengths 10–50 000 cm−1, while MSTRNX_AR5 has 29 spectral bands for the same range of wavelengths. Both cloud microphysics schemes, Thompson and NSW6, have prognostic equations for six classes of hydrometeors, namely, water vapor, cloud water, rainwater, snow, cloud ice, and graupel. However, Thompson is a double-moment scheme, in which the number concentration for cloud ice and rainwater are explicitly predicted, while NSW6 is a single-moment scheme. Hong et al. (2010) found that double-moment schemes generate less rainfall in the light precipitation categories and enhanced rainfall in the moderate and heavy categories. This can be attributed to the fact that double-moment microphysics schemes, like the Thompson scheme used in WRF, typically overpredict the size of the raindrops, leading to faster fall speeds and a more rapid removal of the rainwater from the column, while single-moment schemes normally exhibit a stratiform-like behavior (e.g., Otkin et al. 2006). This contrasting behavior between single-moment and double-moment schemes has been found to be the case for the cloud microphysics schemes used in the WRF Model and considered here [e.g., for a summertime convective event in the Nepalese Himalayas (Orr et al. 2017), for an atmospheric river event in California (Jankov et al. 2011), and for summertime convective events in the United States (Cintineo et al. 2014; Putnam et al. 2017)]. Since one of the events targeted in this study is convective in nature, this suggests that WRF may overestimate the observed rainfall amounts, as reported by Wehbe et al. (2019) over the UAE with the Morrison double-moment scheme (Morrison et al. 2009). The LSMs employed in WRF and NICAM also share many similarities (e.g., they are one-dimensional, employ the bulk aerodynamic formula in the estimation of the surface heat fluxes, and the roughness lengths for momentum and heat are differently set up. For arid regions mostly devoid of vegetation, the main differences between the two are in the configuration of the soil model. In the Noah LSM, there are four levels with the soil bottom at 2 m, whereas in MATSIRO there are five levels with the soil bottom at 5 m. The groundwater table is located below the soil bottom layer in WRF (Niu et al. 2011). In NICAM the mean water table depth is placed within the first unsaturated soil layer (for dry soils at the soil bottom), with the local water table depth estimated using the mean water table depth and the topography of the site (Stieglitz et al. 1997; Takata et al. 2003). The deeper water table and soil bottom in NICAM may lead to reduced surface moisture and hence a larger amplitude diurnal cycle with warmer daytime surface temperatures (e.g., Fan et al. 2007). Regarding the static fields, the topography used in the model runs is interpolated from the same dataset, the dominant land-use category over the UAE is barren or sparsely vegetated, and the surface albedo values over the country are comparable, mostly in the range 20%–30%.
d. Observational datasets and verification diagnostics
The models’ predictions are evaluated against four observational datasets:
Hourly automatic weather station (AWS) and airport station data at 35 sites spread out over the country, shown as circles in Fig. 1c, provided by the National Center of Meteorology (NCM). Fields available include air temperature, relative humidity (RH), water vapor mixing ratio, horizontal wind direction and speed, and for the 30 AWS stations, the downward shortwave radiation flux at the surface. At all stations the daily accumulated precipitation is also provided.
Eddy-covariance measurements at Al Ain’s Airport (24.2617°N, 55.6092°E; Nelli et al. 2020a) for the April 2018 event. Fields available include surface downward/upward shortwave/longwave radiation fluxes, sensible and latent heat fluxes, and ground heat flux every 30 min.
Twice daily radiosonde profiles at Abu Dhabi’s International Airport (24.4331°N, 54.6511°E).
Temperature vertical profiles estimated from microwave radiometer (MWR) measurements located at Masdar Institute in Abu Dhabi (24.4364°N, 54.6119°E; Temimi et al. 2020). The MWR also provides humidity measurements but only temperature profiles are used for evaluation. This is justified because (i) the humidity biases have a rather large magnitude in the lower troposphere, exceeding 6 g kg−1, and (ii) the elevation scanning mode gives high vertical resolution scans in the boundary layer for temperature only. Further details about the MWR performance can be found in Temimi et al. (2020).
The WRF and NICAM performance is assessed using three of the verification diagnostics proposed by Koh et al. (2012), namely the model bias, normalized bias μ, and normalized error variance α, defined in the appendix. The model bias is the mean discrepancy between its predictions and the observations. The normalized bias is the ratio of the bias to the standard deviation of the discrepancy between the model forecasts and observations. This score is used to assess whether the biases are significant: if |μ| < 0.3, the contribution of the bias to the root-mean-square error (RMSE) is less than ~5%, and hence the biases can be considered as not significant. The normalized error variance is the variance of the error arising from the discrepancies between the model predictions and observations, divided by the combined modeled and observed variances. It is given by
where ρ is the correlation and η is the variance similarity. The former is a measure of the phase agreement between the model forecasts and observations, while the latter gives an indication of how the signal amplitude predicted by the model agrees with that observed. Hence, α accounts for both phase and amplitude errors and has values in the range [0, 2]. The optimal performance corresponds to zero bias, μ, and α. In other words, the closer these scores are to zero, the more skillful are the model predictions. For a random forecast based on the climatological mean, ρ = 0 and therefore α = 1, meaning that for a model forecast to be deemed practically useful, α < 1. This diagnostic is nondimensional, symmetric with respect to the observations and forecasts, and applicable to both scalar and vector variables, making it ideal for the evaluation performed in this work. Further details are given in appendix and in Koh et al. (2012).
3. Model simulations
a. Event synoptics
The cold season event takes place in mid-December 2017 and features rather wet weather conditions in particular in the eastern half of the country. Figure 2a shows the 3-day accumulated precipitation for the 35 NCM stations. The highest total is for Al Foah, station 7 in Fig. 1c located close to the Oman border, where 125 mm of rain was recorded on 17 December. Figure 2c shows the averaged mean sea level pressure and 10-m horizontal wind vectors for the 3-day period. A weak low pressure system prevailed over the Arabian Sea most of the time, with the associated convection giving rise to the observed very heavy rainfall amounts. This system eventually moved inland on 18 December (Fig. 3). The winds blew from an easterly to northeasterly direction over most of the UAE, in association with the counterclockwise circulation around the low-pressure center. The vast majority of the precipitation occurred in a single day, on 17 December. The reduced cloud cover, in particular on 18 December and over the western half of the country, coupled with strong radiative cooling and weak wind speeds at night, led to the formation of fog at some locations, a common occurrence in the UAE (e.g., de Villiers and van Heerden 2007).
The warm season event in mid-April 2018, on the other hand, featured rather dry and warm weather conditions (Fig. 4), which are typical in the UAE. As seen in Fig. 2b, precipitation in this period was only recorded around Al Ain, station 32 in Fig. 1c, with three-day totals generally below 10 mm. As with the cold season event, the majority of the recorded rainfall occurred in a single day, 14 April. Figure 2d shows the large-scale circulation as given by the GFS data used to force the models. A weak low pressure system moved over the country, associated with the cold front bringing light precipitation to some of the stations on the high terrain. The predominant wind direction over the UAE was northwesterly to northeasterly, which is the common background circulation over the Arabian Gulf throughout the year (e.g., Eager et al. 2008).
b. Cold season event (16–18 December 2017)
Figures 5 and 6 show the bias and normalized error variance scores for air temperature, relative humidity, water vapor mixing ratio, and horizontal wind for the 3-day period regarding WRF and NICAM, respectively. Figure 7 shows the time series of the referred fields for Madinat Zayed, station 19 in Fig. 1c. Both models, in particular, NICAM, have a tendency to overestimate the observed air temperature typically by about 1°–3°C. This arises mostly from warmer daytime temperatures (see, e.g., Fig. 7), which can be attributed to an overestimation of the surface downward shortwave radiation flux (not shown), suggesting reduced cloud cover compared to observations. This is confirmed when the predicted cloud amounts are visually compared with those given by the Spinning Enhanced Visible and Infrared Imager (SEVIRI; Schmetz et al. 2002) instrument (not shown). A similar result was obtained by Gunwani and Mohan (2017) in WRF simulations over an arid region in India. Several studies have reported a tendency of the WRF Model to underestimate the observed cloud cover (e.g., Kumar et al. 2012; Diaz et al. 2015; Ruiz-Arias et al. 2016; Wehbe et al. 2019; Schwitalla et al. 2019). In particular, Schwitalla et al. (2019) noted a tendency of WRF to generate less clouds than those observed over the UAE for a summertime convective event, and Wehbe et al. (2019) for an extreme event in the cold season. NICAM has also been found to underpredict the observed cloud amounts, in particular, midtropospheric clouds (e.g., Kodama et al. 2012). This can arise from deficiencies in the cloud microphysics scheme, and/or the coarse vertical resolution of the model grids (Satoh et al. 2010). The exception is when heavier precipitation amounts are predicted by the models, in which case they tend to be colder than observations typically by 2°–3°C, as seen by comparing Figs. 5 and 6 with Fig. 8. An example is for Jabal Hafeet, station 16 in Fig. 1c. Further inspection of Figs. 5 and 6 reveals that NICAM has generally higher air temperature biases compared to WRF, due to a more significant overestimation of the daytime temperatures (see, e.g., Fig. 7). This is consistent with the deeper soil bottom and groundwater table depth in the MATSIRO LSM, described in section 2c. The scores are generally less than 0.4, except mainly in the mountainous terrain in the northeastern part of the country and in WRF, indicating that the WRF and NICAM temperature forecasts can be considered skillful. Here, the higher values of arise mostly from lower correlations and hence phase errors largely dominate over amplitude errors. The temperature diurnal cycle at this time of the year is suppressed compared to that in the warm season due to the shorter daylight hours and reduced solar heating of the landmasses (e.g., Elagib and Abdu 1997; Merquiol et al. 2002), and for the cold season period also because of the higher amounts of cloud cover. WRF and NICAM have deficiencies in capturing the observed diurnal temperature variability in particular at the high-elevation stations, where the surface is rougher and the cloud cover is more significant (not shown). This explains the lower values of ρ and hence at these stations.
As opposed to air temperature, RH is generally underpredicted by both models. The water vapor mixing ratio biases are rather small, generally less than 1 g kg−1, except in coastal sites in NICAM and in some of the high-elevation stations in both WRF and NICAM. Hence, the underestimation of the RH can be largely attributed to the overestimation of the air temperature. In NICAM, it is interesting to note that while in inland stations the RH tends to be underpredicted, in coastal stations it is generally higher than observations. At these stations the water vapor mixing ratio is overestimated by up to 4 g kg−1, suggesting that the moister air does not make its way inland, consistent with the warmer air temperatures inland.
While the overall pattern looks similar, the α scores for RH are slightly poorer compared to the temperature ones. However, α is still less than 1, except for a few stations over the mountains. As with the temperature, this poorer performance seems to arise mostly from phase errors. In particular, ρ is rather low for the water vapor mixing ratio, suggesting a deficient simulation of the near-surface moisture variability. A possible explanation is an incorrect representation of the observed precipitation and evaporation. Figure 8 shows the WRF and NICAM daily accumulated precipitation bias over the 3-day period. By and large, WRF tends to overestimate the observed precipitation and NICAM underestimates it. A possible explanation is the use of a double-moment cloud microphysics scheme in WRF: as highlighted in section 2c, double-moment schemes are known to overpredict moderate and heavy rainfall events when compared to single-moment microphysics schemes such as that employed in NICAM (e.g., Otkin et al. 2006; Hong et al. 2010). The biases have a larger magnitude mostly over the northeastern part of the country, possibly due to an incorrect representation of the topography, where the α values for the water vapor mixing ratio are lower.
Both models overpredict the strength of the near-surface wind at all 35 stations, both inland/coastal and low/high-elevation sites, with biases generally in the range 1–3 m s−1. The magnitude of the wind speed bias found here is comparable to that reported by Gunwani and Mohan (2017) over an arid region in India, by Cheng and Steenburgh (2005) over the western United States, and by Carvalho et al. (2014) and Jiménez and Dudhia (2012) over Portugal and Spain, respectively. These uncertainties may arise from an incorrect setting of hard-coded parameters used in the LSMs (e.g., Cuntz et al. 2016), and/or in the PBL schemes. Both WRF and NICAM are run with a 1.5-order local closure PBL scheme. While local PBL schemes have been shown to outperform nonlocal PBL schemes for stable environments and fog prediction in the UAE (Chaouch et al. 2017), for less-stable boundary layers they are known to not fully account for the deeper vertical mixing associated with the larger-scale eddies (Cohen et al. 2015). The WRF simulation was repeated with a nonlocal PBL scheme, Yonsei University (YSU; Hong et al. 2006) scheme, and the verification diagnostics for the horizontal wind are found to be comparable (not shown). Gunwani and Mohan (2017) attributed the higher biases of WRF’s near-surface horizontal wind predictions to an incorrect simulation of its subgrid-scale local fluctuations as well as a deficient representation of the surface drag. This may also be the case here, for both WRF and NICAM that overestimate the strength of the near-surface wind and do not capture well its temporal variability. An inaccurate representation of the strength of the near-surface wind may also arise from an erroneous simulation of the observed land and sea surface temperature differences (e.g., Eager et al. 2008). The values of α for the horizontal wind vector are the highest for all variables and both models. This arises from smaller values of ρ, which is generally in the range 0.1–0.7, being negative for some of the stations in the high terrain, where its predictions are more challenging.
In Fig. 9a, the predicted vertical profiles are compared to those obtained from radiosondes launched at Abu Dhabi’s International Airport. Shown are the temperature difference and the ratio of water vapor mixing ratio with respect to that observed for WRF, NICAM and GFS and observed and model-predicted RH and horizontal wind speed profiles from 1013 to 100 hPa averaged over 0000 and 1200 UTC from 16 to 18 December 2017, a total of six time steps. To generate these plots, the WRF, NICAM, and GFS profiles are interpolated in log-pressure coordinates to the set of pressure levels at which observed data are available.
While at the near-surface WRF and NICAM perform comparably, Fig. 9a shows that the latter generally gives a more accurate representation of the observed vertical profiles for this case study. Except at 100 hPa, the NICAM-predicted temperatures are within 1°C of those observed, with both positive and negative biases. For WRF, however, the biases can exceed 2°C, with the model being warmer than observations from the surface up to ~650 hPa (~3.5 km), and colder above. These temperature biases are consistent with the differences in water vapor mixing ratio: at all levels below about 700 hPa, WRF underestimates the observed moisture amounts by as much as 35%, while the NICAM biases comprise both signs and do not exceed 18%. An analysis of the diabatic heating profiles revealed that the drier (and warmer) atmosphere in WRF arises mostly from the strong sensible heating of the atmosphere, with a contribution from radiative heating at very low levels (not shown). The tendency of WRF to overpredict the temperature at lower levels has been noted [e.g., by Shin and Hong (2011) in simulations over Kansas in October 1999, and by Cremades et al. (2011) in runs over central-western Argentina in the local summer season of 2007]. Above 400 hPa for WRF and 700 hPa for NICAM, both models overestimate the observed water vapor mixing ratio, as is the case in the GFS data used to force the models, even though the amount of moisture present at these levels is roughly two to three orders of magnitude smaller than that just above the surface. The temperature profile predicted by WRF is very similar to that in the GFS data used to force the model as can be seen in Fig. 9a, indicating that NICAM is able to at least partially correct these errors. The RH vertical profile reflects the referred temperature and water vapor mixing ratio discrepancies. Despite simulating the overall phase of the observed horizontal wind speed profile, and except below 900 hPa, both WRF and NICAM underestimate its strength by as much as 10 m s−1. The GFS horizontal wind vertical profile exhibits a similar tendency, suggesting that the WRF and NICAM biases may be tied to the errors in the forcing data.
The discrepancies in the temperature vertical profile with respect to radiosonde estimates shown in Fig. 9a are consistent with those with respect to the MWR measurements at Masdar Institute shown in Fig. 10 as expected since the MWR temperature profiles are found to be in close agreement with the radiosonde profiles (Temimi et al. 2020). Here, the solid lines denote PBL height estimates, defined as the height in the lowest 5 km at which the vertical potential gradient is maximized. Mostly below 2.5 km, and in particular on 17 and 18 December, the WRF temperature biases can exceed +4°C, resulting in a longer-lasting convective boundary layer. The highest temperature biases for both WRF and NICAM are seen during the daytime and are consistent with the higher downward shortwave radiative fluxes at the surface and reduced cloud cover noted earlier in comparison with station data. Despite differences in the bottom 2 km, and in particular on 17 and 18 December, the WRF and NICAM bias plots share some similarities, such as the negative biases between 1 and 3 km and above 4 km on 16 December and above 2 km on 17 December. These discrepancies are also seen in the comparison with radiosonde data for the available times and can be attributed to an overestimation of the observed water vapor mixing ratio (not shown).
c. Warm season event (13–15 April 2018)
Figures 11 and 12 show the verification diagnostics for air temperature, relative humidity and horizontal wind for WRF and NICAM, respectively, for this event. Figure 13 shows the time series of the referred fields for Fujairah, station 34 in Fig. 1c.
As was the case for the winter period, and in line with other studies such as Gunwani and Mohan (2017), both WRF and NICAM generally underestimate the observed cloud cover, as concluded by visually comparing the two (not shown) and in line with other studies such as Schwitalla et al. (2019) and Wehbe et al. (2019), leading to enhanced downward shortwave radiation fluxes at the surface and warmer daytime air temperatures. The exception is when heavier than observed precipitation amounts are predicted, as seen by comparing Figs. 11 and 12 with Fig. 8. An example of this is for NICAM at Abu Dhabi, station 31 in Fig. 1c. At this station, the model predicts 22 mm of rainfall at the end of the day on 13 April, associated with the cooling tendency giving a daily mean temperature over the 3-day period roughly 2.24°C below that observed (Fig. 14). In addition, and at a few inland stations, NICAM predicts lower nighttime temperatures than those observed, possibly due to clearer skies and enhanced radiative cooling and/or the deeper soil bottom and groundwater table depth highlighted in section 2c. Numerical models also exhibit nighttime cold biases over desert regions due to excessive longwave radiation fluxes at the surface, which can be corrected by employing a more sophisticated radiation scheme (Zittis and Hadjinicolaou 2016). A higher spatial resolution can also alleviate the temperature biases by allowing for a better representation of the topography and simulation of the observed cloud cover. The colder nighttime temperatures explain why the daily mean temperature bias is negative at some of the sites in Figs. 11 and 12. Despite biases of comparable magnitude, however, the α scores are generally lower (i.e., more skillful model predictions) for this event, in particular for WRF. An inspection of the ρ and scores showed that the improvement in α is mostly due to an increase in ρ. In April, the daylight hours are longer and the Sun is higher in the sky when compared to December that, together with the drier conditions observed in this period, leads to a larger magnitude temperature diurnal cycle, which both models seem to be more capable of simulating (not shown).
The RH biases are of a comparable magnitude to those in the winter case. However, in this season, the water vapor mixing ratio biases are of a larger magnitude, but still generally less than 2 g kg−1. This arises because the air is moister due to enhanced evaporation from the Arabian Gulf, resulting from a higher solar zenith angle and reduced cloud cover (e.g., Swift and Bower 2003). Despite this, however, the RH biases seem to be largely controlled by the air temperature biases, being generally positive when the latter is negative and vice versa. As for the other period, the scores are lower than for the air temperature, mostly because of a poor simulation of the observed water vapor mixing ratio variability (not shown). This may arise from errors in the representation of the precipitation and evaporation. The former is generally overestimated by WRF and underestimated by NICAM, as concluded for the winter case and seen in Fig. 8, which can be attributed to the differences in the cloud microphysics schemes (section 2c). The horizontal wind skill scores are similar to those given in Figs. 5 and 6, with both models overestimating the strength of the near-surface wind, and also with comparable values. This has been attributed to deficiencies in the surface drag parameterization and in the representation of its subgrid-scale variability, which are also likely to be the case in this season.
For this period, eddy-covariance flux measurements at Al Ain’s International Airport are available (Nelli et al. 2020a) and used for model evaluation. Figure 15 shows the hourly upward and downward shortwave and longwave radiative fluxes, together with the net radiation flux, ground heat flux, sensible and latent heat fluxes at the surface for the 3-day period. Both models tend to overestimate the downward shortwave radiation flux, which is consistent with the reduced amounts of cloud cover. However, and at least for this station, they also underestimate the upward shortwave flux, in particular, WRF. This suggests that the surface albedo, defined as the ratio of the upward to the downward shortwave radiation fluxes, in the models is lower than that in observations. In particular, it is estimated to be around 0.314 for the observations, in line with the values quoted in the literature which are about 0.3 (Nelli et al. 2020a), 0.262 for NICAM and 0.216 for WRF. Should this underprediction of the albedo be the case elsewhere in the UAE, it can also explain the warmer daytime temperatures predicted by both models. However, and in light of the results of Fonseca et al. (2019), the lack of clouds and potential biases in the LSMs are likely to play the largest role in the daytime air temperature biases. As a result of this, the ground surface is warmer during the daytime, leading to an enhanced upward longwave radiation flux, mostly on 13 and 14 April. Out of all shortwave and longwave flux components, the downward longwave exhibits the smallest diurnal cycle amplitude. It is essentially driven by the atmospheric emissivity and moisture content which have a diminished diurnal variability (e.g., Wang and Dickinson 2013). The WRF and NICAM biases for this field are smaller and comprise both signs without any clear tendency.
As a result of the biases in the shortwave radiation fluxes, the net radiative flux at the surface predicted by both models is generally higher than that observed during daytime. This explains the increased ground heat flux, as more heat is transported downward into the soil, as well as the higher sensible heat flux. At night, the surface net radiative flux predicted by both models is more comparable to that observed, at times on the lower side, mainly because of an underestimation of the downward longwave radiation flux. This may be related to the drier atmosphere in the models. Given this, the ground and sensible heat fluxes in WRF, NICAM, and observations are more similar at night. Regarding the latent heat flux, it is interesting to note that the daytime maximum in NICAM is roughly 7 times larger than that in WRF. This may be explained by the higher net radiation flux in NICAM, even though in this model the top soil layer is roughly 30%–60% drier compared to that in WRF (not shown). However, in comparison with the observed estimates, the latent heat flux in NICAM is larger on 13 April but up to 3 times smaller on 15 April. While the discrepancy in the former day may be attributed to the stronger heating of the surface and subsequent evaporation, the sudden increase in the observed latent heat flux on 15 April is likely related to the precipitation that fell on 14 April, roughly 5.4 mm, and the resulting moister soil, largely underestimated by both WRF and NICAM (Fig. 8).
Figure 9b shows a comparison of the WRF, NICAM, and GFS predictions with radiosonde data at Abu Dhabi’s International Airport for the temperature, water vapor mixing ratio, RH and horizontal wind speed averaged over the full period (13–15 April 2018). The performance of both models is comparable to that obtained in the winter case (Fig. 8), with WRF being drier and warmer than observations below 550 hPa and both models overestimating the observed water vapor mixing ratio in particular above 300 hPa. Regarding WRF, an inspection of the diabatic heating profiles showed that the warmer biases are mostly due to the radiative and sensible heating of the atmosphere, in line with the lower amounts of moisture (not shown). As opposed to the other case, however, NICAM is colder than observations below 700 hPa, by up to 2.5°C. This results mostly from a significant underestimation of the observed temperature at 1200 UTC 14 April, which arises from heavy precipitation amounts predicted by the model on 13 April, in excess of 20 mm, that are not observed (Figs. 8 and 14). The magnitude of the temperature and water vapor mixing ratio biases is also larger in this season but exhibits a similar pattern. NICAM outperforms WRF for all fields shown as in the winter event, capturing relatively well the RH profile below 600 hPa with a drier layer between 875 and 925 hPa and a moister layer between 600 and 700 hPa. The horizontal wind speed is better captured in this season, in particular, the magnitude that is generally higher than in the winter event, which is also the case in the GFS data used to force the models.
Figure 14 shows the temperature profiles in the bottom 5 km from the MWR located at Masdar Institute in Abu Dhabi, and the WRF and NICAM biases for the full 3-day period. As was the case in the winter event, both models are generally warmer than observations in the bottom 1 km during the afternoon hours, but the biases are reduced on 14 April with NICAM exhibiting a very significant cold bias. During this period, the precipitation recorded at Abu Dhabi’s Airport was negligible (0.01 mm), with NICAM, in particular, predicting heavier amounts, roughly 22 mm at the end of the day on 13 April (Fig. 8). The timing of the precipitation coincides with a pronounced cold bias, in particular below 3 km. As a result of the moister surface and some cloud cover, it was cooler in NICAM than in observations on 14 April, with a rather suppressed boundary layer with maximum depths below 1 km. On 15 April, the warmer temperatures above the surface returned. Figure 14 stresses the value of having MWR data for model evaluation: its higher vertical and temporal resolution allow for a more detailed assessment when compared to the twice-daily radiosonde vertical profiles. The negative temperature biases above 3 km in both WRF and NICAM are also seen in the radiosonde data and can be attributed to a moister environment in the models, as is the case in the winter event (Fig. 10).
4. Discussion and conclusions
In this paper, WRF and NICAM are run over a 3-day period in the cold (16–18 December 2017) and warm (13–15 April 2018) seasons over the UAE, an arid country located in the Middle East. The former reproduced very heavy rainfall, in particular over the eastern half of the country, with some stations reporting more than 100 mm in a single day. This was brought about by a low pressure system lingering over the Arabian Sea and associated convective activity. The latter was rather dry, except around Al Ain where the total precipitation recorded was just under 10 mm day−1, in association with a weak frontal system that moved over the country. The winter event is chosen due to the more extreme weather conditions observed in the country (Kumar et al. 2015), featuring the heaviest rainfall totals over the UAE for the period when weather station data are available for evaluation, an episode that may occur more frequently in the future (e.g., Shi et al. 2007; T. Yang et al. 2011). The warm season event comprises the heaviest precipitation amounts at Al Ain for a period when eddy-covariance measurements at the site are available for evaluation, allowing for a detailed assessment of the surface heat and radiative fluxes when unstable weather conditions occurred.
The main goal of the work is to better understand the limitations of the two models in a hyperarid environment, where numerical models are known to underperform and satellite-derived meteorological variables, such as surface temperature, are not very reliable (e.g., Gunwani and Mohan 2017; Ozturk et al. 2012; Wehbe et al. 2017, 2018; Fonseca et al. 2019). This study is particularly relevant as desert regions are predicted to expand in a hypothetical warmer world and are known to be very sensitive to climate change (e.g., Feng et al. 2014; Huang et al. 2017; Kumar et al. 2017; Aldababseh et al. 2018). Understanding the reasons behind and quantifying the magnitude of model biases is a necessary step to increase confidence in model projections of climate change.
WRF and NICAM are forced with GFS forecast data at a 0.25° × 0.25° horizontal grid spacing, initialized at 0600 UTC 15 December and 13 April, and run for 4 days with the first 20 h regarded as model spinup. WRF is a regional climate model, and in this work is set up in a two-nest configuration (12–4 km), with the hourly output of the innermost grid, which covers the entire UAE, used for analysis. Schwitalla et al. (2019) and Wehbe et al. (2019) investigated the performance of the WRF Model for a summertime convective event and an extreme event in the cold season, respectively. Both studies highlighted similar biases such as a tendency of the model to underestimate the observed cloud cover. However, there are also some differences; for example, Schwitalla et al. (2019) noted a cold bias in the 2-m temperature during the morning and evening transition, while the nighttime temperature biases in Wehbe et al. (2019) were much reduced; Schwitalla et al. (2019) found that WRF underestimates the observed precipitation, while in Wehbe et al. (2019) the results were mixed with some stations reporting excessive precipitation and others less than that observed. In any case, both studies highlight the capability of WRF to simulate convective events, justifying its choice in this study. NICAM is a global model (i.e., only initial conditions are needed to perform the simulations), run here in a stretched configuration (g7-s64), with the spatial resolution increased around the UAE to about 7 km, and decreased at the antipodal point (southeastern Pacific), to ~450 km. The NICAM output is also stored hourly. The distinct set up of the two models, in particular regarding the grid configuration and physics parameterization schemes, can explain the different responses in the two events. NICAM has been found to generate skillful predictions in the convective tropics (e.g., Fudeyasu et al. 2008; Liu et al. 2009), but to the authors’ knowledge its performance has not been evaluated in arid/semiarid regions such as the UAE. Given its strengths in the deep tropics, it is of interest to assess how it performs in the dry subtropical regions. The evaluation against the observational data revealed systematic biases in the two models present in both events:
A tendency to overestimate the daily mean air temperature, primarily attributed to an overprediction of maximum temperature, typically by 1°–3°C. This can be attributed to (i) an underestimation of the surface albedo, as found to be the case in comparison with eddy-covariance measurements at Al Ain in the April event, with the estimated albedo at Al Ain of around 0.314, contrasting with 0.262 for NICAM and 0.216 for WRF; (ii) the underprediction of the observed cloud cover; (iii) deficiencies in the LSMs, may explain the biases in the air temperature. Fonseca et al. (2019), and for simulations over the Atacama Desert, found that an increase in the surface albedo of 15% leads to a decrease of the daytime surface temperature by about 0.5°–1°C, and a change in the air temperature of an even smaller magnitude, not exceeding 0.5°C. Given this, factors (ii) and (iii) are likely to play the largest roles in the models’ air temperature biases. When the models predict heavier precipitation amounts, however, they tend to be colder than observations, by up to 2°–3°C.
By and large, WRF overestimates the observed precipitation amounts and NICAM underestimates it. A possible explanation lies in the nature of the cloud microphysics scheme employed in the models: the double-moment scheme used in WRF may overpredict moderate and heavy precipitation events, as highlighted in section 2c, and in line with published works (e.g., Hong et al. 2010; Jankov et al. 2011; Cintineo et al. 2014; Orr et al. 2017; Putnam et al. 2017). The larger precipitation biases occur over high terrain in the eastern part of the country and may also be partially explained by an incorrect representation of the observed topography.
Overestimation of the observed near-surface wind speed at all stations and both periods, typically by 1–3 m s−1, with a poor simulation of the diurnal variability of the horizontal wind vector (i.e., α generally above 0.4). This is in line with published works (e.g., Cheng and Steenburgh 2005; Jiménez and Dudhia 2012; Gunwani and Mohan 2017). Following Gunwani and Mohan (2017), the incorrect simulation of the near-surface wind can be attributed to an unrealistic representation of its subgrid-scale fluctuations and of the surface drag parameterization. This is consistent with the fact that the largest wind speed biases and the worst α scores are seen over areas with high terrain.
The model-predicted vertical profiles of temperature, water vapor mixing ratio, RH and horizontal wind speed are evaluated against those observed by radiosondes at Abu Dhabi’s Airport. NICAM is found to outperform WRF for all fields and events, with an improved performance in the December event. For this model and case study, the temperature, RH, and horizontal wind speed biases are generally within 1°C, 20%, and 5 m s−1, respectively. Some of the model biases are also seen in the GFS forcing data, with NICAM being able to at least partly correct some of those errors. The discrepancies in the temperature profile noted with respect to the radiosonde measurements are also seen in the comparison with the MWR data.
The simulations conducted in this paper highlight the limitations of both models, with none clearly outperforming the other. To successfully simulate the weather conditions in arid regions like the UAE, WRF and NICAM need to be properly set up, with potential modifications to the cloud microphysics and LSMs, as well as to the resolution of the model grid. In addition, having access to high-resolution in situ observational data are paramount so as to conduct an in-depth evaluation of their performances and gain further insight into their deficiencies. For example, it is concluded that having a correct estimate of the observed surface albedo is important, stressing the need to measure it in dedicated field campaigns, while the choice of the momentum roughness length seems to be less relevant. In addition, intensive in situ observations of soil moisture are needed in the region (Fares et al. 2013; AlJassar et al. 2019), where unlike midlatitude regions such measurements are scarce (Temimi et al. 2014). Such effort could be useful to verify retrievals from missions like the NASA Soil Moisture Active Passive (SMAP) one, which could enhance its potential to be used in NWP models. An extension of this work would be to consider additional cases and perturb the experimental set up by testing different physics options and changing the values of relevant surface parameters to improve the models’ performance. Once this is achieved, a long-term simulation can be conducted to learn about the local-scale meteorology. This will be presented in a subsequent paper.
This material is based on work supported by the National Center of Meteorology (NCM), Abu Dhabi, United Arab Emirates (UAE), under the UAE Research Program for Rain Enhancement Science (UAEREP). We acknowledge the NCM for kindly providing radiosonde data at Abu Dhabi’s International Airport through the University of Wyoming’s website and the weather station data used for model evaluation. Thomas Schwitalla, Hans-Dieter Wizemann, and Volker Wulfmeyer from the University of Hohenheim are acknowledged for their contributions to this work as part of the Optimizing Cloud Seeding by Advanced Remote Sensing and Land Cover Modification (OCAL) project, funded by the UAEREP. We would also like to thank three anonymous reviewers for their detailed and insightful comments and suggestions that helped to improve the quality of the paper.
In the equations below, D is the discrepancy between the model forecast F and the observations O; ⟨F⟩ and ⟨O⟩ are the mean of F and O, respectively; σO, σF, and σD are the standard deviation of the observations, model forecasts, and discrepancy between the model predictions and observations, respectively; μ is the normalized bias; ρ is the correlation; η is the variance similarity; and α is the normalized error variance:
For a random forecast based on the climatological mean, ρ = 0 and hence α = 1. Therefore, for a model forecast to be regarded as practically useful, α < 1. More details about these skill scores can be found in Koh et al. (2012).