Through a case study of Hurricane Arthur (2014), the Weather Research and Forecasting (WRF) Model and the Finite Volume Coastal Ocean Model (FVCOM) are used to investigate the sensitivity of storm surge forecasts to physics parameterizations and configurations of the initial and boundary conditions in WRF. The turbulence closure scheme in the planetary boundary layer affects the prediction of the storm intensity: the local closure scheme produces lower equivalent potential temperature than the nonlocal closure schemes, leading to significant reductions in the maximum surface wind speed and surge heights. On the other hand, higher-class cloud microphysics schemes overpredict the wind speed, resulting in large overpredictions of storm surge at some coastal locations. Without cumulus parameterization in the outermost domain, both the wind speed and storm surge are grossly underpredicted as a result of large precipitation decreases in the storm center. None of the choices for the WRF physics parameterization schemes significantly affect the prediction of Arthur’s track. Sea surface temperature affects the latent heat release from the ocean surface and thus storm intensity and storm surge predictions. The large-scale atmospheric circulation models provide the initial and boundary conditions for WRF, and influence both the track and intensity predictions, thereby changing the spatial distribution of storm surge along the coastline. These sensitivity analyses underline the need to use an ensemble modeling approach to improve the storm surge forecasts.
Storm surge represents a major threat to coastal communities. Many efforts have been devoted to predicting storm surge. In a recent Coastal and Ocean Modeling Testbed (COMT) project, Kerr et al. (2013b) found that unstructured-grid coastal models including the Advanced Circulation (ADCIRC) model (Luettich et al. 1992), the Finite Volume Community Ocean Model (FVCOM; Chen et al. 2003), and the Semi-Implicit Eulerian Lagrangian Finite Element (SELFE) model (Zhang and Baptista 2008) had better predictive skill than the official National Oceanic and Atmospheric Administration (NOAA) Sea, Lake, and Overland Surges from Hurricanes (SLOSH) operational model, when used to hindcast storm surges generated by Hurricanes Ike (2008) and Rita (2005) in the Gulf of Mexico. A parallel study by Chen et al. (2013) showed that the same three unstructured-grid models achieved similar skill to each other in hindcasting storm surges generated by extratropical storms when the same mesh, meteorological forcing, and initial/boundary conditions were used. These modeling studies typically focused on improving the hydrodynamic model prediction of tides, storm surge, and waves. For example, Kerr et al. (2013a) found that the storm surge generated by Ike was sensitive to the bottom friction parameterization, especially in shelf waters where a strong shore-parallel coastal current was a key to the storm’s geostrophic setup. To obtain accurate predictions of surge water levels overland and in inland waters, Bunya et al. (2010) and Dietrich et al. (2010) employed a spatially varying bottom friction coefficient to account for different types of land surfaces such as salt marshes. Furthermore, studies by Chen et al. (2013) and Beardsley et al. (2013) showed that wave–current interactions could change the direction of storm-induced currents and increase onshore water transport and overland inundation even though they only had moderate effects on the peak surge height.
Storm surge prediction has also been shown to be sensitive to atmospheric forcing (Peng et al. 2004; Weisberg and Zheng 2006; Irish et al. 2008; Rego and Li 2009). Earlier studies used idealized wind models such as parametric surface winds that assume an idealized stationary, symmetric cyclone (Peng et al. 2004), or the planetary boundary layer (PBL) hurricane wind model (Scheffner and Fitzpatrick 1997). In a numerical study of storm surges in the Albemarle–Pamlico Sound for 10 hypothetical hurricanes, Peng et al. (2004) showed that the storm surge was sensitive to both the minimum sea level pressure (MSLP) and the radius of maximum wind (RMW). Using a high-resolution FVCOM model of Tampa Bay, Florida, Weisberg and Zheng (2006) found that the storm surge height was highly dependent on the storm’s track, intensity, and forward propagation speed. Later, Irish et al. (2008) underlined the importance of the storm size and showed that the peak surge height over mild sloping coastal regions could vary by 30% for a reasonable range of the storm size. In another study, Rego and Li (2009) noted the importance of the hurricane forward speed in storm surge prediction and found that resonance amplified the peak surge when the hurricane’s forward speed was comparable to the shallow-water wave propagation speed. However, neither the parametric surface wind model nor the PBL wind model could capture mesoscale wind structures in hurricanes and provide realistic wind fields needed to predict storm surge along complex coastlines. Therefore, sensitivity analyses using the idealized hurricane models are of limited value for improving the storm surge forecast for a specific storm, even though their computational efficiency makes them well suited for producing flood hazard maps in coastal areas.
To obtain accurate storm surge predictions, realistic meteorological forcing fields are needed to drive the hydrodynamic models. NOAA’s Hurricane Research Division used to generate real-time hurricane winds (H*Wind) from surface wind observations on buoys, automated observation platforms, ships, etc. (Powell et al. 1998), but the H*Wind winds were only available prior to landfall. Consequently, surface wind and air pressure predictions from mesoscale atmospheric models such as the fifth-generation Pennsylvania State University–National Center for Atmospheric Research Mesoscale Model (MM5) and the Weather Research and Forecasting (WRF) Model have been used to drive the storm surge models (Lin et al. 2010; Zhong et al. 2010; Di Liberto et al. 2011; Chen et al. 2013; Georgas et al. 2014; Wang et al. 2014; Zambon et al. 2014). Most of these models were run in hindcast mode. Typically, the meteorological forecast that provided the best storm surge prediction was used to drive the hydrodynamic model. Given the large number of parameterization schemes used in the mesoscale atmospheric models, however, it is not clear which parameterization schemes should be used when making forecasts for storm surge. Moreover, little is known about how uncertainties in the atmospheric model prediction affect the hydrodynamic model prediction of storm surge.
The WRF Model has emerged as a preferred mesoscale atmospheric model for hurricane prediction (e.g., Liu et al. 1997; Davis et al. 2008; Li and Pu 2008; Nolan et al. 2009a, b; Lin et al. 2010; Zambon et al. 2014). Many of the important physical processes in tropical cyclones, however, cannot be resolved with WRF and must be parameterized. On the smallest scales, these include the transfer of heat, moisture, and momentum at the air–sea interface, as well as the microphysics of cloud formation and precipitation. Several schemes have been developed to calculate the surface fluxes and vertical mixing within the PBL, including the classic Mellor–Yamada scheme (Mellor and Yamada 1982; Janjic 1994) and nonlocal schemes (Zhang and Anthes 1982; Hong et al. 2006; Hong and Kim 2008). Each PBL scheme includes a surface-layer scheme that parameterizes the momentum, moisture, and heat fluxes within the lowest 10% of the PBL (e.g., Pleim 2006; Jiménez et al. 2012). Various cloud microphysics (CM) schemes are used to simulate hydrometeor phase changes, differing in their complexity and hydrometeor species (Hong et al. 2004; Hong and Lim 2006). Braun and Tao (2000) found pronounced sensitivities of high-resolution intensity simulations of Hurricane Bob (1991) to four different PBL schemes. Nolan et al. (2009a,b) investigated how PBL schemes affected the prediction of Hurricane Isabel’s (2003) intensity and forward propagation speed, and suggested the modifications that significantly improved the intensity prediction. Using a diagnostic tropical cyclone PBL model, Kepert (2012) tested a number of PBL schemes and found that the Louis and Mellor–Yamada schemes performed best. Li and Pu (2008) showed that numerical simulations of the early intensification of Hurricane Emily (2005) were very sensitive to the choice of CM and PBL schemes. Similarly, Zhu and Zhang (2006a) highlighted the possible sensitivity of inner-core structures to changes in CM. Other studies have shown that hurricane prediction is sensitive to the configurations of initial and boundary conditions in WRF (e.g., Davis et al. 2008; Lin et al. 2010). Kilic and Raible (2013) showed that hurricane tracks and intensities were affected by mesoscale differences in the sea surface temperature (SST) used as the WRF bottom boundary conditions. Recently, Glenn et al. (2016) and Seroka et al. (2016) demonstrated that resolving spatiotemporal changes in SST was the key to avoiding the intensity overprediction for Hurricane Irene (2011) over the Mid-Atlantic Bight. Glenn et al. (2016) further showed that ahead-of-eye-center cooling in the coastal ocean was an important factor influencing the intensity of landfalling hurricanes and tropical storms.
Most of the previous sensitivity analyses focused on the effects of physics parameterizations on hurricane simulation in the atmosphere (Kepert 2012; Zhu et al. 2014; Zhu and Zhu 2015; Zhang et al. 2015; Zhang and Marks 2015; Penny et al. 2016). Little research has been done to investigate how storm surge prediction depends on the physics parameterizations in the mesoscale atmospheric model. It remains unclear how uncertainties in the atmospheric model forecast propagate to the ocean model and produce uncertainties in the storm surge forecast. In this study we use an atmosphere–ocean model to conduct sensitivity analyses of the storm surge prediction. By running experiments with different physics parameterizations and different configurations of the initial and boundary conditions in the regional atmospheric model, we will examine how they affect the prediction of storm surge. Hurricane Arthur, which made landfall in North Carolina as a category 2 hurricane on 4 July 2014, is selected for this case study. Arthur produced substantial flooding in North Carolina and other mid-Atlantic states. Before Arthur’s landfall, a multi-institutional team of researchers from the Woods Hole Oceanographic Institution, Rutgers University, the University of Maryland Center for Environmental Science, the University of Maine, and the Gulf of Maine Research Institute deployed an array of storm gliders and storm data buoys in the Mid-Atlantic Bight, in order to investigate the impact of the shelf conditions on the storm intensity. This paper is part of that larger group effort to evaluate the storm surge prediction for Hurricane Arthur.
This paper is organized as follows. Section 2 describes the regional atmosphere and ocean models as well as the design of the sensitivity analyses experiments. Section 3 presents the storm surge prediction from the control model run. Section 4 shows the sensitivity analyses of storm surge prediction to physics parameterizations in WRF, and section 5 shows the sensitivity analyses of storm surge prediction to initial and boundary conditions in WRF. Section 6 presents a discussion and conclusions.
2. Description of atmosphere–ocean models and numerical experiment
To predict the storm surge generated by Hurricane Arthur, we use the Advanced Research version of WRF (version 3.8; Skamarock et al. 2008) to simulate the hurricane dynamics in the atmosphere, along with FVCOM to simulate the oceanic response to the hurricane forcing. In this atmosphere–ocean model system, hourly outputs of surface wind and air pressure fields from WRF are used to force FVCOM.
We have configured triply nested model domains for WRF (Fig. 1a). The outermost domain has a coarse resolution of 12 km and covers the western Atlantic such that WRF simulates most of the storm’s life cycle, starting from its initial formation off the coast of southeastern Florida to its eventual transition to an extratropical storm off the Gulf of Maine. The middle domain covers the south and middle Atlantic regions at a resolution of 4 km. The innermost domain uses a fine resolution of 1.33 km to resolve the region most affected by the storm surge, extending from the Outer Banks, North Carolina, to Chesapeake Bay, Virginia–Maryland. WRF hourly outputs of surface wind and air pressure from all the three domains are used to force FVCOM, but the analysis of storm dynamics is based on the results from WRF’s outermost domain. After testing 35, 40, and 50 vertical sigma levels in WRF, the choice was made to use 40 levels because that number had the best agreement with the National Hurricane Center (NHC) best-track storm intensity. At the lateral boundaries of its outermost domain, WRF is forced by the 0.5° Global Forecast System (GFS) products (http://www.emc.ncep.noaa.gov/GFS/.php) issued at 1900 LST 1 July with updates at 3-h intervals for the 5-day forecast period (i.e., a single cycle). At the ocean surface, WRF is forced by outputs from 0.5° real-time global SST (http://polar.ncep.noaa.gov/sst/rtg_low_res/) at 1-day intervals. GFS is used to initialize WRF at 1900 LST 1 July 2014, approximately 48 h prior to Arthur’s landfall at the Outer Banks. This provides sufficient time for Hurricane Arthur to spin up from its initial vortex state downscaled from GFS.
FVCOM is used to configure a coupled estuary-shelf model that includes Albemarle–Pamlico Sound, Chesapeake Bay, Delaware Bay, and the eastern U.S. continental shelf (Fig. 1b). The eastern boundary is placed several hundred kilometers from the coast while the southern and northern boundaries are roughly perpendicular to the coast, located at 34° and 41°N, respectively. The horizontal resolution ranges from ~1 km in the inner shelf to ~10 km near the open boundaries. Three submodel domains extending from the western boundary resolve Albemarle–Pamlico Sound, Chesapeake Bay, and Delaware Bay at a resolution of 0.2–1.0 km. The model is run in a two-dimensional barotropic mode in which temperature and salinity are kept constant. At the offshore open boundary, the sea level is prescribed using 10 tidal constituents according to the Oregon State University global tidal model TOPEX/Poseidon (TPXO), version 7.1 (Egbert and Erofeeva 2002). River discharges in the upper tributaries of the two estuaries are prescribed using data from U.S. Geological Survey water gauge stations. A quadratic stress is exerted at the bed, with the drag coefficient calculated from a spatially uniform Manning coefficient of 0.02 (Arcement and Schneider 1989). FVCOM is hot started at 1900 LST 1 July 2014 from a hindcast simulation that began at 0000 LST 1 May 2014, in order to allow tides to ramp up before the arrival of Hurricane Arthur and produce sufficiently long sea level time series that can be used for the tidal harmonic analysis later.
A series of numerical experiments has been conducted to investigate how the physics parameterizations and model configurations in WRF affect the storm surge prediction. Table 1 lists the physics parameterization schemes and initial and boundary condition configurations in the control run. The sensitivity analysis runs explore three PBL schemes, four CM parameterizations, three options of cumulus parameterization (CP), five SST products, and three different large-scale atmospheric forecasting products for prescribing the initial and lateral boundary conditions of WRF (Table 2). In each sensitivity analysis run, only one parameterization scheme or one model configuration is changed from the control run while everything else remains unchanged. No changes are made to the FVCOM configuration.
3. Storm surge prediction
Results from the control run are presented and compared against the atmospheric and oceanic observations. The predicted track compares well to the best track provided by NHC (Fig. 2a). Hurricane Arthur moved northeastward in the South Atlantic Bight on 2–3 July. It made landfall near Beaufort, North Carolina, on the Outer Banks around 2200 LST 3 July, and exited Albemarle–Pamlico Sound several hours later. Arthur continued on its northeastward path over the Mid-Atlantic Bight on 4 July but moved at a much faster speed. WRF accurately predicts Arthur’s track and forward propagation speed. The root-mean-square error (RMSE) between the predicted and observed tracks is 63.9 km, which is mainly due to a westward bias originating from the westward shift of the large-scale atmospheric circulation in GFS. Although the westward bias could affect the intensity prediction by increasing the time the storm spends over land, Fig. 2b shows that the predicted MSLP closely matches the observations. The observed MSLP dropped from 995 mb on 2 July to about 970 mb (1 mb = 1 hPa) on 4 July, indicating a rapid intensification of the storm intensity over this period. WRF essentially captures this intensification process. It also reproduces Arthur’s subsequent weakening over the Mid-Atlantic Bight as it encountered colder water. Arthur attained a maximum sustained wind (MSW) of 45 m s−1 at 10 m above the sea surface. This is well simulated by WRF, although the predicted peak speed of MSW lags the observed peak wind by several hours (Fig. 2b). The RMSEs for MSLP and MSW are 4.7 mb and 4.8 m s−1, respectively, while the correlation coefficients are 0.97 and 0.88.
Additional model–data comparisons are made using the measurements of surface pressure and wind speed at six tidal stations (Figs. 3a,b). These stations are selected because the Outer Banks of North Carolina and the Virginia coast encountered the highest storm surges whereas the mid- to upper Chesapeake Bay experienced the largest sea level drop (Berg 2015). WRF generally captures the observed rapid drop of surface air pressure at Hatteras, Oregon Inlet, Duck, and the Chesapeake Bay Bridge–Tunnel (CBBT), although it overpredicts the air pressure drop at Duck and underpredicts the air pressure drops at Oregon Inlet and Hatteras. Since Hurricane Arthur’s path cut through Albemarle–Pamlico Sound, winds blew northeastward at Hatteras and Oregon Inlet but southwestward at Duck and CBBT (Fig. 3b). WRF reproduces this regional difference in the wind direction. Farther north in Chesapeake Bay (e.g., Baltimore and Annapolis, Maryland), the surface air pressure slowly increased as Arthur moved away toward the northeast. The lower than observed pressure at Baltimore and Annapolis, as well as the offsets in surface air pressure at Duck, Oregon Inlet, and Hatteras, are related to WRF’s westward bias in the track prediction. Winds at the northern Chesapeake Bay locations primarily blew in the southward direction, and their magnitudes and directions are well predicted by WRF.
To examine how well WRF–FVCOM forecasts the storm surge generated by Hurricane Arthur, we compare the time series of the total and subtidal sea levels at the six tidal gauge stations (Figs. 3c,d). Stations Hatteras and Oregon Inlet are located just inside the Pamlico Sound and have weak tides. Thus, the storm surge dominated the sea level response at these sites. In contrast, sea level at Duck and CBBT included both tidal and storm surge components. The southward winds blowing down Chesapeake Bay drove the water level down at Baltimore and Annapolis. FVCOM captures these diverse sea level responses. Figure 3d compares the predicted and observed storm surges after tidal signals are removed through harmonic analysis. Hurricane Arthur caused a storm surge of about 1.5 m at Oregon Inlet and a sea level drop of 0.5 m at Baltimore. The RMSEs averaged for the total and subtidal sea levels at the six tidal gauge stations are 0.15 and 0.14 m, respectively, and the averaged correlation coefficients are 0.89 and 0.83.
4. Sensitivity of storm surge prediction to WRF physics parameterizations
Now we explore the sensitivity of the storm surge prediction to the physics parameterization schemes in WRF.
a. The planetary boundary layer
Three widely used PBL schemes in WRF are tested in this study: 1) the Yonsei University (YSU) scheme in run PBL-YSU [i.e., the control run; Hong et al. (2006)]; 2) the Asymmetric Convective Model version 2 (ACM2) scheme in run PBL-ACM2 (Pleim 2007); and 3) the Mellor–Yamada–Janjić 1.5-order (MYJ) scheme in run PBL-MYJ (Janjić 1994). MYJ is a local closure scheme. In contrast, YSU and ACM2 are nonlocal closure schemes. YSU parameterizes the nonlocal fluxes implicitly via a parameterized nonlocal term, while ACM2 treats the nonlocal fluxes explicitly.
The predicted storm tracks are very similar to each other among the three model runs, all exhibiting a slight westward bias from the observed track (Fig. 4a). However, the predicted storm intensities are quite different. Runs PBL-YSU and PBL-ACM2 predict significantly lower values of MSLP on 3–4 July than run PBL-MYJ (Fig. 4b). Similarly, MSW reaches ~44 m s−1 at the peak intensity of the storm in runs PBL-YSU and PBL-ACM2 versus ~36 m s−1 in run PBL-MYJ (Fig. 4c). MSLP in run PBL-MYJ increases between 1200 and 2200 LST 3 July, indicating a weakening of the storm, whereas the MSLP values in the other two runs remain the same. To understand this intensity difference between run PBL-YSU/PBL-ACM2 and run PBL-MYJ, we calculate the equivalent potential temperature θe inside the PBL because the moist entropy flux was previously shown to affect storm intensity (Malkus and Riehl 1960; Holland 1997). We choose a cross section through the storm’s center and average θe between 1200 and 2200 LST 3 July. The cross section is chosen to be in the southwest-to-northeast direction to capture the left–right and the front–back asymmetries in the hurricane field. As shown in Figs. 4d–f, θe in run PBL-MYJ is about 4 K lower than that in the other two runs. This result is consistent with that of Zhang and Zheng (2004), who showed that the MYJ PBL scheme tends to produce a colder PBL than that produced by the Blackadar scheme (Zhang and Anthes 1982) and the Medium-Range Forecast (MRF; Hong and Pan 1996) PBL scheme, upon which the ACM2 and the YSU PBL schemes are based, respectively. The colder PBL with the MYJ scheme may be attributed partly to less upward transfer of the net surface heat fluxes (Fig. 4g), possibly through the paired Eta similarity surface-layer scheme, and partly to the lack of countergradient heat fluxes from the capped inversion in the MYJ scheme (Zhang and Zheng 2004). Starting from late 4 July, a rapid weakening process was initiated in all three model runs as Arthur encountered colder SSTs at higher latitudes over the North Atlantic.
As a result of the intensity difference, the surge heights are different between the runs with the local and nonlocal PBL closure schemes. The storm surge predicted by run PBL-ACM2 is similar to that in run PBL-YSU and will not be discussed further. Figure 5 compares two snapshots of the wind field and nontidal sea level distribution between runs PBL-YSU and PBL-MYJ. Because of a slight difference in the forward propagation speed, the storm position in run PBL-MYJ lags behind that in run PBL-YSU by about 1 h (see the insert in Fig. 4a). The snapshots in Fig. 5 are taken when the storm simulated in the two model runs was at roughly the same geographic location rather than at an identical clock time. In the first snapshot taken around 0500 LST 4 July (Figs. 5a,b,e,f), Hurricane Arthur had just moved out of Albemarle–Pamlico Sound through the northern arc of the Outer Banks. Run PBL-YSU predicts the strong onshore winds ahead of the storm’s center, thus producing high sea levels along the Virginia coast and its adjacent shelf (Figs. 5a,b). The onshore winds predicted by run PBL-MYJ are weaker and the resulting storm surges are weaker (Figs. 5e,f). On the backside of the storm, the winds and storm surge between the two model runs have relatively smaller differences than those ahead of the storm’s center.
In the second snapshot taken around 1500 LST 4 July, Hurricane Arthur moved past 39°N, at the same latitude as the northern end of Chesapeake Bay (Figs. 5c,d,g,h). The southward winds to the left of the storm drove water away from the upper part of the estuary. Large differences in the wind speed magnitude (up to 6 m s−1) are found between runs PBL-YSU and PBL-MYJ, particularly in the middle and lower parts of the estuary (Figs. 5c,g). This results in a larger sea level depression in the upper Chesapeake Bay in run PBL-YSU than in run PBL-MYJ (Figs. 5d,h).
In WRF, the PBL schemes are paired with specific surface physics parameterizations. YSU and ACM2 are paired with the revised MM5 similarity surface-layer scheme (Jiménez et al. 2012), while MYJ is paired with the Eta similarity surface-layer scheme (Janjić 1996). It is possible that the surface-layer parameterization may have contributed to the differences in the Arthur simulations between runs PBL-YSU/PBL-ACM2 and PBL-MYJ. As a step toward discerning the effect of the surface physics parameterization, we explored two surface-layer parameterization schemes in PBL-YSU: the revised MM5 similarity scheme and the old MM5 similarity scheme (Grell et al. 1994). There were virtually no differences in storm track and MSLP between the two model runs. The differences in the storm surge heights were within a few centimeters.
b. Cloud microphysics
The four CM schemes used for the sensitivity analysis are the WRF single-moment 3-class (WSM3) scheme in run CM-WSM3 (i.e., the control run), the WRF single-moment 5-class (WSM5) scheme in run CM-WSM5, the WRF single-moment 6-class (WSM6) scheme in run CM-WSM6, and the WRF double-moment 6-class (WDM6) scheme in run CM-WDM6. WSM3 considers three categories of hydrometeors: vapor, cloud water/rain, and ice/snow. In WSM3, cloud water/rain turns into ice/snow instantly at 0°C, prohibiting the formation of supercooled water (Hong et al. 2004). WSM5 is a mixed-phase scheme, including vapor, cloud water, rain, ice, and snow (Hong et al. 2004). WSM6 extends WSM5 by including graupel (Hong and Lim 2006). Both WSM5 and WSM6 account for the behavior of supercooled water. In addition to predicting the mixing ratios of the six hydrometeors as in WSM6, WDM6 includes prognostic number concentrations of cloud, rainwater, and cloud condensation nuclei (double moment) to improve the representation of warm-phase processes (Lim and Hong 2010).
All four CM sensitivity runs produce nearly identical storm tracks (not shown). Overall, the higher-class CM schemes produce a stronger hurricane, but the MSLP and MSW values predicted by run CM-WSM3 are in better agreement with the observations (Figs. 6a,b). The minimum MSLP is 963 mb in run CM-WSM6, 964 mb in run CM-WSM5, 966 mb in run CM-WDM6, and 974 mb in run CM-WSM3 while the observed minimum MSLP is 972 mb. The MSW is 45 m s−1 in run CM-WSM3 versus 50 m s−1 in run CM-WSM6, amounting to an 11% difference. To understand what caused these different intensity predictions, we compare the cross-sectional distribution (in the southwest-to-northeast direction) of the temperature anomaly among runs CM-WSM3, CM-WSM5, CM-WSM6, and CM-WDM6 across the troposphere, up to 200 mb (Figs. 6e–h). The temperature anomaly is calculated by removing the mean temperature at each pressure level and averaging between 1200 LST 2 July and 1200 LST 3 July. The warm core at the storm’s center is located around 400 mb or roughly 7.5 km. It is smaller and colder in run CM-WSM3 than in the other runs. The maximum temperature anomaly is 10 K in runs CM-WSM5, CM-WSM6, and CM-WDM6, but less than 9 K in run CM-WSM3. The 8-K anomaly contour extends from 600 to 200 mb in runs CM-WSM5, CM-WSM6, and CM-WDM6, but is limited between 520 and 320 mb in run CM-WSM3. The weak temperature anomaly is an indication that less latent heat is released in the storm simulated with the CM-WSM3 scheme (e.g., Li and Pu 2008). Because of its relatively simple representation of cloud physics, WSM3 predicts less precipitation, less snow production, and hence less latent heat release to fuel the storm. In contrast, the higher-class CM schemes (runs CM-WSM5, CM-WSM6, and CM-WDM6) predict a much faster rate of intensification and a lower minimum value of MSLP (Fig. 6a). Compared to run CM-WSM6, run CM-WDM6 offers a more realistic simulation of the storm intensity and MSW after 1200 LST 3 July. To understand this difference, we calculate the vertical distribution of the temperature anomaly through the storm’s center and average it over two periods: one between 1200 LST 2 July and 1200 LST 3 July (before the lowest MSLP) and the other between 1200 LST 3 July and 1200 LST 4 July (after the lowest MSLP). Although the temperature anomaly predicted in run CM-WDM6 is nearly the same as in run CM-WSM6 during the former period (Fig. 6c), it lies halfway between the temperature anomalies in runs CM-WSM6 and CM-WSM3 during the later period (Fig. 6d). This explains why the MSLP and MSW predicted in run CM-WDM6 are in better agreement with the observations than run CM-WSM6 during the decelerating phase of the storm.
The overprediction of the storm intensity in run CM-WSM6 leads to the overprediction of the storm surge (Fig. 7). The storm surge produced in run CM-WSM5 is nearly the same as that in run CM-WSM6 (not shown). At 0500 LST 4 July, when Arthur had just left Albemarle–Pamlico Sound through the northern arc of the Outer Banks, run CM-WSM6 predicts a larger storm size and stronger winds than run CM-WSM3 (Figs. 7a,b). Compared to run CM-WSM6, the winds in run CM-WDM6 are weaker, a result of the rapid weakening that started at 1200 LST 3 July. The winds in the southeast and northwest quadrants of the storm are up to 10 m s−1 stronger in run CM-WSM6 than in runs CM-WSM3 and CM-WDM6. Consequently, the storm surges at the Virginia coast and northern Outer Banks are much higher in run CM-WSM6 (Figs. 7d–f). For example, the peak surge height at Duck is 0.65 m in run CM-WSM3, 0.88 m in run CM-WDM6, and 0.97 m in run CM-WSM6 while the observed surge height was 0.61 m. Hence, run CM-WSM6 overpredicts the surge height at Duck by ~60%. Inside Albemarle–Pamlico Sound and around the southern arc of the Outer Banks, the predicted sea level heights show smaller differences among the three model runs. The peak surge height at Oregon Inlet is 1.52 m in run CM-WSM3, 1.48 m in run CM-WDM6, and 1.68 m in run CM-WSM6 while the observed surge height was 1.26 m.
c. Cumulus parameterization
A cumulus parameterization scheme assumes the generation of subgrid-scale clouds that vertically transport heat and water vapor without requiring the presence of grid-box saturation (Zhang et al. 1988; Molinari and Dudek 1992). It is one of the key parameters controlling the storm intensification process, especially when grid spacing is several kilometers or larger (Smith 2000). In the control run (run CP-ON12_OFF4), WRF employs the Kain–Fritsch CP scheme (Kain 2004) for the 12-km WRF domain because it assumes that the model is unable to resolve cumulus convection at this coarse spatial scale (Weisman et al. 1997), but CP is not activated in the 4-km WRF domain. To examine if CP affects the storm and storm surge prediction, we conduct two model runs: run CP-OFF12_OFF4, in which CP is turned off in all domains, and run CP-ON12-ON4, in which CP is turned on in both the 12- and 4-km WRF domains. No CP is applied to the 1.33-km WRF domain in either of these runs.
All CP sensitivity analysis runs produce nearly identical storm tracks (not shown). Switching off CP in the 12-km domain results in a dramatic reduction in the storm intensity: the minimum MSLP increases from 972 to 990 mb while the maximum MSW decreases from 44 to 32 m s−1 (29% reduction) (Figs. 8a,b). This large reduction in storm intensity can be attributed to the decreased or delayed latent heat release associated with rainbands in the outer regions. Specifically, the activation of a CM scheme requires the presence of grid-box saturation. On the other hand, the Kain–Fritsch scheme would be activated in atmospheric columns where both conditional instability and reasonable upward lifting of the PBL are present (Kain 2004). Once activated, this method would facilitate organized mass and moist convergence, leading to the formation of grid-box saturation and the activation of a CM scheme. In fact, the total precipitation within the 200-km radius of the storm center decreases by approximately 30% in run CP-OFF12_OFF4 between 1200 LST 2 July and 1700 LST 4 July (Fig. 8c). Suppressed latent heat release causes this significant reduction in precipitation, as well as the resulting storm intensity. On the other hand, switching on or off CP in the 4-km domain has a minor effect on the storm intensity (Figs. 8a,b). The response of the storm surge to CP in the 12-km domain is strong, with the surge off the Virginia coast and the Outer Banks significantly weakened in run CP-OFF12_OFF4 (Figs. 8d–f). At Duck, the peak surge height is 0.65 m in run CP-ON12_OFF4 versus 0.45 m in run CP-OFF12_OFF4, while at Oregon Inlet the peak surge height is 1.52 m in run CP-ON12_OFF4 versus 1.31 m in run CP-OFF12_OFF4. Hence, switching off CP in the 12-km WRF domain reduces the surge heights by 13%–31%. In contrast, the surge response to CP in the 4-km domain is weak, with the sea level differences less than 0.02 m at the two sites.
5. Sensitivity of storm surge prediction to WRF initial and boundary conditions
Next, we explore if and to what extent the configurations of the initial and boundary conditions for WRF affect the storm surge prediction.
a. Sea surface temperature
Previous studies have highlighted the crucial roles of SST in determining the storm intensity (Emanuel 1999; Zhu and Zhang 2006b). Several SST products are routinely available and have been used to provide the oceanic boundary conditions to WRF: 1) daily updated real-time global SST used in run SST-RTG_low (i.e., the control run) at 0.5° resolution (approximately 45 km in the midlatitudes), 2) weekly updated GFS skin temperature (run SST-GFS) produced by GFS using optimum interpolation of satellite measurements of SST at 1° resolution (90 km), 3) daily updated 0.083°-resolution (7.5 km) real-time global SST used in run SST-RTG_high, 4) daily updated global analysis of SST obtained from the Hybrid Coordinate Ocean Model (HYCOM) and the Navy Coupled Ocean Data Assimilation (NCODA) systems (run SST-HYCOM) at 0.08° resolution (7.2 km), and 5) daily updated Advanced Very High Resolution Radiometer (AVHRR) SST (run SST-AVHRR) at 0.018° resolution (1.6 km), acquired using the “coldest dark pixel” composite technique (Glenn et al. 2016). We conduct five model runs to examine the effects of SST on the storm surge prediction. SST data are updated daily in runs SST-RTG_low, SST-RTG_high, SST-HYCOM, and SST-AVHRR, whereas SST is held constant in run SST-GFS. They are used as static SST conditions to WRF and no oceanic feedback is considered. Figure 9 shows a snapshot of these SST products in the path of Hurricane Arthur on 3 July. As expected, higher-resolution products from SST-HYCOM and SST-AVHRR depict finescale temperature structures that are not captured in SST-RTG_low and SST-GFS.
Runs SST-RTG_low and SST-GFS, which have the coarse spatial resolutions, produce similar predictions for MSLP and MSW. Run SST-AVHRR predicts stronger MSLP and MSW while run SST-HYCOM predicts weaker MSLP and MSW (Figs. 10a,b). Larger differences are found on 3 July when Arthur was over the southern Atlantic Bight, and the five SST products display large differences in SST. After Arthur’s landfall on 4 July, the intensity predictions among the five runs are similar. Because the high-resolution AVHRR SST simulation produced a less accurate prediction of storm intensity than the coarse-resolution SST products, further investigations were performed, revealing that the AVHRR SST used to force WRF on 3 July was mostly obtained from a satellite scan at 1354 LST 2 July. AVHRR SST is a multichannel SST product and consists of the coldest pixels from several satellite scans within a 3-day time window after cloud shields are removed (Glenn et al. 2016). The ocean surface cooled by ~1 K between the time of the satellite scan and the time when Arthur reached a buoy off the South Carolina coast (its location shown in Fig. 9a). Hence, AVHRR SST was warmer than the observed SST on 3 July (Fig. 9g), resulting in a stronger storm in run SST-AVHRR.
The divergence in the intensity prediction among the five model runs is consistent with the divergence in the surface latent heat flux (Fig. 10c). When the storm moved over the ocean (on 3 July), the latent heat flux within a radius of 200 km of the storm center is on average 50 W m−2 higher in run SST-AVHRR than that in run SST-HYCOM. During the storm’s transit over Albemarle–Pamlico Sound, however, the differences in the latent heat flux are much smaller. The widening divergence of MSLP and the surface latent flux on 4 July are a manifestation of Arthur’s reentry into the North Atlantic at higher latitudes, owing to differences in SST products over this region. Figure 10d shows small differences in the storm track among the five model runs.
Due to the relatively small differences in the intensity prediction around the time of Arthur’s landfall, the surge heights predicted by the five models have minor differences (Figs. 10e,f). At Oregon Inlet, along the southern arc of the Outer Banks, the peak surge heights between runs SST-RTG and SST-HYCOM are almost identical while the differences in the peak surge heights between runs SST-AVHRR and SST-GFS are less than 0.2 m. At Duck, along the northern arc of the Outer Banks, the differences in the peak surge heights are even smaller. All five models predict a peak surge height of around 0.7 m. However, the peak surge height arrives 1 h later in runs SST-HYCOM, SST-RTG_low, and SST-RTG_high than in runs SST-GFS and SST-AVHRR as a result of the differences in the storm’s forward propagation speed (Fig. 10d).
b. Initial and lateral boundary conditions
Large-scale atmospheric models, which provide the initial and lateral boundary conditions (IBCs) to regional WRF models, are not always in good agreement because of their different algorithms and grid resolutions. It is of interest to examine how the discrepancies in the large-scale model predictions propagate to the regional WRF predictions and, eventually, to the storm surge predictions. Three commonly used synoptic-scale atmospheric models are tested here: 1) the GFS data used in run IBC-GFS, 2) the data from the North American Mesoscale Forecast System (NAM) used in run IBC-NAM, and 3) the data from ERA-Interim used in run IBC-ERA. GFS has a resolution of 0.5°, which is approximately 45 km in the midlatitudes. NAM has a grid resolution of 12 km and is based on the WRF Nonhydrostatic Mesoscale Model (WRF-NMM). ERA is based on the reanalysis of the European Centre for Medium-Range Weather Forecasts (ECMWF) model, with 0.7° resolution (63 km). Preliminary numerical experiments showed that the GFS and NAM simulations generated initial conditions similar to those observed but the ERA simulation poorly resolved the initial vortex. To circumvent this initialization problem, the initial vortex from the GFS is superimposed onto the ERA field to initialize run IBC-ERA. The 0.5°-resolution real-time global SST is used in this group of model runs.
Unlike the previous model runs, which show negligible differences in the storm track, the tracks predicted in runs IBC-GFS, IBC-NAM, and IBC-ERA begin to diverge north of 30°N, reflecting the influence of the synoptic-scale weather patterns on the storm’s trajectory (Fig. 11a). The track in run IBC-ERA shifts eastward from the other two tracks early on 3 July. This is likely related to the longitudinal position of the subtropical ridge, as indicated by the 511-mb pressure contour. This pressure ridge in run IBC-ERA lies over 100 km to the east of the same ridge in runs IBC-GFS and IBC-NAM (Figs. 11g–i). After Arthur’s exit from Albemarle–Pamlico Sound and its reentry into the North Atlantic, the track in run IBC-NAM shifts eastward off the track in run IBC-GFS, for the same reason that the large-scale atmospheric circulation fields diverge between the two runs.
There are also notable differences in the intensity prediction among the three runs (Figs. 11b,c). The storm in run IBC-NAM has a much slower rate of intensification prior to landfall and is probably related to its initial conditions displaying a weaker and distorted storm vortex (Fig. 11e). On the other hand, runs IBC-GFS and IBC-ERA are initialized with the same fully developed vortex (Figs. 11d,f) and predict similar storm intensities prior to landfall. During this prelandfall period, runs IBC-GFS and IBC-ERA predict almost identical MSWs while run IBC-NAM predicts lower MSW. After exiting Albemarle–Pamlico Sound and reentering the North Atlantic, the storm in run IBC-ERA propagates over the warm water west of the Gulf Stream and experiences further intensification that is absent in the other two runs (Fig. 11b). The MSW in run IBC-ERA becomes 10 m s−1 stronger while runs IBC-GFS and IBC-NAM predict similar values (Fig. 11c).
The storm surge generated in run IBC-NAM is somewhat weaker than that in run IBC-GFS (Fig. 12). Run IBC-NAM predicts a slower forward propagation speed and exits the northern arc of the Outer Banks at 0700 LST 4 July, 2 h later than that in run IBC-GFS. The predicted wind speeds in run IBC-NAM are only moderately weaker than those in run IBC-GFS, so only minor differences are seen in the storm surge between the two model runs. Because of its easterly track, the storm surge predicted in run IBC-ERA is very different from the other two runs (Figs. 12a–c). The maximum surge region along the northern arc of the Outer Banks shifts southward. The sea level depression along the southern arc of the Outer Banks is also much larger, as strong eastward winds on the backside of the storm drive the water offshore. Moreover, the water-level distribution inside Albemarle–Pamlico Sound is very different between run IBC-ERA and the other two runs. The winds are predominately northeastward in runs IBC-GFS and IBC-NAM such that the surges are highest in the northeast corner of the Pamlico Sound. The winds are mostly eastward in run IBC-ERA so that the surges are higher along most of the eastern boundary of the Pamlico Sound.
6. Discussion and conclusions
In this study we have systematically investigated the sensitivity of storm surge prediction to physics parameterizations and configurations of the initial and boundary conditions in the WRF model. These sensitivity analyses provide useful guidance for selecting certain WRF physics parameterization schemes in storm surge forecasts. For example, the local closure scheme MYJ produces a lower equivalent potential temperature than the nonlocal schemes YSU and ACM2, leading to significant underprediction of wind speeds and surge heights. Another result of this study is the demonstrated need for cumulus parameterization in the outermost 12-km domain. Without it, the storm intensity is grossly underestimated, leading to significant underprediction of storm surge. On the other hand, cumulus parameterization does not make much difference in the 4-km WRF domain. It is also interesting to note that none of the choices for the WRF parameterization schemes significantly affect the prediction of Arthur’s track.
Other results from the sensitivity analyses are somewhat counterintuitive and do not lead to obvious recommendations for a specific parameterization scheme or a particular choice for the SST, lateral boundary, and initial conditions. Higher-class cloud microphysics schemes WSM5 and WSM6 provide more sophisticated representations of cloud processes but overpredict the wind speed in Hurricane Arthur and result in large overpredictions of storm surge at some coastal locations. In contrast, the lower-class WSM3 produces a more accurate prediction of the storm intensity and storm surge. As demonstrated in the comparison between runs CM-WSM6 and CM-WDM6, the storm prediction can be quite different even in the same class of microphysics schemes, depending on whether a single-moment approach is used to predict the mixing ratio for each species or multiple moments are used to predict additional quantities such as number concentration. Another lesson that has been learned from this study is that high-resolution SST products do not necessarily lead to improved predictions of the storm intensity and storm surge. In particular, we need to examine if these SST products capture the prestorm cooling, which has been shown to be a key factor in determining storm intensity on the continental shelf (Glenn et al. 2016; Seroka et al. 2016). It is also worth pointing out that the sensitivity analyses may be different for different storms. Previous sensitivity analyses of hurricane simulations did not yield a unique optimal choice for the physics parameterization schemes (e.g., Li and Pu 2008; Kepert 2012; Zhu and Zhu 2015; Penny et al. 2016).
This leads to our recommendation for using an ensemble approach in storm surge forecasts. Although the ensemble approach is widely used in weather forecasts, it has been rarely used in storm surge modeling. Zou et al. (2013) presented a general modeling framework to predict coastal flood risk due to wave overtopping by linking meteorological forecasts, wave–tide–surge predictions, nearshore wave models, and surf zone models. Colle et al. (2015) used 11 members from an atmospheric ensemble Kalman filter system to drive the ADCIRC model and found that the storm surge generated by Hurricane Sandy (2012) in the metropolitan New York, New York, area was sensitive to moderate changes in storm track and wind speed. Figure 13 summarizes the WRF-FVCOM’s skill in predicting the storm surge generated by Hurricane Arthur. Our skill assessment focuses on six tidal gauge stations shown in Fig. 1b. The pink symbols represent the 15 individual model runs for different choices of physics parameterization schemes and model configurations. The black symbols represent the ensemble mean prediction for the storm surge. In the Taylor diagram, the correlation coefficient r, the centered RMSE, and the ratio σn of the standard deviations of the model-predicted field (i.e., the test field) and the observed field (i.e., the reference field) are displayed by the location of one point (representing the model field) in relation to the reference point (representing the observed field) (Taylor 2001). Figure 13a shows a wide spread in r and σn among the individual model runs: r varies between 0.2 and 0.95 and σn varies between 0.6 and 1.25. The ensemble mean prediction varies between different tidal stations but falls within much narrow ranges of r = (0.8, 0.95) and σn = (0.7, 1.05). The target diagram provides summary information about the pattern statistics as well as the bias, thus allowing for an assessment of their respective contributions to the total RMSE (Jolliff et al. 2009). Although the normalized biases and center RMSE in the individual model runs extend to a circle of radius of 1, the ensemble mean prediction falls within a circle of radius of 0.65. Therefore, the ensemble mean prediction substantially improves the prediction of storm surge generated by Hurricane Arthur. In future studies, it might be interesting to use a larger number of member runs in the model ensemble, including different surface-layer parameterization schemes, different numbers of vertical levels, different grid resolutions in WRF, and a three-dimensional baroclinic version of FVCOM.
The modeling experiments in this study do not consider the oceanic feedback. Chen et al. (2007) showed that high-resolution coupled atmosphere–wave–ocean models were able to capture the complex hurricane structure and intensity change observed during Hurricane Frances (2004). Zambon et al. (2014) used the uncoupled atmospheric model forced by static SST, two-way coupled atmosphere and ocean models, and three-way coupled atmosphere, wave, and ocean models to simulate Hurricane Sandy (2012) but found relatively minor differences in the predicted track and intensity between the uncoupled and coupled models. Future modeling studies of storm surge should consider using fully coupled atmosphere and ocean models because they can directly simulate the air–sea fluxes and may provide more accurate predictions of storm intensity and storm surge.
Another factor to consider in storm surge models is the possible effect of baroclinicity on the storm surge prediction. Although several previous studies found minor differences in the sea level prediction between 3D barotropic and 3D baroclinic ocean models (e.g., Zhong and Li 2006; Ma et al. 2015), Staneva et al. (2016) found up to 20% differences in the surge heights between a 2D barotropic model and a 3D baroclinic model that is coupled to a surface wave model. Some of these differences could be attributed to the effect of wave–current interactions on storm surge (Beardsley et al. 2013). However, the vertical stratification may affect vertical current shear and possibly bottom friction, which could indirectly affect the storm surge prediction.
We are grateful to NOAA (NA13OAR4830233) and the Maryland Sea Grant (NA14OAR4170090, SA75281450-H) for the financial support. FZ is supported by a Maryland Sea Grant fellowship. We thank Dr. Jun Zhang and two anonymous reviewers for their insightful comments, which have significantly improved this paper. Suggestions from Greg Seroka are greatly appreciated. The AVHRR SST data were provided by Travis Miles and Laura Palamara at the Department of Marine and Coastal Sciences, Rutgers University. Model output is available upon request. University of Maryland Center for Environmental Science Contribution Number 5388.