1. Introduction
Reference evapotranspiration (ETo) is defined as the evapotranspiration from a hypothetical reference crop in an adequately watered condition (Allen et al. 1998). ETo is one of the most important hydroclimatic variables for scheduling irrigation, driving hydrologic and crop models, and estimating actual evapotranspiration for a region (Gong et al. 2006). If ETo can be predicted a few months in advance, it would be beneficial for the water management and irrigation communities for making long-term planning decisions. There are many methods to estimate ETo. The Food and Agriculture Organization (FAO) Irrigation and Drainage Paper 56 (FAO-56) Penman–Monteith (PM) equation (Allen et al. 1998) is considered a globally valid standardized method to estimate ETo and was adopted by the FAO of the United Nations. A major limitation of this method is that it requires a large amount of climatic data input at the land surface level, including air temperature, wind speed, solar radiation, and dewpoint temperature or relative humidity, which are often not available in many regions.
Coupled ocean–land–atmosphere general circulation models (CGCMs) combine models for the ocean, atmosphere, land surface, and sea ice and run from several months to 1 year ahead to produce seasonal forecasts (Troccoli 2010). CGCMs have been operationally implemented at major weather and climate forecast centers around the world (Palmer et al. 2004; Saha et al. 2006; Yuan et al. 2011). Recently, the National Centers for Environmental Prediction (NCEP) of the National Oceanic and Atmospheric Administration (NOAA) has improved the physics and resolution of its operational CGCM and updated the forecast system to the Climate Forecast System, version 2 (CFSv2; Saha et al. 2010, 2014; Yuan et al. 2011). There are 29 yr (1982–2010) of retrospective forecasts (reforecasts or hindcasts) of CFSv2 that are archived by the National Climatic Data Center (NCDC). Recent studies have used the archived CFSv2 reforecasts for different applications such as evaluating the seasonal forecast skill of soil moisture (Mo et al. 2012), the South American monsoon (Jones et al. 2012), meteorological drought (Yoon et al. 2012), streamflow (Yuan and Wood 2012a; Yuan et al. 2013), East Asian winter monsoon (Jiang et al. 2013), summer heat waves (Luo and Zhang 2012), and tornado occurrence (Tippett et al. 2012). While the CFSv2 has shown the potential to improve seasonal forecast skill for many applications, studies using CFSv2 to predict seasonal ETo have not been conducted to date.
Seasonal predictions of CFSv2 land surface variables can provide valuable information for ETo forecasts. The archived reforecasts of CFSv2 land surface variables (e.g., temperature, wind speed, and solar radiation) provide all of the variables necessary for the FAO-56 PM equation to assess the predictability for ETo seasonal forecasts. Because of the upgraded physics and resolution, the CFSv2 has been shown to have the ability to predict near-surface air temperature and precipitation for hydrological forecasting at long leads (Yuan and Wood 2012b; Yuan et al. 2011). Besides near-surface air temperature, other land surface variables such as solar radiation and wind speed are important to forecast ETo. However, the predictability of these land surface variables for ETo forecasts has not yet been assessed.
ETo seasonal predictions are often needed at the local scale. Because CFSv2 has the horizontal resolution of T126 (equivalent to nearly 100 km), it is too coarse to meet local forecasting needs. To provide local predictions of seasonal ETo, CFSv2 forecasts need to be spatially downscaled. In general, there are two categories of downscaling methods: statistical and dynamical (Fowler et al. 2007). Statistical downscaling employs statistical relationships between the output of a CGCM and local observations and is computationally efficient and straightforward to apply. The statistical downscaling step can also be used to correct systematic bias between CGCM output and observations. One limitation to statistical downscaling methods is that they need long-term continuous forecast archives and observations to establish the statistical relationships. For dynamical downscaling, a CGCM provides the boundary and initial conditions to a regional climate model (RCM) and the RCM runs at a finer resolution to produce local-scale forecasts. The errors in the CGCM are typically propagated to the RCM and influence predictions (Hwang et al. 2011; Yoon et al. 2012). RCMs also have their own errors. Thus, dynamical downscaling requires additional statistical bias correction and is computationally intensive.
There are multiple methods for statistical downscaling and bias correction. The bias correction and spatial disaggregation (BCSD) method is an interpolation-based downscaling technique that has been extensively applied in hydrologic prediction studies (e.g., Christensen et al. 2004; Salathe et al. 2007; Maurer and Hidalgo 2008; Wood et al. 2002, 2004; Yoon et al. 2012). The BCSD method consists of bias correction using quantile mapping and then spatial disaggregation, which works effectively to correct both mean and variance of the forecasts using observations. Abatzoglou and Brown (2012) developed the spatial disaggregation with bias correction (SDBC) method by reversing the order of the BCSD procedures. This modification improved downscaling skill for reproducing local-scale temporal statistics of precipitation (Hwang and Graham 2013). The spatial disaggregation (SD) from the BCSD or SDBC methods could be adapted to an independent method by spatially interpolating the anomalies of forecasts to a finer resolution and then producing the downscaled forecasts by adding the observed climatology to the interpolated forecast anomalies. Besides the interpolation-based downscaling methods, there are parametric approaches (Schaake et al. 2007; Wood and Schaake 2008) and Bayesian merging techniques (Coelho et al. 2004; Luo and Wood 2008; Luo et al. 2007) that have also been used in downscaling seasonal climate forecasts. While the natural analog and constructed analog downscaling methods have shown good performance (Abatzoglou and Brown 2012; Hidalgo et al. 2008; Maurer and Hidalgo 2008; Maurer et al. 2010; Tian and Martinez 2012a, 2012b), the seasonal reforecast datasets are generally not long enough to perform those analog-based downscaling methods since there is a limited number of potential historical analogs.
The objectives of this study were 1) to evaluate the ability of the CFSv2 to produce downscaled ETo seasonal predictions from 0 to 9 months lead, 2) to assess the predictability of the relevant CFSv2 land surface and reference height variables for ETo forecasts, and 3) to compare the skill from two interpolation-based statistical downscaling methods to downscale CFSv2 forecasts. As members of the Southeast Climate Consortium (SECC), our goal was to develop an improved understanding of seasonal climate variability and climate predictability at local to regional scales across the southeastern United States (SEUS). Therefore, the study region includes Alabama, Georgia, and Florida in the SEUS. Sections 2 and 3 describe the data and methods used in this work. The results are presented in section 4. Conclusions and a discussion are given in section 5.
2. Data
The availability of the long-term archived CFSv2 reforecasts makes it feasible to conduct statistical downscaling. The forcing dataset of phase 2 of the North American Land Data Assimilation System (NLDAS-2) were used as observations to verify and correct errors of the downscaled forecasts.
a. NCEP CFSv2 forecasts
Example of a subset of 16 grid points covering the Tampa Bay area. The small points denote where NLDAS-2 data are available. Large black squares denote where the CFSv2 reforecast data are available.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
b. Forcing dataset of NLDAS-2
The 0.125° × 0.125° resolution (approximately 12 km) NLDAS-2 forcing dataset (Xia et al. 2012b,c; Fig. 1) was taken as a surrogate for long-term observations to use both for forecast verification and bias correction (as described in section 3). NLDAS-2 integrates a large quantity of observation-based and model reanalysis data to drive land surface models and executes at 0.125° × 0.125° grid spacing over North America with an hourly time step. The NLDAS-2 data provides the same land surface variables as CFSv2 reforecasts required for ETo estimation, including 2-m Tmax, Tmin, Tmean, u10, and Rs. The u10 was converted to u2 using Eq. (1). For this work, we used 30 yr of data from January 1982 to December 2011 because CFSv2 has a 9-month lead. The NLDAS-2 hourly data were aggregated into daily data and then averaged to monthly means. The monthly means were converted to seasonal means as was done for CFSv2 reforecasts.
The NLDAS-2 fields used in this study were based on the interpolation of the 3-h time step North American Regional Reanalysis (NARR; 0.3° × 0.3°; Mesinger et al. 2006; Xia et al. 2012b). The NLDAS-2 Rs was bias corrected using satellite-derived Rs by Pinker et al. (2003) over each grid cell using the ratio of their monthly average diurnal cycle (Xia et al. 2012b); all other NLDAS-2 fields used in this study were directly interpolated from NARR with or without adjustments to account for the vertical difference between the NARR and NLDAS-2 fields of terrain height (Xia et al. 2012a). The methods of the spatial and temporal interpolation and vertical adjustment were adopted from Cosgrove et al. (2003). Since the NARR data are a hybrid product of model simulations and observations, they contain known biases for either output fields (e.g., Markovic et al. 2009; Vivoni et al. 2008; Zhu and Lettenmaier 2007) or estimated ETo (Tian and Martinez 2012b). Thus, biases from the NARR would be propagated to the NLDAS-2 fields and consequently affect ETo estimation in this study. High biases of Rs and precipitation were found in the prior generation of forcing of the NLDAS-2 (Luo et al. 2003). The validation for the NLDAS-2 fields is still ongoing by the research community.
3. Methods
a. ETo estimation methods
For details on the calculation of each of the terms in Eq. (3), the reader is referred to Allen et al. (1998). Since the CFSv2 reforecast and forcing data of NLDAS-2 did not include dewpoint temperature (Tdew) or relative humidity (RH) that are required to calculate ea, we approximated Tdew using Tmin, which has been found to be suitable for humid regions (Allen et al. 1998).

b. Downscaling methods
The downscaled ETo seasonal forecasts were produced in two ways. The first was to calculate the coarse-scale ETo forecasts using the FAO-56 PM equation with the input of the CFSv2 forecasts of Tmean, Tmax, Tmin, Rs, and Wind and then downscale and bias correct the coarse-scale ETo (hereafter ETo1); the second was to downscale and bias correct the CFSv2 forecasts of Tmean, Tmax, Tmin, Rs, and Wind and then input those variables into the FAO-56 PM equation to calculate the downscaled ETo forecast (hereafter ETo2). Similarly, Yuan and Wood (2012a) conducted downscaling of the CFSv2 streamflow forecasts in two ways: 1) bias correcting streamflow predicted by the integrated land surface model in CFSv2 and 2) downscaling the meteorological seasonal forecast and using it as input to a well-calibrated hydrologic model. Two methods, SD and SDBC, were used to downscale each of those variables and ETo.
1) SD method

2) SDBC method
c. Evaluation statistics




4. Results
Tables 1 and 2 show the overall mean MSESS and BSS for downscaled CFSv2 variables and ETo by the SD and SDBC methods in 0-month lead for all seasons. In Table 1, in terms of the overall mean skill scores for the CFSv2 variables, Tmax had the highest skill for both deterministic and probabilistic forecasts and was followed by Tmean, Tmin, Rs, and Wind. The skill scores for the SDBC method were slightly higher than the SD method. The probabilistic forecasts in the above-normal and below-normal categories had positive skill, while the forecast skill in the near-normal category was all negative. Failure of the near-normal forecast has been found in many previous studies (e.g., van den Dool and Toth 1991; Barnston et al. 2003). This failure is related to two facts: 1) narrowing of the forecast probability distribution increases the probability in the near-normal category and decreases the probability in the outer two categories, but the change is usually not sizeable, and 2) overall shifts in the forecast probability distribution can reduce the probability in the near-normal category, but changes the probabilities in below- and above-normal categories far more substantially (van den Dool and Toth 1991). In general, for the overall mean predictive skill of the two ETo methods, ETo2 showed slightly higher skill than ETo1, and the SDBC showed slightly higher skill than the SD method (Table 2).
The overall average MSESS and BSS for different downscaled CFSv2 variables in 0-month lead by SD and SDBC methods. The MSESS and BSS evaluate the overall skill and tercile skill, respectively.
As in Table 1, but for the downscaled ETo with ETo1 calculated using the CFSv2 variables before downscaling, and ETo2 calculated with the downscaled CFSv2 variables. The higher positive scores between ETo1 and ETo2 are highlighted in bold.
a. Evaluation of forecast skill in different seasons
This section compared the mean forecast skill in different seasons averaged over the entire region in 0-month lead. Figure 2 shows the forecast skill of the deterministic and tercile probabilistic forecasts in different seasons for the five downscaled CFSv2 variables for the SD and SDBC methods in 0-month lead. In terms of the MSESS and BSS in below- and above-normal categories, Tmean and Tmax had the highest skill over all seasons, Tmin and Rs were skillful only during the cold seasons, and Wind showed minor skill in warm seasons. Figure 3 shows a comparison of skill scores of downscaled ETo for the SD and SDBC methods in different seasons at 0-month lead. Both ETo1 and ETo2 showed skill in cold seasons while the skill dropped below 0 during warm seasons; the two methods for downscaling ETo did not show much difference in terms of the forecast skill, particularly in cool seasons when the skill was positive. Both Figs. 2 and 3 show there is no sizeable difference between the SD and SDBC methods.
Comparison of skill scores of five CFSv2 downscaled variables—Tmean, Tmax, Tmin, Rs, and Wind— by (a)–(d) the SD and (e)–(h) the SDBC methods as a function of consecutive three-month periods (January–March to December–February) for 0-month lead time: (a) MSESS for SD; BSS for SD (b) below, (c) near, and (d) above normal; (e) MSESS for SDBC; and BSS for SDBC (f) below, (g) near, and (h) above normal.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
Comparison of skill scores of downscaled ETo by the SD and SDBC methods as a function of consecutive three-month periods (January–March to December–February) for 0-month lead time: (a) MSESS and BSS (b) below, (c) near, and (d) above normal.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
Since ETo estimation is governed by Tmean, Tmax, Tmin, Rs, and Wind, it would be useful to look at the influence of each variable on ETo. In terms of the sensitivity coefficients in Table 3, Tmax and Rs had the greatest influence on ETo, followed by Tmin and Tmean, and Wind showed only slight influence, particularly in warmer seasons. All variables had positive influence on ETo except Tmin, which had negative influence. While the influence of Tmax is relatively constant for all seasons, Tmean, Tmin, and Rs have greater influence in warmer seasons than in cooler seasons. During warm seasons, because of the influence of Tmin and Rs on ETo, the poor forecasts of these two variables by CFSv2 would cause the negative skill of the ETo forecast in these seasons. While during cold seasons when Tmax, Tmean, Tmin, and Rs forecasts had relatively good performance, ETo showed positive forecast skill.
Spatial average of monthly sensitivity coefficients for Tmean, Tmax, Tmin, Rs, and Wind over the SEUS.
b. Evaluation of forecast skill over space
The forecast skill in this section was the skill scores averaged over all seasons in different grid points in 0-month lead. Figures 4 and 5 show the spatial distribution of the deterministic and probabilistic forecast skill of the downscaled CFSv2 variables (Tmean, Tmax, Tmin, Rs, and Wind) for the SD and SDBC methods for all seasons in 0-month lead, respectively. For both the SD and SDBC methods, Tmean and Tmax had the highest skill over the region, followed by Tmin, Rs, and Wind (Figs. 4, 5). Figure 4 shows, for the SD method, both deterministic and probabilistic forecasts for Tmean and Tmax were skillful in most of the region. The MSESS and BSS for Tmin were highest in southern Florida, northern Georgia, northern Alabama, and coastal areas for the below- and above-normal categories. For deterministic forecasts and above-normal forecasts of Rs, there was skill in northern Florida, while for the below- and near-normal forecasts, there were no skill in most of the area. The forecasts of Wind did not show skill anywhere in the region, with western Florida and Alabama showing the most negative skill. Compared to the other variables, Wind is the most influenced variable by land surface conditions. The failure of the Wind forecasts was likely due to several reasons. First, the boundary layer changes diurnally and seasonally, which makes it difficult to model; second, the turbulent fluxes in the boundary layer (drag force due to land surface friction) in CFSv2 may not be resolved, likely because of the coarse spatial resolution of the model configuration. Figure 5 shows that for the SDBC method, although the skill showed a similar spatial pattern with the SD method, there were improvements of skill over the whole area.
The average skill scores of the downscaled CFSv2 variables—(top to bottom) Tmean, Tmax, Tmin, Rs, and Wind— by the SD method for the deterministic and tercile forecasts across the SEUS: (left to right) MSESS and BSS below, near, and above normal.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
As in Fig. 4, but for the SDBC method.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
Figure 6 shows the average skill scores of the deterministic and probabilistic forecasts of the downscaled ETo1 and ETo2 for the SD and SDBC methods for 0-month lead. For the SD method, both ETo1 and ETo2 showed skillful deterministic forecasts for most of the area except southern Florida. The forecast skill for ETo1 and ETo2 was very similar. For the SDBC method, in terms of the MSESS and above-normal BSS, there were larger areas showing higher skill for ETo2 than for ETo1. There were greater areas showing high skill for the SDBC method than the SD method. The greatest improvement of skill occurred in the near-normal forecasts, though the skill was still negative in most of the area. The skill improvement for the SDBC was due to the additional procedure to bias correct the overall shape of the forecast distribution. In terms of the monthly average of the absolute sensitivity coefficients for each variable in Fig. 7, Tmax and Rs had the greatest influence on ETo, followed by Tmin, Tmean, and Wind. The absolute sensitivity coefficient for each variable was relatively homogeneous over space. Given the influence of Rs on ETo, the low skill of ETo in southern Florida is likely caused by the low skill of Rs in the area.
The average skill scores of the downscaled CFSv2 ETo by the SD and SDBC methods—(top to bottom) SD ETo1, SD ETo2, SDBC ETo1, and SDBC ETo2 variables—for the deterministic and tercile forecasts across the SEUS: (left to right) MESS, BSS below, BSS near, and BSS above normal.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
The average of absolute sensitivity coefficients for (left to right) Tmean, Tmax, Tmin, Rs, and Wind over all consecutive three-month periods (January–March to December–February) in the SEUS.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
c. Evaluation of forecast skill for different leads
The skill scores in this section were the spatial average over all the grid points for different months and leads. All the contours in Figs. 7–10 were smoothed for display purposes. Figures 7 and 8 demonstrate the predictive skill of the deterministic and probabilistic forecasts of the CFSv2 variables (Tmean, Tmax, Tmin, Rs, and Wind) for all seasons and leads over the entire area for the SD and SDBC method, respectively. In general, both figures indicate Tmean and Tmax had longer skillful leads than Tmin, Rs, and Wind, with Tmin and Rs having skillful leads out to approximately 3 months during the cold seasons. Figure 8 shows the MSESS of Tmean and Tmax were skillful at long leads for most seasons of the year. The BSS for Tmean and Tmax indicates that the below- and above-normal forecasts were skillful at near leads. Tmin only showed skill at near leads for cold seasons and no skill for warm seasons in terms of MSESS and BSS in the outer two categories. The deterministic forecasts for Rs were skillful at long leads for cool seasons, while the probablistic forecasts were skillful at near leads during the cold seasons. The deterministic forecasts for Wind showed some skill out to month 3 lead for warm seasons, while there was no skill for other seasons and tercile forecasts. Figure 9 shows the SDBC method improved skill for longer leads for below-normal and above-normal forecasts of Tmax and Rs and deterministic forecasts of Wind. The greatest improvement occurred for near-normal forecasts of Tmax even though the skill was still negative. There were no obvious improvements for the other forecasts.
As in Fig. 4, but as a function of consecutive three-month periods (January–March to December–February) and lead times of 0–9 months. The thick contour denotes 0 skill.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
As in Fig. 8, but for the SDBC method.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
The average skill scores of (top) ETo1 and (bottom) ETo2 by the SD method as a function of consecutive three-month periods (January–March to December–February) and lead times of 0–9 months for the deterministic and tercile forecasts across the SEUS: (left to right) MSESS and BSS below, near, and above normal. The thick contour denotes 0 skill.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
Figures 10 and 11 show the predictive skill of ETo1 and ETo2 as a function of seasons and leads for the SD and SDBC methods, respectively. In general, ETo1 showed similar skill to ETo2, and skillful forecasts at long leads occurred in cold seasons (Figs. 10, 11). In Fig. 10, for the SD method, the predictive skill of the deterministic forecasts of ETo1 and ETo2 had skill at all leads for cold seasons, while there was no skill for warm seasons. Both ETo1 and ETo2 had skill for below-normal forecasts at long leads for cold seasons. For above-normal forecasts, both ETo1 and ETo2 show skill at long leads out to month 5 in cold seasons. Figure 11 shows the SDBC method indicates a slight improvement of skill at longer forecast lead in terms of the above- and below-normal BSS and a great improvement of skill at longer leads for the near-normal BSS even though the skill was still negative. The Student’s t test was conducted to compare the skill of SDBC and SD. For ETo1, the improvements were significant (p < 0.05) for the below-normal BSS from month 6 to month 9 leads, for the near-normal BSS at all leads, and for the above-normal BSS from month 6 to month 9 leads; for ETo2, only the near-normal BSS from month 6 to month 9 leads and the above-normal BSS from month 6 to month 8 leads showed significant improvement (p < 0.05). There were no improvements for any other forecasts.
As in Fig. 10, but for the SDBC method.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
d. Evaluation of forecast skill during ENSO events
Since CFSv2 has been shown to accurately predict the phase of ENSO (Kim et al. 2012), ETo might be better predicted during ENSO events. To evaluate this, we summarized the skill scores of ETo forecasts where the forecast initial seasons (Figs. 12, 13) and target seasons (Fig. 14) were classified as either El Niño [Oceanic Niño Index (ONI) exceeds +0.5°C for at least five consecutive overlapping seasons] or La Niña events (ONI is below −0.5°C for at least five consecutive overlapping seasons) according to the historical ENSO episodes issued by NOAA’s Climate Prediction Center. Figure 12 shows the predictive skill of downscaled ETo by the SDBC method in different seasons at 0-month lead during ENSO events. While the ETo forecast skill was still negative in warm seasons, it was positive in cold seasons and higher than the forecast skill for the entire period (Fig. 3). Figure 13 shows the predictive skill of the downscaled ETo by the SDBC method as a function of seasons and leads where forecast initial seasons were during ENSO events. It shows the forecast skill was higher than during the entire period at long leads (Fig. 11). Figure 14 shows the predictive skill of the downscaled ETo by the SDBC method as a function of seasons and leads where the forecast target seasons were during ENSO events. It shows the skill was negative at long leads even though it showed high skill at 0-month lead. These results indicate that the CFSv2 model can predict ETo with good skill only when the forecast initial seasons were in either the El Niño or La Niña phase of ENSO. The ETo forecasts were not skillful when the forecast target months were during an ENSO event but the forecast initial months were not.
The skill scores of downscaled ETo1 (black) and ETo2 (gray) by the SDBC method as a function of consecutive three-month periods (January–March to December–February) for 0-month lead during ENSO events: (a) MSESS and BSS (b) below, (c) near, and (d) above normal.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
As in Fig. 10, but by the SDBC method. The forecast initial season is during ENSO events.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
As in Fig. 13, but the forecast target season is during ENSO events.
Citation: Journal of Hydrometeorology 15, 3; 10.1175/JHM-D-13-087.1
5. Concluding remarks
ETo is an important hydroclimatic factor for regional water resources planning and management. Based on the seasonal forecasts of Tmean, Tmax, Tmin, Rs, and Wind from the NCEP CFSv2, this study evaluated the deterministic and probabilistic forecasts of seasonal ETo and the predictability of the five relevant variables over the SEUS. The ETo was estimated by two methods. The first method (ETo1) calculated the coarse-scale ETo using the FAO-56 PM equation with the input from the seasonal forecasts of Tmean, Tmax, Tmin, Rs, and Wind from the CFSv2 and then downscaled the coarse-scale ETo to a regional 12-km grid. The second method (ETo2) downscaled each of the five CFSv2 variables to the 12-km grid and then calculated ETo using the FAO-56 PM equation with those downscaled variables. Two methods of statistical downscaling were tested for all seasons and all leads, spatial disaggregation (SD), and spatial disaggregation with bias correction (SDBC).
The CFSv2 showed potential to make deterministic and probabilistic forecasts of seasonal ETo and the five relevant variables in the SEUS. The skill for forecasting seasonal ETo varied with different seasons and across three states of the SEUS. Overall, the deterministic and probabilistic forecasts for both ETo methods were skillful at longer leads during the cold seasons but showed no skill at any leads during the warm seasons. The ETo2 had slightly higher skill than ETo1 over space; however, there was little difference in terms of the forecast skill in different seasons or at different leads. In terms of the computational time, the SD method is more efficient than the SDBC method since it does not include the quantile mapping bias correction procedure. However, the SDBC method slightly improved the probabilistic forecast skill for most of the area except part of southern Florida; the skill for probabilistic forecasts was significantly enhanced over all seasons with longer skillful leads during the cold seasons, but the skill for all leads was still negative during the warm seasons. For downscaling of the forecasts of the five variables, the SDBC method improved the skill mostly in warm seasons when the SD method showed minor or negative skill. The improvement of the skill was over most of the study area, especially over the areas showing minor or negative skill, and the enhancements of the skillful forecast leads were mostly in warm seasons.
Considering the similarity of forecast skill and the efficiency of computational time, ETo1 is the preferred method to ETo2 because it downscaled only one variable (ETo) instead of five. However, ETo2 is helpful to identify the skillful or unskillful variables and, accordingly, to determine the ETo calculation method. Based on this information, other approximate ETo calculation methods such as the Hargreaves method (Hargreaves and Samani 1985), Turc method (Turc 1961), or Priestley–Taylor method (Priestley and Taylor 1972) that require fewer variables might be as skillful as the PM method to produce ETo forecasts in the region (Droogers and Allen 2002; Martinez and Thepadia 2010; Sperna Weiland et al. 2012; Thepadia and Martinez 2012; Todorovic et al. 2013). In addition, the replacement of unskillful variables with the climatology from reanalysis datasets has been shown to improve forecast skill (Tian and Martinez 2012a,b). The improvement of skill for the SDBC method over the SD method implies the additional quantile mapping procedure is effective to correct the systematic errors of the CFSv2 forecast since the SDBC method corrects for bias in the entire shape of the distribution.
ETo was found to be better predicted by the CFSv2 model when the forecast initial seasons were in either the El Niño or La Niña phase of ENSO. However, the ETo forecasts were not skillful when the forecast target months were during an ENSO event but the forecast initial months were not. This result is consistent with the findings from the CFSv1 model, which was found to be able to capture the impact of ENSO on precipitation when the initial conditions already contained the ENSO signal (Yoon et al. 2012).
The variability of the ensemble of the forecasted variables was not evaluated in this work. Such variability of the ensemble is most likely different among different variables. However, downscaling and bias-correction procedures could reduce the forecast uncertainty (e.g., Wilks and Hamill 2007). There are different methods to evaluate the uncertainty of ensemble forecasts (e.g., Wang 2014). Such a study was beyond the scope of this paper. Future work could be conducted to evaluate the uncertainties associated with each of the forecasted variables and how much the downscaling and bias correction could reduce such uncertainty.
The CFSv2-based seasonal prediction of ETo showed moderate skill in cold seasons but no skill in warm seasons. The low performance of ETo prediction in summer was caused by the skill drops of Rs and Tmin. Our explanation is that more convective heating occurs in summer than in winter. Such convection could generate different weather conditions (e.g., clouds) at a small scale that are not captured by the CFSv2 because of its coarse resolution. When the forecast target is in summer, most of the forecasts except Tmean and Tmax showed no skill. These variables, including ETo, Tmin, Rs, and Wind, may not be ready for applications in planning and decision making in summer in the SEUS. Since Rs had no skill during summer and was found to be one of the most important influential variables for ETo estimation in the SEUS, efforts such as running the CFSv2 at a high grid spacing could be done in order to improve the Rs forecasts in the summer. When the forecast target is in winter, using the forecast product could potentially bring benefits for planning or decision making for different sectors even though the skill is moderate. For the application of ETo forecasts in agricultural water management, summer is not the only growing season for many crops in the SEUS because of the warmer climate; the forecast information in other seasons could be useful for farmers. The evaluation of the economic value for using the seasonal forecasts could be conducted in future work.
The ETo forecasts produced in this work were bias corrected using the NLDAS-2 fields that were based on interpolation of the NARR. While the NLDAS-2 data were used as a surrogate for observations, they may contain biases compared to station-based observations. Therefore, the downscaled and bias-corrected CFSv2 predictions produced by this work are still affected by the biases in the NLDAS-2 data. Nevertheless, the methods for ETo estimation and forecasts used in this work are ready to be extended to other regions. The evaluation of the forecast skill can provide valuable information for users who want to use the seasonal forecast product. The downscaled seasonal ETo forecast product could potentially be used to drive hydrological models, urban water supply and demand models, and crop models; inform irrigation schedules; guide water resources planning; assess the risk of climate variability, etc.; and thus improve the reliability of decision-making and reduce risk in different societal sectors such as water management, resource management, and agricultural production management.
Acknowledgments
This research was supported by the NOAA/Climate Program Office SARP and RISA program Grants NA10OAR4310171 and NA12OAR4310130. The forcing data of NLDAS-2 used in this effort were acquired as part of the activities of NASA’s Science Mission Directorate and are archived and distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC).
APPENDIX
Derivation of the FAO Penman–Monteith Equation for the Hypothetical Grass Reference Crop

REFERENCES
Abatzoglou, J. T., and Brown T. J. , 2012: A comparison of statistical downscaling methods suited for wildfire applications. Int. J. Climatol., 32, 772–780, doi:10.1002/joc.2312.
Allen, R. G., Pereira L. S. , Raes D. , and Smith M. , 1998: Crop evapotranspiration: Guidelines for computing crop water requirements. FAO Irrigation and Drainage Paper 56, 300 pp. [Available online at www.fao.org/docrep/X0490E/X0490E00.htm.]
Barnston, A. G., Mason S. J. , Goddard L. , DeWitt D. G. , and Zebiak S. E. , 2003: Multimodel ensembling in seasonal climate forecasting at IRI. Bull. Amer. Meteor. Soc., 84, 1783–1796, doi:10.1175/BAMS-84-12-1783.
Christensen, N. S., Wood A. W. , Voisin N. , Lettenmaier D. P. , and Palmer R. N. , 2004: The effects of climate change on the hydrology and water resources of the Colorado River basin. Climatic Change, 62, 337–363, doi:10.1023/B:CLIM.0000013684.13621.1f.
Coelho, C., Pezzulli S. , Balmaseda M. , Doblas-Reyes F. , and Stephenson D. , 2004: Forecast calibration and combination: A simple Bayesian approach for ENSO. J. Climate, 17, 1504–1516, doi:10.1175/1520-0442(2004)017<1504:FCACAS>2.0.CO;2.
Cosgrove, B. A., and Coauthors, 2003: Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project. J. Geophys. Res., 108, 8842, doi:10.1029/2002JD003118.
Droogers, P., and Allen R. , 2002: Estimating reference evapotranspiration under inaccurate data conditions. Irrig. Drain. Syst., 16, 33–45, doi:10.1023/A:1015508322413.
Fowler, H. J., Blenkinsop S. , and Tebaldi C. , 2007: Linking climate change modelling to impacts studies: Recent advances in downscaling techniques for hydrological modelling. Int. J. Climatol., 27, 1547–1578, doi:10.1002/joc.1556.
Gong, L., Xu C.-Y. , Chen D. , Halldin S. , and Chen Y. D. , 2006: Sensitivity of the Penman–Monteith reference evapotranspiration to key climatic variables in the Changjiang (Yangtze River) basin. J. Hydrol., 329, 620–629, doi:10.1016/j.jhydrol.2006.03.027.
Hargreaves, G. H., and Samani Z. A. , 1985: Reference crop evapotranspiration from temperature. Appl. Eng. Agric., 1, 96–99, doi:10.13031/2013.26773.
Hidalgo, H. G., Dettinger M. D. , and Cayan D. R. , 2008: Downscaling with constructed analogues: Daily precipitation and temperature fields over the United States. PIER Final Project Rep. CEC-500-2007-123, California Energy Commission, Sacramento, CA, 48 pp. [Available online at www.energy.ca.gov/2007publications/CEC-500-2007-123/CEC-500-2007-123.PDF.]
Hwang, S., and Graham W. D. , 2013: Development and comparative evaluation of a stochastic analog method to downscale daily GCM precipitation. Hydrol. Earth Syst. Sci. Discuss., 10, 2141–2181, doi:10.5194/hessd-10-2141-2013.
Hwang, S., Graham W. D. , Hernández J. L. , Martinez C. J. , Jones J. W. , and Adams A. , 2011: Quantitative spatiotemporal evaluation of dynamically downscaled MM5 precipitation predictions over the Tampa Bay region, Florida. J. Hydrometeor., 12, 1447–1464, doi:10.1175/2011JHM1309.1.
Jensen, M. E., Burman R. D. , and Allen R. G. , Eds., 1990: Evapotranspiration and Irrigation Water Requirements. ASCE Manuals and Reports on Engineering Practice, No. 70, American Society of Civil Engineers, 332 pp.
Jiang, X., Yang S. , Li Y. , Kumar A. , Wang W. , and Gao Z. , 2013: Dynamical prediction of the East Asian winter monsoon by the NCEP Climate Forecast System. J. Geophys. Res. Atmos., 118, 1312–1328, doi:10.1002/jgrd.50193.
Jones, C., Carvalho L. M. V. , and Liebmann B. , 2012: Forecast skill of the South American monsoon system. J. Climate, 25, 1883–1889, doi:10.1175/JCLI-D-11-00586.1.
Kim, H.-M., Webster P. , and Curry J. A. , 2012: Seasonal prediction skill of ECMWF system 4 and NCEP CFSv2 retrospective forecast for the Northern Hemisphere winter. Climate Dyn., 39, 2957–2973, doi:10.1007/s00382-012-1364-6.
Kingston, D. G., Todd M. C. , Taylor R. G. , Thompson J. R. , and Arnell N. W. , 2009: Uncertainty in the estimation of potential evapotranspiration under climate change. Geophys. Res. Lett., 36, L20403, doi:10.1029/2009GL040267.
Luo, L., and Wood E. F. , 2008: Use of Bayesian merging techniques in a multimodel seasonal hydrologic ensemble prediction system for the eastern United States. J. Hydrometeor., 9, 866–884, doi:10.1175/2008JHM980.1.
Luo, L., and Zhang Y. , 2012: Did we see the 2011 summer heat wave coming? Geophys. Res. Lett., 39, L09708, doi:10.1029/2012GL051383.
Luo, L., and Coauthors, 2003: Validation of the North American Land Data Assimilation System (NLDAS) retrospective forcing over the southern Great Plains. J. Geophys. Res., 108, 8843, doi:10.1029/2002JD003246.
Luo, L., Wood E. F. , and Pan M. , 2007: Bayesian merging of multiple climate model forecasts for seasonal hydrological predictions. J. Geophys. Res., 112, D10102, doi:10.1029/2006JD007655.
Markovic, M., Jones C. G. , Winger K. , and Paquin D. , 2009: The surface radiation budget over North America: Gridded data assessment and evaluation of regional climate models. Int. J. Climatol., 29, 2226–2240, doi:10.1002/joc.1860.
Martinez, C., and Thepadia M. , 2010: Estimating reference evapotranspiration with minimum data in Florida. J. Irrig. Drain. Eng., 136, 494–501, doi:10.1061/(ASCE)IR.1943-4774.0000214.
Maurer, E., and Hidalgo H. , 2008: Utility of daily vs. monthly large-scale climate data: An intercomparison of two statistical downscaling methods. Hydrol. Earth Syst. Sci., 12, 551–563, doi:10.5194/hess-12-551-2008.
Maurer, E., Hidalgo H. , Das T. , Dettinger M. , and Cayan D. , 2010: The utility of daily large-scale climate data in the assessment of climate change impacts on daily streamflow in California. Hydrol. Earth Syst. Sci., 14, 1125–1138, doi:10.5194/hess-14-1125-2010.
Mesinger, F., and Coauthors, 2006: North American Regional Reanalysis. Bull. Amer. Meteor. Soc., 87, 343–360, doi:10.1175/BAMS-87-3-343.
Mo, K. C., Shukla S. , Lettenmaier D. P. , and Chen L.-C. , 2012: Do Climate Forecast System (CFSv2) forecasts improve seasonal soil moisture prediction? Geophys. Res. Lett., 39, L23703, doi:10.1029/2012GL053598.
Monteith, J. L., 1964: Evaporation and environment. Symp. Soc. Exp. Biol., 19, 205–234.
Palmer, T., and Coauthors, 2004: Development of a European Multimodel Ensemble System for Seasonal to Interannual Prediction (DEMETER). Bull. Amer. Meteor. Soc., 85, 853–872, doi:10.1175/BAMS-85-6-853.
Pinker, R. T., and Coauthors, 2003: Surface radiation budgets in support of the GEWEX Continental-Scale International Project (GCIP) and the GEWEX Americas Prediction Project (GAPP), including the North American Land Data Assimilation System (NLDAS) project. J. Geophys. Res., 108, 8844, doi:10.1029/2002JD003301.
Priestley, C. H. B., and Taylor R. J. , 1972: On the assessment of surface heat flux and evaporation using large-scale parameters. Mon. Wea. Rev., 100, 81–92, doi:10.1175/1520-0493(1972)100<0081:OTAOSH>2.3.CO;2.
Saha, S., and Coauthors, 2006: The NCEP climate forecast system. J. Climate, 19, 3483–3517, doi:10.1175/JCLI3812.1.
Saha, S., and Coauthors, 2010: The NCEP climate forecast system reanalysis. Bull. Amer. Meteor. Soc., 91, 1015–1057, doi:10.1175/2010BAMS3001.1.
Saha, S., and Coauthors, 2014: The NCEP climate forecast system version 2. J. Climate, 27, 2185–2208, doi:10.1175/JCLI-D-12-00823.1.
Salathe, E. P., Mote P. W. , and Wiley M. W. , 2007: Review of scenario selection and downscaling methods for the assessment of climate change impacts on hydrology in the United States Pacific Northwest. Int. J. Climatol., 27, 1611–1621, doi:10.1002/joc.1540.
Schaake, J., and Coauthors, 2007: Precipitation and temperature ensemble forecasts from single-value forecasts. Hydrol. Earth Syst. Sci. Discuss., 4, 655–717, doi:10.5194/hessd-4-655-2007.
Shuttleworth, W. J., 1993: Evaporation. Handbook of Hydrology, D. R. Maidment, Ed., McGraw-Hill, 4.1–4.53.
Sperna Weiland, F. C., Tisseuil C. , Dürr H. H. , Vrac M. , and van Beek L. P. H. , 2012: Selecting the optimal method to calculate daily global reference potential evaporation from CFSR reanalysis data for application in a hydrological model study. Hydrol. Earth Syst. Sci., 16, 983–1000, doi:10.5194/hess-16-983-2012.
Thepadia, M., and Martinez C. J. , 2012: Regional calibration of solar radiation and reference evapotranspiration estimates with minimal data in Florida. J. Irrig. Drain. Eng., 138, 111–119, doi:10.1061/(ASCE)IR.1943-4774.0000394.
Tian, D., and Martinez C. J. , 2012a: Forecasting reference evapotranspiration using retrospective forecast analogs in the southeastern United States. J. Hydrometeor., 13, 1874–1892, doi:10.1175/JHM-D-12-037.1.
Tian, D., and Martinez C. J. , 2012b: Comparison of two analog-based downscaling methods for regional reference evapotranspiration forecasts. J. Hydrol., 475, 350–364, doi:10.1016/j.jhydrol.2012.10.009.
Tippett, M. K., Sobel A. H. , and Camargo S. J. , 2012: Association of U.S. tornado occurrence with monthly environmental parameters. Geophys. Res. Lett., 39, L02801, doi:10.1029/2011GL050368.
Todorovic, M., Karic B. , and Pereira L. S. , 2013: Reference evapotranspiration estimate with limited weather data across a range of Mediterranean climates. J. Hydrol., 481, 166–176, doi:10.1016/j.jhydrol.2012.12.034.
Troccoli, A., 2010: Seasonal climate forecasting. Meteor. Appl., 17, 251–268, doi:10.1002/met.184.
Turc, L., 1961: Evaluation des besoins en eau d’irrigation, evapotranspiration potentielle, formule climatique simplifice et mise a jour. Ann. Agron., 12, 13–49.
van den Dool, H. M., and Toth Z. , 1991: Why do forecasts for “near normal” often fail? Wea. Forecasting,6, 76–85, doi:10.1175/1520-0434(1991)006<0076:WDFFNO>2.0.CO;2.
Vivoni, E. R., Moreno H. A. , Mascaro G. , Rodriguez J. C. , Watts C. J. , Garatuza-Payan J. , and Scott R. L. , 2008: Observed relation between evapotranspiration and soil moisture in the North American monsoon season. Geophys. Res. Lett., 35, L22403, doi:10.1029/2008GL036001.
Wang, H., 2014: Evaluation of monthly precipitation forecasting skill of the National Multi-model Ensemble in the summer season. Hydrol. Processes, doi:10.1002/hyp.9957, in press.
Wilks, D. S., 2011: Statistical Methods in the Atmospheric Sciences. 3rd ed. Academic Press, 476 pp.
Wilks, D. S., and Hamill M. T. , 2007: Comparison of ensemble-MOS methods using GFS reforecasts. Mon. Wea. Rev., 135, 2379–2390, doi:10.1175/MWR3402.1.
Wood, A. W., and Schaake J. C. , 2008: Correcting errors in streamflow forecast ensemble mean and spread. J. Hydrometeor., 9, 132–148, doi:10.1175/2007JHM862.1.
Wood, A. W., Maurer E. P. , Kumar A. , and Lettenmaier D. P. , 2002: Long-range experimental hydrologic forecasting for the eastern United States. J. Geophys. Res., 107, 4429, doi:10.1029/2001JD000659.
Wood, A. W., Leung L. R. , Sridhar V. , and Lettenmaier D. P. , 2004: Hydrologic implications of dynamical and statistical approaches to downscaling climate model outputs. Climatic Change, 62, 189–216, doi:10.1023/B:CLIM.0000013685.99609.9e.
Xia, Y., Ek M. , Wei H. , and Ming J. , 2012a: Comparative analysis of relationships between NLDAS-2 forcings and model outputs. Hydrol. Processes, 26, 467–474, doi:10.1002/hyp.8240.
Xia, Y., and Coauthors, 2012b: Continental-scale water and energy flux analysis and validation for the North American Land Data Assimilation System project phase 2 (NLDAS-2): 1. Intercomparison and application of model products. J. Geophys. Res., 117, D03109, doi:10.1029/2011JD016048.
Xia, Y., and Coauthors, 2012c: Continental-scale water and energy flux analysis and validation for North American Land Data Assimilation System project phase 2 (NLDAS-2): 2. Validation of model-simulated streamflow. J. Geophys. Res., 117, D03110, doi:10.1029/2011JD016051.
Yoon, J.-H., Mo K. , and Wood E. F. , 2012: Dynamic-model-based seasonal prediction of meteorological drought over the contiguous United States. J. Hydrometeor., 13, 463–482, doi:10.1175/JHM-D-11-038.1.
Yuan, X., and Wood E. F. , 2012a: Downscaling precipitation or bias-correcting streamflow? Some implications for coupled general circulation model (CGCM)-based ensemble seasonal hydrologic forecast. Water Resour. Res., 48, W12519, doi:10.1029/2012WR012256.
Yuan, X., and Wood E. F. , 2012b: On the clustering of climate models in ensemble seasonal forecasting. Geophys. Res. Lett., 39, L18701, doi:10.1029/2012GL052735.
Yuan, X., Wood E. F. , Luo L. , and Pan M. , 2011: A first look at Climate Forecast System version 2 (CFSv2) for hydrological seasonal prediction. Geophys. Res. Lett., 38, L13402, doi:10.1029/2011GL047792.
Yuan, X., Wood E. F. , Roundy J. K. , and Pan M. , 2013: CFSv2-based seasonal hydroclimatic forecasts over the conterminous United States. J. Climate, 26, 4828–4847, doi:10.1175/JCLI-D-12-00683.1.
Zhu, C., and Lettenmaier D. P. , 2007: Long-term climate and derived hydrology and energy flux data for Mexico: 1925–2004. J. Climate, 20, 1936–1946, doi:10.1175/JCLI4086.1.