## 1. Introduction

Solar radiation that reaches the earth’s surface, called global solar radiation *R _{g}*, is a fundamental driving variable of many plant physiological processes, such as evapotranspiration, photosynthesis, carbohydrate partitioning, and biomass growth (Cengiz et al. 1981; Boote and Loomis 1991; Allen 1997; Wu et al. 2007). In addition to other weather variables,

*R*is required by most crop models as a key input to simulate crop responses because crop growth is based on several plant physiological processes that involve the utilization of

_{g}*R*(Meinke et al. 1995; Mahmood and Hubbard 2002; Mavromatis and Jagtap 2005). To reflect the seasonal variability and trends in crop production, crop models need weather data of sufficient length (Meinke et al. 1995). Therefore, complete and accurate site-specific data on

_{g}*R*are of considerable importance (Wu et al. 2007). Increasing interest in modeling radiation-driven processes has created a higher demand for

_{g}*R*data (Castellvi 2001).

_{g}Because of financial, technical, or institutional limitations, however, data on *R _{g}* are absent, incomplete, or inaccessible in many parts of the world (Winslow et al. 2001; Wu et al. 2007). As compared with precipitation and temperature,

*R*is not a routinely measured meteorological variable, and only a few stations have records of

_{g}*R*(Allen 1997; Liu and Scott 2001; Rivington et al. 2005). The dearth of

_{g}*R*data for use in crop models is a serious limitation in many countries (Rivington et al. 2005; Yorukoglu and Celik 2006; Muneer et al. 2007; Wu et al. 2007), including the United States (Richardson 1985; Hook and McClendon 1992; Mahmood and Hubbard 2002). Even the stations with long-term

_{g}*R*records sometimes have gaps from several days to several months long that are due to failure in reading by the equipment. These problems have often seriously hindered the research activities that require daily

_{g}*R*data.

_{g}To address this problem, a number of methods ranging from simple empirical relationships to complex radiative transfer schemes have been developed for synthesizing nonexisting or incomplete *R _{g}* data. Although complex methods may be capable of making more accurate estimates, operationally they tend to be too data intensive and complex (Goodin et al. 1999). Values of

*R*can also be estimated directly from satellite observations. The satellite-derived data are rarely available on a real-time basis, however, and also cannot be used in historical, long-term studies (De Jong and Stewart 1993; Donatelli et al. 2003). Moreover, they have inadequate spatial and temporal resolutions (Winslow et al. 2001). Therefore, much effort has been directed at using methods that are simple and are based on commonly measured meteorological variables (De Jong and Stewart 1993; Mavromatis and Jagtap 2005). On the basis of the approach that the simple methods use to generate radiation data, they can be categorized as either stochastic or deterministic (Liu and Scott 2001; Mavromatis and Jagtap 2005).

_{g}A stochastic method, also called a weather generator, aims to simulate a weather variable with statistical characteristics that are identical to those that naturally occur in a given location. Stochastic methods make estimates from data averages, which hide the specific sequence of cold-or-warm, wet-or-dry days that is an important factor for a plant production system (Winslow et al. 2001). A stochastic method may fail to represent the statistical properties of observed weather series (Hayhoe 1998) and also to make reliable estimates of weather extremes (Liu and Scott 2001). Being nondeterministic, a stochastic method may make reliable estimates only for generic, average days, but not for any given day (Goodin et al. 1999). Also, they cannot be used for model validation and simulation analysis for a specific period (Mavromatis and Jagtap 2005).

The common alternative is to use empirical deterministic models (Allen 1997; Hayhoe 1998). An empirical model comprises a set of equations to estimate *R _{g}* from commonly measured meteorological variables. An empirical model that is based on precipitation and temperature may provide better estimates than can a stochastic model (Hayhoe 1998). Simple empirical models are also practical, because they do not need costly and scarce data (Castellvi 2001). There are a number of radiation models that use an empirical approach.

Most empirical methods estimate *R _{g}* as a fraction of daily total solar radiation at the top of the atmosphere, called extraterrestrial radiation

*R*. Variables used in the relationships include cloud cover, elevation, humidity, latitude, precipitation, sunshine duration, temperature, or some combination of the above. Among these variables, using sunshine duration probably would give more accurate results than using other variables, but it is generally absent or incomplete (Wu et al. 2007; Almorox 2011). Cloud-cover data are usually not available and are also less reliable, because the observation of cloudiness is somewhat subjective (Linacre 1992). Temperature- and/or precipitation-based models, on the other hand, are valuable and attractive because thermopluviometric weather stations are common and often have historical records of several decades (De Jong and Stewart 1993; Castellvi 2001). Models that use temperature and/or precipitation are widely used because long-term records of daily temperature and rainfall are readily available in many countries around the world, including the United States. Also, gaps in the records of these variables are rare relative to those in the radiation data and can generally be filled from a nearby weather station (Meinke et al. 1995; Rivington et al. 2006).

_{x}The southeastern region of the United States is a major agricultural area, especially for field crops such as corn, cotton, peanut, soybean, and sugarcane, as well as fruits and vegetables (USDA 1998). In accord with this emphasis, several agricultural research activities are carried out in this region and need data on *R _{g}*. Although a large number of models exist that can estimate

*R*from commonly available meteorological variables, researchers have used only a limited number of methods for generating

_{g}*R*or have explored the performance of only a few methods for the southeastern United States. For instance, Thornton and Running (1999) evaluated the reformulated Bristow and Campbell (1984) model using the weather data of 40 U.S. stations, including 5 stations in the Southeast. Ball et al. (2004) evaluated four models—two models of Hargreaves and Samani (1982) and two variants of the Bristow and Campbell (1984) model—for 13 sites in the United States, including 5 sites in the Southeast. Mavromatis and Jagtap (2005) investigated the predictability of temperature-based models implemented by “RadEst 3.00” (Donatelli et al. 2003; CRA-ISCI 2011) for several locations in Florida. RadEst 3.00 is an

_{g}*R*-estimating software package consisting of three radiation models of Bristow and Campbell (1984), Campbell and Donatelli (1998), and Donatelli and Bellocchi (2001) and a modular model, called “DCBB,” that allows inclusion or exclusion of different components from the three models implemented. Except for the Donatelli–Bellocchi model, Mavromatis and Jagtap (2005) evaluated the other three models on a monthly basis, whereas most crop models need daily data. The Donatelli–Bellocchi model was investigated with a daily time step. Spokas and Forcella (2006) developed an hourly empirical model and tested it in 15 locations around the country, including 3 locations in Florida. Garcia y Garcia et al. (2008) studied the effect of

_{g}*R*generated through the Weather Generator for Solar Radiation (WGENR) model, a stochastic

_{g}*R*generator originally developed by Hodges et al. (1985) and later modified for the southeastern conditions by Garcia y Garcia and Hoogenboom (2005), on simulated yield of cotton, maize, and peanut at various locations in Georgia. The modified WGENR has been used as a principal solar radiation generator for crop modeling purposes in this region (Cabrera et al. 2006; Fraisse et al. 2006, 2007; Fraisse 2007; Garcia y Garcia et al. 2007, 2010; Paz et al. 2007; Persson et al. 2009a,b, 2010a,b).

_{g}The preceding paragraph relates that only a limited number of studies have investigated solar radiation models for the southeastern United States and that these studies have evaluated only a limited number of models in this region. These studies did not carry out the comprehensive, comparative performance analyses of a large number of models or include all models that were in the current best state of model development or were based on commonly measured weather variables. The number of models considered even by the studies that evaluated various methods in other regions of the country was limited. The objective of the study presented here is to evaluate the performance of 16 models that estimate daily global solar radiation using commonly measured meteorological variables—namely, temperature and precipitation—and that are representative of the current best state of model development.

## 2. Materials and methods

### a. Sites and data

On the basis of the availability of daily weather data (maximum and minimum near-surface air temperatures, global solar radiation, and total precipitation), the length of the time span of the data, and geographical representation, 30 locations were selected in the southeastern United States: 10 each in Florida, Georgia, and Mississippi (Fig. 1; Table 1). For Alabama, no solar radiation data of sufficient length were available for any weather station in the state.

Abbreviations (Abbr) and spatial information of various locations, from south to north, selected for the study.

For each location, 8 yr of daily weather data—spanning from 1 January 2003 through 31 December 2010—were used. For locations in Florida, Georgia, and Mississippi, the weather data were obtained from the Internet sites of the Florida Automated Weather Network (UF-IFAS 2011), the Georgia Automated Environmental Monitoring Network (UG-CAES 2011), and the Delta Agricultural Weather Center (DREC-MSU 2011), respectively.

### b. Models evaluated

The models were selected by taking into account commonly measured meteorological variables (temperature and precipitation), simple and empirical relationships, less complex computational procedure and implementation, available software, previous performance, basic method, wide use, and the current state of model development. A total of 16 models were evaluated. From the input variables used, these models can be divided into three groups: temperature based only (*T* based), precipitation based only (*P* based), and both temperature and precipitation based (*TP* based).

#### 1) *T*- or *P*-based models

To estimate the radiation transmissivity of the atmosphere, *T*-based models use daily maximum and minimum air temperatures, assuming that maximum temperature *T*_{mx} decreases but minimum temperature *T*_{mn} increases with increased cloudiness and that *T*_{mx} increases but *T*_{mn} decreases with clearer skies.

*R*from extraterrestrial radiation

_{g}*R*and daily temperature difference

_{x}*T*, a general indicator of cloudiness and atmospheric transmittance, aswhere

_{d}*T*=

_{d}*T*

_{mx}−

*T*

_{mn}and

*α*

_{1},

*α*

_{2}, and the other Greek letters in the following equations are model parameters. Bristow and Campbell (1984) estimated

*R*using

_{g}*R*and diurnal air temperature range

_{x}*T*, an indicator of overcast conditions, aswhere

_{r}*T*is computed for day

_{r}*t*as

*T*

_{r}_{,t}=

*T*

_{mx,t}− (

*T*

_{mn,t}+

*T*

_{mn,t+1})/2 and

^{m}

*T*is mean monthly

_{r}*T*. The

_{r}*T*

_{r}_{,t}is computed from

*T*

_{mn,t+1}to reduce the effect of large-scale horizontal advection of air masses. Richardson (1985) used

*T*instead of

_{d}*T*to reflect the effect of cloudiness asRatkowsky (1990) followed the Bristow and Campbell (1984) approach but used

_{r}*T*:

_{d}*T*

_{av}and

*T*

_{mn}in addition to

*T*, Campbell and Donatelli (1998) derived the following model from Eq. (2):Accounting for seasonal-variation effects in clear-sky transmissivity and

_{r}*T*at a wide variety of sites, Donatelli and Bellocchi (2001) derived the following model from Eq. (2):where

_{r}*d*is the day of year,

*f*(

*φ*

_{3}) = 1 − 1.9[

*φ*

_{3}− integer(

*φ*

_{3})] + 3.83[

*φ*

_{3}− integer(

*φ*

_{3})]

^{2}, and

^{w}

*T*is mean weekly

_{r}*T*. Mahmood and Hubbard (2002) followed the Richardson (1985) approach but used corrected clear-sky radiation

_{r}*R*rather than

_{c}*R*to compute

_{x}*R*as

_{g}*η*

_{2}to zero:where

*f*(

*η*

_{3}) = 1 − 1.9[

*η*

_{3}− integer(

*η*

_{3})] + 3.83[

*η*

_{3}− integer(

*η*

_{3})]

^{2}and

^{w|m}

*T*=

_{r}^{w}

*T*or

_{r}^{m}

*T*. Using piecewise linear regression, Mavromatis (2008) estimated

_{r}*R*from

_{g}*R*and

_{x}*T*aswhere

_{r}*b*is the breakpoint at which the behavior of

*R*changes. McCaskill (1990) estimated

_{g}*R*using rain-day information as an indicator of overcast conditions. This is a

_{g}*P*-based model:where

*P*′ is a rain day (yes/no), computed as

*P*′ = 0 if

*P*= 0 or

*P*′ = 1 if

*P*> 0, where

*P*is the total precipitation on day

*t*,

*t*− 1, or

*t*+ 1.

#### 2) *TP*-based models

*R*

_{g}, incorporating a precipitation function into the

*T*-based model of Richardson (1985), asReplacing

*T*by

_{r}*T*, adding

_{d}*T*

_{mx}, and assuming the effect of precipitation to be additive, Hunt et al. (1998) modified Eq. (11) toAccounting for seasonal changes in atmospheric transmissivity, Liu and Scott (2001) estimated

*R*from

_{g}*T*

_{r}and

*P*′ as

*R*in Eq. (14) with some adjustment factors, Garcia y Garcia and Hoogenboom (2005) modified WGENR, a solar radiation generator, developed by Hodges et al. (1985) on the basis of the Weather Generator (WGEN) algorithm of Richardson (1981):where

_{a}*ρ*is a random component describing the correlation between

*T*

_{mx}or

*T*

_{mn}and

*R*(Richardson 1981),

_{g}*σ*is the standard deviation conditioned to wet or dry days,

*R*and

_{a}*R*are annual average and amplitude solar radiation values, respectively,

_{m}*d*is the day of year, and

*c*is a constant that is 0 (183) for the Northern (Southern) Hemisphere. Assuming temperature and precipitation effects on radiation to be multiplicative, Wu et al. (2007) estimated

*R*from

_{g}*T*,

_{d}*T*

_{av}, and

*P*′ as

*TP*-based models. From the results of previous researchers, such as De Jong and Stewart (1993), Wu et al. (2007), and Almorox (2011), it was assumed that a

*TP*-based model can make better estimates than the ones that are either

*T*or

*P*based. It was further assumed that a PLR-based model might perform better than those that are not PLR based. In the PLR process, two separate equations were formulated: one for

*R*≤ the breakpoint

_{g}*p*and the other for

*R*

_{g}>

*p*. The

*p*is a value at which the behavior of daily

*R*changes:

_{g}Although the modified WGENR is not an empirical model among the 16 models evaluated, it was compared with other models because it has been used as a major solar radiation generator in the southeastern United States (Cabrera et al. 2006; Fraisse et al. 2006, 2007; Fraisse 2007; Garcia y Garcia et al. 2007, 2010; Paz et al. 2007; Persson et al. 2009a,b, 2010a,b). Equation (16) was added to the list of the models evaluated as an alternative to the Wu et al. (2007) model [Eq. (15)]. For brevity, the names of the above models are abbreviated as presented in Table 2. These abbreviated names will be used to refer the models hereinafter.

Abbreviations and categories of various solar radiation generation methods evaluated in this study.

### c. Analyses

For each location, a daily time series dataset of precipitation, *R _{g}*,

*T*

_{mx}, and

*T*

_{mn}was prepared for the entire period of 8 yr. Then, the whole 8-yr dataset was split into two subsets: one for calibration (estimating model parameters) and the other for evaluation of models. Each subset included 4 yr of daily records that were chosen randomly from the original dataset. Daily values of

*R*were computed for each location and year as a function of latitude and day of year (DOY), using the procedure described by Allen et al. (1998).

_{x}For models implemented by RadEst 3.00, site-specific parameters were estimated one at a time for each model following the optimization procedure described in the RadEst 3.00 documentation (CRA-ISCI 2011). For WGENR, there is no need to optimize because it is a stochastic generator. The model directly generates *R _{g}* once the values of the input variables are provided. For models other than WGENR and the ones implemented by RadEst 3.00, the location-specific parameters were estimated through linear or nonlinear regressions depending on the type of the model. For the PLR-based MM and WP models, the daily mean value of the observed

*R*from the calibration dataset comprising 4 yr for a location was used as the breakpoint [

_{g}*b*in Eq. (9) and

*p*in Eq. (16)] for that location as initially used by Mavromatis (2008). Once the parameters were estimated for a model, values of

*R*were estimated by the model using values of the input variables in the evaluation dataset and the values of estimated parameters. The same procedure was followed for each location and model.

_{g}*R*

^{2}and scatterplots were used to provide supplemental information. The closer the value of the intercept is to zero and that of the slope is to unity, the better is the performance of a model. Also, positive values of both intercept and slope (<1) indicate an overestimation of low and underestimation of high

*R*values, whereas a negative intercept with slope greater than 1 indicates an underestimation of low and overestimation of high

_{g}*R*values. Using the least squares approach, values of the intercept and slope for a model are estimated through the regression of estimated and observed values. The RMSE denotes the average distance of a data point from the predicted values measured along a vertical line and is used to assess the error associated with prediction. The RMSE is computed aswhere

_{g}*O*and

_{i}*E*are the observed and estimated values of

_{i}*R*for day

_{g}*i*, respectively, and

*n*is the total number of observations. The ME denotes the average distance between the observed and estimated values relative to the average distance between the observed and mean observed values. The ME is computed from the ratio of the mean-square error to the mean-square deviation of an observed value from the mean observed value aswhere

*R*and the other variables are as defined above. A greater ME value indicates that estimated values are closer to the observed values than the mean observed values. Values of ME range from −∞ to 1. A negative value indicates that the observed mean is a better predictor than the model, whereas a positive value signifies that the model is a better predictor of the observations than is the observed mean. An ME of 0 indicates that the model predictions are as accurate as the mean of the observed data. Like the Willmott index of agreement (

_{g}*d*index; Willmott 1981), an ME value closer to 1 indicates better performance, and an ME of 1 corresponds to a perfect match of the modeled values to the observed data. Unlike the

*t*statistic (Stone 1993), however, ME does not determine whether a model’s estimations are significantly different from the observed values, in addition to assessing the model’s performance. The

*t*statistic was not used as a goodness-of-fit measure because a model performance significance test was not the objective of this study.

The effects of topographic variables—namely, latitude, longitude, and elevation—on model performance were assessed on the basis of *R*^{2}, which describes the proportion of variability in the observed data that can be accounted for by the model. Its values range from 0 to l, with 1 meaning perfect agreement between the observed and predicted values and 0 meaning no agreement at all. For each model, an *R*^{2} value was computed through the least squares technique using ME and the topographic variable as predictand and predictor, respectively (Fig. 2).

## 3. Results and discussion

### a. Regression-based evaluation

The overall order of the 16 models with decreasing intercept values was HS > RS > MC > DS > RK > WU > HT > LS > DM > WM > MH > MM > WP > CD > BC > DB (Fig. 3a). In a similar way, the order with increasing values of slope was HS < MC < RC < DS < RK < DM < WU < HT < MH < LS < WM < DB < MM < CD < BC < WP (Fig. 3b). When compared with other models, MM, WP, CD, BC, and DB had smaller intercepts and larger slopes, indicating smaller biases for low and high radiation values. Because of large biases for intermediate values, however, the CD, BC, and DB models performed less accurately than several other models, including MM and WP (Fig. 4). Although MM or WP had a larger intercept and a smaller slope than those of CD, BC, and DB, the performances of MM and WP were better because of less bias for the intermediate values of radiation. Because of larger intercepts, smaller slopes, and larger biases than those of MM and WP but smaller biases than those of CD, BC, and DB, the DS, RS, HS, RK, HT, WU, and LS models performed less efficiently than MM and WP but more accurately than CD, BC, and DB (Figs. 3 and 4). When compared with the other models, the performances of MC, WM, and DM were poorer because of larger biases, larger intercepts, and/or smaller slopes. Models with larger intercepts and smaller slopes indicate increased overestimation of lower and underestimation of higher radiation values. The WP was the best model according to regression-based analyses, and the order of the 16 models in increasing performance was MC < WM < DM < DB < MH < BC < CD < DS < RS < HS < RK < HT < WU < LS < MM < WP (Figs. 3 and 4).

### b. Difference-based evaluation

According to the difference-based measures (RMSE and ME), the *TP*-based models, except for DS and WM, performed better than *T*- or *P*-based models, except for MM (Fig. 5). De Jong and Stewart (1993), Wu et al. (2007), and Almorox (2011) also found *P*- and *T*-based models to be less accurate than *TP*-based models. They, however, did not analyze the performance of MM. Liu and Scott (2001) observed that a model that has more input variables generally gets a better fit of observed data and that using temperature and precipitation instead of just temperature explains more variance in the data. One of the first empirical models to estimate *R _{g}* from air temperature and extraterrestrial radiation was HS. Most of the

*T*-based models formulated later are based on the basic concept used by HS. The poor performance of the

*T*-based models relative to those of the

*TP*-based models was because HS is primarily intended for monthly application rather than daily. Using a daily time step would introduce greater uncertainty because many factors besides temperature are involved in the daily process, such as wind speed, humidity, elevation, and precipitation (Allen 1997). Because of the limited number of available

*P*-based empirical models, comparisons among these models were not carried out.

MM, a *T*-based model, was superior to all other *TP*-based models except WP because MM contains two separate relationships for low and high radiation levels, which is discussed in the following paragraphs. Although DS, a *TP*-based model, was superior to most *T*-based models, its performance was poorer than that of RK (in addition to MM) because of the inferiority of models that are based on temperature range *T _{r}* relative to models that are based on temperature difference

*T*, as discussed in the following paragraphs. Whereas RK is a

_{d}*T*-based model, DS is based on

_{d}*T*. The performance of WM was poorer than most

_{r}*T*-based models because as a stochastic generator it probably did not represent the statistical properties of the data on a daily time scale. These results indicated that the

*T*–

*TP*basis is not the sole factor that determines the superiority of a model and that model performance is also influenced by other factors such as the

*T*–

_{r}*T*basis and the piecewise relationship.

_{d}The efficiency of the models increased not only with an inclusion of an additional weather variable, such as precipitation, but also with an increase in the dimension of a given weather variable. For instance, HT performed better than DS because the former used *T _{d}* and

*T*

_{mx}instead of just

*T*, which the latter used. In a similar way, CD outperformed BC by using

_{r}*T*

_{av}and

*T*

_{mn}in addition to

*T*, and DM outperformed BC through the use of

_{r}*T*

_{mn}in addition to

*T*. Likewise, RK performed more accurately than RS by using

_{r}*T*, and

_{d}*T*, which RS used. Also, using the rain-day information of a longer period was found to be a better indicator of overcast conditions. For instance, the performance of LS, which used

_{d}Models that used rain-day information (*P*-based models suggested that rain-day information is a better indicator of cloudiness than the total amount of daily precipitation. Liu and Scott (2001) also concluded that the approach of using precipitation as a binary quantity is better than the approach of using the absolute values of total precipitation. The *T _{d}*-based as shown by the predictions of MC being more accurate than those of DB, BC, CD, and DM. These results supported the conclusion of Liu and Scott (2001) that precipitation-based models account for less variation in

*R*than do

_{g}*T*-based models.

_{d}The performance of MH, a *T _{d}*-based model, was better than those of CD and the other

*T*-based models. In a similar way, HT, which is another

_{r}*T*-based model, made more accurate predictions than did DS, which is another

_{d}*T*-based model. To determine how

_{r}*T*-based MM would compare with the original

_{d}*T*-based MM, values of

_{r}*R*were computed also as a function of

_{g}*T*(by replacing

_{d}*T*) in Eq. (9). In this comparison also, the efficiency of

_{r}*T*-based MM (RMSE = 3.13 MJ m

_{d}^{−2}day

^{−1}; ME = 0.80) was better than that of

*T*-based MM (RMSE = 3.20 MJ m

_{r}^{−2}day

^{−1}; ME = 0.79) for all locations (the RMSE and ME values are averages of 30 locations). These results indicated that a model that uses

*T*as an input variable can make more accurate predictions of

_{d}*R*than can a model that uses

_{g}*T*. When compared with

_{r}*T*,

_{r}*T*is a better indicator of cloudiness and thus a better predictor of

_{d}*R*because

_{g}*T*accounts for the effect of clouds only during the daytime since it is computed from the

_{d}*T*

_{mn}and

*T*

_{mx}of the same day, whereas

*T*also accounts for the effect of clouds during the following evening and night because it is computed from the

_{r}*T*

_{mn}of the following day in addition to the

*T*

_{mn}and

*T*

_{mx}of the same day. Because

*T*

_{mn}and

*T*

_{mx}are generally recorded in the morning and afternoon of the day, respectively, clouds during the evening and night have actually nothing to do with the transmittance of radiation during the preceding daytime.

PLR-based models, which used a set of two equations, were considerably more accurate than non-PLR-based models, which used just one equation. Depending on models and locations, the MEs of both MM and WP were about 5% to more than 100% higher than those of the non-PLR-based models. In a similar way, the RMSE values of MM and WP each were about 10%–55% lower than those of the non-PLR-based methods, depending on models and locations. A single equation of a non-PLR-based model probably could not represent the input–output relationship of *R _{g}* and its input variables for all values of

*R*, that is, for the entire year. Results of several past studies also indicated this phenomenon. For instance, Meinke et al. (1995) found that BC overpredicted larger values but underpredicted smaller values of

_{g}*R*. Castellvi (2001) also observed that HS tended to increase errors for cases with larger values, possibly during summer. Mahmood and Hubbard (2002) also discovered that both BC and MH had a tendency to underpredict larger values. Mavromatis and Jagtap (2005) found that DB generally overestimated during winter and underestimated during summer. These prediction errors were potentially due to local heat advection, frontal movements, the regression approach used, and improperly installed and poorly calibrated instruments (Mahmood and Hubbard 2002; Mavromatis and Jagtap 2005). These results indicate that a single equation cannot sufficiently map the input–output relationship for both high and low radiation levels. Because MW and WP, the PLR-based models, mapped the input–output relationships for low and high radiation levels separately, they performed better than did the non-PLR-based models.

_{g}The WP model had the highest performance based on the RMSE and ME values, and the overall order of all models in increasing performance was DB < WM < BC < CD < DM < MC < MH < HS < RS < DS < RK < HT < WU < LS < MM < WP (Fig. 5). For ease in discussion, this order has been split into two groups: the order of *T*- or *P*-based models and the order of *TP*-based models. For *T*- or *P*-based models, the overall order in increasing performance was DB < BC < CD < DM < MC < MH < HS < RS < RK < MM. This order is in agreement with those of several studies in the past: BC < HS < RS (Liu and Scott 2001), DB < CD (Rivington et al. 2005), and MC < HS (Almorox 2011). The efficiency of BC was higher than that of DB because the former does not contain the complex seasonal component that the latter has. Instead of improving BC, the use of seasonal component in DB made this model less efficient than BC, probably because some effects of *T _{r}* are offset by the values of seasonal component, a function of DOY. The performance of CD was better than that of BC because CD is a function of

*T*,

_{r}*T*

_{av}, and

*T*

_{mn}, whereas the BC is a function of just

*T*. The addition of

_{r}*T*and

_{r}*T*

_{av}to CD perhaps provided the model with the ability to represent multidimensional effects of temperature. The modeling efficiency of DM was better than that of CD possibly due to the fact that DM also accounted for the effects of seasonal variations in clear-sky transmissivity and diurnal temperature range. The

*P*-based model MC estimated

*R*more accurately than did DB, BC, CD, and DM, the

_{g}*T*-based models, indicating that precipitation-based models are superior to

*T*-based models and that rain-day information is a better indicator of overcast conditions relative to

_{r}*T*

_{r}, T_{av}, or

*T*

_{mn}. When compared with MH, HS, RS, and RK, the

*T*-based models, however, the performance of MC was worse, suggesting that precipitation-based models account for less variation in radiation than do

_{d}*T*-based models and thus are inferior. Liu and Scott (2001) also came up with similar conclusions. The superiority of MC to

_{d}*T*-based models and the inferiority of MC to

_{r}*T*-based models indicated that

_{d}*T*-based models are superior to

_{d}*T*-based models. The better performance of MH relative to

_{r}*T*- and precipitation-based models was because of using

_{r}*T*. The modeling efficiency of HS was greater than that of MH, indicating that using

_{d}*R*is better than using

_{x}*R*for predicting

_{c}*R*. When compared with HS, RS was slightly better because the parameter associated with

_{g}*T*is location specific for RS, whereas it is fixed (0.5) for HS. The performance of RK was better than that of RS as a result of more input variables. Model RS uses only

_{d}*T*, whereas RK uses two additional variants of

_{d}*T*,

_{d}*T*. The MM model outperformed all

_{d}*T*- or

*P*-based models due to two separate relationships for high and low radiation levels. This result was in line with that of Mavromatis (2008), where MM was superior to the rest of the

*T*-based modes that were evaluated.

For *TP*-based models, the overall order in increasing performance was WM < DS < HT < WU < LS < WP. This order agrees with those of previous studies: DS < HT < LS (Liu and Scott 2001) and HT < WU < LS (Almorox 2011). Of all of the *TP*-based models, WM performed the worst. Being a stochastic generator, WM might not have been able to represent the statistical properties of the data and thus make reliable estimates for any given day. Another reason for the poor performance was the use of the same values for adjustment factors, the parameters of the model, for all locations in the region. The modeling efficiency of HT was higher than that of DS not only because of the use of more temperature-based variables in HT—namely, *T _{d}* and

*T*

_{mx}—than in DS, which uses only

*T*, but also because of the use of

_{d}*T*, a better predictor, instead of

_{d}*T*. The WU model was more efficient than HT mainly because of using rain-day information as a better indicator of cloudiness rather than the amount of precipitation. The performance of LS was better than that of WU as a result of the rain-day information of three consecutive days:

_{r}*T*,

_{d}*T*

_{av}, and

*P*′), 3)

*T*rather than

_{d}*T*, 4) rain-day information instead of rainfall, and 5) two separate relationships for low and high radiation levels.

_{r}### c. Latitude effect

The effect of latitude on the performance of models varied depending on the model used (Table 3). In general, the effect was more evident for *T*-based models than for *TP*-based models. The larger effect of latitude on *T*-based models was because these models estimate *R _{g}* as a function of

*R*and temperature, both of which are affected by latitude. Mavromatis and Jagtap (2005) also found that a

_{x}*T*-based model (DB) performs better in northern locations than in the southern ones in Florida. The better performance of

*T*-based models in higher latitudes than in the lower ones was probably due to more weather stability. Weather conditions in northern locations generally fluctuate less than those in southern locations, which are frequently affected by factors such as advection, sea breezes, and sea fog. The performances of

*TP*-based models, except WM, were not significantly influenced by latitude because these models also use precipitation, which is generally not influenced by latitude. It is possible that the effect of temperature could have been offset by that of precipitation. In the case of WM (a stochastic

*R*generator), however, the better performance in higher latitudes was probably due to more stable weather conditions, which led to a better representation of the statistical properties of the weather data by the generator and thus more accurate estimates. Longitude and elevation, however, were not found to significantly influence the behavior of any

_{g}*T*- or

*TP*-based models. The insignificant influence was because neither longitude nor elevation affected the patterns of temperature and precipitation in the study locations. Elevation could affect the temperatures across locations if their elevation differences were significant, but that was not the case for the locations selected. Even so, the effect of elevation, if any, might have been masked by local weather conditions.

Values of *R*^{2} associated with the ME of various methods as influenced by latitude, longitude, and elevation. The key to the method abbreviations is given in Table 2.

## 4. Conclusions

Of the 16 models evaluated, WP predicted *R _{g}* the most accurately as a result of the use of both temperature and precipitation,

*T*rather than

_{d}*T*, rain-day information instead of rainfall, and separate relationships for low and high radiation levels. To estimate

_{r}*R*with considerable accuracy, especially for the southeastern United States, therefore, one may use WP when both temperature and precipitation data are available. Among

_{g}*T*- or

*P*-based models, MM was the most efficient, suggesting that when temperature is the only input variable available, this model should be used. The overall order of the 16 models by increasing performance was DB < WM < BC < CD < DM < MC < MH < HS < RS < DS < RK < HT < WU < LS < MM < WP.

In general, models with more input variables or their variants were more efficient because of the inclusion of several relevant predictor variables and the representation of multidimensional input–output relationships and thus the ability to explain more variance in the data. Models that are based on rain-day information estimated *R _{g}* more accurately than the ones based on precipitation amount, suggesting that rain-day information is a better indicator of cloudiness than is the amount of precipitation. Moreover, models using rain-day information of more days performed better as a result of capturing overcast conditions of a longer period. The performances of

*T*-based models were better than those of

_{d}*T*-based models, indicating that

_{r}*T*is a better indicator of cloudiness relative to

_{d}*T*. The PLR-based models—namely, MM and WP—were far superior to non-PLR-based models, suggesting that a model with separate equations for low and high radiation levels can represent the relationship between

_{r}*R*and its input variables better than can non-PLR-based models.

_{g}Elevation and longitude did not have significant effects on the performances of models, and neither did latitude except in *T*-based models that estimate *R _{g}* as a function of

*R*and temperature, both of which are affected by latitude. The

_{x}*T*-based models performed better than other types of models in higher latitudes, possibly because of more stable weather conditions. This was also true for WM, a

*TP*-based stochastic solar radiation generator.

## Acknowledgments

This material is based upon work performed through the Sustainable Energy Research Center at Mississippi State University and is supported by the U.S. Department of Energy under Award DE-FG3606GO86025.

## REFERENCES

Allen, R. G., 1997: Self-calibrating method for estimating solar radiation from air temperature.

,*J. Hydrol. Eng.***2**, 56–67.Allen, R. G., , L. S. Pereira, , D. Raes, , and M. Smith, 1998: Crop evapotranspiration (guidelines for computing crop water requirements). FAO Irrigation and Drainage Paper 56, 326 pp. [Available online at http://www.kimberly.uidaho.edu/ref-et/fao56.pdf.]

Almorox, J., 2011: Estimating global solar radiation from common meteorological data in Aranjuez, Spain.

,*Turk. J. Phys.***35**, 53–64.Ball, R. A., , L. C. Purcell, , and S. K. Carey, 2004: Evaluation of solar radiation prediction models in North America.

,*Agron. J.***96**, 391–397.Boote, K. J., , and R. S. Loomis, 1991: The prediction of canopy photosynthesis.

*Modeling Photosynthesis—From Biochemistry to Canopy,*K. J. Boote and R. S. Loomis, Eds., CSSA Special Publication 19, Crop Science Society of America/American Society of Agronomy, 109–140.Bristow, C. L., , and G. S. Campbell, 1984: On the relationship between incoming solar radiation and daily maximum and minimum temperature.

,*Agric. For. Meteor.***31**, 159–166.Cabrera, V. E., , C. W. Fraisse, , D. Letson, , G. Podestá, , and J. Novak, 2006: Impact of climate information on reducing farm risk by optimizing crop insurance strategy.

,*Trans. ASABE***49**, 1223–1233.Campbell, G. A., , and M. Donatelli, 1998: A simple model to estimate global solar radiation.

*Proc. Fifth European Society of Agronomy Congress,*Nitra, Slovak Republic, European Society of Agronomy, 133–134.Castellvi, F., 2001: A new simple method for estimating monthly and daily solar radiation—Performance and comparison with other methods at Lleida (NE Spain); a semiarid climate.

,*Theor. Appl. Climatol.***69**, 231–238.Cengiz, H. S., , J. M. Gregory, , and J. L. Sebaugh, 1981: Solar radiation prediction from other climatic variables.

,*Trans. ASABE***24**, 1269–1272.CRA-ISCI, cited 2011: Tools for agrometeorology and agricultural modelling—RadEst. Agriculture Research Council–Research Institute for Industrial Crops, Bologna, Italy. [Available online at http://www.sipeaa.it/ASP/ASP2/RadEst.asp.]

De Jong, R., , and D. W. Stewart, 1993: Estimating global solar radiation from common meteorological observations in western Canada.

,*Can. J. Plant Sci.***73**, 509–518.Donatelli, M., , and G. Bellocchi, 2001: Estimate of daily global solar radiation: New developments in the software RadEst 3.00.

*Proc. Second Int. Symp. on Modelling Cropping Systems,*Florence, Italy, European Society of Agronomy, 213–214. [Available online at http://www.sipeaa.it/tools/RadEst/RadEst3.00_ISMCS2001.pdf.]Donatelli, M., , G. Bellocchi, , and F. Fontana, 2003: RadEst 3.00: Software to estimate daily radiation data from commonly available meteorological variables.

,*Eur. J. Agron.***18**, 363–367.DREC-MSU, cited 2011: Delta Agricultural Weather Center. Delta Research and Extension Center–Mississippi State University. [Available online at http://www.deltaweather.msstate.edu/.]

Fraisse, C. W., 2007: AgClimate: Crop yield risk decision support system for the southeastern USA.

*Proc. 30th Southern Region Conservation Technology Center (SRCTC) Annual Meeting, Quincy,*FL, SRCTC. [Available online at http://www.ag.auburn.edu/auxiliary/nsdl/scasc/Proceedings/2007/orals/Fraisse.pdf.]Fraisse, C. W., and Coauthors, 2006: AgClimate: A climate forecast information system for agricultural risk management in the southeastern USA.

,*Comput. Electron. Agric.***53**, 13–27.Fraisse, C. W., , V. E. Cabrera, , N. E. Breuer, , J. Baez, , J. Quispe, , and E. Matos, 2007: El Niño–Southern Oscillation influences on soybean yields in eastern Paraguay.

,*Int. J. Climatol.***28**, 1399–1407.Garcia y Garcia, A., , and G. Hoogenboom, 2005: Evaluation of an improved daily solar radiation generator for the southeastern USA.

,*Climate Res.***29**, 91–102.Garcia y Garcia, A., , L. C. Guerra, , A. Suleiman, , J. O. Paz, , and G. Hoogenboom, 2007: Peanut water use under optimum conditions of growth and development: A simulation approach.

*Proc. Georgia Water Resources Conf. 2007,*Griffin, GA, Georgia Water Resources Institute. [Available online at http://www.gwri.gatech.edu/uploads/proceedings/2007/5.6.4.pdf.]Garcia y Garcia, A., , L. C. Guerra, , and G. Hoogenboom, 2008: Impact of generated solar radiation on simulated crop growth and yield.

,*Ecol. Modell.***210**, 312–326.Garcia y Garcia, A., , T. Persson, , J. O. Paz, , C. W. Fraisse, , and G. Hoogenboom, 2010: ENSO-based climate variability affects water use efficiency of rainfed cotton grown in the southeastern USA.

,*Agric. Ecosyst. Environ.***139**, 629–635.Goodin, D. G., , J. M. S. Hutchinson, , R. L. Vanderlip, , and M. C. Knapp, 1999: Estimating solar irradiance for crop modeling using daily air temperature data.

,*Agron. J.***91**, 845–851.Hargreaves, G. H., , and G. A. Samani, 1982: Estimating potential evapotranspiration.

,*J. Irrig. Drain. Eng.***108**, 225–230.Hayhoe, H. N., 1998: Relationship between weather variables in observed and WXGEN generated data series.

,*Agric. For. Meteor.***90**, 203–214.Hodges, T., , V. French, , and S. K. LeDuc, 1985: Estimating solar radiation for plant simulation models. AgRISTARS Tech. Rep. JSC-20239, 21 pp.

Hook, J. E., , and R. W. McClendon, 1992: Estimation of solar radiation data missing from long-term meteorological records.

,*Agron. J.***84**, 739–742.Hunt, L. A., , L. Kuchar, , and C. J. Swanton, 1998: Estimation of solar radiation for use in crop modelling.

,*Agric. For. Meteor.***91**, 293–300.Linacre, E., 1992:

*Climate Data and Resources: A Reference and Guide*. Routledge, 384 pp.Liu, D. L., , and B. J. Scott, 2001: Estimation of solar radiation in Australia from rainfall and temperature observations.

,*Agric. For. Meteor.***106**, 41–49.Mahmood, R., , and K. G. Hubbard, 2002: Effect of time of temperature observation and estimation of daily solar radiation for the northern Great Plains, USA.

,*Agron. J.***94**, 723–733.Mavromatis, T., 2008: Estimation of solar radiation and its application to crop simulation models in Greece.

,*Climate Res.***36**, 219–230.Mavromatis, T., , and S. S. Jagtap, 2005: Estimating solar radiation for crop modeling using temperature data from urban and rural stations.

,*Climate Res.***29**, 233–243.McCaskill, M. R., 1990: Prediction of solar radiation from rainday information using regionally stable coefficients.

,*Agric. For. Meteor.***51**, 247–255.Meinke, H., , P. S. Carberry, , M. R. McCaskill, , M. A. Hills, , and I. McLeod, 1995: Evaluation of radiation and temperature data generators in the Australian tropics and sub-tropics using crop simulation models.

,*Agric. For. Meteor.***72**, 295–316.Muneer, T., , S. Younes, , and S. Munawwar, 2007: Discourses on solar radiation modeling.

,*Renew. Sustain. Energy Rev.***11**, 551–602.Nash, J. E., , and J. V. Sutcliffe, 1970: River flow forecasting through conceptual models. Part I: A discussion of principles.

,*J. Hydrol.***10**, 282–290.Paz, J. O., and Coauthors, 2007: Development of an ENSO-based irrigation decision support tool for peanut production in the southeastern US.

,*Comput. Electron. Agric.***55**, 28–35.Persson, T., , A. Garcia y Garcia, , J. O. Paz, , J. W. Jones, , and G. Hoogenboom, 2009a: Maize ethanol feedstock production and net energy value as affected by climate variability and crop management practices.

,*Agric. Syst.***100**, 11–21.Persson, T., , A. Garcia y Garcia, , J. O. Paz, , J. W. Jones, , and G. Hoogenboom, 2009b: Net energy value of maize ethanol as a response to different climate and soil conditions in the southeastern USA.

,*Biomass Bioenergy***33**, 1055–1064.Persson, T., , A. Garcia y Garcia, , J. O. Paz, , C. W. Fraisse, , and G. Hoogenboom, 2010a: Reduction in greenhouse gas emissions due to the use of bio-ethanol from wheat grain and straw produced in the south-eastern USA.

,*J. Agric. Sci.***148**, 511–527.Persson, T., , A. Garcia y Garcia, , J. O. Paz, , B. V. Ortiz, , and G. Hoogenboom, 2010b: Simulating the production potential and net energy yield of maize-ethanol in the southeastern USA.

,*Eur. J. Agron.***32**, 272–279.Ratkowsky, D. A., 1990:

*Handbook of Nonlinear Regression Models*. Marcel Dekker, 241 pp.Richardson, C. W., 1981: Stochastic simulation of daily precipitation, temperature, and solar radiation.

,*Water Resour. Res.***17**, 182–190.Richardson, C. W., 1985: Weather simulation for crop management models.

,*Trans. ASABE***28**, 1602–1606.Rivington, M., , G. Bellocchi, , K. B. Matthews, , and K. Buchan, 2005: Evaluation of three model estimations of solar radiation at 24 UK stations.

,*Agric. For. Meteor.***132**, 228–243.Rivington, M., , K. B. Matthews, , G. Bellocchi, , and K. Buchan, 2006: Evaluating uncertainty introduced to process-based simulation model estimates by alternative sources of meteorological data.

,*Agric. Syst.***88**, 451–471.Spokas, K., , and F. Forcella, 2006: Estimating hourly incoming solar radiation from limited meteorological data.

,*Weed Sci.***54**, 182–189.Stone, R. J., 1993: Improved statistical procedure for the evaluation of solar radiation estimation models.

,*Sol. Energy***51**, 289–291.Thornton, P. E., , and S. W. Running, 1999: An improved algorithm for estimating incident daily solar radiation from measurements of temperature, humidity, and precipitation.

,*Agric. For. Meteor.***93**, 211–228.UF-IFAS, cited 2011: Florida Automated Weather Network. University of Florida IFAS Extension. [Available online at http://fawn.ifas.ufl.edu/data/reports/.]

UG-CAES, cited 2011: Georgia Automated Environmental Monitoring Network. University of Georgia—College of Agricultural and Environmental Sciences. [Available online at http://www.georgiaweather.net.]

USDA, 1998:

*Agriculture Fact Book 1998*. USDA, 269 pp. [Available online at http://www.usda.gov/news/pubs/fbook98/afb98.pdf.]Willmott, C. J., 1981: On the validation of models.

,*Phys. Geogr***2**, 184–194.Winslow, J. C., , E. R. Hunt Jr., , and S. C. Piper, 2001: A globally applicable model of daily solar irradiance estimated from air temperature and precipitation data.

,*Ecol. Modell.***143**, 227–243.Wu, G., , Y. Liu, , and T. Wang, 2007: Methods and strategy for modeling daily global solar radiation with measured meteorological data—A case study in Nanchang station, China.

,*Energy Convers. Manage.***48**, 2447–2452.Yorukoglu, M., , and A. N. Celik, 2006: A critical review on the estimation of daily global solar radiation from sunshine duration.

,*Energy Convers. Manage.***47**, 2441–2450.