## Abstract

Three methods for analyzing and modeling the global shortwave radiation reaching the earth’s surface are presented in this study. Solar radiation is a very important input for many aspects of climatology, hydrology, atmospheric sciences, and energy applications. The estimation methods consist of an atmospheric deterministic model and two data-driven intelligent methods.

The deterministic method is a broadband atmospheric model, developed for predicting the global and diffuse solar radiation incident on the earth’s surface. The intelligent data-driven methods are a new neural network approach in which the hourly values of global radiation for several years are calculated and a new fuzzy logic method based on fuzzy sets theory. The two data-driven models, calculating the global solar radiation on a horizontal surface, are based on measured data of several meteorological parameters such as the air temperature, the relative humidity, and the sunshine duration.

The three methods are tested and compared using various sets of solar radiation measurements. The comparison of the three methods showed that the proposed intelligent techniques can be successfully used for the estimation of global solar radiation during the warm period of the year, while during the cold period the atmospheric deterministic model gives better estimations.

## 1. Introduction

Solar radiation incident on the earth’s surface is a fundamental input for many aspects of climatology, hydrology, biology, and architecture. In addition, it is an important parameter in solar energy applications, in electricity generation, and in daylighting. In locations where radiation measurements are sparse, theoretical estimates of the available solar energy can be used to predict it from other existing data. Therefore, various algorithms have been developed for the prediction of the available solar irradiance from other existing data, which usually consist of the standard climatological parameters that are measured extensively such as air temperature, relative humidity, sunshine duration, and cloudiness.

In principle, the amount of solar radiation reaching the earth’s surface could be calculated by subtracting from the extraterrestrial radiation, which is known with sufficient accuracy, the radiation losses in the atmosphere, which are caused by several processes such as absorption and scattering (Iqbal 1983; Pisimanis et al. 1987). This is indeed the case for the clear-sky direct component of the radiation for which several models, both rigorous and simple, exist and provide adequate estimates.

The majority of models, which calculate clear-sky solar radiation components on a horizontal surface, consider a one-band calculation (Atwater and Ball 1978; Hoyt 1978; Bird and Hulstrom 1981; Davies and McKay 1982; Sherry and Justus 1983) or a two-band calculation (Lacis and Hansen 1974; Paulin 1980; Gueymard 1989). The most sophisticated models are those of the spectral type (Braslau and Dave 1973; Bird et al. 1983; Kneizys et al. 1988). These models are useful for applications with spectrally varying optical characteristics, but they are computationally complicated and often require very specific input data that are rarely available. Conversely, very simple one-band radiation models, such as ASHRAE (1976) and its variations (Powell 1984; Machler and Iqbal 1985) are widely used by engineers because of their computational convenience, but they are limited with respect to climatic variety (Gueymard 1986).

The effect of cloudiness has been included in various models calculating solar radiation (Angstrom 1924; Barbaro et al. 1979; Collares-Pereira and Rabl 1979; Klein and Theilacker 1981; Erbs et al. 1982; Dogniaux 1984;Page 1986; Pisimanis et al. 1987). A broadband atmospheric model designed for predicting the global and diffuse solar radiation incident on the earth’s surface under clear- or cloudy-sky cases has been developed and used in the present study. The atmospheric transmittance of each atmospheric parameter contributing to solar depletion, such as water vapor ozone, uniformly mixed gases, molecules, and aerosols, is calculated using parameterized expressions resulting from integrated spectral transmittance functions. The beam and diffuse radiation are obtained as a function of the specific atmospheric transmittances. The model is validated using extensive sets of measurements, and a close agreement between the calculated and the measured values of global and diffuse solar radiation is observed. The validation of the model is performed for the city of Athens, which is a large-sized near-coastal area. For the case of Athens, the part of the model predicting the spectral and broadband aerosol transmittance has taken into account the pollution problems and the high concentrations of sea-salt particles observed in a coastal or near-coastal environment like Athens. Although many accurate atmospheric models have been proposed and tested with sufficient accuracy, the present model is selected as it is designed to fit with the specific climatological data of Athens, where the comparison will be performed.

Furthermore, a neural network approach and a fuzzy logic method are used in this study to estimate the global solar radiation. Neural networks and fuzzy logic techniques belong to the class of data-driven approaches instead of model-driven approaches (Chakraborty et al. 1992). In the data-driven models the analysis depends only on the available data, with little rationalization about possible interactions. Relationships between variables, models, laws, and predictions are constructed after building a machine that simulates the considered data. Both neural networks and fuzzy systems have been shown to have the capability of modeling complex nonlinear processes to arbitrary degrees of accuracy.

The main objective of the present study is the presentation and comparison of three models, one deterministic atmospheric model and two intelligent data-driven models, for the estimation of the global shortwave radiation using as inputs several meteorological parameters. The atmospheric model is an analytical approach, based on parameterized expressions, which requires as inputs several climatological parameters such as air temperature, relative humidity, sunshine duration, cloudiness, surface albedo, etc. The model is able to give sufficiently accurate estimations provided that all the required input parameters are available. Taking into account that the climatological measurements network in developed countries is still in progress and that there are locations where measured data are rather sparse, the design of data-driven approaches could be very effective. Intelligent data-driven approaches such as neural networks and fuzzy logic methods present several advantages over conventional, deterministic analytical models. Besides simplicity, another major advantage is that they do not require any assumption to be made about the underlying function or model to be used. All they need are the historical data of the target and those relevant input factors for training the data-driven system. Once the system is well trained and the error between the target and the method estimations has converged to an acceptable level, it is ready for use. Various authors have already designed intelligent data-driven techniques for several energy applications (von Altrock et al. 1994; Dash et al. 1995; Mihalakakou et al. 1998).

However, the results of these methods have never been tested using accurate deterministic models. Thus, there is an uncertainty as regards the applicability of the methods, their field of applicability, and their advantages and disadvantages. The present study aims at investigating the accuracy of two intelligent data-driven methods, a neural network and a fuzzy logic technique, by comparing their results primarily with testing sets of measured data and secondarily with the outputs of an analytical and accurate atmospheric model. Finally, the present paper proposes specific information on the applicability of each model.

The paper is organized as follows: the three models used in the present study are presented in the first section of the article, each one in a separate paragraph, while in the second section a comparison of the three models’ results can be found. Finally, the conclusions are given in the last section.

## 2. Modeling the global solar radiation

### a. The atmospheric deterministic model

A broadband atmospheric model is used in the present study. The proposed model is developed for calculating the beam, diffuse, and global solar radiation incident on the earth’s surface, under clear- or cloudy-sky cases. The revised Neckel and Labs (1981, 1984) extraterrestrial solar spectrum was used in the above model. Theoretical studies of the solar radiation absorption and scattering caused by the principal atmospheric constituents have permitted the development of corresponding transmission functions. The atmospheric transmittance of each atmospheric component contributing to solar radiation depletion, such as water vapor (Psiloglou et al. 1994); atmospheric ozone (Psiloglou et al. 1996); uniformly mixed gases such as CO, CO_{2}, CH_{4}, N_{2}O, and O_{2} (Psiloglou et al. 1995); and molecules and aerosols (Psiloglou et al. 1997), was calculated using parameterized expressions resulting from integrated spectral transmittance functions. The beam and diffuse radiation components were obtained as a function of the specific atmospheric transmittances.

#### 1) Clear-sky radiation model

The beam (*I*_{b}) under clear-sky conditions on a horizontal surface can be expressed as

where *I*_{o} is the extraterrestrial solar radiation; *θ*_{z} is the zenith angle; and the *T* terms are the broadband transmission functions for water vapor (*T*_{w}), uniformly mixed gases (*T*_{mg}), ozone absorption (*T*_{O3}), Rayleigh scattering (*T*_{R}), and aerosol total extinction due to scattering and absorption (*T*_{A(ext)}).

The diffuse solar radiation (*I*_{d}) under clear-sky conditions and on a horizontal surface is regarded as the sum of a portion of beam solar radiation single scattered from the atmospheric constituents (*I*_{d1}), and of a multiple-scattering component (*I*_{d2}) that is caused by a single reflection of the (*I*_{b}) and (*I*_{d1}) components at the earth’s surface followed by backscattering atmospheric constituents. Thus, the diffuse radiation can be modeled as follows:

where

where *T*_{A(abs)} is the aerosol broadband transmission function due only to absorption attenuation, *T*_{A(sct)} is the aerosol broadband transmission function due only to scattering attenuation, *a*_{g} is the ground surface albedo, and *a*_{s} is the albedo of the cloudless sky.

The global solar radiation (*I*_{t}) for clear sky conditions can be expressed as follows:

The atmospheric albedo (*a*_{s}) for clear-sky conditions can be approximated using the following form:

where *a*_{r} represents the albedo due to molecular Rayleigh scattering and *a*_{a} is the atmospheric aerosol albedo due to aerosol scattering.

The transmission functions for water vapor and ozone absorption can be expressed by the following equation (Psiloglou et al. 1994, 1996):

where *A, B, C,* and *D* are parameters, given in Table 1, for water vapor, O_{3}, CO_{2}, CO, N_{2}O, CH_{4}, and O_{2}. Here *M* is the relative optical air mass and *U*_{i} is the absorber amount in a vertical column.

The broadband transmission function due to uniformly mixed gases’ total absorption is calculated by the following equation:

where *T*_{CO2}, *T*_{CO}, *T*_{N2O}, *T*_{CH4}, and *T*_{O2} are the transmittances due to absorption of CO_{2}, CO, N_{2}O, CH_{4}, and O_{2}, respectively.

The transmittance corresponding to Rayleigh scattering is calculated from the following expression:

The absorption and scattering aerosol broadband transmittance functions, *T*_{A(abs)} and *T*_{A(sct)}, are calculated as follows:

#### 2) Cloud-sky radiation model

The beam (*I*_{cb}) and the diffuse (*I*_{cd}) solar radiation under cloudy-sky conditions are represented by the following forms:

where

where *n*/*N* is the relative insolation (the ratio of the real sunshine duration, *n,* to the maximum possible number of sunshine duration, *N*), and *a*_{cs} is the albedo of the cloudy sky.

The global solar radiation, (*I*_{ct}), for cloudy sky on a horizontal surface is represented as follows:

where *a*_{cs} = *a*_{r} + *a*_{a} + *a*_{c} and *a*_{c} is the albedo of the clouds.

The inputs to the above model were the air temperature, the relative humidity, the air pressure, the total ozone amount in a vertical column, the sunshine duration, and the surface albedo.

The accuracy of the model has been verified by comparisons of the theoretical results with the corresponding detailed radiation data measured at two stations with slightly different characteristics [National Observatory of Athens (NOA) and Penteli] in the Athens basin, where global and diffuse radiation measurements are available, for a period of 34 months for NOA and 23 for Penteli. The NOA (altitude: 107 m) station is located on a small hill near the center of Athens, while the Penteli station (altitude: 500 m) is situated in a relatively less populated area in the northern part of Athens. The clear-sky part of the model was tested for 70 individual“clear” days with 2-min intervals, while the whole model was checked with “monthly mean” days and hourly mean values.

Close agreement between the predicted from the model and the measured values of global and diffuse radiation is observed, which verifies the accuracy of the proposed expressions for the solar radiation expressions. For the NOA station the calculated values of root-mean-square errors between the measured and estimated global solar radiation values for the monthly mean day varied between 2.6% and 5.9%. Similarly, for the Penteli station the calculated root-mean-square errors fluctuated between 1.5% and 5.2%.

Figure 1, as an example, shows the temporal variation of that estimated with the atmospheric model and of that measured at the NOA station global solar radiation values for the monthly mean day of January and July 1995.

It can be seen from this figure there is a good agreement between measured and estimated values. The root-mean-square error between the measured and the model estimated values was found equal to 4.46% for the month of January and 2.69% for the month of July.

### b. The neural network approach

#### 1) Neural network architecture

Artificial neural networks are computing systems containing many simple nonlinear computing units or nodes interconnected by links:

A neuron with a single input and no bias is shown in Eq. (5). The scalar input *p* is transmitted through a connection that multiplies its strength by the scalar weight *w* to form the product *wp,* again a scalar. The weighted input *wp* is the only argument of the transfer function *F,* which produces the scalar output *a* (Demuth and Beale 1994):

The neuron in Eq. (6) has a scalar bias *b.* The bias can be viewed as simply being added to the product *wp.* The transfer function net input *n,* again a scalar, is the sum of the weighted input *wp* and the bias *b.* The *F* is a transfer function, typically a step function, a linear or a sigmoid function, that takes the argument *n* and produces the output *a.*

In a feed-forward network, the units can be partitioned into layers, with links from each unit in the *k*th layer being directed to each unit in the (*k* + 1)th layer. Inputs from the environment enter the first layer, and outputs from the network are manifested in the last layer. A *d–n*–1 network is a three-layer feed-forward network with *d* inputs, *n* units in the intermediate “hidden” layer, and one unit in the output layer (Weigend et al. 1990; Chakraborty et al. 1992). A weight is associated with each link, and the network learns or is trained by modifying these weights. A multilayer feed-forward neural network can be observed in Fig. 2. The network consists of three layers: an input layer, an output layer, and an intermediate or hidden layer. The neurons in the input layer act only as buffers for distributing the input signals to the neurons in the hidden layer. The dotted lines in Fig. 2 mean that there are more neurons in each layer than are represented in this figure.

In nonlinear estimation problems the artificial neural networks provide an implementation for the real-time estimation of parameters and the reconstruction of signals corrupted by random noise and distorted by parasitic components. Therefore, for estimation problems a suitable neural network model is constructed, which, when subject to an input parameter *u*(*k*), produces an output *ω*(*k*), which estimates the output *y*(*k*) of the system in the sense that the specified cost function of the errors *e*(*k*) = *y*(*k*)−*ω*(*k*) is minimal. The cost function (*E*) is defined as follows (Cichocki and Unbehauen 1993):

For the nonlinear system estimation the models can take the general form (Korenberg and Paarmann 1991)

where **u**(*k*) and **y**(*k*) are, respectively, the multidimensional system input and output; **F** is the multidimensional system function; and ** ω**(

*κ*) is the estimated vector of

**y**(

*k*).

The estimation problem can be separated into three successive steps or subproblems:

model building or neural network architecture,

the learning or training procedure, and

the testing or diagnostic checking.

In the present study a multiple network based on a back-propagation learning procedure is designed for estimating the global solar radiation. The selected neural network architecture consists of one hidden layer of 15 log-sigmoid neurons followed by an output layer of one linear neuron. Linear neurons are those that have a linear transfer function, while the sigmoid neurons use a sigmoid transfer function. Back-propagation networks use the log-sigmoid (logsig) or the tan-sigmoid (tansig) transfer function.

Several learning techniques exist for optimization of neural networks (Rumelhart and McClelland 1986). In the present neural network approach learning is achieved using the back-propagation algorithm of Rumelhart et al. (1986). Mathematically, back propagation is the gradient descent of the mean-square error as a function of the weights (Weibel et al. 1995). If the mean-square error exceeds some small predetermined value, a new “epoch” (cycle of presentations of all training inputs) is started after termination of the current one. One of the main parameters of the back-propagation algorithm is the learning rate. The learning rate specifies the size of changes that are made in the weights and biases at each epoch. A learning rate of 0.2 was selected, while the number of epochs varied between 3000 and 4000 in all cases.

#### 2) Results and discussion

Global solar radiation measured on a horizontal surface at the NOA has been simulated using the neural network approach. For the global solar radiation estimation the measurements of three meteorological parameters were used: air temperature, relative humidity, and sunshine duration.

The NOA Institute is situated on a hill at the center of Athens (37.967°N, 23.717°E, altitude: 107 m). Continuous observations of standard meteorological parameters have been performed at this location, the close surroundings of which have remained unaltered since 1864.

Integrated hourly, daily, and monthly values of global solar radiation in MJ m^{−2} are measured at the observatory with Kipp–Zonen and Eppley actinometers and pyranometers, respectively. Sunshine hours are measured with a Campbell–Stokes heliograph.

Hourly values of air temperature, relative humidity, and sunshine duration as well as hourly integrated values of global solar radiation for 12 yr (1984–95) and for various months of the year were used for training and testing the network. Analytically, 11 years (1984–94) were used for training the neural network and one year (1995) for testing the training data. The nighttime values of global solar radiation, which probably are zero values, are omitted from the training and testing sets, and therefore they are not used in the training and testing processes.

The network was trained over a certain part of the climatic data, and once training was completed, the network was tested over the remaining data.

The input parameters of the neural network model were the following:

air temperature measurements in °C,

relative humidity measurements (percent),

sunshine duration measurements in hours, and

calculated extraterrestrial radiation values in kJ m

^{−2}h^{−1}.

Analytically, the extraterrestrial irradiation (*I*_{0}) was calculated using the equation of Iqbal (1983): *I*_{0} = *I*_{SC}*E*_{0}(sin*δ* sin*ϕ* + cos*δ* cos*ϕ* cos*ω*_{i}), where *I*_{SC} is the solar constant, *E*_{0} is the eccentricity correction factor, *δ* is the solar declination, *ϕ* is the geographic latitude, and *ω*_{i} is the hour angle. The output was the global solar radiation values. Training is performed using hourly values of the input climatic parameters for the estimation of integrated hourly global solar radiation values for 11 years (1984–94) and for various months of the year. As learning occurs the mean-square error decreases. Results from trial runs indicated that adding more hidden layers or nodes did not significantly improve the network’s prediction capabilities; rather this only slowed the convergence. Calculations have been performed for various months of the year, and the following two time periods were selected for the presentation of results.

The cold period of the year, which consists of the months of December, January, February, and March. The month of January was regarded as representative of the cold period for the presentation of results.

The warm period of the year, which consists of the months of June, July, August, and September. Accordingly, the month of July was considered to be the representative month of the warm period for the presentation of results.

Figures 3a and 3b show the comparison of the measured integrated hourly global solar radiation values with the neural network estimated ones for two years from the training set of data (1987 and 1989) and for the months of July and January, respectively. As can be seen from these figures, there is a good agreement between measured and estimated values.

For most cases, the radiation differences are less than 0.25 MJ m^{−2}, while the root-mean-square error between the measured and the estimated values was found to be equal to 0.16 MJ m^{−2} for the month of July and 0.19 MJ m^{−2} for the month of January.

The accuracy of the neural network estimations was tested by comparing the measurements of the testing set of data, which consists of the radiation values of the year 1995, with the estimated results of the neural network approach. Figures 4a and 4b show the comparison between the estimated hourly values for the year 1995 and the measured values of the testing set of data (1995) for July and January, respectively. The mean-square errors were found to be equal to 0.22 MJ m^{−2} for July and 0.20 MJ m^{−2} for January. The present results are quite encouraging for developing a feed-forward back-propagation neural network approach able to simulate and predict the future values of global solar radiation time series by extracting knowledge from their past values.

Figures 5a and 5b show the temporal variation of the estimated and measured global solar radiation values for two randomly selected days of the warm period (2 July 1992 and 15 July 1995). Accordingly for the cold period, Figs. 5c and 5d present the temporal variation of the estimated and measured radiation for two randomly selected days (7 January 1993 and 12 January 1995). In these figures, the continual line indicates the measured global solar radiation values, while the cross symbols indicate the model estimations. As shown, there is a good agreement between the estimated and the measured data. Similar performance was seen for the whole set of data.

### c. The fuzzy logic method

Fuzzy set theory provides a means for representing uncertainties. A real system could be very complicated, but as humans learn more and more about it, its complexity decreases (Zadehb 1975). As complexity decreases, the precision afforded by computational methods becomes more useful in modeling the system. For systems with little complexity, hence little uncertainty, closed-form mathematical expressions provide precise descriptions of the system. For systems that are a little more complex, but for which significant data exist, model-free methods, such as artificial neural networks, provide a powerful and robust means to reduce some uncertainty through the process of learning. Finally, for the most complex systems where few numerical data exist and where only ambiguous or imprecise information may be available, fuzzy methods provide a way to understand the system behavior by allowing us to interpolate approximately between observed input and output situations.

Fuzzy logic starts with the concept of a fuzzy set. A fuzzy set is a set without a crisp, clearly defined boundary. It can contain elements with only a partial degree of membership.

A fuzzy logic method for modeling nonlinear functions of arbitrary complexity is based on the following processes (Ross 1995).

The classification of the system’s variables in categories or classes where the value of each parameter participates in each of the above class with a certain degree of membership. The degree of membership of a variable’s value in each class is defined by the membership function. A membership function is a curve that defines how each point in the input space is mapped to a membership value (or degree of membership) between 0 and 1. The membership function embodies all fuzziness for a particular fuzzy set, and its description is the essence of a fuzzy property or operation (Dubois and Prade 1980).

The fuzzification is the process of making a crisp quantity fuzzy and creating the fuzzy sets. It involves also the development of the membership functions. During this process the membership functions were assigned to fuzzy variables. This assignment process can be intuitive or it can be based on some algorithmic or logical operations. There are several methods for developing membership functions such as intuition, inference, neural networks, genetic algorithms, fuzzy statistics, etc.

The formation and application of several conditional rules when observing some complex process.

The defuzzification process includes the conversion of a fuzzy quantity to a precise quantity, just as fuzzification is the conversion of a precise quantity to a fuzzy quantity. The output of a fuzzy process can be the logical union of two or more fuzzy membership functions.

In classification, the most important issue is deciding what criteria to classify against. For time series applications the process of pattern recognition is extensively used for the classification of the system parameters. Pattern recognition can be defined as a process of identifying structure in data by comparisons to known structure (Fukunaga 1972; Bezdek 1981). The purpose of the pattern recognition is to assign each input to one of possible pattern classes (or data clusters). Presumably, different input observations should be assigned to the same class if they have similar features and to different classes if they have dissimilar features.

The data used to design a pattern recognition system are usually divided into the following two categories much like the categorization used in neural networks:

the training data and

the testing data.

The training data are used to establish the algorithmic parameters of the pattern recognition system, while the testing data are used to test the overall performance of the pattern recognition system.

In the present study, for the estimation of the global solar radiation on a horizontal surface, the same input parameters as in the neural network approach were used. Moreover, the same sets of measured data of the Institute of Meteorology and Physics of the Atmospheric Environment, National Observatory of Athens, and for the same time period as in the neural network model were used for training and testing the system.

The training data were classified as follows:

air temperature data in four classes,

relative humidity data in three classes, and

sunshine duration in eight classes.

The global solar radiation data were classified in 13 classes. For all classes the trigonal symmetric membership function was used.

Figures 6a and 6b show the comparison of the measured integrated hourly solar radiation values with the fuzzy logic method estimated ones for two years from the training set of data (1987 and 1989) and for the months of July and January, respectively. A very good agreement is observed from the comparison. For most cases the radiation differences are less than 0.20 MJ m^{−2}, while the root-mean-square error between the measured and the estimated values was found equal to 0.16 MJ m^{−2} for July and 0.21 MJ m^{−2} for January.

In order to check the accuracy of the method estimations, the results were tested using the measurements of the testing set of data which consists of the radiation values of the year 1995. Figures 7a and 7b show the comparison between the estimated and measured total daily values of global solar radiation for the months of July and January, respectively. As shown from these figures, there is a relatively good agreement between the measured and the estimated values. The root-mean-square errors were 0.26 MJ m^{−2} for January and 0.22 MJ m^{−2} for July.

## 3. Comparison of the three models

The global solar radiation values estimated from each one of the three models were compared with the corresponding measured values at the station of the National Observatory of Athens. The comparison was performed for the solar radiation hourly values of the year 1995.

When a model has been fitted it is checked whether or not the model provides an adequate description of the data. This is usually done by working out the residuals, which are defined as measured values minus estimated values. The visual inspection of a plot of the residuals themselves is an indispensable first step in the checking process.

Figures 8 and 9 show the temporal variation of the relative difference (%RD) between measured and estimated values from the three models’ global solar radiation values for the monthly mean day of July and for the monthly mean day of January, respectively:

where *R*_{meas} and *R*_{est} are the measured and estimated from the three models global-solar radiation values, respectively.

For the month of July, which in Athens consists usually of clear days, and which represents the warm period of the year, there is a close agreement between the atmospheric model estimations and the measured data with the relative difference ranging from −5.5% to 2.2%. Respectively, a relatively good agreement has been observed between the data-driven models’ estimations and the measured data. The performance of the neural network approach was quite satisfactory and the relative difference varied between −4.7% and 5.3%. As for the fuzzy logic method, the relative differences ranged from −5.7% to 6.6%. The performance of the atmospheric model is trivially better because it is more analytical and takes into account many more involved parameters than the two data-driven models. However, using too few inputs can result in inadequate modeling, whereas too many inputs can excessively complicate the model. The data-driven procedure provides a good performance for the warm period of the year despite the unavailability of an analytical theoretical model underlying the observed phenomena.

In the three models the higher and lower values of the relative difference are observed early in the morning (0600 or 0700 LT) or in the late afternoon (1700 or 1800 LT), when the values of the global solar radiation are relatively small. During the day the values of the relative difference varied between −3% and 3%.

Therefore, for the warm period of the year the two data-driven models provide a satisfactory performance, which, compared with the performance of the deterministic atmospheric model, is quite similar.

The cold period of the year in Athens is represented by January, which is a month with a great number of cloudy days. The relative difference between the atmospheric model estimations and the measurement fluctuated between −10.5% and 3.2%. The lower value (−10.5%) is observed in the morning, while during the day the relative difference varied between 0% and 3%.

Respectively, the neural network model estimations gave a relative difference ranging from −6.7% to 14.3%, while in the fuzzy logic method the observed relative differences varied between −13.2% and 14.3%. As in the month of July, the two data-driven models gave their higher and lower values of the relative difference in the morning and in the afternoon with the observed low values of the global solar radiation. During the day the values of the relative difference varied between −4% and 4%.

In January, the atmospheric model gives better estimates of the global solar radiation than the two data-driven models. This is observed especially during the cloudy days; it can be explained mainly by the fact that the atmospheric model consists of several formulations calculating separately the beam, diffuse, and global solar radiation for a clear and for a cloudy day, and it takes into account a large number of involved parameters. On the contrary, the two data-driven models cannot use so many inputs, as that would imply slower training and slower convergence. Moreover, January in Athens is a month with a great number of cloudy days and generally with weather phenomena such as cloud coverage, rainfall, and storms, and the data-driven models cannot always simulate successfully the days with these various weather phenomena because their results depend strongly on the training data.

## 4. Summary and conclusions

The hourly values of global solar radiation are estimated in the present study using the following three models.

A deterministic atmospheric model.

A new neural network system based on back-propagation techniques, designed and trained to model the global solar radiation. Remarkable success has been achieved in training the networks to learn the hourly radiation values. After training the network the results were tested over another number of data not used in the training procedure, and it was found that the neural network predicted values perform well on the testing set of measurements.

A new fuzzy logic method based on fuzzy sets for modeling nonlinear functions. The fuzzy logic method contains the classification of the system’s variables in classes, the fuzzification, the formation and application of conditional rules, and the defuzzification process. The same sets of data as in the neural network approach were used for training and testing the system. The trained values were compared with the corresponding actual values and they were found to be in close agreement.

The measured data and the results of the three models were compared, and this comparison led to the following observations.

For the warm period of the year, which in Athens consists mainly of clear and sunny days, the three models can give accurate estimations. The atmospheric model provides a trivially better performance, which is caused by the fact that it requires a larger number of input parameters. However, the performance of the two data-driven models can be characterized as very satisfactory for the summer period. Therefore, taking into account the two major advantages of the data-driven models, which are the simplicity and the fact that they do not require any assumption to be made about the underlying function or model to be used, the proposed data-driven models can be successfully used for the global solar radiation estimation in Athens.

During the cold period of the year, which usually consists of a great number of cloudy days and various weather phenomena, the atmospheric model is able to give quite better estimations than the two data-driven models. This can be explained by the fact that the atmospheric model consists of various formulations simulating separately the global solar radiation under several weather conditions. On the other hand, the proposed data-driven models cannot work efficiently when a large number of inputs are involved. Moreover, the results of the data-driven models depend strongly on the training sets of data, and it is not possible to make long-term estimations on chaotic time series such as the global solar radiation data during the cold period of the year, which is characterized by the high frequency of different weather phenomena.

## REFERENCES

_{2}, CO, N

_{2}0, CH

_{4}and O

_{2}) in the atmosphere, for solar radiation models. Renewable Energy, 6, 63–70.

## Footnotes

*Corresponding author address:* Dr. G. Mihalakakou, University of Athens, Department of Physics, Division of Applied Physics, Laboratory of Meteorology, University Campus, Bldg. PHYS-V, Athens 15784, Greece.

Email: msantam@atlas.uoa.gr