## 1. Introduction

The sea surface temperature (SST) is an important parameter in oceanography and ocean technology, as it forms a prerequisite to understanding and predicting weather and climate, and planning of various offshore activities, such as recreational events and fishing. The prediction of SST, however, is highly uncertain because of large variations in heat flux, radiation, and diurnal wind near the sea surface. To predict the SST, physics-based or data-driven approaches are used. The former provides information over a large spatial domain, whereas the latter can be suitable for site-specific predictions. There are many data-driven techniques available for this purpose that range from traditional stochastic to modern artificial intelligence approaches. The conventional statistical techniques include a Markov model (Xue and Leetmaa 2000), empirical canonical correlation analysis (Collins et al. 2004), regression (Kug et al. 2004), genetic algorithms, and empirical orthogonal functions (Neetu et al. 2011). However, one of the techniques that has become the most popular in recent times is the neural network (NN), also called the artificial neural network.

There are a fairly large number of studies in which past investigators have used NNs to predict the SST. Seasonal predictions of the SST over a certain region in the tropical Pacific were made by Tangang et al. (1997) using an NN that had the input of wind stress empirical orthogonal functions and also that of the SST itself. The accuracy of this method over a lead time of 6 months was comparable to that of the El Niño–Southern Oscillation (ENSO)-based models. A comparison of an NN with canonical correlation analysis and linear regression in predicting equatorial Pacific SST was made by Tangang et al. (1997). A study by Pozzi et al. (2000) indicated the usefulness of the NN as a complementary tool to more conventional approaches and paleoceanographic data analysis. Wu et al. (2006) predicted the SST over the tropical Pacific using an NN that consisted of deriving the SST principal components over a 3- to 15-month lead time using the input of SST and sea level pressure. Tanvir and Mujtaba (2006) developed many NN-based relationships to predict the rise in the temperature of seawater during a desalination process.

The monthwise SST data for a certain region in the Indian Ocean were analyzed by Tripathi et al. (2008), and monthly averaged SST predictions were made. This study involved SST predictions based on past SST data, while the one by Garcia-Gorriz and Garcia-Sanchez (2007) considered meteorological variables as input to predict targeted satellite-derived SST values in the western Mediterranean Sea. The networks trained in this way predicted the seasonal and interannual variability of the SST well. Gupta and Malmgren (2009) compared the prediction skills of different methods based on certain transfer functions, regressions, and an NN at the Antarctic and Pacific Oceans and found that the NN generally performed better than the other methods. The prediction skills of the NN with support vector machine and linear regression were also compared by Aguilar-Martinez and Hsieh (2009), who found that the NN provided better overall predictions than the support vector machine. Lee et al. (2011) used an NN to identify sources of errors in satellite-based SST estimates. The temperature of air, the direction of wind, and the relative humidity affected the SST derivations significantly. More recently Mahongo and Deo (2013) predicted the SST over a subsequent month and season at sites near an East African shore using different NNs and also based on the autoregressive integrated moving average (ARIMA) method. It was observed that among all of the NNs, the nonlinear autoregressive network had better skills not only in forecasting monthly and seasonal SST anomalies, but also in capturing ENSO and Indian Ocean dipole (IOD) events. This observation with respect to single-time-step predictions was later confirmed by Patil et al. (2014) when multiple-time-step predictions were attempted.

A review of the readily available publications mentioned above showed that the technique of NN is promising in predicting SST but that it needs a variety of applications and more experimentation to exploit its full potential.

Many agencies around the world provide real-time forecasts of SST through web-based platforms. These are based on numerical modeling. Numerical models explain the process of heat transfer across the atmosphere and the oceans. These models are essentially designed to obtain average information over a large spatial domain. However, for some applications such as fishing and sports events, site-specific information is more desirable. In this work we investigated whether the numerical forecasts of SST are suitable for location-specific predictions, and if not, whether this can be satisfactorily done by a neural network.

## 2. Materials and methods

### a. Study area and data

The long-term data required to predict an SST can be derived from the product of numerical models of the atmosphere and ocean specified over geographical grids of a certain size. Often these products are refined by assimilating in situ–based or satellite-based observations. There are a variety of instruments for in situ measurements, such as thermometers and thermistors mounted on drifting or moored buoys, Argo, and moorings. The satellite-based technique may consist of sensing the ocean radiation of certain wavelengths of an electromagnetic spectrum and relating it to the SST. Microwave radiometry based on an imaging radiometer called the Moderate Resolution Imaging Spectroradiometer is also in use.

The monthly raw or unprocessed SST was converted into anomalies by subtracting appropriate mean values before using them as input in modeling. This was necessary in view of the small range of fluctuations around the means compared to the changes in absolute values. Anomalies were obtained by subtracting the given (unprocessed or absolute) SST value from the corresponding long-term mean computed over the entire sample. The means of observed and modeled data were calculated separately. For each type of data—namely, monthly, weekly, and daily—the means were different. Typically, for monthly data, and further, for a given month—say, January—the mean was calculated over all January SST values and such a mean was subtracted from the given (unprocessed) January SST value to get the SST anomaly. The daily database used consisted of anomalies themselves computed over long-term means.

The SST data were extracted at six different locations around India and within the Indian Ocean, as shown in Fig. 1, which also provides the coordinates of these locations. The locations and the code name for each location are as follows: Arabian Sea (AS), Bay of Bengal (BoB), west of Indian Ocean (WEIO), east of Indian Ocean (EEIO), off the African coast (THERMO), and south of the Indian Ocean (SOUTHIO). The predictions were made based on three different time scales: daily, weekly, and monthly. Accordingly, different sources of data were selected as in Table 1 per their availability in appropriate sample sizes. The sample size for daily analysis was 28 months (2012–14), that for weekly analysis was 34 years (1981–2014), and that for monthly predictions was 145 years (1870–2014). The characteristics of these data are given below:

*INCOIS data*These numerical SST data, provided by the Earth System Science Organisation (ESSO)–Indian National Centre for Ocean Information Services (INCOIS), were available every 6 h and over spatial grids of size 0.25° × 0.25°. The data involved modeled products of the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) and were used as numerical data in this work in daily SST analysis.

*NOAA (OISSTv2) data*These are high-resolution and optimally interpolated SST (OISST, version 2; OISSTv2) data developed by the National Oceanic and Atmospheric Administration (NOAA) of the United States. The data use Advanced Very High Resolution Radiometer (AVHRR) infrared satellite SST and in situ data from ships and buoys (http://www.ncdc.noaa.gov/sst/description.php). The data involve optimum interpolation for specification over a grid resolution of 0.25° × 0.25° and a temporal resolution of 1 day. These were treated as observations for daily and weekly analysis. Among other SST products of NOAA, these data are more suitable for daily predictions due to their high sample size, near-surface measurements, and accounting for diurnal effects.

It may be noted that NOAA’s National Environmental Satellite, Data, and Information Service (NESDIS) generates a variety of SST products. The NOAA OISSTv2 dataset we used represents daily OISST maps prepared by single-day satellite observations averaged over a grid size of 1/4°. The observations are restricted to the upper 0.5- or 1-m water depth, and they reflect the diurnal behavior of the SST and thus are suitable to model daily values. SST products with more high resolution are also available, for example, the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA) dataset, which has a grid size of 1/20°. However, these data are observed at higher water depths and may not involve the diurnal variability of the SST and further their sample size is small. Such information represents foundation SST products and serves as boundary conditions for weather forecasting models.

*CCCma data*These numerical SST data were generated by the Canadian Centre for Climate Modelling and Analysis (CCCma), from the Second Generation Earth System Model (CanESM2). The data resolution was 0.93°N × 1.40°E. These data served as numerical data in weekly and monthly analyses (http://www.cccma.ec.gc.ca/data/cgcm4/CanESM2/index.shtml).

*The Hadley SST data*This high-quality observation dataset belonged to the U.K. Hadley Centre’s collection called the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) (http://www.metoffice.gov.uk/hadobs/hadisst/). The dataset incorporates observations received through the Global Telecommunications System, which are specified over a spatial resolution of a 1°× 1° grid.

Details of the datasets used.

Table 1, referred to above, also shows the pairs of “numerical” and corresponding “observed” datasets considered in this study over the scales of daily, weekly, and monthly, and their corresponding durations.

### b. Artificial neural network

*O*denotes the output from a neuron;

*x*

_{1},

*x*

_{2},… denote the input values;

*w*

_{1},

*w*

_{2},…denote the weights along the linkages connecting two neurons and these indicate the strength of connections; and

*θ*denotes the bias value. Many applications of NN in ocean engineering are based on a feed-forward type of the network rather than against the feedback or recurrent one. A feed-forward multilayer network consists of an input layer, one or more hidden layers, and an output layer of neurons. Before it is actually applied, the network is calibrated or trained using mathematical algorithms and a set of input–output examples. Details of such training methods can be seen in textbooks like Wasserman (1993) and Wu (1994). To arrive at the most accurate estimation of target values, different algorithms along with their training parameters, such as learning rate, learning tolerance, and momentum factor, are required to be experimented and selected thereafter.

### c. Wavelet neural network

One of the recent types of neural networks is a hybrid type and called wavelet neural network (WNN), in which the input into the network is preprocessed for ease of learning using wavelet transform. Applications of this wavelet transform–neural network combination have been recently reported typically in forecasting significant wave height (Deka and Prahlada 2012; Dixit et al. 2015) and also in runoff predictions (Shoaib et al. 2014). Basic information on wavelet networks is given below; however, for details readers are referred to Alexandridis and Zapranis (2013), Mallat (1998), and Moraud (2009).

In WNN input preprocessing is done by decomposing a given series by passing it through high- and low-pass filters and providing such decomposed components as input to the network in place of the original irregular and noisy signal. The breaking of the series in this way is done through a cyclic function called a wavelet. A wavelet transform (WT) essentially isolates a particular feature (frequency) in a given time series from its surrounding features for correct identification of the former. The WT gives frequency-dependent information without losing time-dependent knowledge, such as trends, periodicities, and discontinuities. This is done by filtering or moving a cyclic function or wavelet of a certain width along the time axis. The width of the function can be made larger while analyzing low-frequency components, and smaller while dealing with high-frequency components. These cyclic functions or wavelets are prepared from a “mother wavelet” *ψ* using two transformations: (i) dilation or expansion by a scale parameter *s* and (ii) a change in form or position or shift by translation parameter *u*. The scale and translation parameters determine the resolution or width of the window or filtered portion.

*ψ*(

*x*) is given by

*f*(

*x*)

*,*the wavelet transform WT

_{s,u}[

*f*(

*x*)] is mathematically given by

*ψ**() denotes the complex conjugate of

*ψ*().

For a given series, the mother wavelet acts as a high- and low-pass filter, separating these two components. The separated high- [detail (D)] and low- [approximate (A)] frequency components are analyzed separately. The approximate component is important, and hence it is further decomposed into detail and approximate parts. Such decomposition continues over a few levels. Each decomposed wavelet is a scaled and shifted version of the mother wavelet. Figure 3 shows an example of third-level decomposition (at site AS). The underlying wavelet was of discrete Meyer (dmey) type, which is explained later. The bottommost series is the original SSTA series, while the top three series—D1, D2, and D3—represent the decomposed first, second, and third detail series, respectively. The fourth detail series from the top is the third-level approximate A3 component. The necessity and sufficiency of third-level decomposition is determined by trials aimed at obtaining the best testing performance. As shown in Fig. 4, these four decomposed components are provided as input to the NN in place of the original SSTA series. In our work, WNN is used to denote a sigmoid neural network for which the input was derived from wavelet analysis, although conventionally WNN indicates the network in which the wavelet is used as an activation function. The wavelet decomposition was carried out over the training set only because the test data were not supposed to be available for actual application.

The mother wavelet functions could be of different types or families and can be classified by the width of the filter window (which influences the localization) and also by the order of signal moments incorporated in it (which affects the extent of representation of the signal as a polynomial) (Addison 2002). With respect to the latter classification, a wavelet function called Daubechies3 typically indicates signal representation through three features: constant, linear terms, and quadratic terms.

In this study, typically seven families of wavelet functions that are more common are comparatively analyzed. The variation of their signal strengths over time is shown in Fig. 5. The Haar wavelet has the simplest shape of a step function, and thus it can effectively deal with discontinuous series. The Symlet, Daubechies, and Coiflet wavelets have comparable shapes but vary in their extent of phases. The Coiflet wavelet has the advantage that both wavelet and scaling functions have depleting moments facilitating efficient wavelet transformation. The dmey type has the distinction that the wavelet and scaling functions efficiently operate in the frequency domain. The biorthogonal and reversed biorthogonal wavelets exhibit the property of the linear phase, which is helpful in signal and image reconstruction.

## 3. Results and discussion

### a. Accuracy of the numerical forecasts

The basic statistics of minimum, mean, and maximum values of observed SST are given in Table 2 for all locations. The minimum SST has a fair amount of variation across all sites, which is not so with the maximum SST, and the mean SST values are higher at EEIO and WEIO compared to other sites.

Basic statistics of observed SST at all locations over daily, weekly, and monthly time scales.

To begin, the accuracy of numerical SST estimations versus corresponding observations was checked over the total duration involved in various time scales. This was done by evaluating the coefficient of correlation *r*, the root-mean-square error (RMSE), and the mean absolute error (MAE). The use of multiple error criteria enabled a balanced understanding of the differences across the estimated and observed values. While *r* indicates the degree of the linear association of the two datasets, RMSE and MAE are measures of actual deviations across them with the difference that RMSE highlights higher differences unlike the MAE.

Table 3 shows a comparison of the SST anomaly (SSTA) that resulted in terms of measures mentioned above between the raw numerical estimates and the corresponding observations at all six locations and over three different time scales. It is clear from the table that the raw numerical estimations are associated with very low values of *r* and high values of RMSE and MAE. Overall, the highest value of *r* was 0.57, which is much less than the desired value of 1.0. The magnitudes of RMSE and MAE were high in the range of (0.38°C, 0.84°C) and (0.30°C, 0.69°C), respectively, indicating the limitations of extracting a site-specific numerical outcome. Data-driven methods are an alternative to such physics-based numerical evaluations; however, considering the general reluctance of the user community to rely only on data-driven approaches, it may be worthwhile to adopt a hybrid physics- and data-based approach. The next section describes how the numerical results can be combined with neural networks to make attractive SST predictions over multiple time steps in the future.

Numerical vs observed SSTA at six locations of study (RMSE and MAE in °C).

### b. Combining numerical and neural methods

The procedure of combining numerical and neural outputs began with obtaining the error or difference between numerical estimations and corresponding observations at a given time step and forming an error time series as a result. Thereafter, time series forecasting was accomplished with the help of a neural network for which the input was a sequence of a few preceding errors for a given time step and the output was predicted errors over multiple time steps in the future. Such predicted errors were added to the numerical estimation, and SSTA predictions over multiple time steps were made. Such a methodology thus brings together the advantages of both physics-based and data-driven approaches, which might be expected to produce a reliable and accurate outcome.

A great amount of experimentation with network architectures and learning algorithms was done. This included the ordinary feed-forward backpropagation, as well as radial basis function, generalized regression, and nonlinear autoregression types of neural networks. For details about these network architectures and other basic features of neural networks, readers are referred to the textbooks of Hagan et al.(2014), Haykin (1999), and Wasserman (1993), and also to neural network reviews by the ASCE Task Committee (2000a,b), Maier and Dandy (2000), and Jain and Deo (2006).

A typical feed-forward autoregressive network consists of time-delayed (TDL) input, namely, SSTA(*t*), SSTA(*t*−1), SSTA(*t*−2),…. [where, SSTA(*t*−*n*) is an SST anomaly at time step “*t* − *n*”]. This is fed through the input nodes, and after processing the prediction over the desired time step (*t* + *n*) in the future is collected from the output node. The network is thus allowed to learn an unknown hidden pattern in the preceding sequence and to predict the future value by moving ahead in time in a sliding-window fashion.

Before its actual application, a network is required to be trained, or the connection weight and bias values are required to be fixed using a mathematical optimization algorithm that varies from a simple gradient descent to a complex genetic algorithm. These algorithms usually operate by minimizing the sum-squared error between the target output and the network-yielded output. The errors are formed when a series of known input–output pairs are fed one after another to the network.

Initial trials had shown that instead of the above-mentioned basic network, a nonlinear autoregressive type of network would work more effectively, since this type brings in more memory and nonlinearity in the autoregression through an additional feedback loop. The network training was done using the initial 70% of data, and testing was carried out with the remaining 30%. An example of the performance of such a network during network testing is shown in Fig. 6, in which time histories of the target and network-yielded SSTA and their respective scatter are compared. This example pertains to a 3-day-ahead prediction of SSTA at the site BoB. To impart adequate training commensurate with the time scale of the underlying time series, separate networks were developed to cater to daily, weekly, and monthly SSTA values. Typically, the number of input and hidden nodes was determined by trial and error. The trials involved the training dataset. In every trial a fixed number of input and hidden nodes was considered, and the network was trained on that basis and was tested over the testing dataset. The number of input and hidden nodes was varied from one onward, and in a given architecture this number was not necessarily the same. Thus, the pair (the number of input and hidden nodes) that gives the best out-of-sample results was selected for further analysis. This prediction was associated with an *r* of 0.41, an RMSE of 0.57°C, and an MAE of 0.46°C. The overall error statistics between the network predictions and actual observations at this site for a lead time of 5 days is shown in Table 4. The table indicates low values of *r* and high values of RMSE and MAE. Although these statistics were attractive compared with those of the raw numerical estimates at the given time (first row of Table 4), they indicated a need to further improve the predictions by more experimentation.

The testing error statistics at site BoB; the NN predictions pertain to the nonlinear autoregressive NN (RMSE and MAE in °C).

Therefore, we examined a recent type of hybrid network called a WNN, discussed in section 3c.

### c. Results with WNN

The prediction of SSTA was made over three different scales: daily, weekly, and monthly. The wavelet decomposition was carried out over the training set only, since the test data were not supposed to be available in actual application. At every scale, SSTA was predicted over five time steps in the future. In general, the “dmey” wavelet with three levels of decomposition was found to yield relatively better results in terms of the error statistics of *r*, RMSE, and MAE in comparison with the six wavelets discussed earlier. An example is given in Figs. 7–9 pertaining to station AS. These figures show the variation of *r*, RMSE, and MAE, respectively, against prediction intervals across seven wavelet types. The clear edge of the dmey wavelet indicates that in this application, the preprocessing of the SSTA series was efficiently handled by the frequency-domain operation of this wavelet and the scaling function envisaged in it. A similar observation was also made over the remaining five sites. Therefore, the following discussion is based on this type of WNN only.

#### 1) Daily predictions

The performance of the network in daily SSTA predictions is given in Figs. 10a–c at the location WEIO as an example. These figures compare WNN results with corresponding observations during the testing period and show how *r*, RMSE, and MAE changed over the five time steps in the future at the location WEIO. The *r* values were very high over all steps and near their ideal value of 1.0. The RMSE and MAE were less than approximately 0.20°C for 5 days.

The output of the numerical SST models was not available in real-time or online mode, and hence a comparison of the numerical estimates and corresponding observations was made at the same time step only as an offline performance assessment. Such numerical estimations, when compared with corresponding measured values for the same testing period as above, yielded error statistics of *r* = 0.37, RMSE = 0.47°C, and MAE = 0.38°C. These much lower values of *r* and higher values of RMSE and MAE compared with those in Fig. 10 show a significant increase in accuracy by adoption of the suggested prediction method.

An overall testing performance at all six stations and over all five time intervals is given in Table 5. A generally very satisfactory prediction can be noted. This was associated with high values of *r* and low values of RMSE (lower than 0.25°C) and MAE (lower than 0.20°C) with only a couple of exceptions at the highest time intervals.

Testing performance of the daily prediction models over all locations and prediction intervals. (Note: The underlying network is WNN.)

#### 2) Weekly predictions

The network performance in weekly SSTA predictions at station BoB is given in Figs. 11a–c. Over the testing period, these figures compare WNN results with corresponding observations and show as an example at site BoB how *r*, RMSE, and MAE changed over five time steps in the future. The *r* values were above 0.85 over all steps and near their ideal value of 1.0. The RMSE and MAE values were less than approximately 0.27°C. The weekly numerical estimations, when compared with corresponding measured values for the same testing period, yielded error statistics of *r* = 0.27, RMSE = 0.78°C, and MAE = 0.64°C, showing significantly lower accuracy in numerical estimations.

Table 6 presents the overall testing performance at all stations and over all time intervals considered. The high *r* values (above 0.90), with a couple of exceptions at the longest interval, indicate that the predictions strongly correlated with the observations. Further, these predictions had the RMSE contained within 0.26°C and MAE within 0.22°C, indicating very satisfactory predictions.

Testing performance of the weekly prediction models over all locations and prediction intervals. (Note: The underlying network is WNN.)

#### 3) Monthly predictions

Figures 12a–c show the network performance in monthly SSTA predictions at station EIO over the testing period. They compare WNN results with corresponding observations and show as an example at site EEIO how *r*, RMSE, and MAE changed over five time steps in the future. The figures indicate that *r* values were more than approximately 0.60 and that RMSE and MAE were less than around 0.30° and 0.25°C, respectively, for all steps. The monthly numerical estimations when compared with the corresponding measured values for the same testing period yielded error statistics of *r* = 0.11, RMSE = 0.58°C, and MAE = 0.46°C. This shows a much lower performance of the latter method. Figure 12, along with the earlier Figs. 10 and 11 belonging to daily and monthly time scales, respectively, indicate a drop in *r* after two time steps. This might be because of the reduced memory associated with increasing time steps, making predictions difficult for an autoregressive type of network.

The testing performance at all six stations and over all five time intervals is given in Table 7. A generally satisfactory prediction can be noted, with an RMSE below 0.31°C and an MAE below 0.25°C.

Testing performance of the monthly prediction models over all locations and prediction intervals. (Note: The underlying network is WNN.)

The past works on SSTA predictions made using NN dealt mainly with monthly predictions, and hence, the present results giving monthly, weekly, and daily predictions can be compared with them only in a limited way. Studies by Tangang et al. (1997) and Tang et al. (2000) were focused on the equatorial Pacific and incorporated inputs of empirical orthogonal functions of SSTA, sea level pressures, and wind stress. The authors could predict monthly SSTA over a 3-month horizon with *r* values up to 0.80. Over the southern part of the Indian Ocean, Tripathi et al. (2008) used a standard feed-forward network and carried out SSTA predictions over the next month only. They achieved higher *r* values of 0.84 over some cases of their analysis. Over two locations in the western Indian Ocean, Mahongo and Deo (2013) used a nonlinear autoregressive type of network with the input of only preceding SSTA and achieved a high accuracy seen in terms of *r* = 0.86. Thus, the present modeling strategy resulted in substantial improvement in linear correspondence between modeled and observed SST, and this too over monthly, weekly, and daily time scales, and also over five time steps in the future.

One of the problems sometimes faced while training a neural network is the overfitting of data, in which the network fails to generalize and instead starts learning specific examples (ASCE Task Committee 2000a,b; Maier and Dandy 2000). This might happen because of the use of a large number of hidden nodes or training patterns. By carrying out several trials on network architecture, hidden neurons, and control parameters, and the proportion of training and testing data, we have ensured that this did not happen. The high level of accuracy realized during the testing process may indicate the absence of overfitting.

## 4. Conclusions

A comparison of given numerical estimations and corresponding measurements showed large deviations. This highlighted the necessity to evolve innovative techniques for SST predictions over future time intervals. The procedure of combining numerical and neural techniques presented in this paper generally produced accurate SST predictions of daily, weekly, and monthly values over five time steps in the future at the selected locations. With very few exceptions, such methods resulted in substantially improving the linear correspondence between modeled SST and observations.

The suggested method has the advantage that it considers physics-based and data-driven approaches together when making SSTA predictions. A large amount of experimentation with alternative network architectures and learning schemes was necessary to achieve high accuracy in the results. Such exercises showed that preprocessing of the neural network, which predicts errors of the numerical estimation, by a wavelet transform with a “dmey” type of wavelet family worked most satisfactorily in this application. The performance of the suggested scheme with real-time numerical predictions needs to be assessed in the future.

## Acknowledgments

This study was made as part of a research project funded by ESSO-INCOIS, Ministry of Earth Sciences, government of India, Hyderabad, India, under the High Resolution Operational Ocean Forecast and Reanalysis System (HOOFS) program (F/INCOIS/HOOFS-0402013 Dt. 21.06.2013). The above work used the language platform in MATLAB R2012b (8.0.0.783) and specifically its Neural Network Toolbox, version 8.0.

## REFERENCES

Addison, P. S., 2002:

. Institute of Physics Publishing, 353 pp.*The Illustrated Wavelet Transform Handbook: Introductory Theory and Applications in Science, Engineering, Medicine and Finance*Aguilar-Martinez, S., and Hsieh W. W. , 2009: Forecasts of tropical Pacific sea surface temperatures by neural networks and support vector regression.

,*Int. J. Oceanogr.***2009**, 167239, doi:10.1155/2009/167239.Alexandridis, A. K., and Zapranis A. D. , 2013: Wavelet neural networks: A practical guide.

,*Neural Networks***42**, 1–27, doi:10.1016/j.neunet.2013.01.008.ASCE Task Committee, 2000a: Artificial neural networks in hydrology. I: Preliminary concepts.

,*J. Hydrol. Eng.***5**, 115–123, doi:10.1061/(ASCE)1084-0699(2000)5:2(115).ASCE Task Committee, 2000b: Artificial neural networks in hydrology. II: Hydrologic applications.

,*J. Hydrol. Eng.***5**, 124–137, doi:10.1061/(ASCE)1084-0699(2000)5:2(124).Collins, D. C., Reason C. J. C. , and Tangang F. , 2004: Predictability of Indian Ocean sea surface temperature using canonical correlation analysis.

,*Climate Dyn.***22**, 481–497, doi:10.1007/s00382-004-0390-4.Deka, P. C., and Prahlada R. , 2012: Discrete wavelet neural network approach in significant wave height forecasting for multistep lead time.

,*Ocean Eng.***43**, 32–42, doi:10.1016/j.oceaneng.2012.01.017.Dixit, P., Londhe S. , and Dandawate Y. , 2015: Removing prediction lag in wave height forecasting using neurowavelet modelling technique.

,*Ocean Eng.***93**, 74–83, doi:10.1016/j.oceaneng.2014.10.009.Garcia-Gorriz, E., and Garcia-Sanchez J. , 2007: Prediction of sea surface temperatures in the western Mediterranean Sea by neural networks using satellite observations.

,*Geophys. Res. Lett.***34**, L11603, doi:10.1029/2007GL029888.Gupta, S. M., and Malmgren B. A. , 2009: Comparison of the accuracy of SSTA estimates by artificial neural networks (ANN) and other quantitative methods using radiolarian data from the Antarctic and Pacific Oceans.

,*Earth Sci. India***2**, 52–75.Hagan, M. T., Demuth H. B. , Beale M. H. , and De Jesùs O. , 2014:

. 2nd ed. Hagan and Demuth, 1012 pp.*Neural Network Design*Haykin, S., 1999: Adaptive filters. 6 pp. [Available online at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.42.6386&rep=rep1&type=pdf.]

Jain, P., and Deo M. C. , 2006: Neural networks in ocean engineering.

,*Ships Offshore Struct.***1**, 25–35, doi:10.1533/saos.2004.0005.Kug, J.-S., Kang I.-S. , Lee J.-Y. , and Jhun J.-G. , 2004: A statistical approach to Indian Ocean sea surface temperature prediction using a dynamical ENSO prediction.

,*Geophys. Res. Lett.***31**, L09212, doi:10.1029/2003GL019209.Lee, Y.-H., Ho C.-R. , Su F.-C. , Kuo N.-J. , and Cheng Y.-H. , 2011: The use of neural networks in identifying error sources in satellite-derived tropical SST estimates.

,*Sensors***11**, 7530–7544, doi:10.3390/s110807530.Mahongo, S. B., and Deo M. C. , 2013: Using artificial neural networks to forecast monthly and seasonal sea surface temperature anomalies in the western Indian Ocean.

,*Int. J. Ocean Climate Syst.***4**, 133–150, doi:10.1260/1759-3131.4.2.133.Maier, H. R., and Dandy G. C. , 2000: Neural networks for the prediction and forecasting of water resources variables: A review of modelling issues and applications.

,*Environ. Modell. Software***15**, 101–124, doi:10.1016/S1364-8152(99)00007-9.Mallat, S. G., 1998:

. Academic Press, 577 pp.*A Wavelet Tour of Signal Processing*Moraud, E. M., 2009: Wavelet networks. School of Informatics, University of Edinburgh, 8 pp. [Available online at http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV0809/martinmoraud.pdf.]

Neetu, Sharma R. , Basu S. , Sarkar A. , and Pal P. K. , 2011: Data-adaptive prediction of sea-surface temperature in the Arabian Sea.

,*IEEE Geosci. Remote Sens. Lett.***8**, 9–13, doi:10.1109/LGRS.2010.2050674.Patil, K. R., Deo M. C. , and Ravichandran M. , 2014: Neural networks to predict sea surface temperature.

*Proc. 19th Int. Conf. on Hydraulics, Water Resources, Coastal and Environmental Engineering (HYDRO 2014)*, Bhopal, Madhya Pradesh, India, Indian Society of Hydraulics, 1317–1326.Pozzi, M., Malmgren B. A. , and Monechi S. , 2000: Sea surface-water temperature and isotopic reconstructions from nannoplankton data using artificial neural networks.

,*Palaeontol. Electron.***3**, 4. [Available online at http://palaeo-electronica.org/2000_2/neural/issue2_00.htm.]Shoaib, M., Shamseldin A. Y. , and Melville B. W. , 2014: Comparative study of different wavelet based neural network models for rainfall–runoff modeling.

,*J. Hydrol.***515**, 47–58, doi:10.1016/j.jhydrol.2014.04.055.Tang, B., Hsieh W. W. , Monahan A. H. , and Tangang F. T. , 2000: Skill comparisons between neural networks and canonical correlation analysis in predicting the equatorial Pacific sea surface temperatures.

,*J. Climate***13**, 287–293, doi:10.1175/1520-0442(2000)013<0287:SCBNNA>2.0.CO;2.Tangang, F. T., Hsieh W. W. , and Tang B. , 1997: Forecasting the equatorial Pacific sea surface temperatures by neural network models.

,*Climate Dyn.***13**, 135–147, doi:10.1007/s003820050156.Tanvir, M. S., and Mujtaba I. M. , 2006: Neural network based correlations for estimating temperature elevation for seawater in MSF desalination process.

,*Desalination***195**, 251–272, doi:10.1016/j.desal.2005.11.013.Tripathi, K. C., Rai S. , Pandey A. C. , and Das I. M. L. , 2008: Southern Indian Ocean SST indices as early predictors of Indian summer monsoon.

,*Indian J. Mar. Sci.***38**, 70–76.Wasserman, P. D., 1993:

John Wiley & Sons, Inc., 255 pp.*Advanced Methods in Neural Computing.*Wu, A., Hsieh W. W. , and Tang B. , 2006: Neural network forecasts of the tropical Pacific sea surface temperatures.

,*Neural Networks***19**, 145–154, doi:10.1016/j.neunet.2006.01.004.Wu, K. K., 1994:

*Neural Networks and Simulation Methods.*Marcel Decker, 456 pp.Xue, Y., and Leetmaa A. , 2000: Forecasts of tropical Pacific SST and sea level using a Markov model.

,*Geophys. Res. Lett.***27**, 2701–2704, doi:10.1029/1999GL011107.