The southeastern coast of Brazil is frequently affected by meteorological disturbances such as cold fronts, which are sometimes associated with intense extratropical cyclones. These disturbances cause oscillations on the sea surface, generating low-frequency motions. The relationship of these meteorologically driven forces in low frequency to the storm-surge event is investigated in this work. A method to predict coastal sea level variations related to meteorological events that use a neural network model (NNM) is presented here. Pressure and wind values from NCEP–NCAR reanalysis data and tide gauge time series from the Cananéia reference station in São Paulo State, Brazil, were used to analyze the relationship between these variables and to use them as input to the model. Meteorological influences in the sea level fluctuations can be verified by filtering the astronomical tide frequencies for periods lower than tidal cycles (periods higher than 24 h). Thus, a low-pass filter was applied in the tide gauge and meteorological time series for periods lower than tides to identify more readily the interactions between coastal sea level response and atmospheric-driven forces. Statistical analyses on time and frequency domain were used. Maxima correlations and coherence between the low-frequency sea level and meteorological series could be defined using the time lag of the NNM input variables. The model was tested for 6-, 12-, 18-, and 24-hourly forecasts, and the results were compared with filtered sea level values. The results show that this model is able to capture the effects of atmospheric and oceanic interactions. It can be considered to be an efficient model for predicting the nontidal residuals and can effectively complement the standard constant harmonic analysis model. A case study of a storm that impacted coastal areas of southeastern Brazil in March 1998 was analyzed and indicates that the neural network model can be effectively utilized in the Cananéia region.
Oscillations in sea level due to meteorological driving forces related to wind and pressure occur at different scales and frequencies in all coastal regions. Some countries are affected by flooding, with serious damage. Therefore, knowledge about the sea level height variations is very important not only for marine services but also for the protection of coastal residents, for monitoring the changes in marine ecosystems, and for designing and constructing onshore and offshore structures. Interactions between meteorological (atmospheric pressure, wind, sea surface temperature) and oceanic (salinity and deep sea) variables affect the regular tides and modify the sea level conditions in coastal regions, mainly in restricted waters such as bays.
Tropical cyclones and extratropical storms are the main cause of storm surges that can produce damage through high waves and sprawling water over large coastal areas in a single storm. The principal factors involved in the generation and modification of storm surge are the action of wind stresses on the surface water, reduction of atmospheric pressure (inverted barometer effect), waves and swells in the shallow water area, coastline configuration, and bathymetry (Pore 1964).
Frontal systems associated with extratropical cyclones introduce variations that are observed in the tide records and sometimes result in storm surges (Pugh 1987, 2005). These storms differ with respect to size and intensity, and the associated storm surges have characteristics consistent with these differences. The different characteristics of these types of storms in particular concern wind speeds and spatial scale of the storm (Storch and Worth 2008).
In the South Atlantic Ocean, along the Brazilian coastline, there are few tide gauge records with long series to analyze and predict surge events. Characteristics of the meteorological tide variations along the southeast coast of Brazil have been studied by Marone and Camargo (1994). Castro and Lee (1995) presented a study of sea level fluctuations due to wind-driven forces in the southeast continental shelf. Ribeiro (1997) investigated a surge caused by the passage of a cyclone along the Rio de Janeiro coastline that raised the sea level 0.60 m above the mean sea level datum, causing damage to coastal communities along the Guanabara Bay. Netto and Lana (1997) studied the superficial sediment characteristics of tidal flats in Paranaguá Bay. De Mesquita (2008a,b) verified a similar behavior of the mean sea level oscillations along this area of the Brazilian coastline. Mantovanelli et al. (2004) verified the tidal velocity and duration as a determinant of water transport and residual flow in Paranaguá Bay estuary. Dalazoana et al. (2005) studied the mean sea level variations using longer tide gauge temporal series from Cananéia and Fiscal Island (state of Rio de Janeiro) tide gauge and satellite altimetry to establish analysis methods applicable to Brazilian vertical datum regions.
The classical method of harmonic analysis is used to predict the astronomical tides. Tidal curves appear as periodic oscillations and can be described in terms of amplitude, period, or frequency. The harmonic analysis is based in tidal variations represented by N harmonic constituents of the tide (Doodson and Warburg 1944). Normally, 365 days of hourly data at a point are needed to extract the constituents with adequate separation of closely spaced constituents using the least squares method. These constituents can then be used to provide reliable predictions for future tides at the respective point (Franco 1981).
Predictions for reference stations are prepared from the astronomical arguments using local constituents determined by previous analysis and do not take into account meteorological influences. Thus, the observed and predicted values of the sea level are normally different. Numerical models developed to predict surges are still considered to be insufficient because of the complexity of the nonlinear processes involved. These models require a large amount of tidal and meteorological data, collecting many factors such as central pressure, speed of the cyclone, rainfall, and coastal topography (Lee 2006).
Nowadays, the neural network model (NNM) has been widely applied to modeling nonlinear dynamic systems using time series that translate the physical relations between the input variables (predictors) and the phenomenon that will be modeled (predictand). Elsner and Tsonis (1992) developed some methods for making short-term predictions of nonlinear time series data using a neural network model. These authors discuss the implications of these methods for the study of weather and climate.
The NNM has some important characteristics such as generalization, parallelism, nonlinearity, adaptability, and robustness (Haykin 2001). These models have been used in some fields of science and engineering. Sztobryn (2003) applied NNM in hydrological forecasting where the variation of water level is only wind generated. The results were successfully compared with observed sea level and other routine methods. Lee (2006) applied an NNM for forecasting storm surge in Taiwan related to the passage of three typhoons over the region. The results indicate that NNM is efficiently capable of learning and predicting from these events. Tseng et al. (2007) used a typhoon-surge forecasting model developed with a backpropagation neural network in coastal northeastern Taiwan. To determine the better forecasting model, four models were applied and tested under different compositions of the input variables. For coastal and harbor engineering applications, Chang and Lin (2006) simulated tides at multipoints considering tide-generating forces. The NNM proposed is applicable for multipoint tidal prediction in which the tidal type is similar to that of the original point.
The southeast coast of Brazil is sufficiently affected by cold fronts over 3–5-day periods. An important event that sometimes occurs because of the combination of tides and surges is the rising of the sea level with waves that reached the coastline. There are few NNM applications to predict the variability of sea level along the Brazilian coastline that are focused on the surge events. The relationships describing the response of the coastal sea level to the influence of cold fronts were analyzed using cross correlations and spectral density between the tide gauge series and meteorological variables. Maximum values and time lags of both analyses were proposed as inputs of the sea level forecast model. This paper presents a method to predict coastal sea level variations and surge using an NNM.
Although drastic storm surge typically does not occur along the coastal waters of Brazil, these events can cause some damage to coastal regions. A strong storm surge occurred along the southeast coast of Brazil in March of 1998, causing severe flooding in these coastal areas and destroying some coastline constructs. Figure 1 shows the curves related to water level over a 6-day period at the Cananéia tide gauge station in São Paulo, Brazil. The storm surge revealed can be compared with other series. In this case, the curves are water levels referenced to mean low water (MLW), that is, referenced to a fixed level or station datum (tide gauge benchmark near a gauge, to which the gauge zero is referred) for the convenience of plotting with mostly positive numbers. In this paper, we demonstrate the application of the NNM for this study case.
Operational forecasting of high sea levels (storm surges) might be important in the southeast coast of Brazil, where there are registered sea level variations above the astronomical tide predictions that can consistently impact coastal zones in this area. The aim of this study is to develop an empirical prediction of storm surge by determining the relationship of the wind and pressure fields to storm surge. This proposed model can be used to complement the standard constant harmonic model to improve the prediction of sea level variations.
2. Study area
The study area lies within the Cananéia Estuary (24°50′–25°05′S and 47°45′–48°00′W), in the southeast coastal region of Brazil in São Paulo State (Fig. 2). This region is located on the continental shelf, which is wider than the shelf of the northern coast. The average width and declivity near Cananéia city in São Paulo State are about 218 km and 46 cm km−1, respectively (Filippo 2003). The isobaths are oriented from southwest to northeast, parallel to the coastline with 45° northern direction (Trucculo 1998). It has wide coastal plains, long beach barriers, and large estuaries (Angulo and Lessa 1997). The Cananéia Estuary is an important biological reserve and contains federal and state environmentally protected areas (SMA 1990, 1996). This estuarine system covers an area of 135 km2 and is surrounded by a large mangrove area with high concentrations of nutrients (Besnard 1950; Shaeffer-Novelli et al. 1990).
The astronomical tide pattern at Cananéia station is semidiurnal, with the greatest amplitude H for the constituents M2 (principal lunar) and S2 (principal solar). The diurnal constituents O1 (principal lunar) and K1 (declinational lunisolar) are also present, as well as the shallow-water constituents M3 and M4 (third and quarter diurnal lunar, respectively), which indicate the influence of the propagation of the tide wave in the continental shelf (Fig. 3).
Surges that are verified in the tide gauge records normally are related to the same extreme events—the passage of cold fronts over this region (de Mesquita 2000a,b). Figure 4 shows the maximum recorded at the Cananéia gauge (3.13 m) during a storm on 25–28 March 1998.
a. Climatological description of the region
The South American continent is affected by both tropical and extratropical weather systems. The most severe weather systems in South America are cold fronts, intense extratropical cyclones near the east coast that cause intense winds, upper-level cyclonic vortices (ULCV; in some cases responsible for cyclogenesis and frontogenesis), the South Atlantic convergence zone, squall lines, mesoscale convective complexes, and the low-level jet. This region is influenced by persistent high pressure over the South Atlantic Ocean that enhances northeast flow across the area. This circulation is periodically disturbed by the passage of frontal systems caused by migrating anticyclones that move from the southwest across the northeast in the southeast coast of Brazil. In this region the presence of strong cyclogenesis activity is verified (Gan and Rao 1991; Seluchi 1995) that is associated with ULCV that reach through the South American west coast and cause instability in the east and northeast sector. Gan and Rao (1991) have verified two regions of persistent cyclogenesis over South America; one over the San Matias Gulf in Argentina (42.5°S, 62.5°W) and another over Uruguay (31.5°S, 55°W). The climate is subtropical humid, and during the El Niño–South Oscillation (ENSO) phenomenon great climatic disturbances occur in this region, leading to abundant rain.
ENSO strongly influences rainfall patterns mainly in southern Brazil that tend to be above the median from November to February (Rao and Rada 1990). The El Niño event in 1997–98 was considered, by some measures, to be the strongest on record, causing major climatic impacts around the world (McPhaden 1999). During the 1997–98 El Niño event, the sea level in the Cananéia region was significantly affected by a storm surge, which is analyzed in this work.
3. Meteorological and sea level dataset
Hourly sea level records for the period 1997–98 were obtained from the tide gauge station installed at Cananéia Estuary at latitude 25°01′S and longitude 47°55′W. The equipment has been operated and maintained by the Instituto Oceanográfico/Universidade de São Paulo, and data were available (http://ilikai.soest.hawaii.edu/UHSLC/jasl.html) through the Joint Archive for Sea Level program of the University of Hawaii Sea Level Center. Atmospheric pressure and wind from the National Centers for Environmental Prediction–National Center for Atmospheric Research reanalysis data (Kistler et al. 2001) at 0000, 0600, 1200, and 1800 UTC between 25°00′ and 27°30′S and from the coastline up to 45°00′W at 2.5° × 2.5° and 1.905° × 1.875° grid points—latitude and longitude, respectively—were used for studying the local influences caused by the passage of cold fronts. Meteorological analysis and forecasts from the forecast daily bulletins transmitted by the Centro de Hidrografia da Marinha do Brasil (CHM) during this period were also used. Figure 5 shows the location of the Cananéia Estuary tide gauge and the reanalysis grid points.
4. Statistical analysis
The atmospheric pressure, wind, and tide gauge time series were analyzed statistically by estimating the center of the distribution (mean and median), variances, standard deviation, asymmetry, and kurtosis. From the percentiles analyses, a few outliers in the sea level record could be identified from box plots. They were replaced by the average values between the previous and following hourly data. Before fitting, both series were used for the period from January 1997 to December 1998 to study the coastal sea level response related to the meteorological conditions as well as the behavior of the coastal sea level in this Brazilian region.
a. Filtering data using a low-pass filter
This study was focused on the oscillations in sea level caused by frequency lower than astronomically driven forces that is related to the passage of frontal systems that have periods around 3–5 days. Tides and inertial motions usually cause a high-frequency noise in sea level records used to analyze low-frequency motion in the ocean (Thompson 1983). To eliminate diurnal and shorter-period tide oscillations from the input dataset, the Thompson low-pass filter, a symmetric digital filter, was used. This filter is defined by the following expressions:
where Yt is the filtered time series, w−k and wk are the symmetric weights (n = −120 to +120 with a total of 241 weights, including 0), and xt+k are the input data.
The symmetry is imposed to preserve phase information, and, to pass low frequencies correctly, a constraint is considered (Thompson 1983):
The digital filter response is near unity at low frequencies but is near zero at high frequencies, mainly at inertial frequency (w = w0) and tidal frequencies (w = wt).
An example of a function that has a desirable shape for this filter is
where Ω1 and Ω2 are cutoff frequencies chosen within a definite range. The cutoff frequencies used in this study were Ω1 = 6.4° h−1 and Ω2 =11.2° h−1, with periods of 56.25 and 32.14 h, respectively. The Thompson filter uses 15 harmonic components and local inertial frequency (Coriolis parameter: 2Ω sinϕ, where ϕ is latitude) to calculate the weights that will be used for filtering by convolution of the hourly dataset.
Hourly observed sea level records were then filtered to remove the oscillations or noises related to tidal frequencies. For the reanalysis dataset, the same filter was used, considering the 6-hourly intervals. After filtering, the hourly sea level series was replaced at 6-hourly intervals as the reanalysis data so that both datasets could be compared for the same time interval and frequencies.
b. Series analysis in the time and frequency domain
Storm surge is usually considered to be driven by two processes: the extreme wind stress and atmospheric pressure. Therefore, cross correlations among the filtered sea level, atmospheric pressure, and zonal and meridional wind stresses were calculated. The zonal wind stress (zws) and meridional wind stress (mws) were calculated using the following equations:
where ρ = 1.22 kg m−3 (air density), W = intensity of the wind (m s−1) calculated from zonal (U) and meridional (V) wind components of the reanalysis dataset, and Cd = 1.1 + 0.053W [coefficient of drag for the southeast Brazil coast (Stech and Lorenzzetti 1992)]. The units used for wind stress are newtons per meter squared, where 1 hPa is equal to 102 N m−2.
Spectral analyses of filtered meteorological data and filtered sea level records were carried out, using fast Fourier transform. Cross-spectral analyses were obtained to identify the frequency characteristics of the local and remote meteorological events that have an influence on the variation of the sea level at the Cananéia tide gauge station. The coherence between the peaks of the meteorological and sea level time series was analyzed to verify the linear correlation between the components of the bivaried process.
5. Neural network modeling
The progress of neurobiology allowed several researchers to develop this model to emulate the cerebral capacity of learning in the attempt to solve problems of a complex nature. It has the capability of abstraction in representing the characteristics of the phenomenon through the information from a large database. To determine the best linear approach to a dataset, Rumelhart et al. (1986) has developed the backpropagation learning algorithm, which is widely used in the multiple-layers model.
An NNM receives a set of inputs Xi that is multiplied by a weight Wi and added, consisting of a linear combination. It is expressed as
The backpropagation learning is used for supervised learning with multilayer feed-forward networks. This algorithm repeats the application of a chain rule to compute each weight in the model with respect to an error function. The topology of a multiple-layer perceptron (MLP) is specified by the number of layers and the number of nodes per layer. The layers are denoted by the input, hidden, and output layers. A basic element of this model is the activation function (linear, logic, and sigmoid) that computes the activation level across the NNM. The output signal is given by
where ϕ(uk) is the activation function and bk = bias (controller over the activation function).
The first step of the NNM approach to design specific network architecture includes a number of layers, each layer consisting of a specific number of neurons. The size and structure of the model need to be a match for the nature of the problem. This stage is usually not well known, and therefore it is not an easy task, often involving a trial-and-error approach according to the characteristics of the application domain. The network is then subjected to the training stage. In that phase, neurons apply an iterative process to the input variables to adjust the weights of the NNM to optimally predict the data on which the training is performed. In this way, the error for each output unit is calculated and is used to update the weights. As a consequence, one could say that the model attained the optimum architecture or found a fit to when the error between the desired output and the target is reduced. The NNM needs three datasets of input such as training, testing, and validation series. After learning, another new dataset is used to test or verify the performance of the trained neural network. Training and test series are used for calibrating the model. Validation series verify the generalization of the model comparing the output data (predictions) with the actual ones (Fu 1994).
In this paper, different training methods were applied to find the best performance: radial basis functions (RBF), a network particularly adapted to approximation function in which the hidden layer is defined by radial basis functions and the learning fits a nonlinear surface according to some stochastic criteria (Wasserman 1993); the generalized regression neural network (GRNN), a method for estimating the joint probability density function of x and y as in the standard regression technique, given only a training set (Cigizoglu and Alp 2006); and NNM feed forward–MLP, which used the supervised learning and backpropagation algorithm.
All samples were used with intervals of 6 h (LT) between the observations, and this dataset was selected with 50% for training, 25% for testing, and 25% for validation. The input variables for the NNM training were atmospheric variables, filtered sea level series of previous hours, and observed wind for the actual time.
6. Results of the statistical analysis
The filtered records generated the time series of the sea level response in low frequency related with the meteorological systems that were used in the NNM (Fig. 6). Figures 7a–c show the peaks of the cross correlation between the low-frequency sea level at the Cananéia gauge station and pressure, zws, and mws from one and two reanalysis grid points. The greatest values of the cross correlations for the sea level response and pressure, zws, and mws components were 48, 30, and 0 h, respectively, with percentages around 32%, 35%, and 47%.
To verify the intercorrelations of the sea level response and meteorological variables in the same frequency, values of cross-spectral densities and coherence between the series were analyzed. It was found that peaks of energy and high coherence for periods from 5 to 3 days were related to passages of cold fronts over the region.
Figures 8a and 8b show the maximum peaks around 2.9 days for pressure and 1.7 days for zws. Another peak is also verified around 4.6 days for pressure and 2.8 days for zws. In Fig. 8c, peaks can be observed for periods around 6.2–2.8 days for mws. These values indicate a correlation of sea level variation and meteorological events such as cold fronts that can be identified in low-frequency motions. Similar results were found by Paiva (1993) and Castro and Lee (1995) with respect to the effect of waves in the continental shelf of the Brazilian southeast coastline. High coherence values around 75% also were verified between the meteorological variables and the low-frequency sea level variation to periods from 5 to 3 days (Fig. 9).
7. NNM performances
The maximum values of cross correlations described previously were used as input to the model. Then, a time lag was considered with respect to the sea level response and the meteorological variables. Autocorrelations of the low-frequency sea level and wind speed for the current time were also used. Therefore, pressure, zws, mws, 18-, 12-, and 6-hourly filtered sea level, and wind speed predictors were input vectors. The filtered sea level relating to 6-hours after was used as the output variable (predictand).
The filtering dataset from January 1997 to December 1998 for predicting sea level variations was used, and the results were compared with actual and predicted values. Table 1 shows the best performances of the NNM with the correlation coefficients r. The MLP with 7–14–1 layers produced the best results.
The backpropagation algorithm was used for the NNM training. The activation function used in the hidden and output units was the hyperbolic tangent function. The software Statistica Neural Networks for Windows was employed in this work.
Table 2 shows the correlation coefficients to the selected pairs for training, testing, and validation for 6-, 12-, 18-, and 24-hourly simulations. In both of the stages, a high correlation was observed.
The MLP (7–14–1) for forecasting the sea level variations for 6-hourly time lags presents accurate results. The performance of the NNM in forecasting the sea level variations was satisfactory enough (r = 0.999) for 6-hourly time lags. Figure 10 shows the comparison between NNM generalization (validation) to predict the variations of the low-frequency sea level and the target (filtering dataset). It is observed that the two curves are very similar, being in accordance with the statistical results shown in Table 2.
Figure 11 shows the evolution for training and validation of the MLP model to reach the error convergence. Learning rate and momentum parameters affect the speed of the convergence of the backpropagation algorithm. The stopping criterion was based on the error to be minimized to improve the performance of the network. The model attained the best performance for 700 epochs in which the training error is 0.008 276 and validation error is 0.008 531 with learning rate of 0.01 and momentum of 0.9. After 700 epochs, the process was stabilized as shown in Fig. 11.
The scatterplots shown in Fig. 12 have a small disparity, illustrating that NNM has a smaller error in the learning stage than in the validation stage according to the correlation coefficients in Table 2. This is a common result in establishing an NNM. The simulation performance of predictors is normally evaluated by the root-mean-square (RMS) or the square of correlation coefficient (R2), which is called the coefficient of determination. Small RMS and large R2 values indicate that the simulation performance is good (Chang and Lin 2006).
The left column of this figure shows the target and desired output simulated by NNM in the training stage for 1997. This column indicates that there is little disparity between filtered and simulated values for 6- and 12-hourly training, for which R2 is 0.9981 and 0.9803, respectively. The R2 values for 18- and 24-hourly are around 0.912 and 0.7767, respectively, showing that the NNM preserves the influences of the physical process, such as pressure and wind, in the sea level variations. The right column shows the scatterplots for 1998 in the validation stage. Small differences between the two stages are verified. The R2 values for 6-, 12-, and 18-hourly present similar results with the learning stage. For 24-hourly forecasting, the R2 presented values lower than for the testing stage. It can be related to the correlation between the predictors and predictand. Strong (∼15 m s−1) southwesterly (190°–260°) winds blowing for 3–5 days, over the ocean parallel to the coastline, are the most conducive wind vectors for producing storm surge along the southeast Brazilian coast.
Figure 13 shows the curves of the sea level variation related to the storm surge that occurred on 26–28 March 1998 in southeast coastal Brazil. The value of the peak of the high water level on 26 March was 3.13 m, and the predicted tide with the harmonic model (HM) was 2.53 m. The difference between the maximum peaks was around 0.60 m, characterizing the occurrence of a surge in this region. The value predicted by NNM was around 0.63 m. Therefore, the value obtained with both models (HM + NNM) was around 3.16 m. It can also be verified in the figure that some peaks of the high water predicted with both models are above the observed sea level. The values of the low water level are very similar.
Conventional numerical models developed to predict surges are still considered insufficient because of the complexity of the nonlinear processes involved. In this paper, an alternative method based on the structure of the neural network model to predict coastal sea level variations related to meteorological events was proposed.
Preprocessing of the data series in the time and frequency domain allowed definition of the input of the neural network model. Maximum correlations in the physical process could determine the time lag between the meteorological variables and the sea level response.
The results indicate that the MLP architecture of the network developed in this work could generalize satisfactorily the nonlinear behavior of the sea level fluctuations due to the interactions of ocean and atmosphere at the Cananéia tide gauge station. This model presented the best performance, with a correlation coefficient around 0.99 for the 6-hourly time-lag simulation, and it can be efficient to forecast storm surge, as shown in Fig. 13. The results obtained for 24-hourly time-lag simulations of approximately 0.83 for the correlation coefficient suggest that this model could still be used for forecasting the low-frequency sea level at this time lag with good performance. Forecasting for periods larger than 24-hours could be improved by considering hydrodynamic variables such as river discharges.
The results indicate that the NNM can also be useful as a complement to the standard harmonic model and thus to improve the sea level forecast. Also, the proposed NNM for predicting the surge level can be further applied to other locations along the Brazilian coast or in other sites in the world. In addition, this NNM could be developed in conjunction with a numerical ocean model (e.g., Princeton Ocean Model) to improve forecasting water levels at the key locations.
We are thankful to CHM for supplying the tide gauge records and the meteorological information. Appreciation and thanks are also given to three anonymous reviewers for their constructive comments and suggestions to improve the manuscript.
Corresponding author address: Marilia M. F. de Oliveira, Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia—COPPE, UFRJ, Caixa Postal 68501, CEP 21945-970, Rio de Janeiro, RJ, Brazil. Email: email@example.com