1. Introduction
A mature tropical cyclone (TC), also known as a typhoon or hurricane, is an intense mesoscale vortex characterized by a calm eye and a ring of violent winds in the eyewall (Frank 1977). A TC is one of the most severe weather systems on Earth that usually brings destructive winds and extreme rainfall, causes strong surges and floods, and often results in loss of life and damage to property. According to the statistics, more than 90% of marine catastrophes are caused by such meteorological disasters. Currently, typhoon detection, monitoring, and forecasting still remain a challenging task. Over the past a few decades, meteorological information about TCs has been extensively conducted with the utilization of various instruments such as aircrafts, dropsondes, and Doppler radar wind profilers (Powell 1980; Weatherford and Gray 1988; Aberson et al. 2006; Giammanco et al. 2012; Xie et al. 2013). The predicted weather condition of a TC was calculated using mathematical modeling software and meteorological data. Such a process-based numerical prediction model is usually driven by typhoon-characteristics information such as typhoon locations, central atmospheric pressures, and maximum wind speeds near typhoon centers. Even with modern high performance computing (HPC) facilities, it is still computationally expensive to run those models, and the accuracy of the prediction is heavily affected by the initial and boundary conditions (Bender et al. 2007; Jiménez et al. 2007; Hsiao et al. 2009).
Remote sensing technologies have great potential for providing spatial and temporal observations of environmental variables on large scales (Tralli et al. 2005; Handcock et al. 2006). Optical remote sensing data, in particular, provide better repeated spatial coverages than in situ surface meteorological station data, offering efficient monitoring of TCs (Hasler et al. 1983; Wu et al. 2003; Bell and Montgomery 2008; Liu et al. 2010; Zhang et al. 2014). In the intertropical convergence zone (ITCZ), the typhoon intensity and the cloud turbulence degree have a very strong correlation, which is demonstrated as the spiral cloud bands (SCBs) in satellite remote sensing images (Montgomery and Kallenbach 2010). Extracting intensity-associated features of SCBs provides an alternative approach to classifying and predicting TC. The most mature Dvorak analysis for assessing tropical cyclone intensity utilizes empirical regulations and constraint conditions on features of cloud system structures, by using specific parameters in satellite cloud images such as data T-number (DT), model expected T-number (MET), and pattern T-number (PT) (Dvorak 1975; Velden et al. 2006). But this method demands abundant domain knowledge for feature extraction and hardly captures structural information about typhoon inner cores. Currently, the implicit intensity-associated features of SCBs are extracted manually; therefore, there is an urgent need to develop methods to automatically classify intensity-based typhoons.
Researchers have already applied machine learning (ML) algorithms in the field of marine meteorology, such as support vector machine (SVM), extreme learning machine (ELM), and back propagation (BP) to analyze meteorological cloud images (L. Li et al. 2015; P. Li et al. 2015; Xia et al. 2015). Traditional ML algorithms usually achieve unsatisfactory classification results from sophisticated intensity-associated typhoon information about satellite cloud images. A subfield of machine learning—deep learning (DL), as an extremely generalized approach—has been applied to various domains, such as pattern recognition, computer vision, artificial intelligence, and so on (Xiao et al. 2010; Hinton et al. 2012; LeCun et al. 2015). As one of most popular deep learning models, convolutional neural networks (CNNs) have been demonstrated as a promising tool for image classification due to its locally connected layers and feasibility to integrate with local features; it outperforms the most traditional techniques mentioned above with a great margin of image classification and scene recognition (Taigman et al. 2014; Ren et al. 2015). In the hierarchical structure of CNNs, lower layers that involve series of multiscale convolutional filters learn low-level features such as edges, textures, and colors from raw data automatically and form more abstract concepts hierarchically along feed-forward layers, synthesizing semantic representation of input data at the highest layer eventually (Zeiler and Fergus 2013). Several studies have employed a conventional CNN model or multidimensional CNN (2D/3D) to estimate tropical cyclone intensity, which obtains a small root-mean-square error (RMSE; Rodés-Guirao 2019; Chen et al. 2019; Wimmers et al. 2019; Lee et al. 2020). These CNN-based approaches are applied to intensity-based typhoon imagery classification and focused on the application and improvement in network structure. However, there is a lack of in-depth exploration on the internal mathematical operation mechanism of CNNs, which could theoretically improve the model accuracy to some extent.
In this paper, in order to explore inner physical mechanisms of CNNs in the typhoon intensity prediction, we build a Typhoon-CNNs architecture as a discriminating and automatic feature descriptor to classify and predict typhoon intensity using satellite cloud images. Based on the traditional structure of CNNs, the framework proposed the cyclic convolution strategy of a multiscale filter [section 2b(1)] to extract the representative hidden features of SCBs. Meanwhile, we focus on the activation function of network neurons and the cross-entropy loss functions based on the minimum classification error (MCE). To further enhance the performance of Typhoon-CNNs, we optimize the selection of network hyperparameters (such as the scale of filters, the rate of dropout zero-set, etc.) related to the design of typhoon CNNs. The improved Typhoon-CNNs is trained and validated using more than 10 000 multiple sensor satellite cloud images from the National Institute of Informatics (NII). Finally, the image sensitive areas that affect the intensity of the typhoon were visualized and the experimental results were also analyzed.
2. Materials and methods
a. Data acquisition and augmentation
Satellite remote sensing images for experiments are provided by NII, which provides real-time meteorological satellite data concerning typhoons in the northwest Pacific from 1995 to 2018 with intensity labels, captured by the satellites such as GMS-5, GOES-9, MTSAT-1R, MTSAT-2, and Himawari-8. Experimental samples are taken from the infrared band (IR3) images with multiresolutions, and the details about IR3 for these five satellites are shown in Table 1. The NII establishes a benchmark to rank typhoon intensity for five grades outlined in Table 2, such as tropical depression (TD), tropical storm (TS), severe tropical storm (STS), typhoon (TY), and severe typhoon (STY), on the basis of wind speeds near the typhoon center. We label the respective TC grades mentioned above from category 1 to 5 successively to perform supervised learning on the typhoon datasets.
Details about IR3 band information of five satellites.


A benchmark for typhoon intensity established by NII.


Typhoon datasets contain 12 500 satellite cloud images, of which 2500 images in each category are divided into 2000 samples for training and 500 samples for testing. To compare with other models such as VGG16 and ResNet50 (Chatfield et al. 2014; He et al. 2015), the size of images in the typhoon datasets is scaled from 512 × 512 × 1 to 224 × 224 × 1 by using the nearest-neighbor interpolation, and normalization is performed to reduce information redundancy. Training a CNN architecture that owns numerous free parameters from scratch needs a large number of labeled images. To increase the diversity of images, data augmentation is adopted to expand a certain number of images on each training sample (Chatfield et al. 2014). Meanwhile, data augmentation can add appropriate noise to the input, making the model robust and adaptive. The data augmentation—which would not alter the label of the image—is rotating randomly. Notably, the amplitude of random rotation can be bounded to a maximum angle ranging from 0° to 180°. As strong correlations between adjacent pixels usually accompanied with redundancy among training images, ZCA whitening, performed as a preprocessing procedure, can decrease redundant information about input images and reduce the correlation between characteristics, eventually making the whitened image more approximate to the original input data (Krizhevsky et al. 2012). Figure 1 illustrates a portion of raw training images of each category, and the first column images are the raw data. The second are the pixelate images and the third are ZCA whitening images from the raw data.

Training images of clouds in the typhoon dataset. The number in front of “-” represents the typhoon intensity, and the letters after “-” represent different images: (a) raw images, (b) pixelate images, and (c) ZCA-whitening images.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Training images of clouds in the typhoon dataset. The number in front of “-” represents the typhoon intensity, and the letters after “-” represent different images: (a) raw images, (b) pixelate images, and (c) ZCA-whitening images.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Training images of clouds in the typhoon dataset. The number in front of “-” represents the typhoon intensity, and the letters after “-” represent different images: (a) raw images, (b) pixelate images, and (c) ZCA-whitening images.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
b. Methods
1) Architecture of Typhoon-CNNs
CNNs, a milestone in the course of deep learning, are prone to be highly efficient in tackling computer vision issues in large-scale fields (LeCun et al. 2015). A typical CNN architecture is a topological structure, with a sequence of feed-forward layers such as convolutional and pooling layers arranged alternately. A convolution layer works as a generalized linear operation between input images and a filter bank. Each filter corresponds to a small receptive field from the input image. A pooling layer, normally placed behind each convolutional layer, operates a subsampling step that samples a specific number of pixels according to the stride to retain the maximum or average pixel in feature maps. Finally, a fully connected layer with a softmax function of nonlinear mapping provides a probability output for each category. Such a specific structure performs superiorly to a standard (fully connected) artificial neural network, by making use of sparse neural connectivity (limited by small receptive fields) and sharing weights of convolutional filters.
Different from ImageNet samples derived from natural scenes and Modified National Institute of Standards and Technology (MNIST) handwriting samples with recognizable features, the features in the spiral radius of cloud images datasets are indistinguishable due to intricate atmospheric factors during the formation of typhoons. When coping with inconspicuous features, part of the traditional topological structure with convolutional and pooling layers arranged alternately will lose part of crucial features at a certain extent and weaken model performance because the pooling operation lowers the resolution of feature maps in exchange for faster model convergence and lesser computational burden (Sabour et al. 2017). Hence, part of the traditional CNN architectures is possibly insensitive to extract implicit features from typhoon cloud images.
Cyclic convolution strategy based on typical CNNs architecture is used, and the framework is called Typhoon-CNNs in our study. To implement the cyclic convolution process in Typhoon-CNNs, we proposed a set of two convolution layers and one pooling layer arranged successively in each group, which is called a cyclic convolution strategy in this paper. The framework used a max pooling layer by dividing the input into rectangular pooling regions and comparing the maximum of each region. The architecture of the Typhoon-CNNs is depicted in Fig. 2, and detailed classifier configurations for each layer are outlined in Table 3. The framework contains 10 feed-forward layers as follows: five convolutional layers, two max pooling layers, one fully connected layer, and an output layer. Input grayscale images are a matrix of 224 × 224, and the first convolutional layer convolves the input with learned filters of 11 × 11 typically. To preserve crucial features of the image, convolutional layers (con2 and con4) adopted a stride of 1, and con3 and con5 with stride 2 mainly was designed to reduce the dimension and the amount of calculation. The max pooling layer also used in the network structure considered the nonlinearity and noise suppression of the remote sensing images.

The overall architecture of Typhoon-CNNs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

The overall architecture of Typhoon-CNNs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
The overall architecture of Typhoon-CNNs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
The Typhoon-CNNs classifier configurations.


2) Improved activation function
After the process of convolution operation, each filter produces a feature map through activation function existing in neurons, realizing automatic extraction for the implicit features. The nonlinear activation function should satisfy the following requirements: nonlinearity, saturability, continuity, and monotonicity, so as to highlight the back-propagation signal. The traditional activation functions are generally chosen to be logistic sigmoid, Tanh, ReLU (rectified linear unit), or ELU (exponential linear unit) function. These functions induce sparsity in hidden units and sparse feature representations for input images obtained by model (Rehn and Sommer 2007; Nair and Hinton 2010). Selecting an appropriate activation function adaptively on the typhoon dataset will improve the generalized classification performance of Typhoon-CNNs.
Currently, ReLU is the more mature activation function, which solves gradient vanishing with its merits of unilateral inhibition and sparse activation, conforming with activity of biological nerve (Glorot et al. 2012). When receiving a negative signal, neurons are in an inhibitory state, called “dead neurons.” Such a phenomenon makes the weights of inactive neurons never renewed while an optimization algorithm of gradient descent is applied during the feedback process in CNNs.

The original and derivative of the T-ReLU activation function.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

The original and derivative of the T-ReLU activation function.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
The original and derivative of the T-ReLU activation function.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
In the formula (2), when zn > 0, T-ReLU′(x) = 1, the gradient is invariance and neurons are in an active state. The residual δn is only determined by
3) Improved cross-entropy loss function
The loss function is a crucial component of CNN algorithms, and its value largely represents the robustness of the model. In this paper, we propose a loss function called “CE-FMCE,” which combines cross-entropy (CE) with the improved minimum classification error (MCE). Various loss functions such as mean square error (MSE), CE, and MCE have been extensively used to offer the gradient of weights in the back-propagation algorithm. The curve of cross-entropy loss function has fewer flat regions than MSE, which can jump out of local optimum easily; thus, cross-entropy loss function is more applicable to multiple classification tasks (Krizhevsky et al. 2012; Szegedy et al. 2016). While cross-entropy loss function is mainly expected to raise the output accuracy of the correct categories and ignore probabilities of other incorrect categories without distinction, which makes classifier difficult to distinguish between true and false categories usually accompanied with the decline in model accuracy.
The loss function based on MCE takes the most confused false categories into consideration and reduces misclassification probability of traditional shallow networks (McDermott et al. 2006). However, the constructor of MCE is a sigmoid function that can easily be saturated in the back-propagation algorithm, causing gradient vanishing. In particular, the direction of gradient descent in MCE is contrary to that in cross-entropy under misclassification situations, and finally results in training incompletely and converging slowly on CNNs.
According to the above formula definitions of the improved loss function, CE-FMCE function cannot only raise the output accuracy but also reduce the probability of misclassification, by combing the cross-entropy and improved MCE function.
Analysis on gradient of loss function.


3. Experimental results and discussion
All the proposed approaches are implemented by using the Keras framework based on Tensorflow, a library for Python, which extremely simplifies the usage and training for CNNs. The experiments are conducted on a CPU (intelXeonX5650 at 2.67 GHz) with 16 GB of memory, under the Windows operating system.
Test accuracy and training time of Typhoon-CNNs are taken as a comprehensive index to evaluate model performance. The model of Typhoon-CNNs is trained through over 10 000 grayscale samples, 20% of which as a validation set fine-tune the hyperparameters per epoch, and 2500 images are selected to test the Typhoon-CNNs. After 200 epochs, the performance of Typhoon-CNNs tends to be stable, the verification accuracy is 89.31%, and the loss value is 0.12. Figures 4a and 4b depict accuracy and loss curves, respectively, during the training procedure. Figures 5a and 5b depict test accuracy and loss under different epochs. The detailed settings of model hyperparameters during experiments are shown in Table 5.

(a) Accuracy curves and (b) loss value curves during the training phase.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

(a) Accuracy curves and (b) loss value curves during the training phase.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
(a) Accuracy curves and (b) loss value curves during the training phase.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

(a) Test accuracies and (b) test losses under different epochs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

(a) Test accuracies and (b) test losses under different epochs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
(a) Test accuracies and (b) test losses under different epochs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
The experimental settings of model hyperparameters. LR stands for “learning rate.”


The training accuracy curve in Fig. 4a tends to converge by reaching the accuracy of 0.93 ± 0.01 after 50 epochs while the validation accuracy converges slowly after approximately 140 epochs by achieving 0.88 ± 0.02 accuracy. Typhoon-CNNs controls the difference between training and verification accuracy within 4%, which alleviates the over fitting phenomenon to a certain extent. A total of 2500 image samples without training are chosen to test the classification performance of Typhoon-CNNs. During the testing process, batch size is set to 64 for the entire test set. Depicted in Fig. 5a, after iterating 150 epochs, the mean test accuracy on typhoon test set reaches 88.74% and the value of test loss reaches to 0.09 after 200 epochs. A total of 2218 images are correctly classified, and 282 images are wrongly predicted. In the following sections, the process of establishing Typhoon-CNNs will be illustrated in detail.
a. Selection for optimal sizes of convolution filters
The size of convolutional filters in Typhoon-CNNs affects the performance of feature extraction from associated pixels in typhoon cloud images. In detail, if the chosen size is too small, associated pixels can hardly be fully extracted, resulting in low learning efficiency, while a too large convolutional filter size will increase redundancy and lose partial valuable information (Park and Bang 2010). Considering input images have a resolution of 224 × 224, the first convolutional layer is equipped with 64 filters of 11 × 11, and subsequent three convolutional layers with the same size of 1 × 1, 3 × 3, 5 × 5, or 7 × 7. For the sake of calculating, the last convolutional layer adopts 256 filters of 2 × 2. According to filter size, five groups of comparative experiments are planned, and the respective losses curve of cross-entropy value are depicted in Fig. 6 after 300 epochs. Table 6 outlines the training and validation accuracy of Typhoon-CNNs against various size of filters on typhoon dataset.

Comparison of loss value with different kernel sizes.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Comparison of loss value with different kernel sizes.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Comparison of loss value with different kernel sizes.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Model performance with convolutional filter in various sizes.


As shown in Fig. 6, the loss curve of the filter size of 3 × 3 in typhoon CNNs converges the fastest after 120 epochs, reaching the minimum cross entropy loss of 0.06 ± 0.03, 0.10 lower than 5 × 5 scale, 0.17 smaller than 7 × 7, and 0.38 smaller than 1 × 1. Simultaneously, the model achieves the optimal training accuracy of 91.35% and validation accuracy of 82.90% on typhoon dataset. The loss curve of 1 × 1 size converges the slowest and achieves the lowest accuracy outlined in Table 6. Proved by comparative experiments, convolutional filters with the size of 3 × 3 are more consistent with feature distribution in spiral cloud bands of the typhoon.
b. Optimization for rate of dropout in fully connected layer
The phenomenon of overfitting is common no matter which size of convolutional filters is adopted from Table 4 in section 3a. The mean difference between training and validation accuracy is over 8%. To alleviate overfitting caused by plenty of free trainable network parameters in Typhoon-CNNs, a dropout strategy is applied to the fully connected layers. We randomly freeze 10%–90% of neurons in the last fully connected layers (fc6 in Table 3) and compare with the classification performance of Typhoon-CNNs without freezing neurons in 10 experiments. Figure 7 depicts test accuracy of the network when freezing different proportions of neurons in the fully connected layer, and we find that the ratio of neuronal freezing is irregular with the generalization of typhoon CNNs. According to Fig. 7, the zero-set rate of 60% is adopted to improve test accuracy to 86.72%, and the difference between training and test accuracy is less than 5%, which can alleviate the overfitting phenomenon significantly. Dropout strategy, which reduces sophisticated co-adaptations of neurons, makes the update of weights independent on the inherent relation of hidden nodes among fully connected layers (Hinton et al. 2012). In general, the overfitting phenomenon of typhoon CNN can be alleviated to some extent through these comparative experiments.

Test accuracy of freezing different proportions of neurons.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Test accuracy of freezing different proportions of neurons.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Test accuracy of freezing different proportions of neurons.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
c. Comparison among different activation functions
Figure 8 compares several common activation functions such as Tanh, ReLU, and ELU with T-ReLU in our research by evaluating the performance of Typhoon-CNNs under various activation functions on the typhoon dataset. Figure 8 shows training and validation accuracy under four different activation functions, respectively, from which the improved activation function T-ReLU that we proposed presents the optimal classification accuracy on both training and validation dataset by achieving the accuracy of 92.09% and 87.71%, respectively, 2.8% higher than ReLU function. And the Tanh function performs worst, which tends to be overfitting. The ELU function adds a nonzero gradient on neurons in inhibitory state and makes the negative information to participate in training of network. While all of the negative signals are compressed between −1 and 0 in the inhibitory region, the derivative of ELU function within such scope is close to 1, which means all of the negative values will affect the weight update of each layer in network (Zheng et al. 2018). Unlike the ELU function, the T-ReLU function compresses the negative signals between −0.2 and 0. The derivative within such scope is far less than 1, which can ensure that negative values will participate in the update of weight in few network layers.

Different performances with various activation functions.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Different performances with various activation functions.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Different performances with various activation functions.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
d. Comparison among different loss functions
Considering the limitations of cross-entropy function and MCE in improving model performance on classification, a new max-margin minimum classification error (M3CE) training approach is proposed, which is inspired by traditional MCE. The M3CE aims not only to increase posterior probability of the true category but also to decrease the output of the most confused category (Feng et al. 2016). In this section, loss functions such as MCE, MSE, M3CE, and cross-entropy functions are compared with the CE-FMCE, to manifest the feasibility of CE-FMCE function used in Typhoon-CNNs. The batch size is set to 40, and the initial value of λ is 0.025 for the batch size. Finally, λ in CE-FMCE is 0.03 after the optimization. Figure 9 shows curves of training accuracy on Typhoon-CNNs using five various loss functions.

Training accuracy of Typhoon-CNNs using different loss functions.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Training accuracy of Typhoon-CNNs using different loss functions.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Training accuracy of Typhoon-CNNs using different loss functions.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
The accuracy curve of using the CE-FMCE function converges the fastest after iterating 52 epochs and achieves the highest training accuracy of 93.37%. This is mainly due to the fact that descending direction of cross entropy was consistent with the MCE. Though M3CE overcomes the shortcoming of gradient vanishing in MCE, it achieves the lowest accuracy of 82.51% and converges slower than the CE-FMCE function in Fig. 9. The main reason is that the descent direction of loss function gradient was different from that of cross entropy. Notably, MSE achieves training accuracy of 85.4%, 7.90% lower than CE-FMCE, which evaluates posterior probability in classification tasks and is more applicable to solve regression problems (Guo et al. 2011). Table 7 outlines training and test accuracy of Typhoon-CNNs in detail, which manifests the CE-FMCE loss function is applicable to back-propagation of residuals during the training phase of Typhoon-CNNs.
Model accuracy with various loss functions.


In conclusion, the CE-FMCE function makes sure to enhance the output probability of the correct category and consider the incorrect category meanwhile.
e. Comparison with the existing works
Many deep learning approaches, especially deep convolutional neural network (DCNN), have acquired excellent performance on a variety of computer vision issues. In particular, three representative works, VGGNet, GoogLeNet, and ResNet, have achieved prominent ability for feature learning on the ImageNet dataset in the Large-Scale Visual Recognition Challenge (ILSVRC) (Simonyan and Zisserman 2014; He et al. 2015; Szegedy et al. 2015). The Typhoon-CNNs proposed in our research is compared with these three representative DCNNs trained on the typhoon dataset in Table 8. Figure 10 describes the stabilization time required to train the four models to converge and stabilize. From Table 8 and Fig. 10, Typhoon-CNNs presents better performance than the other three representative models by consuming about 12 400 s after iterating 100 epochs and achieving test accuracy of 88.74%, 7.43% higher than ResNet50, 10.27% higher than InceptionV3, and 14.71% higher than VGG16. The three DCNNs tend to be overfitting and acquire unsatisfactory model performance during testing phase, which results from the huge discrepancy between ImageNet dataset in natural scenes and typhoon dataset in the remote sensing situation mentioned in section 2b.

Training times consumed by different models.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Training times consumed by different models.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Training times consumed by different models.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
The accuracies on the typhoon cloud image datasets.


f. Visualization of relevant patterns
The process of visualization allows us to visualize patterns that are detected at each layer of the deep network (Zeiler and Fergus 2013). Such procedure contributes to make hierarchies of CNNs from the category of black-box models visible, highlighting the most relevant regions of input images for network output probabilities. In this paper, we visualize features extracted by Typhoon-CNNs hierarchically in a single typhoon cloud image. Figure 11 visualizes the feature maps of partial layers in Typhoon-CNNs. From Fig. 11a, we can find that the first convolutional layer (conv1) learns features which represent the absence or presence of edges at particular rotation direction and location of typhoon center in cloud images. As the number of convolutional layers increases, the features extracted by filters are becoming more abstract, contrasting with feature maps in the layers of conv2 in Fig. 11b. After the pooling operation (pool2), the outline around the typhoon eye is clearer, and texture structure of the cloud system provides more abundant, accurate, and reliable detail information in Fig. 11c. Figure 11 verifies the basic principle of typhoon CNN. First, the edge features are identified by shallow learning convolution layer, and then these features are combined to form a more abstract concept. The whole cognitive process of feature extraction for input typhoon cloud image is from shallow to deep, from low level to high level.

The features visualized in partial hierarchies of Typhoon-CNNs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

The features visualized in partial hierarchies of Typhoon-CNNs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
The features visualized in partial hierarchies of Typhoon-CNNs.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
To further confirm that the filter size of 3 × 3 is the most optimal for Typhoon-CNNs to achieve better performance on the typhoon dataset in section 3b, we select four cloud images and visualize feature maps with the sizes of 1 × 1, 3 × 3, 5 × 5, and 7 × 7, respectively, in the conv4 layer, corresponding to Figs. 12b–e. The filters of 3 × 3 can capture comprehensive and detailed features (yellow regions in feature maps), which represent the regions of the typhoon eye, cloud wall, and spiral cloud bands in the original images. Typhoon-CNNs is more sensitive to these areas rather than the fibrous cloud at the edge of typhoons. Feature extraction capability of 3 × 3 kernel size is better than filters of 1 × 1 and 5 × 5, which results in the fastest convergence in Fig. 7. With the size of convolution kernel increases, redundant information is also increased to a certain extent. Therefore, instead of increasing the kernel size, the feature extraction ability of model is improved (Wimmers et al. 2019). The filters of 7 × 7 increase the similar redundancy of typhoon images and lose local feature information. Thus, focusing on capturing features in yellow regions is beneficial to the classification task for Typhoon-CNNs in different filter size. Such a conclusion is also consistent with some research on typhoon, such as locating the typhoon eye, segmenting dense-shadowing cloud areas, and extracting characteristics in spiral cloud bands (Kuo et al. 2008; Wei et al. 2011; Chen et al. 2013).

Feature extraction using different convolution kernel size: (a) the original image, (b) 1 × 1 kernel size, (c) 3 × 3 kernel size, (d) 5 × 5 kernel size, and (e) 7 × 7 kernel size.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Feature extraction using different convolution kernel size: (a) the original image, (b) 1 × 1 kernel size, (c) 3 × 3 kernel size, (d) 5 × 5 kernel size, and (e) 7 × 7 kernel size.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Feature extraction using different convolution kernel size: (a) the original image, (b) 1 × 1 kernel size, (c) 3 × 3 kernel size, (d) 5 × 5 kernel size, and (e) 7 × 7 kernel size.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Based on the traditional CNNs, our proposed Typhoon-CNNs is constructed by using the cyclic convolution strategy. The convolution layer and the convergence layer are arranged alternately. To compare the model performance of different networks more intuitively, we visualize the hidden layers of traditional CNNs, VGG16, and Typhoon-CNNs, respectively, in Fig. 13. Figures 13a and 13b represent the learned abstract features in traditional CNNs. Figures 13c and 13d represent the learned abstract features in VGG16 and Typhoon-CNNs, respectively. Contrasting the area of yellow regions in Fig. 13, traditional CNNs lose a large proportion of local pivotal features, resulting in poor performance than Typhoon-CNNs and VGG16, because the pooling operation may drop a certain number of crucial features in yellow regions during the training phase.

Visualization of feature maps in the hidden layers by different models: (a),(b) traditional CNNs hidden layer, (c) VGG16 hidden layer, and (d) Typhoon-CNNs hidden layer.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Visualization of feature maps in the hidden layers by different models: (a),(b) traditional CNNs hidden layer, (c) VGG16 hidden layer, and (d) Typhoon-CNNs hidden layer.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Visualization of feature maps in the hidden layers by different models: (a),(b) traditional CNNs hidden layer, (c) VGG16 hidden layer, and (d) Typhoon-CNNs hidden layer.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
g. Comparison with various dataset
The experimental datasets include IR3 satellite cloud images captured by the GMS-5, GOES-9, MTSAT-1R, MTSAT-2, and Himawari-8 satellites. Figure 14 depicts the performance of Typhoon-CNNs on five datasets, from which samples are captured respectively by these five satellites at infrared band. The test accuracy on the multisource experimental dataset is compared with the other single source datasets in Fig. 15.

Accuracy curves of six datasets.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Accuracy curves of six datasets.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Accuracy curves of six datasets.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Prediction accuracy on six datasets.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1

Prediction accuracy on six datasets.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Prediction accuracy on six datasets.
Citation: Journal of Atmospheric and Oceanic Technology 39, 1; 10.1175/JTECH-D-19-0207.1
Contrasting Fig. 14 with Fig. 15, the performance of Typhoon-CNNs on single-source datasets is superior to the experimental dataset we used. Because of the inconsistency in spectral resolution of the experimental samples resulting from various satellites, the test accuracy is 88.74%, lower than other datasets from only one satellite. If the data source of input is limited to one sensor, the accuracy will increase, but the adaptability and robustness of the model will decrease. The comparative experiments show that within the allowable range of accuracy reduction, the model we proposed can adapt to multisource merging datasets from different sensors.
4. Conclusions
Typhoons, as the most devastating disaster, cause extreme damage to human life and production. Precise classification for typhoon intensity is an urgent problem to be tackled. The technique of satellite remote sensing becomes the most convenient and effective of monitoring TCs with the usage of the high temporal–spatial resolution images captured by meteorological satellites. Satellite cloud images provide abundant weather information for deep learning approaches to extract meteorological characteristics automatically.
In this paper, a typhoon CNN based on convolutional neural networks (CNNs) is proposed. The model can be used to classify typhoon intensity from remote sensing cloud images. Its novelty and contribution lie the following aspects:
-
Typhoon-CNNs architecture. Based on the traditional CNN method, the typhoon classifier adopts a cyclical convolution strategy, which can extract sensitive features of spiral cloud bands. We optimize the selection for filter parameters to enhance the representation ability of Typhoon-CNNs. Dropout zero-set operation prevents the network from overfitting by freezing partial proportion of neurons randomly. At the same time, the optimized zero rate is selected, which improves the accuracy of the typhoon dataset from 82.90% to 86.72%.
-
Improved activation function, T-ReLU. On the basis of ReLU and Tanh, we improve the activation function called T-ReLU in typhoon CNNs. The T-ReLU function activates neuron signals in a proper numerical range for negative values while retaining monotony in positive x axis, which can alleviate the dead neurons phenomenon, universally existing in ordinary activation functions. The Typhoon-CNNs using T-ReLU activation function achieves the training accuracy of 92.09% and test accuracy to 88.74%. These experiment results show the significant improvement on typhoon CNNs with T-ReLU than other activations.
-
Improved loss function, CE-FMCE. The cross-entropy loss function is widely used in CNNs during error back-propagation. The cross-entropy function neglects the influence of the most confused category in misclassification situations. We propose the CE-FMCE function based on the cross-entropy and MCE functions, through adding a rectified term to make the direction of gradient descent consistent with cross-entropy in a misclassification situation. The CE-FMCE function accelerates the speed of model convergence and finally increases training accuracy to 93.37% and test accuracy of 88.74%. Based on the above research, Typhoon CNNs has been established, and compared with other classical CNNs methods on typhoon datasets, it is found that the model accuracy and time consumption of Typhoon CNNs are better than other DCNNs.
We apply visualization techniques to observe that Typhoon-CNNs can detect specific features in the sensitive regions of typhoon cloud images automatically and generate abstract feature representation layer by layer, which verifies the feasibility of using Typhoon-CNNs to extract features automatically. The model of Typhoon-CNNs constructed on the multisource dataset is more robust and can adapt to multiple sensors.
Finally, based on visualization technology, we observe that typhoon CNNs can detect the specific features of the sensitive area of typhoon images, and generate abstract feature representation layer by layer, which confirms the feasibility of automatic feature extraction of typhoon CNNs. The typhoon CNNs model based on multisource datasets has stronger robustness and can adapt to multiple sensors.
Acknowledgments.
This study was supported by grants from the NSF China (41671431), STCSM capacity-building projects in local universities (17050501900), scientific project from Shanghai Engineering Research Center of Estuarine and Oceanographic Mapping, and open fund project of Science Technology Key Laboratory in SOA Digital Ocean. We acknowledge NII for providing basic data on the typhoon events. The authors thank the reviewers in advance for their comments and suggestions.
REFERENCES
Aberson, S. D., M. L. Black, R. A. Black, R. W. Burpee, J. J. Cione, C. W. Landsea, and F. D. Marks, 2006: Thirty years of tropical cyclone research with the NOAA P-3 aircraft. Bull. Amer. Meteor. Soc., 87, 1039–1056, https://doi.org/10.1175/BAMS-87-8-1039.
Bell, M. M., and M. T. Montgomery, 2008: Observed structure, evolution, and potential intensity of category 5 Hurricane Isabel (2003) from 12 to 14 September. Mon. Wea. Rev., 136, 2023–2046, https://doi.org/10.1175/2007MWR1858.1.
Bender, M. A., I. Ginis, R. Tuleya, B. Thomas, and T. Marchok, 2007: The operational GFDL coupled hurricane ocean prediction system and a summary of its performance. Mon. Wea. Rev., 135, 3965–3989, https://doi.org/10.1175/2007MWR2032.1.
Chatfield, K., K. Simonyan, A. Vedaldi, and A. Zisserman, 2014: Return of the devil in the details: Delving deep into convolutional nets. Proceedings of the British Machine Vision Conference 2014, M. Valstar, A. French, and T. Pridmore, Eds., BMVA Press, 6.1–6.12, https://doi.org/10.5244/C.28.6.
Chen, B., B. Chen, H. Lin, and R. L. Elsberry, 2019: Estimating tropical cyclone intensity by satellite imagery utilizing convolutional neural networks. Wea. Forecasting, 34, 447–465, https://doi.org/10.1175/WAF-D-18-0136.1.
Chen, X., L. I. Yan, K. F. Mao, and S. M. Fei, 2013: Automatic location of typhoon center based on nonlinear fitness function from IR satellite cloud images. J. Trop. Meteor., 29, 155–160.
Dvorak, V. F., 1975: Tropical cyclone intensity analysis and forecasting from satellite imagery. Mon. Wea. Rev., 103, 420–430, https://doi.org/10.1175/1520-0493(1975)103<0420:TCIAAF>2.0.CO;2.
Feng, Z., Z. Sun, and L. Jin, 2016: Learning deep neural network using max-margin minimum classification error. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Shanghai, China, Institute of Electrical and Electronics Engineers, 2677–2681, https://doi.org/10.1109/ICASSP.2016.7472163.
Frank, N. L., 1977: Atlantic tropical systems of 1971. Mon. Wea. Rev., 122, 307–314, https://doi.org/10.1175/1520-0493(1972)100%3C0268:ATSO%3E2.3.CO;2.
Giammanco, I. M., J. L. Schroeder, and M. D. Powell, 2012: GPS dropwindsonde and WSR-88D observations of tropical cyclone vertical wind profiles and their characteristics. Wea. Forecasting, 28, 77–99, https://doi.org/10.1175/WAF-D-11-00155.1.
Glorot, X., A. Bordes, and Y. Bengio, 2012: Deep sparse rectifier neural networks. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, Vol. 15, G. Gordon, D. Dunson, and M. Dudík, Eds., MLR Press, 315–323, https://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf.
Guo, D., Y. Wu, S. S. Shitz, and S. Verdu, 2011: Estimation in Gaussian noise: Properties of the minimum mean-square error. IEEE Trans. Inf. Theory, 57, 2371–2385, https://doi.org/10.1109/TIT.2011.2111010.
Handcock, R. N., A. R. Gillespie, K. A. Cherkauer, J. E. Kay, S. J. Burges, and S. K. Kampf, 2006: Accuracy and uncertainty of thermal-infrared remote sensing of stream temperatures at multiple spatial scales. Remote Sens. Environ., 100, 427–440, https://doi.org/10.1016/j.rse.2005.07.007.
Hasler, A. F., R. Mack, and A. Negri, 1983: Stereoscopic observations from meteorological satellites. Adv. Space Res., 2, 105–113, https://doi.org/10.1016/0273-1177(82)90130-2.
He, K., X. Zhang, S. Ren, and J. Sun, 2015: Deep residual learning for image recognition. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, Institute of Electrical and Electronics Engineers, 770–778.
Hinton, G. E., N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, 2012: Improving neural networks by preventing co-adaptation of feature detectors. arXiv, 18 pp., https://arxiv.org/abs/1207.0580.
Hsiao, L. F., M. S. Peng, D. S. Chen, K. N. Huang, and T. C. Yeh, 2009: Sensitivity of typhoon track predictions in a regional prediction system to initial and lateral boundary conditions. J. Appl. Meteor. Climatol., 48, 1913–1928, https://doi.org/10.1175/2009JAMC2038.1.
Jiménez, P., R. Parra, and J. M. Baldasano, 2007: Influence of initial and boundary conditions for ozone modeling in very complex terrains: A case study in the northeastern Iberian Peninsula. Environ. Modell. Software, 22, 1294–1306, https://doi.org/10.1016/j.envsoft.2006.08.004.
Juang, B. H., and S. Katagiri, 1992: Discriminative learning for minimum error classification. IEEE Trans. Sig. Proc., 40, 3043–3054, https://doi.org/10.1109/78.175747.
Krizhevsky, A., I. Sutskever, and G. E. Hinton, 2012: ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (NIPS 2012), F. Pereira et al., Eds., NeurIPS, 1097–1105, https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf.
Kuo, H. C., C. P. Chang, Y. T. Yang, and H. J. Jiang, 2008: Western north Pacific typhoons with concentric eyewalls. Mon. Wea. Rev., 137, 3758–3770, https://doi.org/10.1175/2009MWR2850.1.
LeCun, Y., Y. Bengio, and G. Hinton, 2015: Deep learning. Nature, 521, 436–444, https://doi.org/10.1038/nature14539.
Lee, J., J. Im, D.-H. Cha, H. Park, and S. Sim, 2020: Tropical cyclone intensity estimation using multi-dimensional convolutional neural networks from geostationary satellite data. Remote Sens., 12, 108, https://doi.org/10.3390/rs12010108.
Li, L., Y. Chen, T. Xu, R. Liu, K. Shi, and C. Huang, 2015: Super-resolution mapping of wetland inundation from remote sensing imagery based on integration of back-propagation neural network and genetic algorithm. Remote Sens. Environ., 164, 142–154, https://doi.org/10.1016/j.rse.2015.04.009.
Li, P., L. Dong, H. Xiao, and M. Xu, 2015: A cloud image detection method based on SVM vector machine. Neurocomputing, 169, 34–42, https://doi.org/10.1016/j.neucom.2014.09.102.
Liu, C. C., G. R. Liu, T. H. Lin, and C. C. Chao, 2010: Accumulated rainfall forecast of Typhoon Morakot (2009) in Taiwan using satellite data. J. Meteor. Soc. Japan, 88, 785–798, https://doi.org/10.2151/jmsj.2010-501.
McDermott, E., T. J. Hazen, J. L. Roux, A. Nakamura, and S. Katagiri, 2006: Discriminative training for large-vocabulary speech recognition using minimum classification error. IEEE Trans. Audio Speech Lang. Process., 15, 203–223, https://doi.org/10.1109/TASL.2006.876778.
Montgomery, M. T., and R. J. Kallenbach, 2010: A theory for vortex Rossby‐waves and its application to spiral bands and intensity changes in hurricanes. Quart. J. Roy. Meteor. Soc., 123, 435–465, https://doi.org/10.1002/qj.49712353810.
Nair, V., and G. E. Hinton, 2010: Rectified linear units improve restricted Boltzmann machines. ICML’10: Proc. of the 27th Int. Conf. on Machine Learning, Haifa, Israel, Association for Computing Machinery, 807–814, https://dl.acm.org/doi/10.5555/3104322.3104425.
Park, D. R., and S. W. Bang, 2010: Method and apparatus for processing line pattern using convolution kernel. U.S. Patent US20060182365A1, filed 10 February 2006, issued 2 February 2010.
Powell, M. D, 1980: An evaluation of diagnostic marine boundary layer models applied to tropical cyclones. Wind Eng., 108, 133–143.
Rehn, M., and F. T. Sommer, 2007: A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields. J. Comput. Neurosci., 22, 135–146, https://doi.org/10.1007/s10827-006-0003-9.
Ren, S., K. He, R. Girshick, and J. Sun, 2015: Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell., 39, 1137–1149, https://doi.org/10.1109/TPAMI.2016.2577031.
Rodés-Guirao, L., 2019: Deep learning for digital typhoon: Exploring a typhoon satellite image dataset using deep learning. Dissertation, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 70 pp., http://www.diva-portal.org/smash/get/diva2:1304600/FULLTEXT01.pdf.
Sabour, S., N. Forsst, and G. E. Hinton, 2017: Dynamic routing between capsules. Proc. of the 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, Association for Computing Machinery, 3856–3866, https://dl.acm.org/doi/10.5555/3294996.3295142.
Sharma, A., X. Liu, X. Yang, and D. Shi, 2017: A patch-based convolutional neural network for remote sensing image classification. Neural Networks, 95, 19–28, https://doi.org/10.1016/j.neunet.2017.07.017.
Simonyan, K., and A. Zisserman, 2014: Very deep convolutional networks for large-scale image recognition. arXiv, 14 pp., https://arxiv.org/abs/1409.1556.
Szegedy, C., and Coauthors, 2015: Going deeper with convolutions. 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Institute of Electrical and Electronics Engineers, 1–9, https://doi.org/10.1109/CVPR.2015.7298594.
Szegedy, C., V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, 2016: Rethinking the inception architecture for computer vision. 2016 IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, Institute of Electrical and Electronics Engineers, 2818–2826, https://doi.org/10.1109/CVPR.2016.308.
Taigman, Y., M. Yang, M. A. Ranzato, and L. Wolf, 2014: DeepFace: Closing the gap to human-level performance in face verification. 2014 IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, OH, Institute of Electrical and Electronics Engineers, 1701–1708, https://doi.org/10.1109/CVPR.2014.220.
Tralli, D. M., R. G. Blom, V. Zlotnicki, A. Donnellan, and D. L. Evans, 2005: Satellite remote sensing of earthquake, volcano, flood, landslide and coastal inundation hazards. ISPRS J. Photogramm. Remote Sens., 59, 185–198, https://doi.org/10.1016/j.isprsjprs.2005.02.002.
Velden, C., and Coauthors, 2006: The Dvorak tropical cyclone intensity estimation technique: A satellite-based method that has endured for over 30 years. Bull. Amer. Meteor. Soc., 87, S6–S9, https://doi.org/10.1175/BAMS-87-9-Velden.
Weatherford, C. L., and W. M. Gray, 1988: Typhoon structure as revealed by aircraft reconnaissance. Part II: Structural variability. Mon. Wea. Rev., 116, 1044–1056, https://doi.org/10.1175/1520-0493(1988)116<1044:TSARBA>2.0.CO;2.
Wei, K., Z. L. Jing, Y. X. Li, and S. L. Liu, 2011: Spiral band model for locating tropical cyclone centers. Pattern Recognit. Lett., 32, 761–770, https://doi.org/10.1016/j.patrec.2010.12.011.
Wimmers, A., C. Velden, and J. H. Cossuth, 2019: Using deep learning to estimate tropical cyclone intensity from satellite passive microwave imagery. Mon. Wea. Rev., 147, 2261–2282, https://doi.org/10.1175/MWR-D-18-0391.1.
Wu, C. C., K. H. Chou, H. J. Cheng, and Y. Wang, 2003: eyewall contraction, breakdown and reformation in a landfalling typhoon. Geophys. Res. Lett., 30, L017653, https://doi.org/10.1029/2003GL017653.
Xia, M., W. Lu, J. Yang, Y. Ma, W. Yao, and Z. Zheng, 2015: A hybrid method based on extreme learning machine and k -nearest neighbor for cloud classification of ground-based visible cloud image. Neurocomputing, 160, 238–249, https://doi.org/10.1016/j.neucom.2015.02.022.
Xiao, J., J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, 2010: SUN database: Large-scale scene recognition from abbey to zoo. 2010 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, San Francisco, CA, IEEE, 3485–3492, https://doi.org/10.1109/CVPR.2010.5539970.
Xie, F., Z. He, M. Q. Esguerra, F. Qiu and V. Ramanathan, 2013: Determination of heterotic groups for tropical Indica hybrid rice germplasm. Theor. Appl. Genet., 127, 407–417, https://doi.org/10.1007/s00122-013-2227-1.
Zeiler, M. D., and R. Fergus, 2013: Stochastic pooling for regularization of deep convolutional neural networks. arXiv, 9 pp., https://arxiv.org/abs/1301.3557.
Zhang, L. J., H. Y. Zhu, and X. J. Sun, 2014: China’s tropical cyclone disaster risk source analysis based on the gray density clustering. Nat. Hazards, 71, 1053–1065, https://doi.org/10.1007/s11069-013-0700-4.
Zheng, Z. S., Z. R. Liu, D. M. Huang, W. Song, G. L. Zou, Q. Hou, and J. B. Hao, 2018: Deep learning model for typhon grade classification based on improved activation function. Comput. Sci., 45, 177–181, https://doi.org/10.11896/j.issn.1002-137X.2018.12.028.
Zheng, Z. S., Q. Hou, G. L. Zou, and L. Qi, 2019: Research on deep learning based on improved minimal classification error criterion algorithm: Take typhoon satellite image as example. Jisuanji Yingyong Yanjiu, 36, 3160–3163.