1. Introduction
The microphysical characteristics of atmospheric ice particles are of great interest for a number of reasons. Weather forecasting models depend on accurate estimates of ice particle mass (Heymsfield et al. 2013) and terminal velocity (Heymsfield and Westbrook 2010) to accurately represent ice clouds. Precipitation rates and surface accumulation forecasts also depend highly on these properties. In-cloud ice particle characteristics can have significant impacts on precipitation development (Sulia et al. 2021). There is significant uncertainty associated with the measurement of atmospheric ice particle characteristics for small particle sizes (sub-100 μm) due to cloud particle concentration being uncertain because of the depth of field issues and potential particle shattering for traditional optical array probes (Korolev et al. 2013). The early stages of ice crystal growth are a frequent topic of study in cloud chambers such as Aerosol Interaction and Dynamics in the Atmosphere (AIDA; Schnaiter et al. 2016). Ice fog and diamond dust in arctic regions have been described as a natural laboratory for the study of early cloud particle growth (Gultepe et al. 2017). Air temperatures can be similar to midlatitude cirrus clouds with the added benefit that measurements can be made at the surface.
The Particle Phase Discriminator mark 2, Karlsruhe edition (PPD-2K; Kaye et al. 2008), and its airborne cousin the Small Ice Detector, version 3 (SID-3; Ulanowski et al. 2012, 2014), are well suited to increase our understanding of small ice particle sizes and shapes. The PPD-2K and SID-3 differ from standard forward scattering probes in that they record a two-dimensional image of the forward scattered light rather than a single forward scattered (or one forward and one backward) light intensity value. The forward scattering pattern images can provide a wealth of detail on the shape of the ice particle (Schnaiter et al. 2016; Vochezer et al. 2016; Järvinen et al. 2014; Schmitt et al. 2016). The PPD-2K and SID-3 have been actively used for ice particle characterization in both laboratory and field experiments. Vochezer et al. (2016) used the PPD-2K to study mixed-phase cloud characteristics at Jungfraujoch research station in Switzerland. Schnaiter et al. (2016) used SID-3 to study ice particle surface roughness in the AIDA cloud chamber. Schmitt et al. (2016) used SID-3 measurements from the NASA WB-57 to investigate small particle characteristics in midlatitude ice clouds. Järvinen et al. (2014) used the PPD-2K to study the light scattering properties of corona-producing cirrus clouds through laboratory experiments. While traditional methods of information extraction have led to quality results, this study aims to investigate if further useful information can be easily extracted through machine learning (ML) techniques.
Ice fog and diamond dust particles typically form during cold periods (T < −20°C), often associated with strong temperature inversions. As ice fog and diamond dust particles are typically smaller than 100 μm, the PPD-2K is a well-suited probe for their measurements. Diamond dust is defined as small ice crystals falling from an apparently cloudless sky (American Meteorological Society 2012, definition diamond dust). Diamond dust is common during strong thermal inversions when air aloft reaches saturation either through mixing with cooler air or simply by radiative cooling. Diamond dust is often associated with strong optical effects (sun pillars or halos) due to the typical pristine nature of the ice particles. Ice fog particles, in low visibility situations, are often quasi-spherical droxtal-shaped particles (Thuman and Robinson 1954) which generally do not produce halo features.
Particle habit classification studies have often been conducted using particle imagery data. Using edge detection methods, the physical dimensions of a particle are determined and the particle is classified based on empirically developed relationships (Lawson et al. 2006, 2008). More recently, ML has been used to identify further particle characteristics as in Przybylo et al. (2022) and O’Shea et al. (2016) (cloud particle imager probe; Lawson et al. 2001) and Touloupas et al. (2020) (holographic instrument; Fugal and Shaw 2009). These studies have been based on images of the particles. Given the aforementioned challenges with sampling small ice crystals as well as the insufficient pixel resolution for most two-dimensional optical array probes (2D OAPs), SID-3 and PPD-2K forward scattering pattern measurements provide a unique opportunity to characterize the microphysical properties of small ice crystals.
The shapes and light scattering properties of atmospheric ice particles are significant for Earth’s radiation budget (Lawson et al. 2019). Fu (1996) provides single scattering properties for hexagonal cirrus cloud particles for use in climate models. This type of study was advanced by the inclusion of more different ice particle shapes up to and including aggregates of plate-shaped crystals (Baum et al. 2011) and droxtals (Yang et al. 2003). Yang et al. (2008) investigated the scattering characteristics of ice particles with surface texture and roughness. While these microphysical characteristics are understood to have different important impacts on Earth’s radiation budget, identification of these shapes is poorly understood. Most 2D OAPs are useful for particle shape identification only at particle sizes larger than a few times their pixel size at best. The SID-3 and PPD-2K size range includes atmospheric particles smaller than 100 μm, and the measured scattering patterns can provide a lot of microphysical information such as shape and surface roughness for particles as small as 5–10 μm.
For three winters starting in December 2019, PPD-2K measurements were conducted of boundary layer ice particles at the surface in Fairbanks, Alaska. Numerous earlier studies had indicated that the size range of ice particles in Fairbanks area ice fog would be ideally suited for the PPD-2K (Thuman and Robinson 1954; Schmitt et al. 2013; Benson 1970). Measurements were focused on potential ice fog conditions, but the PPD-2K was often operated for significant periods either before or after anticipated cold conditions. The result is a large dataset of ice particle measurements during times when either or both ice fog and diamond dust were present.
In this study, a Visual Geometry Group (VGG16) convolutional neural network (Simonyan and Zisserman 2014) is trained to categorize light-scattering patterns collected during the Fairbanks, Alaska, measurement campaign. The CNN training was partially supervised in that the training dataset was expanded by using a previously trained (on a smaller dataset) CNN. The earlier trained CNN was applied to randomly selected images from the full dataset to increase the training dataset, but each image was individually scrutinized before inclusion. The results from the CNN are compared to traditional methods of PPD-2K analysis, and strengths and weaknesses are discussed. Results suggest that a combined approach including the standard methods and CNN analysis provides the most detailed information on the particle population. A companion publication submitted in tandem with this article uses the results of this study to quantify particle type based on the observed atmospheric conditions (Schmitt et al. 2024).
2. Data and methods
a. Field site
Figure 1 shows a map of the Fairbanks region. Fairbanks is situated at the northern edge of the Tanana Flats region between the Alaska Range to the south and a ring of small hills to the north. The research campaign was conducted at Fort Wainwright Army Base on the east side of Fairbanks. The goal of the research project was to study the impact of the Fort Wainwright powerplant on ice fog, and thus, measurements were conducted nearby. Generally, the Fort Wainwright powerplant did not directly influence the microphysical measurements, but occasionally, weak winds sent the powerplant plume toward the measurement location. Instrumentation was located at the Cold Regions Research and Engineering Laboratory (CRREL) Alaska Regional Office building approximately 1 km northwest of the powerplant near the hospital.
Google map of the Fairbanks area. Measurements were conducted at Fort Wainwright (red circle) east of downtown Fairbanks. Yellow squares indicate powerplant locations. Red triangles indicate locations of additional atmospheric measurements associated with the study.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
b. Instrumentation PPD-2K
The ice particle microphysical measurements were conducted using a PPD-2K, developed at the University of Hertfordshire, United Kingdom. This instrument is similar to the instrument described in Kaye et al. (2008). The PPD-2K is a laboratory version of the SID-3 (Ulanowski et al. 2012, 2014). A brief description of the operation of the PPD-2K is given here. For a complete description of the operation of the PPD-2K, see Vochezer et al. (2016) and the aforementioned publications. The PPD-2K operates by detecting laser light that is scattered through interaction with atmospheric cloud particles. A frequency doubled Nd:YAG laser shines a 100-mW 532-nm laser beam through a sample volume. A vacuum pump pulls 5 L min−1 of air into the instrument which is accelerated and focused into the sample volume so that all particles intersect the laser at a similar speed. When a cloud particle passes through the laser beam, the scattered light is measured in two ways. A beam splitter sends a portion of the light to a trigger detector which triggers a 582 × 592 CCD camera to collect an image of the scattered light. The CCD camera samples at 30 Hz meaning that some particles could pass through the sample volume without an image being collected since the trigger detector operates continuously. In addition to counting, the trigger detector records the trigger intensity for every particle. The recorded trigger detector intensity is used to estimate the particle size. Size estimates are made for all particles that pass through the sample volume while scattering pattern images are only saved if the CCD camera is ready.
In total, approximately 2.15 million scattering patterns were recorded by the PPD-2K over the 3-yr project. The PPD-2K was not operated continuously, rather it was operated during time periods when ice particles were anticipated as well as when personnel were available to monitor the instrument and surroundings. In total, the PPD-2K was operated for approximately 315 h with varying measurement strategies from constant operation to operating for two 5-min periods every hour. The PPD-2K was operated with two different CCD camera gain settings leading to two separate datasets of approximately 2 150 000 and 500 000 images which were analyzed with separate CNNs. Note that the data presented here are from the larger dataset although a CNN was trained similarly for the second dataset with similar results.
c. Machine learning study
PPD-2K scattering pattern images are of uniform image size with no rotational bias, and scattering pattern types can be easily identified by eye. There can be a high degree of variability in the light-scattering patterns, but several pattern types appear frequently and are thought to originate from particles with similar shapes. For uniformity, the images were clipped to be 582 × 582 pixels. Figure 2 shows several examples of ice particle scattering patterns from six categories imaged by the PPD-2K. An initial dataset including seven categories with approximately 200 human identified images categorized for each category was used to test different CNN architectures on the dataset using PyTorch (Torchvision 2019). The training dataset was augmented by flipping and randomly rotating the images. It was quickly discovered that the VGG (Simonyan and Zisserman 2014) architecture worked very well on the dataset. The initial dataset was used to train CNNs with the VGG11, VGG13, VGG16, and VGG19 and DenseNet (Huang et al. 2016) and AlexNet (Krizhevsky et al. 2012) architectures. VGG13 and VGG16 performed equally well with VGG11 and VGG19 being close behind. The pyTorch cross-entropy loss function was used, and a learning rate of 0.0001 was used for all VGG models. The DenseNet and AlexNet architectures both failed to converge to better than 70% even after 2–5 times more epochs being allowed.
Example scattering pattern images from the PPD-2K. Examples from six of the labeled categories are shown. “Small rough” and rough scattering patterns show large and small speckles, respectively. Scattering patterns from “Pristine” show distinctive diffraction patterns, often with the 22° halo visible and very symmetric, while “Irregular” patterns have distinctive linear features but are not generally symmetric around the center. “Spherical” scattering patterns have near perfect concentric rings while scattering patterns from sublimating particles show rounded features. The two categories not shown are “saturated” and “small,” essentially too much light and not enough light to characterize.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
PPD-2K images were categorized as pristine, rough, spherical, sublimating, or irregular. Additional categories for saturated and small (too much light or not enough light) were also used to categorize images with less useful information. Scattering patterns from pristine particles show very distinctive features created by light scattering through hexagonal prisms. Rough particles are characterized by speckled scattering patterns (Schnaiter et al. 2016). Spherical particles scatter light according to Mie theory: distinctive perfectly circular rings with the spacing based on particle size (Mie 1908). Sublimating particles often have scattering patterns that have concentric rings like the spherical particles but that are distorted as if the particles have sublimated to imperfect rather than perfect spheres. Based on the positive results from the initial study, a more advanced study was carried out.
For the advanced study, eight categories were used with rough particles being split into two categories based on the speckle size (as in Ulanowski et al. 2012). Scattering patterns with small speckles are thought to be created by larger particles while large speckles are more likely from smaller particles. Based on the preliminary results, the VGG13 architecture was chosen for its combination of speed and accuracy for developing the training dataset. Rather than visually identifying 1500 images for each category, an iterative approach was used to augment the training dataset. A VGG13 model was first trained on an initial eight category dataset with approximately 120 human selected images identified in each category. That VGG13 model was then used to identify candidate images randomly selected from the full 2.15 million image dataset. A program randomly selected images from the full dataset and categorized them based on the initial CNN. Once a selected number of new images (the 50% plus 10%–20% for miscategorized candidate images) were identified for each category, the candidate images were each visually inspected by a human to identify erroneously categorized images. Any images that were obviously miscategorized were placed into the training dataset for the correct category for the next round. Any images that were ambiguous were removed from the candidate folder. Typically, fewer than 20% were moved or removed depending on category and the skill developed up to that point. This method was chosen as it was thought that it would provide a more representative labeled dataset. A second program was then used to randomly select images from each scrutinized candidate folder to increase the total training folder to a set amount. The random selection step was taken so that the process would be less subjective as opposed to selecting only the images that appeared to be the best. The training dataset was augmented by 50% or so with each iteration of the process. The training dataset was increased in this manner from the initial 120 per category to approximately 175, 325, 500, 1000, and 1500 per category per each successive iteration. Note that liquid water droplets (spheres) were extremely rare in the dataset, so the spherical category tended to have approximately half of the number of categorized particles as the other categories. At each iteration, the images were separated into training and validation subsets by randomly selecting 20% of the images for validation and using the remaining 80% for training. With the new training dataset, a new VGG13 model was then trained. A flowchart of the process is shown in Fig. 3. Once a dataset of 1500 per category was established, each of the four different VGG architectures was tested. Comparative results are shown graphically in Fig. 4. Each successive iteration took between 2 and 6 h for the hand sorting of the candidate images. For comparison, by randomly sorting through the entire dataset at a rate of five images per minute, it would have taken approximately 160 h to identify the 700 liquid droplet scattering patterns.
Flowchart for the training dataset augmentation process. Rectangles indicate time intensive human actions. Diamonds indicate decisions that the user must make. Ovals are different Jupyter Notebook programs written in Python.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
(a) Skill comparison for VGG13 architecture as training dataset size increased as well as comparisons between the different VGG architectures with the final training dataset. (b) Comparison between the VGG architectures and AlexNet and DenseNet skills on full dataset. (c) Confusion matrix for final VGG16 CNN.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
Increasing the number of images in each category led to a slow improvement in skill, and all four VGG architectures performed similarly on the full dataset with the VGG16 having a slight edge over the others. The results in Fig. 4 are the best skill achieved within the first 25 epochs. For each CNN training step, the CNN was saved after each epoch if it had the highest achieved skill to that point as well as after the 25th epoch. At each step during the training dataset augmentation phase, a confusion matrix was created to identify where the model was making mistakes. Each column in the confusion matrix shows where the human labeled images were categorized (rows), and thus, for scattering patterns for spherical particles, 711 were categorized as spherical and one each was categorized as small or sublimating. Also, 74 sublimating and 2 small particles were categorized as spherical. While occasionally a CNN achieved a higher skill score before the 25th epoch, it was observed that the confusion matrix was more uniform after the 25th epoch. The final confusion matrix is shown in Fig. 4. The confusion matrix clearly showed overlap between the categories. This was expected given the similarities between scattering patterns in different categories. The cutoff between small rough and large rough scattering patterns is subjectively chosen by the speckle sizes in the scattering pattern. Sublimating and spherical scattering patterns both show concentric rings, with spherical scattering pattern rings being perfectly concentric circles while sublimating scattering pattern rings are not circular. Saturated images often show signs of speckles and are often therefore classified as rough.
The CNNs appeared to be performing well when applied to unlabeled data. To confirm that the CNNs were focused on important characteristics, saliency maps were created. The process of constructing saliency maps involves extracting gradient values from a CNN during the classification process. An image is created from the extracted gradient values which highlights the regions that were important for making the categorization decision. Saliency maps are particular to the image that is being categorized; thus, a saliency map should highlight the areas of the image that the human eye would identify as critical features for distinguishing the image from another (Simonyan et al. 2013). Then, the saliency map can be compared to the image being classified. The saliency map highlights the regions of the image that are important for the classification of the image. Several individual images and their associated saliency maps are shown in Fig. 5. Subjectively, the saliency maps suggest that the CNNs were working as expected with the critical regions being highlighted. Notably, the saliency maps show highlighted circles for the particle identified as liquid and the cross patterns common to pristine hexagon scattering patterns are also highlighted. As would be expected, the bright spot in the center of the scattering pattern is never highlighted in the saliency maps indicating that although it is bright, it is not useful for categorization. This suggests that the CNNs should be useful for SID-3 data which have the central bright spot blocked. Two separate CNNs were developed for each of the different settings used in data collection.
Saliency maps and corresponding PPD-2K images. For the pristine scattering pattern, notice that the saliency map shows faint lines matching the diffraction pattern in the image. The saliency map for the rough scattering pattern is relatively uniformly speckled over the full scattering pattern. For the spherical scattering pattern, concentric arcs appear to be highlighted in some regions of the saliency map.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
3. Results
For the purposes of this publication, we present the results of the ML study as compared to standard microphysical characterization techniques for the PPD-2K as described in Schnaiter et al. (2016) and Vochezer et al. (2016). A companion article (Schmitt et al. 2024) details the relationship between the identified microphysical characteristics and atmospheric conditions. Figure 6 shows the full categorization of ice particles observed by the PPD-2K through the three winters of study.
Categorization of particle types for full dataset based on VGG16 CNN.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
Two of the ML categories can easily be compared to similar categories from the traditional analysis. Schnaiter et al. (2016) describe the standard method to identify scattering patterns from particles with rough surfaces. They use the gray-level co-occurrence matrix (GLCM) method described in Lu et al. (2006). The GLCM is a matrix describing how frequently pairs of gray levels (pixel values) occur in a texture image. Through this process, an energy value or K value (K-val) can be extracted from the matrix which has been shown to be related to surface roughness which has little dependence on image brightness. These results can be directly compared to the ML categories for rough surfaced particles. A cutoff value for K-val of 4.6 is typically used to separate particles with rough surfaces from particles without rough surfaces. The K-val = 4.6 line is indicated on graphs showing the GLCM threshold typically used to separate rough from nonrough surfaced particles. For the labeled dataset, scattering pattern images were subjectively selected for the rough categories based on their speckled appearance without knowledge of the K-val. The VGG-type CNNs were able to reasonably learn this pattern type. Figure 7 shows the range of K-vals that correspond to both rough and small rough particles as determined by the ML algorithm. While the small rough category includes values below the 4.6 threshold, the “rough” category is mostly above that threshold. For the full dataset, a scatterplot of the relative percentage of scattering patterns from rough particles (determined with the K-val cutoff compared to the combination of the ML categories rough and small rough) is also shown in Fig. 7 for each hour when more than 100 scattering patterns were recorded. The offset is likely due to the ML defined small rough particles that fell below the K-val = 4.6 cutoff. Visual inspection of scattering patterns with K-val just below 4.6 did not show distinctly different patterns, just less contrast across the slightly larger in appearance speckles. Pearson’s correlation coefficient between the percentages determined by two techniques is 0.845 indicating good agreement. Also note that the particle sizes for the small rough category tend to be smaller than for the rough category, which supports the Ulanowski et al. (2012) results regarding speckle size.
Comparison of the portion of rough particles in each time period as determined by the two different methods. (a) Scatterplot of ML rough vs GLCM rough portion per hour period through full dataset. Histograms of (b),(d) K-val, with GLCM roughness cutoff indicated and (c),(e) particle size for scattering pattern images identified as small rough or rough and size histogram for each category.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
The second category where the standard techniques can be compared to ML is the pristine category. Pristine particles, which are more likely associated with diamond dust, are identified by symmetries in the scattering patterns. As can be seen in Fig. 2, the pristine particles often have 2-, 4-, or 6-fold symmetry. The standard analysis technique determines this by “unwrapping” the scattering pattern to create an integrated azimuthal intensity graph. A fast Fourier transform (FFT) can then identify these symmetries. Vochezer et al. (2016) use these symmetries to categorize particles as either columns (having 2- or 4-fold symmetries) or plates (having strong sixfold symmetry). In the example images in Fig. 2, notice that the training dataset for pristine particles all includes distinct sharp diffraction patterns while many of the scattering patterns in the irregular category do have features that would suggest strong symmetries, but do not include the sharp diffraction patterns. This has led to the ML algorithm being more discerning in identifying the diffraction patterns. While the ML algorithm was not able to separate fourfold from sixfold symmetry, it appears to have handled well the diffraction patterns associated with light scattering from near perfect hexagon prisms. Figure 8 shows a similar scatterplot showing the relationship between the ML pristine category and the proportion of particles designated as plates or columns by the FFT analysis. Pearson’s correlation coefficient between the percentages determined by two techniques is 0.842. A number of the scattering patterns determined symmetric by the FFT technique were classified as irregular by the ML algorithm due to the lack of strong diffraction features. Figure 8 also shows the K-vals for the ML pristine and irregular categories as well as which FFT symmetries were strongest for those categories. The ML pristine category had high counts with FFT values of 2, 4, or 6. Approximately 40% of the ML irregular particles had FFT values of 2, 4, or 6 indicating that a number of FFT defined plates and columns did not have strong diffraction patterns. The scattering patterns used for training the ML irregular category are typically not speckled and therefore would not go into the rough categories. The ML sublimating category also included scattering patterns that were classified by the FFT technique as columns or plates which also added to the offset in the scatterplot. Also shown in Fig. 8 are the ML categories for scattering patterns that had FFT symmetries for plates and columns. The scattering patterns that led the FFT analysis to suggest twofold symmetry led to a higher count of ML irregular patterns than ML pristine, while fourfold and sixfold symmetries led to twice as many pristine particles as compared to irregular.
Comparison of the portion of pristine particles determined by ML compared to the portion of scattering patterns designated as from columns or plates. (a) Scatterplot of ML pristine portion vs FFT plates + columns portion per hour for full dataset. Histograms of (b),(d) K-val and (c),(e) strongest FFT harmonic for scattering pattern images identified as pristine or irregular. ML assigned categories for scattering patterns with (f)–(h) either 2-, 4-, or 6-fold symmetries.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
The ML category of scattering patterns labeled “sublimating” is a category that has not reliably been identified using standard digital image analysis tools. As in the examples shown in Fig. 2, sublimating scattering patterns can range from modestly rounded nearly pristine scattering patterns to nearly concentric ovals. In the AIDA cloud chamber, the transition from pristine to concentric ovals is well documented during periods when the chamber is subsaturated. Schnaiter et al. (2016) show the progression of scattering patterns in an experiment where column-shaped crystals were grown and then sublimated. Figure 9 provides a similar set of scattering patterns from particles with different degrees of sublimation from the Fairbanks measurement campaign. Numerous publications mention that droxtal-shaped ice crystals are common in ice fog. Droxtals are thought to be frozen droplets (Thuman and Robinson 1954) and have been geometrically described in Yang et al. (2003). When a liquid droplet freezes, it is thought that the resulting ice particle will be nearly spherical but with many flat surfaces. Near spherical particles with flat surfaces rather than curved surfaces scatter light differently. Järvinen et al. (2016) and Schnaiter et al. (2016) discuss scattering patterns created by distorted spherical particles. Using the light scattering code, they simulated the SID-3 response to ice spheres with varying degrees of distortions. The calculated scattering patterns of the less distorted spheres are similar in appearance to scattering patterns categorized as sublimating. In a companion publication to this (Schmitt et al. 2024), it is shown that scattering patterns categorized as sublimating occur at higher percentages under two distinctly different circumstances. They are more common when the temperature is increasing which is not a surprise as those are conditions when you would expect sublimating particles to be present. Second, they are very common during the coldest time periods when particle concentrations were highest. It is suspected that these particles are actually droxtals rather than sublimating particles. They likely develop due to the very high particle concentrations which necessitate that the relative humidity be 100%. In this situation, sharp edges would be eroded while surfaces that are closer to flat would tend to collect water molecules. The result is that after time the particles would tend toward quasi-spherical shapes. Figure 9 also shows the K values determined for scattering patterns in the sublimating category as well as the particle sizes. Similar to the K-vals determined for the theoretical scattering patterns in Schnaiter et al. (2016) appendix B, the sublimating category is strongly skewed to the lower K-vals.
(a) Scattering patterns for increasingly sublimated column-shaped crystals. The first three were categorized as pristine while the final three were classified as sublimating. Histograms of (b) K-val and (c) particle size for scattering patterns categorized as sublimating.
Citation: Artificial Intelligence for the Earth Systems 3, 3; 10.1175/AIES-D-23-0091.1
4. Summary and conclusions
Machine learning is a valuable tool for the analysis of forward scattering particle probes such as the PPD-2K and SID-3. Measured light-scattering patterns are distinctively different for differently shaped particles. Additionally, measured light-scattering patterns are uniformly sized and have no preferred orientation enabling further training dataset augmentation through rotation and flipping of images. VGG-type CNNs have been found to be able to categorize different scattering pattern types. This study presents the results of a ML study to categorize 2.15 million PPD-2K scattering patterns from boundary layer cloud particle measurements in ice fog and diamond dust conditions from Fairbanks, Alaska.
A partially supervised approach was used for the training of the CNN. Scattering patterns were randomly selected from the 2.15 million image dataset and categorized by a preliminary CNN in order to augment the training dataset. This strategy had several advantages. It was possible to identify strengths and weaknesses of the CNN at early stages thus saving computing time if category changes were necessary. It makes it possible to maintain the number of images in each labeled category similar. Even though every image was scrutinized by eye before it went into the training dataset, weeding through potential candidate images was far less time consuming than it would have been to sift through the full dataset by hand to identify a similar dataset.
When compared to traditional categorization schemes, the VGG CNN was able to categorize scattering patterns similarly to traditional methods while it was able to identify categories that were not possible to categorize otherwise. Scattering patterns for rough-surfaced particles were nearly equally identified by both the ML and the GLCM method used by Schnaiter et al. (2016). While the VGG CNN was not able to separate scattering patterns created by plate-shaped crystals from scattering patterns from column-shaped crystals, it was able to better identify strong diffraction patterns. The VGG CNN was also able to identify scattering patterns from suspected to be sublimating particles or droxtals. This is significant as other means of categorizing sublimating particles do not exist.
The results of this study show that machine learning can be used to categorize light-scattering patterns from SID-3/PPD-2K type particle probes. Since, in numerous cases, the scattering patterns have a fluid transition from one type to another as with sublimating to pristine, and rough to small rough, it would be critical for a user to confirm that the ML algorithms separated the scattering patterns appropriately for a particular study. A combined approach could also be very useful for research studies such as when only pristine particles were of interest in which the ML pristine category combined with the ML sublimating particles with FFT analysis indicating plates or columns could provide the best results. In this case, the FFT pristine but ML sublimating scattering patterns typically showed rounded diffraction patterns, adding to the category while the FFT pristine performed better at identifying scattering patterns than the FFT analysis for identifying classic diffraction patterns. This combined approach could provide the best results for a particular scientific task.
An extensive study using the ML results is submitted as a companion publication (Schmitt et al. 2024). In that publication, ice particle characteristics gleaned from this study are related to atmospheric conditions in the Fairbanks region.
Acknowledgments.
This material is based upon work supported by the Broad Agency Announcement Program from the U.S. Army Cold Regions Research and Engineering Laboratory (ERDC-CRREL) under Contract W913E521C0017 from the U.S. Army Basic Research Program (Program Element 0603119A, Ground Advanced Technology). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Broad Agency Announcement Program or ERDC-CRREL. The authors would also like to thank Dominique Pride, Heike Merkel, and Tom Douglas for their leadership in this project.
Data availability statement.
Data are available at https://akclimate.org/projects/icefog.
REFERENCES
American Meteorological Society, 2012: Diamond dust. Glossary of Meteorology, https://glossary.ametsoc.org/wiki/Diamond_dust.
Baum, B. A., P. Yang, A. J. Heymsfield, C. G. Schmitt, Y. Xie, A. Bansemer, Y.-X. Hu, and Z. Zhang, 2011: Improvements in shortwave bulk scattering and absorption models for the remote sensing of ice clouds. J. Appl. Meteor. Climatol., 50, 1037–1056, https://doi.org/10.1175/2010JAMC2608.1.
Benson, C. S., 1970: Ice Fog: Low Temperature Air Pollution. Vol. 121. Corps of Engineers, U.S. Army, Cold Regions Research and Engineering Laboratory, 198 pp.
Fu, Q., 1996: An accurate parameterization of the solar radiative properties of cirrus clouds for climate models. J. Climate, 9, 2058–2082, https://doi.org/10.1175/1520-0442(1996)009<2058:AAPOTS>2.0.CO;2.
Fugal, J. P., and R. A. Shaw, 2009: Cloud particle size distributions measured with an airborne digital in-line holographic instrument. Atmos. Meas. Tech., 2, 259–271, https://doi.org/10.5194/amt-2-259-2009.
Gultepe, I., A. J. Heymsfield, M. Gallagher, L. Ickes, and D. Baumgardner, 2017: Ice fog: The current state of knowledge and future challenges. Ice Formation and Evolution in Clouds and Precipitation: Measurement and Modeling Challenges, Meteor. Monogr., No. 58, Amer. Meteor. Soc., https://doi.org/10.1175/AMSMONOGRAPHS-D-17-0002.1.
Heymsfield, A. J., and C. D. Westbrook, 2010: Advances in the estimation of ice particle fall speeds using laboratory and field measurements. J. Atmos. Sci., 67, 2469–2482, https://doi.org/10.1175/2010JAS3379.1.
Heymsfield, A. J., C. Schmitt, and A. Bansemer, 2013: Ice cloud particle size distributions and pressure-dependent terminal velocities from in situ observations at temperatures from 0° to −86°C. J. Atmos. Sci., 70, 4123–4154, https://doi.org/10.1175/JAS-D-12-0124.1.
Huang, G., Z. Liu, L. van der Maaten, and K. Q. Weinberger, 2016: Densely connected convolutional networks. arXiv, 1608.06993v5, https://doi.org/10.48550/arXiv.1608.06993.
Järvinen, E., P. Vochezer, O. Möhler, and M. Schnaiter, 2014: Laboratory study of microphysical and scattering properties of corona-producing cirrus clouds. Appl. Opt., 53, 7566–7575, https://doi.org/10.1364/AO.53.007566.
Järvinen, E., and Coauthors, 2016: Quasi-spherical ice in convective clouds. J. Atmos. Sci., 73, 3885–3910, https://doi.org/10.1175/JAS-D-15-0365.1.
Kaye, P. H., E. Hirst, R. S. Greenaway, Z. Ulanowski, E. Hesse, P. J. DeMott, C. Saunders, and P. Connolly, 2008: Classifying atmospheric ice crystals by spatial light scattering. Opt. Lett., 33, 1545–1547, https://doi.org/10.1364/OL.33.001545.
Korolev, A. V., E. F. Emery, J. W. Strapp, S. G. Cober, and G. A. Isaac, 2013: Quantification of the effects of shattering on airborne ice particle measurements. J. Atmos. Oceanic Technol., 30, 2527–2553, https://doi.org/10.1175/JTECH-D-13-00115.1.
Krizhevsky, A., I. Sutskever, and G. E. Hinton, 2012: Imagenet classification with deep convolutional neural networks. 25th Conf. on Neural Information Processing Systems, Lake Tahoe, NV, NeurIPS, 1097–1105, https://papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html.
Lawson, R. P., B. A. Baker, C. G. Schmitt, and T. L. Jensen, 2001: An overview of microphysical properties of Arctic clouds observed in May and July 1998 during FIRE ACE. J. Geophys. Res., 106, 14 989–15 014, https://doi.org/10.1029/2000JD900789.
Lawson, R. P., B. A. Baker, P. Zmarzly, D. O’Connor, Q. Mo, J.-F. Gayet, and V. Shcherbakov, 2006: Microphysical and optical properties of atmospheric ice crystals at South Pole Station. J. Appl. Meteor. Climatol., 45, 1505–1524, https://doi.org/10.1175/JAM2421.1.
Lawson, R. P., B. Pilson, B. Baker, Q. Mo, E. Jensen, L. Pfister, and P. Bui, 2008: Aircraft measurements of microphysical properties of subvisible cirrus in the tropical tropopause layer. Atmos. Chem. Phys., 8, 1609–1620, https://doi.org/10.5194/acp-8-1609-2008.
Lawson, R. P., and Coauthors, 2019: A review of ice particle shapes in cirrus formed in situ and in anvils. J. Geophys. Res. Atmos., 124, 10 049–10 090, https://doi.org/10.1029/2018JD030122.
Lu, R.-S., G.-Y. Tian, D. Gledhill, and S. Ward, 2006: Grinding surface roughness measurement based on the co-occurrence matrix of speckle pattern texture. Appl. Opt., 45, 8839–8847, https://doi.org/10.1364/AO.45.008839.
Mie, G., 1908: Beiträge zur Optik trüber Medien, speziell kolloidaler Metallösungen. Ann. Phys., 330, 377–445, https://doi.org/10.1002/andp.19083300302.
O’Shea, S. J., and Coauthors, 2016: Airborne observations of the microphysical structure of two contrasting cirrus clouds. J. Geophys. Res. Atmos., 121, 13 510–13 536, https://doi.org/10.1002/2016JD025278.
Przybylo, V. M., K. J. Sulia, C. G. Schmitt, and Z. J. Lebo, 2022: Classification of cloud particle imagery from aircraft platforms using convolutional neural networks. J. Atmos. Oceanic Technol., 39, 405–424, https://doi.org/10.1175/JTECH-D-21-0094.1.
Schmitt, C. G., M. Stuefer, A. J. Heymsfield, and C. K. Kim, 2013: The microphysical properties of ice fog measured in urban environments of Interior Alaska. J. Geophys. Res. Atmos., 118, 11 136–11 147, https://doi.org/10.1002/jgrd.50822.
Schmitt, C. G., M. Schnaiter, A. J. Heymsfield, P. Yang, E. Hirst, and A. Bansemer, 2016: The microphysical properties of small ice particles measured by the Small Ice Detector-3 probe during the MACPEX field campaign. J. Atmos. Sci., 73, 4775–4791, https://doi.org/10.1175/JAS-D-16-0126.1.
Schmitt, C. G., D. Vas, M. Schnaiter, E. Järvinen, L. Hartl, T. Wong, V. Cassella, and M. Stuefer, 2024: Microphysical characterization of boundary layer ice particles: Results from a 3-year measurement campaign in interior Alaska. J. Appl. Meteor. Climatol., 63, 699–716, https://doi.org/10.1175/JAMC-D-23-0190.1.
Schnaiter, M., and Coauthors, 2016: Cloud chamber experiments on the origin of ice crystal complexity in cirrus clouds. Atmos. Chem. Phys., 16, 5091–5110, https://doi.org/10.5194/acp-16-5091-2016.
Simonyan, K., and A. Zisserman, 2014: Very deep convolutional networks for large-scale image recognition. arXiv, 1409.1556v6, https://doi.org/10.48550/arXiv.1409.1556.
Simonyan, K., A. Vedaldi, and A. Zisserman, 2013: Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv, 1312.6034v2, https://doi.org/10.48550/arXiv.1312.6034.
Sulia, K. J., Z. J. Lebo, V. M. Przybylo, and C. G. Schmitt, 2021: A new method for ice–ice aggregation in the adaptive habit model. J. Atmos. Sci., 78, 133–154, https://doi.org/10.1175/JAS-D-20-0020.1.
Thuman, W. C., and E. Robinson, 1954: Studies of Alaskan ice-fog particles. J. Appl. Meteor., 11, 151–156, https://doi.org/10.1175/1520-0469(1954)011<0151:SOAIFP>2.0.CO;2.
Torchvision, 2019: Torchvision models. PyTorch, https://pytorch.org/vision/stable/models.html.
Touloupas, G., A. Lauber, J. Henneberger, A. Beck, and A. Lucchi, 2020: A convolutional neural network for classifying cloud particles recorded by imaging probes. Atmos. Meas. Tech., 13, 2219–2239, https://doi.org/10.5194/amt-13-2219-2020.
Ulanowski, Z., E. Hirst, P. H. Kaye, and R. Greenaway, 2012: Retrieving the size of particles with rough and complex surfaces from two-dimensional scattering patterns. J. Quant. Spectrosc. Radiat. Transfer, 113, 2457–2464, https://doi.org/10.1016/j.jqsrt.2012.06.019.
Ulanowski, Z., P. H. Kaye, E. Hirst, R. S. Greenaway, R. J. Cotton, E. Hesse, and C. T. Collier, 2014: Incidence of rough and irregular atmospheric ice particles from small ice detector 3 measurements. Atmos. Chem. Phys., 14, 1649–1662, https://doi.org/10.5194/acp-14-1649-2014.
Vochezer, P., E. Jearvinen, R. Wagner, P. Kupiszewski, T. Leisner, and M. Schnaiter, 2016: In situ characterization of mixed phase clouds using the small ice detector and the particle phase discriminator. Atmos. Meas. Tech., 9, 159–177, https://doi.org/10.5194/amt-9-159-2016.
Yang, P., B. A. Baum, A. J. Heymsfield, Y. X. Hu, H.-L. Huang, S.-C. Tsay, and S. Ackerman, 2003: Single-scattering properties of droxtals. J. Quant. Spectrosc. Radiat. Transfer, 79–80, 1159–1169, https://doi.org/10.1016/S0022-4073(02)00347-3.
Yang, P., G. W. Kattawar, G. Hong, P. Minnis, and Y. Hu, 2008: Uncertainties associated with the surface texture of ice particles in satellite-based retrieval of cirrus clouds—Part I: Single-scattering properties of ice crystals with surface roughness. IEEE Trans. Geosci. Remote Sens., 46, 1940–1947, https://doi.org/10.1109/TGRS.2008.916471.