1. Introduction
Arctic snow is an important component of the global water–energy budget, with connections to water resource management, flood forecasting, and ecosystem sustainability (Pörtner et al. 2019; Gray and Landine 2011; Buttle et al. 2016; Gergel et al. 2017). However, extended monitoring of high-latitude snowfall is challenging due to the vast size and remote nature of the region (Doesken and Robinson 2009). The current sparse in situ measurement network contains large observational gaps, with less than 1% of all active terrestrial weather stations operating above 70°N (Mekis et al. 2018). Spaceborne observations from satellites provide far more spatial coverage than stationary surface gauges and offer a new perspective toward snowfall measurements in remote regions (Dietz et al. 2012; Skofronick-Jackson et al. 2019; Stephens et al. 2018). The National Aeronautics and Space Administration (NASA) CloudSat satellite is one such source of observations through its Cloud Profiling Radar (CPR) instrument, which makes use of a 94-GHz (W band) radar to retrieve estimates of atmospheric hydrometeor activity, including snowfall, by observing the inner structures of clouds (Stephens et al. 2008, 2002).
Comparisons between gridded CPR-derived surface snow accumulation estimates, reanalysis, and in situ observations have previously demonstrated good agreement at high latitudes above 70°N (Kulie et al. 2020; King and Fletcher 2020; Duffy et al. 2021; King and Fletcher 2021). However, CloudSat observations are impacted by surface clutter contamination in the lowest 1.2 km of the retrieved profile over land, which limits the CPR’s ability to view hydrometeors in this region (Palerme et al. 2019; Bennartz et al. 2019). This radar blind zone contributes significant uncertainty to CloudSat-derived snowfall estimates due to potential missed shallow cumuliform snowfall events and cases of virga, further contributing substantial error to calculations of surface snow accumulation and estimations of the regional water–energy budget (Hudak et al. 2008; Kulie et al. 2016).
Mixed-phase cloud (MPC) systems, spanning vast Arctic regions for most of the year, are key drivers of Arctic snowfall (Shupe et al. 2006; Morrison et al. 2012). These low-level clouds, single or multilayered, consist of supercooled liquid water droplets near the cloud top, which freeze into ice crystals as they descend toward the surface (Pinto 1998; Morrison et al. 2009). Unlike low-latitude MPCs, Arctic MPCs can last for multiple days to weeks under various atmospheric conditions, such as synoptic-scale forcing and large-scale subsidence (Zuidema et al. 2005). Their constant liquid water production leads to significant snowfall, which critically impacts surface radiative fluxes and the regional water–energy budget (Morrison et al. 2009; Shupe et al. 2006). However, due to the aforementioned blind zone, many of these shallow snowfall systems are missed. For instance, Maahn et al. (2014) found that CloudSat fails to detect 9%–11% of high-latitude precipitation, with similar uncertainties also reported for CPR virga classifications (Kodamana and Fletcher 2021). Similarly, McIlhattan et al. (2020) observed that CloudSat misses 25% of cooled liquid water (CLW) snowfall events at Greenland’s Summit Station due to the blind zone. These inaccuracies lead to a substantial underestimation of accumulated surface water quantities since CLW events are responsible for generating over half of the total observed snowfall in these regions.
Accurately reconstructing blind zone reflectivity values would therefore facilitate a more reliable retrieval of snowfall produced by Arctic MPCs. In the realm of computer vision, image inpainting techniques offer a solution to this type of problem, using information stored in surrounding pixels to fill in missing portions of an image (Elharrouss et al. 2020). Linear extrapolation techniques are easy to interpret and do not require training (Safont et al. 2014) but often result in blurry, unrealistic inpainted regions for complex images since they are unable to recognize and learn from the wider, global context (Jam et al. 2021). Machine learning (ML)-based inpainting techniques have also been around for decades (Efros and Leung 1999; Bertalmio et al. 2000), with recent leaps in their technical proficiency largely attributed to the use of increasingly sophisticated convolutional neural networks (CNNs) and generative adversarial networks (GANs) trained on large input datasets (Yan et al. 2018; Chen et al. 2021; Demir and Unal 2018). These advancements have notably enhanced the capability of generative deep learning models to fill in missing portions of high-resolution imagery with large image gaps (Liu et al. 2018; Yu et al. 2018; Yeh et al. 2017; Elharrouss et al. 2020; Guillemot and Le Meur 2014).
In particular, a specific subclass of CNNs known as U-Nets have demonstrated promise for learning subtle, latent features within images to better characterize objects in a scene (Zhou et al. 2018; Huang et al. 2020). These techniques have recently been adopted in the atmospheric sciences for inpainting regions of radar beam blockage, radar measurement defects, and simulated spaceborne blind zones, by learning about latent features in the surrounding image to intelligently “fill in” missing gaps (Lops et al. 2021; Pondaven et al. 2022; Tan et al. 2023). However, to our knowledge, there have been no detailed studies investigating the use of these techniques on the CPR blind zone or their relation to high-latitude cloud, atmospheric reanalysis, and snowfall.
Applying an ML-based inpainting model to spaceborne data retrieved by the CPR initially requires a model trained on an extensive, similar wavelength surface dataset with a clear blind zone for assessing model performance and robustness. Building on the work of Geiss and Hardin (2021), our proposed solution involves the development of a 3Net+-style convolutional neural network, trained on surface radar and corresponding atmospheric data, to inpaint a 1.2-km simulated surface blind zone at two sites in northern Alaska. Comparing this model to traditional linear gap-filling techniques helps gauge the resilience of the ML-based methods and examine their applicability to CloudSat data in future work. The incorporation of additional atmospheric input channels from reanalysis allows for evaluation of the model’s ability to discern intricate, physically based relationships related to near-surface reflectivity. Explainability analyses offer an exploration into model behavior to understand key inputs contributing to model accuracy, with the goal of enhancing future snowfall retrieval algorithms. This proof-of-concept, bottom-up study paves the way for a universal spaceborne blind zone inpainting algorithm for operational use in current and future snowfall retrieval missions.
The main objectives of this work are therefore to
-
Develop a U-Net to reconstruct a multiyear surface radar simulated blind zone at two Arctic locations;
-
Assess the U-Net’s pixel- and scene-level inpainting proficiency compared to linear methods and evaluate its capability to accurately identify near-surface cloud, shallow snowfall, and virga;
-
Determine if incorporating atmospheric data via a multichannel approach improve the performance of the model and identify the most important input image features for reconstructing the radar blind zone;
-
Examine model inpainting skill when applied to site-coincident spaceborne reflectivity observations from CloudSat.
2. Data
a. Study sites
Surface meteorology observations and vertical profiling radar measurements were collected from two U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) climate research facilities along the northern Alaskan coast over 6.5 years. The North Slope Alaska (NSA) atmospheric observing site is situated 8 m above sea level at (71.323°, −156.615°) with an observational record stretching from 2012 to 2018 (53 total months). The second location, Oliktok Point, is another coastal facility situated just 2 m above sea level, 260 km from NSA at 70.495°, −149.886°, with data spanning 2015–18 (25 months). The locations of each site are shown in Fig. 1a, with additional site details summarized in Table 1.
Summary of study site locations, measurement periods, and sample sizes.
Each research facility was equipped with dozens of instruments for continuous monitoring of clouds, aerosols, precipitation, energy, and other meteorological variables, including surface temperature sensors, precipitation gauges, and Ka-band ARM Zenith Radar (KaZR) instruments. Across both sites, there are a combined 78 total months of data (Fig. 1b), with 41 of these months observing cold season temperature ranges (where surface meteorology observations of daily maximum 2-m air temperatures remain below 2°C for the entire month). This cold season dataset facilitates an extensive data record containing a diverse set of cloud structures observed at both locations for later evaluating model performance in reconstructing cases of shallow snowfall and virga.
b. Model inputs
1) Vertical profiling radar
Prior to applying an inpainting model to CloudSat’s spaceborne measurements, it is crucial to have a similar radar dataset without a 1.2-km blind zone for training. In 2019, the KaZR active remote sensing of cloud (KaZR-ARSCL) value-added product (VAP), utilizing well-characterized 94-GHz CloudSat-CPR observations, was developed in Kollias et al. (2019). This derived VAP recalibrated the reflectivity estimates for KaZR-ARSCL observations at both NSA and OLI using offsets derived from near-coincident CloudSat overpasses. This product is called the KAZR-ARSCL-CLOUDSAT VAP, from which the “reflectivity_best_estimate” quality-controlled variable is used as ground truth training data (herein referred to as the KaZR VAP).
The idea of using CloudSat as a global radar calibration tool has been around since 2011 as described in Protat et al. (2011), with the general concept of using spaceborne radar for calibrating ground-based radars tracing its roots to even earlier work from Anagnostou et al. (2001). More recently, Kollias et al. (2019) refined and enhanced this calibration approach in the development of the NSA and OLI VAPs. The calibration methodology consists of a statistical comparison of mean reflectivity values between CloudSat and the surface radar for nonprecipitating ice clouds, using CloudSat overpasses within 200 km of each site with less than a 1-h time difference from the surface radar observations. A reflectivity frequency by altitude diagram is then constructed over a 6-month time period at each location for a variety of calibration offsets ranging between −15 and 15 dBZ (using increments of 0.1 dBZ). Root-mean-square error (RMSE) is then calculated between each radar profile for each calibration offset to determine which offset (when applied to the surface radar) provides the most representative profile that minimizes RMSE. Following this statistical calibration, the ground-based profiling radar systems are accurate within 1–2 dB. Additional details describing this process and its associated uncertainties can be found in section 2.2 of Kollias et al. (2019).
The KaZR VAP has a temporal resolution of 4 s and a vertical range resolution of 30 m and extends up to 18 km above the surface (with the first bin at 160 m). The lowest 10 km of KaZR radar data are used in this study, as this region contains the vast majority of cloud activity in the region. The sensitivity range of the calibrated KaZR dataset is [−60, 30] dBZ (additional details in Table 2). The CloudSat-calibrated KaZR product is only available during periods where there were both KaZR observations and nearby monthly CloudSat measurements available for calibration, which results in 6.5 years of available data (i.e., millions of vertical profiles that can be used during the model training process).
Model input summary descriptions and associated dataset sources.
2) Surface meteorology
Surface temperature data are also collected from the ARM surface meteorology system instrument suites at both NSA and OLI to constrain the test set of observations to the cold season. Temperature data are collected in 1-min intervals using Vaisala temperature sensors installed at 2-m height and are temporally matched to the nearest minute of KaZR data at each location (Ritsche 2011). During overlapping observational periods between NSA and OLI, there exist very similar maximum surface temperatures (Fig. 1b), with most winter months (October–April) displaying maximum monthly temperatures below 2°C [i.e., periods where precipitation occurrence is primarily in a solid phase (Liu 2008)].
3) Atmospheric reanalysis
Reanalysis estimates of temperature, specific humidity, and wind speed (both u and υ components) provided in the fifth major global reanalysis produced by ECMWF (ERA5) (Hersbach et al. 2020) were also included as inputs to provide additional context to the model. While reflectivity information is typically the primary source of information when retrieving certain surface properties, like snow accumulation, additional atmospheric context can be useful in distinguishing between cases with similar vertical reflectivity structures (King et al. 2022a,b; Pettersen et al. 2021; Shates et al. 2021; Mateling et al. 2023).
ERA5 atmospheric climate variables are hourly products produced at 0.25 × 0.25 spatial resolution (Schulzweida 2022). Here, the ERA5 grid cell containing each study site’s latitude and longitude was selected. Comparisons to radiosonde temperature and humidity observations were also performed (not shown) to confirm that no substantial biases existed in the ERA5 estimates at each location. All ERA5 variables and the KaZR dataset are summarized in Table 2. Since the primary objective of this work is to train a model using surface data which can then be easily applied to similarly calibrated CloudSat-CPR observations, the models examined here are constrained to using 2D training variables that are already spatiotemporally aligned to CPR observations. These datasets include the derived CPR products and the ECMWF auxiliary (ECMWF-AUX) product (an intermediate product containing ancillary ECMWF state variable data that has already been interpolated to each CPR bin) (Cronk 2017).
c. Data preprocessing
Before training the CNN, both the ERA5 and KaZR datasets are vertically aligned to 128 bins up to 10 km above the surface. This is achieved by downsampling the KaZR data from 333 bins and upsampling the ERA5 data from 20 pressure levels via linear interpolation (yielding a common 78-m bin resolution across all products) and a 128-pixel vertical resolution that can easily be adapted for model training. Initial tests using other vertical resolutions (e.g., 256 pixels) were also considered, but 128 pixels yielded similar performance with faster training times and matched more closely with CloudSat’s 125 vertical bins for future applications. ERA5 data are then temporally aligned to the nearest hour for each KaZR reflectivity profile, and the data are then chunked into 128-bin temporal slices (each representing 8.5 min) to produce the final 128 × 128 chunks used for model training. For testing longer periods, such as multihour overpasses from CloudSat, a sliding horizontal 64-pixel window can be employed to combine multiple inpainted predictions in a single pass.
3. Methods
a. Inpainting techniques
Simple repeating extrapolation and marching average inpainting techniques are used as baseline methods for comparison against the U-Net-style approaches. The aim is to discern if the simpler, more interpretable methods, which do not require an expensive training process, perform comparably to sophisticated CNNs. The repeating extrapolation (REP) method is a basic technique that duplicates the reflectivity value at the blind zone threshold vertically 16 times down to the lowest surface range gate. This simplistic method lacks vertical frequency information in the reconstructed blind zone, often resulting in unphysical vertical reflectivity streaks (but can perform well in cases of homogeneous cloud). Marching averages (MAR), akin to REP, use a top-down approach, creating a smoother product by calculating the average of the four reflectivity bins above each point in the blind zone column for each of the 16 bins. Both methods predominantly rely on information at, or directly above, the blind zone threshold.
Three U-Net-style CNNs, including the UNet++ from Zhou et al. (2018), as well as two, more recently developed and computationally efficient 3Net+ models, are also compared to REP and MAR to evaluate the performance benefits of using a nonlinear inpainting algorithm. The two 3Net+ models are newer iterations of the UNet++ architecture developed by Huang et al. (2020) and include both a single-channel model (denoted 3+_1) and a multichannel model (3+_5). The 3+_1 model is trained using only KaZR reflectivity data as an input, while 3+_5 is trained using both the reflectivity data and ERA5 state variables to evaluate the potential benefits of the additional atmospheric contextual information in the inpainting process. All models examined in this work are summarized in Table 3.
Summary of inpainting models, algorithm schemes, predictor sets, and input/output shapes. Inputs include reflectivity r, temperature t, specific humidity q, wind speed u component u, and wind speed υ component υ.
Summary details of hardware used for model training using TensorFlow v2.5. vCPU: virtual central processing unit.
Hyperparameter sweep summary details and final tuned values.
Two loss functions were evaluated during the model training phase, including a weighted mean absolute error (MAE) loss calculated across the 16 × 128 pixels in the masked blind zone of the reflectivity channel and a hybrid loss based on Huang et al. (2020), which consolidates pixel-, patch-, and scene-level losses into one function. MAE was selected as the primary loss, as it yielded similar performance to the more complex hybrid loss in early tests but with greater stability. Additionally, Monte Carlo Dropout (MCDropout) is applied at inference (n = 50 iterations), as outlined in Gal and Ghahramani (2016) to derive a Bayesian approximation of model uncertainty for the 3Net+ models. While not physically based, this technique does provide some insight into where the models have low confidence in an inpainted prediction. Using the ensemble’s mean prediction and standard error, an average inpainted blind zone is produced along with a corresponding uncertainty estimate, with areas of high standard error (>1 SE) masked. Last, the results discussed in subsequent sections are based on the unseen test set of observations.
b. Fidelity metrics
We also evaluate model skill in terms of predicting the location of the KaZR VAP’s lowest activity-filled reflectivity bin, to gauge how effectively the models inpaint near-surface gaps and descended clouds, thereby assessing cloud structure within the blind zone. Furthermore, the probability of detection (POD), success rate (SR), and critical success index (CSI) are compared between each model for classifying instances of shallow cumuliform snowfall and virga within the blind zone. Additional descriptors for each of these metrics can be found in section c.1 of Chase et al. (2022). A simple snowfall classification scheme is applied here, where a ≥ −20 dBZ cutoff is used to identify potential precipitating layers both directly above the 1.2-km blind zone threshold and in the lowest available KaZR VAP range gate bin at 160 m. Cases where it is snowing above the blind zone but not at the surface are classified as virga, and the reverse cases are classified as shallow snowfall.
c. Explainability
While it is infeasible to fully comprehend the U-Net behavior with millions of trainable parameters, gaining insights into some of its high-level features is still crucial for interpreting results. Associating macroscale model decisions with physically consistent environmental factors can increase model confidence and verify that correct results are generated for the correct reasons. Additionally, identifying the most significant input areas and channels contributing to the model’s effectiveness can guide the development of future models and retrievals. To that end, multiple prevalent explainability techniques are applied to glean insights into the behavior of the U-Net models.
Quantifying the impact of individual input channels on model skill involves performing a training sweep of n = 16 subset channel models composed of a single radar channel (i.e., the same model as 3+_1) plus all combinations of radar and temperature, specific humidity, wind (u component), and wind (υ component). These unique subset channel combinations represent all possible input channel combinations and are selected using a standard binomial coefficient calculation based on the four ancillary ERA5 variables. It is then possible to observe the corresponding training curves and validation loss after a short training run (n = 25 epochs) to compute an estimate of the marginal contributions of each predictor to a reduction in MAE. To maintain reasonable training times, a subset of the data that includes 1 year of data from 2016 at both NSA and OLI is used to perform these experiments.
While sophisticated feature importance analyses often employ techniques like Shapley value calculations (SHAP) from Lundberg et al. (2020) or local interpretable model-agnostic explanations from Ribeiro et al. (2016), these methods are not readily compatible with the U-Net structure and its separate channel predictors. This necessitates computing independent importance approximations. While not as comprehensive as a full SHAP analysis, the drop column (more accurately, drop channel) importance scores calculated here are well validated in prior work (Fu et al. 2022; Molnar et al. 2021; Parr et al. 2020; Altmann et al. 2010) and still provide insights into the relative marginal contribution of each predictor used in the model. However, it is worth noting that these scores do not capture potential nonlinear interactions between inputs.
Additionally, pixel attribution vanilla gradient saliency maps are examined, inspired by the work of Simonyan et al. (2014), to gain further insight into areas the model identifies as crucial contributors to high inpainting accuracy for a given input. These saliency maps are generated by passing an image through the network and subsequently extracting the gradients of the output layer through the use of a gradient tape monitoring a single backpropagation pass, based on the input across all channels (Kim et al. 2019). These gradients are then directly mapped to a 128 × 128 pixel heatmap and compared to each of the input channels. While simplistic, this method is particularly useful for visualizing which parts of the observed image are deemed significant when inpainting the blind zone reflectivity values, allowing for direct plotting of the activation gradients (Szczepankiewicz et al. 2023).
4. Results
a. Inpainted blind zone case studies
The evaluation of each model’s proficiency starts with an analysis of several case study illustrations. These depict common cloud patterns both above and beneath the blind zone thresholds at NSA and OLI. These examples highlight the strengths and weaknesses of traditional, linear inpainting methods when compared with the more sophisticated U-Net-style reconstructions. As noted in section 3a, hatched regions in the blind zone represent areas with considerable uncertainty within the U-Net.
In Fig. 4, REP and MAR copy the signal and intensity at the blind zone threshold and perform well if that signal does in fact extend down to the surface (i.e., deep, homogeneous reflectivity profiles). However, cases of reflectivity gradients below the blind zone threshold, or multilayer cloud structures, cannot be resolved using these techniques. In many cases, the U-Net models can capture these complex structures at both locations. As a result of the MAE loss used during the U-Net training phase to minimize pixel-scale error, the resulting inpainted cloud is often slightly blurry in appearance (a common feature of these types of models); however, the larger-scale features are preserved.
Figure 5 brings attention to a further set of precipitating cloud cases that frequently occur in Arctic regions but are challenging for traditional inpainting methods to resolve, specifically the presence of shallow MPCs and gaps in near-surface cloud coverage (Kulie et al. 2016; McIlhattan et al. 2020; Pettersen et al. 2018). If no activity is detected directly at or above the 1.2-km threshold, neither REP nor MAR methods can yield accurate predictions for shallow MPC, a phenomenon commonly observed across the Arctic for most of the year (Shupe et al. 2006). Furthermore, the U-Nets demonstrate superior performance in predicting instances of gaps in cloud coverage between the 1.2-km threshold and the surface, such as potential virga cases.
Figure 6 highlights the challenges often faced by the U-Net to accurately identify cases of shallow snowfall using information from distant clouds. In both Figs. 6a and 6b, the reflectivity profiles of the cloud aloft appear very similar. However, in Fig. 6b, a near-surface cloud is present, a feature that is absent in Fig. 6a. Notably, none of the linear methods can resolve this feature, which is expected due to their reliance on blind zone threshold reflectivity values. Nevertheless, when coupled with atmospheric data from ERA5, the 5-channel 3Net+ can often detect such cloud structures and potential shallow cumuliform snowfall cases.
b. Model robustness
To evaluate the general performance of each model, an examination of their average ability to correctly produce reflectivity structures within the blind zone throughout all winter periods is conducted. Vertical power spectral density curves (Fig. 7a) show a 60% enhancement in the 3+_5 model’s capability to model small-scale variability in blind zone reflectivity estimates, in comparison to MAR (REP does not appear in Fig. 7a because the inpainted reflectivity values lack any vertical frequency information). The 3Net+ models exhibit a slight low bias on average (relative to the UNet++); however, all U-Nets are closer in scale and shape to the KaZR observations than the traditional, linear inpainting methods.
The U-Net models also demonstrate a greatly enhanced ability in capturing the observed lowest reflectivity bin, as illustrated in the probability density distribution plot in Fig. 7b. Near-surface bin estimates in the 3+_5 model typically fall within ±1 bin of the true location, making it the closest on average across all tested models. Conversely, MAR and REP are typically off by 20 bins (1500 m) on average. Generally, REP or MAR might not be expected to capture these trends effectively, as they cannot account for discontinuous clouds in the blind zone. Despite this, the strong performance of the U-Nets within the blind zone holds promise, potentially aiding in classifying near-surface snowfall events that would otherwise go undetected.
Figures 7c–g summarize the reconstructed blind zone fine-scale variability, with the structure of the 2D histograms of reflectivity appearing much closer to ground truth when using the 3Net+. While the linear inpainting methods can capture the general macroscale reflectivity gradient present at each site (depicted in the left to right intensities gradients of each heatmap), they entirely miss the localized clusters of reflectivity that represent shallow clouds and common locations of cloud gaps. A slight low reflectivity bias is again noted in the 3Net+ models (most prevalent in the 3+_5 model), likely due to the model predicting closer to the negative end of the spectrum (−60 dBZ or “no cloud”), which is a safer, more commonly observed case at these sites, compared to predicting higher-intensity positive reflectivity values from extreme (but far less common) winter blizzards.
Figure 8a assesses pixel-level accuracy, showing the average monthly MAE for each model and the overall average MAE comparisons in Fig. 8b. On average, the 3+_5 and REP models perform the best, with many of the high error cases in the U-Net stemming from incorrectly hallucinated cloud, coupled with a tendency for estimates to be slightly too low on average (most notably in December). However, even a slightly misplaced cloud can visually resemble the ground truth in shape and intensity but still result in a high MAE. Therefore, scene-level accuracy (i.e., DSC) is also important to consider alongside MAE. Notably, the 3+_5 model excels over the 3+_1 model during late winter/early spring (with an overall reduction in MAE of 6%), where there is a clear divergence in MAE performance, suggesting some benefits from the ERA5 data for inpainting during this period.
Scene-level DSC scores are summarized in Figs. 8c and 8d for each month and their overall average performance, respectively. The U-Nets significantly surpass the linear techniques in accurately predicting the shape and location of a cloud, with a 38% higher DSC score from the 3+_5 model over REP. A performance enhancement in the 3+_5 DSC scores over the single channel model is noted across nearly all months, further highlighting the benefit of using additional atmospheric covariates in the inpainting process for correctly positioning clouds.
While pixel- and patch-level accuracy metrics describe each model’s ability to reconstruct small-scale variability and cloud position, interest also lies in how well the models capture the presence of near-surface cloud, shallow snowfall, and virga within the blind zone. Figure 9 shows POD, SR, and CSI values for each model/class combination (where models positioned toward the top right indicate “better” performance). The three symbols in Fig. 9 (i.e., cloud, shallow, and virga) represent cases of detected hydrometeors anywhere within the blind zone (with activity at or above the blind zone threshold), detected hydrometeors in the blind zone at the surface with no activity at the blind zone threshold, and finally no detected hydrometeors at the surface with activity at the blind zone threshold, respectively. The 3+_5 model generally outperforms all others in terms of CSI scores, displaying an 18% increase in the detection of each of these cases compared to traditional techniques and a value that is typically less biased on average (i.e., closer to the one-to-one line). This translates to thousands of additional missed instances of shallow snowfall and virga when using the linear methods. Typically, the easiest detection task (i.e., is there a cloud somewhere in the blind zone) achieves the highest CSI score across all models, with shallow snowfall detection being the most challenging.
Shallow snowfall cases are intuitively the hardest to model due to the many similar cases of distant clouds that sometimes include near-surface precipitation, while others have none (refer to Fig. 6). In many cases, due to this disparity between these two cases, reflectivity alone is insufficient to constrain near-surface estimates, and additional context (e.g., from ERA5) is needed. Recent investigations from Pettersen et al. (2018) and Shates et al. (2021) also come to similar conclusions regarding the tightly coupled nature of synoptic-scale processes and atmospheric circulation patterns that govern shallow snowfall processes. Interestingly, the UNet++ model performs better at shallow snowfall detection, likely due to its less conservative approach in predicting near-surface cloud and a reduced low bias over the 3Net+ models. Overall, the suite of U-Net models significantly surpasses the traditional techniques for classifying each case within the blind zone.
c. Feature importance and visual explanations
Using the drop channel importance methodology from section 3c, an ensemble of learning curves can be generated, as shown in Fig. 10a. Here, warmer colors represent model training runs with more predictors, with the deep red line representing the 3+_5 model and the black dashed line indicating 3+_1. Typically, models with more predictors learn faster and yield the lowest overall validation loss after 25 epochs. This trend is evident when analyzing the validation loss at the end of each 25 epoch run, as shown by the general downward slope of the bars to the right of Fig. 10b. Note that just adding more predictors does not always decrease loss, and certain combinations can lead to higher error (e.g., “tv” versus “t”). When calculating the drop column weighted marginal importance scores across each combination of these final validation loss values, wind speed data emerge as the most significant contributor to skill, leading to the highest marginal increase in MAE if dropped.
Understanding where the model focuses within an image can help connect inpainting behaviors to the physical properties of a cloud and its surrounding atmospheric state. For instance, key patterns in the 32 feature maps generated from the radar channel of a deep cloud case in the first encoder layer of the 3+_5 model are easily recognizable (Fig. 11). First, the model pays close attention to cloud edges (Fig. 11, panels 3, 5, 8, 11, 13, 27, 29), which may help it learn about the general cloud shape, depth, and position in the scene. Second, it strongly considers the gradients of reflectivity, especially near the 1.2-km blind zone threshold, likely to distinguish between increasing or decreasing precipitation intensities toward the surface (Fig. 11, panels 3, 6, 10, 19, 20, 22). Finally, many of the filters highlight the multilayer nature of the scene and the existence of a second cloud above the first (Fig. 11, panels 3, 6, 10, 13, 19, 20, 29). While these patterns may change based on the input case and individual interpretations may vary, these features commonly recur across the test dataset.
By plotting saliency map gradient values for a handful of examples (Fig. 12), it is possible to further compare the differences in importance between the 3+_5 and 3+_1 models. For multilayer clouds that intersect with the blind zone cutoff, both models focus on the top of the cloud and the 1.2-km boundary threshold, as these systems often continue to extend down to the surface with similar reflectivity intensities. Both models also typically focus on and around cloud gaps in deeper systems (e.g., Fig. 12a); however, a distinct halo of importance toward the tropopause is noticeable in the 3+_5 model. This recurring feature is likely incorporating upper-troposphere wind and humidity data into predictions of near-surface reflectivity. Interestingly, the 3+_1 model does not focus solely on the areas of high reflectivity in the scene but also on the regions around the cloud. This is observed in cases Figs. 12b and 12c, where it is postulated that the model is investigating whether this is a multicloud system by also looking elsewhere in the scene. A similar pattern is observed in the 3+_5 models but in different locations, presumably due to a higher reliance on the ERA5 data in these reflectivity-sparse cases. Last, for clouds with a pronounced horizontal component (as in Fig. 12d), both models focus on the cloud’s edge similarly, but the 3+_5 model also seems to consider clusters of parallel horizontal streaks to incorporate additional context from the ERA5 temperature and humidity data.
Some of the primary differences between the generated saliency maps can be intuited from a physical perspective based on the provided model inputs. Wind components, for example, provide information about the horizontal movement of air masses near the site. In the upper troposphere, wind patterns can hint at the large-scale atmospheric dynamics, such as the presence of high or low pressure systems, fronts, and jet streams. Combined with atmospheric temperature gradients (indicating the presence of atmospheric fronts, associated cloud formation, and precipitation) and humidity data (to understand the likelihood of cloud presence and hydrometeor intensity based on the moisture content of the atmosphere), the model can more accurately reconstruct near-surface reflectivity.
It is important to emphasize that these exploratory techniques do not provide a definitive answer to exactly why the model makes certain predictions, and human interpretation is still required. However, they do offer some insights into model behavior to produce less of a black box solution. In summary, as observed in the model intercomparisons, ERA5 atmospheric data appear to provide a slight edge to U-Net model predictions, and it appears that all individual channels provide some useful predictive contribution to skill.
5. Applications
We now consider applying these models to spaceborne observations to demonstrate that the trained U-Net continues to provide reasonable inpainted predictions, consistent with what has been shown in section 4. Since the KaZR radar is calibrated using coincident CloudSat observations, CloudSat was selected as the most logical source of spaceborne reflectivity observations as inputs to the model. Spaceborne reflectivity information is sourced from the CloudSat CPR’s R05 2B-GEOPROF Radar_Reflectivity product, which is derived from a combination of echo power and other ancillary datasets as described in Li et al. (2017).
For these comparisons, we consider two types of common wintertime snowfall overpasses at this location, in the form of a deep system and a shallow system. Both of these cases were randomly selected overpasses that crossed within a 50-km radius of NSA observing the same storm system. While both the KaZR and CPR are observing the same storm, it should be noted that CloudSat does not directly pass over NSA in either case. As a result, there are minor differences in the intensity of reflectivity reported by each instrument since they are observing different portions of the storm. Therefore, instead of focusing on the absolute error of each model’s reconstructed blind zone, we consider how well they capture the general structure of the storm as observed by the KaZR (similar to section 4b).
The inpainting models compared here (i.e., REP, MAR, and 3+_1) are the exact same as those presented earlier in this work (no additional training was performed). For simplicity, the 3+_1 model was selected for this proof-of-concept application to spaceborne radar as it has fewer degrees of freedom than the 3+_5 model, and the performance is generally more conservative. However, there exists a key difference in the shape of the inputs that needs to be considered. As CloudSat has a different vertical extent to that of the KaZR (i.e., CloudSat has 125 vertical bins up to 30 km, while the current models were trained using 128 vertical bins up to 10 km from the KaZR), we use the entire 125-bin vertical extent of CloudSat [with three appended empty (i.e., −60 dBZ) rows to the top of the profile to get to 128 vertical bins]. This adds additional uncertainty to the model, as the scaled height of the clouds will be different when the model is applied to CloudSat (additional details in section 6). The CloudSat data are otherwise treated in the exact same manner as the KaZR data, and a blind zone mask is generated between the surface and approximately 1.5 km above the ground, as this was manually identified as the location of the first clutter-free bin.
a. Deep CloudSat overpass
A deep storm system was observed from an ascending CloudSat overpass (granule 52103), which came within 15 km southwest of NSA at 2233 UTC 12 February 2016. The overpass track is plotted on the map in Fig. 13a, with the red portion depicting CloudSat’s orbital path, the blue portion showing the NSA-coincident 128 CPR soundings, and the white point showing the geographically closest retrieved CPR sounding to NSA. This storm system exhibits an extended high-intensity reflectivity core (>0 dBZ) reaching 6–8 km above the surface, during a period with below 0°C atmospheric and surface temperatures, indicating likely surface snowfall (Fig. 13b). However, due to surface clutter interference, the lowest 6 CPR bins from the surface up to 1.5 km are contaminated and are therefore masked. Through the selection of 128 CloudSat soundings near the NSA site (blue points in Fig. 13a and marked by the dashed lines in Fig. 13b), we can extract a CloudSat-derived 128 × 128 pixel image that can then be fed into the various inpainting models.
The NSA KaZR was also operating during this period and displays a similar storm structure to that observed by CloudSat at 2233 UTC (Fig. 13c). The KaZR notes a similar cloud-top height of 6 km and high-intensity reflectivity values extending downward from 4 km toward the surface. Due to the distance between CloudSat and NSA, each radar is observing slightly different portions of the same storm, resulting in an approximate +10 dBZ increase in reflectivity intensity in the KaZR observations compared to the CPR. Aligning the CloudSat overpass from Fig. 13b to the KaZR observations produces Fig. 13c, which depicts the structure of the storm below the blind zone threshold (i.e., an increasing reflectivity gradient to 1 km and then a decreasing gradient from 1 km to the surface).
Using the inpainting models from the previous section and applying them to the aforementioned CloudSat data (i.e., Fig. 13e) produces Figs. 13f–h. Examining the inpainted regions, the REP and MAR models create unphysical and disjointed streaks of reflectivity toward the surface. The 3+_1 model displays the only reconstructed blind zone that replicates the reflectivity gradient observed by the reference KaZR. Further, when considering the reflectivity profile from the closest CloudSat sounding (i.e., the white dot in Fig. 13a and the white dashed lines in Figs. 13b,d–h), the reconstructed U-Net blind zone displays a much higher Pearson correlation with the KaZR (r3+_1 = 0.77) compared to MAR (rMAR = 0.45), suggesting that it is more accurately inpainting the near-surface storm structure for this case.
b. Shallow CloudSat overpass
The second CloudSat overpass we examine, a shallow cumuliform snowfall case, occurred 35 km northeast of NSA. This ascending overpass (granule 56661) occurred at 2226 UTC 21 December 2016 (Figs. 14a,b). This storm system displayed a consistent cloud-top height of 2.5 km by both the KaZR and CloudSat, with a −10-dBZ blind zone reflectivity threshold value coupled with below 0°C atmospheric and surface temperatures during this period. Similar to the deep case, there exists a band of increased reflectivity intensity within the blind zone from the KaZR, suggestive of surface snowfall (Figs. 14c,d).
Performing a similar blind zone inpainting analysis to the previous deep case, we note that when given Fig. 14e as input, the MAR and REP models produce a less physically consistent inpainted blind zone to that of the U-Net. In this case, the REP model produces unphysical streaks of reflectivity and cloud gaps toward the surface, while the MAR model strongly underestimates the reflectivity values below the blind zone threshold due to the thin cloud layer and low clear-sky conditions reducing the average reflectivity. The 3+_1 model displays the most consistent reconstruction to that of the KaZR, with an increasing reflectivity band around 1 km that decreases in intensity from 1 km to the surface. If we consider the nearest reflectivity profile between CloudSat and NSA, we again note a strong increase in Pearson correlation using the U-Net model over the traditional techniques (i.e., r3+_1 = 0.74 and rMAR = 0.13), suggesting that the U-Net continues to display enhanced skill in reconstructing near-surface reflectivity structures when extended to shallow snowfall cases observed from spaceborne radar.
6. Discussion and conclusions
This study introduces a deeply supervised U-Net-style CNN designed to address the challenge faced by spaceborne radar in detecting near-surface hydrometeor activity. The uncertainty produced from ground clutter interference in the lowest 1–2 km of the atmosphere is a major source of retrieval error in current spaceborne systems. The 3Net+ model produced here effectively predicts cold season reflectivity profiles in simulated blind zone regions, integrating latent cloud features above the blind zone with atmospheric state variables like temperature, humidity, and wind speed. Comparative analyses show that this CNN surpasses traditional linear extrapolation methods in accuracy, with a wintertime reduction in mean absolute error, a 38% improvement in the Sørensen–Dice coefficient, and vertical reflectivity distributions that are 60% closer to observed values. Furthermore, the U-Net demonstrates enhanced capability in detecting near-surface clouds and shallow snowfall events. Explainability analyses emphasize the crucial role of both reflectivity and atmospheric state variables in model efficacy. First impressions using CloudSat-CPR observations as inputs to the U-Net display promising results consistent with earlier simulated blind zone tests, highlighting the operational potential of these models for enhancing future spaceborne precipitation missions.
However, despite the demonstrated superiority of U-Net-style models in predicting the location and intensity of near-surface cloud compared to traditional methods, these models are not infallible and can frequently hallucinate to generate false positives (Huang et al. 2020). For this reason, additional investigations into the use of model uncertainties should be performed with the goal of establishing a threshold of uncertainty below which prediction information can be deemed reliable. Further, the decision was made to not use ERA5 vertical wind velocity information in this analysis since it is not currently available in the ECMWF-AUX product (Cronk 2017). However, it would be worthwhile to also incorporate the vertical component of the wind data from ERA5 to better capture convective updrafts and downdrafts within the system (Ikeda et al. 2010; Weisman and Klemp 1982).
The way in which the 3+_5 model incorporates the ERA5 atmospheric data in combination with the KaZR VAP is intriguing. However, the prominence of wind data could pose a problem if the model only recognizes coastal wind patterns. If applied to new regions present in spaceborne data (given that CloudSat is mobile), differing wind patterns might lead to inconsistencies and errors in model predictions. Another concern in this application of transfer learning stems from the fact that the KaZR VAP’s horizontal sampling depends on the speed of advection, while CloudSat’s is constant. Thus, to further assess the robustness of these methods, the U-Nets should be tested on additional datasets outside the region, such as those from vertically pointing radar systems in the Canadian Arctic Archipelago (e.g., Eureka, Resolute Bay), as well as sites further east (e.g., Greenland’s Summit Station) and others available throughout northern Europe before being applied to CloudSat.
Implementing additional tests using GANs might also prove beneficial to determine if a different network architecture results in better pixel-level and scene-level accuracy. GAN architecture differs from the U-Net in that two networks, a generator and discriminator, compete against one another during training to produce new synthetic instances of inpainted radar that can pass as being authentic (Goodfellow et al. 2020). GAN-based approaches have been considered in previous work by Geiss and Hardin (2021) and Tan et al. (2023) for radar inpainting, often producing a more realistic-looking inpainted image; however, this was found to be at the cost of increased per-pixel error levels. Early tests were performed in this project using a series of simple GAN architectures for inpainting at both OLI and NSA. However, as is common with this more complex training paradigm, model stability was found to be poorer (Aggarwal et al. 2021; Creswell et al. 2018), and the GANs were more challenging to converge in a physically consistent manner when compared to the U-Net architectures. Diffusion-based inpainting techniques may also be relevant to explore for radar applications, as they train more reliably than GANs and have also demonstrated skill in generating semantically meaningful predictions in regions with arbitrary masks (Lugmayr et al. 2022; Alt et al. 2022).
Early applications of the U-Net model to CloudSat-CPR observations near NSA appear consistent with our previous results tested on a simulated blind zone. However, additional tests are required to quantitatively evaluate the robustness of these models on spaceborne data. To further enhance the U-Net models going forward, the resolution of the images being trained on needs to be further reduced, due to CloudSat’s coarse vertical resolution (with only 41 vertical pixels of information from the surface to 10 km). With this reduced resolution, only 4–6 pixels would need to be inpainted instead of the currently predicted 16 vertical pixels. Additional resolution-matching techniques should therefore be compared to the methods used in section 5, to better understand how the vertical extent of CloudSat observations, and the scaling of cloud features, impacts model accuracy. Additionally, a more robust comparison methodology would need to be considered to accurately quantify model skill. Future tests should also consider the impacts of other atmospheric inputs to the model (e.g., ECMWF-AUX atmospheric variables). If performance is satisfactory in these cases, the model could then be expanded to the wider Arctic region to further assess its impact on Arctic snowfall quantities.
Improved representation of near-surface MPCs would have a substantial impact on the accuracy of current spaceborne remote sensing–based retrievals of snowfall (Morrison et al. 2012; Pettersen et al. 2021). Current large uncertainties stemming from the complex relationships between various macro- and microphysical processes coming together to form MPCs, combined with a limited number of observations across Arctic regions, have led to limited success in simulating these systems using traditional numerical models (Curry et al. 2000; Klein et al. 2009; Inoue et al. 2006). The findings of this study suggest that the 3Net+ offers a new perspective for detecting the presence of near-surface clouds, shallow snowfall cases, and virga events in radar blind zones in MPCs, given its ability to correctly position the lowest reflectivity bin and its more realistic vertical structure of reflectivity in the blind zone. The improved representation of near-surface cloud could therefore help reduce the underestimation of shallow snowfall from CloudSat as reported in Maahn et al. (2014), Bennartz et al. (2019), and McIlhattan et al. (2020), to improve our understanding of the Arctic’s water–energy budget.
However, hallucinations remain an issue (especially during the summer), and additional work should therefore investigate additional model inputs and ML architectures and use more training data to reduce current low prediction biases and further improve inpainted generalizability. The inpainted outputs from the 3Net+, along with their corresponding uncertainty estimates, could potentially enhance traditional retrieval methods, such as optimal estimation, by providing additional context about a critically important region in the radar profile, which is typically masked and ignored (Wood et al. 2014; Stephens et al. 2002).
While the CloudSat mission is nearing its end, upcoming generations of cloud profiling satellites, like European Space Agency’s EarthCARE satellite and the NASA Atmosphere Observing System, will still exhibit a surface blind zone, and therefore, similar snowfall retrieval uncertainties will persist (Lamer et al. 2020). Further, the relationships identified between the reflectivity data and the surrounding atmospheric state variables could be useful in enhancing future passive retrievals, particularly those heavily reliant on atmospheric climate parameters such as temperature and humidity. Ultimately, the integration of data-driven ML inpainting techniques not only unlocks valuable insights into current retrieval uncertainties but also paves the way for enhanced precision and confidence in future spaceborne snowfall estimates.
Acknowledgments.
This study was primarily supported by a NASA New (Early Career) Investigator Program (NIP) Grant 80NSSC22K0789, with additional support provided by the Natural Sciences and Engineering Research Council of Canada (577912). We thank the U.S. Department of Energy, the Atmospheric Radiation Measurement Climate Research Facility, and the European Centre for Medium-Range Weather Forecasts for providing access to the data used in model training. We would also like to recognize the Microsoft Corporation for providing access to their Azure computing cluster. Finally, we thank Andrew Geiss and Joseph C. Hardin, whose earlier work in Geiss and Hardin (2021) was instrumental in the continued development of this project.
Data availability statement.
As an open-source project, the code used in model training and testing is publicly available on GitHub (https://github.com/frasertheking/blindzone_inpainting/) for others to use and adapt [example datasets have also been made publicly available on Google Drive (https://drive.google.com/drive/folders/1VpVWexR5soQkjcIWsuc6jf-hVBlgtptN?usp=share_link)]. Further, the KAZR-ARSCL-CLOUDSAT VAP and associated MET datasets at NSA and OLI are publicly available for download on the ARM data product repository (https://www.arm.gov/capabilities/science-data-products/vaps/kazrarsclcloudsat). ERA5 hourly data on single pressure levels are also publicly available via the ECMWF climate data storage online service (https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysisera5singlelevels?tab=overview).
REFERENCES
Aggarwal, A., M. Mittal, and G. Battineni, 2021: Generative adversarial network: An overview of theory and applications. Int. J. Inf. Manage. Data Insights, 1, 100004, https://doi.org/10.1016/j.jjimei.2020.100004.
Alt, T., P. Peter, and J. Weickert, 2022: Learning sparse masks for diffusion-based image inpainting. Pattern Recognition and Image Analysis, A. J. Pinho et al., Eds., Lecture Notes in Computer Science, Vol. 13256, Springer, 528–539, https://doi.org/10.1007/978-3-031-04881-4_42.
Altmann, A., L. Toloşi, O. Sander, and T. Lengauer, 2010: Permutation importance: A corrected feature importance measure. Bioinformatics, 26, 1340–1347, https://doi.org/10.1093/bioinformatics/btq134.
Anagnostou, E. N., C. A. Morales, and T. Dinku, 2001: The use of TRMM precipitation radar observations in determining ground radar calibration biases. J. Atmos. Oceanic Technol., 18, 616–628, https://doi.org/10.1175/1520-0426(2001)018<0616:TUOTPR>2.0.CO;2.
Bennartz, R., F. Fell, C. Pettersen, M. D. Shupe, and D. Schuettemeyer, 2019: Spatial and temporal variability of snowfall over Greenland from CloudSat observations. Atmos. Chem. Phys., 19, 8101–8121, https://doi.org/10.5194/acp-19-8101-2019.
Bertalmio, M., G. Sapiro, V. Caselles, and C. Ballester, 2000: Image inpainting. SIGGRAPH ’00: Proc. 27th Annual Conf. on Computer Graphics and Interactive Techniques, New Orleans, LA, ACM Press/Addison-Wesley Publishing Co., 417–424, https://doi.org/10.1145/344779.344972.
Buttle, J. M., and Coauthors, 2016: Flood processes in Canada: Regional and special aspects. Can. Water Resour. J./Rev. Can. Resour. Hydriques, 41, 7–30, https://doi.org/10.1080/07011784.2015.1131629.
Chase, R. J., D. R. Harrison, A. Burke, G. M. Lackmann, and A. McGovern, 2022: A machine learning tutorial for operational meteorology, Part I: Traditional machine learning. Wea. Forecasting, 37, 1509–1529, https://doi.org/10.1175/WAF-D-22-0070.1.
Chen, Y., H. Zhang, L. Liu, X. Chen, Q. Zhang, K. Yang, R. Xia, and J. Xie, 2021: Research on image inpainting algorithm of improved GAN based on two-discriminations networks. Appl. Intell., 51, 3460–3474, https://doi.org/10.1007/s10489-020-01971-2.
Creswell, A., T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and A. A. Bharath, 2018: Generative adversarial networks: An overview. IEEE Signal Process. Mag., 35, 53–65, https://doi.org/10.1109/MSP.2017.2765202.
Cronk, H., 2017: Cloudsat ECMWF-AUX auxillary data product process description and interface control document. NASA, accessed 14 July 2023, https://www.cloudsat.cira.colostate.edu/data-products/ecmwf-aux.
Curry, J. A., and Coauthors, 2000: FIRE Arctic clouds experiment. Bull. Amer. Meteor. Soc., 81, 5–30, https://doi.org/10.1175/1520-0477(2000)081<0005:FACE>2.3.CO;2.
Demir, U., and G. Unal, 2018: Patch-based image inpainting with generative adversarial networks. arXiv, 1803.07422v1, https://doi.org/10.48550/arXiv.1803.07422.
Dietz, A. J., C. Kuenzer, U. Gessner, and S. Dech, 2012: Remote sensing of snow – A review of available methods. Int. J. Remote Sens., 33, 4094–4134, https://doi.org/10.1080/01431161.2011.640964.
Doesken, N. J., and D. A. Robinson, 2009: The challenge of snow measurements. Historical Climate Variability and Impacts in North America, L.-A. Dupigny-Giroux and C. J. Mock, Eds., Springer, 251–273, https://doi.org/10.1007/978-90-481-2828-0_15.
Duffy, G., F. King, R. Bennartz, and C. G. Fletcher, 2021: Seasonal estimates and uncertainties of snow accumulation from CloudSat precipitation retrievals. Atmosphere, 12, 363, https://doi.org/10.3390/atmos12030363.
Efros, A. A., and T. K. Leung, 1999: Texture synthesis by non-parametric sampling. Proc. Seventh IEEE Int. Conf. on Computer Vision, Kerkyra, Greece, Institute of Electrical and Electronics Engineers, 1033–1038, https://doi.org/10.1109/ICCV.1999.790383.
Elharrouss, O., N. Almaadeed, S. Al-Maadeed, and Y. Akbari, 2020: Image inpainting: A review. Neural Process. Lett., 51, 2007–2028, https://doi.org/10.1007/s11063-019-10163-0.
Fu, X., J. Peng, G. Jiang, and H. Wang, 2022: Learning latent features with local channel drop network for vehicle re-identification. Eng. Appl. Artif. Intell., 107, 104540, https://doi.org/10.1016/j.engappai.2021.104540.
Gal, Y., and Z. Ghahramani, 2016: Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. arXiv, 1506.02142v6, https://doi.org/10.48550/arXiv.1506.02142.
Geiss, A., and J. C. Hardin, 2021: Inpainting radar missing data regions with deep learning. Atmos. Meas. Tech., 14, 7729–7747, https://doi.org/10.5194/amt-14-7729-2021.
Gergel, D. R., B. Nijssen, J. T. Abatzoglou, D. P. Lettenmaier, and M. R. Stumbaugh, 2017: Effects of climate change on snowpack and fire potential in the western USA. Climatic Change, 141, 287–299, https://doi.org/10.1007/s10584-017-1899-y.
Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, 2020: Generative adversarial networks. Commun. ACM, 63, 139–144, https://doi.org/10.1145/3422622.
Gray, D. M., and P. G. Landine, 2011: An energy-budget snowmelt model for the Canadian prairies. Can. J. Earth Sci., 25, 1292–1303, https://doi.org/10.1139/e88-124.
Guillemot, C., and O. Le Meur, 2014: Image inpainting: Overview and recent advances. IEEE Signal Process. Mag., 31, 127–144, https://doi.org/10.1109/MSP.2013.2273004.
Hersbach, H., and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803.
Huang, H., and Coauthors, 2020: UNet 3+: A full-scale connected UNet for medical image segmentation. arXiv, 2004.08790v1, https://doi.org/10.48550/arXiv.2004.08790.
Hudak, D., P. Rodriguez, and N. Donaldson, 2008: Validation of the CloudSat precipitation occurrence algorithm using the Canadian C band radar network. J. Geophys. Res., 113, D00A07, https://doi.org/10.1029/2008JD009992.
Ikeda, K., and Coauthors, 2010: Simulation of seasonal snowfall over Colorado. Atmos. Res., 97, 462–477, https://doi.org/10.1016/j.atmosres.2010.04.010.
Inoue, J., J. Liu, J. O. Pinto, and J. A. Curry, 2006: Intercomparison of Arctic regional climate models: Modeling clouds and radiation for SHEBA in May 1998. J. Climate, 19, 4167–4178, https://doi.org/10.1175/JCLI3854.1.
Jam, J., C. Kendrick, K. Walker, V. Drouard, J. G.-S. Hsu, and M. H. Yap, 2021: A comprehensive review of past and present image inpainting methods. Comput. Vis. Image Underst., 203, 103147, https://doi.org/10.1016/j.cviu.2020.103147.
Kim, B., J. Seo, S. Jeon, J. Koo, J. Choe, and T. Jeon, 2019: Why are saliency maps noisy? Cause of and solution to noisy saliency maps. 2019 IEEE/CVF Int. Conf. on Computer Vision Workshop (ICCVW), Seoul, South Korea, Institute of Electrical and Electronics Engineers, 4149–4157, https://doi.org/10.1109/ICCVW.2019.00510.
King, F., and C. G. Fletcher, 2020: Using CloudSat-CPR retrievals to estimate snow accumulation in the Canadian Arctic. Earth Space Sci., 7, e2019EA000776, https://doi.org/10.1029/2019EA000776.
King, F., and C. G. Fletcher, 2021: Using CloudSat-derived snow accumulation estimates to constrain gridded snow water equivalent products. Earth Space Sci., 8, e2021EA001835, https://doi.org/10.1029/2021EA001835.
King, F., G. Duffy, and C. G. Fletcher, 2022a: A centimeter-wavelength snowfall retrieval algorithm using machine learning. J. Appl. Meteor. Climatol., 61, 1029–1039, https://doi.org/10.1175/JAMC-D-22-0036.1.
King, F., G. Duffy, L. Milani, C. G. Fletcher, C. Pettersen, and K. Ebell, 2022b: DeepPrecip: A deep neural network for precipitation retrievals. Atmos. Meas. Tech., 15, 6035–6050, https://doi.org/10.5194/amt-15-6035-2022.
Klein, S. A., and Coauthors, 2009: Intercomparison of model simulations of mixed-phase clouds observed during the ARM mixed-phase arctic cloud experiment. I: Single-layer cloud. Quart. J. Roy. Meteor. Soc., 135, 979–1002, https://doi.org/10.1002/qj.416.
Kodamana, R., and C. G. Fletcher, 2021: Validation of CloudSat-CPR derived precipitation occurrence and phase estimates across Canada. Atmosphere, 12, 295, https://doi.org/10.3390/atmos12030295.
Kollias, P., B. Puigdomènech Treserras, and A. Protat, 2019: Calibration of the 2007–2017 record of atmospheric radiation measurements cloud radar observations using CloudSat. Atmos. Meas. Tech., 12, 4949–4964, https://doi.org/10.5194/amt-12-4949-2019.
Kulie, M. S., L. Milani, N. B. Wood, S. A. Tushaus, R. Bennartz, and T. S. L’Ecuyer, 2016: A shallow cumuliform snowfall census using spaceborne radar. J. Hydrometeor., 17, 1261–1279, https://doi.org/10.1175/JHM-D-15-0123.1.
Kulie, M. S., L. Milani, N. B. Wood, and T. S. L’Ecuyer, 2020: Global snowfall detection and measurement. Satellite Precipitation Measurement, V. Levizzani et al., Eds., Advances in Global Change Research, Vol. 69, Springer, 699–716, https://doi.org/10.1007/978-3-030-35798-6_12.
Lamer, K., P. Kollias, A. Battaglia, and S. Preval, 2020: Mind the gap – Part 1: Accurately locating warm marine boundary layer clouds and precipitation using spaceborne radars. Atmos. Meas. Tech., 13, 2363–2379, https://doi.org/10.5194/amt-13-2363-2020.
Li, L., S. Durden, and S. Tanelli, 2017: Level 1 B CPR process description and interface control document. JPL Doc. D-20308, 24 pp., https://www.cloudsat.cira.colostate.edu/cloudsat-static/info/dl/1b-cpr/1B-CPR_PDICD.P_R04.20070627.pdf.
Liu, G., 2008: Deriving snow cloud characteristics from CloudSat observations. J. Geophys. Res., 113, D00A09, https://doi.org/10.1029/2007JD009766.
Liu, G., F. A. Reda, K. J. Shih, T.-C. Wang, A. Tao, and B. Catanzaro, 2018: Image inpainting for irregular holes using partial convolutions. arXiv, 1804.07723v2, https://doi.org/10.48550/arXiv.1804.07723.
Lops, Y., A. Pouyaei, Y. Choi, J. Jung, A. K. Salman, and A. Sayeed, 2021: Application of a partial convolutional neural network for estimating geostationary aerosol optical depth data. Geophys. Res. Lett., 48, e2021GL093096, https://doi.org/10.1029/2021GL093096.
Lugmayr, A., M. Danelljan, A. Romero, F. Yu, R. Timofte, and L. Van Gool, 2022: RePaint: Inpainting using denoising diffusion probabilistic models. 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, Institute of Electrical and Electronics Engineers, 11 451–11 461, https://doi.org/10.1109/CVPR52688.2022.01117.
Lundberg, S. M., and Coauthors, 2020: From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell., 2, 56–67, https://doi.org/10.1038/s42256-019-0138-9.
Maahn, M., C. Burgard, S. Crewell, I. V. Gorodetskaya, S. Kneifel, S. Lhermitte, K. Van Tricht, and N. P. M. van Lipzig, 2014: How does the spaceborne radar blind zone affect derived surface snowfall statistics in polar regions? J. Geophys. Res. Atmos., 119, 13 604–13 620, https://doi.org/10.1002/2014JD022079.
Mateling, M. E., C. Pettersen, M. S. Kulie, and T. S. L’Ecuyer, 2023: Marine cold-air outbreak snowfall in the North Atlantic: A CloudSat perspective. J. Geophys. Res. Atmos., 128, e2022JD038053, https://doi.org/10.1029/2022JD038053.
McIlhattan, E. A., C. Pettersen, N. B. Wood, and T. S. L’Ecuyer, 2020: Satellite observations of snowfall regimes over the Greenland Ice Sheet. Cryosphere, 14, 4379–4404, https://doi.org/10.5194/tc-14-4379-2020.
Mekis, E., N. Donaldson, J. Reid, A. Zucconi, J. Hoover, Q. Li, R. Nitu, and S. Melo, 2018: An overview of surface-based precipitation observations at environment and climate change Canada. Atmos.–Ocean, 56, 71–95, https://doi.org/10.1080/07055900.2018.1433627.
Molnar, C., T. Freiesleben, G. König, G. Casalicchio, M. N. Wright, and B. Bischl, 2021: Relating the partial dependence plot and permutation feature importance to the data generating process. arXiv, 2109.01433v1, https://doi.org/10.48550/arXiv.2109.01433.
Morrison, H., and Coauthors, 2009: Intercomparison of model simulations of mixed-phase clouds observed during the ARM mixed-phase arctic cloud experiment. II: Multilayer cloud. Quart. J. Roy. Meteor. Soc., 135, 1003–1019, https://doi.org/10.1002/qj.415.
Morrison, H., G. de Boer, G. Feingold, J. Harrington, M. D. Shupe, and K. Sulia, 2012: Resilience of persistent Arctic mixed-phase clouds. Nat. Geosci., 5, 11–17, https://doi.org/10.1038/ngeo1332.
Palerme, C., C. Claud, N. B. Wood, T. L’Ecuyer, and C. Genthon, 2019: How does ground clutter affect CloudSat snowfall retrievals over ice sheets? IEEE Geosci. Remote Sens. Lett., 16, 342–346, https://doi.org/10.1109/LGRS.2018.2875007.
Parr, T., J. D. Wilson, and J. Hamrick, 2020: Nonparametric feature impact and importance. arXiv, 2006.04750v1, https://doi.org/10.48550/arXiv.2006.04750.
Pettersen, C., R. Bennartz, A. J. Merrelli, M. D. Shupe, D. D. Turner, and V. P. Walden, 2018: Precipitation regimes over central Greenland inferred from 5 years of ICECAPS observations. Atmos. Chem. Phys., 18, 4715–4735, https://doi.org/10.5194/acp-18-4715-2018.
Pettersen, C., and Coauthors, 2021: The precipitation imaging package: Phase partitioning capabilities. Remote Sens., 13, 2183, https://doi.org/10.3390/rs13112183.
Pinto, J. O., 1998: Autumnal mixed-phase cloudy boundary layers in the Arctic. J. Atmos. Sci., 55, 2016–2038, https://doi.org/10.1175/1520-0469(1998)055<2016:AMPCBL>2.0.CO;2.
Pondaven, A., M. Bakler, D. Guo, H. Hashim, M. Ignatov, and H. Zhu, 2022: Convolutional neural processes for inpainting satellite images. arXiv, 2205.12407v1, https://doi.org/10.48550/arXiv.2205.12407.
Pörtner, H.-O., and Coauthors, 2019: The Ocean and Cryosphere in a Changing Climate. Cambridge University Press, 755 pp., https://doi.org/10.1017/9781009157964.
Protat, A., D. Bouniol, E. J. O’Connor, H. K. Baltink, J. Verlinde, and K. Widener, 2011: CloudSat as a global radar calibrator. J. Atmos. Oceanic Technol., 28, 445–452, https://doi.org/10.1175/2010JTECHA1443.1.
Ribeiro, M. T., S. Singh, and C. Guestrin, 2016: “Why should I trust you?”: Explaining the predictions of any classifier. arXiv, 1602.04938v3, https://doi.org/10.48550/arXiv.1602.04938.
Ritsche, M., 2011: ARM surface meteorology systems handbook. Tech. Rep. DOE/SC-ARM/TR-086, 19 pp., https://doi.org/10.2172/1007926.
Safont, G., A. Salazar, A. Rodriguez, and L. Vergara, 2014: On recovering missing ground penetrating radar traces by statistical interpolation methods. Remote Sens., 6, 7546–7565, https://doi.org/10.3390/rs6087546.
Schulzweida, U., 2022: CDO user guide. Zenodo, 215 pp., https://doi.org/10.5281/zenodo.7112925.
Shates, J. A., C. Pettersen, T. S. L’Ecuyer, S. J. Cooper, M. S. Kulie, and N. B. Wood, 2021: High-latitude precipitation: Snowfall regimes at two distinct sites in Scandinavia. J. Appl. Meteor. Climatol., 60, 1127–1148, https://doi.org/10.1175/JAMC-D-20-0248.1.
Shupe, M. D., S. Y. Matrosov, and T. Uttal, 2006: Arctic mixed-phase cloud properties derived from surface-based sensors at SHEBA. J. Atmos. Sci., 63, 697–711, https://doi.org/10.1175/JAS3659.1.
Simonyan, K., A. Vedaldi, and A. Zisserman, 2014: Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv, 1312.6034v2, https://doi.org/10.48550/arXiv.1312.6034.
Skofronick-Jackson, G., M. Kulie, L. Milani, S. J. Munchak, N. B. Wood, and V. Levizzani, 2019: Satellite estimation of falling snow: A Global Precipitation Measurement (GPM) core observatory perspective. J. Appl. Meteor. Climatol., 58, 1429–1448, https://doi.org/10.1175/JAMC-D-18-0124.1.
Stephens, G., D. Winker, J. Pelon, C. Trepte, D. Vane, C. Yuhas, T. L’Ecuyer, and M. Lebsock, 2018: CloudSat and CALIPSO within the A-Train: Ten years of actively observing the Earth system. Bull. Amer. Meteor. Soc., 99, 569–581, https://doi.org/10.1175/BAMS-D-16-0324.1.
Stephens, G. L., and Coauthors, 2002: The CloudSat mission and the A-Train. Bull. Amer. Meteor. Soc., 83, 1771–1790, https://doi.org/10.1175/BAMS-83-12-1771.
Stephens, G. L., and Coauthors, 2008: CloudSat mission: Performance and early science after the first year of operation. J. Geophys. Res., 113, D00A18, https://doi.org/10.1029/2008JD009982.
Szczepankiewicz, K., A. Popowicz, K. Charkiewicz, K. Nalecz-Charkiewicz, M. Szczepankiewicz, S. Lasota, P. Zawistowski, and K. Radlak, 2023: Ground truth based comparison of saliency maps algorithms. Sci. Rep., 13, 16887, https://doi.org/10.1038/s41598-023-42946-w.
Tan, S., H. Chen, S. Yao, and V. Chandrasekar, 2023: Weather radar beam blockage correction using deep learning. 2023 United States National Committee of URSI National Radio Science Meeting (USNC-URSI NRSM), Boulder, CO, Institute of Electrical and Electronics Engineers, 296–297, https://doi.org/10.23919/USNC-URSINRSM57470.2023.10043151.
Weisman, M. L., and J. B. Klemp, 1982: The dependence of numerically simulated convective storms on vertical wind shear and buoyancy. Mon. Wea. Rev., 110, 504–520, https://doi.org/10.1175/1520-0493(1982)110<0504:TDONSC>2.0.CO;2.
Wood, N. B., T. S. L’Ecuyer, A. J. Heymsfield, G. L. Stephens, D. R. Hudak, and P. Rodriguez, 2014: Estimating snow microphysical properties using collocated multisensor observations. J. Geophys. Res. Atmos., 119, 8941–8961, https://doi.org/10.1002/2013JD021303.
Yan, Z., X. Li, M. Li, W. Zuo, and S. Shan, 2018: Shift-net: Image inpainting via deep feature rearrangement. arXiv, 1801.09392v2, https://doi.org/10.48550/arXiv.1801.09392.
Yeh, R. A., C. Chen, T. Y. Lim, A. G. Schwing, M. Hasegawa-Johnson, and M. N. Do, 2017: Semantic image inpainting with deep generative models. arXiv, 1607.07539v3, https://doi.org/10.48550/arXiv.1607.07539.
Yu, J., Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, 2018: Generative image inpainting with contextual attention. arXiv, 1801.07892v2, https://doi.org/10.48550/arXiv.1801.07892.
Zhou, Z., M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, 2018: UNet++: A nested U-net architecture for medical image segmentation. arXiv, 1807.10165v1, https://doi.org/10.48550/arXiv.1807.10165.
Zuidema, P., and Coauthors, 2005: An Arctic springtime mixed-phase cloudy boundary layer observed during SHEBA. J. Atmos. Sci., 62, 160–176, https://doi.org/10.1175/JAS-3368.1.