Strictly Enforcing Invertibility and Conservation in CNN-Based Super Resolution for Scientific Datasets

Andrew Geiss aPacific Northwest National Laboratory, Richland, Washington

Search for other papers by Andrew Geiss in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-2571-4603
and
Joseph C. Hardin aPacific Northwest National Laboratory, Richland, Washington
bClimateAI, Inc., San Francisco, California

Search for other papers by Joseph C. Hardin in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-8489-4763
Open access

We are aware of a technical issue preventing figures and tables from showing in some newly published articles in the full-text HTML view.
While we are resolving the problem, please use the online PDF version of these articles to view figures and tables.

Abstract

Recently, deep convolutional neural networks (CNNs) have revolutionized image “super resolution” (SR), dramatically outperforming past methods for enhancing image resolution. They could be a boon for the many scientific fields that involve imaging or any regularly gridded datasets: satellite remote sensing, radar meteorology, medical imaging, numerical modeling, and so on. Unfortunately, while SR-CNNs produce visually compelling results, they do not necessarily conserve physical quantities between their low-resolution inputs and high-resolution outputs when applied to scientific datasets. Here, a method for “downsampling enforcement” in SR-CNNs is proposed. A differentiable operator is derived that, when applied as the final transfer function of a CNN, ensures the high-resolution outputs exactly reproduce the low-resolution inputs under 2D-average downsampling while improving performance of the SR schemes. The method is demonstrated across seven modern CNN-based SR schemes on several benchmark image datasets, and applications to weather radar, satellite imager, and climate model data are shown. The approach improves training time and performance while ensuring physical consistency between the super-resolved and low-resolution data.

Significance Statement

Recent advancements in using deep learning to increase the resolution of images have substantial potential across the many scientific fields that use images and image-like data. Most image super-resolution research has focused on the visual quality of outputs, however, and is not necessarily well suited for use with scientific data where known physics constraints may need to be enforced. Here, we introduce a method to modify existing deep neural network architectures so that they strictly conserve physical quantities in the input field when “super resolving” scientific data and find that the method can improve performance across a wide range of datasets and neural networks. Integration of known physics and adherence to established physical constraints into deep neural networks will be a critical step before their potential can be fully realized in the physical sciences.

© 2023 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Andrew Geiss, avgeiss@gmail.com

Abstract

Recently, deep convolutional neural networks (CNNs) have revolutionized image “super resolution” (SR), dramatically outperforming past methods for enhancing image resolution. They could be a boon for the many scientific fields that involve imaging or any regularly gridded datasets: satellite remote sensing, radar meteorology, medical imaging, numerical modeling, and so on. Unfortunately, while SR-CNNs produce visually compelling results, they do not necessarily conserve physical quantities between their low-resolution inputs and high-resolution outputs when applied to scientific datasets. Here, a method for “downsampling enforcement” in SR-CNNs is proposed. A differentiable operator is derived that, when applied as the final transfer function of a CNN, ensures the high-resolution outputs exactly reproduce the low-resolution inputs under 2D-average downsampling while improving performance of the SR schemes. The method is demonstrated across seven modern CNN-based SR schemes on several benchmark image datasets, and applications to weather radar, satellite imager, and climate model data are shown. The approach improves training time and performance while ensuring physical consistency between the super-resolved and low-resolution data.

Significance Statement

Recent advancements in using deep learning to increase the resolution of images have substantial potential across the many scientific fields that use images and image-like data. Most image super-resolution research has focused on the visual quality of outputs, however, and is not necessarily well suited for use with scientific data where known physics constraints may need to be enforced. Here, we introduce a method to modify existing deep neural network architectures so that they strictly conserve physical quantities in the input field when “super resolving” scientific data and find that the method can improve performance across a wide range of datasets and neural networks. Integration of known physics and adherence to established physical constraints into deep neural networks will be a critical step before their potential can be fully realized in the physical sciences.

© 2023 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Andrew Geiss, avgeiss@gmail.com

1. Introduction

Image “super resolution” (SR) involves increasing the resolution of images beyond their native resolution and is a long-standing problem in the field of image processing. Here, we focus on single image SR (SISR), which involves estimating subpixel-scale values based only on a single coarsely resolved input image (Nasrollahi and Moeslund 2014) [as opposed to SR frameworks that utilize multiple images (Richard et al. 2020; Tao and Muller 2018; Huang et al. 2018; Baker and Kanade 1999)]. The simplest approach to SISR is 2D interpolation, though many schemes exist that produce high-resolution (HR) outputs of various qualities. Sophisticated SISR schemes can perform different operations at different locations in an image depending on the local low-resolution (LR) pixel data: using a dictionary of image-patch exemplars for instance (Timofte et al. 2015). Nasrollahi and Moeslund (2014) provide an overview of SISR schemes.

a. SR with neural networks

Recently, deep convolutional neural networks (CNNs) have been applied to SISR and have significantly outperformed past algorithms. Initially, Dong et al. (2016) used a three-layer CNN to achieve state of the art SISR results, and the approach was quickly expanded to use significantly deeper CNNs (Kim et al. 2016). SISR-CNN architectures have rapidly developed, and complex SISR networks are now built up of many blocks of convolutional layers that include skip connections such as dense blocks (Huang et al. 2017), residual blocks (He et al. 2016), and channel attention blocks (Bastidas and Tang 2019). The CNNs’ internal spatial upsampling operators (which perform the resolution increases) have progressed from bicubic upsampling (Dong et al. 2016), to learned kernels (Long et al. 2014), to the “pixel-shuffle” approach (Shi et al. 2016). The loss functions have also evolved from simple pixelwise errors, to feature loss (based on features learned by image classification CNNs) and adversarial loss (Ledig et al. 2017), which allow the CNNs to “hallucinate” plausible subpixel-scale features (i.e., generate features that could not be directly inferred from the LR data). Current CNN SR schemes combine many of these concepts (Anwar and Barnes 2019). Wang et al. (2020) review CNN-based SISR.

b. Invertible super-resolution networks

SISR-CNNs are not typically invertible. Training them usually involves degrading HR images and tasking the CNN to reconstruct them, but applying the same degradation to the CNN output does not necessarily reproduce the input image. SISR is an ill-posed problem because there are usually multiple HR images that produce the same LR image when downsampled, and constraining SISR-CNN outputs to this manifold of possible HR images is desirable (Menon et al. 2020). Several studies have approximated this type of invertibility with a modified loss function that computes the pixel error between the downsampled output and the LR input (Menon et al. 2020; Abdal et al. 2019; Ulyanov et al. 2018; Zhang et al. 2020). Additionally, in cases in which the HR image is known, but needs to be intentionally degraded (image compression), training two CNNs simultaneously to perform upsampling and downsampling provides better performance than other SR schemes (Sun and Chen 2020; Kim et al. 2018). These approaches only approximate invertibility, however, and may not be sufficient for “super resolving” scientific datasets.

c. Downscaling in the environmental sciences

In the environmental sciences, mapping data from a low-resolution grid to a high-resolution grid is often referred to as “downscaling,” and CNN-based SR can be seen as a type of statistical downscaling (Wilby et al. 2004). To clarify terminology: “downscaling” should not to be confused with “downsampling,” which, as used here, involves reducing the resolution of data. “Downscaling” is a broad term, and there are other types of downscaling such as dynamical downscaling, in which a high-resolution simulation is used to enhance a low-resolution simulation (Xu et al. 2019; Wang et al. 2021), or model output statistics (Glahn and Lowry 1972) where gridded weather model output is mapped to point forecasts. In these scenarios, strict enforcement of invertibility is not necessarily useful, but in the context of statistical downscaling that maps low-resolution data to a high-resolution grid our method can ensure strict preservation of the physical quantities in the input field.

d. Contributions and impacts

This study introduces a method—referred to here as “downsampling enforcement” (DE)—that strictly constrains a super-resolution CNN’s output to be exactly invertible under 2D-average downsampling. This is accomplished using a transfer function that is applied after the last convolutional layer of the CNN. We demonstrate the method on scientific datasets from a weather radar, satellite imager, and climate model, all of which are cases in which strictly enforcing conservation of quantities in the low-resolution input data is potentially important, and we find that it improves the performance of the CNNs in each case. We then demonstrate this method using seven different CNN SISR architectures on five common benchmark image datasets, and show that it consistently improves performance in more typical SISR applications.

e. Motivation

There are many scientific fields where CNN-based SR could be applied to image data or, more generally, regularly gridded datasets. In these applications, guaranteed invertibility under 2D-averaging can ensure the SR scheme’s inputs and outputs are physically consistent. For example, satellite imagers often have a much larger dynamic range than handheld cameras and undergo rigorous calibration and validation to ensure that measured radiances are accurate (Markham et al. 2014); this should be considered when applying CNN-based SR (Liebel and Körner 2016; Lanaras et al. 2018; Müller et al. 2020). Other possible applications of this method include data from ranging instruments such as radars (Geiss and Hardin 2020), sonars (Feldens 2020), and lidars (Liu et al. 2020). Weather radars can be used to estimate precipitation rates for instance (Fulton et al. 1998), a physical quantity that should be conserved under spatial averaging. Super resolution can be used to enhance output from gridded numerical simulations (Baño-Medina et al. 2020), and CNN-SISR has already been demonstrated on several real-world numerical simulation problems, including: precipitation modeling (Wang et al. 2021) (who focus on dynamical downscaling), wind and solar modeling (Stengel et al. 2020), and climate modeling (Vandal et al. 2018). In climate simulations, strict conservation of physical quantities in super-resolved fields is of particular importance because climate signals can be relatively weak. Also, if SR is used during model integration, even small errors can grow rapidly over many time steps and significantly impact results. The lack of strictly enforced physical consistency in CNNs, and internal representation of physics in general, has been identified as a major hurdle that must be addressed before their impressive capabilities can be fully brought to bear on important imaging and modeling problems in the physical sciences (Reichstein et al. 2019; Tsagkatakis et al. 2019; Beucler et al. 2021; McGovern et al. 2021). In these SISR applications, and many others, strict conservation of large-scale statistical properties in the LR input fields is often just as important as the visual fidelity of the HR outputs, and our method can ensure both.

2. The downsampling enforcement operator

Typically, during training, CNN-SISR schemes are provided LR input images produced by degrading HR-images and tasked with recovering the original. If IHR and ILR are the high- and low-resolution images, respectively, D{I} represents applying a downscaling operator to an image I, and S{I} represents applying an operator to super-resolve an image, CNN-SISR schemes try to find S such that IHRS{ILR}, and during training: IHRS{D{IHR}}. Here, we derive a DE operator, that can be incorporated into most common SR-CNNs and ensures the CNN also satisfies ILR = D{S{ILR}}. Here, we assume that D represents 2D-average downsampling, although solutions can likely be derived for other downsampling schemes. The DE operator f( x, P) operates on each N × N pixel block in the HR image. Here, P denotes the value of a single pixel in the LR image and xi x are the N × N corresponding HR-image pixels output by the last convolutional layer in the CNN; P and xi are assumed to have pixel intensities bounded by [−1, 1].
f(x,P)i={xi+(Px¯1x¯)(1xi)x¯<Pxix¯=Pxi+(Px¯1+x¯)(1+xi)x¯>P,
where
x¯=1N2xjxxj.
Equation (1) can also be written
f(x,P)i=xi+(Px¯)(σ+xiσ+x¯),
whereσ=sign(x¯P). A detailed derivation of f( x, P) is provided in section 1 of the online supplemental material. This formulation of f has several useful properties:
1N2i=0N2f(x,P)i=P,
f(x,P)i[1,1],
xi>xjf(x,P)if(x,P)j,and
f(x,P)iispiecewisedifferentiable.
Equation (3) ensures invertibility under 2D-average downsampling. Constraint (4) bounds f( x, P) to the input image’s dynamic range of [−1, 1]. Constraint (5) maintains the order of the initial output pixels’ intensities. Constraint (6) is necessary because f( x, P) will be included as a part of the CNN during training and must be differentiable for backpropagation to work. Short proofs that (1) and (2) satisfy these conditions are given in section 2 of the online supplemental material.

Equation (2) has a physical interpretation: it operates on initial SISR-CNN outputs [the last convolutional layer of the CNN prior to applying (2) has 3-channel RGB output and a tanh transfer function]. Equation (2) is a correction applied independently to each channel that ensures the intensity of each N × N block of HR output pixels x exactly averages to the value of the corresponding LR input pixel P. When P exceeds x¯, the remaining unused output pixel intensity is computed for each output pixel (1 − xi) and a constant fraction of it is added to the output pixel values. A similar approach is applied when x¯>P. Figure 1 shows the magnitude of the correction [f( x, P)ixi] when the DE operator is applied to a hypothetical block of output pixels (ranged over [−1, 1] with mean 0) for a range of LR input pixel values. It demonstrates how the correction term varies smoothly with respect to P and xi.

Fig. 1.
Fig. 1.

Visualization of the correction f( x, P) − xi as a function of varying P with a sample input of 16 xis ranging from −1 to 1, with a mean of 0.

Citation: Artificial Intelligence for the Earth Systems 2, 1; 10.1175/AIES-D-21-0012.1

3. Application to scientific datasets

Here, we apply the DE super-resolution method to three scientific datasets in applications where strict conservation of quantities in the input fields is important and compare it with identical CNNs trained without DE. These datasets are from diverse sources: a satellite imager, a weather radar, and a numerical weather model. Each dataset was divided into independent training and testing datasets. In each experiment, we train a single CNN type without parameter tuning so a separate validation set was not used. Figure 2 shows an example input, degraded image, and SR output for each dataset. Figure 3 shows the loss evaluated on the test sets at various stages during training. In each case, we find that inclusion of DE allows for exact physical consistency between the CNN inputs and outputs while providing a modest improvement in performance.

Fig. 2.
Fig. 2.

(a),(d),(g) Example degraded inputs, (b),(e),(h) super-resolved outputs with DE, and (c),(f),(i) ground truth for three different scientific datasets. Super-resolved outputs from the conventional CNNs can be seen in Fig. 3 in the online supplemental material.

Citation: Artificial Intelligence for the Earth Systems 2, 1; 10.1175/AIES-D-21-0012.1

Fig. 3.
Fig. 3.

Test set RMSE evaluated for each scientific dataset throughout training. Final test set scores (without/with DE) were (a) GOES: 13.72/13.71 W m−2 sr−1 μm−1, (b) ERA 5: 7.56%/7.55%, and (c) SEVIR: 0.507/0.502 kg m−2.

Citation: Artificial Intelligence for the Earth Systems 2, 1; 10.1175/AIES-D-21-0012.1

a. Neural network

For each dataset, we use the Enhanced Deep Residual Network (EDRN) architecture (Lim et al. 2017). EDRN is based on ideas from He et al. (2016) and is composed of a series of residual blocks. These blocks pass their input through several convolutional layers and then add the result to the original input. The intuition behind this is that residual skip connections provide a more direct path between the input and output of the network and allow the neural network to learn which residual blocks and which features to use as it trains, combating the vanishing gradient problem (the tendency for gradients to approach zero in very deep neural networks during training, which prevents them from learning). “Enhanced” refers to removing several aspects of the residual blocks that were used in He et al. (2016) that are not beneficial for SR (batch normalization for instance). We note that, in the examples of SR applied in the physical sciences given in section 1e, EDRN is the most commonly used model framework, with Stengel et al. (2020), Lanaras et al. (2018), and Liu et al. (2020) all using EDRN-like architectures. Our implementation has 16 residual blocks with 128-filter convolutional layers. The CNN operates at low resolution and then upsamples using a pixel shuffle (Shi et al. 2016) as one of the final operations in the network. The implementation used here has 5.2 × 106 trainable parameters.

b. Training

This study uses 2D-average downsampling for image and data degradation. Bicubic downsampling with antialiasing is more common when working with images in the SISR literature—specifically the MATLAB scheme—but 2D-averaging was assumed in deriving (2) because this is the more common physical relation between high- and low-resolution data. The CNNs here perform 4× SISR, converting 48 × 48 pixel inputs to 192 × 192 outputs. Inputs are standardized to a [−1, 1] scale, and a tanh activation is applied to the output. Outputs are then scaled back to the appropriate (dimensional) range. In the DE cases the tanh activation is applied before applying (2). Each CNN is trained for 300 epochs, with the learning rate reduced by a factor of 10 after the 200 epoch. Epochs are 1000 batches of 16 image chips selected randomly from the training set with random flips and rotations. Pixelwise mean-square error (MSE) is used as a loss function, and the Adam optimizer is used with an initial learning rate of 10−4, β1 = 0.9, and β2 = 0.999. A lower initial learning rate of 2 × 10−5 was used for the Storm Event Imagery (SEVIR) data because the model did not successfully train with the higher learning rate.

c. GOES-17 L1b radiance

The Geostationary Operational Environmental Satellite 17 (GOES-17) (NOAA 2019a) is a geostationary satellite currently orbiting above the equatorial Pacific at 137.2°W. Here, we apply SR to level-1b radiance data from the Advanced Baseline Imager band 2 (the red 640-nm band), which has a resolution of 0.5 km at nadir (NOAA 2019b). The images selected for this study are full-disk scans taken near 1200 LST on various days during 2019 and 2020. The images are cropped to pixels between 2452 and 19 348 in the height dimension and from 8100 to 17 700 in the width dimension to avoid extreme viewing angles and low illumination near the Earth’s limb. The exact file names for the training and test data are given in section 3 of the online supplemental material. The L1b radiances have units of watts per meter squared per steradian per micrometer, and enforcing strict conservation of the low-resolution radiances yields a slight performance improvement for the SR scheme (Fig. 3a). Example outputs are shown in Figs. 2a–c. The final RMSE on the test set was 13.72 W m−2 sr−1 μm−1 for the conventional CNN and 13.71 W m−2 sr−1 with DE. The RMSE between the inputs and downsampled outputs for the conventional CNN was 0.20 W m−2 sr−1 μm−1. For the DE CNN, this value is mathematically constrained to be zero though in practice it is a very small nonzero number whose magnitude depends on the floating point precision used when evaluating the CNN.

d. ERA5 cloud fraction

The fifth major global reanalysis produced by the European Centre for Medium-Range Weather Forecasts Reanalysis (ERA5) is a reconstruction of the past state of the atmosphere from 1979 to present. The reanalysis is performed by assimilating historical atmospheric observations with a numerical weather model (Hersbach et al. 2020). Here we apply SR to cloud fraction data that represent the fraction (0%–100%) of each model gridcell area covered by cloud. We use daily 0.25° × 0.25° resolution data between latitudes ±45° at 0000 UTC for the period from 1 January 1979 to 31 December 2018 for training and use data from 2019 for testing. Note that these data are on a latitude–longitude, grid whereas our DE technique assumes an equal-area grid, so here we have used only data near the equator where there is less distortion. It is possible to modify the DE technique to include latitude weightings, but we leave this for future work. Nonetheless, enforcing physical consistency between the LR and HR data in this context provides a small performance improvement for the SR scheme (Fig. 3b). Examples are shown in Figs. 2d–f. The final RMSE on the test set was 7.56% for the conventional CNN and 7.55% with DE. The RMSE between the inputs and downsampled outputs for the conventional CNN was 0.10%.

e. NEXRAD vertically integrated liquid water

The SEVIR weather dataset (Veillette et al. 2020) is composed of collocated satellite and radar observations of 20 000 weather events over the continental United States between 2017 and 2020. Here we train EDRN to perform 4× SR on 192 × 192 pixel (1-km resolution) chips of vertically integrated liquid water (VIL), a radar derived product from the NEXRAD radar network. VIL has units of kilograms per meter squared, and the DE approach assures that the CNN is mass conserving; for example, the super-resolution process cannot inadvertently add or remove water that was not present in the input field. We use the 25 time samples from the first 18 000 events to train and the remaining 2000 to test. Figures 2g–i show an example case from the test set. The DE method achieves a slight performance improvement over conventional training (Fig. 3c). The final RMSE on the test set was 0.507 kg m−2 for the conventional CNN and 0.502 kg m−2 with DE. The RMSE between the inputs and downsampled outputs for the conventional CNN was 0.05 kg m−2.

4. SR of images

The majority of SISR literature focuses on applications to photographs. Several well-established benchmark image datasets and evaluation metrics have become the standard for evaluating new SISR schemes. Our DE approach to SISR is effective when applied to scientific datasets and can even provide a slight performance improvement in addition to enforcing physical consistency between inputs and outputs of the CNN, but does it improve performance on conventional image datasets where enforcing invertibility of the SR image is not necessary? Here, we have implemented a selection of SISR-CNNs from recent literature and trained them under identical conditions both with and without the DE operator using common benchmark image datasets. These experiments are meant to evaluate our scheme in a way that is more comparable to the bulk of past SISR research.

a. Neural networks

Here, we have reproduced seven different CNN architectures from the recent SISR literature. For unbiased intermodel comparison, we have altered each model slightly so that they all have a similar number of trainable parameters: 5 × 106 (except for SR-CNN and Lap-SRN, which have fewer). The CNNs were implemented in Keras with a Tensorflow backend, and the code and model diagrams can be found on GitHub (https://github.com/avgeiss/invertible_sr). We provide a more detailed overview of the CNNs and our implementations in section 4 of the online supplemental material. The CNNs are SR-CNN (Dong et al. 2016), Laplacian Pyramid Super-Resolution Network (Lap-SRN; Lai et al. 2017), Dense U-Net (DUN; Ronneberger et al. 2015; Huang et al. 2017; Geiss and Hardin 2020), Deep Back Projection Network (DBPN; Haris et al. 2018), Dense SR Net (DSRN; Tong et al. 2017), EDRN (Lim et al. 2017), and Residual Dense Network (RDN; Zhang et al. 2018). Each was trained both with and without strictly enforced invertibility. We note that most of the examples of CNN-based SR applied to environmental datasets given in section 1e either directly use these CNN architectures or are close relatives of them.

b. Image datasets

The Div2k (Agustsson and Timofte 2017) dataset was used for training. It contains 800 high-resolution training images with a 100-image test set. The last 10 training images are typically held out and used to compute validation scores in SISR studies focused on development of new SISR models (Lim et al. 2017). Here, we do not perform hyperparameter tuning, but we do withhold these 10 images during training and use them to evaluate each model throughout the training process (Fig. 4). Trained CNNs are evaluated on several image test datasets that were used because of their prevalence in the SISR literature (Wang et al. 2020): “Set5” (Bevilacqua et al. 2012), “Set14” (Zeyde et al. 2010), “BSDS100” (Martin et al. 2001), “Manga109” (Fujimoto et al. 2016), “Urban 100” (Huang et al. 2015), and the 100-image “Div2k” test set (Agustsson and Timofte 2017). Manga 109 are illustrated images, and Urban 100 contains photographs of urban scenes, whereas the other datasets contain miscellaneous photographs.

Fig. 4.
Fig. 4.

(a)–(g) Div2k validation set PSNR during training for the seven CNN architectures with (red) and without (black) downsampling enforcement. The blue line in (e) uses an additional loss function term instead of DE. (h) Application of conventional CNN-SISR (black) and DE-SISR that incorrectly assumes 2D-average downsampling (green) to bicubic-downsampled images.

Citation: Artificial Intelligence for the Earth Systems 2, 1; 10.1175/AIES-D-21-0012.1

c. Training and testing

This section uses the exact training procedure described in section 3b. This training procedure and choice of optimizer parameters is very similar to both Lim et al. (2017) and Zhang et al. (2018). Two evaluation metrics are used for the image datasets: peak signal-to-noise ratio (PSNR; Wang et al. 2020) and the structural similarity index (SSIM; Wang et al. 2004). PSNR is computed on the intensity (Y) channel after converting the CNN’s output to the luminance (Y), chroma blue (Cb), and chroma red (Cr) (YCbCr) color space (Dong et al. 2016). SSIM is a scalar metric designed to be more representative of the perceptual quality of an image than pixelwise metrics. It scales from −1 to 1, and higher values are better. During model testing, each LR image is broken into 48 × 48 pixel chips using a 24-pixel stride, and PSNR and SSIM are then calculated on the 96 × 96 center portions of each of the HR outputs. For each CNN, both with and without DE, a five-member ensemble was trained from randomly initialized weights and the ensemble mean test scores are reported in Table 1. Figure 4 shows PSNR computed throughout training on the 10-image Div2k validation set for the first ensemble member for each CNN.

Table 1

Evaluation of several SR CNN architectures, both with (“w/DE”) and without downsampling enforcement, applied to standard test datasets for image SR. Entries show PSNR/SSIM, with higher scores in boldface type. Values are averaged across five training runs with random initializations, and italicized entries pass a 99% confidence test for difference in means.

Table 1

d. Results

Figures 4a–g shows the Div2k validation set PSNR during training for a single CNN of each type (panels) both with (red lines) and without (black lines) the DE operator. All of the CNN architectures perform comparably or better when DE is added, with the largest advantage early during training.

Sample outputs from each of the CNNs for each of the training sets both with and without DE are included in the online supplemental material (Figs. 1 and 2 in the online supplemental material). While there are some differences on close inspection, the small differences in PSNR shown in Fig. 4 do not relate to any dramatic change in perceptual image quality of the output. These figures help demonstrate that the DE approach can achieve state-of-the-art SISR performance while strictly enforcing physical consistency within the CNN architecture.

Table 1 summarizes final performance for every CNN–test-set pair, with the better scores denoted by boldface text. Adding DE to the CNN improved performance in all but one case (the Dense-Net had slightly worse PSNR on Set5, although note that Set5 has only five images and that results in this column are more likely to be affected by small sample size). The improvements are often small but are of comparable size to recent generational improvements in CNN architectures—for example, EDRN versus RDN (https://paperswithcode.com/sota/image-super-resolution-on-bsd100-4x-upscaling; accessed 28 January 2021). Furthermore, in most cases the improvement in the mean test score due to adding DE passes a 99% confidence test (one-sided t test for difference in means; Rice 2006). These cases are shown in boldface italics. Overall, the results in Table 1 show that, in addition to achieving the primary goal of exact invertibility, our approach yields robust and consistent performance improvements when applied across a large sampling of the most prevalent CNN types and image datasets in the SISR literature.

5. Additional experiments and discussion

In this section we provide discussion of the limitations of our method, comparison with recent literature, and outline potential areas of future research.

a. Related methods

No existing methods, at the time of writing, strictly enforce invertibility of CNN SR but, past studies (Menon et al. 2020; Abdal et al. 2019; Ulyanov et al. 2018; Zhang et al. 2020) have highlighted its importance and have used loss functions to approximate invertibility. A pulse generative adversarial network (PulseGAN; Menon et al. 2020), adds an MSE term computed between the LR input and the downsampled output to an SR GAN’s adversarial loss function, ensuring more physically plausible outputs. This method does not improve CNNs that optimize pixelwise MSE because it effectively imposes the same loss function twice at different resolutions. We confirm this by training the EDRN-CNN with the loss function:
L=MSE+λ(D{x}D{x^})2¯,
where MSE is the pixelwise mean-square error; x and x^ are ground-truth and predicted HR pixel values, respectively; D{} is a 4 × 4 averaging downsampling operator; λ is a weighting coefficient (here, λ = 16); and the overbar represents averaging. Validation PSNR throughout training is shown in Fig. 4e as a blue line. The PSNR/SSIM computed on the Div2k test set was: 29.95/0.8327, lower than the scores in Table 1. The key difference of the DE approach is that it directly modifies the CNN not the loss function, and it guarantees exact instead of approximate invertibility.

b. Other downsampling schemes

Assuming 2D-average downsampling is often correct for enforcing conservation laws, but it is not always used to train SISR-CNNs. We demonstrate the impact of an incorrectly assumed downsampling scheme by training the EDRN-CNN using a bicubic downsampling scheme, both with and without DE that assumes 2D-averaging. Predictably, the conventional EDRN outperforms the DE version in this case (Fig. 4h). For the div2k test set, the conventional scheme had a PSNR/SSIM of 30.26/0.8393, while the DE-EDRN scored 30.13/0.8356. Although 2D averaging is used here, it may be possible to derive similar operators for other downsampling schemes, by modifying (1) and (2) to include the weights of a bicubic downsampling kernel for instance.

c. Checkerboard artifacts

Some of the outputs from the SR-CNNs contain faint checkerboard artifacts (e.g., in the trees in the BSDS100 sample image in Figs. 1 and 2 in the online supplemental material). Because (1) imposes the low-resolution grid on the high-resolution CNN outputs, CNNs that use DE may be more prone to this “artifacting.” We perform a detailed analysis of checkerboard artifacts in section 5 of the online supplemental material and find that for the 4 × 4 SR used in this paper, the primary contributors to checkerboard artifacts are the use of an L2 instead of L1 loss and the use of 2D-average downsampling to prepare the training samples instead of the bicubic scheme. When larger-resolution increases (8 × 8) are performed the DE operator contributes substantially more, however. We note that in specific applications to scientific datasets, introduction of even very small discontinuities in the output fields could be physically problematic and that this is a limitation of our algorithm. Strategies for mitigation of this effect will be a useful area of future research.

d. The magnitude of pixel corrections

In section 2, f( x, P) is interpreted as a correction applied to an intermediate image output by the CNN. Here, we investigate whether the quality of this intermediate output improves during training, or if the CNN learns to rely on the correction by examining the value of: | xf( x, P)|, the difference between the initial output and the corrected output. On the Div2k test set for the EDRN model, the average difference was 108 (when RGB pixel values are scaled to 0–255), meaning the HR image requires a correction to ensure that it will downsample to the input image. Figures 5a and 5b, show the intermediate output and the corrected output for an example HR image chip from the Div2k test set, respectively (the original image is shown in Fig. 5e). The magnitude of the correction can be reduced by retraining with a regularizer in the loss function
L=MSE+λ|xf(x,P)|¯,
where λ = 100. After training with this regularization term, | xf( x, P)| averaged over the Div2k test set was 0.3. This is demonstrated in Figs. 5c,d, where the intermediate output (Fig. 5c) is now a near perfect match for the final output (Fig. 5d). The overall performance of the SISR scheme was not substantially altered and the regularized CNN had a PSNR of 30.07 and a SSIM of 0.8359 on the Div2k test set, comparable to results without regularization. Because the final outputs are nearly identical with or without the regularizer, it is not necessary to include it for most SR use cases, but the ability to increase the accuracy of the CNN’s intermediate output may be useful for future applications.
Fig. 5.
Fig. 5.

Examination of intermediate outputs from our downsampling enforcement implementation of EDRN (Lim et al. 2017) for a sample image from BSDS100 (Martin et al. 2001) before and after the DE-operator is applied: (a) output prior to DE layer; (b) final output after DE layer; (c),(d) as in (a) and (b) but with regularization on the magnitude of the correction; and (e) the ground-truth image for reference.

Citation: Artificial Intelligence for the Earth Systems 2, 1; 10.1175/AIES-D-21-0012.1

e. Alternate formulations for other use cases

Equation (1) was formulated to enforce physical consistency and also enforce strict upper and lower bounds on the CNN outputs. Alternate and simpler formulations are possible if these bounds do not need to be enforced. In the case that only a lower bound needs to be enforced: data can be transformed to be greater than or equal to zero, the initial output from the CNN can be passed through a rectified linear unit transfer function, and then can be scaled by a constant to enforce physical consistency. This approach is applicable to downscaling atmospheric chemistry data (Geddes et al. 2016), where chemical concentrations are nonnegative, and upper bounds can be ignored for trace gases. If no strict bounds are required, a simple constant can be added to the initial CNN outputs to ensure physical consistency instead. This would be appropriate for super resolving wind vector data, for instance (Salameh et al. 2009).

6. Conclusions

Here, we demonstrated a new method to ensure that the output from any super-resolution CNN, when downsampled with 2D averaging, exactly reproduces the low-resolution input. In addition to providing physical consistency between the input and output data, this approach improves the CNN performance for many different super-resolution architectures across several common image datasets and three scientific datasets examined here. The method involves constructing the CNN with downsampling enforcement and does not require any modifications to the data, training procedure, or loss function.

CNN-based SR is applicable to many types of imagery and gridded data where a guarantee that the statistics of the LR data are preserved is impactful. Here, we demonstrated how this approach could be used to generate high-resolution satellite imagery without introducing nonphysical radiances, to downscale coarse-resolution output from a numerical model without breaking physical consistency, or to super resolve radar data while preserving vertically integrated water mass. In these applications, preserving the LR data statistics in the HR data is paramount, and the technique presented here can deliver the high visual fidelity provided by CNN-based super-resolution schemes without sacrificing physical consistency.

Acknowledgments.

Author Geiss conceived the method, performed the experiments, and wrote the draft paper; Author Hardin conceived the experiments and contributed to the paper. A preprint of the work in this paper was uploaded on 11 November 2020, and an update was provided on 26 October 2021 (https://arxiv.org/abs/2011.05586). This research has been supported by the U.S. Department of Energy (Grant DE-AC05-76RL01830) and by the Department of Energy’s Atmospheric Radiation Measurement program.

Data availability statement.

All of the datasets used here are publicly available: GOES-17 (https://registry.opendata.aws/noaa-goes/), ERA-5 (https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5), and SEVIR (https://sevir.mit.edu/). For image datasets, see the corresponding references: Div2k (Agustsson and Timofte 2017), Set5 (Bevilacqua et al. 2012), Set14 (Zeyde et al. 2010), BSDS100 (Martin et al. 2001), Manga109 (Fujimoto et al. 2016), and Urban 100 (Huang et al. 2015).

REFERENCES

  • Abdal, R., Y. Qin, and P. Wonka, 2019: Image2StyleGAN: How to embed images into the StyleGAN latent space? 2019 IEEE/CVF Int. Conf. on Computer Vision, Seoul, South Korea, Institute of Electrical and Electronics Engineers, 4431–4440, https://doi.org/10.1109/ICCV.2019.00453.

  • Agustsson, E., and R. Timofte, 2017: NTIRE 2017 challenge on single image super-resolution: Dataset and study. IEEE Conf. on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, Institute of Electrical and Electronics Engineers, 1122–1131, https://doi.org/10.1109/CVPRW.2017.150.

  • Anwar, S., and N. Barnes, 2019: Densely residual Laplacian super-resolution. arXiv, 1906.12021v2, https://doi.org/10.48550/arXiv.1906.12021.

  • Baker, S., and T. Kanade, 1999: Super resolution optical flow. Carnegie Mellon University Robotics Institute Tech. Rep. CMU-RI-TR-99-36, https://www.ri.cmu.edu/publications/super-resolution-optical-flow/.

  • Baño-Medina, J., R. Manzanas, and J. M. Gutierrez, 2020: Configuration and intercomparison of deep learning neural models for statistical downscaling. Geosci. Model Dev., 13, 21092124, https://doi.org/10.5194/gmd-13-2109-2020.

    • Search Google Scholar
    • Export Citation
  • Bastidas, A. A., and H. Tang, 2019: Channel attention networks. 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, Institute of Electrical and Electronics Engineers, 881–888, https://doi.org/10.1109/CVPRW.2019.00117.

  • Beucler, T., M. Pritchard, S. Rasp, J. Ott, P. Baldi, and P. Gentine, 2021: Enforcing analytic constraints in neural networks emulating physical systems. Phys. Rev. Lett., 126, 098302, https://doi.org/10.1103/PhysRevLett.126.098302.

    • Search Google Scholar
    • Export Citation
  • Bevilacqua, M., A. Roumy, C. Guillemot, and M. L. Alberi-Morel, 2012: Low-complexity single-image super-resolution based on nonnegative neighbor embedding. Proc. British Machine Vision Conf., Surrey, United Kingdom, BMVA, 135.1–135.10, http://dx.doi.org/10.5244/C.26.135.

  • Dong, C., C. C. Loy, K. He, and X. Tang, 2016: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell., 38, 295307, https://doi.org/10.1109/TPAMI.2015.2439281.

    • Search Google Scholar
    • Export Citation
  • Feldens, P., 2020: Super resolution by deep learning improves boulder detection in side scan sonar backscatter mosaics. Remote Sens., 12, 2284, https://doi.org/10.3390/rs12142284.

    • Search Google Scholar
    • Export Citation
  • Fujimoto, A., T. Ogawa, K. Yamamoto, Y. Matsui, T. Yamasaki, and K. Aizawa, 2016: Manga109 dataset and creation of metadata. MANPU’16: Proc. First Int. Workshop on coMics Analysis, Cancun, Mexico, Association for Computing Machinery, 1–5, https://doi.org/10.1145/3011549.3011551.

  • Fulton, R. A., J. P. Breidenbach, D.-J. Seo, D. A. Miller, and T. O’Bannon, 1998: The WSR-88D rainfall algorithm. Wea. Forecasting, 13, 377395, https://doi.org/10.1175/1520-0434(1998)013<0377:TWRA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Geddes, J. A., R. V. Martin, B. L. Boys, and A. van Donkelaar, 2016: Long-term trends worldwide in ambient NO2 concentrations inferred from satellite observations. Environ. Health Perspect., 124, 281289, https://doi.org/10.1289/ehp.1409567.

    • Search Google Scholar
    • Export Citation
  • Geiss, A., and J. C. Hardin, 2020: Radar super resolution using a deep convolutional neural network. J. Atmos. Oceanic Technol., 37, 21972207, https://doi.org/10.1175/JTECH-D-20-0074.1.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor. Climatol., 11, 12031211, https://doi.org/10.1175/1520-0450(1972)011<1203:TUOMOS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Haris, M., G. Shakhnarovich, and N. Ukita, 2018: Deep back-projection networks for super-resolution. IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, Institute of Electrical and Electronics Engineers, 1664–1673, https://doi.org/10.1109/CVPR.2018.00179.

  • He, K., X. Zhang, S. Ren, and J. Sun, 2016: Deep residual learning for image recognition. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, Institute of Electrical and Electronics Engineers, 770–778, https://doi.org/10.1109/CVPR.2016.90.

  • Hersbach, H., and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 19992049, https://doi.org/10.1002/qj.3803.

    • Search Google Scholar
    • Export Citation
  • Huang, G., Z. Liu, L. van der Maaten, and K. Q. Weinberger, 2017: Densely connected convolutional networks. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, Institute of Electrical and Electronics Engineers, 2261–2269, https://doi.org/10.1109/CVPR.2017.243.

  • Huang, J.-B., A. Singh, and N. Ahuja, 2015: Single image super resolution from transformed self-exemplars. 2015 IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, Institute of Electrical and Electronics Engineers, 5197–5206, https://doi.org/10.1109/CVPR.2015.7299156.

  • Huang, Y., W. Wang, and L. Wang, 2018: Video super-resolution via bidirectional recurrent convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell., 40, 10151028, https://doi.org/10.1109/TPAMI.2017.2701380.

    • Search Google Scholar
    • Export Citation
  • Kim, H., M. Choi, B. Lim, and K. M. Lee, 2018: Task-aware image downscaling. European Conference on Computer Vision, Lecture Notes in Computer Science, Vol. 11208, Springer, 399–414, https://doi.org/10.1007/978-3-030-01225-0_25.

  • Kim, J., J. K. Lee, and K. M. Lee, 2016: Accurate image super-resolution using very deep convolutional networks. arXiv, 1511.04587v2, https://doi.org/10.48550/arXiv.1511.04587.

  • Lai, W.-S., J.-B. Huang, N. Ahuja, and M.-H. Yang, 2017: Fast and accurate image super-resolution with deep Laplacian pyramid networks. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, Institute of Electrical and Electronics Engineers, 5835–5843, https://doi.org/10.1109/CVPR.2017.618.

  • Lanaras, C., J. Bioucas-Dias, S. Galliani, E. Baltsavias, and K. Schindler, 2018: Super-resolution of sentinel-2 images: Learning a globally applicable deep neural network. ISPRS J. Photogramm. Remote Sens., 146, 305319, https://doi.org/10.1016/j.isprsjprs.2018.09.018.

    • Search Google Scholar
    • Export Citation
  • Ledig, C., and Coauthors, 2017: Photo-realistic single image super-resolution using a generative adversarial network. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, Institute of Electrical and Electronics Engineers, 105–114, https://doi.org/10.1109/CVPR.2017.19.

  • Liebel, L., and M. Körner, 2016: Single-image super resolution for multispectral remote sensing data using convolutional neural networks. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLI-B3, 883890, https://doi.org/10.5194/isprs-archives-XLI-B3-883-2016.

    • Search Google Scholar
    • Export Citation
  • Lim, B., S. Son, H. Kim, S. Nah, and K. M. Lee, 2017: Enhanced deep residual networks for single image super-resolution. IEEE Conf. on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, Institute of Electrical and Electronics Engineers, 1132–1140, https://doi.org/10.1109/CVPRW.2017.151.

  • Liu, G., J. Ke, and E. Y. Lam, 2020: CNN-based super-resolution full-waveform Lidar. Proc. Imaging and Applied Optics Congress, Washington, DC, Optical Society of America, JW2A.29, https://doi.org/10.1364/3D.2020.JW2A.29.

  • Long, J., E. Shelhamer, and T. Darell, 2014: Fully convolutional networks for semantic segmentation. arXiv, 1411.4038v2, https://doi.org/10.48550/arXiv.1411.4038.

  • Markham, B., and Coauthors, 2014: Landsat-8 operational land imager radiometric calibration and stability. Remote Sens., 6, 12 27512 308, https://doi.org/10.3390/rs61212275.

    • Search Google Scholar
    • Export Citation
  • Martin, D., C. Fowlkes, D. Tal, and J. Malik, 2001: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. Proc. Eighth IEEE Int. Conf. on Computer Vision, Vancouver, BC, Institute of Electrical and Electronics Engineers, 416–423, https://doi.org/10.1109/ICCV.2001.937655.

  • McGovern, A., I. Ebert-Uphoff, D. J. Gagne II, and A. Bostrom, 2021: The need for ethical, responsible, and trustworthy artificial intelligence for environmental sciences. arXiv, 2112.08453v1, https://doi.org/10.48550/arXiv.2112.08453.

  • Menon, S., A. Damian, S. Hu, N. Ravi, and C. Rudin, 2020: PULSE: Self-supervised photo upsampling via latent space exploration of generative models. arXiv, 2003.03808v3, https://doi.org/10.48550/arXiv.2003.03808.

  • Müller, M. U., N. Ekhtiari, R. M. Almeida, and C. Rieke, 2020: Super resolution of multispectral satellite images using convolutional neural networks. ISPRS Ann. Photogramm. Remote Sens Spatial Inf. Sci., V-1-2020, 3340, https://doi.org/10.5194/isprs-annals-V-1-2020-33-2020.

    • Search Google Scholar
    • Export Citation
  • Nasrollahi, K., and T. B. Moeslund, 2014: Super-resolution: A comprehensive survey. Mach. Vision Appl., 25, 14231468, https://doi.org/10.1007/s00138-014-0623-4.

    • Search Google Scholar
    • Export Citation
  • NOAA, 2019a: GOES-R product algorithm theoretical basis documents. NOAA, accessed 25 October 2020, https://www.goes-r.gov/resources/docs.html.

  • NOAA, 2019b: GOES R-series product definition and users’ guide. NOAA, 442 pp., https://www.goes-r.gov/users/docs/PUG-L1b-vol3.pdf.

  • Reichstein, M., G. Camps-Valls, B. Stevens, M. Jung, J. Denzler, N. Carvalhais, and Prabhat, 2019: Deep learning and process understanding for data-driven Earth system science. Nature, 566, 195204, https://doi.org/10.1038/s41586-019-0912-1.

    • Search Google Scholar
    • Export Citation
  • Rice, J. A., 2006: Mathematical Statistics and Data Analysis. Nelson Education, 688 pp.

  • Richard, A., I. Cherabier, M. R. Oswald, V. Tsiminaki, M. Pollefeys, and K. Schindler, 2020: Learned multi-view texture super-resolution. arXiv, 2001.04775v1, https://doi.org/10.48550/arXiv.2001.04775.

  • Ronneberger, O., P. Fischer, and T. Brox, 2015: U-net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, N. Navab et al., Eds., Lecture Notes in Computer Science, Vol. 9351, Springer, 234–241.

  • Salameh, T., P. Drobinski, M. Vrac, and P. Naveau, 2009: Statistical downscaling of near-surface wind over complex terrain in southern France. Meteor. Atmos. Phys., 103, 253265, https://doi.org/10.1007/s00703-008-0330-7.

    • Search Google Scholar
    • Export Citation
  • Shi, W., J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, 2016: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, Institute of Electrical and Electronics Engineers, 1874–1883, https://doi.org/10.1109/CVPR.2016.207.

  • Stengel, K., A. Glaws, D. Hettinger, and R. N. King, 2020: Adversarial super-resolution of climatological wind and solar data. Proc. Natl. Acad. Sci. USA, 117, 16 80516 815, https://doi.org/10.1073/pnas.1918964117.

    • Search Google Scholar
    • Export Citation
  • Sun, W., and Z. Chen, 2020: Learned image downscaling for upscaling using content adaptive resampler. IEEE Trans. Image Process., 29, 40274040, https://doi.org/10.1109/TIP.2020.2970248.

    • Search Google Scholar
    • Export Citation
  • Tao, Y., and J.-P. Muller, 2018: Super-resolution restoration of MISR images using the UCL MAGiGAN system. Remote Sens., 11, 52, https://doi.org/10.3390/rs11010052.

    • Search Google Scholar
    • Export Citation
  • Timofte, R., V. De Smet, and L. Van Gool, 2015: A+: Adjusted anchored neighborhood regression for fast super-resolution. Computer Vision—ACCV 2014. ACCV 2014: 12th Asian Conf. on Computer Vision, Lecture Notes in Computer Science, Vol. 9006, Springer, 111–126, https://doi.org/10.1007/978-3-319-16817-3_8.

  • Tong, T., G. Li, X. Liu, and Q. Gao, 2017: Image super-resolution using dense skip connections. 2017 IEEE Int. Conf. on Computer Vision, Venice, Italy, Institute of Electrical and Electronics Engineers, 4809–4817, https://doi.org/10.1109/ICCV.2017.514.

  • Tsagkatakis, G., A. Aidini, K. Fotiadou, M. Giannopoulos, A. Pentari, and P. Tsakalides, 2019: Survey of deep-learning approaches for remote sensing observation enhancement. Sensors, 19, 3929, https://doi.org/10.3390/s19183929.

    • Search Google Scholar
    • Export Citation
  • Ulyanov, D., A. Vedaldi, and V. Lempitsky, 2018: Deep image prior. IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, Institute of Electrical and Electronics Engineers, 9446–9454, https://doi.org/10.1109/CVPR.2018.00984.

  • Vandal, T., E. Kodra, S. Ganguly, A. Michaelis, R. Nemani, and A. R. Ganguly, 2018: Generating high resolution climate change projections through single image super-resolution: An abridged version. Proc. 27th Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, International Joint Conferences on Artificial Intelligence Organization, 5389–5393, https://doi.org/10.24963/ijcai.2018/759.

  • Veillette, M., S. Samsi, and C. Mattioli, 2020: SEVIR: A storm event imagery dataset for deep learning applications in radar and satellite meteorology. Advances in Neural Information Processing Systems, Vol. 33, Curran Associates, Inc., 22 009–22 019, https://proceedings.neurips.cc/paper/2020/file/fa78a16157fed00d7a80515818432169-Paper.pdf.

  • Wang, J., Z. Liu, I. Foster, W. Chang, R. Kettimuthu, and V. R. Kotamarthi, 2021: Fast and accurate learned multiresolution dynamical downscaling for precipitation. Geosci. Model Dev., 14, 63556372, https://doi.org/10.5194/gmd-14-6355-2021.

    • Search Google Scholar
    • Export Citation
  • Wang, Z., A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, 2004: Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process., 13, 600612, https://doi.org/10.1109/TIP.2003.819861.

    • Search Google Scholar
    • Export Citation
  • Wang, Z., J. Chen, and S. Hoi, 2020: Deep learning for image super-resolution: A survey. arXiv, 1902.06068v2, https://doi.org/10.48550/arXiv.1902.06068.

  • Wilby, R. L., S. P. Charles, E. Zorita, B. Timbal, P. Whetton, and L. O. Mearns, 2004: Guidelines for use of climate scenarios developed from statistical downscaling methods. IPCC Doc., 27 pp., https://www.ipcc-data.org/guidelines/dgm_no2_v1_09_2004.pdf.

  • Xu, Z., Y. Han, and Z. Yang, 2019: Dynamical downscaling of regional climate: A review of methods and limitations. Sci. China Earth Sci., 62, 365375, https://doi.org/10.1007/s11430-018-9261-5.

    • Search Google Scholar
    • Export Citation
  • Zeyde, R., M. Elad, and M. Protter, 2010: On single image scaleup using sparse-representations. Curves and Surfaces 2010, Springer, 711–730.

  • Zhang, Y., Y. Tian, Y. Kong, B. Zhong, and Y. Fu, 2018: Residual dense network for image super-resolution. 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, Institute of Electrical and Electronics Engineers, 2472–2481, https://doi.org/10.1109/CVPR.2018.00262.

  • Zhang, Y., Z. Zhang, S. DiVerdi, Z. Wang, J. Echevarria, and Y. Fu, 2020: Texture hallucination for large-factor painting super-resolution. Computer Vision—ECCV 2020, A. Vedaldi et al., Eds., Lecture Notes in Computer Science, Vol. 12352, Springer, 209–225.

Supplementary Materials

Save
  • Abdal, R., Y. Qin, and P. Wonka, 2019: Image2StyleGAN: How to embed images into the StyleGAN latent space? 2019 IEEE/CVF Int. Conf. on Computer Vision, Seoul, South Korea, Institute of Electrical and Electronics Engineers, 4431–4440, https://doi.org/10.1109/ICCV.2019.00453.

  • Agustsson, E., and R. Timofte, 2017: NTIRE 2017 challenge on single image super-resolution: Dataset and study. IEEE Conf. on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, Institute of Electrical and Electronics Engineers, 1122–1131, https://doi.org/10.1109/CVPRW.2017.150.

  • Anwar, S., and N. Barnes, 2019: Densely residual Laplacian super-resolution. arXiv, 1906.12021v2, https://doi.org/10.48550/arXiv.1906.12021.

  • Baker, S., and T. Kanade, 1999: Super resolution optical flow. Carnegie Mellon University Robotics Institute Tech. Rep. CMU-RI-TR-99-36, https://www.ri.cmu.edu/publications/super-resolution-optical-flow/.

  • Baño-Medina, J., R. Manzanas, and J. M. Gutierrez, 2020: Configuration and intercomparison of deep learning neural models for statistical downscaling. Geosci. Model Dev., 13, 21092124, https://doi.org/10.5194/gmd-13-2109-2020.

    • Search Google Scholar
    • Export Citation
  • Bastidas, A. A., and H. Tang, 2019: Channel attention networks. 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, Institute of Electrical and Electronics Engineers, 881–888, https://doi.org/10.1109/CVPRW.2019.00117.

  • Beucler, T., M. Pritchard, S. Rasp, J. Ott, P. Baldi, and P. Gentine, 2021: Enforcing analytic constraints in neural networks emulating physical systems. Phys. Rev. Lett., 126, 098302, https://doi.org/10.1103/PhysRevLett.126.098302.

    • Search Google Scholar
    • Export Citation
  • Bevilacqua, M., A. Roumy, C. Guillemot, and M. L. Alberi-Morel, 2012: Low-complexity single-image super-resolution based on nonnegative neighbor embedding. Proc. British Machine Vision Conf., Surrey, United Kingdom, BMVA, 135.1–135.10, http://dx.doi.org/10.5244/C.26.135.

  • Dong, C., C. C. Loy, K. He, and X. Tang, 2016: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell., 38, 295307, https://doi.org/10.1109/TPAMI.2015.2439281.

    • Search Google Scholar
    • Export Citation
  • Feldens, P., 2020: Super resolution by deep learning improves boulder detection in side scan sonar backscatter mosaics. Remote Sens., 12, 2284, https://doi.org/10.3390/rs12142284.

    • Search Google Scholar
    • Export Citation
  • Fujimoto, A., T. Ogawa, K. Yamamoto, Y. Matsui, T. Yamasaki, and K. Aizawa, 2016: Manga109 dataset and creation of metadata. MANPU’16: Proc. First Int. Workshop on coMics Analysis, Cancun, Mexico, Association for Computing Machinery, 1–5, https://doi.org/10.1145/3011549.3011551.

  • Fulton, R. A., J. P. Breidenbach, D.-J. Seo, D. A. Miller, and T. O’Bannon, 1998: The WSR-88D rainfall algorithm. Wea. Forecasting, 13, 377395, https://doi.org/10.1175/1520-0434(1998)013<0377:TWRA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Geddes, J. A., R. V. Martin, B. L. Boys, and A. van Donkelaar, 2016: Long-term trends worldwide in ambient NO2 concentrations inferred from satellite observations. Environ. Health Perspect., 124, 281289, https://doi.org/10.1289/ehp.1409567.

    • Search Google Scholar
    • Export Citation
  • Geiss, A., and J. C. Hardin, 2020: Radar super resolution using a deep convolutional neural network. J. Atmos. Oceanic Technol., 37, 21972207, https://doi.org/10.1175/JTECH-D-20-0074.1.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor. Climatol., 11, 12031211, https://doi.org/10.1175/1520-0450(1972)011<1203:TUOMOS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Haris, M., G. Shakhnarovich, and N. Ukita, 2018: Deep back-projection networks for super-resolution. IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, Institute of Electrical and Electronics Engineers, 1664–1673, https://doi.org/10.1109/CVPR.2018.00179.

  • He, K., X. Zhang, S. Ren, and J. Sun, 2016: Deep residual learning for image recognition. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, Institute of Electrical and Electronics Engineers, 770–778, https://doi.org/10.1109/CVPR.2016.90.

  • Hersbach, H., and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 19992049, https://doi.org/10.1002/qj.3803.

    • Search Google Scholar
    • Export Citation
  • Huang, G., Z. Liu, L. van der Maaten, and K. Q. Weinberger, 2017: Densely connected convolutional networks. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, Institute of Electrical and Electronics Engineers, 2261–2269, https://doi.org/10.1109/CVPR.2017.243.

  • Huang, J.-B., A. Singh, and N. Ahuja, 2015: Single image super resolution from transformed self-exemplars. 2015 IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, Institute of Electrical and Electronics Engineers, 5197–5206, https://doi.org/10.1109/CVPR.2015.7299156.

  • Huang, Y., W. Wang, and L. Wang, 2018: Video super-resolution via bidirectional recurrent convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell., 40, 10151028, https://doi.org/10.1109/TPAMI.2017.2701380.

    • Search Google Scholar
    • Export Citation
  • Kim, H., M. Choi, B. Lim, and K. M. Lee, 2018: Task-aware image downscaling. European Conference on Computer Vision, Lecture Notes in Computer Science, Vol. 11208, Springer, 399–414, https://doi.org/10.1007/978-3-030-01225-0_25.

  • Kim, J., J. K. Lee, and K. M. Lee, 2016: Accurate image super-resolution using very deep convolutional networks. arXiv, 1511.04587v2, https://doi.org/10.48550/arXiv.1511.04587.

  • Lai, W.-S., J.-B. Huang, N. Ahuja, and M.-H. Yang, 2017: Fast and accurate image super-resolution with deep Laplacian pyramid networks. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, Institute of Electrical and Electronics Engineers, 5835–5843, https://doi.org/10.1109/CVPR.2017.618.

  • Lanaras, C., J. Bioucas-Dias, S. Galliani, E. Baltsavias, and K. Schindler, 2018: Super-resolution of sentinel-2 images: Learning a globally applicable deep neural network. ISPRS J. Photogramm. Remote Sens., 146, 305319, https://doi.org/10.1016/j.isprsjprs.2018.09.018.

    • Search Google Scholar
    • Export Citation
  • Ledig, C., and Coauthors, 2017: Photo-realistic single image super-resolution using a generative adversarial network. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, Institute of Electrical and Electronics Engineers, 105–114, https://doi.org/10.1109/CVPR.2017.19.

  • Liebel, L., and M. Körner, 2016: Single-image super resolution for multispectral remote sensing data using convolutional neural networks. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLI-B3, 883890, https://doi.org/10.5194/isprs-archives-XLI-B3-883-2016.

    • Search Google Scholar
    • Export Citation
  • Lim, B., S. Son, H. Kim, S. Nah, and K. M. Lee, 2017: Enhanced deep residual networks for single image super-resolution. IEEE Conf. on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, Institute of Electrical and Electronics Engineers, 1132–1140, https://doi.org/10.1109/CVPRW.2017.151.

  • Liu, G., J. Ke, and E. Y. Lam, 2020: CNN-based super-resolution full-waveform Lidar. Proc. Imaging and Applied Optics Congress, Washington, DC, Optical Society of America, JW2A.29, https://doi.org/10.1364/3D.2020.JW2A.29.

  • Long, J., E. Shelhamer, and T. Darell, 2014: Fully convolutional networks for semantic segmentation. arXiv, 1411.4038v2, https://doi.org/10.48550/arXiv.1411.4038.

  • Markham, B., and Coauthors, 2014: Landsat-8 operational land imager radiometric calibration and stability. Remote Sens., 6, 12 27512 308, https://doi.org/10.3390/rs61212275.

    • Search Google Scholar
    • Export Citation
  • Martin, D., C. Fowlkes, D. Tal, and J. Malik, 2001: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. Proc. Eighth IEEE Int. Conf. on Computer Vision, Vancouver, BC, Institute of Electrical and Electronics Engineers, 416–423, https://doi.org/10.1109/ICCV.2001.937655.

  • McGovern, A., I. Ebert-Uphoff, D. J. Gagne II, and A. Bostrom, 2021: The need for ethical, responsible, and trustworthy artificial intelligence for environmental sciences. arXiv, 2112.08453v1, https://doi.org/10.48550/arXiv.2112.08453.

  • Menon, S., A. Damian, S. Hu, N. Ravi, and C. Rudin, 2020: PULSE: Self-supervised photo upsampling via latent space exploration of generative models. arXiv, 2003.03808v3, https://doi.org/10.48550/arXiv.2003.03808.

  • Müller, M. U., N. Ekhtiari, R. M. Almeida, and C. Rieke, 2020: Super resolution of multispectral satellite images using convolutional neural networks. ISPRS Ann. Photogramm. Remote Sens Spatial Inf. Sci., V-1-2020, 3340, https://doi.org/10.5194/isprs-annals-V-1-2020-33-2020.

    • Search Google Scholar
    • Export Citation
  • Nasrollahi, K., and T. B. Moeslund, 2014: Super-resolution: A comprehensive survey. Mach. Vision Appl., 25, 14231468, https://doi.org/10.1007/s00138-014-0623-4.

    • Search Google Scholar
    • Export Citation
  • NOAA, 2019a: GOES-R product algorithm theoretical basis documents. NOAA, accessed 25 October 2020, https://www.goes-r.gov/resources/docs.html.

  • NOAA, 2019b: GOES R-series product definition and users’ guide. NOAA, 442 pp., https://www.goes-r.gov/users/docs/PUG-L1b-vol3.pdf.

  • Reichstein, M., G. Camps-Valls, B. Stevens, M. Jung, J. Denzler, N. Carvalhais, and Prabhat, 2019: Deep learning and process understanding for data-driven Earth system science. Nature, 566, 195204, https://doi.org/10.1038/s41586-019-0912-1.

    • Search Google Scholar
    • Export Citation
  • Rice, J. A., 2006: Mathematical Statistics and Data Analysis. Nelson Education, 688 pp.

  • Richard, A., I. Cherabier, M. R. Oswald, V. Tsiminaki, M. Pollefeys, and K. Schindler, 2020: Learned multi-view texture super-resolution. arXiv, 2001.04775v1, https://doi.org/10.48550/arXiv.2001.04775.

  • Ronneberger, O., P. Fischer, and T. Brox, 2015: U-net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, N. Navab et al., Eds., Lecture Notes in Computer Science, Vol. 9351, Springer, 234–241.

  • Salameh, T., P. Drobinski, M. Vrac, and P. Naveau, 2009: Statistical downscaling of near-surface wind over complex terrain in southern France. Meteor. Atmos. Phys., 103, 253265, https://doi.org/10.1007/s00703-008-0330-7.

    • Search Google Scholar
    • Export Citation
  • Shi, W., J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, 2016: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, Institute of Electrical and Electronics Engineers, 1874–1883, https://doi.org/10.1109/CVPR.2016.207.

  • Stengel, K., A. Glaws, D. Hettinger, and R. N. King, 2020: Adversarial super-resolution of climatological wind and solar data. Proc. Natl. Acad. Sci. USA, 117, 16 80516 815, https://doi.org/10.1073/pnas.1918964117.

    • Search Google Scholar
    • Export Citation
  • Sun, W., and Z. Chen, 2020: Learned image downscaling for upscaling using content adaptive resampler. IEEE Trans. Image Process., 29, 40274040, https://doi.org/10.1109/TIP.2020.2970248.

    • Search Google Scholar
    • Export Citation
  • Tao, Y., and J.-P. Muller, 2018: Super-resolution restoration of MISR images using the UCL MAGiGAN system. Remote Sens., 11, 52, https://doi.org/10.3390/rs11010052.

    • Search Google Scholar
    • Export Citation
  • Timofte, R., V. De Smet, and L. Van Gool, 2015: A+: Adjusted anchored neighborhood regression for fast super-resolution. Computer Vision—ACCV 2014. ACCV 2014: 12th Asian Conf. on Computer Vision, Lecture Notes in Computer Science, Vol. 9006, Springer, 111–126, https://doi.org/10.1007/978-3-319-16817-3_8.

  • Tong, T., G. Li, X. Liu, and Q. Gao, 2017: Image super-resolution using dense skip connections. 2017 IEEE Int. Conf. on Computer Vision, Venice, Italy, Institute of Electrical and Electronics Engineers, 4809–4817, https://doi.org/10.1109/ICCV.2017.514.

  • Tsagkatakis, G., A. Aidini, K. Fotiadou, M. Giannopoulos, A. Pentari, and P. Tsakalides, 2019: Survey of deep-learning approaches for remote sensing observation enhancement. Sensors, 19, 3929, https://doi.org/10.3390/s19183929.

    • Search Google Scholar
    • Export Citation
  • Ulyanov, D., A. Vedaldi, and V. Lempitsky, 2018: Deep image prior. IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, Institute of Electrical and Electronics Engineers, 9446–9454, https://doi.org/10.1109/CVPR.2018.00984.

  • Vandal, T., E. Kodra, S. Ganguly, A. Michaelis, R. Nemani, and A. R. Ganguly, 2018: Generating high resolution climate change projections through single image super-resolution: An abridged version. Proc. 27th Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, International Joint Conferences on Artificial Intelligence Organization, 5389–5393, https://doi.org/10.24963/ijcai.2018/759.

  • Veillette, M., S. Samsi, and C. Mattioli, 2020: SEVIR: A storm event imagery dataset for deep learning applications in radar and satellite meteorology. Advances in Neural Information Processing Systems, Vol. 33, Curran Associates, Inc., 22 009–22 019, https://proceedings.neurips.cc/paper/2020/file/fa78a16157fed00d7a80515818432169-Paper.pdf.

  • Wang, J., Z. Liu, I. Foster, W. Chang, R. Kettimuthu, and V. R. Kotamarthi, 2021: Fast and accurate learned multiresolution dynamical downscaling for precipitation. Geosci. Model Dev., 14, 63556372, https://doi.org/10.5194/gmd-14-6355-2021.

    • Search Google Scholar
    • Export Citation
  • Wang, Z., A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, 2004: Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process., 13, 600612, https://doi.org/10.1109/TIP.2003.819861.

    • Search Google Scholar
    • Export Citation
  • Wang, Z., J. Chen, and S. Hoi, 2020: Deep learning for image super-resolution: A survey. arXiv, 1902.06068v2, https://doi.org/10.48550/arXiv.1902.06068.

  • Wilby, R. L., S. P. Charles, E. Zorita, B. Timbal, P. Whetton, and L. O. Mearns, 2004: Guidelines for use of climate scenarios developed from statistical downscaling methods. IPCC Doc., 27 pp., https://www.ipcc-data.org/guidelines/dgm_no2_v1_09_2004.pdf.

  • Xu, Z., Y. Han, and Z. Yang, 2019: Dynamical downscaling of regional climate: A review of methods and limitations. Sci. China Earth Sci., 62, 365375, https://doi.org/10.1007/s11430-018-9261-5.

    • Search Google Scholar
    • Export Citation
  • Zeyde, R., M. Elad, and M. Protter, 2010: On single image scaleup using sparse-representations. Curves and Surfaces 2010, Springer, 711–730.

  • Zhang, Y., Y. Tian, Y. Kong, B. Zhong, and Y. Fu, 2018: Residual dense network for image super-resolution. 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, Institute of Electrical and Electronics Engineers, 2472–2481, https://doi.org/10.1109/CVPR.2018.00262.

  • Zhang, Y., Z. Zhang, S. DiVerdi, Z. Wang, J. Echevarria, and Y. Fu, 2020: Texture hallucination for large-factor painting super-resolution. Computer Vision—ECCV 2020, A. Vedaldi et al., Eds., Lecture Notes in Computer Science, Vol. 12352, Springer, 209–225.

  • Fig. 1.

    Visualization of the correction f( x, P) − xi as a function of varying P with a sample input of 16 xis ranging from −1 to 1, with a mean of 0.

  • Fig. 2.

    (a),(d),(g) Example degraded inputs, (b),(e),(h) super-resolved outputs with DE, and (c),(f),(i) ground truth for three different scientific datasets. Super-resolved outputs from the conventional CNNs can be seen in Fig. 3 in the online supplemental material.

  • Fig. 3.

    Test set RMSE evaluated for each scientific dataset throughout training. Final test set scores (without/with DE) were (a) GOES: 13.72/13.71 W m−2 sr−1 μm−1, (b) ERA 5: 7.56%/7.55%, and (c) SEVIR: 0.507/0.502 kg m−2.

  • Fig. 4.

    (a)–(g) Div2k validation set PSNR during training for the seven CNN architectures with (red) and without (black) downsampling enforcement. The blue line in (e) uses an additional loss function term instead of DE. (h) Application of conventional CNN-SISR (black) and DE-SISR that incorrectly assumes 2D-average downsampling (green) to bicubic-downsampled images.

  • Fig. 5.

    Examination of intermediate outputs from our downsampling enforcement implementation of EDRN (Lim et al. 2017) for a sample image from BSDS100 (Martin et al. 2001) before and after the DE-operator is applied: (a) output prior to DE layer; (b) final output after DE layer; (c),(d) as in (a) and (b) but with regularization on the magnitude of the correction; and (e) the ground-truth image for reference.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 1360 803 105
PDF Downloads 719 248 17