## 1. Introduction

Ground-based solar radiation measurement systems are generally considered simple, reliable, and necessary to validate satellite measurements and retrievals (Smirnov et al. 2000; Krotkov et al. 2005). Separation of clear-sky and cloudy portions in measurements is an essential requirement that all these ground systems have to fulfill during in situ calibration and while producing retrievals of atmospheric properties [e.g., aerosol optical depth (AOD)]. One of the most common in situ calibration methods for direct-beam-measuring instruments [e.g., sunphotometer and Multifilter Rotating Shadowband Radiometer (MFRSR)] is Langley regression (Stephens 1994), which can only be applied on cloud-screened data.

The U.S. Department of Agriculture (USDA) UV-B Monitoring and Research Program (UVMRP) has been observing solar UV radiation at 37 sites across United States for over a decade. The primary instrument it uses is the ultraviolet version of MFRSR (UV-MFRSR). The UV-MFRSR receives the direct normal, diffuse horizontal, and total horizontal solar radiation at seven UV channels characterized by 2-nm full width at half maximum (FWHM) bandpasses and are nominally centered at wavelengths of 300, 305, 311, 317, 325, 332, and 368 nm (Slusser et al. 2000).

Chen et al. (2013) reviewed ground-based cloud screening methods published over the last two decades. There are four common types of cloud screening. In the first type, cloud screening is performed on uncalibrated voltage data—the standard Langley analysis or its variants fall into this category. In the second type, cloud screening is performed on calibrated irradiance data. In the third type, cloud screening is performed on derived AOD data. In the fourth type, cloud screening is performed using collocated auxiliary equipment/data, such as a Total Sky Imager (TSI). Here we briefly review some common algorithms.

If the cloud screening method is based on ratios of measured voltages or irradiances rather than their absolute quantities, then this cloud screening method can apply to both the first and second types because the uncalibrated voltage data and the calibrated irradiance data have a constant ratio. Most type 2 cloud screening algorithms are not designed for the purpose of calibration. It is desirable to have a cloud screening algorithm suitable for both applications.

The current cloud screening module used by the UVMRP is the Langley Analyzer (LA), which was developed by the Atmospheric Solar Radiation Group at the University at Albany, State University of New York, Albany, New York. This technique, which uses the methodology described in Harrison and Michalsky (1994), is a two-step filter on a series of log transformed voltage (ln*V*) and airmass points. In the first step, points are sorted by air mass in ascending order and segments beginning at the point where ln*V* starts to increase and ending at the point where ln*V* starts to decrease are classified as cloudy (Chen et al. 2013). In the second step, the concavity test, the points with the slope of ln*V* exceeding a given threshold are cloudy. One of the LA variants developed by Lee et al. (2010) uses the maximum value composite (MVC) technique to acquire the largest voltage values in narrow airmass intervals. Those composite values are considered to be close to voltages measured under clear-sky conditions.

Long and Ackerman (2000) developed a cloud screening algorithm for irradiance data from pyranometers. The downwelling irradiances—the total and the diffuse components—are normalized by air mass. Applying thresholds on these two airmass-normalized components eliminates scenes containing optically thick clouds or haze, as well as high thin clouds, such as cirrus. Applying thresholds on irradiance variation over time and in comparison to a normalized diffuse ratio may eliminate some other cloudy points. When new cloudy points are detected in these processes, threshold values mentioned above will be adjusted and a new iteration of screening is triggered. Otherwise, the surviving points are the final clear-sky points.

Alexandrov et al. (2004) proposed an automated cloud screening algorithm for the time series of AOD derived from a single MFRSR channel. The method renormalizes the AOD time series by removing the local AOD average and calculates the corresponding inhomogeneity index (ε′) for each point. The statistical distribution of ε′ over a long period will show two distinctive maxima that correspond to the aerosol and cloud modes. Applying a threshold between the two maxima, the method separates the clear-sky points and cloudy points.

Smirnov et al. (2000) developed an automatic cloud screening algorithm on the time series of AOD derived from calibrated sunphotometers of the Aerosol Robotic Network (AERONET). Essentially, this algorithm requires that clear-sky points 1) are within a certain AOD range, 2) have stable and smooth AOD in nearby points, and 3) do not exceed a certain standard deviation of AODs from the daily AOD average.

The limitations of the existing cloud screening algorithms include

- being able to perform only on the time series of optical depth (e.g., the index of inhomogeneity algorithm; Alexandrov et al. 2004), which requires accurate calibration that may not be available at the current step;
- requiring consideration of the variation of nearby points or points in a relatively short window (e.g., in the index of inhomogeneity algorithm, the measurements taken within 5 min (or 17 points) are used to determine the cloudiness of the center point); or
- missing some clear intervals due to their short duration or contamination by slight noise. For example, in LA, the slope between nearby points cannot exceed a certain threshold.

The following sections will describe the new TOD pairing cloud screening algorithm, show its advantages mentioned above compared to other methods (especially the LA method), and its ability to obtain more Langley offsets on the cloud-screened data.

## 2. The algorithm description

### a. The basis

*t*;

*t*at one channel;

*t*.

### b. The transformed coordinate system

Instead of the original measurement pair (*t* directly because the slant paths have already been corrected or normalized by moving the

Figure 1 illustrates the transformed coordinate system. The mathematical basis for the whole figure is given by Eq. (2). In the normal Langley calibration method,

The coordinate transformed measurements are sorted by

### c. The outline of the algorithm

At the beginning, the algorithm treats every measurement as an indeterminate point, which means the cloudiness of this measurement has not been determined. The methodology of the procedure is to start by comparing the TOD of this point with other points to determine whether it is cloudy; and if it is not definitely cloudy, then to use it in determining whether other points are cloudy. The point whose cloudiness is to be determined is called the target point. All indeterminate points within a certain time range surrounding but excluding the target point constitute the local window for that target point.

The procedure pairs an indeterminate point in the local window with all other indeterminate points in the same local window. For a pair of such points, a particular type of weighted average TOD is calculated. If the two points of a pair represent nearly clear-sky points, then the TOD difference between the target and the weighted average of such a pair is an indicator of the target point’s cloudiness. The sections below mathematically explain why the algorithm uses the weighted TOD average rather than the standard average in comparing with the target’s TOD. Then a description is given of how to calculate the TOD difference between the target and a paired points’ weighted averages without knowing the value of

Figure 2 gives an example of a target (the purple pentagon) and one pair of points in its local window (green triangles).

In practice, comparing the target’s TOD to the average of only one pair may not be determinate because of the possibility of including cloudy points in the pair. Therefore, an assumption is made that there are many pairs of clear-sky points in the local window and the differences between the target’s TOD and those pairs’ weighted TOD averages would cluster at one value. Then, we calculate the TOD differences between the target and all pairs’ weighted averages in the local window. With the assumption above, outliers of the TOD differences are removed and the mean TOD difference of the remaining pairs is considered as a more robust indicator of the target’s cloudiness. If the mean TOD difference is positive and greater than a reasonable threshold, then the target is definitely cloudy.

When applying the screening routine described above on every indeterminate point, the examining order does not matter and doing so completes one iteration. Cloudy points will retain the cloudy status and be excluded in the future operations. A new iteration is triggered if any new cloudy points are found in the last iteration. If there were no new cloudy points found in the last iteration, then the cloud screening finishes. The surviving indeterminate points are considered clear points.

Following the algorithm design, the points with higher TODs are more likely to be labeled cloudy in the early iterations, while the points with TODs close to the baseline are more likely to survive the screening as would be anticipated.

### d. Algorithm implementation

#### 1) Pairing and TOD differences

*k*th point (

The standard average is a special case of a linear combination. Fortunately, for our purpose—comparing the target’s TOD to a pair of local window points (A and B)—any linear combination of

#### 2) Outliers

In practice, examining the difference between the target’s TOD and a single pair’s average TOD in the window may not be conclusive because of the possibility of including cloudy points in the pair. Figure 3 shows an example of the histogram of *x* axis is *y* axis is the frequency (number of cases) of the target’s pairs’

#### 3) Threshold

Since cloudy points have higher total optical depth values than clear points, we can set a threshold value,

If the purpose of using this cloud screening algorithm is to identify the clearest points in a period and to obtain the Langley offset from those points, a lower value of

The flowchart of the whole cloud screening algorithm is presented in Fig. 5.

### e. Accuracy assessment

Although it has been assumed that there are adequate clear-sky points to obtain a cluster of AODs near the clear-sky value, nevertheless it is useful to confirm that the points identified as clear actually are. As an example of how this can be achieved, the measured diffuse-to-direct ratio (DDR) can be compared to the modeled value for a clear sky. The UV-MFRSR provides both the direct normal and diffuse horizontal irradiance measurements simultaneously. Before calibration, these irradiances are measured in units of millivolts. However, the direct normal and diffuse horizontal irradiance ratio (DDR) of the calibrated irradiances and uncalibrated voltages are the same because the same responsivity is used to convert both voltage measurements to their respective irradiance values. If the AOD, the Rayleigh optical depth (ROD), the ozone optical depth, solar geometry, and site location are known, then a radiative transfer routine such as the moderate resolution atmospheric transmission (MODTRAN) for UV and visible channels or the tropospheric UV model (Madronich 1993) for UV channels is capable of simulating DDR. Solar geometry and site location are measured and known properties. ROD is a function of ground pressure and site location (Bodhaine et al. 1999) and is relatively stable over a day. Ozone optical depths are negligible at the 368-nm channel. When it is important, the total column ozone amount data are available from satellite. By elimination, AOD is the main unknown source that affects the DDR simulation. Therefore, AODs can be estimated by matching the MODTRAN-modeled and UV-MFRSR-measured DDRs on points indicated as clear. Using the retrieved mean AOD value, the direct normal irradiances at the clear-sky points are simulated by MODTRAN. The Langley regression can be applied to the measurements indicated as clear and the

## 3. Results

The HI02 site at Mauna Loa, Hawaii, is a climatology site operated by the USDA UV-B Monitoring and Research Program (UVMRP). Because of its high altitude (3409 m) and great distance from any continents, its atmosphere is less affected by varying aerosol loading and is relatively stable, which makes HI02 a good site for applying Langley regression. The original Langley analysis at HI02 gives more

The stable atmosphere at the HI02 site is optimal for Langley calibration; however, for many sites where air pollution, clouds, and the combination of both are frequent, the Langley method is not as reliable as a means of calibration. In contrast, the FL02 site at Homestead, Florida, is characterized by frequent and fast-moving stratocumulus clouds. The internal cloud screening module of the original Langley analysis often misses short periods of clear points. As a result, there may not be sufficient clear points to calculate

Figure 9 shows an example of the cloud screening performance of the original Langley analysis (top) and the new method (bottom) at the FL02 site on 26 September 2013. The point sets for the two methods are the same: morning points with an airmass range between 1.5 and 3.0. For the Langley analysis 12 points survived from its cloud screening procedure and 5 clear points (3 in transition between clear and cloudy and 2 in short clear periods) are missed in comparison to the new method. After the removal of regression outliers, the Langley analysis allows 9 points, which is less than its minimum requirement (12 points), and therefore the original Langley analysis does not generate

To demonstrate the performance of ^{−5}, while the MSE value for a day inappropriate for Langley analysis may be two to three magnitudes higher. It suggests that the clear-sky points have nearly equal AOD values. It is also evident that cloudy points determined by the new cloud screening algorithm have much lower irradiances than clear-sky points (Fig. 10).

Table 1 lists the statistics of

The statistics of

## 4. Summary

A new cloud screening algorithm for narrowband direct-beam measurements is developed. The mathematical basis of this algorithm is Beer’s law. Measurements are reorganized to a converted coordinate system that emphasizes the relative magnitude of measurements’ total optical depth (TOD). Instead of examining the fluctuation of a target measurement with nearby points, this algorithm calculates the TOD difference between a target and pairs of all indeterminate points and considers the target a cloudy point if the TOD difference exceeds a certain threshold value. All points are of indeterminate status at the beginning of cloud screening. Each point is examined with all other indeterminate points. If new cloudy points are found, then a new iteration of examination is triggered. The cloud screening finishes when no new cloudy points are found in the last iteration. The surviving indeterminate points are considered clear points. The new cloud screening method is verified by comparing the Langley voltage offsets (

This work is supported by the USDA UV-B Monitoring and Research Program under Grant USDA NIFA projects (2012-34263-19736 and 2013-34263-20931).

# APPENDIX A

## TOD Comparison (Direct)

*k*lies within the target’s local window. Using Eqs. (4) and (6) on point

*k*, we getIn Eq. (A2),

Using a coarsely estimated

# APPENDIX B

## TOD Comparison (Standard Average)

# APPENDIX C

## Important MODTRAN Parameters

To simulate the 368-nm-channel direct normal and the diffuse horizontal solar irradiance at the FL02 site on 26 September 2013, the following MODTRAN parameters are used. The parameters are for MODTRAN, version 5.3.

### Card 1

MODEL = 1 Tropical atmosphere

ITYPE = 3 Vertical or slant path to ground

IEMSCT = 4 Execute in spectral solar radiance mode with no thermal scatter

IMULT = 1 Execute with multiple scattering

### Card 1A

DIS = T Use Discrete Ordinate Radiative Transfer (DISORT) discrete ordinate multiple scattering algorithm

DISALB = T Calculate spectral albedo and diffuse transmittance

NSTR = 8 Number of streams to be used by DISORT

O3STR ‘a0.2784’: Column ozone amount (ATM-cm), data source: Earth Observing System (EOS) *Aura* Ozone Monitoring Instrument (OMI) daily level 3 global 0.25° gridded data (http://gdata1.sci.gsfc.nasa.gov/daac-bin/G3/gui.cgi?instance_id=omi)

LSUNFL = T Read a user-specified top of the atmosphere (TOA) solar irradiance data

LBMNAM = T Read the root name of the band model parameter data

### Card 1A1

DATA/SUN01SAO2010.dat Data provided by Chance and Kurucz (2010)

### Card 1A2

01_2009 The root name of 1.0 band model

### Card 2

IHAZE = 1 Rural extinction

VIS = −0.103 Negative of the 550-nm vertical aerosol optical depth

GNDALT = 0.000 Altitude of surface relative to sea level (km)

### Card 3

H1ALT = 0.000 Initial altitude (km)

OBSZEN = 180.000 Initial zenith angle (°) as measured from H1ALT

### Card 3A1

IPARM = 2 Method of specifying solar geometry on Card 3A2

IPH = 2 Select Mie-generated database of aerosol phase functions

IDAY Day of year 269 (26 September 2013)

ISOURC = 0 Extraterrestrial source is the sun

### Card 3A2

PARM1 Azimuth angle, which varies at each observation

PARM2 Solar zenith angle, which varies at each observation

### Card 4

V1 = 26789 Initial frequency in wavenumber (cm^{−1})

V2 = 27526 Final frequency

DV = 1 Frequency increment

FWHM = 2 Slit function full width at half maximum

FLAGS(7:7) = F Write a spectral flux (.flx) file

MLFLX = 1 Number of atmospheric levels for which spectral fluxes are output

## REFERENCES

Alexandrov, M. D., , Marshak A. , , Cairns B. , , Lacis A. A. , , and Carlson B. E. , 2004: Automated cloud screening algorithm for MFRSR data.

,*Geophys. Res. Lett.***31**, L04118, doi:10.1029/2003GL019105.Bodhaine, B. A., , Wood N. B. , , Dutton E. G. , , and Slusser J. R. , 1999: On Rayleigh optical depth calculations.

,*J. Atmos. Oceanic Technol.***16**, 1854–1861, doi:10.1175/1520-0426(1999)016<1854:ORODC>2.0.CO;2.Chance, K., , and Kurucz R. L. , 2010: An improved high-resolution solar reference spectrum for earth’s atmosphere measurements in the ultraviolet, visible, and near infrared.

,*J. Quant. Spectrosc. Radiat. Transfer***111**, 1289–1295, doi:10.1016/j.jqsrt.2010.01.036.Chen, M., , Davis J. , , Tang H. , , Ownby C. , , and Gao W. , 2013: The calibration methods for Multi-Filter Rotating Shadowband Radiometer: A review.

,*Front. Earth Sci.***7**, 257–270, doi:10.1007/s11707-013-0368-9.Harrison, L., , and Michalsky J. , 1994: Objective algorithms for the retrieval of optical depths from ground-based measurements.

,*Appl. Opt.***33**, 5126–5132, doi:10.1364/AO.33.005126.Krotkov, N., and Coauthors, 2005: Aerosol ultraviolet absorption experiment (2002 to 2004), part 1: Ultraviolet multifilter rotating shadowband radiometer calibration and intercomparison with CIMEL sunphotometers.

,*Opt. Eng.***44**, 041004, doi:10.1117/1.1886818.Lee, K. H., and Coauthors, 2010: Aerosol optical depth measurements in eastern China and a new calibration method.

,*J. Geophys. Res.***115**, D00K11, doi:10.1029/2009JD012812.Long, C. N., , and Ackerman T. P. , 2000: Identification of clear skies from broadband pyranometer measurements and calculation of downwelling shortwave cloud effects.

,*J. Geophys. Res.***105**, 15 609–15 626, doi:10.1029/2000JD900077.Madronich, S., 1993: The atmosphere and UVB radiation at ground level.

*Environmental UV Photobiology,*A. R. Young et al., Eds., Plenum Press, 1–39.Slusser, J., , Gibson J. , , Bigelow D. , , Kolinski D. , , Disterhoft P. , , Lantz K. , , and Beaubien A. , 2000: Langley method of calibrating UV filter radiometers.

,*J. Geophys. Res.***105**, 4841–4849, doi:10.1029/1999JD900451.Smirnov, A., , Holben B. N. , , Eck T. F. , , Dubovik O. , , and Slutsker I. , 2000: Cloud screening and quality control algorithms for the AERONET database.

,*Remote Sens. Environ.***73**, 337–349, doi:10.1016/S0034-4257(00)00109-7.Stephens, G. L., 1994: Passive sensing: Extinction and scattering.

*Remote Sensing of the Lower Atmosphere: An Introduction,*Oxford University Press, 261–327.