1. Introduction
Observing and characterizing the spatial and temporal distribution of clouds, together with their microphysical properties, is fundamental to eventually ascertaining the net effect of clouds on climate. Whereas the horizontal distribution of clouds can be accurately estimated by satellite observations over most regions (Rossow et al. 1993), inferring the vertical distribution of clouds from a satellite is extremely difficult (Baum et al. 1995). With existing technology, collocated, low power radar/lidar pairs appear to hold the most promise for accurate, continuous mapping of the vertical distribution of clouds (Spinhirne 1993; Kropfli et al. 1995; Uttal et al. 1995). Low power is necessary for both types of instruments to make them safe and reliable during continuous 24-h-a-day operation over many years.
Active remote sensing with a low power radar and lidar makes the detection of significant power returns from clouds difficult, as these signals are often small compared to the instrument and atmospheric noise. In this work we describe the extension of an algorithm that we first applied to the power returns from The Pennsylvania State University 94-GHz cloud radar (Clothiaux et al. 1995) to the micropulse lidar developed by Spinhirne (1993). The main purpose of the current algorithm is to extend the identification of significant cloud power returns from the cloud base to all range resolution volumes in each vertical profile of power returns without exact knowledge of the lidar system specifications. As opposed to lidar cloud detection algorithms that use changes in sign of the slope of the lidar backscatter powers versus altitude (e.g., Pal et al. 1992), we construct lidar clear-sky power return profiles from the data to test profiles for the presence of cloud.
The micropulse lidar that we utilized in this study was installed at the Southern Great Plains (SGP) Central Facility (36.61°N, 97.58°W) of the U. S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) Program (Stokes and Schwartz 1994) in 1993; it had a pulse power of 4 μJ (Spinhirne et al. 1995). After two years of continuous operation, optical contamination had degraded the effective pulse power to less than 1 μJ. Our algorithm is tested with data from the period in 1995 with the low performance. After this period an instrument with a pulse power of 10 μJ was introduced at the SGP central facility. Importantly, instruments of this second type are currently deployed at the DOE ARM sites and the signal to noise of these instruments is an order of magnitude or more greater than the data used here. Therefore, an adequate performance of our algorithm on the tested data should translate to a high level of confidence on the presence of cloud in the current lidar signals that are now being generated over an extended period of time at the DOE ARM sites.
For lower tropospheric clouds that completely extinguish the lidar beam, features of this algorithm, or any other lidar-based cloud detection algorithm, may not be important. However, for optically thin clouds, algorithms of this sort can identify the height of the cloud top. Knowledge of cloud-top height is useful in the validation of satellite-derived cloud-top heights obtained from the same clouds, as well as for radiative transfer studies that attempt to ascertain the effects of clouds on the energy budget of the earth.
To validate algorithm performance in the lower troposphere, we compare the cloud-base heights that are retrieved in this analysis both with the cloud-base heights produced by a more traditional algorithm developed by Scott and Spinhirne (V. S. Scott 1996, personal communication) that is applied to the same data and with the cloud-base heights generated by the commercially available algorithm that is applied to Belfort laser ceilometer data (TransTechnology Corporation 1993, Technical and Operator’s Manual for the Model 7013C Ceilometer). Since we do not have a validation dataset at cirrus altitudes, we perform a subjective validation of the algorithm performance by comparing the algorithm-derived cloud detections with the original photoelectron count data.
2. Methods
To derive estimates of both the Rayleigh backscattered power Pr,ij and the noise ΔPd,ij generated by the detection process, we start with the observation that the power Ps,i originating from the sun is a smoothly varying function of time during cloudless daylight hours. Consequently, small values of the least squares error of a linear least squares fit to the powers Ps,i over a short period of time (i.e., 10–15 min) appear to be useful predictors for the presence of cloud in the field of view of the lidar receiver. Inspecting the least squares errors over several clear-sky days, we arrived at a single threshold value for this error that separates clear- and cloudy-sky periods, and we applied this single threshold to the entire dataset.
Having identified all of the clear-sky profiles during the course of a day using the above test on Ps,i, we calculate the average and the standard deviation of the clear-sky power returns at each resolution volume using temporally nonoverlapping groups of contiguous clear-sky profiles. To ameliorate the effects of dew that may have formed on the instrument shelter window during the preceding night, we discard all of the clear-sky averages and standard deviations generated for each resolution volume that did not occur within approximately 2 h of local solar noon. (At times close to local solar noon, the background Ps,i and its contribution to ΔPd,ij are at a maximum.) We then set the power Pr,ij for the jth resolution volume to the average clear-sky profile power return whose value is a maximum at this height over all of the remaining average values for this day. Since Pr,ij is no longer strictly dependent upon the time of day, we omit the subscript i and write it as Pr,j. Similarly, we set ΔPd,ij for the jth resolution volume equal to the standard deviation whose value is also a maximum at this height. Again omitting i, we write the final estimate as ΔPd,j.
We use the clear-sky profile estimate Pr,j + ΔPd,j generated for each day to perform two tests for the significance of the power returns Pt′,ij during the same 24-h period. The motivation for the two tests is illustrated in Fig. 1, where we show a schematic picture of what the power Pt′,ij might look like for returns from two cloud layers. For consecutive lidar range resolution volumes with no cloud contribution to the power returns, the ratio Pt′,ij/Pr,j is a constant if noise is neglected. This ratio has a value of one in region A, a local maximum greater than one in region B, a constant value less than one in region C, another local maximum in region D, and a constant minimum value in region E. If any cloud layer completely extinguishes the lidar beam, then the ratio goes to zero.








Once we have “calibrated” the lidar system to the observed clear-sky data by adjusting the overlap factors, we estimate βp,i(rj) for each resolution volume with a significant power return Pt′,ij. We start with j = 1 and iteratively work upward to j = Ng, substituting Pt′,ij for Ptheory,ij in (6) and solving for βp,i(rj). If the power return due to a particular resolution volume is not significant, we assign a value of zero to βp,i(rj).
Since the assignment of a cloud-base height to the lidar signal profile is to an extent arbitrary, we report three heights that are computed from the values of βp,i(rj) using a scheme similar to Pal et al. (1992). The first height rb is the smallest range rj for which βp,i(rj) > 0. The second height rcb is our estimate of cloud-base height, that is, the smallest range rj for which βp,i(rj) > βcloud, where we set βcloud to 3.3 × 10−6 m−1 sr−1, 6.7 × 10−6 m−1 sr−1, or 10.0 × 10−6 m−1 sr−1 below 5 km and to 0 m−1 sr−1 above 5 km. Assuming a typical aerosol extinction to backscatter ratio to be on the order of 30 (Reagan et al. 1984), these thresholds on βp,i(rj) correspond to total aerosol extinction optical depths τcloud of 0.03, 0.06, and 0.09 over a depth of 300 m. (During the experimental period, the largest total column aerosol optical depth inferred from sunphotometer measurements never exceeded 0.20.) The third height rp that we report is the range rj for which βp,i(rj) is maximum.
3. Results
We validated the performance of the algorithm by comparing the cloud-base heights it generated with the heights produced by a more traditional algorithm developed by Scott and Spinhirne (V. S. Scott 1996, personal communication) and with the cloud-base heights generated by the commercially available algorithm that comes with the Belfort laser ceilometer system (TransTechnology Corporation 1993, Technical and Operator’s Manual for the Model 7013C Ceilometer). The first steps in the Scott algorithm are to correct the micropulse lidar average power returns by subtracting the background, taken as the signal average from 20 to 30 km, and normalizing for the output energy return. (The output energy return is a reading from the system energy monitor.) The corrected signal is then divided by a calculated estimate of the received signal due to a combination of typical aerosol backscatter and molecular backscatter that includes an overlap function correction. The resulting ratio signal is subsequently searched for a cloud-base height. The basis behind the search is empirically determined threshold functions.
To generate the threshold functions, the cloud-base heights for a large sample of data were manually determined by visual inspection of data displays of the micropulse lidar power returns. For each true (i.e., by visual inspection) cloud base, the corresponding signal value at cloud base was taken as the threshold value for cloud detection. A scatterplot of threshold value versus range was generated from all of the data, and empirical function fits to the data in the scatterplot were derived. A number of empirical testing procedures were then employed to improve reliability. Visual pattern recognition is the most valid method for cloud-base location, and the algorithm results were almost as accurate as the manual pattern recognition from the displayed data.
The Belfort laser ceilometer system is a commercially developed system, and the source code for the two cloud-base height algorithms that are available with the system are not readily available. In the current comparisons we used the cloud-base heights produced by the threshold algorithm available with the system. The technical manual provides the following description of the theshold method.
The units for this threshold are counts, which are also referred to as ‘A/D counts’ in this document. Counts are the number of times, out of the 5120 laser-firings in a data acquisition cycle, that a range cell had a large enough signal-to-noise ratio to produce a ‘1’ at the output of the input comparator on the receiver printed circuit board. The threshold method of cloud detection can report a maximum of 3 different clouds. The algorithm for this method searches for consecutive range cells that form a peak above this threshold (‘MinCloudSig’). The 3 highest peaks above this threshold, if any, are reported as clouds.
Comparing the descriptions of the three algorithms, several differences between the algorithms are apparent. The two micropulse lidar algorithms use an average profile of returns generated over the data acquisition cycle of 150 000 pulses as the starting point, whereas the Belfort laser ceilometer system threshold algorithm tests the return signal at each range gate as a result of each pulse for significance. The range-dependent thresholds for the Scott algorithm are derived from a regression analysis of the average power returns at cloud-base height derived from manual interpretation of a large sample of micropulse lidar data. These thresholds are then applied to new data in real time. The current algorithm, on the other hand, attempts to characterize the micropulse lidar clear-sky average signal profile on a daily basis. The current algorithm then compares each range gate signal of each profile with the corresponding range gate signal of the clear-sky profile. Significant departures of range gate signals from the clear-sky signal are identified as containing cloud contributions. Unlike the two micropulse lidar algorithms, the Belfort laser ceilometer system threshold algorithm identifies the three range gates with the highest number of significant signals above a minumum count (in a sample set of 5120 returns) as containing cloud contributions. The threshold for a significant signal level is set by the Belfort laser ceilometer system receiver input comparator.
For this study we used data collected by a micropulse lidar and a Belfort laser ceilometer that were located at the Southern Great Plains (SGP) Central Facility (36.61°N, 97.58°W) of the DOE ARM Program (Stokes and Schwartz 1994) from 22 September to 29 October 1995. Analyzing the 50 903 profiles collected over the experiment period, we found that the average cloud-base height difference between the Scott algorithm and the current algorithm (i.e., the Scott algorithm heights minus the current algorithm heights) dropped from 172 to −12 m as the value of τcloud increased from 0.03 to 0.09 (Table 3). Since τcloud was set to 0 above 5 km, the largest changes in the average height difference with increasing τcloud occurred below 5 km. The two algorithms produced identical cloud-base height estimates for 69% of the profiles that contained cloud signals, and the height estimates in 92% of the cloud containing profiles were not different by more than a single gate. Furthermore, in over 99% of the profiles rb and rp bracketed the cloud-base heights reported by the Scott algorithm.
Relative to the 11 007 profiles when the Scott algorithm identified the presence of cloud, the number of false positives and false negatives below 5 km produced by the current algorithm amounted to 6.7% and 0.3%, respectively, for τcloud = 0.03 (Table 3). For τcloud = 0.09, the number of false positives dropped to 2.1%, while the number of false negatives increased to 9.5%. These changes, as well as the drop in the average cloud-base height difference, with increasing τcloud are not surprising, as increasing τcloud both reduces the number of profiles that are labeled as containing cloud and can only lead to an increase in the cloud-base height estimates. Above 5 km the number of false positives and false negatives were 11.4% and 0.5%, respectively. These differences above 5 km were primarily due to slightly different sensitivities of the two algorithms to weak cirrus signals.
Comparing the current algorithm results to the cloud-base heights derived from the algorithm that is applied to the Belfort laser ceilometer data, we find much the same trends that we obtained in the comparison with the Scott algorithm (Table 3). As τcloud increases from 0.03 to 0.09, the average cloud-base height difference between the two algorithms (i.e., the Belfort laser ceilometer system heights minus the current algorithm heights) drops from −82 to −240 m and the number of false negative cloud detections below 5 km of the current algorithm relative to the Belfort laser ceilometer system increases from 14% to 25%. (Note that 672 of the profiles contributing to these percentages occurred during one nighttime period when the micropulse lidar output signal was extremely low.) We attribute the large number of false positives above 5 km (approximately 23% relative to the number of Belfort system cloud detections) to high-level clouds that go undetected by the Belfort laser ceilometer system. Relative to the total sample size of 50 903 profiles, the results in Table 3 indicate that 1) the two instruments agree in the presence or absence of a cloud in approximately 90.5% of the profiles, 2) the Belfort ceilometer does not detect high-level clouds in 4.9% of the cases, 3) in 5.5% of the profiles the instruments disagree in the presence or absence of cloud in the lowest 3.5 km, and 4) approximately 1.0% of the profiles are from periods when the Belfort ceilometer, but not the micropulse lidar, detected high-level clouds or the micropulse lidar, but the Belfort laser ceilometer system, detected low-level clouds. The comparison of the Scott algorithm cloud-base heights to the cloud-base heights from the Belfort laser ceilometer system led to similar results (Table 3).
Based on the results illustrated in Table 3, we conclude that the optimal value of τcloud in the pool of three values is 0.06. For this value of τcloud the magnitude of the average cloud-base height difference below 5 km between the Scott algorithm and the current algorithm was 20 m, while the magnitude of the height difference above 5 km between the two algorithms was not substantially different from the results for τcloud = 0.09. As Fig. 2a illustrates, the difference in height of 20 m was the result of a high percentage of identical cloud-base heights, together with nearly equal numbers of −300- and 300-m height differences, below 5 km. Above 5 km the current algorithm rarely reported a cloud-base height above the height generated by the Scott algorithm, leading to a cloud-base height that was 315 m lower on average. For τcloud = 0.06 both the current algorithm and the Scott algorithm had a nearly identical performance in comparison to the Belfort laser ceilometer system cloud base heights (Fig. 2b). On average the cloud-base heights produced by these two algorithms were approximately 170 m higher than the cloud-base heights from the Belfort laser ceilometer system. Below 5 km the micropulse lidar cloud-base heights were approximately 195 m higher than the Belfort laser ceilometer heights, while above 5 km the micropulse lidar heights were generally lower than the Belfort laser ceilometer system heights by 125 to 150 m. All of the height differences reported above are within the minimal uncertainty of approximately 150 to 300 m, which is attributable to the 300-m resolution of the micropulse lidar.
One possible source of the difference in the cloud-base heights below 5 km reported by the algorithms applied to micropulse lidar data and the Belfort laser ceilometer system data may result from the substantially larger micropulse lidar resolution volume of 300 m as compared to the Belfort laser ceilometer resolution volume of 7.62 m. In the current micropulse lidar algorithms we assumed the cloud-base height to be at the height of the center of the micropulse lidar resolution volume. This assumption will not lead to a bias if the actual cloud-base heights occur randomly across the full vertical extent of the resolution volume that is identified as containing cloud base. However, if the resolution volume must fall completely within the cloud for a significant cloud detection to occur, then the cloud-base height will be high by at least 150 m if referenced to the center of the range resolution volume. Referencing the micropulse lidar cloud-base heights to the bottom of the resolution volume, the Belfort laser ceilometer cloud base heights would be only approximately 20 m lower than the cloud-base heights reported by the algorithms applied to the micropulse lidar data. However, this line of reasoning does not correct the discrepancies between the two instrument systems above 5 km.
Validation of algorithm performance above 5 km was problematic because there was no “truth” dataset against which we could compare the retrieved heights. For example, data from a high-powered lidar were not available from the site during the experimental period. Comparison against the cloud masks generated by millimeter-wave radars operating nearby can be used for a consistency check of the micropulse lidar results in some cases, but millimeter-wave radars are unable to detect thin ice clouds near the tropopause. Therefore, for the current study we validated the micropulse lidar algorithm performance by visual inspection of the resulting cloud masks. We developed a single set of algorithm thresholds (i.e., Table 1) that eliminated regions in the mask that had no obvious corresponding significant returns in the original photoelectron count data. Importantly, the resulting masks contained few false negatives as well.
4. Conclusions
We described a lidar algorithm for detecting clouds that is based on the construction of lidar clear-sky power return profiles from the data. The algorithm produced cloud-base heights that were consistent with the values produced by the cloud-base height algorithm developed by Scott and Spinhirne (V. S. Scott 1996, personal communication). The cloud-base heights produced by the current algorithm applied to the micropulse lidar data were also in fair agreement with the heights generated by a Belfort laser ceilometer system. On average, the cloud-base heights derived from the micropulse lidar system were either 171 or 21 m higher than the heights reported by the Belfort laser ceilometer system, depending upon the choice for the cloud-base height reference within the micropulse lidar range resolution volume. A 30-m resolution micropulse lidar that is currently planned for the ARM SGP central facility should enable these differences to be more thoroughly investigated.
The Belfort laser ceilometer system generally detected more clouds in the boundary layer than the algorithms applied to the micropulse lidar data. This result is not surprising since the Belfort system has a higher temporal and spatial resolution. The micropulse lidar system, however, detected more cirrus. Although we were not able to rigorously validate algorithm performance on cirrus clouds, manual inspection of the resulting lidar cloud masks indicated that the current algorithm detected most of the returns that were clearly above the noise.
The micropulse lidar is intended to be one piece of a multi-instrument system for identifying the presence and location of cloud hydrometeors in the vertical atmospheric column above the DOE ARM SGP central facility. The micropulse lidar is perhaps the most effective instrument for identifying the presence of high, thin cirrus in otherwise clear-sky conditions. The final validation of the current algorithm work will occur in the context of an evaluation of the multi-instrument system, where consistency of results between instruments is potentially the most powerful tool for assessing algorithm performance.
Acknowledgments
Support for this research was funded in part by the Environmental Sciences Division of the U.S. Department of Energy (under Grant DE-FG02-90ER61071) and Battelle Pacific Northwest Laboratory (Subcontract 091572-A-Q1). Connor Flynn at the Pacific Northwest National Laboratory of Batelle was tireless as he answered time and again our inquiries about the micropulse lidar and Belfort laser ceilometer system located at the Atmospheric Radiation Measurement Program Southern Great Plains site. The comments and suggestions of the three reviewers were extremely helpful.
REFERENCES
Baum, B. A., and Coauthors, 1995: Satellite remote sensing of multiple cloud layers. J. Atmos. Sci.,52, 4210–4230.
Clothiaux, E. E., M. A. Miller, B. A. Albrecht, T. P. Ackerman, J. Verlinde, D. M. Babb, R. M. Peters, and W. J. Syrett, 1995: An evaluation of a 94-GHz radar for remote sensing of cloud properties. J. Atmos. Oceanic Technol.,12, 201–229.
Kropfli, R. A., and Coauthors, 1995: Cloud physics studies with 8 mm wavelength radar. Atmos. Res.,35, 299–313.
Pal, S. R., W. Steinbrecht, and A. I. Carswell, 1992: Automated method for lidar determination of cloud-base height and vertical extent. Appl. Opt.,31, 1488–1494.
Reagan, J. A., M. V. Apte, T. V. Bruhns, and O. Youngbluth, 1984: Lidar and ballon-borne cascade impactor measurements of aerosols: A case study. Aerosol Sci. Technol.,3, 259–275.
Rossow, W. B., and L. C. Garder, 1993: Cloud detection using satellite measurements of infrared and visible radiances for ISCCP. J. Climate,6, 2341–2369.
Spinhirne, J. D., 1993: Micro pulse lidar. IEEE Trans. Geosci. Remote Sens.,31, 48–55.
——, J. A. R. Rall, and V. S. Scott, 1995: Compact eye safe lidar system. Rev. Laser Eng.,23, 112–118.
Stokes, G. M., and S. E. Schwartz, 1994: The atmospheric radiation measurement (ARM) program: Programmatic background and design of the cloud and radiation testbed. Bull. Amer. Meteor. Soc.,75, 1201–1221.
Uttal, T., E. E. Clothiaux, T. P. Ackerman, J. M. Intrieri, and W. L. Eberhard, 1995: Cloud boundary statistics during FIRE II. J. Atmos. Sci.,52, 4276–4284.

Schematic representation of the changes in lidar return power (thick solid line) as the lidar beam penetrates two cloud layers located in regions B and D. The clear-sky power return is represented by the thin solid line.
Citation: Journal of Atmospheric and Oceanic Technology 15, 4; 10.1175/1520-0426(1998)015<1035:AAAFDO>2.0.CO;2

Schematic representation of the changes in lidar return power (thick solid line) as the lidar beam penetrates two cloud layers located in regions B and D. The clear-sky power return is represented by the thin solid line.
Citation: Journal of Atmospheric and Oceanic Technology 15, 4; 10.1175/1520-0426(1998)015<1035:AAAFDO>2.0.CO;2
Schematic representation of the changes in lidar return power (thick solid line) as the lidar beam penetrates two cloud layers located in regions B and D. The clear-sky power return is represented by the thin solid line.
Citation: Journal of Atmospheric and Oceanic Technology 15, 4; 10.1175/1520-0426(1998)015<1035:AAAFDO>2.0.CO;2

(a) Frequency of occurrence of the difference between the cloud-base heights reported by the Scott algorithm and the cloud-base heights generated by the current algorithm. Solid circles represent the histogram generated by the complete dataset, while open circles (triangles) represent differences when the Scott algorithm cloud-base heights were below (above) 5 km. Positive differences occur when the Scott algorithm cloud-base heights are higher. (b) Frequency of occurrence of the difference between the cloud-base heights reported by the Belfort laser ceilometer system and the Scott algorithm (solid circles) and the Belfort laser ceilometer system and the current algorithm with τcloud = 0.06 (open circles). Positive differences occur when the Belfort laser ceilometer system cloud-base heights are higher.
Citation: Journal of Atmospheric and Oceanic Technology 15, 4; 10.1175/1520-0426(1998)015<1035:AAAFDO>2.0.CO;2

(a) Frequency of occurrence of the difference between the cloud-base heights reported by the Scott algorithm and the cloud-base heights generated by the current algorithm. Solid circles represent the histogram generated by the complete dataset, while open circles (triangles) represent differences when the Scott algorithm cloud-base heights were below (above) 5 km. Positive differences occur when the Scott algorithm cloud-base heights are higher. (b) Frequency of occurrence of the difference between the cloud-base heights reported by the Belfort laser ceilometer system and the Scott algorithm (solid circles) and the Belfort laser ceilometer system and the current algorithm with τcloud = 0.06 (open circles). Positive differences occur when the Belfort laser ceilometer system cloud-base heights are higher.
Citation: Journal of Atmospheric and Oceanic Technology 15, 4; 10.1175/1520-0426(1998)015<1035:AAAFDO>2.0.CO;2
(a) Frequency of occurrence of the difference between the cloud-base heights reported by the Scott algorithm and the cloud-base heights generated by the current algorithm. Solid circles represent the histogram generated by the complete dataset, while open circles (triangles) represent differences when the Scott algorithm cloud-base heights were below (above) 5 km. Positive differences occur when the Scott algorithm cloud-base heights are higher. (b) Frequency of occurrence of the difference between the cloud-base heights reported by the Belfort laser ceilometer system and the Scott algorithm (solid circles) and the Belfort laser ceilometer system and the current algorithm with τcloud = 0.06 (open circles). Positive differences occur when the Belfort laser ceilometer system cloud-base heights are higher.
Citation: Journal of Atmospheric and Oceanic Technology 15, 4; 10.1175/1520-0426(1998)015<1035:AAAFDO>2.0.CO;2
Values of the system-dependent thresholds p1(Nc, d[n], l[h]) (top number) and p2(Nc, d[n], l[h]) (bottom number) as a function of Nc, the time of day, that is, day (d) or night (n), and the height above the surface, that is, low (l) or high (h).


Values of the system-dependent empirical fit threshold coefficients X, A, and B in the Scott algorithm.


Comparisons of the current algorithm (CA) cloud-base heights with the cloud-base heights reported by the Scott algorithm (SA) and the Belfort laser ceilometer (BLC) system. The false positives and false negatives are for the CA results with respect to the SA and BLC results in the first two cases; in the third case the false positives and false negatives are for the SA results with respect to the BLC results. The average height differences are given by the SA heights minus the CA heights, the BLC heights minus the CA heights, and the BLC heights minus the SA heights.

