## 1. Introduction

Stereophotography has the potential to provide a unique window into the behavior of clouds, enabling the calculation of four-dimensional trajectories of cloud surfaces in exquisite detail. The fine spatial resolution and high frame rate allow for the tracking of individual clouds and convective turrets, providing a perspective on cloud life cycles that cannot be replicated with other existing technologies. Radar, with its ability to measure the condensates and motions within clouds, comes closest to this capability. For example, vertically pointing W-band radar can map the distribution and vertical velocity of shallow, nonprecipitating clouds (Kollias et al. 2001, 2003, 2007; Kollias and Albrecht 2010; Ghate et al. 2011), but its “soda straw” view of the atmosphere cannot provide information on the life cycle of any individual cloud. To measure cloud life cycles, individual clouds must be imaged repeatedly, which would require the use of a scanning radar. The downside of a scanning radar is its coarse spatial and temporal resolution, caused by typical beamwidths on the order of 1° and the time it takes for the radar to mechanically scan the sky. For example, scanning weather radar are typically used to generate cloud maps with a spatial and temporal resolution of ~1 km and ~10 min, respectively (e.g., Davies-Jones 1979; Yuter and Houze 1995; Collis et al. 2013), which is too coarse to track the life cycle of cloud turrets with either size ≲1 km or updraft speed ≳10 m s^{−1}.

To make a more apples-to-apples comparison between a scanning radar and a digital camera, we should consider a scanning mode that covers an atmospheric volume similar to a camera’s field of view. For example, the boundary layer range–height indicator (BL-RHI) scan, as employed by the W-band scanning radar at the Atmospheric Radiation Measurement Program (ARM) sites (Mather and Voyles 2013; Ackerman and Stokes 2003), covers 90° of elevation (horizon to zenith) and 80° of azimuth. This is comparable to the fields of view for the cameras used in this study: the horizontal fields of view are equal to 76° and 67° for the left and right cameras, respectively. With the BL-RHI scan, the sky is mapped with a resolution of 1° in elevation and 2° in azimuth once every 5 min (Fielding et al. 2013; Kollias et al. 2014). By comparison, an off-the-shelf digital camera has an angular resolution of about 0.1°, and it can capture images at frame rates greater than 1 Hz. Consider a convective cloud at a distance 20 km rising at 10 m s^{−1}. A BL-RHI scan will image the cloud with a transverse resolution of 700 m and a vertical displacement of the cloud between scans of 3 km. A camera with a modest 0.1-Hz frame rate will image the cloud with a resolution of about a few tens of meters and a vertical displacement between images of 100 m. With this spatial and temporal resolution, stereophotogrammetry can provide detailed cloud life cycle data.

Stereophotogrammetry dates back more than 100 years, with early cloud studies making use of theodolites (Koppe 1896). Analog photographs were used in the stereophotogrammetry of clouds from at least the 1950s to obtain cloud positions and velocities (Malkus and Ronne 1954; Kassander and Sims 1957; Orville and Kassander 1961; Bradbury and Fujita 1968; Warner et al. 1973). More recently, digital photographs have been used (Allmen and Kegelmeyer 1996; Kassianov et al. 2005; Zehnder et al. 2007; Damiani et al. 2008), which opens up the possibility of using feature-matching algorithms to automate the reconstruction process (Seiz et al. 2002; Zehnder et al. 2006).

In all of these studies, the essential first step is the careful calibration of camera positions and orientations. Orientation of the cameras—that is, the determination of the camera’s three Euler angles—is often the trickiest step. Previous approaches have used the positions of known landmarks, such as topographic features (e.g., Hu et al. 2009) or the positions of stars (e.g., Seiz et al. 2002). In this study, we describe how to calibrate a pair of cameras in the absence of landmarks, as needed for the case of daytime images of land-free scenes. This method is applied to images collected by a pair of cameras looking out over Biscayne Bay in Miami, Florida, during April 2013. Section 2 reviews the concept of three-dimensional reconstruction with stereo cameras. Section 3 discusses the setup and calibration of the two cameras. Section 4 calculates the precision of the stereo reconstruction. The accuracy of the reconstructions is validated by comparison with lidar in section 5 and radiosondes in section 6. Section 7 summarizes the results.

## 2. Calibration without landmarks

Given two contemporaneous photographs from two cameras with known positions and orientations, we can reconstruct the position of any object that appears in both images. This “stereo reconstruction” of an object’s three-dimensional position is simply a matter of triangulation. An example in two dimensions is given in Fig. 1. Each of the two cameras (left and right) is represented by a pinhole through which light is admitted (black circle) and an image line where a photosensitive array would be located (black line). Each camera reports a single number: the position of the object in the camera’s image plane. We can write these positions as *d* is defined as the difference in these two image-plane positions: *x* and *y*) in two-dimensional space. The left camera tells us that the object lies somewhere on the locus of points denoted by the green line, for all objects on that line will project onto

The line passing through the two pinholes is referred to as the baseline. In Fig. 1, the two cameras are drawn with parallel baseline and image lines. In some applications, this type of arrangement is both desirable and practical. For example, if a short baseline is appropriate, then the two cameras can be physically attached to one another, thereby guaranteeing that the baseline and image lines are parallel. In some applications, however, the required baseline is too long for this approach to be feasible. Baselines are chosen to be about one or two orders of magnitude shorter than the distance from the cameras to the objects being imaged (Gallup et al. 2008). This reflects a trade-off between 1) the greater disparity with a larger baseline, which makes the reconstruction more precise; and 2) the increased difficulty of identifying matching points (and a greater occurrence of occlusions) with a larger baseline, which reduces the amount of data. If we wish to capture clouds with a resolution of 100 m or better with a 4-mm focal-length lens and a sensor that is 4 mm wide with 1000 pixels across, then the basic pinhole model geometry (Forsyth and Ponce 2003) tells us that we are restricted to looking at clouds within 100 km. These distances are best sampled with a baseline on the order of 1–10 km, which is far too large to have the cameras attached to each other by a rigid structure.

Therefore, for stereophotography of clouds, the two cameras must be sited and oriented independently of one another. This leads to a potential problem: errors in the assumed position and orientation of the cameras will generate errors in the reconstructed positions of objects. In Fig. 1, the reconstructed *x* and *y* will be erroneous if we believe that the two image lines are parallel with the baseline when, in fact, one of them is not. To mitigate this source of error, careful attention must be paid to the calibration (i.e., measurement) of camera position, internal optical parameters, and camera orientation. Camera position can be determined by using the global positioning system (GPS) or by collocation of the camera with known landmarks. Internal optical parameters can be obtained from the manufacturer or from controlled laboratory tests of the camera. Orientation, on the other hand, must be measured in the field upon deployment.

A standard approach to calibrating orientation is to use landmarks with known positions. For example, in the two-dimensional case, at least one landmark is needed in the field of view to determine the one angle of orientation. In three dimensions, there are three Euler angles (roll, pitch, and yaw), which can be measured with at least two landmarks. This is the approach taken by recent stereophotogrammetric studies (e.g., Zehnder et al. 2007; Hu et al. 2009). But, what if there are fewer than two landmarks or no landmarks at all? In the study of maritime clouds, cameras looking out over the ocean will see, in general, only ocean and sky, with no fiducial landmarks in the field of view. Here, we report on a calibration technique for this scenario, that is, a calibration method that does not require the use of landmarks.

To explain calibration without landmarks, let us return, for the moment, to Fig. 1. In two dimensions, there is almost always a solution to the intersection of two lines of sight (the rare exception being when the lines are supposedly parallel). Therefore, the cameras will report an unambiguous position for almost any object, regardless of errors in calibration. This can be stated mathematically as follows: for supposedly parallel baseline and image lines, any pair of *x* and *y*. While this may sound like a good thing, it can cause problems by masking errors in the calibration: stereo cameras will report unambiguous reconstructions whether they are well calibrated or not.

In three dimensions, however, this is not the case. Generally, any error in the calibration will cause the lines of sight to fail to intersect in three-dimensional space. This is illustrated in Fig. 2, which shows the image planes for the left and right cameras. In the left camera, the object’s image is denoted by the green square, which has a position in the image plane denoted by

In the absence of landmarks, we can use this mismatch to calibrate the cameras. The idea is simple in principle. Given a large number of points in the field of view (spots on clouds, ships in the ocean, birds in the sky), define an objective function as the sum (over all points and both image planes) of the deviation (in the image plane) between the image and the associated epipolar line. Then, using an optimization algorithm, find the set of calibration parameters that minimizes this objective function. In practice, this is an optimization problem of high dimensionality (six position parameters, six Euler angles, and several internal optical parameters), and so is numerically intractable without a good initial estimate of the parameters. In the case described in section 3c, estimates are obtained using Google Earth for the camera positions (GPS would also work for this purpose), the position of the sun for yaw (a compass would work just as well, correcting for magnetic declination), and the ocean horizon for pitch and roll.

## 3. Camera setup and calibration

*x*,

*y*,

*z*, 1) denote the homogenous coordinates of a point in 3D space. In homogeneous coordinates, (

*αx*,

*αy*,

*αz*,

*α*) defines the same point for all

*α*≠ 0. Similarly, let

*x*′,

*y*′, 1) be the homogeneous coordinates of the image in the image plane. The definitions and units of the notations used throughout the paper are listed in Table 1. For a pinhole camera, the relation between

List of variables and their units.

*f*is the focal length;

*k*

_{x}(

*k*

_{y}) is the density of pixels, that is, number of pixels per distance, in the image plane’s

*x*′ (

*y*′) direction; and

*c*

_{x}(

*c*

_{y}) is the principal point coordinate of the image plane. The rotation matrix

*x*,

*y*,

*z*) into the coordinate system aligned with the camera (

*x*′ and

*y*′ in the image plane and

*z*′ perpendicular to the image plane in the direction of the line of sight). Letting

*θ*

_{1},

*θ*

_{2}, and

*θ*

_{3}denote the three Euler angles of pitch, yaw, and roll, respectively,

*x*

_{0},

*y*

_{0}, and

*z*

_{0},

*l*and

*r*refer to the left and right cameras, respectively. Stereo reconstruction is the process of calculating

### a. Camera setup

The two cameras used in this study are located on Virginia Key in Miami, where they overlook Biscayne Bay (25.6°N, 80.2°W; see Fig. 3). One of the cameras (a 5-megapixel Stardot NetCam SC) is mounted on the roof of the Marine and Science Technology (MAST) Academy. This camera will be referred to as the right camera throughout the text. The left camera (a 3-megapixel Stardot NetCam XL) is mounted on the roof of the Rosenstiel School of Marine and Atmospheric Science (RSMAS) as part of South Florida’s Cloud–Aerosol–Rain Observatory (CAROb; http://carob.rsmas.miami.edu). It is located 296 m to the east and 822 m to the south of the right camera. The right and left cameras face 6.56° and 18.19° west of south, respectively. The right and left cameras capture images at 1296 × 960 and 1024 × 768 pixel resolution, and 30- and 3-s intervals, respectively. The left camera was acquired in 2011 as part of CAROb, but the right camera was acquired in 2012 to launch the stereo reconstruction research; hence, it has a higher resolution than the left camera. The two cameras are connected to separate servers and so are controlled independently from each other, but they are synchronized to a common time server. Both cameras have wide-angle lenses, which cause noticeable radial distortion that cannot be ignored as in previous works (such as Hu et al. 2009).

### b. Camera lens distortion

*x*,

*y*) is the corrected position (i.e., where on the image plane the point would lie if the lens were replaced by a pinhole). Equations 7a and 7b establish a relation between the observed and correct pixel positions in terms of unknown radial distortion parameters

*k*

_{i}. A checkerboard pattern is printed out and held in front of the camera at various angles and positions. The distorted corner-point positions are extracted from the captured images by an automatic corner-extraction algorithm in OpenCV, an open source computer-vision library (Bradski and Kaehler 2008). A lens correction algorithm from the same OpenCV library is then used to estimate the

*k*

_{i}using Levenberg–Marquardt optimization (Press et al. 2007). With the

*k*

_{i}estimated in this way, the mapping of Eqs. (7a)–(7c) is applied to compensate for the distortion. Samples of images before and after this correction are displayed in Fig. 4. OpenCV lens distortion correction algorithm also gives an estimate of

*fk*,

*c*

_{x}, and

*c*

_{y}, together with

*k*

_{i}.

### c. Camera calibration

Camera “calibration” refers to the estimation of intrinsic parameters (given by the elements of *f* varying between 4 and 8 mm. This sets a lower and upper bound on *f*, but the precise value is unknown a priori. The principal point coordinate, denoted by (*c*_{x}, *c*_{y}), is often assumed to be at the center of the image plane, but it may deviate from the center depending on the lens used.

Camera extrinsic parameters include the world coordinate of the camera center and the three angles of the camera’s yaw, roll, and pitch. These parameters must be determined with high accuracy to achieve meaningful reconstructions. The Google Earth mapping software provides the coordinates of both cameras in 3D. The sun provides a way of estimating the yaw when it is low enough to be captured by the cameras in January and February. By recording the time of day when the sun is centered horizontally on the image plane, the azimuth of the sun at that time (obtained from http://www.esrl.noaa.gov/gmd/grad/solcalc) gives the absolute yaw of the right camera and an initial estimate of the relative yaw between the two cameras. This and other parameters are then calculated using the epipolar constraint and the horizon constraint, described below.

#### 1) The epipolar constraint

Given the homogeneous world coordinate _{l} maps _{l} and _{r} are known perfectly, then _{l} and _{r} contain error, then images will not, in general, coincide with the epipolar lines. In particular, it is error in the internal parameters (focal length and principal point) and in the relative displacement, relative pitch, relative yaw, and relative roll of the two cameras that causes these deviations. Note that changes in displacement, pitch, yaw, and roll that are applied equally to both cameras cannot cause images to deviate from epipolar lines; only relative errors in these quantities manifest as deviations of images from epipolar lines.

_{l}and

_{r}(for details, see Hartley and Zisserman 2003; Forsyth and Ponce 2003). The elements of

_{l}and

_{r}is perfect, then

The fundamental matrix is a 3 × 3 matrix with rank 2. Equation (8) implies that the fundamental matrix can be estimated from point correspondences, in the least squares error sense. Note that, although _{l} and _{r} uniquely define the fundamental matrix, the reverse is not true. The fundamental matrix determines _{l} and _{r} up to an overall displacement, pitch, yaw, and roll; in other words,

#### 2) The horizon constraint

*x*,

*y*,

*z*) = (0, 0,

*h*) using the standard meteorological convention of

*x*increasing to the east,

*y*increasing to the north, and

*z*increasing with altitude. To use the horizon for calibration, we must first determine where the horizon is located in world coordinates. Let us denote the horizontal displacement of the horizon from the camera by

*d*

_{H}and the vertical displacement of the horizon from the camera’s sea level by

*R*

_{e}is the radius of the earth. This implies that

*h*is 1 km, so

*d*

_{H}, we know that the horizon for the right camera at (

*x*

_{r},

*y*

_{r},

*h*

_{r}) can be described by the set of 3D points

*h*= 1, 10, and 100 m with zero roll angle. As an observer moves up off the surface, the curvature of the earth increasingly manifests itself as a bending of the horizon (until, in outer space, the horizon is clearly a circle). As seen in Fig. 6, however, the projection of the horizon (i.e., its outline in the photograph) deviates from a straight line by less than one pixel for camera heights up to 100 m. Hence, we can assume that the projection of the horizon is a line in the right camera’s image plane, which can be described by the set of pixels

*m*

_{r}and

*a*

_{r}. The values of

*m*

_{r}and

*a*

_{r}can be obtained easily by regressing a line on the image of the horizon. We know, therefore, that the projection matrix

_{r}must satisfy

#### 3) Calibration and reconstruction

*C*as

*a*,

*b*,

*c*)

^{T}, the epipolar line in the left image plane is defined by

*a*,

*b*) points perpendicular to this epipolar line. Therefore, |

*d*

_{e}| is the minimum distance of

*d*

_{e}is defined by

_{r}. Therefore, the second term in the cost function is the sum of squared distances (in the

*y*′ direction) of the actual horizon from the projected horizon. Levenberg–Marquardt optimization (Press et al. 2007) is used to minimize the cost function and, thereby, solve for the unknown parameters of the projection matrices.

^{(i)}refers to the

*i*th row of the projection matrix.

## 4. Estimating the reconstruction precision

In this study, the baseline between the two cameras is 296 m in the east–west direction and 822 m in the north–south direction. The left camera is pointed 18.19° west of south, and the right camera is pointed 6.56° west of south. This configuration is depicted in Fig. 7. The yaw angles given here are obtained from the sun’s position, but the relative yaw angle between the cameras is obtained by the epipolar constraint. The fields of view of the left and right cameras are 76° and 67°, respectively. By coincidence, the leftmost viewing angle of the left camera coincides nearly perfectly with the baseline.

This configuration poses some unique challenges. Consider a line of sight from one of the cameras that is collinear with the baseline. In this case, the two lines of sight are collinear. As a result, there is no information that can be gathered about the depth of any object on this line: all positions along on this line will project onto the same pair of image-plane pixels. Therefore, we see that the reconstruction must fail for objects near the baseline. In most applications of stereophotogrammetry, the baseline is not within the field of view of both cameras, so this is not an issue. In our configuration, however, we must be careful to quantify the reconstruction error near the baseline.

To get a feeling for the magnitude and spatial distribution of this reconstruction error, let us derive the error analytically for the case of two identical cameras in 2D with the same relative displacement and relative yaw as the Miami cameras. To this end, let us derive the disparity *x*_{l} and *z*_{l} denote the coordinates of the object in the left camera’s reference frame (i.e., with the origin at the left camera, the positive *z*_{l} axis pointing in the direction of the left camera’s line of sight, and with *x*_{l} increasing to the right of the left camera). Similarly, let *x*_{r} and *z*_{r} denote the object’s coordinates in the right camera’s reference frame. Let (*x*_{l}, *z*_{l}) = (*x*_{b}, *z*_{b}) be the location of the right camera in the left camera’s reference frame; here, the subscript *b* refers to the baseline. Likewise, (*x*_{r}, *z*_{r}) = (−*x*_{b}, −*z*_{b}) is the location of the left camera in the right camera’s reference frame. In addition, let the right camera’s image plane be tilted counterclockwise at an angle *θ* relative to the left camera’s image plane.

Figure 8 depicts this geometry, with the perspectives from the two cameras overlaid on a common image plane. The horizontal line at the bottom of the figure is the image plane, and the two diagonal lines are the two lines of sight, which intersect each other at the pinhole. In the left camera’s reference frame, the object is located at (*x*_{l}, *z*_{l}), which is at the end of the line of sight that slopes up and to the right. In the right camera’s reference frame, the object is located at (*x*_{r}, *z*_{r}), which is at the end of the line of sight that slopes up and to the left. The dashed lines aid the eye in seeing the trigonometric relationships to be used in the derivation below. As depicted in Fig. 8, *x*_{b} > 0, *z*_{b} < 0, and *x*_{r} < 0.

*L*is given by

*x*=

*x*

_{l}−

*x*

_{r}, Δ

*z*=

*z*

_{r}−

*z*

_{l}, and

*x*

_{r}and

*z*

_{r}are given by

*x*

_{l}and

*z*

_{l}, we obtain

*x*

_{r}and

*z*

_{r}are defined in terms of

*x*

_{l}and

*z*

_{l}by Eq. (19).

For simplicity, assume that there is no error in the measurement of the image location in the left camera’s image plane. This measurement fixes the ratio *x*_{l}/*z*_{l}. Now, imagine that the measurement of the image in the right camera is uncertain. This introduces uncertainty into the disparity *d*, which generates an uncertainty in the values of *x*_{l} and *z*_{l}. Let us assume that the error in the right camera’s image plane is on the order of one pixel. We would like to know how much fractional error that generates in *x*_{l} and *z*_{l} (the fractional error will be the same for the two, since we assume their ratio is known exactly).

*kf*= 817 in place of

*f*in Eq. (20) to give

*d*in pixels (

*kf*= 817 is the average of 651 for the left camera and 983 for the right camera). As stated above, the relative position and relative yaw of the two cameras is chosen to be the same as for the Miami cameras. Figure 9 plots

*x*

_{l}and

*z*

_{l}per pixel of error in the position of the object in the right camera’s image plane. As seen in Fig. 9, the error in the reconstructed depth of the object goes to infinity at the baseline, which just happens to coincide with the leftmost view from the left camera. Despite this fact, the vast majority of the field of view out to 8 km has an estimated reconstruction error of less than 2% (see the left panel), and the vast majority of the field of view out to 80 km has an estimated reconstruction error of less than 20% (see the right panel).

In 3D, the behavior of the reconstruction error is more nuanced. For cameras at sea level, the lines of sight to an object can be collinear with the baseline only if the object is at sea level. With the Miami cameras, which are located near sea level, the lines of sight to clouds are always pitched upward. As a result, the lines of sight to clouds are never parallel to the baseline, even though the baseline is in the field of view. This motivates a hypothesis that the reconstruction error diminishes near the leftmost field of view as the cloud height increases. To check this, we can calculate maps of estimated reconstruction error using the full 3D projection matrices for the Miami cameras. These error maps are shown in Fig. 10 for hypothetical objects at altitudes of 2, 6, and 12 km. The estimated error in these maps is the fractional change (in percent) of the reconstructed depth due to a one-pixel error in the disparity. Since the image planes are two dimensional, the plotted error is the root-sum-square of the errors generated by a one-pixel error in the *x*′ and *y*′ directions. The white regions are those areas that are not visible by the two cameras; note that the object’s altitude affects where it enters the field of view. With regard to the hypothesis, we see that the error in the leftmost field of view decreases as the height of the reconstructed object increases, as predicted.

## 5. Validation against lidar

The preceding analysis gives some sense for the precision of the stereo reconstructions. To assess accuracy, however, the reconstructions must be compared against meteorological data from independent observations. In this section, reconstructed cloud bases are compared against data from lidar. In the next section, the reconstructed horizontal motions of clouds are compared against wind profiles from radiosondes.

For the validation of reconstructed heights, cloud bases are calculated from three pairs of still images taken at 2025 UTC 8 April 2013, 1344 UTC 4 April 2013, and 1120 UTC 9 April 2013. At these times, we observed stratocumulus (Sc), altocumulus (Ac), and cirrocumulus (Cc) at heights of 2, 6, and 12 km, respectively. For each pair of images, authors Öktem and Romps manually identified pairs of matching cloud-base feature points in the two images. A mouse-controlled graphical user interface (GUI) was designed for manual identification of feature points and automatic reconstruction. Once a point is selected in one image, the GUI displays the corresponding epipolar line in the other image to serve as a guide for identifying the matching point. For each pair of image points (i.e.,

To validate that the stereo cameras are giving accurate cloud-base altitudes, the histogram of stereophotogrammetric heights at the one instant in time are compared against lidar data collected over a 20-min period from 10 min prior to 10 min after the photographs were taken. For the stratocumulus and altocumulus, cloud-base heights were obtained with a Vaisala CL31 ceilometer that is collocated with the left camera as part of the CAROb instrumentation. This ceilometer reports cloud-base heights up to 7.5 km at 10-m resolution every 15 s. For the cirrocumulus, cloud-base heights were obtained with a Sigma MPL-4B-IDS-532 micropulse lidar also collocated with the left camera, which reports backscattered intensity up to 15 km at 30-m resolution every 15 s. The micropulse lidar samples only zenith view with a 1.2-m footprint at 12 km, associated with 100-*μ*rad pulse width. From visual inspection, the clouds in the lidar data were clearly above 11 km, so twice the maximum logarithm of backscattered intensity between 9 and 11 km was used as the threshold to identify cloud bases. Since the thickness of the cirrus layer is about 1 km, practically any method for identifying cloud bases gives the same answer; in fact, using all cloudy points—not just cloud bases—produces a very similar distribution of height. Table 2 lists the mean and standard deviation for the observed cloud bases, obtained from both the stereo cameras and the lidars.

Cloud-base statistics from the stereo cameras and ceilometers (Vaisala CL31 for Sc and Ac, and Sigma MPL for Cc).

On 8 April 2013, when the view was dominated by stratocumulus, 440 pairs of feature points were sampled from the pair of photographs taken at 2025 UTC. The top panels of Fig. 11 show the locations of the feature points in the two images. The reconstructed heights included three outliers: one data point at 828 m and two data points at an altitude of around 8 km. Those points were sampled from distant cumulus and cirrus (at depths from the cameras exceeding 20 km), which are not captured by the ceilometer at that time. Those three samples are discarded from the results in Table 2 and Fig. 11. The samples of the stratocumulus layer were located 2.2–13 km south of the right camera. As seen in the middle row of Fig. 11, the histogram of stereo-camera cloud-base heights closely resembles the histogram of ceilometer cloud-base heights. The means of the two distributions are 1804 and 1805 m, respectively. The standard deviations are 56 and 42 m, respectively. The bottom panel of Fig. 11 shows the spatial distribution of reconstructed heights: the color of each point gives the fractional deviation of that point’s height relative to the mean of the ceilometer heights. These deviations have no discernable spatial pattern other than the clump of three high values near the leftmost field of view, where we expect the reconstruction to be the poorest.

For 4 April 2013, when the view was dominated by altocumulus, 416 pairs of feature points were sampled from the pair of photographs taken at 1344 UTC. The top panels of Fig. 12 display the feature points for this pair of images. The altocumulus feature points range in distance from 5.6 to 34 km south of the right camera. As seen in the middle row of Fig. 12, the histogram of stereo-camera cloud-base heights closely resembles the histogram of ceilometer cloud-base heights. As is true for all three cases, the number of points collected from the instantaneous pair of stereophotographs is significantly larger than the number of data points collected by lidar during the surrounding 20-min interval. This leads to a superior sampling of the distribution of cloud-base heights by the stereo cameras. Nevertheless, the means and standard deviations of the two distributions are very similar: the stereo cameras and the ceilometer report means of 5904 and 5913 m, and the standard deviations of 174 and 173 m, respectively. The bottom panel of Fig. 12 suggests that the variance of reconstructed heights increases somewhat up and to the left, as anticipated by Fig. 10. It also appears that the altocumulus at distances greater than 20 km south of the cameras is lower in altitude than the altocumulus closer to the cameras.

For 9 April 2013, when the view was dominated by cirrocumulus, 307 pairs of feature points were sampled from the pair of photographs taken at 1120 UTC. The images used for this analysis, as well as the locations of feature points, are shown in Fig. 13. The cirrocumulus feature points range in distance from 11 to 39 km south of the right camera. Unlike the previous two cases, the histograms from the lidar and the stereo cameras have some noticeable differences. As seen in the middle row of Fig. 13, the histogram of stereo-camera cloud-base heights is broader, extending to larger altitudes. At the low end, the 10th percentile of the lidar and stereo-camera distributions is quite similar: 11.3 and 11.1 km, respectively. The 90% quantiles, on the other hand, are quite different: 11.7 km for the lidar and 12.7 km for the stereo cameras. The bottom panel of Fig. 13 suggests a reason for this difference: the cirrocumulus appears to slope upward in the southward direction. Since there is no indication from the previous two cases of a positive height bias for distant clouds, it is quite plausible, although difficult to verify by other means, that the cirrocumulus was tilted in this way.

In stereo reconstruction, there are two main sources of error: image-identification error and parametric uncertainty. When identifying the pixel location of images in the image plane (either manually or algorithmically), some amount of error is to be expected, typically on the order of one pixel. These errors generate a disparity error, which produces a reconstruction error proportional to the distance from the camera, as depicted in Fig. 10. A one-pixel inaccuracy in image-point locations results in 15, 133, and 565 m of error in the mean of the measured cloud-base heights of 8, 4, and 9 April, respectively; that is, less than 1%, 3%, and 5% of the lidar-measured cloud-base averages for each of the three cases, respectively. With regard to parametric uncertainty, Table 3 lists the uncertainties in the relative camera positions (*x*_{0,r} − *x*_{0,l} and *y*_{0,r} − *y*_{0,l}), camera heights (*z*_{0,r} and *z*_{0,l}), horizon line-fitting parameters (*a*_{r} and *m*_{r}), and lens distortion correction parameters (*k*_{1}, *k*_{2}, and *k*_{3}), and the corresponding errors that these uncertainties would generate in the mean reconstructed cloud-base heights. Uncertainty in the absolute yaw angle, which is estimated from the sun, does not affect the height reconstruction. The right camera is mounted on the outer rails of a cylindrical tower, and the left camera is mounted on the roof of a building. Both of these sites can be clearly identified in the aerial and satellite photographs provided by Google Earth. The camera elevations are measured with Google Earth’s 3D view enabled, whereas the *x* and *y* positions are measured with 3D view disabled. The relative positions (*x*_{0,r} − *x*_{0,l} and *y*_{0,r} − *y*_{0,l}) vary by no more than 3 m when accounting for the range of possible camera locations in the Google Earth images. Similarly, an uncertainty of 0.5 m is estimated for the heights (*z*_{0,r} and *z*_{0,l}). The horizon parameters *a*_{r} and *m*_{r} are calculated by identifying two points close to the right and left ends of the visible ocean horizon in the right camera view, and by deriving the equation of the line passing through these two points. A one-pixel uncertainty in the location of these two points leads to a one-pixel uncertainty in *a*_{r} and a 0.002 uncertainty in the slope. To estimate the error bound in lens distortion correction parameters, we re-executed the distortion-correction algorithm several times by omitting one of the calibration pattern images from the training set in each execution.

The impacts of parametric uncertainties (first seven data rows) and image-identification error (disparity) on the reconstructed cloud-base heights. See Table 1 for the definition of the listed parameters. The total error for the reconstructed height of any one feature is given by the root-sum-square of all of the errors, and is rounded to the first significant digit. The relative error in the last row is obtained by dividing by the mean heights from Table 2 and rounding to the first significant digit.

Table 3 lists these uncertainties and the typical magnitudes of the errors that they generate for reconstructed cloud-base heights on 8 April (Sc), 4 April (Ac), and 9 April (Cc). Note that the disparity error (i.e., the image-identification error) is a random error, whereas the uncertainty in the parameters leads to a bias. When averaging over many reconstructed cloud-base heights, the disparity error has a negligible effect on the mean. The total error in this table is the square root of the sum of all of the squared errors above (including the disparity error), and so it should be thought of as the expected error for the reconstructed height of any one feature; the expected error for the mean of the cloud-base heights is smaller (i.e., it is obtained by excluding the disparity error). Note from Table 3 that these uncertainties have a negligible impact on the height reconstructions for low-altitude clouds: the total error for the height of an individual feature on a shallow cumulus is only about 30 m (less than 2% of the height). As the cloud altitudes increase, the errors increase due to the larger distances to the clouds being measured: the total error for the height of a feature on a cirrocumulus cloud is about 900 m (8% of the height).

## 6. Validation against radiosondes

*u*and

*υ*, respectively) are calculated as

*x*

_{i}(

*t*),

*y*

_{i}(

*t*)] is the horizontal position of feature point

*i*at time

*t*. For each of the four cloud layers studied (one each on 4 and 8 April, and two on 9 April),

*N*independent cloud features were identified by Öktem from several frames over a given time interval, and those same features were identified in images

*δt*later. The number of features

*N*used for each case is listed in Table 4. For each cloud layer, a fixed

*δt*was used.

Horizontal velocities of cloud bases measured by stereo reconstruction.

The interval *δt* was chosen subjectively for the following reasons. Recall from section 4 that the error in the reconstructed positions increases with distance between the cameras and the observed feature. When *δt* is small and the distance to the feature is large, the position-estimation error can exceed the distance the cloud feature travels in *δt* and results in low accuracy in velocity estimation. On the other hand, if *δt* is too large, then the cloud feature may exit the field of view, be obscured by another cloud, or evaporate in this time period. Hence, it was not possible to define a general rule for setting *δt*. A low value of *δt* = 30 s was used for the low-level clouds of 8 and 9 April, whose feature points were in the range 2–5 km from the cameras. For the high-altitude clouds of 4 and 9 April, whose feature points were at distances exceeding 8 km from the cameras, a longer time interval of *δt* = 300 s was used.

The results are listed in Table 4 for the altocumulus layer of 4 April, the stratocumulus layer of 8 April, and the cirrocumulus and shallow cumulus layers of 9 April. For each of the four cloud layers, this table reports the mean of the height, *u*, and *υ* of the sampled feature points, as well as the standard deviations of those distributions. These data are plotted in Fig. 14 along with radiosonde measurements from the nearby Miami airport (obtained from the website of the Department of Atmospheric Science at the University of Wyoming, http://weather.uwyo.edu/upperair/sounding.html). The location of the radiosonde releases is about 10 km inland from the cameras. The radiosondes are launched twice daily, at 1105 UTC (local morning) and 2305 UTC (local evening). Figure 14 displays the morning soundings for 4 and 9 April, and the evening sounding for 8 April. It is not always possible to observe clouds contemporaneously with radiosonde releases, so we picked the times as close to the radiosonde release times as we could. These radiosonde releases are about 2.5 h before the stereo data on 4 April, 2.5 h after the stereo data on 8 April, and nearly contemporaneous with the stereo data on 9 April. In spite of the fact that the radiosonde data are collected at a somewhat different time and location than the stereo data, the agreement between the reconstructed winds and the radiosonde winds is quite good.

The error bars in Fig. 14 show the standard deviations in the measurements; that is, they correspond to the two rightmost columns of Table 4. Table 5 lists the expected errors in the horizontal velocities caused by parametric uncertainties and image-identification errors. From the last row of this table, we see that the relative errors are significantly larger for the *υ* speeds than for the *u* speeds. This can be understood as follows. Since the cameras face approximately south, the *υ* speeds correspond to motion roughly parallel with the lines of sight, whereas the *u* speeds correspond to motion roughly perpendicular to the lines of sight. With a single camera, we can readily detect motions that are transverse to the line of sight, but we cannot detect any motion of a point moving along the line of sight. With a second camera, the motion of a point moving along the first camera’s line of sight can be detected, but the displacement in the image plane is small compared to the displacement of a point moving with the same speed transverse to the first camera’s line of sight. This reduces the signal-to-noise ratio when calculating speeds from displacements in the image planes. Therefore, the speeds of objects moving in the north–south direction are harder to measure accurately in our setup than the speeds of objects moving in the east–west direction.

As in Table 3, but for horizontal velocities. The relative error in the last row is obtained by dividing by the mean speeds from Table 4 and rounding to the first significant digit.

Comparing the last two columns of Table 4 to the second-to-last row of Table 5, we see that the standard deviations of the measured speeds for the 9 April Cu are significantly larger than the total expected errors for that case. The video sequence of 9 April shows that the sampled cumulus clouds are active, and that the features are not very sharp due to the early-morning light not yet illuminating those clouds (see the low-altitude and gray-looking cumuli in Fig. 13). These two conditions deteriorate the performance of feature identification and matching between contemporaneous and consecutive frames, and this leads to the large standard deviation of velocity measurements for the 9 April cumuli.

## 7. Summary

This paper presents a method for performing stereophotogrammetry of clouds over the open ocean. By eliminating the requirement of landmarks in the field of view, this approach widens the potential uses of stereophotogrammetry to the study of tropical oceanic convection. The key to calibrating without landmarks is to tune the calibration parameters to minimize the sum of squared distances (in the image plane) of 1) image points from their respective epipolar line for a collection of random objects (clouds, birds, ships, etc.), and 2) the actual horizon from the projected horizon.

In the field, the location of cameras is often dictated by practical constraints, such as the availability of electrical power and Internet connectivity. This can result in the baseline—the line connecting the two cameras—being close to or within the field of view. As shown here, the precision of the reconstruction suffers the closer a cloud feature is to the baseline. Precision also degrades with distance from the camera, as expected.

Despite these limitations, the accuracy of the stereo reconstruction is quite good. The stereo-reconstructed and lidar-measured cloud-base heights show good agreement, as do the stereo-reconstructed and radiosonde-measured winds. An error analysis leads to the conclusion that the uncertainty in the reconstructed heights of shallow clouds is less than 2%, with the uncertainty increasing with altitude to values as large as 8% for cirrocumulus. Although the comparisons with lidar and radiosondes did not reveal any obvious biases in the reconstructions, a more detailed comparison is precluded by the fact that the lidar is not collocated with the imaged clouds, and the radiosonde releases are neither collocated nor contemporaneous. A more definitive statement on the accuracy will require new measurements, such as images collected by a third camera or contemporaneous measurements by a scanning cloud radar.

Like any observational technique, stereophotogrammetry has its limitations. For example, data collection can be impeded when the view of the clouds being investigated is blocked by other clouds. This occlusion can be an occasional spoiler when trying to track a cloud continuously in time. Environmental light can also impact the quality of photogrammetric measurements. Low light can result in a high signal-to-noise ratio and excess light can result in overexposure of images, both of which deteriorate the accuracy of feature matching. Nevertheless, even with basic off-the-shelf cameras, stereophotogrammetry can provide a large amount of instantaneous data for clouds in a wide field of view. For example, the development of a deep-convective turret 20 km away can be imaged with a few tens of meters resolution and a 0.1-Hz frame rate. Although manual tracking of features over image sequences is labor intensive and time consuming, algorithms from the field of computer vision should make it possible to automate the feature identification and matching process, enabling the processing of vast amounts of data in seconds.

## Acknowledgments

This work was supported initially by the Laboratory Directed Research and Development Program of Lawrence Berkeley National Laboratory under the U.S. Department of Energy Contract DE-AC02-05CH11231 and, subsequently, by the Hellman Fellows Fund. The Marine and Science Technology (MAST) Academy hosted one of the cameras, and many thanks are due to the school administration and technical staff, without whom this project would not have been possible.

## REFERENCES

Ackerman, T. P., and Stokes G. M. , 2003: The atmospheric radiation measurement program.

,*Phys. Today***56,**38–44, doi:10.1063/1.1554135.Allmen, M. C., and Kegelmeyer P. , 1996: The computation of cloud-base height from paired whole-sky imaging cameras.

,*J. Atmos. Oceanic Technol.***13**, 97–113, doi:10.1175/1520-0426(1996)013<0097:TCOCBH>2.0.CO;2.Bradbury, D. L., and Fujita T. , 1968: Computation of height and velocity of clouds from dual, whole-sky, time-lapse picture sequences. SMRP Paper 70, Dept. of Geophysical Sciences, University of Chicago, 34 pp.

Bradski, G., and Kaehler A. , 2008:

*Learning OpenCV: Computer Vision with the OpenCV Library*. O’Reilly, 555 pp.Brown, D. C., 1966: Decentering distortion of lenses.

,*Photogramm. Eng.***32**, 444–462.Collis, S., Protat A. , May P. T. , and Williams C. , 2013: Statistics of storm updraft velocities from TWP-ICE including verification with profiling measurements.

,*J. Appl. Meteor. Climatol.***52,**1909–1922, doi:10.1175/JAMC-D-12-0230.1.Damiani, R., and Coauthors, 2008: The Cumulus, Photogrammetric, In Situ, and Doppler Observations Experiment of 2006.

,*Bull. Amer. Meteor. Soc.***89**, 57–73, doi:10.1175/BAMS-89-1-57.Davies-Jones, R. P., 1979: Dual-Doppler radar coverage area as a function of measurement accuracy and spatial resolution.

,*J. Appl. Meteor.***18**, 1229–1233, doi:10.1175/1520-0450-18.9.1229.Fielding, M. D., Chiu J. C. , Hogan R. J. , and Feingold G. , 2013: 3D cloud reconstructions: Evaluation of scanning radar scan strategy with a view to surface shortwave radiation closure.

,*J. Geophys. Res. Atmos.***118**, 9153–9167, doi:10.1002/jgrd.50614.Forsyth, D. A., and Ponce J. , 2003:

*Computer Vision: A Modern Approach*. Prentice Hall, 689 pp.Gallup, D., Frahm J.-M. , Mordohai P. , and Pollefeys M. , 2008: Variable baseline/resolution stereo.

*2008 IEEE Conference on Computer Vision and Pattern Recognition,*IEEE, 2562–2569, doi:10.1109/CVPR.2008.4587671.Ghate, V. P., Miller M. A. , and DiPretore L. , 2011: Vertical velocity structure of marine boundary layer trade wind cumulus clouds.

*J. Geophys. Res.,***116,**D16206, doi:10.1029/2010JD015344.Hartley, R., and Zisserman A. , 2003:

*Multiple View Geometry in Computer Vision*. Cambridge University Press, 655 pp.Hu, J., Razdan A. , and Zehnder J. A. , 2009: Geometric calibration of digital cameras for 3D cumulus cloud measurements.

,*J. Atmos. Oceanic Technol.***26**, 200–214, doi:10.1175/2008JTECHA1079.1.Kassander, A. R., and Sims L. L. , 1957: Cloud photogrammetry with ground-located K-17 aerial cameras.

,*J. Meteor.***14**, 43–49, doi:10.1175/0095-9634-14.1.43.Kassianov, E., Long C. N. , and Christy J. , 2005: Cloud-base-height estimation from paired ground-based hemispherical observations.

,*J. Appl. Meteor.***44**, 1221–1233, doi:10.1175/JAM2277.1.Kollias, P., and Albrecht B. A. , 2010: Vertical velocity statistics in fair-weather cumuli at the ARM TWP Nauru Climate Research Facility.

,*J. Climate***23**, 6590–6604, doi:10.1175/2010JCLI3449.1.Kollias, P., Albrecht B. A. , Lhermitte R. , and Savtchenko A. , 2001: Radar observations of updrafts, downdrafts, and turbulence in fair-weather cumuli.

,*J. Atmos. Sci.***58**, 1750–1766, doi:10.1175/1520-0469(2001)058<1750:ROOUDA>2.0.CO;2.Kollias, P., Albrecht B. A. , and Marks F. D. Jr., 2003: Cloud radar observations of vertical drafts and microphysics in convective rain.

,*J. Geophys. Res.***108**, 4053, doi:10.1029/2001JD002033.Kollias, P., Clothiaux E. E. , Miller M. A. , Albrecht B. A. , Stephens G. L. , and Ackerman T. P. , 2007: Millimeter-wavelength radars: New frontier in atmospheric cloud and precipitation research.

,*Bull. Amer. Meteor. Soc.***88**, 1608–1624, doi:10.1175/BAMS-88-10-1608.Kollias, P., Bharadwaj N. , Widener K. , Jo I. , and Johnson K. , 2014: Scanning ARM cloud radars (SACR’s). Part I: Operational sampling strategies.

,*J. Atmos. Oceanic Technol.***31,**569–582, doi:10.1175/JTECH-D-13-00044.1.Koppe, C., 1896:

*Photogrammetrie und Internationale Wolkenmessung*. Braunschweig Verlag, 108 pp.Malkus, J. S., and Ronne C. , 1954: On the structure of some cumulonimbus clouds which penetrated the high tropical troposphere.

,*Tellus***6A**, 351–366, doi:10.1111/j.2153-3490.1954.tb01130.x.Mather, J. H., and Voyles J. W. , 2013: The ARM Climate Research Facility: A review of structure and capabilities.

,*Bull. Amer. Meteor. Soc.***94**, 377–392, doi:10.1175/BAMS-D-11-00218.1.Orville, H. D., and Kassander A. R. Jr., 1961: Terrestrial photogrammetry of clouds.

,*J. Meteor.***18**, 682–687, doi:10.1175/1520-0469(1961)018<0682:TPOC>2.0.CO;2.Press, W. H., Teukolsky S. A. , Vetterling W. T. , and Flannery B. P. , 2007:

*Numerical Recipes: The Art of Scientific Computing*. Cambridge University, 1256 pp.Seiz, G., Baltsavias E. P. , and Gruen A. , 2002: Cloud mapping from the ground: Use of photogrammetric methods.

,*Photogramm. Eng. Remote Sens.***68**, 941–951.Warner, C., Renick J. , Balshaw M. , and Douglas R. , 1973: Stereo photogrammetry of cumulonimbus clouds.

,*Quart. J. Roy. Meteor. Soc.***99**, 105–115, doi:10.1002/qj.49709941910.Yuter, S. E., and Houze R. A. , 1995: Three-dimensional kinematic and microphysical evolution of Florida cumulonimbus. Part I: Spatial distribution of updrafts, downdrafts, and precipitation.

,*Mon. Wea. Rev.***123**, 1921–1940, doi:10.1175/1520-0493(1995)123<1921:TDKAME>2.0.CO;2.Zehnder, J. A., Zhang L. , Hansford D. , Radzan A. , Selover N. , and Brown C. , 2006: Using digital cloud photogrammetry to characterize the onset and transition from shallow to deep convection over orography.

,*Mon. Wea. Rev.***134**, 2527–2546, doi:10.1175/MWR3194.1.Zehnder, J. A., Hu J. , and Razdan A. , 2007: A stereo photogrammetric technique applied to orographic convection.

,*Mon. Wea. Rev.***135**, 2265–2277, doi:10.1175/MWR3401.1.