## 1. Introduction

There are a wide range of problems that involve transport processes in the atmospheric boundary layer (ABL). These include pollution emission, dispersion, and control and emergency actions in cases of chemical spills and chemical or biological agent dispersion. However, our knowledge of boundary layer processes is limited due in large part to the frequently changing large-scale forcings and the lack of measurements of nearly any atmospheric variable at higher altitudes. To gain new insights into the ABL transport processes and the dispersion of contaminants in and around cities to address some concerns of homeland security, the Joint Urban 2003 atmospheric dispersion field experiment (JU2003) was held in Oklahoma City, Oklahoma. Most noteworthy about this experiment is that for the first time two Doppler light detection and ranging (lidar) systems were deployed to collect hours of radial velocity data in the urban ABL.

Scanning lidars can make fast measurements of atmospheric variables to provide some critical information about the ABL dynamics (Banta et al. 1996, 2004). These measurements, however, scatter in space and time. Integration of the information to provide a more complete picture of the atmospheric state is highly desirable. Thus, if data from active remote sensors is used to drive a numerical prediction model, the model can be used to derive the atmospheric conditions that were necessary to create the observed phenomena. In this way, we achieve a result that is greater than that which the models and sensors could provide independently. The four-dimensional variational data assimilation (4DVAR) method is based upon this notion to recover a time sequence of 3D velocity and temperature fields from limited observational radial velocity data. The 4DVAR is based on optimal control theory to minimize the difference between model prediction and limited measurement by variational adjustment of the model’s control variables. The 4DVAR was used by Sun and Crook (1994) to assimilate mesoscale Doppler radar data in a dry gust-front case. Later, Sun and Crook (1997, 1998 applied it to a simulated convective storm. Gao et al. (2004) used the 3DVAR to Doppler radar observations of a supercell storm. Hu et al. (2006a,b) also used the 3DVAR to study a cluster of tornadic thunderstorms. The 3DVAR is a simplified version of the 4DVAR by omitting the time dependence of measurements. The 4DVAR was further applied to recover microscale ABL flow structures from single-Doppler-lidar data (Lin et al. 2001; Lin and Chai 2002; Chai and Lin 2003; Chai et al. 2004). Newsom and Banta (2004) later also developed a 4DVAR algorithm (hereafter, the NB method) for ABL retrieval, which is restricted to periodic boundary conditions and a fixed eddy viscosity profile.

In principle, the 4DVAR can assimilate datasets from multiple lidars for accurate retrieval. To reduce operational cost, the 4DVAR based upon single-lidar dataset is desirable. However, for the single-lidar retrieval the velocity components in the direction orthogonal to the laser beam, that is, the cross-beam direction, are not available, making the accuracy of the retrieved flow structures uncertain. One way to assess the uncertainty and accuracy of the single-lidar retrieval was through the identical twin experiment (ITE) (Lin et al. 2001), which utilizes the dataset calculated by the large-eddy simulation (LES) technique to generate synthesized ABL data. Because both synthesized observations and predicted variables are generated by the model, the model error cannot be accounted for in the ITE. Thus, the ITE tends to yield optimistic results. In the JU2003 field campaign, dual Doppler lidars were deployed to measure fluid radial velocities in the urban boundary layer. The deployment of the two lidars was coordinated so that the second lidar could provide the missing cross-beam radial velocity data by scanning from an orthogonal or nearly orthogonal direction. This makes it possible to determine the fidelity of the data retrieved by the 4DVAR from single-lidar dataset. There are about 6 volume scans of U.S. Army Research Laboratory (ARL) and Arizona State University (ASU) dual-lidar data and 19 volume scans of ARL single-lidar data measured in the afternoon of 11 July 2003 during the JU2003 experiment. The 6-volume dual-lidar data are used in this paper (Part I) for accuracy assessment, and the 19-volume ARL single-lidar data are used in Lin et al. (2008, hereafter Part II) for the study of the ABL dynamics.

The objectives of this paper are threefold. The first objective is to examine the effect of eddy viscosity models on the quality of retrieved velocity fields. The second objective is to determine the fidelity of the data retrieved from a single-lidar dataset and evaluate the difference between single- and dual-lidar retrievals. The third objective is to identify turbulent structures above some geospatial features in the Oklahoma City metropolitan area. This is the first time that dual lidars were deployed together in a major field experiment. In particular, the second Doppler lidar supplies the cross-beam velocity information that was missing in the first Doppler lidar. That is, the dual-lidar datasets allow validation of the data retrieved from the single-lidar dataset and assessment of the error incurred by the missing cross-beam information. Although Newsom et al. (2005) had performed a similar analysis using the NB method, an additional independent assessment without the constraints of periodicity and fixed eddy viscosity, and with more dual-lidar volume data, is needed. The major differences between the current 4DVAR and the NB method are boundary conditions and control variables. The NB method uses periodic lateral boundary conditions. The current method uses lateral boundary conditions derived from the lidar data. The eddy viscosity and thermal diffusivity in the NB method are prescribed a priori using the profile of Troen and Mahrt (1986), which was developed for use in large-scale models whose spatial resolution is insufficient for resolving the ABL flow structures. The current method (Chai and Lin 2003) treats the eddy viscosity and thermal diffusivity as control variables. Development of an eddy viscosity model in the 4DVAR framework shall consider two factors: easy implementation and well-posedness in optimization. The eddy viscosity model of Chai and Lin (2003) was developed to include these features. However, their method was only evaluated by the ITE approach. Furthermore, six volume scans of dual-lidar data are used in this study instead of two volumes as in Newsom et al. (2005). Use of different 4DVAR algorithms, boundary conditions, eddy viscosity models, and more lidar data over a longer time period can potentially contribute to the uncertainty of the quality of retrieved data. Thus, the reassessment of eddy viscosity models and 4DVAR algorithms based upon more dual-lidar data would certainly provide valuable information on the confidence of applying data assimilation techniques to study the ABL.

For the third objective, we aim to identify flow structures above some geospatial features on the land surface of the Oklahoma City metropolitan area for two purposes. The first purpose is to demonstrate that some coherent structures, such as roll vortices and thermal plumes, exist above notable ground building structures before presenting a more extensive analysis in Part II. The second purpose is to demonstrate the limitation of the 4DVAR that only a few physical realizations of the ABL structures over a short time duration can be retrieved at a time, leaving several questions about the flow physics and the land–atmosphere correlation unanswered. A technique that can process the retrieved data over a longer time period is needed to address these issues, leading to Part II of the paper. The instantaneous flow structures retrieved from the single-lidar dataset are discussed so that the discussions in Part II, which is based upon single-lidar retrieval, can be related to those in this Part I.

The paper is organized as follows. Section 2 reviews the numerical formulations of the present 4DVAR including the prediction model, the adjoint equations, and the optimization method. Section 3 describes the JU2003 field experiment observation. Section 4 discusses single- and dual-lidar retrieval results. The retrieved results are compared with lidar radial velocity data. The quality of single- and dual-lidar-retrieved data is assessed. The accuracy of the cross-beam velocity component retrieved from single-lidar data is discussed. The spatial correlation between retrieved ABL flow structures and building data in the Oklahoma City metropolitan area is identified. Concluding remarks are given in section 5.

## 2. Numerical method

In the current 4DVAR system, the 3D wind and temperature fields at the beginning of the assimilation period and the model parameters of eddy viscosity and thermal diffusivity are treated as control variables. The difference between model predictions and field observational data is measured by the cost function. All control variables are updated iteratively to minimize the cost function. A set of adjoint equations are derived via a variational approach to calculate the cost function gradients with respect to control variables. The system consists of a prediction model and an adjoint-based optimization process described briefly below. For a detailed description of the system, please refer to Lin et al. (2001), Lin and Chai (2002), Chai and Lin (2003), and Chai et al. (2004). Chai et al. (2004) applied this system to the radial velocity data measured by a high-resolution Doppler lidar (HRDL) during the Cooperative Atmospheric Surface Exchange Study in 1999 (CASES-99) and retrieved a dry microburst in the convective boundary layer (CBL).

### a. Prediction model

**U**= (

*U*,

*V*,

*W*) in the (

*x*,

*y*,

*z*) coordinates;

**G**= (0, 0,

*g*) with the gravitational acceleration

*g*;

*ρ*is the reference density;

_{o}*θ*, Θ, and Θ

*are the fluctuating, background (function of*

_{o}*z*only), and reference virtual potential temperatures, respectively. Lowercase variables denote fluctuating parts of variables. Eddy viscosity

*ν*(

*z*) and thermal diffusivity

*κ*(

*z*) are assumed functions of height. Dependent variables are arranged on a staggered, orthogonal grid. The time and space derivatives in the governing equations are discretized using the second-order Adams–Bashforth method and the second-order finite difference method. A pressure Poisson equation derived from Eqs. (1) and (2) is solved to enforce the mass conservation. Although the Boussinesq approximation is reasonable for most ABL situations that meet the shallow convection conditions (Stull 1988), the effect of density variation should be taken into account for mesoscale atmospheric flows. The modification of the current 4DVAR to account for the density effect can be achieved by invoking an anelastic approximation as in Sun and Crook (1997).

To apply the 4DVAR model to a CBL, a surface momentum and temperature flux model is implemented to enforce the Monin–Obukhov similarity theory at the first vertical grid level immediately above the surface. At the domain top, the Neumann gradient-free boundary condition is imposed for *U*, *V*, and *θ*, whereas a nonpermeable boundary condition *W* = 0 is applied for *W*. At the lateral boundaries, horizontal mean *U*, *V*, *W* (=0) derived from the lidar data (section 3c), and *θ* (=0) are used to impose the Dirichlet boundary condition.

### b. Optimization method

**U**and

*θ*and the model parameters

*ν*(

*z*) and

*κ*(

*z*), which are time independent throughout the assimilation time window, can be found in Lin et al. (2001) and Chai and Lin (2003). The initial guess for

**U**is the mean value derived from a velocity–azimuth display (VAD) technique to be described in section 3c. The initial guess for

*ν*(

*z*) is based upon the profile of Troen and Mahrt (1986), which was used by Newsom et al. (2005) in their 4DVAR study:where

*z*

_{max}is the height of the computational domain,

*ν*

_{max}= 20 is the maximum value of

*ν*(

*z*), and

*χ*= 4 is a parameter that controls the shape of the eddy viscosity profile. The initial profile of eddy diffusivity

*κ*(

*z*) is approximated by

*κ*(

*z*) = 2

*ν*(

*z*). These profiles will be adjusted accordingly during the optimization process. Leredde et al. (1999) showed that data assimilation can estimate a time- and space-dependent turbulent viscosity using a 3D oceanic circulation model. In their study, the eddy viscosity was the only control variable. Here, the 3D wind and temperature fields are the primary control variables. Addition of time-dependent 3D eddy viscosities to the control variables changes the ratio of observations to control variables and subsequently the condition number of the optimization problem, which may result in an ill-posed problem (Chai and Lin 2003).

*u*and

_{r}*u*

_{ob}are the retrieved and observational radial velocities, respectively; Σ

_{t}and Σ

_{x,y,z}denote summations over time and space, respectively; and

**∇**and ∇

^{2}are the respective divergence and Laplacian operators;

*α*is the validity coefficient indicative of the quality of observational data, serving to reduce the contribution from data points with low signal-to-noise ratios (SNR). In Chai et al. (2004),

*α*was taken as unity for good observations and zero for bad ones. In the present study, it is determined by

*α*= 1/

*σ*

^{2}(Kalnay 2003), where

*σ*is the radial velocity observational error (Newsom and Banta 2004). From single-lidar retrieval to dual-lidar retrieval, the major change made in the cost function is the first term on the right-hand side of Eq. (5). The index

*P*denotes the

*P*th lidar, and the contribution from each lidar shall be added to the cost function. The second term is a penalty term, representing nondivergent constraint for the initial velocity field with the penalty coefficient

*β*= 100 (Lin et al. 2001). The spatial smoothness constraint imposed through the last term is set to be no more than 20% of the total cost function by adjusting

*ξ*dynamically (Lin and Chai 2002).

_{f}The adjoint equations provide the gradients of the cost function with respect to the control variables. These gradients are then used by the limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) method (Liu and Norcedal 1989) to find the optimal initial **U** and *θ* fields and the optimal model parameters *ν*(*z*) and *κ*(*z*) for the prediction model. The optimization process is repeated until the convergence criterion that measures the difference between retrieved data and observational data is met.

## 3. Observational data

### a. Dual-lidar data

The dual-Doppler lidar data were measured during the JU2003 field experiment, which was conducted in Oklahoma City from 28 June to 31 July 2003. The dataset used in the present study was collected from 1756:37 to 1805:17 UTC [locally from 1256:37 to 1305:17 central daylight time (CDT)] 11 July 2003. It was a typical convective early afternoon, which was cloud free and hot. The lidar data were grouped by volume scan. A total of six volume scans of dual-lidar data were measured during that period. The time duration for each volume scan was 87 s. According to Lin et al. (2001), two volume scans of lidar data are needed for the 4DVAR. Thus, three sets of 4DVAR experiments can be conducted with each using two volume scans of data. The assimilation time window for two volume scan data is *t _{w}* = 2 × 87 = 174 s. The ARL and ASU lidars are nearly identical in their designs (WindTracer by Coherent Technologies, Inc. in Lafayette, Colorado) except that the ARL lidar is one model number later. The solid-state laser transceivers employed in these lidars were operated at a wavelength of 2

*μ*m with a 400-ns (60 m) 2-mJ pulse. Both lidars were configured with the same signal processing parameters and performed a series of range–height indicator (RHI) scans. For the ARL lidar, there are 13 RHI sectors in each scan volume with the azimuth angle from 184.4° to 244.4° clockwise from north in 5° steps. The RHI sectors are spanned with elevation angles varying from 0° to 45° at a speed of 8° s

^{−1}. The ASU lidar performed a similar type RHI scan with azimuth angles ranging from 274.8° to 334.8°. The observational data were processed with 66-m range gates and 100 pulse averaging at a pulse repetition frequency (PRF) of 500 Hz. There was no oversampling in space and time. Thus the time resolution is 0.2 s and the space resolution is 66 m.

There are notable ground building structures in the satellite map of Oklahoma City (Fig. 1), such as parks (denoted by “PARK” in the figure), central business districts (CBD), restaurants (R), highways, and rivers. The coordinates of the ARL lidar acquired by handheld GPS receivers are 35°28.385′N, 97°30.266′W and those of the ASU are 35°26.330′N, 97°29.553′W. The locations of the two Doppler lidars are represented by two small circles in the figure. The ARL lidar was deployed 1400 m east-northeast of CBD atop a parking garage. The ASU lidar was deployed 3804 m to the south and 1090 m to the east of the ARL lidar.

### b. Observational error

The original lidar dataset contains a variety of data, such as radial velocity, SNR, time, azimuth angle, and elevation angle. The SNR is a performance measure for coherent lidar observational samples. Low SNR data usually correspond to weak return signals. The SNR threshold technique is used for quality control. The lidar data whose SNR are lower than a certain value are tagged as “bad” points. Those “bad” radial velocities are not used in the 4DVAR.

Figure 2 displays the radial velocity and SNR data for the ARL and ASU lidars taken at the middle of the assimilation time window *t _{w}* = 87 s. Figures 2a and 2c show that the data quality of the ARL lidar decreases beyond a range of 5 km. Figures 2b and 2d show that there are scarce good ASU observational data beyond 4 km. Therefore the observational data beyond 4 km for the ASU and 5 km for the ARL are excluded in the 4DVAR. The data quality of the ARL and ASU lidars below the height of 1200 m is better than the region above. This height roughly corresponds to the location of the capping layer.

The random noise, which is induced by the mean frequency estimator, is the dominant source of error in the Doppler lidar observational data (Rye and Hardesty 1993; Frehlich and Yadlowsky 1994). The relationship between radial velocity observational error *σ* and SNR is displayed in Fig. 3, which is obtained experimentally (Newsom et al. 2005). Both lidars have similar radial velocity errors of approximately 15 cm s^{−1} in the high SNR regime. The ARL curve is computed from the fixed-beam data, which were measured from 2228:58 and 2258:56 UTC 8 July 2003. The ASU curve is computed from the fixed-beam data, which were acquired between 2306:46 and 0646:46 UTC 14 July 2003. The radial velocity observational error *σ* is factored into the optimization algorithm through the validity parameter *α* in Eq. (5) to reduce the influence of low SNR data on the cost function. Because the SNR data of the ASU lidar are generally lower than those of the ARL lidar and vary between −4 and −7, incorporation of the observational error in the 4DVAR has more effect on the ASU data than on the ARL data.

### c. Mean wind and temperature profiles

During the JU2003 field campaign, the sounding data of pressure, air temperature, and dewpoint temperature were collected using a radiosonde that was released from a site located 2.6 km west and slightly south of the ARL lidar inside the CBD. This was at 1800 UTC (1300 LT) 11 July 2003, very close to the time 1756:37–1805:17 UTC for the 4DVAR. The virtual potential temperature profile calculated using those sounding data is exhibited in Fig. 4 (Jacobson 1999). The height of the capping inversion layer is about 1200 m. The large slope of virtual potential temperature profile near the ground is different from the observation during the CASES-99 (refer to Fig. 8 in Chai et al. 2004), indicating the presence of a deep superadiabatic layer below 500 m.

*U*and

_{b}*V*at a given height;

_{b}*φ*is the elevation angle,

*ϕ*is the azimuth angle, and

*u*

_{ob}is the same observational radial velocity as those used in Eq. (5). The running index

*j*represents each of the observations in a given bin. The mean vertical velocity is assumed zero.

The mean velocity profiles estimated from Eq. (6) are displayed in Fig. 5. The profiles obtained from the ARL lidar data are similar to those from the ASU lidar data below the capping layer. The slight discrepancy above the capping layer is attributable to the lack of good ASU data, as shown in Fig. 2. The mean velocity profiles obtained from the ARL lidar data are used in specification of lateral boundary conditions as well as the first-guess fields for both single- and dual-lidar retrievals. Because the mean *U* velocity is nearly zero below the capping layer, there exists a strong southerly wind, which is common for this location at this time of year.

### d. Data mapping and domain size

Since lidar observations are recorded in the spherical coordinate system, they are mapped to the Cartesian computational grid. The grid resolutions for the data are 5° in azimuth angle, approximately 1.7° in elevation angle, and 66 m in the radial direction. For any computational grid point inside the scan volume, the radial velocity, its correspondent observational time, and SNR data are calculated by averaging data at the eight vertices of the spherical grid element that encloses the grid point. The data are weighted by the distance between the vertices and the grid point. Bad observational data determined by the lidar quality control algorithm are excluded from averaging. If all eight vertices are bad, the radial velocity at that grid point is specified by a very large number, signifying a void point in the scan volume.

A computational domain of 5.5 km × 5.5 km × 2.0 km is employed in the 4DVAR analysis. All cases use a grid of 66 × 66 × 34. Axes *x*, *y*, and *z* are directed north, east, and upward, respectively. The grid distance is chosen to be consistent with the grid resolution along the laser beam. The origin is located at the left bottom corner of the domain. The ARL lidar is located at (5000, 5000, 0) m, and the ASU lidar is at (6090, 1196, 0) m in the coordinate system of the computational domain. This domain is chosen to cover most of the scan volume. Following the suggestion of Chai et al. (2004), we use a big model domain with buffer zones, which are referred to as data-void regions between the domain boundaries and the lidar scan volume. These buffer zones help minimize effects induced by lateral boundary conditions. In the computational domain, there are approximately 40 000 ARL good observational data points and only 21 000 good observational data points from the ASU lidar.

## 4. Retrieval results

In this section, the retrieval results using both single- and dual-lidar data are presented to address three objectives. The first objective is to evaluate the sensitivity of retrieval to eddy viscosity models. Chai and Lin (2003) concluded from the identical twin experiments that the dynamic model approach is superior to the prescribed viscosity model whose eddy viscosity profile is given a priori and is fixed during the 4DVAR. Here we aim to assess the effect of the fixed model approach on the accuracy of retrieval by comparing it with the dynamic model solutions. The second objective is to quantify the difference between single- and dual-lidar retrieved data. The third objective is to identify the physical correlation between turbulent structures with the geospatial features in the Oklahoma City metropolitan area. For single-lidar retrieval, the ARL observational data are adopted because the ARL data are about twice as abundant as the ASU data. The single-lidar retrieved velocities are then compared with the ARL and ASU observational radial velocities. For the dual-lidar retrieval, the retrieved results are compared with the ARL and ASU data as well as with the single-lidar retrieval. In what follows, we first discuss the convergence criterion and accuracy measure before addressing these objectives.

### a. Convergence criterion and accuracy measure

*u*, whereas the lidar observational radial velocity is denoted by

_{r}*u*

_{ob}. The root-mean-square (rms) misfit between

*u*and

_{r}*u*

_{ob}is denoted by Δ. Figure 6 shows that the misfit Δ decreases with increasing iteration. In the first 10 iterations Δ is reduced from 0.85 to 0.4 m s

^{−1}. The convergence rate slows down after about 50 iterations. For instance, from iteration 50 to 200, Δ is reduced only from 0.14 to 0.08 m s

^{−1}. From Fig. 3, the radial velocities measured by both lidars have a precision of slightly less than 0.2 m s

^{−1}. Thus, we adopt the convergence criterion:As will be discussed in section 4c, the use of a strict convergence criterion Δ ≤ 0.1 m s

^{−1}leads to retrieved data that agree better with one lidar dataset, but is worse with the other.

*x*and

*h*is defined asThe rms deviation is given byThe overbars in Eqs. (8) and (9) denote averaging over the grid points within the overlap region of the two lidar scans where both ARL and ASU observational radial velocity data are available for comparison. CORR(

*ξ*,

*η*) and RMS(

*ξ*,

*η*) are used to measure the difference between

*u*and

_{r}*u*

_{ob}, namely, (

*ξ*,

*η*) = (

*u*,

_{r}*u*

_{ob}).

*M*

_{1}and

*M*

_{2}is used to compare the retrieved fluctuating components:where

*f*represents any retrieved fluctuating velocity or temperature at a vertical grid level

*k*. The subscripts

*i*and

*j*denote cases

*i*and

*j*, for instance, single-lidar case

*i*and dual-lidar case

*j*. The height-based rms error is calculated byThe overbars in Eqs. (10) and (11) denote averaging at a vertical grid level

*k*. In the calculation of Eqs. (10) and (11),

*M*

_{1}= 1 and

*M*

_{2}= 21, that is, below the capping layer where

*z*= 1200 m, are used unless otherwise noted.

### b. Effect of eddy viscosity models

In this section, the results of the dynamic eddy viscosity model are compared with those of the fixed eddy viscosity model to assess the effect of eddy viscosity on the retrieved field. The fixed viscosity profile is given by Eq. (4) as in Newsom et al. (2005). Figure 7a shows the retrieved fluctuating velocity field in a vertical plane at *x* = 4000 m for the fixed model case 1a. This plane passes the CBD at *y* = 4300 m. Figure 7b shows the corresponding retrieved fluctuating virtual potential temperature field. The updraft (downdraft) regions correspond to the positive (negative) fluctuating virtual potential temperature, which is typical in the CBL.

The cross sections for the dynamic model case 1b at the same location as case 1a are exhibited in Fig. 8. A comparison of Figs. 7 and 8 shows that both models capture similar turbulent structures. To quantify the difference with the observational data, Table 2 lists the CORR coefficient and rms deviation. In the table, the “would be” radial velocity *u _{r}* based upon the coordinates of the ARL and ASU lidars are reconstructed and compared with the lidar observational radial velocity

*u*

_{ob}. The results show that the dynamic model yields slightly better results than the fixed model regardless of lidar volume data, for example, cases 1a versus 1b, cases 2a versus 2b, and cases 3a versus 3b. This is expected because the fixed eddy viscosity profile of Troen and Mahrt (1986) is for use in large-scale models that have an insufficient spatial resolution for the ABL. To visualize the differences, the 3D velocity vectors of case 1a are subtracted from those of case 1b. Figure 9 shows the difference vectors in a vertical plane at

*x*= 4000 m. The discrepancy in the

*w*velocity component appears to be most evident in several places. The height-based correlation coefficients

*γ*,

_{u}*γ*,

_{υ}*γ*, and

_{w}*γ*between cases 1a and 1b are 0.985, 0.989, 0.980, and 0.953, respectively. The rms errors ε

_{θ}*, ε*

_{u}*, and ε*

_{υ}*are 0.125, 0.113, and 0.155 m s*

_{w}^{−1}, and ε

*is 0.130 K. Figure 10 further displays the vertical distributions of the relative rms velocity discrepancies Δ*

_{θ}*U*, Δ

*V*, and Δ

*W*between cases 1a and 1b normalized by the velocity magnitude of case 1b. Throughout the ABL depth except near the ground

*z*< 80 m, the relative rms velocity discrepancy Δ

*W*is greater than Δ

*U*and Δ

*V*. Thus, the difference between the two retrieved fields mainly lies in the

*w*velocity component, which subsequently affects the

*θ*field because of the coupling of Eqs. (2) and (3) through the buoyancy term. Since there is no temperature observational data,

*γ*is expected to be smallest among all variables.

_{θ}To understand why the *w* velocity component is most affected by the viscosity model, the vertical distributions of velocity variances for case 1b are displayed in Fig. 11. As in a typical CBL, the *w* variance has a local maximum around the mid-CBL height. A comparison of the eddy viscosity profiles in Fig. 12 shows that the fixed viscosity is much greater than the retrieved viscosity, especially in the range of *z* = 200–600 m. The excessive viscosity could greatly affect the *w* velocity field, resulting in the difference in Figs. 9 and 10. The retrieved eddy viscosity profiles for cases 2b and 3b (not shown) also consistently show that the fixed model overpredicts viscosity except near the wall and the capping layer. The viscosity profile retrieved from the dual-lidar retrieval case 1c is also displayed in Fig. 12 for comparison. The similarity between single- and dual-lidar-retrieved profiles supports the notion that the eddy viscosity profile is dependent upon instantaneous turbulent structures and suggests that the viscosity profile cannot be accurately prescribed a priori. Hereafter, we shall adopt a dynamic model for assessment of the accuracy of single- and dual-lidar retrieval.

### c. Accuracy of single-lidar retrieval

In this section, we assess the accuracy of the single-lidar-retrieved data by comparing with the observational data. The correlations CORR(*u*^{ARL}_{r}, *u*^{ARL}_{ob}) for single-lidar cases 1b, 2b, and 3b are 0.994–0.995 with a rms deviation of 0.16–0.17 m s^{−1} (Table 2). Because the mean velocity below the capping layer is about 8 m s^{−1} (Fig. 5), the relative rms deviation in the along-beam direction is about 2%. This indicates that *u*^{ARL}_{r} and *u*^{ARL}_{ob} are well correlated. This is expected because the retrieved results are derived from the ARL observational radial velocities. Figure 13a shows that the histogram of the difference *u*^{ARL}_{r} − *u*^{ARL}_{ob} for case 1b sharply peaks at zero, and Fig. 13c shows that the correlation diagram exhibits an almost linear relationship between *u*^{ARL}_{r} and *u*^{ARL}_{ob}.

To assess how well the ARL single-lidar retrieval can recover the missing information in the cross-beam direction contained in the ASU data, we compare the values of the CORR coefficient and rms deviation between *u*^{ASU}_{r} and *u*^{ASU}_{ob} in Table 2. It is noted that *u*^{ASU}_{r} is reconstructed from the ARL single-lidar-retrieved velocity field. The CORR coefficients for cases 1b, 2b, and 3b vary from 0.964 to 0.927 and the rms deviations range from 0.907 to 1.367 m s^{−1}. Figure 13b shows that the histogram distribution of *u*^{ASU}_{r} − *u*^{ASU}_{ob} for case 1b still peaks around zero. Figure 13d shows that the correlation diagram exhibits a quasi-linear relationship with a rms deviation of 0.907 m s^{−1}. Using the mean velocity of 8 m s^{−1} as a reference, the relative rms deviations in the cross-beam direction vary from about 11%–17%. In other words, in spite of the missing information in the cross-beam direction, the current single-lidar 4DVAR is able to retrieve radial velocity fields with an accuracy of 98% in the along-beam direction and 80%–90% in the cross-beam direction.

Newsom et al. (2005) noted in their single-lidar retrieval that the histogram distribution for the difference *u*^{ASU}_{r} − *u*^{ASU}_{ob} is skewed toward positive values and peaks at 0.3 m s^{−1}, suggesting a slight systematic offset in the scanner alignments. Here, the differences *u*^{ARL}_{r} − *u*^{ARL}_{ob} for cases 1b, 2b, and 3b do not exhibit any biased distribution. But for the ASU data, the distributions *u*^{ASU}_{r} − *u*^{ASU}_{ob} for cases 1b, 2b, and 3b peak at 0.09 (Fig. 13b), 0.169, and 0.318 m s^{−1}, respectively. Because case 3b uses the same volume scans as Newsom et al. (2005), the current result for case 3b is consistent with their observation. Nonetheless, the almost zero skewness in case 1b does not support the notion of the “systematic” offset in the scanner alignments. The biased distribution may be caused by other unknown reasons. Newsom et al. (2005) also found in their single-lidar retrieval that the CORR coefficient and rms deviation between *u*^{ARL}_{r} and *u*^{ARL}_{ob} are 0.99 and 0.24 m s^{−1}, and those between *u*^{ASU}_{r} and *u*^{ASU}_{ob} are 0.94 and 1.18 m s^{−1}. In comparison with case 3b, which uses the same volume scans (Table 2), they are 0.994 and 0.174 m s^{−1} for the ARL data and 0.927 and 1.367 m s^{−1} for the ASU data. Thus, the current ARL single-lidar retrieval yields better results than theirs as compared with the ARL data, but not with the ASU data. A single-lidar case similar to case 3b but imposing a strict convergence criterion Δ ≤ 0.1 m s^{−1} was tested. The results show that the agreement with the ARL data becomes better than case 3b, but the agreement with the ASU becomes worse. These observations suggest that the ASU and ARL data might not be consistent at microscales, perhaps due to a small asynchronicity between the lidars.

### d. Accuracy of dual-lidar retrieval

To assess the accuracy of dual-lidar retrieved data, we show the histograms and correlation diagrams of the “would be” and observational radial velocities in Fig. 14. The distributions in both diagrams indicate good agreement with the ASU and ARL observational radial velocities. Although the retrieval is driven by both lidar data, the ARL data are twice as many as the ASU data. Consequently, the retrieved data may be biased toward the ARL data, resulting in a better agreement with the ARL data than the ASU data. Table 2 indeed shows that the dual-lidar retrieved data are in better agreement with the ARL data than the ASU data.

In comparison with single-lidar retrieval, it is noted that the correlation coefficient CORR(*u*^{ARL}_{r}, *u*^{ARL}_{ob}) decreases slightly from 0.995 (case 1b) to 0.993 (case 1c), and the RMS(*u*^{ARL}_{r}, *u*^{ARL}_{ob}) increases slightly from 0.161 to 0.2 m s^{−1}. On the other hand, CORR(*u*^{ASU}_{r}, *u*^{ASU}_{ob}) improves from 0.964 (case 1b) to 0.989 (case 1c), and RMS(*u*^{ASU}_{r}, *u*^{ASU}_{ob}) decreases from 0.907 to 0.503 m s^{−1}. The slight changes in agreement with the ARL data for dual-lidar cases may be attributed to the uncertainties of the ARL and ASU data at small scales caused by the observational errors as well as the model errors. The inconsistency between both lidar data at small scales is also discussed before in the single-lidar retrieval. By using the mean velocity of 8 m s^{−1} as a reference, the maximum relative rms deviation is now reduced from 11.3% (case 1b) to 6.3% (case 1c), from 16.3% (case 2b) to 11.4% (case 2c), and from 17% (case 3b) to 10.1% (case 3c). Therefore, with the cross-beam information available, the current 4DVAR is able to retrieve radial velocity fields with an accuracy of 97%–98% in the along-beam direction of the ARL lidar, and 90%–94% in the along-beam direction of the ASU lidar. It can be concluded that single-lidar 4DVAR yields reasonably good retrieved data in terms of the CORR coefficient and the rms deviation.

Figure 15 shows the fluctuating velocity and temperature fields in the vertical cross section of the dual-lidar case 1c at *x* = 4000 m and *t _{w}* = 87 s. The flow structures resemble those of the ARL single-lidar case 1b shown in Fig. 8. Because there are no ASU observational data within

*y*= 0–1500 m, the dual-lidar results in that region are essentially the same as the single-lidar results. For the overlap region from

*y*= 1500–4500 m, the flow structures retrieved from dual- and single-lidar cases have similar flow patterns, but vary in intensity. Because

*w*and

*θ*are coupled, the difference in the velocity field also reflects on the temperature distribution shown in Figs. 8b and 15b. The difference between single- and dual-lidar retrievals is quantified in terms of

*γ*and ε

_{f}*in Table 3. The major observations are 1)*

_{f}*γ*,

_{u}*γ*, and

_{υ}*γ*vary from 0.6 to 0.8 and

_{w}*γ*in general is better than the other two because

_{υ}*υ*is approximately parallel to the along-beam direction of the ARL lidar; 2) the rms errors ε

*, ε*

_{u}*, and ε*

_{υ}*vary from 0.4 to 0.9 m s*

_{w}^{−1}and ε

*is the smallest; 3)*

_{υ}*γ*is the lowest among all and ε

_{θ}*varies from 0.24 to 0.46 K; 4) the agreement between single- and dual-lidar cases is better in the region away from the surface, for example, above 200 m. In summary, the results indicate that the single-lidar 4DVAR is able to recover the 3D fluctuating velocity field with 70%–80% accuracy in the along-beam direction and 60%–70% accuracy in the cross-beam direction if the dual-lidar-retrieved data are assumed perfect, containing no errors. It should be noted that the correlation between fluctuating velocity fields is usually lower than that of total velocity fields because of the absence of mean velocity.*

_{θ}In the dual-lidar retrieval of Newsom et al. (2005), the CORR coefficient and rms deviation are 0.99 and 0.27 m s^{−1} for the ARL data, and 0.98 and 0.63 m s^{−1} for the ASU data. Even after correcting the so-called azimuth offset discussed in section 4c, their CORR and rms values did not improve. As compared with dual-lidar case 3c in Table 2, which uses the same volume scans as theirs, case 3c yields slightly better results only when compared with the ARL data. This suggests again that the ARL and ASU data might not be consistent at small scales. We shall also compare with the ITE in Chai et al. (2004), which used LES to generate the synthetic radial velocity field. From their Table 2, the height-based correlation coefficient *γ _{f}* and the rms error ε

*for*

_{f}*u*,

*υ*, and

*w*are about 0.88 and 0.5 m s

^{−1}. By assuming that dual-lidar retrieved data are perfect and free of errors, the best result in Table 3 shows that

*γ*and ε

_{f}*for cases between cases 1b and 1c are about 0.7–0.8 and 0.43–0.57 m s*

_{f}^{−1}. Given the possibilities that the ARL and ASU data at small scales might be contaminated with observational errors and that the model errors might exist, the current single-lidar retrieval seems to be able to retrieve reasonable results.

### e. Instantaneous flow structures

In this section, the ground building data in the metropolitan area of Oklahoma City are used to interpret the retrieved flow structures above certain ground buildings. The coordinates of the airport are (*x*, *y*, *z*) = (2400, 2200, 0) m. We first focus on the flow structure in the vicinity of the downtown airport and parks shown in Fig. 1. Figures 16, 17 and 18 exhibit the time sequence of the fluctuating velocities and temperature in the vertical plane of *y* = 2200 m with a time interval of 174 s. They are taken from the middle of the assimilation time window of cases 1b, 2b, and 3b. In Fig. 16a, a strong updraft is located right above *x* = 2400 m, where the airport is located. The updraft is correlated with local positive fluctuating temperature, as shown in Fig. 16b, and generates large-scale counterrotating roll structures throughout the CBL height. Figure 17a also indicates the presence of strong updrafts and roll structures. However, two local maxima of positive fluctuating temperature are identified in Fig. 17b above *x* = 2200 and 2700 m. Two smaller vortices are found at *x* = 2600 and 4000 m. Figure 18a shows that about 174 s later, the roll structures become much less evident with the local maximum temperature shifting to above *x* = 3300 m.

We then inspect the flow structures near the CBD whose coordinates are (*x*, *y*, *z*) = (4000, 4300, 0) m in Fig. 1. Figures 8, 19, and 20 show the time sequence of the fluctuating velocities and temperature in the vertical plane of *x* = 4000 m. Although the CBD is located near the edges of the sectorlike scanning region of the lidars, there exist vortices and local positive temperature fluctuations near the CBD and sometimes a strong downdraft above the CBD, as in Fig. 20a. It is also noted that strong updrafts and downdrafts associated with these vortices are found between *x* = 500–1500 m. This region corresponds to the box marked by R in Fig. 1. From the satellite map, no distinct ground buildings like the CBD can be seen. Nonetheless, Fig. 21 shows that this region is roughly located at the intersection of South Robinson Avenue (which leads northward to the CBD) and Commerce Street, having the coordinates of (*x*, *y*, *z*) = (4000, 1100, 0) m. Google Earth further reveals that several restaurants are clustered at this intersection. Nonetheless, the above snapshots of instantaneous flow realizations are insufficient to conclude any long-term physical correlation between these flow structures and the geospatial ground features. A technique that can process the retrieved data over a longer time period is needed to understand the ABL dynamics and land–atmosphere correlation.

## 5. Summary

In the JU2003 atmospheric dispersion field experiment held in Oklahoma City, two coherent Doppler lidars were deployed to study the boundary layer transport processes and the dispersion of contaminants in and around cities. In this paper, the dual-lidar data are used to assess the effect of eddy viscosity models on retrieval quality and the accuracy of single- and dual-lidar retrievals. The retrieved flow fields are then compared with the building data to identify flow structures above some notable ground building structures. The goals of the paper are to perform a quantitative assessment of the current 4DVAR and establish a preliminary connection between the geospatial features of the city and the turbulent coherent structures in the urban boundary layer. They serve a foundation for the second part of the paper, which will utilize a total of 19 volumes of lidar data for the 4DVAR to study the long-term effect (on a time scale of 30 min) of the coherent structures and their multiscale nature and interplay in an urban environment.

The study confirms that the dynamic eddy viscosity model improves retrieval quality. The fixed model prescribes excessive viscosity in the mid CBL and particularly affects the *w* velocity component because the *w* variance has a local maximum near the mid CBL. Prescribing accurate viscosity a priori is difficult, if not impossible.

Two different measures are adopted to assess the accuracy of the retrieved data. The first measure is based on comparison with lidar observational radial velocity data. Namely, the retrieved velocity field is used to reconstruct the “would be” radial velocity and compare with the ARL and ASU observational radial velocity. The single-lidar 4DVAR is found to retrieve radial velocity fields with an accuracy of 98% in the along-beam direction and 80%–90% in the cross-beam direction when compared with the ARL and ASU radial velocity data. For the dual-lidar 4DVAR, we can assess the accuracy of the retrieved data along two along-beam directions. One is the along-beam direction of the ARL lidar, and the other is that of the ASU lidar. The current 4DVAR is able to retrieve radial velocity fields with an accuracy of 97%–98% in the along-beam direction of the ARL lidar, and 90%–94% in the along-beam direction of the ASU lidar. The better agreement with the ARL lidar data is attributable to the fact that the ARL lidar data are about twice as many as the ASU data.

The second measure is to quantify the difference in fluctuating quantities retrieved from single- and dual-lidar retrievals. Because of the absence of the mean quantity, the correlation coefficient is expected to be lower than that obtained from the first measure. Assuming that the dual-lidar-retrieved data is perfect and free of errors, the results show that the single-lidar 4DVAR recovers 3D fluctuating velocity field with 70%–80% accuracy in the along-beam direction and 60%–70% accuracy in the cross-beam direction.

The retrieved wind and temperature fields are then used to identify turbulent structures above some ground building data. Large-scale convective roll structures are found above the downtown airport and parks. The location and strength of these structures vary in time. Strong velocity fluctuations are observed near the CBD, perhaps due to the roughness effect of the tall buildings. Strong updrafts and downdrafts are observed above these restaurants. These observations imply the existence of the physical correlation between flow structures and buildings in the urban boundary layer. Nonetheless, a statistical analysis of a long time sequence of retrieved velocity fields is needed to support the physical correlation between them.

In conclusion, the current 4DVAR can be considered as a variation of typical LES for study of the ABL, but is driven by observational lidar data rather than by numerical initial and boundary conditions alone. It is believed that the retrieved 3D data are genuine in the sense that the flow-driven conditions are more realistic, can provide different perspectives on urban ABL structures than those provided by LES data, and may lead to new insight on the multiscale processes in the urban boundary layer.

## Acknowledgments

This work is supported by the National Science Foundation through Grant ATM 0352193. We thank David Ligon of ARL for providing ARL lidar data and Young Yee of ARL for providing radiosonde data.

## REFERENCES

Banta, R. M., , L. D. Olivier, , P. H. Gudiksen, , and R. Lange, 1996: Implications of small-scale flow features to modeling dispersion over complex terrain.

,*J. Appl. Meteor.***35****,**330–342.Banta, R. M., , R. K. Newsom, , J. K. Lundquist, , Y. L. Pichugina, , R. L. Coulter, , and L. D. Mahrt, 2002: Nocturnal low-level jet characteristics over Kansas during CASES-99.

,*Bound.-Layer Meteor.***105****,**221–252.Banta, R. M., , L. S. Darby, , J. D. Fast, , J. O. Pinto, , C. D. Whiteman, , W. J. Shaw, , and B. W. Orr, 2004: Nocturnal low-level jet in a mountain basin complex. Part I: Evolution and implications to other flow features.

,*J. Appl. Meteor.***43****,**1348–1365.Chai, T., , and C-L. Lin, 2003: Optimization of turbulent viscosity and diffusivity in adjoint recovery of atmospheric boundary layer flow structures.

,*Multiscale Model. Simul.***1****,**196–220.Chai, T., , C-L. Lin, , and R. Newsom, 2004: Retrieval of microscale flow structures from high-resolution Doppler lidar data using an adjoint model.

,*J. Atmos. Sci.***61****,**1500–1520.Frehlich, R. G., , and M. J. Yadlowsky, 1994: Performance of mean-frequency estimators for Doppler radar and lidar.

,*J. Atmos. Oceanic Technol.***11****,**1217–1230.Gao, J., , M. Xue, , K. Brewster, , and K. K. Droegemeier, 2004: A three-dimensional variational data analysis method with recursive filter for Doppler radars.

,*J. Atmos. Oceanic Technol.***21****,**457–469.Hu, M., , M. Xue, , and K. Brewster, 2006a: 3DVAR and cloud analysis with WSR-88D level-II data for the prediction of the Fort Worth, Texas, tornadic thunderstorms. Part I: Cloud analysis and its impact.

,*Mon. Wea. Rev.***134****,**675–698.Hu, M., , M. Xue, , J. Gao, , and K. Brewster, 2006b: 3DVAR and cloud analysis with WSR-88D level-II data for the prediction of the Fort Worth, Texas, tornadic thunderstorms. Part II: Impact of radial velocity analysis via 3DVAR.

,*Mon. Wea. Rev.***134****,**699–721.Jacobson, M. Z., 1999:

*Fundamentals of Atmospheric Modeling*. Cambridge University Press, 656 pp.Kalnay, E., 2003:

*Atmospheric Modeling, Data Assimilation, and Predictability*. Cambridge University Press, 341 pp.Leredde, Y., , J-L. Devenon, , and I. Dekeyser, 1999: Turbulent viscosity optimized by data assimilation.

,*Ann. Geophys.***17****,**1463–1477.Lin, C-L., , and T. Chai, 2002: On smoothness constraints for four-dimensional data assimilation.

,*J. Comput. Phys.***181****,**430–453.Lin, C-L., , T. Chai, , and J. Sun, 2001: Retrieval of flow structures in a convective boundary layer using an adjoint model: Identical twin experiments and coherent structures in the convective planetary boundary layer.

,*J. Atmos. Sci.***58****,**1767–1783.Lin, C-L., , Q. Xia, , and R. Calhoun, 2008: Retrieval of urban boundary layer structures from Doppler lidar data. Part II: Proper orthogonal decomposition.

,*J. Atmos. Sci.***65****,**21–42.Liu, D. C., , and J. Norcedal, 1989: On the limited memory BFGS method for large scale optimization.

,*Math. Program.***45****,**503–528.Newsom, R. K., , and R. Banta, 2004: Assimilating coherent Doppler lidar measurements into a model of the atmospheric boundary layer. Part I: Algorithm development and sensitivity to measurement error.

,*J. Atmos. Oceanic Technol.***21****,**1328–1345.Newsom, R. K., , D. Ligon, , R. Calhoun, , R. Heap, , E. Cregan, , and M. Princevac, 2005: Retrieval of microscale wind and temperature fields from single- and dual-Doppler lidar data.

,*J. Appl. Meteor.***44****,**1324–1345.Rye, B. J., , and R. M. Hardesty, 1993: Discrete spectral peak estimation in Doppler lidar. I: Incoherent spectral accumulation and the Cramer-Rao bound.

,*IEEE Trans. Geosci. Remote Sens.***31****,**16–27.Stull, R. B., 1988:

*An Introduction to Boundary Layer Meteorology*. Kluwer Academic, 666 pp.Sun, J., , and A. Crook, 1994: Wind and thermodynamic retrieval from single-Doppler measurements of a gust front observed during Phoenix II.

,*Mon. Wea. Rev.***122****,**1075–1091.Sun, J., , and A. Crook, 1997: Dynamical and microphysical retrieval from Doppler radar observations using a cloud model and its adjoint. Part I: Model development and simulated data experiments.

,*J. Atmos. Sci.***54****,**1642–1661.Sun, J., , and A. Crook, 1998: Dynamical and microphysical retrieval from Doppler radar observations using a cloud model and its adjoint. Part II: Retrieval experiments of an observed Florida convective storm.

,*J. Atmos. Sci.***55****,**835–852.Troen, I. B., , and L. Mahrt, 1986: A simple model of the atmospheric boundary layer; sensitivity to surface evaporation.

,*Bound.-Layer Meteor.***37****,**129–148.

Descriptions of cases; NTER: number of iterations.

Correlation coefficients and rms deviations between retrieved and observational radial velocities.

Height-based correlation coefficients *γ _{f}* and rms errors ε

*between single- and dual-lidar retrieval below*

_{f}*z*= 1200 m, at

*t*= 87 s.

_{w}