## Abstract

A prototype high-resolution (1 km, 5 min) multiradar 3D gridded reflectivity product, including a suite of derived 2D vertical column products, has been developed for the Single European Skies Air Traffic Management Research program. As part of this, a new method for mapping radar data to grid points is being used, based on the concept of a binary space partitioning (BSP) tree that treats radar data as a set of points in a 3D point cloud. This allows the resulting analysis to be based on a complete picture of the nearby data from overlapping radars and can easily adapt to irregular grid configurations. This method is used with a Barnes successive corrections technique to retrieve finescale features while avoiding problems of undersmoothing in data-sparse regions. This has been tested using 3D domains enclosing the terminal maneuvering areas surrounding Paris, France, and London, United Kingdom, and using reflectivity plan position indicator scan data from the French and U.K. operational networks, encoded using the standard European Operational Programme for the Exchange of Weather Radar Information (OPERA) Data Information Model format. Quantitative intercomparisons between the new method, in various configurations; a high-resolution version of an existing method, in operational use at Météo-France; and a method that was developed by the National Oceanic and Atmospheric Administration for use with the Weather Surveillance Radar-1998 Doppler radar network, have been done using simulated radar scans derived from 3D synthetic radar reflectivity fields in stratiform and convective regimes.

## 1. Introduction

The Single European Skies Air Traffic Management Research (SESAR) program is a public–private partnership, founded in 2007 by Eurocontrol and the European Union, that aims to provide the next generation of European air traffic management (ATM) systems by the end of this decade. A collaborative project between Météo-France and the Met Office has been set up in order to develop a solution for the SESAR Work Package WP11.2–MET Information Systems. The project is currently in phase III, after successful completion of phase II in 2014.

In aviation, major sources of atmospheric hazards include wake vortices, wind shear, clear-air turbulence, icing, and severe convection. While airport-based radar and lidar can be important for diagnosing low-level hazards—for example, see Merritt et al. (1989)—the focus of this work is to diagnose severe convection, facilitated by weather radar network data provided by national meteorological services (NMSs). These networks can provide coverage over horizontal domains ~1000 km wide and with vertical extents exceeding 12 km or 40 kft MSL.

A radar network, in this context, consists of a number of radars that perform plan position indicator (PPI) scans of overlapping 3D volumes of the atmosphere. A typical scanning pattern is made at a series of elevation angles repeated on a 5-min cycle. The choice of scan pattern is made so that the volume of interest is sufficiently well sampled within the cycle time. Most national radar networks favor the lowest scan elevations because they are generally of the most use for surface quantitative precipitation estimation (QPE).

Operational systems that deliver 3D (and 2D) gridded multiradar reflectivity products, often called mosaics, have been demonstrated by others. For example, the continental United States (CONUS) mosaic (Zhang et al. 2011) is produced on a 3D domain covering the entire contiguous United States at ~1-km horizontal resolution, with 31 vertical levels (variable spacing) and a 2.5-min update cycle. Another example is the Météo-France system, which provides Doppler wind and reflectivity analyses at 2.5-km horizontal resolution, with 24 vertical levels (500-m spacing) and a 15-min update cycle where radar scans at different times are temporally synchronized using motion vectors diagnosed at previous time steps. This has recently been extended to cover nearly all of metropolitan France^{1} (Bousquet and Tabary 2014). For a more detailed review of the current 3D mosaic creation methods, see Lakshmanan and Humphrey (2014).

Although 3D information is valuable for ATM, 2D column-integrated products can easily be derived from the 3D grids. These are also useful indicators of convective storm severity that can more readily be visualized on display screens in operational environments. Commonly used 2D products include the maximum reflectivity in a column (MAXDBZ), the echo-top heights [18 dB*Z* (TOP18)/45 dB*Z* (TOP45)], and the vertically integrated liquid (VIL; Greene and Clark 1972).

Reflectivity and reflectivity-derived products of this type are well established and understood. More recently, others have investigated using dual-polarization radar to provide useful hazard warning products, for example, for hail detection and icing conditions (Williams et al. 2011). Liu et al. (2007) have demonstrated how dual-polarization information can give improved quality weightings when combining data from multiple radars to form a mosaic. Although European networks do not operationally exchange dual-polarization data at present, future operational systems will benefit from being designed to accommodate this additional information.

A number of schemes are in use whereby 3D grids of radar reflectivity can be constructed from radar network data. The method used for the CONUS mosaic was proposed by Zhang et al. (2005); a nearest-neighbor method is used to grid reflectivity values for a single-elevation scan and then a linear interpolation is used to interpolate vertically between scans from the same radar. The 3D mosaic is then constructed by combining the single-radar grids from multiple radars, at collocated target grid points, in a weighted average. This is a distance-weighted mean (DWM), where the weightings are dependent on the distance of each radar from the analysis point. A slightly different approach is used in the Météo-France system, whereby the single-radar grids are produced using a Cressman weighting (Cressman 1959) in 3D before they are combined using the DWM.

The gridding technique known as objective analysis (OA) was developed originally to create analyses of meteorological variables for assimilation into NWP using irregularly spaced observation data, such as those provided by surface observing sites or by radiosonde ascents. The early schemes of Cressman (1959) and Barnes (1964) are probably the best known and have been used extensively for gridding of radar data (e.g., Henja and Michelson 1999). In addition to gridding, these schemes also introduce low-pass filtering, preventing undesirable small-scale noise in the analysis. They have the advantage of being very computationally efficient compared to techniques based on minimum variance estimates, such as statistical interpolation (Gandin 1963) and kriging (Matheron 1963). A notable disadvantage is that OA can suffer from problems related to the clustering of data (Barnes 1994). Despite this, the schemes continue to be used for radar applications today.

A comprehensive study of radar data OA in 2D was carried out by Trapp and Doswell (2000). They compare several commonly used methods for regridding the radar data from polar coordinates, including OA, to other techniques that are sometimes used, such as bilinear interpolation or nearest neighbor. It was demonstrated that although the optimal choice of scheme depends largely on the application, Barnes OA avoids some of the problems associated with other methods. In terms of the Fourier components of the retrieved analysis, these problems include the introduction of phase shifts and wave amplitudes that were not present in the original data.

OA does not take into account the sampling errors inherent in the measurement of the data that are used in the analysis. For radar data at long range, these errors are significant and therefore it is desirable to take account of them when creating a multiradar mosaic. An inverse method to produce 3D mosaics has recently been proposed by Roca-Sancho et al. (2014). This shows promise because the radar sampling (in terms of the spread of the radar beam power) is properly specified in the model. However, the computational cost of such methods is typically high and it is not clear whether this method is ready to be used to create 3D mosaics in an operational environment.

As outlined by Lakshmanan et al. (2006), 3D mosaics can be constructed by either finding the best estimate from each radar individually (at collocated points) and then combining the estimates together [effectively a two-stage approach as is used by Zhang et al. (2011)] or putting data from multiple radars together and treating them equally in a single pass. In this work, the latter option is considered, where multipixel, multiradar data near each analysis point are enumerated using the results of an efficient spatial search algorithm and then multipass Barnes OA has been used to estimate the reflectivity at these points. For relatively dense radar networks, such as those in the United Kingdom and France, this method was developed with the expectation that it can give improved real-time 3D gridded analyses, compared to the existing methods that use a DWM.

In the next section, the coverage of the U.K. and French radar networks, which have been used for a prototype 3D system, is described (section 2). This is followed by a description of a proposed method to efficiently map irregularly spaced data from the radar network to the target grid (section 3). A technical description of the 3D retrieval algorithms (section 4) is then given. Next, a method by which the prototype is evaluated is described and results of this evaluation are presented (section 5). Following that, a few examples of the output of the prototype for a case study are shown (section 6). Finally, the conclusions resulting from the evaluation are reported and future work is discussed (section 7).

## 2. Radar network coverage of Paris and London TMAs

For the purposes of demonstrating the prototype, two 3D domains were chosen that enclose the terminal maneuvering areas (TMAs) of Heathrow Airport (LHR), in Greater London, United Kingdom, and Charles de Gaulle Airport (CDG), northeast of Paris, France. Figure 1 shows the boundaries of the domains and the relative positions of the radars covering these areas. All of the radars used are C band, with a 1.1° beamwidth and range gate spacings of Δ*r* = 1.0 km and Δ*r* = 0.6 km for the French and U.K. radars, respectively. The maximum range of the radars is ≈255 km but, for clarity, the diagram shows the coverage of the radars out to 100 km.

### a. Specification of prototype domains

The domains have been chosen so that they extend 200 km in ground distance in the east, west, north and south directions from the locations of LHR and CDG, giving a horizontal extent of 400 km × 400 km, while the vertical extent was chosen to be 12 km because nearly all commercial aircraft fly below this level and it is extremely rare for convective cells to exceed this height at the latitudes of LHR and CDG. A horizontal resolution of 1 km and a vertical resolution of 500 m are being used, resulting in 3D grid dimensions of 400 × 400 × 24.

### b. Scan strategy and radar coverage

The reflectivity scan elevations (Doppler and polarimetric excluded) that cover the LHR and CDG domains are shown in Table 1. The U.K. reflectivity scans are repeated every 5 min, while some of the French reflectivity scans are operated on a *supercycle*; usually the lowest elevations are repeated on a 5-min cycle but high-elevation or vertically pointing scans may be only every 15 min.

To illustrate the vertical coverage, Fig. 2 shows a vertical cross section of a typical scanning pattern for a U.K. radar, where standard formulas giving the height of the radar beam above the earth’s surface have been used (Doviak and Zrnić 1984). Notably, there is a large cone-shaped gap directly above, often called the *cone of silence*, and also below the lowest scan, due to the curvature of the Earth and the positive elevation angle.

### c. Map transformation to 3D domain coordinates

To give the Cartesian coordinates (*X*, *Y*) on the earth’s surface, the polar stereographic map projection (Snyder 1987) has been used, while the *Z* coordinate is given by the height above the World Geodetic System 1984 (WGS 84) geoid at (*X*, *Y*). To minimize map scale distortions, the latitude of true scale was shifted to the airport, meaning that the Cartesian coordinates (*X*, *Y, Z*) refer to a space that is very close to Euclidean within the domain of interest.

In the following sections, radar data coordinates (centroid of the radar sampling volume) are expressed in Cartesian coordinates. These coordinates are the result of map transformations, achieved using the software package PROJ4^{2} (Evenden 1990), from the sampling volume centroids in the radar domain coordinates to the 3D mosaic domain coordinates , assuming the standard beam height formulas (Doviak and Zrnić 1984). The set of all coordinates is subsequently referred to as the *point cloud*.

## 3. Point cloud retrieval method

A method for mapping radar data to grid pixels is proposed here. It is achieved by performing an iteration over the retrieval grid points *i*, where at each point a spatial search in the point cloud is done to enumerate the nearby radar data points from multiple radars at once. This allows all of the nearby information from the radar network, irrespective of which radar or radar scan each datum is associated with, to be available when the grid point is evaluated and therefore provides complete flexibility to combine the radar data in a way that is most suitable for the application.

### a. Examples of existing search methods

Most existing methods for mapping of radar grids to other target grid types (or vice versa) rely on either assumptions about the spatial arrangement of the grid points or some precalculated lookup table. For example, the Météo-France system iterates over radar data points *p* and updates a running total of the reflectivity at each target grid point that is within range of *p*. It assumes a regularly spaced Cartesian target grid and so the grid point indexes can be calculated efficiently by dividing by the Cartesian grid spacing. Another example is the approach used for the Operational Programme for the Exchange of Weather Radar Information (OPERA) European radar mosaic (Dupuy et al. 2010). In this case, the algorithm is efficient enough to allow a 15-min update cycle because the mappings of radar pixels to Cartesian pixels are precalculated and retrieved from disk at run time, as needed. It relies on the fact that the mappings do not often change from one cycle to the next.

### b. A k-d tree search in the radar point cloud

In general, finding the set of data points within a specified radius of the grid point is a computational problem. Using a brute-force method, which requires the 2-norm distance to be evaluated for all points in the cloud (at each grid point), would be intractable.^{3} The solution used in this work is to arrange the coordinate data in a specialized data structure that facilitates efficient searching. Such data structures are often used in computer graphics and are known as binary space partitioning (BSP) trees. A type of BSP tree, called the *k*-dimensional (k-d) tree, proposed originally by Bentley (1975), works with Cartesian coordinate systems of arbitrary dimensionality to allow fast searching within the point cloud, either by retrieving the *n* nearest points to an arbitrary query point or all points within a specific radius (Euclidean 2-norm distance). A brief description of the k-d tree algorithm is given in appendix A but, for more detail, the authors recommend the text by Moore (1990, chapter 6) as well as numerous other sources online.^{4}

There are two steps to the k-d tree approach, as is shown in Fig. 5. A precalculation step is done first, where the k-d tree is constructed using the point cloud coordinates . Then a k-d tree search is done for each target *i* to retrieve efficiently the indexes of the data near each target grid point location . These indexes can then be used to retrieve all of the measured radar moments associated with those radar points, in particular the reflectivity but also, for example, Doppler radial winds or dual-polarization moments.

The Fast Library for Approximate Nearest Neighbors^{5} (FLANN; Muja and Lowe 2009) was chosen to provide the radius-based k-d tree search implementation due to its high run time efficiency, compared to other alternatives. This software was used, in an unmodified form, with the 3D point cloud data in the coordinate system of the target grid.

### c. Efficiency considerations

A k-d tree search for all data points within a specified search radius of a query point can be accomplished with a run time of ~log*n*_{p}, compared to ~*n*_{p} for a brute-force method. The total time to enumerate the nearby pixels at all *n*_{i} grid points is therefore ~*n*_{i}log*n*_{p}. Precalculation of the k-d tree is necessary, which takes a time of ~*n*_{p}log*n*_{p}. The choice of search radius depends on the desired use of the data, as discussed in section 4c. The search time is affected strongly by this choice.

If the goal is to achieve a many-to-one mapping of radar points to grid points, then the random-access memory (RAM) required for the k-d tree method can be relatively small compared to other approaches that use a lookup table (LUT). This is because it stores the index of each radar point only once for all pixel-to-grid mappings, as described in appendix A, but a LUT may need to assign the same radar pixel to multiple grid pixels (when the search radius is large enough).

### d. Examples of use

This approach could be advantageous for a number of applications and, because it is a general solution to a common problem, these are not limited to aviation. One example is for aircraft trajectory planning, where data could be extracted along a proposed flight path that is not known in advance. Another example is where an adaptive target grid is desired to resolve better the features of interest; in this case the grid spacing will be variable and determined by the present weather. It allows for rapid changes in the scanning strategy of the radar (or even its location). Furthermore, it could be used for gridding multiple types of observation together in a multisensor mosaic where the different observations are arranged in different ways in space, or changing location with time, for example, upper-level winds from radiosonde ascents. Additionally, it provides a convenient way to retrieve observations after they have been subject to an advection correction to temporally synchronize the data, which in general results in an irregular distribution of data points.

## 4. Methods for retrieval on to 3D grids

For the remainder of this article, attention is restricted to the case where the retrieval grids are regularly spaced, as has been used for the prototype system. A multiradar 3D point cloud mosaic (PCM) method, based on the k-d tree, is the main focus of this section, which uses Barnes OA to linearly weight data from multiple radars in a single summation.

Figure 5 shows a program flowchart for the PCM. It can be run in either a single-pass mode or a multipass mode. For the single-pass mode, a Barnes OA was performed by linear weighting grid points from all radar pixels (irrespective of radar) near each grid point, using the method described in section 4a. For the multipass mode, the single-pass mode was produced as before and then updated using the successive corrections method (SCM) by applying increments at each grid point, as described in section 4b.

For all of the methods used in this work, the radar data were encoded in the European standard OPERA Data Information Model file format (ODIM; Michelson et al. 2014), enabling the software to ingest PPI data produced by nearly all of the NMSs in Europe, including the Met Office and Météo-France.

### a. Single-pass Barnes OA

There are numerous texts describing the Barnes and Cressman OA methods—for example, see Trapp and Doswell (2000) for a description in the context of radar data OA—so these are not described in detail here. For single-pass Barnes OA, the analysis *G*_{i} at *i* is given by linear weighting all data in the cloud, represented by the vector **F**, with the Barnes weighting , which is dependent on the *Barnes smoothing parameter * and the 2-norm distances between the grid point coordinate and the datum coordinates . This can be written in a vector notation:

where the superscript (0) indicates the first pass and is the matrix of normalized weightings , subject to for all *i*.

To help with the discussion in the later sections, we recall the result of Barnes (1964)—that the Barnes OA acts as a low-pass filter on the data. In the limit that the data spacing —that is, a continuous field—the frequency response of the Barnes OA on the data can be evaluated; it is the Fourier transform of :

where *k* is the spatial frequency, or wavenumber. That is to say, the wave amplitude at the frequency *k* in the original data will be reduced by a factor when passed through the Barnes filter.

In reality, the discrete nature of the data and the fact that the data spacing for radar network data is somewhat variable means that the frequency response of the Barnes filter will differ from the above. Pauley and Wu (1990) provide a study of the theoretical response of the Barnes weighting for the case where observations are treated as discrete and evenly spaced points. Equation (2) is a good approximation for evenly spaced data unless the Barnes smoothing parameter is chosen to be too narrow for the given data spacing. The effect of having unevenly distributed data points can be somewhat more problematic, especially at data boundaries. Attempts have been made by others to quantify this effect (e.g., see Barnes 1994; Achtemeier 1986), but because of its complexity, it is not considered further here.

### b. Successive corrections method

The details of the SCM are given in this section. In this study, the standard form of SCM has been used but also a modification has been proposed in section 4f. For further details of the SCM, the authors recommend the text by Daley (1991). With SCM, the first guess [Eq. (1)] is updated by a series of correction passes to improve the agreement between the retrieved field and the observations. The estimate of the field , after *n* correction passes, can be written as

where the quantity is called the *increment* and is the *increment weights* for pass *n*. For Barnes SCM, the weights are again a Gaussian function of the distance between the grid points and the data points with an adjustable smoothing parameter that can be chosen to suit the data. The increment is the difference between the field retrieved at the previous iteration , evaluated at by means of an interpolation operation **T**, and the point cloud values . Using the increment weights, a Barnes interpolation is done of the increments at back on to the grid points . Then the result is added to the field from the previous pass. Since the target grid used in this work is regularly spaced, the operator **T** has been achieved using a relatively simple trilinear interpolation.

In the standard SCM, the smoothing width does not vary from one iteration to the next—that is, —but it is common practice to reduce *κ* at successive iterations in order to speed up the convergence to the smaller length scales; that is, , where is the *convergence parameter* for *n*, with . The details of the smoothing parameters used in this work are given in section 5d.

In the original paper by Barnes (1964), was calculated after *N* iterations with fixed *κ*:

Each successive iteration serves to increase the steepness of the response function, allowing for better resolution of the smaller length scales. Later Barnes (1973) argued that the desired frequency response can be achieved after just one correction pass by using a convergence parameter *γ*_{1}, resulting in

However, the need for additional correction passes and the recommended approach for choosing smoothing parameters is still a matter of debate (e.g., see Spencer et al. 2007). In the section 5 it is shown that, for the configurations tested, the additional correction passes can improve the analyses.

### c. Limit on the range of the Barnes weighting

When the distance of an observational datum from the analysis grid point becomes large compared to *κ*, the weighting assigned to it becomes relatively small. It is appropriate to stop considering data beyond a certain cutoff range for reasons of computational efficiency. Barnes (1964) used a cutoff radius , where *E* = 4 is chosen so that nearly all of the weighting (98% in this case) from the data is represented by points within . If there are sufficiently many points within , there will be very little benefit to including data outside this range. For radar data, values for the smoothing parameter are typically κ ~ 1 km^{2}, which gives values of . When searching in the point cloud, using the k-d tree, the value of can be used as the search radius.

In this work, was chosen to be large enough to ensure that at least one radar datum from a given radar is included at maximum range from that radar. However, no allowance has been made for gaps in the radar network, where the cone of silence or the volume underneath the lowest scan is not covered by other radars. It was determined that was sufficient for the LHR and CDG domains to avoid gaps due to the increase in data spacing with range.

### d. Effects of undersmoothing

When is chosen to be small compared to the data spacing (), the effects of undersmoothing become apparent. In this limit, the smoothing function behaves like a nearest-neighbor interpolation algorithm; a “blocky” image can be seen in places where the weighting function assigns nearly all of the weight to the nearest-neighbor pixel.

Even when , the analysis will still be subject to aliasing effects, as outlined by Pauley and Wu (1990) and others, in the form of undesirable high-frequency noise. To suppress this, Koch et al. (1983) suggested using a minimum value of .

### e. Multiradar mosaic

Zhang et al. (2005) use a DWM to create a mosaic from multiple single-radar analyses and favored a weighting function that is a Gaussian function of the distance *r* between the radar and the analysis grid point:

where *K* is a parameter that is tuned to optimize the analysis for the desired application. They found that a value of *K* = 50 km, which produces a steep drop-off in the weight, going from short to long range, was preferred over a weighting function that dropped off more slowly, because it allowed for good resolution of finescale storm features where they were available from radar data at short range. The Météo-France system used a similar scheme but with *K* = 200 km.

### f. Point density weighting

The data spacing for a given radar will increase with range due to an increase in spatial separation between adjacent radar azimuths and adjacent radar scan elevations, meaning that the local density of data points will decrease. If multiple radars are used in a single Barnes OA weighted average, then the radar at longer range will have a lower total weighting relative to a radar that is at a shorter range. Effectively this behaves like a range-dependent radar weighting and is referred to as point density weighting (PDW). It is similar to a DWM but with some important differences. An advantage of PDW is that the combined contributions from all overlapping radars will give an effective point density that is higher than for any individual radar and, therefore, the expectation is that more detail can be retrieved by using a narrower Barnes weighting function, permitted by this increased density. Also, since it is purely a Barnes analysis, the frequency response applied by the retrieval process is well defined and can be adjusted according to requirements.

A limitation of the PDW is that, unless *κ* is large enough that all radars are sufficiently smoothed, the relative weighting of radars that are undersmoothed can fluctuate strongly depending on the location of the analysis point relative to the nearby data points. In Fig. 6 the total Barnes weight from nearby data points has been calculated for a theoretical case, assuming evenly spaced data points , with spacing Δ*n* = 1.0, in one dimension, for a few values of *κ*. As can be seen, the total Barnes weight as a function of the analysis point coordinate *x* oscillates with an amplitude of about 2% of the maximum for , but when *κ* is reduced to , the amplitude is about 50% of the maximum. This is not an issue in itself, when dealing with a Barnes analysis for a single radar, because the total weight is always normalized, that is, divided by . However, when the sum is over data points from all radars, as is the case for the PDW, the oscillations will be retained in the analysis because the unnormalized single-radar weights give the relative weighting of each radar. This can lead to *spokes* when two radars are overlapping and at least one is heavily undersmoothed. An example is shown in Fig. 7a.

For a single-pass method, the effects of undersmoothing in the PCM can be avoided by selecting a Barnes smoothing parameter that is large enough to avoid the oscillations at maximum range; that is, , where Δϕ is the azimuth separation and Δθ is the spacing between successive elevation scans. Obviously this is a compromise; the loss of fine detail at short range, as a result of the increased smoothing, needs to be balanced against the reduction in undersmoothing effects seen at longer ranges.

For the SCM, it is possible to start with a value of that avoids undersmoothing at all ranges and to add the finer detail by means of the correction passes. This produces measureable improvements to the accuracy, as is shown later. However, the undersmoothing will again become a problem in places where has decreased to a sufficiently small value compared to the data spacing.

These artifacts can be visually misleading. To avoid them, either must remain large enough during the correction passes or it is possible to adjust the increment weightings when it is known this will occur. One possibility is simply to set equal to zero the weighting of increments that would be undersmoothed (when the range from the radar exceeds a critical range determined by the smoothing). However, this can lead to difficulties when an increment is set to zero for one grid pixel but is unmodified in an adjacent pixel, leading to a spuriously large increment after back interpolation from the results of the previous pass. Instead, a radar range weighting factor *f* can be applied to the increments arising from each radar datum when and where is a critical smoothing amount determined by the azimuthal data spacing at range *r*. The following form was tested with the four-pass (PCM3) algorithm using values of *κ* that produced heavy undersmoothing:

where *a* and *b* are tunable constants. The effect of applying this factor is shown in Fig. 7. After some experimentation, the values for the constants used were *a* = 0.5 and *b* = 3.0. In regions where the coverage was already good enough, there was no apparent change to the mosaic.

## 5. Verification by stochastic simulation of a radar network

In this section, a method is described that compares 3D gridded retrievals from synthetic radar network observations, which are the result of a simulated measurement of a synthetic reference field, to the synthetic field itself. The differences between the retrieval from the synthetic observations and the original synthetic field are then studied for different retrieval algorithms, thereby allowing them to be compared quantitatively to each other. This approach, often referred to as an observing system simulation experiment (OSSE), has been used in the context of verifying radar grids in 2D (Askelson et al. 2000; Trapp and Doswell 2000) and for radar QPE (Sánchez-Diezma et al. 2001).

### a. Generation of the 3D reference field

The statistical properties of high-resolution gridded reflectivity analyses, derived from real data, were calculated and then these were used to generate synthetic observations with similar features. This allows a relatively economical approach, without reference to other sources of observations or the need to develop and understand a physical model. Arguably there is a reduced amount of physical realism, compared to other approaches—for example, Caumont et al. (2006)—which use NWP data in a forward model, but the synthetic fields generated in this way capture the most important features, that is, the variability over a range of scales.

Randomly generated synthetic 3D gridded reference fields , composed of a range of spatial frequencies that were inferred from high-resolution (500 m × 500 m × 250 m) retrievals on real data, were produced at 500 m × 500 m × 250 m. The increased resolution used in the reference field allows the variability that would be present in real radar data at scales smaller than the retrieval resolution (1 km × 1 km × 500 m) to be retained when simulating the radar measurement (as described in the next subsection).

The method used to generate is described in more detail in appendix B. Examples of the horizontal structure of this synthetic data are shown in Figs. 8 and 9, which are matched to the statistics of retrievals using real PPIs from convective and stratiform cases, respectively.

### b. Simulation of the radar PPIs

The second part of the OSSE requires a means to simulate the radar sampling of the 3D reflectivity field. This was modeled by applying an operation on to simulate the effect of radar beam broadening and is based on the form of the radar equation for the mean received power over the sampling volume [Doviak and Zrnić 1984, Eq. (4.11)]:

with

where *ϕ* and *θ* are the azimuth and elevation angles, respectively; and are the angles measured relative to the beam center azimuth and elevation ; *η* is the reflectivity; and *l* is the one-way attenuation loss factor. With this form of the radar equation in mind, the effect of beam broadening on the reflectivity was approximated by the following integral, as has been used in other simulation work—for example, see Anagnostou and Krajewski (1997)—to give a simulated reflectivity for a range gate at :

This assumes that the factor is constant over the sampling volume, which is reasonable if the range gate length is small compared to the range from the radar. A Gaussian form was used as an approximation for the transverse beam power pattern, following Probert-Jones (1962), specified in terms of the 3-dB beamwidths, and :

### c. Reference implementations of the 3D gridding

To provide a baseline performance measure, reference implementations of the retrieval methods by Zhang et al. (2005, hereafter ZM) and the reflectivity component of the Météo-France multiple-Doppler synthesis and continuity adjustment technique (MUSCAT) method by Bousquet and Chong (1998, hereafter MRM) were developed. For ZM, the vertical interpolation (VI) method, as described in their paper, was used to create single-radar 3D analyses on the LHR domain that were then combined using the DWM, with the recommended width of *K* = 50 km. For MRM, the Cressman weighting, with a fixed horizontal radius of and variable vertical radius matched to the beamwidth, was used to create the single-radar analyses, again on the LHR domain. These were then combined using the default setting of *K* = 200 km.

### d. Intercomparison of retrieval algorithms

Using the methods described above, a quantitative intercomparison between the various retrieval algorithms was made. This included the ZM, MRM, Barnes single-pass (PCM0), the two-pass (PCM1), and the PCM3 algorithms. The parameters of the algorithm configurations are shown in Table 2. The mean error (ME) and root-mean-square error (RMSE) for each configuration were calculated by comparing each 3D grid pixel in the retrieved reflectivity fields, using simulated U.K. radars covering the LHR domain, to the corresponding grid pixels in the reference synthetic reflectivity fields . The sample statistics were calculated from an aggregate of 10 independent realizations of (and the corresponding PPIs) in order to avoid biases that may have occurred due to having synthetic reflectivity features localized to certain parts of the radar network.

#### 1) Stratiform case

For the OSSE based on stratiform data, Figs. 11 and 12 show the ME and RMSE as a function of height level, while Table 3 shows the scores calculated over the whole domain. These results show that the PCM3 algorithm gives a modest improvement in performance over the reference ZM implementation and is substantially better than the single-pass (PCM0), two-pass (PCM1), and MRM algorithms.

#### 2) Convective case

For the OSSE based on convective data, Figs. 13 and 14 show the ME and RMSE as a function of height level, while Table 4 shows the scores calculated over the whole domain. The same picture is seen as in the stratiform case; the results show that the PCM3 algorithm gives a modest improvement in performance over the reference ZM implementation and is substantially better than the PCM0, PCM1, and MRM algorithms.

### e. Comparison of computational costs

Rudimentary tests were carried out to find the precalculation cost to build the k-d tree for the PCM with a typical configuration using the LHR domain ( and ). It was found that both the computation time (~1 s on a 3-GHz CPU) and the RAM storage needed (~100 MB) were not a significant factor.

Some basic tests were performed to determine the total time for the various algorithm configurations. This was carried out on a Linux Red Hat desktop machine with eight central processing unit (CPU) cores running at 2.3 GHz and 6 GB of RAM. The MRM algorithm was not configured to run in parallel, so only one CPU core was used but the PCM algorithm used all eight CPU cores, with each vertical column of grid points constituting a single parallelization unit. The elapsed times and total CPU times required to carry out the retrieval using a simulated PPI network (as described in the next section) using the LHR domain ( and as above) are shown in Table 5. The times for ZM are not shown because the implementation used was not optimized for run time. However, it is believed it would be comparable to that of MRM.

## 6. Case study: Supercell thunderstorm in the Midlands region of England, 28 June 2012

On 28 June 2012, outbreaks of supercell thunderstorms developed over England and Wales that were of a magnitude rarely seen in the United Kingdom.

Supercell formation requires strong convective updrafts, associated with a high convective available potential energy (CAPE) and strong vertical wind shear. These updrafts can persist for a relatively long period and often promote hail growth.

Clark and Webb (2013) give a detailed description of the meteorological evolution of this storm. They reported a swathe of hail extending over 100 km horizontally with hailstone sizes exceeding 75 mm and integrated lightning strike densities exceeding 10 km^{−2} along the storm track. Severe flight disruption was reported at Birmingham International Airport in the West Midlands.

Figures 15 and 16 demonstrate a 3D mosaic retrieval for this event, at 1330 UTC, using the PCM3 algorithm. This corresponds to the smaller 100 km × 100 km domain indicated in Fig. 1, covering the Nottinghamshire region of the English Midlands. Horizontal cross sections, at constant height levels of 2.5, 5.0, 7.5, and 10.0 km are shown in Figs. 15a–d.

Vertical cross sections, corresponding to the above, at constant distance east *X* = 45 km and north *Y* = 50 km, are shown in Figs. 16a and 16b. The anvil shape of the supercell is particularly evident in these figures.

In addition to the 3D mosaic, a suite of 2D column-integrated products were produced (Figs. 17a–d). This includes the MAXDBZ, the VIL (calculated according to Greene and Clark 1972), the height of TOP18 and the height of TOP45. The TOP45 product (Fig. 17d) shows very clearly a central core that, given the reports on that day, is likely to be responsible for some of the large hailstones seen.

## 7. Summary, conclusions, and future work

A k-d tree method has been described that creates a many-to-one mapping of multipixel, multiradar data, within a specified 3D radius around each point in a target analysis grid, that may be irregularly spaced. This has been shown to be efficient enough to allow real-time applications to create 3D multipixel, multiradar mosaics. By using the k-d tree search in this way, a prototype system was demonstrated that uses a Barnes successive corrections method to calculate the analyzed values. This allows smoothed interpolation of the data at the same time as applying a Gaussian low-pass filter with a prescribed width. By starting with a wide enough smoothing width, it was demonstrated how undersmoothing can be avoided at long range, where the separation between adjacent data is greater. It was then shown that the detail provided by the improved resolution at shorter range from the radar can be retrieved by using the additional correction passes.

In section 5, an observing system simulation experiment was described, allowing an intercomparison of the various retrieval methods to be carried out. It was shown, using the mean error (ME) and root-mean-square error (RMSE) statistics of the retrieval from simulated synthetic PPIs versus the reference synthetic field, that, for the cases studied, the PCM3 is more skillful than the ZM method and that, compared to PCM0 and PCM1, the additional correction passes improve the skill substantially.

The results suggest that a multipixel, multiradar analysis can be superior to one that combines data into single-radar analyses and then creates a mosaic from a weighted average of those values. This approach could be adapted to use more sophisticated SCM-based methods—for example, as proposed by Bratseth (1986)—or a later modification by Pedder (1993), which can deal with the problems of overweighting due to the clustering of observations by introducing a correlation matrix of radar observations. Radar errors (and error correlations) could be included explicitly in this type of interpolation method, assuming they can be quantified. This allows more control over the weighting of radar data and could allow additional quality information—for example, from dual-polarization radar, as described by Liu et al. (2007)—to be used in the weighting. This may represent a reasonable intermediate step between the variational or inverse methods, which have a high computational cost, and the most basic OA schemes, which introduce the largest errors in the analyses.

It can be argued that the radar beam broadening acts like a Gaussian low-pass filter on the radar reflectivity field, in the transverse direction to the radar beam axis. This is suggested by the form of Eq. (8), which is approximately a convolution of and the Gaussian beam power pattern . This would imply that the frequency response is also a Gaussian (the Fourier transform of the beam power pattern), with a range-dependent width. The combined effect of the radar sampling and the Barnes smoothing would be to apply two Gaussian filters to the original reflectivity field, giving an overall filter width equal to the sum of the widths of the individual filters. Although this has not been investigated, it could be of interest in future work.

Ways to extend the prototype in order to provide additional products are being considered. An estimate of the probability of hail (PoH) can be easily calculated from the difference in height between the TOP45 and the wet-bulb freezing level (provided by an NWP forecast), as described by Delobbe and Holleman (2006). The k-d tree approach is not limited to the retrieval of reflectivity, and it is a trivial extension of the method to retrieve other radar moments as part of the same k-d tree search. The flexibility afforded by the multipixel, multiradar approach combined with dual-polarization could allow better hazard products to be derived using techniques similar to those described by Williams et al. (2011) and, more generally, facilitating real-time hydrometeor classification (e.g., see Al-Sakka et al. 2013) products in 3D space.

## Acknowledgments

This work has been funded by SESAR (www.sesarju.eu). The authors would also like to thank Malcolm Kitchen, Katie Norman, Caroline Sandford, Pierre Tabary, and Nicolas Gaussiat for their useful feedback on the technical aspects and for assistance with the preparation of this article.

### APPENDIX A

#### Brief Description of the k-d Tree Radius Search Algorithm

The first step in the k-d tree approach is to precalculate the binary tree based on the set of point coordinates in the point cloud. This is done by dividing the space (3D space in this case) into two half-spaces using a *splitting plane* that is perpendicular to one of the coordinate axes and intersects one point in the cloud. This defines the root node of a binary tree, where the points in the left half-space will be on the left side (or branch) of the tree and similarly for the right half-space. To ensure that the tree is *balanced*—that is, an equal amount of points on each side—a median-finding algorithm is typically used to find the center point along the selected axis. Each half-space is then divided again, using a splitting plane perpendicular to the previous, which further subdivides the space; this gives the second tier of the tree. This is repeated recursively until there is only one point in the half-space, that is, the *leaf node* of the tree. For points in the cloud, the depth of the tree will be approximately . At each node, the following information is stored: the index of the point, the *splitting dimension* and the indexes of the left and right *child nodes*. Once the tree has been constructed, a spatial search can be achieved for the chosen query point by starting at the root node and recursing down the tree, by choosing the appropriate half-space at each tier, until a leaf node is reached. Then Euclidean distances from the query point of the points corresponding to nodes near to the leaf are then checked and it is thereby determined which points lie within the search radius.

### APPENDIX B

#### Method for Generating 3D synthetic reflectivity

In the following, the concept of a reflectivity indicator *I* has been used. It can take one of two values depending on whether the radar detected a signal above a noise threshold (typically two standard deviations above the receiver noise mean):

The PCM3 algorithm was used on U.K. radar network data over the LHR domain to create a reference field of 3D gridded values for the reflectivity and the reflectivity indicator using the same grid dimensions and resolution as required for . Randomly generated synthetic fields were then produced with the aim to match the following quantities, calculated from and :

The 2D power spectra and at a representative height level

The means and standard deviations , and , at each height level

*k*The Pearson correlation matrices and formed from all pairs of height levels

*k*_{1}and*k*

The discrete Fourier transforms and were calculated using the fast Fourier transform (FFT) at a constant height level of 4 km (where the observation data density is greatest and therefore assumed to be the most representative of all height levels). The values , , , and are simply the sample means and standard deviation of *Z* and *I*, respectively, averaged over all grid points at each height level *k*. The Pearson correlation matrices and are formed from the Pearson correlation coefficients between each pair of height levels, where each matrix element is calculated by sampling over the whole horizontal extent of the domain.

Random noise was generated in 2D using a scale-decomposition-based synthetic noise generator that is part of the Short-Term Ensemble Prediction System (STEPS), developed at the Met Office and the Bureau of Meteorology and is described in detail by Seed et al. (2013). This allows the creation of gridded 2D normally distributed noise fields that match a given 2D power spectrum. Using this approach, 2D noise fields were produced over the horizontal domain, with a mean and standard deviation *N*(0, 1) that match the power spectra of the real data and . By matching the power spectra, it is known that the horizontal features in will have some similarity to those in .

Because of the variation in the correlations between vertical levels with height, a different approach was necessary to get a matching vertical correlation structure in . This was achieved using a principal component analysis (PCA) on the Pearson correlation matrices.

For the cases studied, it was found that eight principal components (PCs) are enough to represent more than 99% of the vertical variability because neighboring height levels are highly correlated. By using eight horizontal synthetic noise fields (uncorrelated with each other) a 3D field of *N*(0,1) noise is represented in the principal component system. The full 48-level simulation can then be constructed by inverting the principal component transformation, that is, to transform the 8 uncorrelated height levels into 48 correlated height levels. It can be shown that this transformation preserves the horizontal power spectrum.

Given this 3D field of correlated *N*(0, 1) noise, was then constructed by scaling the mean and standard deviation at each grid point according to the values calculated at each height level *k* from the real data , .

By repeating the process mentioned above, a second independent realization of the 3D *N*(0, 1) noise can be constructed using the properties of . By thresholding these *N*(0,1) values to give the correct rain fraction at each height level, a synthetic indicator field was then generated.

The indicator was then applied to the reflectivity field, thereby flagging as “dry” pixels those that were above the threshold. Because the reflectivity and reflectivity indicator were modeled as independent processes, it was deemed appropriate to adjust the local mean *μ* near to the edges of the nonzero reflectivity regions in the image, that is, where there is a transition from dry to wet, to avoid unnaturally sharp transitions. This was done according to the *dry drift* adjustment method described by Schleiss et al. (2014).

## REFERENCES

## Footnotes

*Publisher’s Note:* This article was revised on 17 March 2016 to correct an editorial error in the first paragraph.

^{1}

The coverage excludes Corsica.

^{2}

Available online at http://trac.osgeo.org/proj/.

^{3}

For one radar on the LHR domain, this is in excess of 10^{12} evaluations.

^{4}

For example, see https://en.wikipedia.org/wiki/K-d_tree.

^{5}

Available at http://www.cs.ubc.ca/research/flann/.

^{6}

This was provided by the software package Cuba, available at http://www.feynarts.de/cuba/.