## 1. Introduction

The utility of clustering as a component in any ensemble diagnostic toolkit for operational forecasting is now well recognized and documented in textbooks (e.g., Inness and Dorling 2013, 135–137). More generally, the application of clustering methods to meteorological and climatological datasets predates the ensemble era (Wilks 2006, p. 549). Ensemble clustering has been applied diagnostically to assess sources of variability among ensemble members for forecasts on seasonal (e.g., Straus and Molteni 2004) to short-range (0–72 h; e.g., Alhamed et al. 2002; Branković et al. 2008) projection times. In other clustering work, the emphasis has been on forecast applications for ensemble clustering, covering seasonal (e.g., Nakaegawa and Kanamitsu 2006), extended-range (beyond 10 days; e.g., Palmer et al. 1990), medium-range (days 4–10; e.g., Ferranti and Corti 2011), and short-range (e.g., Yussouf et al. 2004; Johnson et al. 2011) forecast projection times. Clustering has also been utilized in forecast verification (e.g., Marzban et al. 2008). This article focuses primarily on the forecast application and some performance aspects of a divisive clustering method applied over the medium-range time frame.

In the mid-1990s, the National Centers for Environmental Prediction (NCEP) began generating clustering output for the Global Ensemble Forecast System (GEFS) using the method described by Tracton and Kalnay (1993). On 27 September 1999, a fire in the NCEP computing facility destroyed disk drives attached to the Cray 90 computing system resulting in the loss of the software used to do the clustering. Since that incident, cluster output for the NCEP GEFS has not been available. Recently, clustering utilizing the methods described by Alhamed et al. (2002) was implemented operationally for the NCEP Short-Range Ensemble Forecast (SREF) based on 500-hPa heights. Implementation of operational clustering for the longer-range global ensembles available to NCEP is anticipated at an indefinite future date.

During the time when the NCEP GEFS clusters were available, medium-range (3–5 days into the future at that time, now extended to 7 days) forecasters in the NCEP Hydrometeorological Prediction Center [now the NCEP Weather Prediction Center (WPC)] found clusters to be useful. With the return of clustering in the offing, WPC sought to reacquaint medium-range forecasters with the operational use of clusters and investigate the broader use of clusters in the contemporary forecast process [manually edited blends of model guidance as described in appendix A with reference to Novak et al. (2014a)]. Furthermore, WPC forecasters wanted the clustering applied to the combined NCEP GEFS and European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble. To accomplish these goals, a simple, fast, and reasonably effective clustering algorithm was needed. Existing clustering algorithms described in textbooks (e.g., Wilks 2006) and the literature (e.g., Alhamed et al. 2002) are computationally demanding and fairly complex in terms of the coding required. Therefore, the development of a simple algorithm utilizing existing software ensued.

The pursuit of a clustering capability must be accompanied by recognition of the fact that, within the context of operational weather forecasting, clustering of an ensemble has not only strengths but weaknesses. Clusters are ambiguous. First, the problem of clustering an ensemble does not have a unique solution. The choice and configuration of clustering algorithms, the choice of parameters upon which to base clustering, and even postprocessing of ensemble output prior to clustering can influence the clustering outcome leading to different results. Second, ambiguity arises because of forecast uncertainty. Clusters can depict the uncertainty in a more focused way than bulk ensemble statistics (e.g., spread, median, and mean) but cannot resolve the uncertainty. The ambiguity associated with the clustering problem has been identified in previous work (e.g., Nakaegawa and Kanamitsu 2006; Branković et al. 2008). Another challenge is in interpreting the meaning of clusters within the context of the full ensemble taking into account the properties of the clustering algorithm; for example, do clusters define modes of the ensemble distribution? The problem of interpreting clusters from an ensemble has motivated the development of alternatives to clustering (Atger 1999). Only with a full understanding of their limitations can clusters be helpful to forecasters.

The goals of this article are to describe the divisive clustering algorithm (DCA) and to present evidence supporting its use in operational applications pending the availability of some future operationally supported clustering capability. In the following, section 2 describes the clustering algorithm and discusses its characteristics. Statistical verification of cluster forecasts is discussed in section 3. Section 4 presents a particular case demonstrating the possible utility of clusters for the WPC medium-range forecast process. Section 5 provides a summary and concluding discussion.

## 2. The divisive clustering algorithm

The idea for the divisive clustering algorithm was inspired by an existing WPC east–west phase error verification approach based on the discrete Fourier transformation (harmonic analysis). A repurposing of existing Fortran code and a bit of UNIX shell scripting resulted in quick implementation of a clustering algorithm. In technical terms, the DCA is a univariate divisive clustering method. The approach is “divisive” because it begins with a single cluster containing all of the ensemble members. Ironically, divisive clustering methods are rarely employed because they are usually computationally intensive and slow (Wilks 2006, p. 559). It is not uncommon to use empirical orthogonal functions (EOFs) to reduce the number of dimensions associated with the clustering problem prior to applying the actual clustering itself (e.g., Ferranti and Corti 2011). In the approach described below, analytic orthogonal functions, the cosines of the one-dimensional discrete Fourier transformation, are used.

To avoid confusion, it may be helpful to state what the DCA is not. The DCA does not involve the spectral decomposition of a similarity matrix. It is a nonmatrix method. It is not Ward’s method (Wilks 2006, p. 552), which is an agglomerative nonmatrix method. It is not the *K*-means method (Wilks 2006, p. 559), which is a nonhierarchical method. Given its simplicity, the DCA may be described as a “poor man’s” clustering algorithm.

The DCA is applied over a region formed by a truncated zonal band (TZB) of points on a latitude–longitude array (grid). Deviations from the ensemble mean of the 500-hPa isobaric height field defined discretely at these grid points provide the pattern for cluster matching after meridional averaging is performed to reduce the data to one spatial dimension in the zonal direction. The locations and areas covered by the cold and warm season TZBs are shown in Fig. 1. The summer (May–September) TZB is 5° of latitude farther north than the winter (October–April) TZB to account for the northward shift of the polar jet stream during the warm season. The longitudinal extent of the TZB used is 124°; the latitudinal extent is 20°. The mean length of the warm season TZB is 9750 km, whereas the cold season TZB is 10 562 km in mean length.

### a. The clustering procedure

^{1}along the TZB from an actual ensemble member (details of the particular case are not important for this discussion). Across ensemble members, phase shifts and amplitudes will vary and serve as the basis for clustering as described in detail below. The functions shown in Fig. 2 are the Fourier transformation cosines and have the following mathematical form:

*k*,

*k*(equal to the length of the domain for

*k*= 1, which is half the length of the domain for

*k*= 2, and so on),

*k*, and

*x*is the distance along the TZB measured from its western edge in the same units as wavelength. Wilks (2006, 375–388) gives a full treatment of harmonic analysis and the discrete Fourier transformation.

Example depiction of the first four harmonics of the cosine series of the discrete Fourier transformation applied to a zonal series of meridional-averaged 500-hPa deviations from the ensemble mean for one ensemble member. The first harmonic is shown as a thick brown curve, the second harmonic is the blue curve, the third harmonic is the green curve, and the fourth harmonic is the red curve.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Example depiction of the first four harmonics of the cosine series of the discrete Fourier transformation applied to a zonal series of meridional-averaged 500-hPa deviations from the ensemble mean for one ensemble member. The first harmonic is shown as a thick brown curve, the second harmonic is the blue curve, the third harmonic is the green curve, and the fourth harmonic is the red curve.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Example depiction of the first four harmonics of the cosine series of the discrete Fourier transformation applied to a zonal series of meridional-averaged 500-hPa deviations from the ensemble mean for one ensemble member. The first harmonic is shown as a thick brown curve, the second harmonic is the blue curve, the third harmonic is the green curve, and the fourth harmonic is the red curve.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The most concise way to describe the clustering algorithm is in the following outline form:

For each member of the ensemble, the 500-hPa height field (m) is used as follows:

Subtract the ensemble mean at every grid point (1° × 1° resolution) in the TZB shown in Fig. 1.

Average along the meridional direction to reduce the TZB array of deviations from the ensemble mean to a single strand of values along the west-to-east (zonal) direction.

Perform the discrete Fourier transformation for the spatial series of values created in step (b) above.

Compute the amplitudes and phase shifts for the first four harmonics of the cosine series obtained from the Fourier transformation.

Identify the wavenumber having the largest amplitude among the first four harmonics and record the wavenumber, its phase shift angle, and the member name in a selection list. (Each ensemble member is represented by exactly one entry in the selection list.) For example, the ensemble member depicted in Fig. 2 has the largest amplitude for wavenumber 2.

Separate the selection list according to wavenumber into as many as four groups for wavenumbers 1–4. These groups constitute the first tier of clusters in the divisive procedure. For example, the ensemble member depicted in Fig. 2 would be placed in the group for wavenumber 2.

Perform cluster selection. Within each wavenumber group formed in step 2), the varying phase shift angles of the cosine function become the basis for this final clustering step. Members having nearly the same phase angle will have nearly overlapping cosine functions and be placed in the same cluster. If the phase angle difference between two members is greater than one-fifth wavelength (72°), then this algorithm does not place those two together in the same cluster. Select final clusters by applying the following steps to each of the four wavenumber groups:

Sort the phase angles (converted to degrees) within each wavenumber group from lowest to highest values. These are the wavenumber order statistics.

Append members having phase angles within 72° of 360° to the lower end of the order statistics after subtracting 360°.

Append members having phase angles within 72° of 0° to the upper end of the order statistics after adding 360°.

Traverse the phase angle order statistics from lowest to highest value searching for clusters of phase angles falling within a 72° range. Members having phase angles within a 72° window of the phase angle order statistics are assigned to a cluster. Any member included in a cluster at the lower (negative) end of the phase angle order statistics is excluded from consideration at the upper end of the order statistics.

Perform a look-ahead search beginning with each member of an existing cluster to check for either a larger cluster or an equally sized more compact cluster than the one found in the previous step. [Some members found in step (d) above may be dropped in this step.]

In the algorithm outlined above, any given member is assigned to only one cluster. A cluster must have at least four members. Reaching the end of the phase angle order statistics is the stopping point for the algorithm. If there are no four-member clusters, the order statistics are traversed again to find the first three members having phase angles within a 60° range. It is possible that no clusters will be found if the ensemble has few members. Not all ensemble members are necessarily assigned to a cluster, especially if the ensemble has many members. The number of clusters found is determined by the algorithm and varies with forecast projection time.

The phase angle range criterion (72° or 60°) for this clustering method is arbitrary. If it is set lower, fewer large clusters are found. If it is set higher, clusters are both larger in size (number of members) and fewer in number and have more spread within the clusters. Figure 3 is an illustration of a case for which five ensemble members have a maximum amplitude for wavenumber 1. Four of the five have phase angles falling within a 72° range. The fifth member (black, dashed) has a phase angle outside of the cluster range as it is noticeably shifted to the left of the clustered members. The number and range of harmonics used is less arbitrary. Using the first four harmonics addresses the larger-scale patterns of 500-hPa heights on the TZB that are most predictable in the medium-range time frame. Extending the range of harmonics could affect both the size and number of clusters found. Moving the range of harmonics away from the fundamental (first) harmonic addresses features of smaller scale but could yield clusters having larger intracluster spread because the same cluster can include members having different large-scale patterns, although the nonlinearity of atmospheric dynamics may mitigate this effect as discussed below.

Schematic depiction of a possible four-member cluster based on the first cosine function of the discrete Fourier transformation (wavenumber 1) on the TZB. The solid colored curves depict the cosine functions for the clustered members, having phase angles within 72° of one another. The black dashed curve depicts the cosine function for another member also having the largest amplitude for the first harmonic, but with a phase angle placing it more than 72° from the four clustered members. Angular distance is in degrees. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Schematic depiction of a possible four-member cluster based on the first cosine function of the discrete Fourier transformation (wavenumber 1) on the TZB. The solid colored curves depict the cosine functions for the clustered members, having phase angles within 72° of one another. The black dashed curve depicts the cosine function for another member also having the largest amplitude for the first harmonic, but with a phase angle placing it more than 72° from the four clustered members. Angular distance is in degrees. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Schematic depiction of a possible four-member cluster based on the first cosine function of the discrete Fourier transformation (wavenumber 1) on the TZB. The solid colored curves depict the cosine functions for the clustered members, having phase angles within 72° of one another. The black dashed curve depicts the cosine function for another member also having the largest amplitude for the first harmonic, but with a phase angle placing it more than 72° from the four clustered members. Angular distance is in degrees. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

### b. Clustering statistics

The cluster algorithm is applied to the 70-member^{2} combined ECMWF ensemble (50 members) and the NCEP GEFS (20 members) beginning 66 h from the initial time and continuing at 6-h intervals through 204 h. The overall character of applying the DCA to the combined 70-member ensemble is assessed statistically in Table 1. As forecast hour increases, the ensemble member solutions diverge resulting in a larger percentage of mixed clusters having at least one member drawn from the ECMWF ensemble and at least one member drawn from the NCEP GEFS. Even early in the forecast projection time the two ensemble systems seem to “play” well together, with nearly two-thirds of the clusters being mixed. For reference, assuming random cluster member selection (equally likely selection of members), the hypergeometric probability (appendix B) of a proportionately populated seven-member cluster (two members from the NCEP GEFS and five from the ECMWF ensemble) is approximately 0.34. Actual cluster member selection is not random, clusters are generally not proportionately populated, and the cluster sizes vary as discussed below. The relatively high percentage of mixed clusters found for the DCA is a better result than is typically obtained for multimodel short-range ensembles for which the members tend to cluster by component models and data assimilation systems (e.g., Johnson et al. 2011; Yussouf et al. 2004).

Statistical summary of cluster composition and enumeration, including the average for the number of members not selected to be in any cluster. These are accumulated statistics for the 8-month period from 1 Oct 2013 through 31 May 2014. Each range of forecast hours shown in the header row for columns 3–6 covers five projection times at 6-h intervals. The range in the last column covers only four projection times.

The number of clusters found at any given forecast projection varies, ranging from as few as 3 or 4 to as many as 12 or 13, the average being 7 or 8, without much change with increasing forecast projection time (Table 1). The number of clusters depends on the tendency of the phase angle distribution within each wavenumber group to exhibit clumps of similar values falling within intervals having a range not exceeding 72°. As mentioned previously, the number of clusters could be decreased by increasing the phase angle range encompassing a cluster. Both Ferranti and Corti (2011) and Branković et al. (2008) limit the number of clusters for the 51-member ECMWF ensemble to six and three, respectively. Given the larger number of members for the combined ensemble used here, seven or eight clusters on average seems reasonable.

The average number of members not selected to be in any cluster has a very weak downward trend with increasing forecast projection time as shown in the last row of Table 1. Again, the number not selected could be decreased by increasing the phase angle range. The omitted members are not necessarily outliers with respect to either the ensemble mean or any cluster mean, nor do these members together constitute a separate cluster.

The distribution for the number of clusters found as a function of cluster size and forecast projection hour is shown in Fig. 4. A large majority of the clusters are in the 5–10-member size range. The distribution shape is roughly the same for all five ranges of forecast hours shown in Fig. 4. The four-member clusters are fairly common, whereas large clusters having 14 or more members are comparatively rare. An average cluster size of seven members is the same for each of the five forecast time ranges. During the analysis period, the DCA never resorted to the three-member alternative allowed when no clusters having four or more members exist.

Histogram chart showing the distribution of the number of clusters found (indicated on the ordinate) as a function of cluster size (number of members in the cluster; indicated along the abscissa) and ranges of forecast projection time (different colored bars). The counts are cumulative from 1 Oct 2013 through 31 May 2014. The histogram color key above the bar graph gives the forecast projection hour ranges for forecasts at 6-h intervals.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Histogram chart showing the distribution of the number of clusters found (indicated on the ordinate) as a function of cluster size (number of members in the cluster; indicated along the abscissa) and ranges of forecast projection time (different colored bars). The counts are cumulative from 1 Oct 2013 through 31 May 2014. The histogram color key above the bar graph gives the forecast projection hour ranges for forecasts at 6-h intervals.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Histogram chart showing the distribution of the number of clusters found (indicated on the ordinate) as a function of cluster size (number of members in the cluster; indicated along the abscissa) and ranges of forecast projection time (different colored bars). The counts are cumulative from 1 Oct 2013 through 31 May 2014. The histogram color key above the bar graph gives the forecast projection hour ranges for forecasts at 6-h intervals.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The DCA was applied to a few cases in the short range to test the robustness of the algorithm when the differences between members and the ensemble mean are quite small. In the short-range, the algorithm tends to behave much as it does in the medium range, producing 6–10 clusters of varying sizes. The cluster means at short lead times are not very different from the ensemble mean. The DCA is sensitive to the pattern of differences between individual members and the ensemble mean. The pattern discrimination is first based on the relative amplitudes of the first four harmonics of the cosine series representing the meridional averaged differences from the ensemble mean for a given member. These amplitudes are small for the short lead times and become larger with increasing lead time.

### c. Intracluster agreement

In an ideal scenario, the clustering algorithm results would be compared to clusters selected objectively and independently by human inspection (e.g., Alhamed et al. 2002). However, in this endeavor the size of the combined ensemble (70 members) and a paucity of human resources rendered impractical extensive subjective testing of the clustering. Instead, individual members of clusters were occasionally examined to ascertain the plausibility of the clustering result for the 500-hPa height field pattern. All cases appeared reasonable, with an acceptably low degree of spread among the members within each cluster examined, but this result is based on subjective human judgment of a limited number of specific cases.

One case of a nine-member cluster is shown by way of Figs. 5 and 6, which depict one of eight clusters found for the 7-day (168 h) forecast from 0000 UTC 26 June valid at 0000 UTC 3 July 2014. In this case, the amplitude of the 500-hPa pattern along the U.S.–Canadian border is reminiscent of the cold season. Figure 5 is a display of the 500-hPa height field for each member of the cluster. Two members labeled numerically (1–50) are from the ECMWF ensemble; the other seven members labeled alphabetically (A–T) come from the NCEP GEFS. Figure 6 shows the nine-member cluster mean 500-hPa height and color-shaded deviations from the 70-member ensemble mean 500-hPa height. The nine-member cluster mean pattern of 500-hPa height contours in Fig. 6 is reflected to varying degrees in the individual members (Fig. 5): a pronounced trough along the Pacific Northwest coast of the United States and the coast of British Columbia, a negatively tilted ridge in west-central Canada, and a trough in eastern Canada extending more or less southward into the Great Lakes area and the Ohio valley.

Depiction of 500-hPa height (dm; green contours) fields for a nine-member cluster at forecast hour 168 valid at 0000 UTC 3 Jul 2014. The initial time and date are 0000 UTC 26 Jun 2014. The individual member alphabetic or numeric descriptor used in Table 2 is given at the end of the title written below. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Depiction of 500-hPa height (dm; green contours) fields for a nine-member cluster at forecast hour 168 valid at 0000 UTC 3 Jul 2014. The initial time and date are 0000 UTC 26 Jun 2014. The individual member alphabetic or numeric descriptor used in Table 2 is given at the end of the title written below. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Depiction of 500-hPa height (dm; green contours) fields for a nine-member cluster at forecast hour 168 valid at 0000 UTC 3 Jul 2014. The initial time and date are 0000 UTC 26 Jun 2014. The individual member alphabetic or numeric descriptor used in Table 2 is given at the end of the title written below. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Depiction of 500-hPa height (dm; green contours) cluster mean for the nine-member cluster shown in Fig. 5. The 500-hPa height deviations (m) from the combined ensemble mean are shaded according to the color bar on the left. The member list above the map is given in terms of the alphabetic and numeric descriptors used in Table 2.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Depiction of 500-hPa height (dm; green contours) cluster mean for the nine-member cluster shown in Fig. 5. The 500-hPa height deviations (m) from the combined ensemble mean are shaded according to the color bar on the left. The member list above the map is given in terms of the alphabetic and numeric descriptors used in Table 2.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Depiction of 500-hPa height (dm; green contours) cluster mean for the nine-member cluster shown in Fig. 5. The 500-hPa height deviations (m) from the combined ensemble mean are shaded according to the color bar on the left. The member list above the map is given in terms of the alphabetic and numeric descriptors used in Table 2.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The members of this particular cluster have maximum amplitudes for wavenumber 2 on the warm season TZB. Table 2 gives the amplitudes and phase shift angles associated with the second harmonic [*k* = 2 in (1)] for the members of this nine-member cluster. Considering a single cosine function, intracluster diversity arises from both the differences in phase shift and the differences in the amplitude. The phase shift range is constrained by the clustering criterion. The amplitude range is not constrained once it is established that the magnitudes for the other three wavenumbers are less for each member. Table 2 shows that the range of phase shift angles is 65°, and the range of amplitudes is 34.8 m.

Amplitudes of deviation from the ensemble mean and phase shift angles for a nine-member cluster for which wavenumber 2 is dominant. The order from top to bottom reflects the order of discovery in the phase angle order statistics. Members denoted by alphabetic characters (A–T) are from the NCEP GEFS; members denoted numerically (1–50) are from the ECMWF ensemble.

The other sources of intracluster diversity are amplitudes and phase shifts associated with the other cosine functions having lesser amplitudes or wavenumbers higher than four (not to mention the meridional averaging). If the atmosphere is a linear system, clustering based on a single cosine function likely fails in almost all cases. Fortunately, the atmosphere behaves nonlinearly with the appreciable exchange of energy among scales as described by Dutton (1986, p. 540). Although it is beyond the scope of this article to investigate, it is reasonable to hypothesize that ensemble members exhibiting a similar pattern for one cosine function also have similar patterns for others as a result of coupling associated with the nonlinear interactions characterizing atmospheric dynamics.

### d. Visual impressions of intercluster contrasts

A desirable feature of any clustering method is in the selection of clusters that are distinctly different from one another. The DCA often produces clusters appearing to have similar 500-hPa height fields upon casual inspection, but the pattern of deviations from the ensemble mean like that shown in Fig. 6 reveals noticeable differences among the clusters. Such differences often have ramifications for forecasts of frontal positions, sea level pressure, and precipitation. The degree of differences among clusters depends on the spread of the combined ensemble. After applying the one-dimensional Fourier transformation and computing the parameters of the cosine series, the clustering algorithm makes its first divisive separation of ensemble members into groups based on the relative amplitudes of the first four Fourier cosine functions representing deviations from the ensemble mean. If the combined ensemble spread is small, these deviations will be small and the differences among clusters will be smaller than in the opposite case of large spread and large deviations. Daily mean sea level pressure MSLP and temperature verification data provide a more systematic demonstration of cluster contrast and will be discussed in the next section.

### e. Operational use

The DCA is applied independently for each forecast projection time at 6-h intervals beginning at 66 h and ending at 204 h. A crude internal Internet display for each individual forecast hour is provided to WPC forecasters. A limited set of fields is displayed: 500-hPa height with shaded deviation from ensemble mean, MSLP with shaded deviation from ensemble mean, 24-h probability of precipitation exceeding 0.0254 cm, 6-h maximum 2-m temperature, and 6-h minimum 2-m temperature. A more extensive, forecaster-friendly internal web page was developed and tailored to the WPC medium-range forecast process (appendix C). The cluster means are not yet available in the WPC blending tool described in appendix A.

## 3. Limited statistical verification

For a typical ensemble, individual members do not outperform the ensemble mean in cumulative statistical verification (Tracton and Kalnay 1993). By induction, means for small clusters of ensemble members would not be expected to perform much better than individual members. Therefore, means for small clusters would not be expected to outperform the ensemble means used as benchmarks in this verification and are not verified. [This is not to say that a small cluster mean cannot depict a better forecast than the ensemble mean for individual cases (e.g., Palmer et al. 1990).] There is no reason to expect means for large clusters to outperform ensemble means either, but it is an interesting matter for investigation. For this limited statistical verification only the two largest clusters are archived, ensuring a complete data record since a minimum of three or more clusters can be expected at each forecast hour (Table 1). The deterministic NCEP GFS, the NCEP GEFS mean, the deterministic ECMWF, and the ECMWF mean forecasts are also archived. The verification focus here is on MSLP and the maximum and minimum temperatures. For clusters and ensembles, means over constituent members are verified. The time interval covered by the verification is from 1 October 2013 through 31 May 2014.

The MSLP verification metric is a standardized anomaly correlation [specifically, a centered anomaly correlation as described by Wilks (2006, p. 311)] based on climatological means and standard deviations derived from the NCEP–NCAR reanalyses (Kistler et al. 2001) for a 41-yr period (1958–98). The verifying analysis for the MSLP is the WPC surface analysis existing on a 97-km-resolution grid. Owing to the difficulties associated with sea level pressure reduction over complex terrain, the MSLP verification region is restricted to the eastern portion of North America within the blue rectangle in Fig. 7. The verification is only done for model runs initialized at 0000 UTC against the WPC MSLP analysis valid at 1200 UTC for forecast days 3–7.

Verification regions for MSLP anomaly correlations (blue rectangle) and daily temperature extrema MAEs (east of heavy green line).

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Verification regions for MSLP anomaly correlations (blue rectangle) and daily temperature extrema MAEs (east of heavy green line).

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Verification regions for MSLP anomaly correlations (blue rectangle) and daily temperature extrema MAEs (east of heavy green line).

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The maximum and minimum temperatures are determined for 6-h intervals ending at 0000 and 1200 UTC, respectively. The verifying analysis is the 2.5-km-resolution NCEP Real-Time Mesoscale Analysis (RTMA; De Pondeca et al. 2011) dataset, which uses different time windows for determining the daily extrema, as discussed in appendix D. Therefore, the verification region for the temperature extrema is the eastern two-thirds of the contiguous United States (CONUS), the area east of the heavy green line in Fig. 7. As for the MSLP, only 0000 UTC model runs are verified, corresponding to guidance used for the early day shift WPC medium-range day-3–7 temperature product suite. These temperature fields are not bias corrected but are downscaled to 5-km resolution in the manner described by Novak et al. (2014a).

The forecast verification system (fvs) used here is described by Novak et al. (2014a, their appendix B). The resampling method of Hamill (1999) is applied to obtain the distribution of differences of performance metrics for pairs of forecast sources (e.g., cluster 1 vs NCEP GFS). In Figs. 8–10 referenced below, the error bars (barred line segments) associated with histogram bars show confidence intervals obtained from the distribution of paired differences for the forecast source represented by that histogram bar compared to the forecast source of the first histogram bar. All forecast sources are compared with the first one. The vertical extent of each error bar is determined by the level of the statistical significance test (0.05) and depicts the 95% confidence interval (ranging from the 2.5th percentile to the 97.5th percentile value in the order statistics of resampled differences). Each error bar is plotted with the zero value of the distribution of randomly resampled differences aligned with the value along the ordinate of the performance metric of the first forecast source, a position consistent with the null hypothesis of no difference. Therefore, an error bar partially overlapping the color of the underlying histogram bar indicates no statistically significant difference between that forecast source and the first one. An error bar completely overlapping a color bar or completely clear of a color bar indicates a statistically significant difference. The number of samples used in the bootstrap resampling is 5000.

The MSLP anomaly correlations (%; indicated on the left ordinate) as a function of forecast hour (indicated along the abscissa). The plus signs give the number of contributing cases (i.e., valid times contributing to the cumulative verification) plotted against the right ordinate. The color key for the histogram bars is given above the graph. The inset box gives the confidence level and number of random samples used in generating the error bars for assessing statistical significance. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The MSLP anomaly correlations (%; indicated on the left ordinate) as a function of forecast hour (indicated along the abscissa). The plus signs give the number of contributing cases (i.e., valid times contributing to the cumulative verification) plotted against the right ordinate. The color key for the histogram bars is given above the graph. The inset box gives the confidence level and number of random samples used in generating the error bars for assessing statistical significance. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The MSLP anomaly correlations (%; indicated on the left ordinate) as a function of forecast hour (indicated along the abscissa). The plus signs give the number of contributing cases (i.e., valid times contributing to the cumulative verification) plotted against the right ordinate. The color key for the histogram bars is given above the graph. The inset box gives the confidence level and number of random samples used in generating the error bars for assessing statistical significance. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The TMAX MAE (°C; indicated on the left ordinate) as a function of forecast hour along the abscissa. The plus signs give the number of contributing cases plotted against the right ordinate as in Fig. 8. The color key for the histogram bars is given above the graph. The inset box gives the confidence level and number of random samples used in generating the error bars for assessing statistical significance. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The TMAX MAE (°C; indicated on the left ordinate) as a function of forecast hour along the abscissa. The plus signs give the number of contributing cases plotted against the right ordinate as in Fig. 8. The color key for the histogram bars is given above the graph. The inset box gives the confidence level and number of random samples used in generating the error bars for assessing statistical significance. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The TMAX MAE (°C; indicated on the left ordinate) as a function of forecast hour along the abscissa. The plus signs give the number of contributing cases plotted against the right ordinate as in Fig. 8. The color key for the histogram bars is given above the graph. The inset box gives the confidence level and number of random samples used in generating the error bars for assessing statistical significance. See text for details.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

As in Fig. 9, but for TMIN.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

As in Fig. 9, but for TMIN.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

As in Fig. 9, but for TMIN.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Figure 8 shows results for the MSLP verification. The centered standardized anomaly correlation is positively oriented, meaning higher values indicate better performance. The larger of the two largest clusters tends to outperform the next-largest cluster by a statistically significant margin at forecast hours 84, 132, and 156. The largest cluster is always significantly better than the NCEP GFS deterministic model and the NCEP GEFS, except for the latter at 156 and 180 h. The largest cluster cannot outperform the ECMWF deterministic model, but the differences are not statistically significant at and beyond 132 h. Finally, the ECMWF ensemble mean is significantly better than the largest cluster at all forecast hours.

Figure 9 shows verification results for the maximum temperature TMAX in terms of the mean absolute error MAE, which is negatively oriented, meaning lower values indicate better performance. For TMAX, the larger of the two largest clusters is significantly better than the next-largest cluster at only 96 and 144 h; while, at 120, 168, and 192 h, the differences are not significant although the first cluster is always better (lower MAE). The NCEP GFS model is always worse than the largest cluster, and the differences are significant after 120 h. The NCEP GEFS mean is significantly worse than the largest cluster before 168 h, after which it is not significantly different. The ECMWF deterministic model TMAX is significantly better than the largest cluster mean TMAX at all forecast hours except 192. Finally, as with MSLP, the ECMWF ensemble mean TMAX significantly outperforms the largest cluster at all forecast hours.

Figure 10 displays verification results for the minimum temperature TMIN in terms of MAE. The largest cluster outperforms the next-largest cluster significantly only at 132 h. The largest cluster is actually worse than the next-largest one at 180 h, the difference being not quite significant at the 0.05 level. For TMIN, the largest cluster has significantly lower MAEs than the NCEP GFS model only at 156 and 180 h with no significant differences at 84, 108, and 132 h. Interestingly, the NCEP GEFS mean TMIN is significantly better than the largest cluster at all forecast hours except 108 and 132. The ECMWF deterministic model is significantly better for TMIN at all forecast hours except 180, and the ECMWF ensemble mean is significantly better at all forecast hours for TMIN.

The statistical verification demonstrates that the objectively selected largest clusters cannot outperform the ECMWF ensemble mean over many cases. In fact, the superior performance of the ECMWF ensemble mean is statistically significant against all forecasts verified for this article. The superiority of the ECMWF ensemble mean in this verification supports the findings of Keune et al. (2014), except that the results here are unambiguously significant at the 0.05 level. The statistical verification suggests clusters may lie on a statistical performance continuum, falling somewhere between deterministic runs and the ensemble mean. The value of clusters cannot be argued on the basis of statistical verification, but rather on their ability to depict scenarios not evident via ensemble means and not necessarily captured by deterministic forecasts.

Returning to the previously discussed topic of intercluster contrast, the fact that significant differences can exist between the two largest clusters in the verification discussed above, especially in the case of MSLP, indicates the two largest clusters are likely to be noticeably different. Such differences will certainly vary from case to case. As demonstrated by the example case described in section 4, the two largest clusters can be quite different. Since seven or eight clusters are usually found (Table 1), it is also possible for the two largest clusters to be somewhat similar, while other clusters of the same or slightly smaller size are quite different. The existing limited verification cannot reflect the result of such contrasts.

## 4. Example clustering forecast case

The statistical verification presented in the previous section indicates that clustering cannot be of singular utility in any particular forecast case. The totality of guidance must always be considered. In the WPC, the human deterministic forecast remains central to the process for creating the suite of medium-range forecast products (appendix A). In some cases, the information conveyed via clustering may enable the human forecaster to sharpen the deterministic forecast toward the outcome expected after having considered all available guidance, forecast continuity, guidance trends, and collaborative input from other forecasters. In practice, sharpening the forecast with a cluster would be accomplished by including the mean of one or more clusters along with the ensemble mean to produce a consensus forecast (blend), effectively a weighted ensemble mean with some ensemble members weighted a bit more than others. The utility of clusters may be evident in situations exhibiting large forecast uncertainty. Even if not used directly in creating a consensus forecast, cluster depictions of contrasting possibilities may aid in communicating the uncertain aspects of the forecast.

The day-6–7 forecast valid 11–12 May 2014 serves as an example case. The ensemble guidance for these forecasts was initialized at 0000 UTC 5 May 2014. The focus here is on day 7 ending at 0000 UTC 13 May 2014 (the 192-h forecast). Spaghetti plots [described by Inness and Dorling (2013, p. 137)] for the 564-dm 500-hPa height contour at and prior to the 192-h projection time (not shown) indicated a marked split between ensemble members predicting a slow-moving trough over the western CONUS and those predicting a progressive trough over the eastern CONUS. At 192 h there were eight clusters, for which the 500-hPa heights are plotted in Fig. 11. Three of the clusters, comprising a total of 21 members, had the trough over the eastern half of the CONUS (Figs. 11a,d,g). One eight-member cluster showed nearly zonal flow across the CONUS (Fig. 11c). Finally, four of the eight clusters comprising 30 members depicted the trough over the western half of the CONUS (Figs. 11b,e,f,h), which is what actually verified (Fig. 11i). In fact, the verifying trough was anomalously deep, resulting in unusually cool temperatures for the western portion of the northern plains and adjacent Rocky Mountain areas as discussed below.

(a)–(h) Depiction of 500-hPa height (dm; green contours) cluster means for eight clusters found at 192 h from the 0000 UTC 5 May initial time valid at 0000 UTC 13 May 2014 and (i) the verifying NCEP GDAS analysis valid 0000 UTC 13 May 2014. The size of the clusters is written in (a)–(h).

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

(a)–(h) Depiction of 500-hPa height (dm; green contours) cluster means for eight clusters found at 192 h from the 0000 UTC 5 May initial time valid at 0000 UTC 13 May 2014 and (i) the verifying NCEP GDAS analysis valid 0000 UTC 13 May 2014. The size of the clusters is written in (a)–(h).

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

(a)–(h) Depiction of 500-hPa height (dm; green contours) cluster means for eight clusters found at 192 h from the 0000 UTC 5 May initial time valid at 0000 UTC 13 May 2014 and (i) the verifying NCEP GDAS analysis valid 0000 UTC 13 May 2014. The size of the clusters is written in (a)–(h).

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The similarities and contrasts in Figs. 11a–h demonstrate the ambiguous nature of clustering results. The similarities suggest that a different clustering method might produce fewer clusters, each having more members. The contrasts reflect the forecast uncertainty, which the clustering cannot resolve. The means of the two most populated clusters^{3} at 192 h (valid 0000 UTC 13 May 2014) depict the bifurcation of the ensemble quite well, as shown in Figs. 11a and 11b and in more detail in Fig. 12. The largest cluster (Fig. 12a) has 10 members and depicts the eastern position of the trough. The membership of this 10-member cluster is dominated by the NCEP GEFS with only three of the members coming from the ECMWF ensemble. (The NCEP GEFS members are represented by letters, and the ECMWF ensemble members are represented by integers in Fig. 12.) The next-largest cluster (Fig. 12b) has nine members and exhibits a strong trough over the western CONUS. This nine-member cluster is entirely from the ECMWF ensemble. The dashed contours in Fig. 12 show the 500-hPa height contours for the combined ensemble mean, which weakly favors the western trough solution. The shading in Fig. 12 depicts the large deviations from the combined ensemble mean for these two clusters.

Depiction of 500-hPa height (dm; green contours) cluster mean and deviations (m; shaded according to color bar on the left) from the combined ensemble mean for (a) the 10-member largest cluster and (b) the 9-member next-largest cluster found at 192 h from the 0000 UTC 5 May initial time valid at 0000 UTC 13 May 2014. The dashed lines depict the combined ensemble mean 500-hPa height field (dm). The cluster membership is indicated by letters for NCEP GEFS members and integers for ECMWF ensemble members above.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Depiction of 500-hPa height (dm; green contours) cluster mean and deviations (m; shaded according to color bar on the left) from the combined ensemble mean for (a) the 10-member largest cluster and (b) the 9-member next-largest cluster found at 192 h from the 0000 UTC 5 May initial time valid at 0000 UTC 13 May 2014. The dashed lines depict the combined ensemble mean 500-hPa height field (dm). The cluster membership is indicated by letters for NCEP GEFS members and integers for ECMWF ensemble members above.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Depiction of 500-hPa height (dm; green contours) cluster mean and deviations (m; shaded according to color bar on the left) from the combined ensemble mean for (a) the 10-member largest cluster and (b) the 9-member next-largest cluster found at 192 h from the 0000 UTC 5 May initial time valid at 0000 UTC 13 May 2014. The dashed lines depict the combined ensemble mean 500-hPa height field (dm). The cluster membership is indicated by letters for NCEP GEFS members and integers for ECMWF ensemble members above.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

Since the clusters are not yet available for the WPC forecasters to use in their blending tool (appendix A), the following discussion is hypothetical. In light of the ambiguities associated with clustering and considering the verification results, forecasters will have to be cautioned not to overweight blends toward cluster means. Nevertheless, the questions are how might these clusters be helpful in this particular case, assuming one could use them in the blender, and what might be the outcome for sensible weather parameters like the maximum 2-m temperature, a parameter not used in the DCA itself? The blending options currently consist of deterministic model runs and ensemble means. In general, WPC medium-range forecasters, particularly in the day-6–7 projection time range, weight the ensemble means strongly and may apply smaller weights to one or more of the deterministic runs to nudge toward the solution considered to be most likely. The previous WPC forecast is often given some weight to maintain a degree of forecast continuity. Clusters could provide blender input lying between the detailed, sharp deterministic solutions and the often washed-out ensemble means; therefore, blending a cluster mean with the ensemble mean would sharpen the forecast in a preferred way.

For the particular case at hand, the forecaster might be tempted to select the largest cluster (Fig. 12a) to nudge the blend toward the trough over the eastern CONUS, which would push the forecast toward an inaccurate solution. However, such action would not be supported by a broader view of the cluster output (see previous discussion), nor is it supported by the combined ensemble mean and forecast continuity (not shown). In addition, the ECMWF deterministic model run (not shown) indicated the western location of the trough, but not the intensity shown in Fig. 12b. The totality of guidance suggests that the forecaster might reasonably include the next-largest cluster (Fig. 12b) in a weighted average (blend) along with the ensemble mean to sharpen the forecast toward the western trough solution.

Assuming the hypothetical forecaster made the decision to include the second-largest cluster in the blend, what might its influence be on the maximum 2-m temperature? Figure 13, in approximate terms (see appendix D), suggests an answer to this question by comparing the mean 6-h maximum 2-m temperature valid ending at 0000 UTC 13 May 2014 from the 9- (Fig. 13a) and 10-member clusters (Fig. 13c) to both the WPC maximum temperature forecast (Fig. 13b) and the verifying daily maximum 2-m temperature (Fig. 13d) obtained from the RTMA data, remapped to the 1° × 1° resolution of the cluster output. The remapping technique (Im et al. 2006) preserves area averages. Figure 13a shows 2-m temperatures consistent with the nine-member cluster mean 500-hPa height trough (Fig. 12b). The pattern of relatively cool temperatures over Wyoming and Colorado in Fig. 13a matches the RTMA much better than either the 10-member cluster mean or the WPC forecast. However, the WPC forecast (Fig. 13b) does indicate some degree of coolness in the area of interest, reflecting the fact that the WPC forecaster chose to use a nearly equally weighted blend of the ECMWF ensemble mean, the NCEP GEFS mean, and the previous WPC forecast (continuity). As shown by the contours of the combined ensemble mean in Figs. 12a and 12b, the actual forecaster’s blend choice weakly favors the western trough solution.

The 2-m TMAX (°C; shaded according to color bars on the left) forecasts valid on day 7 ending at 0000 UTC 13 May 2014 for (a) the 9-member cluster mean, (b) the WPC forecast blend, and (c) the 10-member cluster mean. (d) The verifying RTMA TMAX.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The 2-m TMAX (°C; shaded according to color bars on the left) forecasts valid on day 7 ending at 0000 UTC 13 May 2014 for (a) the 9-member cluster mean, (b) the WPC forecast blend, and (c) the 10-member cluster mean. (d) The verifying RTMA TMAX.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

The 2-m TMAX (°C; shaded according to color bars on the left) forecasts valid on day 7 ending at 0000 UTC 13 May 2014 for (a) the 9-member cluster mean, (b) the WPC forecast blend, and (c) the 10-member cluster mean. (d) The verifying RTMA TMAX.

Citation: Weather and Forecasting 30, 4; 10.1175/WAF-D-14-00137.1

## 5. Summary and conclusions

The divisive clustering algorithm (DCA) discussed herein is posed as a stop-gap measure for practical application and experimentation in the WPC until an approach is implemented operationally that may better account for differences in spatial patterns and, perhaps, temporal evolution. The DCA is based on the one-dimensional discrete Fourier transformation applied to ensemble member deviations from the ensemble mean of 500-hPa heights in a truncated zonal band at a given forecast projection time. The ensemble is the 70-member combination of the NCEP GEFS and the ECMWF ensemble. The DCA is a divisive algorithm in which the largest amplitude of the first four wavenumbers from the Fourier transformation cosine series determines the initial division of ensemble members into four groups. The second division of each of these four groups into final clusters is based on the cosine function phase angle order statistics. The advantages of the DCA are several: 1) it is easy to develop and maintain, 2) it executes very quickly for a large ensemble, and 3) it produces subjectively plausible clusters. The two disadvantages of the DCA are 1) as configured it typically orphans about a dozen of the 70 ensemble members by not including them in any cluster and 2) it does not have an extended-in-time feature such as described by Ferranti and Corti (2011). However, the important point is that the DCA allows the WPC to develop methods for utilizing clusters in the medium-range forecast process while awaiting improved capability.

The DCA clusters are not yet available in the WPC blending tool, but viewing the cluster means can provide meteorologically coherent pictures of outcomes not captured sharply by ensemble means or deterministic model output. Limited statistical verification indicates that objectively chosen clusters based on the greatest number of members cannot outperform the mean of the largest component ensemble, the ECMWF ensemble, for mean sea level pressure, maximum temperature, and minimum temperature. The value of clusters is not necessarily revealed by statistical verification, but rather by the demonstrated ability of clusters to depict a variety of outcomes, each supported by more than just a single ensemble member. Although it cannot be known in advance which cluster is more likely to verify best, a case study presented in this article demonstrates that several clusters can exhibit a forecast to be considered favorably based on numbers of members and agreement with weaker indications given by ensemble means and deterministic model guidance. Thus, it is reasonable to anticipate the usefulness of including clusters as an option in the WPC blending tool allowing forecasters to nudge the prediction toward a particular solution not well defined by ensemble means and not clearly represented by contradictory deterministic model output but favored in consideration of continuity, consistency, collaborative thinking, and the trend of the guidance. However, it is important that forecasters receive proper training to understand the inherent ambiguity of clustering and the perpetual imperative to consider the totality of guidance.

## Acknowledgments

The work of NCEP/HPC (WPC) 2011 summer intern Mr. William Davis demonstrated the viability of the divisive clustering algorithm applied to NCEP GEFS and is much appreciated. The suggestions and comments of WPC internal reviewer, Dr. David Novak, are much appreciated. The graphics for this article were produced using the MATLAB and GEMPAK software packages. Thanks to Ms. Lauren Morone for providing the date of the Cray 90 fire. The authors are grateful to the anonymous reviewers whose comments and suggestions led to significant improvements in the manuscript. Funds for publication were provided by NOAA/NWS/NCEP/WPC. Any opinions expressed in this article are those of the authors and do not necessarily state a NOAA/NWS position.

## APPENDIX A

### WPC Medium-Range Forecast Process Overview

The WPC medium-range forecast process is oriented toward depicting the most likely outcome as a deterministic forecast. While the WPC embraces probabilistic forecasting and plans to expand the probabilistic forecast content for the medium-range time frame, the human contribution will continue to be a deterministic forecast. There are several important reasons for this: 1) human forecasters are most comfortable making deterministic forecasts, 2) users continue to require deterministic forecasts, 3) human forecasters cannot manually create a consistent set of probabilistic products within operational time constraints, and 4) human deterministic forecasts perform competitively compared to model guidance (Novak et al. 2014a). To make probabilistic products, the WPC assumes that the human deterministic forecast is the mode of a probability density function (PDF). The available ensemble data provide the variance of the PDF, and the position of the human deterministic forecast in the ensemble order statistics determines the skewness of the PDF. A PDF is created at every point on a grid covering the forecast domain. This method is described for winter precipitation by Novak et al. (2014b).

A blending tool (blender) used by the WPC medium-range forecast desk allows the forecaster to create deterministic forecasts from model guidance. The forecaster uses the blender interface to specify weights as percentage values for an average of deterministic models and ensemble means. The blender currently does not allow the forecaster to select the individual members of an ensemble. The forecaster-selected blend information (weights and descriptors) is used in postprocessing to generate bias-corrected, downscaled forecast fields for various parameters (Novak et al. 2014a).

## APPENDIX B

### Hypergeometric Probability Calculation

*a*and

*b*are integers such that 0 ≤

*b*≤

*a.*The hypergeometric probability of the random selection of 5 ECMWF members in drawing 7 members from the 70-member ensemble consisting of 50 ECMWF members and 20 NCEP GEFS members is given by the following calculation:

## APPENDIX C

### Cluster Displays in the WPC Operational Environment

Before describing the forecaster-friendly cluster guidance in more detail, the issue of forecast temporal continuity for looped time series displays and accumulations [e.g., for quantitative precipitation forecasts (QPFs)] using the clusters must be discussed. From one forecast projection time to the next, individual ensemble members can drift from one cluster to another, joining different members to make a new subsequent cluster. This problem has been addressed in various ways by others. For the NCEP SREF clusters mentioned in the main text, the clustering at the forecast hour for which the cluster count is a maximum within a 24–27-h time window establishes the cluster membership for the duration of that time window [J. Du, NCEP/Environmental Modeling Center (EMC), 2014, personal communication]. The ECMWF uses extended-in-time EOFs applied to time windows increasing in size with increased projection time as described by Ferranti and Corti (2011). For the DCA, a very simple approach is taken: the clusters found at 96 h are perpetuated over the projection time interval 66–96 h, those found at 144 h apply over 102–144 h, and the clusters at 192 h apply over 150–204 h. The verification presented in the main text uses clusters generated at 6-h intervals.

The forecaster-friendly cluster-based forecast guidance is currently displayed via pregenerated graphics posted on the WPC internal web page. A graphical user interface featuring mouse-over activation, point-and-click selection, and pulldown menus allows a forecaster to select the cycle time (0000 or 1200 UTC), the forecast hour (66–204 at 6-h intervals), and the parameter. Each selection results in a multipanel display of cluster means, one panel for each cluster associated with the chosen forecast time. The parameters available to select are as follows: MSLP overlaid on 1000–500-hPa thickness; 6-, 24-, or 48-h accumulated precipitation; 12-h maximum (0000 UTC) and minimum (1200 UTC) temperatures; 12-h maximum (0000 UTC) and minimum (1200 UTC) temperature anomalies; 850-hPa temperature; 500-hPa height and vorticity; 200-hPa height and winds; precipitable water; and convective available potential energy (CAPE). WPC meteorologists can use these selections to view meteorologically coherent depictions of outcomes that may differ markedly from the NCEP GEFS and ECMWF ensemble means.

## APPENDIX D

### Maximum and Minimum Temperature Time Windows

The time of daily maximum temperature at any location in the CONUS typically occurs within several hours of the time of maximum solar elevation angle, but varies depending on low-level temperature advection, cloud cover, and precipitation. The time of the daily minimum temperature is usually near sunrise but can vary because of the same factors influencing maximum temperature. To account for this variability, the WPC processing uses 18-h time windows consisting of three consecutive 6-h time intervals for determining daily extremes: from 1200 to 0600 UTC the following day for maximum temperatures and from 0000 to 1800 UTC for minimum temperatures. The time windows used by the RTMA are different: from 1200 to 0300 UTC for maximum temperature and from 0000 to 1600 UTC for minimum temperature. Thus, the 6-h time resolution of the WPC forecast does not allow an exact match to the RTMA verifying analysis.

## REFERENCES

Alhamed, A., Lakshmivarahan S. , and Stensrud D. J. , 2002: Cluster analysis of multimodel ensemble data from SAMEX.

,*Mon. Wea. Rev.***130**, 226–256, doi:10.1175/1520-0493(2002)130<0226:CAOMED>2.0.CO;2.Atger, F., 1999: Tubing: An alternative to clustering for the classification of ensemble forecasts.

,*Wea. Forecasting***14**, 741–757, doi:10.1175/1520-0434(1999)014<0741:TAATCF>2.0.CO;2.Branković, Č., Matjačić B. , Ivatek-Šahdan S. , and Buizza R. , 2008: Downscaling of ECMWF ensemble forecasts for cases of severe weather: Ensemble statistics and cluster analysis.

,*Mon. Wea. Rev.***136**, 3323–3342, doi:10.1175/2008MWR2322.1.De Pondeca, M. S. F. V., and Coauthors, 2011: The real-time mesoscale analysis at NOAA’s National Centers for Environmental Prediction: Current status and development.

,*Wea. Forecasting***26**, 593–612, doi:10.1175/WAF-D-10-05037.1.Dutton, J. A., 1986:

*The Ceaseless Wind.*Dover Publications, 617 pp.Ferranti, L., and Corti S. , 2011: New clustering products.

*ECMWF Newsletter,*No. 127, Reading, United Kingdom, 6–11. [Available online at http://old.ecmwf.int/publications/newsletters/pdf/127.pdf.]Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts.

,*Wea. Forecasting***14**, 155–167, doi:10.1175/1520-0434(1999)014<0155:HTFENP>2.0.CO;2.Im, J.-S., Brill K. , and Danaher E. , 2006: Confidence interval estimation for quantitative precipitation forecasts (QPF) using Short-Range Ensemble Forecasts (SREF).

,*Wea. Forecasting***21**, 24–41, doi:10.1175/WAF902.1.Inness, P., and Dorling S. , 2013:

*Operational Weather Forecasting.*J. Wiley and Sons, 231 pp.Johnson, A., Wang X. , Xue M. , and Kong F. , 2011: Hierarchical cluster analysis of a convection-allowing ensemble during the Hazardous Weather Testbed 2009 Spring Experiment. Part II: Ensemble clustering over the whole experiment period.

,*Mon. Wea. Rev.***139**, 3694–3710, doi:10.1175/MWR-D-11-00016.1.Keune, J., Ohlwein C. , and Hense A. , 2014: Multivariate probabilistic analysis and predictability of medium-range ensemble weather forecasts.

,*Mon. Wea. Rev.***142**, 4074–4090, doi:10.1175/MWR-D-14-00015.1.Kistler, R., and Coauthors, 2001: The NCEP–NCAR 50-Year Reanalysis: Monthly means CD-ROM and documentation.

,*Bull. Amer. Meteor. Soc.***82**, 247–267, doi:10.1175/1520-0477(2001)082<0247:TNNYRM>2.3.CO;2.Marzban, C., Sandgathe S. , and Lyons H. , 2008: An object-oriented verification of three NWP model formulations via cluster analysis: An objective and a subjective analysis.

,*Mon. Wea. Rev.***136**, 3392–3407, doi:10.1175/2007MWR2333.1.Meyer, P. L., 1970:

*Introductory Probability and Statistical Applications.*2nd ed. Addison-Wesley, 367 pp.Nakaegawa, T., and Kanamitsu M. , 2006: Cluster analysis of the seasonal forecast skill of the NCEP SFM over the Pacific–North America sector.

,*J. Climate***19**, 123–138, doi:10.1175/JCLI3609.1.Novak, D. R., Bailey C. , Brill K. F. , Burke P. , Hogsett W. A. , Rausch R. , and Schichtel M. , 2014a: Precipitation and temperature forecast performance at the Weather Prediction Center.

,*Wea. Forecasting***29**, 489–504, doi:10.1175/WAF-D-13-00066.1.Novak, D. R., Brill K. F. , and Hogsett W. A. , 2014b: Using percentiles to communicate snowfall uncertainty.

,*Wea. Forecasting***29**, 1259–1265, doi:10.1175/WAF-D-14-00019.1.Palmer, T. N., Brankovic C. , Molteni F. , Tibaldi S. , Ferranti L. , Hollingsworth A. , Cubasch U. , and Klinker E. , 1990: The European Centre for Medium-Range Weather Forecasts (ECMWF) program on extended-range prediction.

,*Bull. Amer. Meteor. Soc.***71**, 1317–1330, doi:10.1175/1520-0477(1990)071<1317:TECFMR>2.0.CO;2.Straus, D. M., and Molteni F. , 2004: Circulation regimes and SST forcing: Results from large GCM ensembles.

,*J. Climate***17**, 1641–1656, doi:10.1175/1520-0442(2004)017<1641:CRASFR>2.0.CO;2.Tracton, M. S., and Kalnay E. , 1993: Ensemble forecasting at NMC: Operational implementation.

,*Wea. Forecasting***8**, 379–398, doi:10.1175/1520-0434(1993)008<0379:OEPATN>2.0.CO;2.Wilks, D. S., 2006:

*Statistical Methods in the Atmospheric Sciences.*2nd ed. Academic Press, 630 pp.Yussouf, N., Stensrud D. J. , and Lakshmivarahan S. , 2004: Cluster analysis of multimodel ensemble data over New England.

,*Mon. Wea. Rev.***132**, 2452–2462, doi:10.1175/1520-0493(2004)132<2452:CAOMED>2.0.CO;2.

^{1}

Although the direction is longitudinal, the angular displacement in the frequency domain is not longitude, which varies considerably less than 360° along the TZB as stated previously and shown in Fig. 1.

^{2}

Control members are excluded only because of the organization of the data in the NCEP file system.

^{3}

If there are two or more clusters having the same size, then the first two found are the largest two. If there is a tie for the second-largest cluster, the first found is used.