Characteristics of Diagnostics for Identifying Elevated Convection over the British Isles in a Convection-Allowing Model

: Identifying modes of convection can be useful in both forecasting and research. For example, it allows for potentially different impacts to be determined in forecasting contexts and strati ﬁ cation of model behavior in research contexts. One area where identi ﬁ cation could be particularly bene ﬁ cial is elevated convection. Elevated convection is not routinely examined (outside of an operational environment) within a physical-process perspective in operational numerical weather prediction model evaluation or veri ﬁ cation. Using convection-allowing model (CAM) output the characteristics of four elevated convection diagnostics [based on boundary layer, convective available potential energy (CAPE) ratios, downdraft, and in ﬂ ow layer properties] are examined in operational forecasts during the U.K. Testbed Summer 2021 run at the Met Of ﬁ ce. A survey of the practical use of these diagnostics in a simulated operational environment revealed that diagnostics based on CAPE ratios and in ﬂ ow layer properties were preferred. These diagnostics were the smoothest varying in both space and time. Treating the CAPE ratio and downdraft properties diagnostics as proxies for updrafts and downdrafts, respectively, showed that updrafts were slightly more likely to be resolved than downdrafts. However, a substantial proportion of both are unresolved in current CAMs. Filtering the CAPE ratios by the in ﬂ ow layer properties led to improved spatial and temporal characteristics, and thus indicates a potentially useful diagnostic for both research and forecasting.


Introduction
Identification of convective mode in models can be important for forecasting perspectives.For example, the type of convection may have an impact on the hazard that could be present in severe convection (e.g., in the United States over 90% of deaths from tornadoes are attributed to supercells; Schoen and Ashley 2011; in the United Kingdom, the majority of tornadoes are associated with quasi-linear convective systems; Clark and Smart 2016).However, identification of convective mode can also be useful within a research context.For example, different convective types behave differently in models (e.g., more organized modes of convection tend to have better specified forecast location compared to less organized events, Fowle and Roebber 2003).One area where the identification of the type of convection remains somewhat uncertain, in the modeling community, is that of elevated versus surface-based convection.
Elevated convection is defined as convection which initiates above the planetary boundary layer (e.g., Berry et al. 1945).It is often prone to impactful forecast busts and thus remains a forecasting challenge (Corfidi et al. 2008).However, elevated convection is not routinely examined from a physical process perspective in numerical weather prediction (NWP) models outside of the operational environment.Instead, the focus is on specific well-observed events during field campaigns [e.g., Plains Elevated Convection At Night (PECAN; Geerts et al. 2017;Stelten and Gallus 2017;Weckwerth et al. 2019].Furthermore, it is noted by operational meteorologists, that for many convection-related field campaigns these types of events often have the poorest forecasts (e.g., Convective Storm Initiation Project: Browning et al. 2007; Convective Precipitation Experiment: Leon et al. 2016).
The definition of elevated convection presented above is, arguably, broad.Using this definition, it has been shown that elevated convection has many distinct characteristics compared to its surface-based counterpart.For example, in comparison to surface-based convection, elevated convection is often associated with greater precipitation totals, increased frequency of positively charged lightning, reduced frequency of convective wind gusts at the surface, and greater prevalence during nocturnal hours (e.g., Colman 1990;Horgan et al. 2007;Reif and Bluestein 2017).Elevated convection is also associated with the phenomenon widely referred to as "thundersnow" (e.g., Market et al. 2002).
Given the potential range of hazards, evaluation of elevated convection in operational NWP models should be given more attention.However, for effective and unbiased evaluation to take place, diagnostics that can objectively identify elevated convection in the model and observations (using national networks) are needed.The latter is outside the scope of this paper but is currently under investigation.Instead, we focus on the former.
Traditionally, elevated convection is identified by factors including the lack of surface-based convective available potential energy (SBCAPE); the presence of midlevel CAPE; the effective inflow layer; or from the ratio of downdraft convective inhibition to downdraft CAPE (DCIN and DCAPE, respectively;Thompson et al. 2007;Corfidi et al. 2008;Nowotarski et al. 2011;White et al. 2016;Market et al. 2017).Furthermore, forecasters will use the formation mechanisms (e.g., presence of low-level jets, density currents or bores, the positioning of fronts and the likelihood of convection ahead of fronts, gravity waves, initiation from pre-existing convection; e.g., Johns and Hirt 1987;Trier and Parsons 1993;Przybylinski 1995;Rochette and Moore 1996;Moore et al. 2003;Wilson and Roberts 2006;Trier et al. 2011;Browning et al. 2012; Keene and Schumacher 2013;Haghi et al. 2017;Wilson et al. 2018;Parsons et al. 2019) to help identify if the convection is elevated.
Part of the debate about how best to identify elevated convection arises from considerations that if there is small, but nonzero, SBCAPE surface air parcels can be ingested into the convection and so it is not elevated (e.g., Marsham et al. 2011;Nowotarski et al. 2011;Schumacher 2015).It is, perhaps, interesting to then consider the idea of convection occurring as part of a spectrum.In the spectrum, surface-based convection and elevated convection would represent its two extremes (e.g., Corfidi et al. 2008).The spectrum approach has been successfully used for convective regimes: the spectrum that exists between convective quasi-equilibrium and nonequilibrium convection (e.g., Zimmer et al. 2011).This approach has shown that forecast behavior is different along the regime spectrum (e.g., Flack et al. 2018Flack et al. , 2021)); this observation presents evidence for spectrum-based thinking being a powerful concept.However, to consider forecast behavior across a spectrum, a diagnostic that sensibly places the convection on the spectrum is required.
To allow further investigation of the behavior of elevated and surface-based convection in NWP models, four environmentbased diagnostics have been compared during summer 2021 over the British Isles during the U.K. Summer Testbed 2021 run at the Met Office.The comparison occurs using a convectionallowing model (CAM) to understand the diagnostics' properties and applicability.Specifically, this paper aims to answer the following questions: • Which diagnostics are practically useful in current CAMs?
• What are the properties of the diagnostics in CAMs?
• Do the properties of the diagnostics indicate any reasons behind why some diagnostics have less practical use?
The focus of this paper is on assessing the practical use of the diagnostics presented, rather than aiming to identify the "best" given they are all theoretically viable.Here, practical use is defined as being of additional benefit to the operational meteorologists over current practices (be that in terms of speed or indicating different areas where elevated convection could occur, and it does in reality).Thus, any diagnostic with a lower rating compared to the others indicates either a missing or poorly resolved/parameterized process or feature within current CAMs or a diagnostic where the assumptions may not be fully applicable.
The remainder of this paper is organized as follows.The model and U.K. Testbed Summer 2021 are described in section 2, the elevated convection diagnostics are described in section 3, subjective results from the U.K. Testbed Summer 2021 are presented in section 4, the properties of the diagnostics are examined in section 5, a combination of the preferred diagnostics is presented in section 6, and a summary is drawn in section 7. The appendix provides a list of the cases examined in the U.K. Testbed Summer 2021.

Methodology
The framework of the U.K. Testbed Summer 2021 (section 2a) and the model used (section 2b) are described within the following subsections.
a.The U. K. Testbed Summer 2021 As part of the U.K. Testbed Summer 2021 an elevated convection survey was conducted to determine the practical use (in CAMs) of the elevated convection diagnostics (section 3).The U.K. Testbed Summer 2021 followed a similar format to that of the Met Office winter testbed 2020 (Bain et al. 2022).Like the winter testbed the summer testbed was "on demand."However, the summer testbed was divided into two periods: four weeks in July and three in September.Furthermore, the testbed was scaled back, restricted to virtual participation, and in addition to participants' regular roles due to the ongoing COVID-19 pandemic restrictions.
Cases were chosen based on the following considerations, in order: potential impact from severe weather, precipitating convection, and dry convection.The cases were chosen irrespective of the convective mode.The testbed was "on demand"; therefore, if cases were deemed too similar or too close together, one of two options were taken: (i) the case would not be used, or (ii) the case would be optional.All cases were considered within T 1 6-T 1 36 h (0600-1200 UTC the following day) from forecast initiation.This time window was chosen as this is the period, as the lead time reduces, when there will be greatest pressure on the operational meteorologist to make more detailed decisions about warnings and advice.Thus, the timing loosely mimics that for the operational forecasting bench.
The elevated convection survey ran between 17-26 July 2021 and 6-24 September 2021.A full list of cases is presented in the appendix.
The elevated convection survey consisted of four parts: (i) scoring maps of current practice against observations; (ii) scoring the different diagnostics against observations; (iii) scoring the different diagnostics against the current practice; (iv) using i}iii to determine the practical use of the different diagnostics.
The current practice and observations maps were created jointly by the lead author and the operational meteorologist (forecaster) providing the testbed weather briefing.
The current practice map was derived from the U.K. Variable (UKV) resolution configuration of the Unified Model (UM) which is described in section 2b.To derive the map, model tephigrams, pressure, geopotential height, wind, temperature, precipitation, divergence, cloud, and boundary layer fields were used alongside local, operational, and research knowledge of convection.Particular attention was paid to the identification of unstable layers and formation mechanisms.
The equivalent observations map was created from analysis of radiosonde ascents, aircraft meteorological data relays (AMDARs), satellite and radar data.Where necessary, to supplement the observations, ECMWF analyses or shortrange (1-3 h) forecasts were used to identify the convection type.The analyses or short-range forecasts were given a lower importance compared to the observations.
Given the dependence of the analysis on sparse observations, objective measures would, perhaps, be less effective; therefore, subjective measures were assessed in the survey (e.g., Seshadrinathan et al. 2010; Jahedi and Méndez 2014).

b. The U.K. variable resolution configuration of the Unified Model
For the elevated convection survey during the U.K. Testbed Summer 2021, and subsequent analysis, the operational UKV configuration of the UM was used.The operational model was operational suite 44 (OS44) at UM version 11.5.The UKV at OS44 uses the science settings and parameterizations as defined in the regional atmosphere and land configuration at version 2 for the midlatitudes (RAL2M; Bush et al. 2023).The UKV has an interior grid length of 1.5 km and tapers from 4 to 1.5 km at the edges (Tang et al. 2013).Thus, convection is represented explicitly.The results focus on the interior, fixed grid length domain.There are 70 vertical levels up to 40 km (Hanley et al. 2015) and hourly cycling 4D-variational data assimilation.
During the U.K. testbed the forecasts from the 0000 UTC cycle were used.This cycle time runs out to a maximum lead time of 54 h.Further analysis into the properties of the diagnostics presented within this paper (sections 5 and 6) considers all forecasts initiated at 0000 UTC daily between 17 July and 24 September 2021.The subsequent analysis focuses on the 24-h period between T 1 12 and T 1 36 h (1200-1200 UTC) to provide more reliable statistics of the properties of the diagnostics.

Elevated convection diagnostics
The four diagnostics used to objectively identify elevated convection are described in the following subsections.These diagnostics are used to indicate environmental properties associated with elevated convection and are summarized in Table 1.While being simplistic, when considered with diagnostics for triggering methods, these diagnostics should allow the identification of the type of convection present in the model when it is produced.Throughout the rest of the paper the diagnostics are referred to via shortened names related to the properties they identify ("boundary layer," "CAPE ratios," "downdraft," and "inflow layer").Figure 1 shows a case study example of the diagnostics from during the U.K. Testbed Summer 2021.The case shows a typical situation for elevated convection events in the United Kingdom: a mesoscale convective system advecting in from France.

a. Boundary layer properties: Convection above a stable boundary layer
The first diagnostic (Fig. 1a) represents a zeroth-order diagnostic.The diagnostic is based on the properties of the boundary layer.It determines if the model is producing convection above a stable boundary layer.The basic principle is that air within a stable boundary layer cannot convect, therefore, any convection will be decoupled from the surface (e.g., Corfidi et al. 2008).
Convection is identified by an arbitrary threshold of 30 dBZ in the maximum column reflectivity.This reflectivity value represents areas with the most intense rain, while not limiting it to severe convection.The boundary layer type is derived by the boundary layer parameterization (for more details on the different boundary layer types and their diagnosis in the model see Lock et al. 2000).It is worth noting that within the model time step, the boundary layer scheme is run after the largescale precipitation scheme.This ordering is important as precipitation locally cools the boundary layer.This local cooling results in the diagnosis of a stable boundary layer directly underneath the precipitation in the same time step.Therefore, the boundary layer type at the preceding time step is used to indicate the type of environment the convection is moving into for the diagnostic.A caveat to this approach is that slow moving convection may be incorrectly identified as elevated.
The diagnostic is defined when convection is present in the model.It is a binary diagnostic: values of one imply elevated convection, and zero implies surface-based convection.

b. CAPE ratio
The second diagnostic (Fig. 1b) is based on convective instabilities: the SBCAPE and most-unstable CAPE (MUCAPE).The diagnostic is defined as The diagnostic is based on the idea that for elevated convection to occur there needs to be instability aloft.This "elevated" instability should be larger than that from a surface-based parcel ascent.Therefore, the larger the MUCAPE is compared to the SBCAPE, the more likely the convection will be elevated (e.g., Clark et al. 2012;Gallo et al. 2016).
The CAPE ratio varies smoothly between zero and unity.Therefore, it acknowledges the idea that convection exists in a continuum between surface-based and elevated convection (e.g., Corfidi et al. 2008).This diagnostic is not restricted to where convection is present in the model.This independence implies that it will indicate environments suitable for producing elevated convection.Values close to zero imply SBCAPE is close to the MUCAPE, therefore the convection will most likely be surface based.Values close to one imply the convection will be elevated as the SBCAPE will be small or nonexistent.Thresholds are not applied to either CAPE value in the ratio.However, if the CIN associated with the most unstable parcel exceeds a magnitude of 75 J kg 21 the ratio is mathematically undefined as it is assumed the MUCAPE is unlikely to be realized.Furthermore, should there be no CAPE present (i.e., SBCAPE and MUCAPE are zero) this quantity is undefined.
If the diagnostic is focused on elevated initiation, then it is most applicable at convective initiation.However, it is also applicable over time as when the CAPE is released the ratio can identify if an event has the potential to show more elevated characteristics (i.e., MUCAPE is not released as quickly as SBCAPE and so the ratio remains in favor of elevated convection).

c. Downdraft properties
The third diagnostic (Fig. 1c) is an orthogonal definition of elevated convection compared to the CAPE ratio.It is defined as the ratio between the DCIN and DCAPE: This diagnostic was first introduced by Market et al. (2017Market et al. ( , 2019)).It is based on the principle that if the convective downdraft cannot reach the surface, an updraft originating from the surface will be unable to convect.The interpretation of this diagnostic changes depending upon how mature the convection is.For example, for mature convection it becomes more of a hazard identification diagnostic (for downbursts) rather than an elevated convection diagnostic.Thus, this diagnostic can have a double meaning.For our version of this diagnostic, we define the DCAPE and DCIN from a fixed model level with an altitude of approximately 4 km, equivalent to approximately 700 hPa.This is a simplistic definition and may not be the best level to use to calculate DCIN and DCAPE as it could result in a less continuous field.However, we feel this is a useful first attempt to understand its utility for the British Isles.This diagnostic varies between zero and, theoretically, infinity (assuming DCAPE and DCIN are defined).Values exceeding one are more likely to indicate elevated convection than those below one.An important caveat to this diagnostic is that it is most applicable for identifying elevated convection at convective initiation.The caveat arises as over time downdrafts from elevated convection can erode the capping inversion and may, eventually, be able to reach the surface (Horgan et al. 2007;Market et al. 2017) and so it may or may not indicate the convection is elevated.Given the propensity of convection being advected to the United Kingdom from the near European continent (e.g., Flack et al. 2016) it is likely that this diagnostic may not have as much practical use in U. K. CAMs compared to a domain where convective initiation is more likely to happen within it.

d. Inflow layer properties
The fourth diagnostic (Fig. 1d) considers the properties of the inflow layer (Thompson et al. 2007).The effective inflow layer indicates the most likely layer of the atmosphere from which air parcels are ingested into the convective event.Thus, intuitively, if the base of the inflow layer is within the boundary layer there will be an element of surface forcing.The diagnostic is defined as a conditional relationship: for the effective inflow layer base (EIB), defined as the lowest height (above sea level) in the layer in which certain CAPE and CIN thresholds are met: CAPE .100 J kg 21 and |CIN| , 250 J kg 21 as in Thompson et al. (2007).The boundary layer height h is defined as a height above sea level to ensure a fair comparison with the EIB so is strictly the modelderived boundary layer height plus the model orography.This diagnostic, as with the first, is binary.Values of one imply elevated convection is likely to occur within that environment.A value of zero implies the convection (if likely to happen) is surface based.

Testbed analysis
The U.K. Testbed Summer 2021 was used to determine the practical use of the elevated convection diagnostics within current operational CAMs.Ten participants scored the diagnostics in post-event analysis during the testbed: six people in both July and September, with two of the six scoring both periods.A base participant also scored all the cases.This base participant had been conducting longer-term monitoring of these diagnostics prior to the testbed and can thus act as a measure for consistency between the different backgrounds of the participants as there are not enough samples for statistical tests.We acknowledge that this is a small sample size (due to circumstances described in section 2a).However, as the idea is to get a general improved understanding of how the diagnostics perform with relation to each other, and given the range of cases, we feel we have a large enough sample size to cover this.
The scores were checked for consistency from the participants (i.e., the scores for the diagnostics when showing surface-based instead of elevated convection were treated consistently by the same participant).This checking ensures that the variation in subjectivity was genuine, rather than due to inconsistent scoring; it also allows for understanding of any cognitive biases that may be present (e.g., anchoring the scores on the performance of identification of elevated convection rather than all types of convection).Cognitive biases are considered as they can impact the interpretation of the survey results (e.g., Jahedi and Méndez 2014).For example, if anchoring was present, it could give an unfair disadvantage to those good at identifying surface-based convection and not elevated convection; if the diagnostic does not indicate elevated convection, it could be more heavily penalized.It is worth noting that while some anchoring bias on elevated convection was present, particularly for the downdraft diagnostic, it had limited impact on the overall interpretation of the results.
To understand the practical ratings of the scores, the model performance for representing the type of convection is considered.Figure 2 shows the subjective scores of the current forecasting methods from the model against observations.As expected, and consistent with all convective forecasts (e.g., Done et al. 2006;Surcel et al. 2016), case-to-case variability is the dominant mode of variability (Fig. 2a).The average score across all the events for the model performance is three.The lowest scores are related to a heatwave breakdown (e.g., a blocking high becomes replaced with an unstable plume).The highest scores are related to events either existing within the domain beforehand or events associated with the frontaloverrunning mechanism, loosely agreeing with Weckwerth et al. (2019).September cases tended to score higher than the July cases as a result of different synoptic conditions and the predominance of long-lived convection or events coming in from the boundaries (Table A1 in the appendix).
Stratification of the data by participant background (Fig. 2b) shows that the operational meteorologists tended toward more positive scoring than the research participants who were evenly spread across the different events.However, this trend is weak and within range of the base participant scores, suggesting that the groups' views on the model performance are consistent.
The outcome of the survey was practical use ratings for each of the diagnostics compared to current practice (section 2a).As previously discussed, all the diagnostics have a theoretical basis; therefore, a lower practical rating does not imply a bad diagnostic.Instead, a lower rating indicates a potential model deficiency.
Figure 3 shows the practical use ratings for the different diagnostics.Like the model performance (Fig. 2) some of the diagnostics, such as boundary layer properties (Figs.3a,b), show large case-to-case variability (relative to the median value).Diagnostics based on the CAPE ratio and inflow layer properties scored more consistently (Figs.3c,g).This consistency suggests reduced dependence on the representation of the convection in the model and a stronger dependence on the representation of the background environment.The dependence on convection or environment is likely to manifest in the spatial properties of the diagnostics (section 5b).
The summary of the ratings by testbed participant type ( Figs. 3b,d,f,h) shows similar themes to the overall model performance (Fig. 2b).As with the model performance there is a slight trend for the operational meteorologists to rate the diagnostics slightly higher than researchers.The base participant again falls in the middle of the two distributions suggesting consistency in the overall ratings of the scores between the participants.
Given the overall similarity between the scoring (Figs. 2  and 3), and the subjective nature of the survey, it shows there is strong agreement between those diagnostics preferred (CAPE ratio and inflow layer properties).There is also agreement on the diagnostic with the lowest practical rating (downdraft properties).To gain insight into why certain diagnostics are preferred, a longer-term examination of the properties of the diagnostics is required.The preferences could be related to spatiotemporal properties of, or agreement between, the different diagnostics.

Properties of the elevated convection diagnostics
The U.K. Testbed Summer 2021 has been used to identify which diagnostics are preferred for practical use.Here, longer-term analysis focuses on three properties of the diagnostics identified from comments during the U.K. Testbed Summer 2021 and Fig. 1 to explain reasons behind the ratings.The three properties examined are (i) the agreement between the diagnostics; (ii) the spatial coverage; and (iii) temporal variation.These themes are examined for all forecasts between 17 July and 24 September 2021.The forecasts are initiated at 0000 UTC, with analysis occurring for lead times between T 1 12 and T 1 36 h (1200-1200 UTC).

a. Agreement between diagnostics
An important aspect to consider for diagnostics with similar purposes is their agreement.To ensure comparisons are as fair as possible all diagnostics are restricted to where convection is present in the model, based on maximum reflectivity exceeding 30 dBZ.Including all points where the diagnostics are defined reduces the agreement between the boundary layer properties diagnostic and the other diagnostics (not shown).This increased disagreement arises because the boundary layer diagnostic is only defined in the presence of convection; all other diagnostics are defined based on the environment and are not dependent upon convection being present in the model.
Figure 4 shows 2D histograms for the different diagnostic comparisons and their respective Kendall t coefficient (Kendall 1938).This coefficient considers the relative quadrants in which the two quantities being correlated appear (e.g., for our situation both near one, one near one, one near zero, and both near zero).If there is stronger agreement, the coefficient provides values close to one; if there is disagreement it is closer to zero.All comparisons show some degree of agreement.This agreement arises, in part, due to the rarity of elevated convection compared to surface-based convection.The diagnostics with the strongest agreement are the boundary layer properties and CAPE ratio; CAPE ratio and inflow layer properties; and boundary layer and inflow layer properties.Reasons for the differences between the other diagnostics and the boundary layer properties diagnostic are related to slow-moving events cooling the boundary layer, as previously discussed (section 3a).Differences related to the CAPE ratio and the inflow layer properties are related to the threshold applied to the CAPE to define the inflow layer (e.g., Thompson et al. 2007) and lack of threshold applied to the CAPE ratio.The lack of threshold allows small CAPE values to be indicative of environments suitable for convection even if the instability is unlikely to be released.
The reasoning for the reduced agreement between the downdraft properties diagnostic and the other diagnostics is less straightforward.One difference will be related to the interpretation of the diagnostic at different stages of the convective life cycle (e.g., Market et al. 2017).This is most clearly seen in Fig. 4d where most of the values are populated within the first column of the downdraft properties but spread evenly across the CAPE ratio.It is likely that many of these events refer to areas of mature convection in which the DCIN has been eroded.If the DCIN has eroded the downdrafts can penetrate to the surface.Therefore, for mature convection the downdraft properties diagnostic becomes a hazard identification diagnostic rather than an elevated convection identification diagnostic.However, this may not be the only reason as to why there is reduced agreement between this diagnostic and the other diagnostics.Therefore, other properties of the diagnostics need to be examined, to help determine the reason why this diagnostic is perceived to have reduced practical use.

b. Spatial properties of the diagnostics
The spatial properties of the diagnostics are considered here.Figure 5 shows the spatial distribution of the different diagnostics.Larger values imply elevated convection occurs more frequently within that region, per the diagnostics.As with the agreement, the diagnostics based on boundary layer properties, CAPE ratio, and inflow layer properties show greater agreement with one another compared to the diagnostic based on downdraft properties (cf.Figs. 5a,b,d with Fig. 5c).Common areas indicated to be preferential areas for the model to produce elevated convection are predominantly coastal areas or channels.There are no clear preferential areas over land.The CAPE ratio diagnostic strongly reduces the frequency over large urban areas (e.g., London, Birmingham, Dublin, Paris) and both the CAPE ratio and inflow layer properties reduce the frequency around orographic areas (e.g., the Highlands of Scotland).These reductions make physical sense, with orography leading to a greater chance of surface-based convection.Furthermore, processes such as the urban heat island leading to a longer-lived, well-mixed boundary layer will impact the convection over urban areas (e. g., Collier 2006).
In terms of the spatial distribution the practically preferred diagnostics (Figs.5b,d) appear smoother than the other diagnostics (Figs.5a,c).
The downdraft properties appear particularly localized.To quantify this localization, object statistics (akin to cell statistics; e.g., Hanley et al. 2015) are presented in Fig. 6.Unlike the cell statistics the object statistics allow grid box values to be defined as objects.Percentages of the frequency of grid box values and objects that are less than 25 grid boxes in area, hereafter unresolved objects, are presented in Table 2.An area of 25 grid boxes is chosen as this represents a size of (5Dx) 2 , for the grid length Dx.This size represents the smallest scale at which it is expected atmospheric phenomena will be resolved in NWP models (e.g., Skamarock et al. 2014).From Fig. 6a and Table 2, the diagnostic describing the downdraft properties is the most localized.The other three diagnostics are similar in terms of their spatial distributions.For the downdraft properties diagnostic, 70% of events are a single grid point.Conversely, for the other diagnostics this percentage varies between 25% and 37%.The localization of the downdraft diagnostic helps explain its reduced agreement with the other diagnostics.
For the diagnostics based on CAPE ratio and downdraft properties these percentages have been checked for sensitivity against the definition of elevated convection (Table 2).Greater sensitivity of spatial distribution to the definition of elevated convection (defined by an arbitrary threshold for each diagnostic l) arises at the gridpoint scale compared to the resolved scale, as would be expected for convection permitting models (e.g., Roberts and Lean 2008).
Given the downdraft diagnostic is the most localized diagnostic and has a lower rating, it is likely that the physical process the diagnostic represents is not resolved in the model.To examine the extent of this idea we consider the downdraft properties and CAPE ratio as proxy diagnostics for downdrafts and updrafts, respectively.This is reasonable given Kirkpatrick et al. (2009) who note that, to some extent, the CAPE and DCAPE are related to the maximum updraft and minimum downdraft speeds, respectively.This can robustly be achieved by considering all convection (setting the threshold to identify objects in the diagnostics to zero).Figure 6b shows a similar qualitative structure compared to the elevated events (Fig. 6a).However, examination of the percentage values in Table 2 indicates a difference; those values suggest that approximately 35% of updrafts are a single grid point and 46% of downdrafts are a single grid point.Furthermore, considering all unresolved objects increases the percentages to approximately 89% and 99%, respectively.Therefore, there is a substantial proportion of both updrafts and downdrafts that remain unresolved at convection-allowing resolutions in agreement with previous studies (e.g., Stein et al. 2015).Therefore, increasing the model resolution could improve the representation of convection, and as such could increase the practical use of the downdraft properties diagnostic in NWP.It is worth nothing that the downdraft properties diagnostic is more sensitive at the grid point scale to the definition of elevated convection than the CAPE ratio.This may suggest the downdraft has an important role in the representation of elevated convection, or that downdrafts are smaller in elevated convection.
The spatial properties help show a preference for certain locations around the United Kingdom for the formation or presence of elevated convection: coastal regions.Thus, it is likely that land-sea interactions and the sea breeze (e.g., Fovell 2005) have an important role in the presence or maintenance of elevated convection over the United Kingdom.The spatial properties have also indicated that some of the diagnostics, while of theoretical value, are currently not practically useful in CAMs.For example, the downdraft properties diagnostic has less practical use in part due to CAMs being unable to resolve the downdrafts.Thus, fundamental processes in representing convection are currently missing and scale-aware convection parameterizations could be useful within CAMs.

c. Temporal properties of the diagnostics
Finally, the temporal properties of the diagnostics are considered.These are considered via the forecast evolution over a 24-h period to give an idea of the diurnal cycle.Figure 7 FIG.6. Object-area relative frequency histograms for the different elevated convection diagnostics based on data from UKV forecasts initiated daily at 0000 UTC between 17 Jul and 24 Sep 2021 for forecast lead times between T 1 12 and 36 h (1200-1200 UTC).Solid black lines represent convection over a stable boundary layer, dashed red lines represent the CAPE ratios, dotted blue lines represent downdraft properties, and magenta dot-dashed lines represent the diagnostic based on inflow layer properties.The panels are for (a) where elevated convection is defined as being equal to or above 0.99 in the diagnostics and (b) for all defined objects in the nonbinary diagnostics.The bin widths are one grid point.TABLE 2. Percentage of objects of different areas (expressed in grid points) and the variation with a threshold (l) to the definition of elevated convection.A threshold of 0 is used to consider all events as opposed to just elevated convective events.Forecasts considered were from the UKV initiated daily at 0000 UTC between 17 Jul and 24 Sep 2021 for forecast lead times between T 1 12 and 36 h (1200-1200 UTC).shows the forecast evolution for the four diagnostics.As expected, the peak in elevated convection activity is during the nocturnal hours (e.g., Geerts et al. 2017).
The boundary layer properties show a double peak in its evolution (Fig. 7a), one corresponding to around local sunset and the other (larger peak) in the early morning (approximately 0600 UTC).It is likely that this first peak is related to long-lived convection or convection forming over very shallow stable layers and may not be "truly" elevated (e.g., Nowotarski et al. 2011;Gropp and Davenport 2018).The second peak is likely a result of nocturnal formation or formation before the convective boundary layer has developed.
As with the spatial diagnostics, the CAPE ratio and inflow layer properties (Figs.7b,d) are the smoothest.It is notable that the CAPE ratio has a broader peak compared to the inflow layer properties.
Finally, the diagnostic based on the downdraft properties (Fig. 7c) is the most variable.It notably peaked at approximately T 1 18 h (1800 UTC).This peak is likely linked to the reduced spatial coverage (as a result of the inability to resolve the convective downdrafts) and as such there are not enough events to produce a smooth diurnal cycle.
Thus overall, the temporal properties follow the same ordering from smoothest to most variable as the spatial properties.These are also linked to the practical rating, suggesting that the spatiotemporal smoothness of the CAPE ratio and inflow layer properties allowed these to be more trusted in a practical sense in current CAMs.Given the agreement between the two favored diagnostics, they are combined into a single diagnostic next.

Development of a combined diagnostic
The smoothest properties, both spatially and temporally, are the CAPE ratios and inflow layer properties.Considering the feedback from the U.K. Testbed Summer 2021 (section 4), these were the ones deemed to have the greatest practical use.However, the CAPE ratio diagnostic is perceived to have too much spatial coverage (due to the lack of thresholding the CAPE values) and there is no consideration of the height of the "elevated" instability.Filtering the CAPE ratio by the inflow layer properties to create a new diagnostic could mitigate these factors.The filtering occurs via the multiplication of the CAPE ratio by the inflow layer properties diagnostics (as the latter is a binary field).Thus, the filtered diagnostic represents the CAPE ratio where the inflow layer diagnostic is supportive of elevated convection.This filtered diagnostic varies continuously between zero and one: values near one represent elevated convection; values near zero represent surface-based convection.
Figure 8 shows the combined diagnostic for the different properties previously considered.For the case study (Fig. 8a), it indicates the same areas as Fig. 1d but has the nonbinary values of Fig. 1b.The spatial distribution (Fig. 8b) shows a blend of the component diagnostics (Figs.5b,d) with the same preferential points described as before.There is also a slight reduction in the lack of preference over large urban areas.The biggest difference is the reduction of preference in the southwest of the domain.This reduction comes from the CAPE ratio diagnostic and agrees with the model's preference to produce large SBCAPE values within that region (e.g., Flack et al. 2016), andCAPE climatologies (e.g., Romero et al. 2007;Riemann-Campe et al. 2009).The object statistics (Fig. 8c) agree with those presented in Fig. 6a for the two constituent diagnostics.The forecast evolution peak (Fig. 8d) is broader compared to the inflow layer properties diagnostic (Fig. 7d), but not as broad as the CAPE ratio (Fig. 7b).
Considering all properties of this combined diagnostic suggests that it has the potential to be a useful tool for operations and research going forward.This combined diagnostic allows multiple factors about the environment (including atmospheric profiles) of a region to be determined at the same time.This diagnostic was produced after the testbed; therefore, it would be beneficial to get feedback around the improvements from operational meteorologists.This further analysis could potentially come in the form of longer-term testing alongside the other diagnostics.

Summary
Elevated convection is a phenomenon where there is relatively little agreement on the use of a common diagnostic to uniquely characterize forecast performance or support model development (e.g., Thompson et al. 2007;Corfidi et al. 2008;Clark et al. 2012;Market et al. 2017).Given the wide range of initiations mechanisms (e.g., Trier and Parsons 1993;Moore et al. 2003;Browning et al. 2012;Parsons et al. 2019) this is perhaps unsurprising.The identification of elevated convection is not easily solved by one parameter; however, we present a step forward in this direction by characterizing four diagnostics to identify environments suitable for elevated convection within this paper.The diagnostics are based on boundary layer properties, CAPE ratios, downdraft properties, and inflow layer properties.All these diagnostics have a theoretical basis.However, they may not apply in all convective environments.Therefore, the emphasis of this paper has been to examine the practical use and characteristics of the diagnostics in CAMs.Specifically, the questions considered are as follows: • Which diagnostics are practically useful in current CAMs?
• What are the properties of the diagnostics in CAMs?
• Do the properties of the diagnostics indicate any reasons behind why some diagnostics have less practical use?
These questions have been considered via the U.K. Testbed Summer 2021 and spatiotemporal analysis of the diagnostics for a 70-day period (17 July-24 September 2021).The following has been shown: • The diagnostics based on the CAPE ratio and inflow layer properties are the (practically) preferred diagnostics.(i) The localized nature of this diagnostic.
(ii) The diagnostic is of most use for identifying elevated convection at convective initiation (e.g., Market et al. 2017Market et al. , 2019) ) and thus given the nature of convection over the United Kingdom being advected in from different areas (e.g., Hand 2005;Hand et al. 2004) it is perhaps not as useful in the current domain.Overall, the favored diagnostics consider the atmospheric profiles and are more spatially and temporally smooth.However, they both have limits: one is binary (e.g., inflow layer properties) and the other is perceived to have too much spatial coverage (e.g., CAPE ratio).Combining these diagnostics results in CAPE ratios where the MUCAPE is defined above the boundary layer, and thus produces a narrower definition for elevated convection.
There are several caveats to this study including the "on demand" testbed approach limiting the cases used to test the diagnostics, and so false alarms rates could not be identified.Furthermore, the diagnostics have yet to be quantitatively verified due to a lack of available diagnostics.Thus, further testing to include a larger selection of cases (to include potential false alarm situations) and verification is planned future work.There are further questions remaining such as how "elevated" the elevated convection is, and whether other diagnostics such as moist static energy (Marquet 1993) or vertically integrated extent of realizable symmetric instability (Dixon et al. 2002) could also be useful diagnostics.Furthermore, the definition of the downdraft properties could be changed to determine the impact of using different heights (such as the height of the lowest wet-bulb potential temperature) could be beneficial for this diagnostic, and the inflow layer definition could be examined further to investigate the sensitivity into its definition of elevated convection.
The testbed and the characterization of the diagnostics has shown that current CAMs are unable to fully resolve elevated convection due to the size of updrafts and downdrafts.Therefore, diagnostics that are too localized by focusing on specific unresolved phenomenon have less practical use.Two diagnostics (CAPE ratio and inflow layer properties) currently have clear practical benefits, and those, along with their combined form, (filtered CAPE ratio by inflow layer properties) will be considered in longer-term testing.This longer-term testing will act to alleviate concerns around the "on-demand" and potential limited range of events from testbeds for potential development into an operational diagnostic.It is expected that the use of these diagnostics in future operations would be as a quick glance complementary tool to aid the forecasting of severe convection.These diagnostics will also be used within a research context to help identify differences in the representation of elevated convection in different model configurations, including sensitivity to model physics and grid spacing.

Acknowledgments.
The authors thank all participants of the U.K. Testbed Summer 2021.The participants that completed the elevated convection survey and gave permission for their data to be used within this manuscript were Chris Bulmer, Abdullah Kahraman, Anne McCabe, Aurore Porson, Kristin Raykova, Jessica Renz, Nigel Roberts, William Rosling, David Walters, Brent Walker, and an anonymous participant.Further thanks go to Jessica Renz, William Rosling, and Brent Walker who were the operational meteorologists that helped produce the current practices and observations maps.Further thanks go to Aurore Porson for facilitating the U.K. Testbed Summer 2021.We also acknowledge the valuable discussions with Sylvia Bohnenstengel, John Marsham, Douglas Parker, and Jon Petch, around the manuscript and diagnostics.We further acknowledge the insightful comments from four anonymous reviewers, which has led to improvements to the manuscript.

FIG. 2 .
FIG. 2. Violin plots for the UKV model performance using current forecasting practices to identify environments suitable for elevated convection compared to observations for (a) the cases and (b) all cases grouped by the participant background.The white dot shows the median, the black box shows the interquartile range, the black whiskers extend out to the 5th and 95th percentiles, and the colored region shows the PDF of the distribution.A score of 5 is a perfect forecast.Cases marked with an asterisk (*) were optional during the testbed.

FIG. 3 .
FIG. 3. Violin plots for the practical use rating of the different diagnostics.A score of 3 is equivalent use to current forecasting practices, above 3 is more use compared to just using current practices, and below 3 is less use compared to the current practices.The panels are for the different elevated convection diagnostics that are based on (a),(b) boundary layer properties; (c),(d) CAPE ratios; (e),(f) downdraft properties; and (g),(h) inflow layer properties for (left) scores by cases and (right) scores by participant background.The white dot shows the median, the black box shows the interquartile range, the black whiskers extend out to the 5th and 95th percentiles, and the colored region shows the PDF of the distribution.Cases marked with an asterisk (*) were optional during the testbed.

FIG. 4 .
FIG. 4. 2D  relative frequency histograms showing the correspondence between the different diagnostics and their respective Kendall t coefficient.The solid black lines are to give an indication of where the convection is expected to be elevated beyond that point, and the dashed lines for where the convection is most likely to be surface based beneath that point.These lines apply to the nonbinary diagnostics.The elevated convection diagnostics comparisons are based on (a) boundary layer properties vs CAPE ratios, (b) boundary layer properties vs downdraft properties, (c) boundary layer properties vs inflow layer properties, (d) CAPE ratios vs downdraft properties, (e) CAPE ratios vs inflow layer properties, and (f) downdraft properties vs inflow layer properties.The color scale applies to all plots.The histograms use all values where the model identifies convection from UKV forecasts initiated at 0000 UTC daily between 17 Jul and 24 Sep 2021 for lead times between T 1 12 and 36 h (1200-1200 UTC).

FIG. 5 .
FIG. 5. Temporal averages of the different elevated convection diagnostics between T 1 12 and 36 h (1200-1200 UTC) for UKV forecasts initiated at 0000 UTC daily between 17 Jul and 24 Sep 2021.Zero values have been included.The panels depict the elevated convection diagnostics based on (a) boundary layer properties, (b) CAPE ratios, (c) downdraft properties, and (d) inflow layer properties.Note the different color scales for each of the panels; in all the panels a larger (blue) value implies the environment is more suitable for elevated convection.

FIG. 7 .
FIG. 7. The normalized forecast evolution between T 1 12 and 36 h (1200-1200 UTC) from UKV forecasts initiated at 0000 UTC daily between 17 Jul and 24 Sep 2021.The forecast evolution has been normalized by the mean value across the period.The panels show the diagnostics based on (a) boundary layer properties, (b) CAPE ratio, (c) downdraft properties, and (d) inflow layer properties.Larger values imply increased frequency of elevated convection, and smaller values imply reduced frequency of elevated convection.The gray shading represents the nocturnal hours between approximately 1900 and 0500 UTC (T 1 19-T 1 29); note local time is UTC 1 1 h.

FIG. 8 .
FIG. 8.The filtered CAPE ratio by inflow layer properties for (a) 0600 UTC 14 Sep 2021 initialized from 0000 UTC 14 Sep 2021, (b) the temporal average between 17 Jul and 24 Sep 2021 between T 1 12 and 36 h (1200-1200 UTC) from forecasts initialized daily at 0000 UTC, (c) object area relative frequency histogram for the same period as used for (b) using bins of width of 1 grid box, and (d) the normalized forecast evolution between T 1 12 and 36 h for the same period as used in (b); the forecast evolution has been normalized by the mean value across the period, the gray shading represents the nocturnal period (approximately between 1900 and 0500 UTC (T 1 19-T 1 29), with the local time equivalent to UTC 1 1 h.
Data availability statement.The data used are from operational output of the Unified Model at Operational Suite 44.Permission to use the data from the U.K. Testbed Summer 2021 from the participants has been acquired before use.The data have been anonymized before use.All participants have been thanked in the form that they requested within the acknowledgements.The data are available by contacting the corresponding author and subject to licensing constraints.APPENDIXAn Overview of U.K. Testbed Summer 2021 CasesTableA1provides the dates, forecast initiation times, and a brief description of the cases for which the elevated convection survey was completed during the U.K. Testbed  Summer 2021.

TABLE 1 .
A summary of the main attributes of the four diagnostics to identify elevated convection.
Yang et al. 2016aft properties diagnostic as a proxy for downdrafts, and the CAPE ratio as a proxy for updrafts, indicates that a substantial proportion of downdrafts and updrafts are unresolved in CAMs (approximately 99% and 90% respectively).This shows that the downdrafts are less likely to be resolved, which theoretically agrees with observation studies that have shown downdrafts to have smaller diameters than updrafts (e.g.,Yang et al. 2016).•Combining the results and theoretical properties of the downdraft properties diagnostic identifies two main factors that result in the reduced practical use within the UKV: •The preferred diagnostics are both spatially and temporally smooth in comparison to the diagnostics based on boundary layer and downdraft properties.•

TABLE A1 .
An overview of the different cases for which the diagnostics were tested during the U.K. Testbed Summer 2021.Cases with a (*) denote optional cases.