Connecting Land–Atmosphere Interactions to Surface Heterogeneity in CHEESEHEAD19

: The Chequamegon Heterogeneous Ecosystem Energy-Balance Study Enabled by a High-Density Extensive Array of Detectors 2019 (CHEESEHEAD19) is an ongoing National Science Foundation project based on an intensive field campaign that occurred from June to October 2019. The purpose of the study is to examine how the atmospheric boundary layer (ABL) responds to spatial heterogeneity in surface energy fluxes. One of the main objectives is to test whether lack of energy balance closure measured by eddy covariance (EC) towers is related to mesoscale atmospheric processes. Finally, the project evaluates data-driven methods for scaling surface energy fluxes, with the aim to improve model–data comparison and integration. To address these questions, an extensive suite of ground, tower, profiling, and airborne instrumentation was deployed over a 10 km × 10 km domain of a heterogeneous forest ecosystem in the Chequamegon– Nicolet National Forest in northern Wisconsin, United States, centered on an existing 447-m tower that anchors an AmeriFlux/NOAA supersite (US-PFa/WLEF). The project deployed one of the world’s highest-density networks of above-canopy EC measurements of surface energy fluxes. This tower EC network was coupled with spatial measurements of EC fluxes from aircraft; maps of leaf and canopy properties derived from airborne spectroscopy, ground-based measurements of plant productivity, phenology, and physiology; and atmospheric profiles of wind, water vapor, and temperature using radar, sodar, lidar, microwave radiometers, infrared interferometers

L and-atmosphere exchanges of energy, water, and carbon influence weather and climate.The biological processes that mediate these exchanges with the atmosphere occur at multiple spatial and temporal scales, necessitating cross-scale observational platforms.Accurate accounting of land-atmosphere interactions is critical for improving the performance of numerical weather and climate models.Unfortunately, there is a persistent mismatch between the scales of observations and models.This scale mismatch is problematic because the land surface exhibits substantial heterogeneity, which means that observations are not always accurate reflections of the entire model grid cell.Furthermore, the atmosphere is strongly influenced by nonlinear two-way interactions with radiation, land cover, and soil, so that the spatial and temporal scaling of surface fluxes is fundamental to assessing the parameterizations used in atmospheric models to represent land-atmospheric interactions.
The notion that land surface heterogeneity influences the surface energy balance-and resulting atmospheric responses-emerged from early model simulations showing the importance of soil moisture, vegetation, albedo, roughness, and heating on the atmosphere (Garratt 1993;Mahrt 2000;Betts et al. 1996;Charney 1975;Avissar 1995;Pielke et al. 1998).Theories on how land surface variations drive atmospheric boundary layer (ABL) growth vary (e.g., Desai et al. 2006;Reen et al. 2014;Platis et al. 2017;Gantner et al. 2017), with no consensus on whether responses scale linearly or nonlinearly and whether they differ for dry versus moist dynamics (Raupach and Finnigan 1995).Modeling studies have been developed from limited sets of observations of prior field experiments and from specialized modeling domains using simplified boundary conditions (e.g., Kang et al. 2007;Hill et al. 2008Hill et al. , 2011;;Zhu et al. 2016).From these previous studies, scaling laws have been derived based on numerical simulations (van Heerwaarden et al. 2014;Rihani et al. 2015), but a systematic regional-scale observational experiment that quantifies the multiscale nature of subgrid scaling and patterning has never been realized (Steinfeld et al. 2007).
An issue related to how heterogeneity influences transport processes in the ABL is the energy balance closure problem.This refers to an observed tendency in eddy covariance (EC) flux measurements in which the sum of incoming available energy [net radiation

E423
(R N ) minus ground heat flux (G)] exceeds surface turbulent sensible and latent heat fluxes (H S and H L ) over subhourly time scales (Foken et al. 2011).Systematic studies have ruled out instrument errors as the primary cause (Twine et al. 2000;Liu et al. 2011).Incomplete observation of submeasurement height storage flux accounts for only some of this lack of closure (Leuning et al. 2012;Xu et al. 2018).Advection terms are not expected (in most cases) to have a systematic direction that would always lead to lack of closure (e.g., Aubinet et al. 2010;Barr et al. 2013;Nakai et al. 2014;Zitouna-Chebbi et al. 2012;Eder et al. 2015), while topography contributes mostly in extreme cases (McGloin et al. 2018).
EC sites with more variable land cover tend to have larger closure imbalances (Stoy et al. 2013;Z. Xu et al. 2017).One proposed hypothesis for lack of closure in the energy budget is that surface heterogeneity generates mesoscale features not adequately resolved by traditional EC methods (e.g., Charuchittipan et al. 2014;Gao et al. 2016;Foken et al. 2011;Mauder et al. 2007b).An intensive suite of energy flux measurements between surface and atmosphere at the mesoscale, on the order of tens of kilometers, can help address this key uncertainty in land-atmosphere exchange (Xu et al. 2020).

Experimental goals
The Chequamegon Heterogeneous Ecosystem Energy-Balance Study Enabled by a High-Density Extensive Array of Detectors 2019 (CHEESEHEAD19) was designed to provide a new level of observation density and instrumentation reliability to test hypotheses on spatial heterogeneity and atmospheric feedbacks.The two main research objectives for the CHEESEHEAD19 experiment were 1) to investigate causes of energy balance nonclosure over heterogeneous ecosystems and 2) to address the problem of scaling surface energy fluxes.
There is currently no definitive answer as to what is responsible for energy balance nonclosure.CHEESEHEAD19 was designed specifically to test the hypothesis that heterogeneity is responsible for generating organized (sub)mesoscale structures that are not resolved by traditional EC methods.
Various theories suggest that "spatial" EC, where multiple towers are combined to estimate the mesoscale contribution to the total flux, could be used to analyze this contribution and "close" the energy balance (Steinfeld et al. 2007;Mauder et al. 2008b).To calculate spatial fluxes, CHEESEHEAD19 deployed an EC tower network and airborne EC measurements.These measurements provide spatial patterns of surface energy fluxes across various vegetation and surface types in the heterogeneous landscape.Alongside this EC flux network, multiple platforms were deployed to characterize the atmospheric environment by profiling relevant atmospheric characteristics across a range of scales.This allows us to determine the existence of and characterize the nature of organized mesoscale structures.We can investigate the degree to which mesoscale eddies are responsible for energy balance nonclosure in EC measurements, and whether land surface energy partitioning and atmospheric responses differ from the sum of their individual components.
To systematically address surface energy balance variability in the heterogeneous forested landscape, a precampaign large-eddy simulation (LES; see the appendix for a list of acronyms used throughout the paper) analysis of the study domain was conducted.It was found that, while 12 flux towers would be sufficient to adequately sample land-cover variation, >15 flux towers are required to sample mesoscale eddy structures and close the energy budget; a similar result to Steinfeld et al. (2007).Therefore, the CHEESEHEAD19 field campaign deployed 20 flux towers, a marked increase over previous experiments.
CHEESEHEAD19 asks how we can optimally observe and simulate the terms of the surface energy balance and the corresponding atmospheric responses to heterogeneous surface forcings.The objective is to evaluate methods for scaling surface energy fluxes, with the aim of improving model-data comparisons.To this end, we conduct LES and machine-learning

E424
scaling experiments to simulate submesoscale responses.These will be compared to measured quantities to test theory and to improve our understanding of how scale-dependent transport processes in the lower atmosphere respond to surface heterogeneity.The dataset collected during this study will help test multiple scaling methodologies across heterogeneous land cover.Specifically, it aims to test the environmental response function-virtual control volume (ERF-VCV) approach (Metzger 2018;Xu et al. 2018), which combines the strengths of both data-driven and mechanistic strategies.
Several additional research objectives are addressed by using the unique data resources of CHEESEHEAD19.These include a separately funded study to use CO 2 fluxes of Integrated Surface Flux System (ISFS) towers and hyperspectral imagery of canopy functional traits to determine the principal drivers of variation in net primary production (NPP) and carbon use efficiency across a broad array of forest ecosystems.Additionally, concurrent measurements of ozone (O 3 ) mixing ratios at 30 and 122 m on the tall tower were made using a chemical ionization time-of-flight mass spectrometer (CI-ToFMS; TOFWERK AG and Aerodyne Research Inc.) and a photometric analyzer (Model 49i; Thermo Fisher) to obtain vertical O 3 profiles above the forest canopy (Bertram et al. 2011;Novak et al. 2020).These measurements were accompanied by flights of a small unoccupied aircraft system (sUAS)-mounted lightweight O 3 monitor (POM; 2B) that obtained vertical concentration gradients.These measurements are being used to determine the relative contributions of stomatal uptake and other nonstomatal loss pathways to O 3 deposition within a mixed forest canopy.

The experiment
Overview.CHEESEHEAD19 investigators deployed an extensive suite of ground, tower, profiling, and airborne instrumentation over a 10 km × 10 km domain in a forested and aquatic landscape in northern Wisconsin, United States (Fig. 1; Table 1), centered on the existing Park Falls 447-m-tower AmeriFlux/NOAA supersite (US-PFa/WLEF).The main components of the CHEESEHEAD19 field campaign were 1) ground-based fluxes and meteorology, 2) airborne fluxes and meteorology, 3) atmospheric profiling, and 4) surface environment characterization.
The EC tower network consisted of 17 towers from the NSF Lower Atmosphere Observing Facility (LAOF) ISFS, two additional towers, and the central tall AmeriFlux tower, US-PFa.Ground-based measurement of vegetation occurred at 41 plots in the domain, plus an additional 10 plots for measuring phenology.Airborne spectroscopy imaging was used to map leaf chemistry and canopy properties.
The suite of atmospheric profiling instruments included the LAOF Integrated Sounding System (ISS;Fig. 2c,Fig. ES1 in the online supplemental material) and the University of Wisconsin (UW) Space Science and Engineering Center Portable Atmospheric Research Center (SPARC) system (Fig. 2a).Additional instrument systems were brought by collaborators, including the combined ATMONSYS lidar for measuring aerosol, T, and H 2 O profiles and two Doppler wind lidars brought by Karlsruhe Institute of Technology (KIT), two Collaborative Lower Atmospheric Profiling Systems (CLAMPS, from NOAA/NSSL), two 915-MHz radar wind profilers with radio acoustic sounding systems (RASS) with microwave radiometers (MWRs, from NOAA/PSL), and the Surface Radiation Budget Network (SURFRAD, from NOAA/GML) systems for measuring incoming and outgoing radiation and cloud properties.While many of these instruments were located within the 10 km × 10 km CHEESEHEAD19 domain, some instruments were located at the Prentice and Lakeland airports, located approximately 45 km

E427
south and east of the WLEF tower, respectively, to provide information on the spatial variability of boundary layer structure and cloud and radiation fields.Three 7-day intensive observation periods (IOPs) occurred on 7-13 July, 18-24 August, and 22-28 September.During these IOPs the University of Wyoming King Air (UWKA) flew transects over an extended 30 km × 30 km domain to measure EC fluxes, ABL depth, and atmospheric profiles of water vapor and temperature.These observations will be used to test flux tower scaling, observe atmospheric mesoscale patterning, and evaluate LES.Also, during the IOPs, a team from NOAA/ARL/Atmospheric Turbulence and Diffusion Division (ATDD) brought multiple sUASs for measuring profiles of meteorological variables (T, H 2 O, U, P; see the appendix for a list of variables used in this paper) and land surface temperature.Additional information on the spatial variations of surface meteorology was obtained using mobile observing systems operated in pedestrian, boat, and car modes.
The 4-month deployment spanned the summer to fall transition, capturing the shift in surface energy balance from a more uniform evapotranspiration (latent heat flux) dominated landscape to a patchy sensible heat flux dominated landscape.These energy balance shifts arise from seasonal changes in plant phenological phases, ecosystem water use for photosynthesis, and available net radiation.These shifts provide a "natural experiment" with which to test hypotheses on how heterogeneity influences energy balance closure and spatial scaling.
The study domain was partly chosen due to the history of atmospheric science research in the region.Since 1995, university and NOAA investigators have sampled greenhouse gas profiles, meteorology, and EC flux measurements (energy, carbon, momentum) at 30, 122, and 396 m above ground level (AGL; Fig. 2b) on the WLEF tall tower (Bakwin et al. 1998;Davis et al. 2003).The site also includes a Fourier transform infrared solar-pointing spectrometer (TCCON) for total greenhouse column observations operated by CalTech and NASA JPL.Two additional EC towers (US-WCr, 30 m in mature forest, and US-Los, 10 m in

E428
shrub fen wetland) have been operating for 20 years, approximately 20 km from the tall tower (Cook et al. 2004;Desai et al. 2005;Sulman et al. 2009).The scales of atmospheric motions at the site are characteristic of a midlatitude continental site-with synoptic westerlies dominating and weather systems passing through on a multiday period, associated with the movement of the midlatitude jet.Being 75 km south of Lake Superior and 225 km west of Lake Michigan, the Great Lakes do not typically influence mesoscale motions at the site.CHEESEHEAD19 builds upon previous tower mesonet experiments, including BOREAS (Sellers et al. 1995), CASES99 (Poulos et al. 2002), SGP97 (Desai et al. 2006), IHOP (Kang et al. 2007), LITFASS-2003(Beyrich et al. 2006), EBEX (Oncley et al. 2007), BEAREX (Alfieri et al. 2012), HiWATER-MUSOEXE (Wang et al. 2015), SCALE-X (Wolf et al. 2017), and GRAPEX (Kustas et al. 2018), that were aimed at understanding scaling of nonlinear land-atmosphere interaction.

Instrumentation and measurements
Ground-based measurements.Towers sampled three-dimensional wind velocity, temperature, moisture, and CO 2 at 20 Hz to measure land-atmosphere fluxes (τ, H S , H L , F CO2 ).Each tower also measured net radiation, soil heat flux at 5-cm depth (and soil temperature profiles, heat capacity, and moisture to determine soil heat storage), and included a three-level air temperature and humidity profile to estimate canopy heat storage.A majority of the sites were forested and had flux instruments mounted 33 m AGL (Fig. 3; Table ES2).Instruments for wetland, grass, and lake sites were mounted between 1 and 3 m AGL to maintain consistent vegetation within the flux footprint.Tower placement within the 10 km × 10 km study domain followed a stratified random grid pattern, taking into account practical considerations including distance to road, a suitable gap in trees for a tower, U.S. Forest Service (USFS)-owned land, etc.Individual towers were an average of 1.4 km from their nearest neighboring tower and an average of 3.5 km from the tall tower.This meant that under certain conditions (e.g., high wind speeds, stable stratification) several of the towers shared overlapping flux footprints; a favorable condition for applying some of the data-driven scaling methods used in the project.Additionally, the semi-random placement meant that the towers were

E429
not chosen by distributing the towers in the centers of the most homogeneous areas of the various land-cover types.Thus, within the individual footprint of each tower there was often spatial variability in vegetation height and type (deciduous vs evergreen).While this can complicate analyses of flux measurements, it generates more representative data from these types of mixed forests.Furthermore, we expect it will enhance the ability of the data-driven methods for estimating domain-wide fluxes.
A suite of high-quality radiation sensors was deployed in the ISS field as a complement to the net radiometers installed on each flux tower.The full suite included high-quality upwelling and downwelling broadband surface radiation measurements to determine the surface radiation budget, as well as ancillary measurements of meteorological parameters, photosynthetically active radiation (PAR), and clouds as described in Table ES3.Radiation measurements were manually screened and then processed through an automated data quality procedure (Long and Shi 2008).Clear-sky radiation fluxes are estimated using the Radiative Flux Analysis method (Long and Ackerman 2000;Long and Turner 2008), from which derivation of cloud radiative effects as well as other data products such as fractional sky cover (Long et al. 2006;Dürr and Philipona 2004) and cloud optical depth (Barnard and Long 2004;Niple et al. 2016) are calculated.Measurements of cloud properties will allow us to quantify their impacts on the radiative and turbulent heat fluxes to better understand the two-way coupling between cloud-radiative interactions and boundary layer evolution, and to investigate the effect on EC nonclosure.
A smaller suite of radiation, cloud, and surface meteorological measurements were deployed at the Prentice and Lakeland airports, approximately 45 km south and east from the ISS field, respectively (Fig. 1), to characterize the larger-spatial-scale inhomogeneities.These measurements include downwelling shortwave and longwave irradiance as well as diffuse and direct components of shortwave irradiance (Table ES3); sufficient information to derive cloud radiative effects and fractional sky cover using the Radiative Flux Analysis method described above.Ceilometers deployed at the two airport sites provided additional cloud and boundary layer information.
airborne measurements.During each IOP the UWKA flew over the study area to measure spatial EC fluxes of heat, water vapor, and CO 2 .The purpose of the airborne observations was to test flux tower scaling and observe atmospheric mesoscale patterning.The UWKA also measured cross-sectional profiles of water vapor and temperature below the flight level using a downward-pointing Compact Raman Lidar (CRL; Wu et al. 2016) and ABL depth with the upward-looking Wyoming Cloud Lidar (WCL; Wang et al. 2009).
Flights over the domain occurred on 4 days during each of the three IOPs (Table ES4).On each day there were two 3-h flights, one in the morning (1400-1700 UTC) and one in the afternoon (1900( -2200 UTC) UTC).Flights consisted of ten 30-km down-and-back transects across the domain.The first leg of each transect was flown at 400 m AGL, while the return leg was flown at 100 m AGL.Flight transects alternated between straight and diagonal passes.

E430
Three different flight patterns were determined prior to the experiment (oriented SE-NW, SW-NE, and W-E; Fig. 4).Flying them either in forward or reverse order resulted in six distinct flight sequences that maximize data coverage under different wind conditions (see the sidebar).The main objectives were to maximize 1) the number of independent atmospheric eddies and 2) surface flux footprint observed by the aircraft EC measurements, while 3) ensuring crew safety.This was achieved by designing a parsimonious set of only three flight patterns that allowed the UWKA to fly perpendicular to the prevailing winds within a range of ±45° on any given day (Metzger et al. 2021).The 30-km flight legs extended an average of 10 km beyond the domain to compute a robust mesoscale eddy flux (Mauder et al. 2007a(Mauder et al. , 2008a) ) by capturing enough eddies and mesoscale variation to properly compute statistics for fluxes using the wavelet decomposition method.
The low-altitude legs were primarily used to measure EC fluxes.The altitude 100 m AGL was chosen to ensure flux measurements were made in the surface layer, as well as to minimize flux footprint errors over the 10 km × 10 km sampling domain.It was also the lowest altitude deemed safe to fly, as canopy height extended up to 35 m.The low-altitude legs were also used to identify ABL depth with the upward pointing 355 nm WCL.The primary purpose of the high-altitude legs (400 m AGL) was to map temperature and moisture profiles of the atmosphere with the CRL.These data were collected to estimate mesoscale development and calculate flux divergence and storage terms.atmospheric profilinG.Characterizing the mesoscale environment over the study domain was accomplished with a range of platforms and instruments to measure profiles of wind, water vapor, temperature, aerosols, and gases at different temporal and spatial scales (Fig. 1; Table 1).
The NCAR ISS was located in a field 1.6 km west of the tall tower (45.946°N, 90.294°W).It deployed a radar wind profiler, sodar-RASS, ceilometer, all-sky camera, and a surface meteorology station to measure ABL depth, winds, water vapor, and temperature.The 449-MHz modular wind profiler measured 30-min wind profiles with 150-m vertical resolution up to several kilometers AGL, while the sodar-RASS was capable of higher resolution (20 m; 10 min), but only penetrated to ~400 m AGL.Meteorological profiles were also measured with 172 radiosonde launches (daily 1800 UTC soundings and 3-4 additional soundings on IOP days).These instruments characterized the ABL from nocturnal boundary layer (sunrise sounding), through ABL development (midmorning and afternoon), to peak ABL (late afternoon sounding).In mid-September, one of the MWRs located at the Prentice Airport was relocated to this location, due to the failure of the AERI at the tall tower site in early September.
Several profiling systems were deployed at the base of the tall tower.SPARC (Wagner et al. 2019) was located 50 m north of the WLEF tower and was equipped with an Atmospheric Emitted Radiance Interferometer (AERI, a zenith-pointing infrared radiometer; Knuteson et al. 2004), a high spectral resolution lidar (HSRL; Eloranta 2005), and a ceilometer.Profiles of boundary layer temperature and humidity were retrieved from the AERI radiance observations (Turner and Löhnert 2014;Turner and Blumberg 2019).The HSRL sampled ABL aerosol backscatter and depolarization ratio at 532 and 1,064 nm.The ceilometer provided an additional measure of ABL depth.
The ATMONSYS system was placed beside the SPARC system, measuring atmospheric water vapor, temperature, and aerosol.The primary light source of the ATMONSYS lidar is a 100-Hz diode pumped Nd:Yag laser with the harmonic generation of 532 and 355 nm.The 532-nm light (P ≃ 27 W) is used for optical pumping a Ti:Sapphire laser, generating 817 nm (P ≃ 2 W) for water vapor profiling with the high-resolving DIAL (differential absorption lidar) method as well as for profiling aerosol backscatter.The 355-nm light is used for temperature profiling from rotational Raman backscatter.The system setup as installed during CHEESEHEAD19 F E B R UA RY 2 0 2 1 E431 (Vogelmann et al. 2020) allows for spatial sampling of 7.5 m and integration times of 20 s for aerosols and water vapor measurements and 300 s for temperature profiling.
In the field to the east of the trailers were three Doppler wind lidars.One lidar (LVS) measured in vertical stare mode throughout the measurement campaign.The other two lidars (LA, LB) were placed 90 m away from the LVS and made range-height indicator (RHI) scans (66°-87° elevation angle) pointing toward the LVS.This setup constitutes a virtual tower that provides vertical wind speed measurements and calculates average horizontal wind speed at multiple height levels above the LVS (Calhoun et al. 2006;Klein et al. 2015;Wulfmeyer et al. 2018).Additionally, the collocation of lidars for measuring 3D winds, temperature, and water vapor facilitates calculation of flux profiles of τ, H S , and H L , as well as flux divergence (Wulfmeyer et al. 2016).
Two precipitation instruments [a Precipitation Imaging Package (PIP) and a Micro Rain Radar Pro (MRRPro; Metek GmbH)] were installed at WLEF.The PIP is a video disdrometer system that records information about hydrometers and produces end user products such as particle size distributions, fall speeds, and rain rate at 1-min resolution (Newman et al. 2009;Pettersen et al. 2020a,b).The MRRPro is a 24-GHz, frequency modulated continuous wave, vertically profiling Doppler radar (Klugmann et al. 1996) that is used for observations of both rain (i.e., Peters et al. 2002) and snow (Kneifel et al. 2011).
Additional thermodynamic profiling systems were operated at the Prentice and Lakeland airports throughout the experiment to characterize the boundary layer variability and evolution around the CHEESEHEAD19 domain.The primary motivation of these two profiling sites was to characterize the mesoscale transport and role of advection on the ABL mass balance.At each location, a 915-MHz wind profiler with radio acoustic sounding system was deployed together with a multichannel MWR.These instruments provided profiles of horizontal wind and temperature, and low-vertical-resolution profiles of water vapor.
Prior to IOP 3, two mobile CLAMPS facilities (Wagner et al. 2019) were deployed at Prentice and Lakeland.The systems contained a Doppler lidar wind profiler, an AERI, and an MWR.The information content in the AERI observations is higher than in the MWR, and thus the retrieved water vapor and temperature profiles have better vertical resolution and accuracy (Löhnert et al. 2009;Blumberg et al. 2015).The Doppler lidars complemented the radar wind profilers, providing higher-temporal-resolution and higher-vertical-resolution measurements than the radar systems up to 1.5 km AGL on average, while the radars were able to extend wind profiles up to 3.0 km on average, albeit at coarser resolution (Table ES1).
Two sUASs were flown to characterize surface and near-surface conditions (Fig. ES2).During IOP1 (IOP2), a DJI S-1000 (e.g., Lee et al. 2019) was flown adjacent to the SW2 tower (WLEF tall tower) to quantify the variability in surface sensible heat flux (e.g., Lee et al. 2017).During all three IOPs, the Meteomatics Meteodrone SSE sUAS was used to sample the evolution of near-surface profiles of temperature, moisture, and wind up to 213 m AGL, which was the maximum altitude to which we could operate our sUAS per our cooperative agreement with the Federal Aviation Administration (FAA).Additionally, the Meteodrone SSE was used to sample the horizontal variability in temperature, moisture, and wind fields over a ~100 m × 100 m box surrounding the SW2 and WLEF towers.Over the three IOPs, 26 (103) flights were conducted with the DJI S-1000 (Meteodrone SSE).
surface environment.Data describing the ecological environment were collected to provide the boundary conditions of canopy type, activity, and stress, needed for estimating scaling properties.This was done with a variety of methods, including airborne imaging spectroscopy, ground-based phenological characterization, and tree growth measurements.
Foliar functional traits such as leaf mass per area (LMA) and nitrogen concentration strongly influence photosynthetic capacity and plant growth (i.e., NPP) (Niinemets 2001; Kattge et al. 2009) and can be mapped using imaging spectroscopy (a.k.a.hyperspectral remote sensing; Kampe et al. 2011;Singh et al. 2015).To map foliar functional traits across the domain a full-range imaging spectroscopy system comprising two co-aligned imagers (VNIR-1800 and SWIR-384; HySpex, Skedsmokorset, Norway) was operated from a Cessna 210 at 1,400 m AGL on 4 days (26 June, 11 July, 4 August, 30 August), producing images with 1-m spatial resolution.The HySpex collects 474 bands with a spectral resolution of 3.26 nm in the VNIR (400-1,000 nm) and 5.45 nm in the SWIR (1,000-2,500 nm).
Extensive ground-based vegetation samples were collected to support the hyperspectral image analyses.These included 41 plots in the domain for measuring tree species (400+ trees), root growth, tree height, diameter at breast height (DBH), NPP, biometry, and leaf area index (LAI).In addition, 122 top-of-canopy foliar samples were collected to estimate leaf level function traits following the protocol from Serbin et al. (2014).
In combination with an existing extensive database of foliar traits and image spectra (Wang et al. 2020), we will use the 122 foliar samples to develop and validate 1-m-resolution maps of numerous foliar functional traits hypothesized to influence NPP (including LMA, nitrogen concentration, chlorophyll and other pigments, phosphorus, nonstructural carbohydrates, fiber and lignin, and phenolics) for all four dates of the hyperspectral image collection.From this, we will test the relationship between functional traits and gross primary production (GPP; as derived from towers) and peak-season integrated NPP (early July to early September, derived from the 41 plots).We will generate 1-m maps of NPP and GPP and identify the foliar factors that most influence each.
Additional plots were used to measure vegetation phenology as it changed through the season, building upon several years of previous phenological observations collected in the domain.Autumn tree leaf color and fall phenology levels were visually observed and recorded at least twice weekly over 6 weeks during the senescence period (1 September to 25 October) for a group of 214 individual trees (at 10 sites distributed over the 10 km × 10 km area) that were representative of the major species.
Forest canopy structure was characterized using an sUAS-based lidar system (Routescene; Edinburgh, Scotland) acquiring high density point clouds (500 points per square m) within footprints from 11 CHEESEHEAD19 flux tower sites including aspen, pine, poplar, larch, cedar, and hardwood forests (e.g., Antonarakis et al. 2014).Areas surveyed ranged between F E B R UA RY 2 0 2 1 E433 0.25 and 1 km 2 per site.Additional canopy information for the entire domain came from leaf-off lidar from USFS sampling (1 m × 1 m resolution) conducted for the three counties that comprise the study area between 2014 and 2017.
Land surface temperature (LST) is a key environmental driver of the surface energy balance (e.g., Anderson et al. 2008;Metzger et al. 2013;K. Xu et al. 2017).Spatially explicit LST can be acquired from satellite remote sensing (Fig. 5).However, there are tradeoffs in space and time resolutions such that no single sensor provides sufficient resolution for use as a land surface driver to map heat fluxes across space at subkilometer and hourly time steps required for the hypotheses here.Also, remote sensing methods may not be able to distinguish between true surface temperature and upper canopy temperature.Here, we are investigating multisensor fusion using a combination of in situ thermal drone and infrared camera imagery, ECOSTRESS, Landsat, VIIRS, and/or GOES (Wu et al. 2013).
Data analysis and modeling.Two analysis approaches have been proposed to test the hypotheses of this study.The first is the application of ERF-VCV-a data-driven approach that can be used to account for the dispersive fluxes missed by single-tower EC measurements, and to upscale fluxes across the CHEESEHEAD19 domain (Metzger 2018;Xu et al. 2018Xu et al. , 2020)).ERF-VCV uses a machine learning algorithm to find relationships between measured fluxes and their meteorological and surface drivers within the flux footprints (see sidebar).
The second analysis approach will be to perform LES for the IOP days using the Parallelized LES Model (PALM) (Raasch and Schröter 2001;Maronga et al. 2015Maronga et al. , 2020)).Building on previous LES work (Stoll et al. 2020), we will emulate airborne and tower-mounted flux observations to compare them against the "real world" observations with the ability to also evaluate flux footprints using Lagrangian particle modeling (Steinfeld et al. 2008), radiation footprints, and storage fluxes at various locations and points in time.To simulate the physical processes as observed during the IOPs of the field experiment as realistically as possible, we will assume realistic topography for the experiment site, and apply a land surface model (LSM) with a coupled soil and radiation model, as well as a plant canopy model (PCM).Both models are built into PALM.The use of the LSM and PCM runs instead of prescribed surface fluxes will allow us to study land-atmosphere feedbacks such as self-reinforcement of mesoscale circulations over the heterogeneous study domain.The LSM will be set up for each IOP test case, with land-use classes, soil, and vegetation data as observed during the field experiment.Further, in order to account for synoptic-scale processes during the IOPs (e.g., advection of air masses with different characteristics) we will nest the LES domain into a larger-scale model.
One proposed goal is to derive a parametric heterogeneity correction of dispersive fluxes by setting up virtual towers within the LES, applying the correction to CHEESEHEAD19 tower flux field data, and evaluating the correction with ERF-VCV flux grids.Therefore, tower-level turbulence characteristics will be simulated as observed during the field campaign to investigate the energy balance nonclosure problem.Additionally, by emulating real-world measurements we intend to help interpret the observations-such as giving hints where secondary circulations occur or how far heterogeneity signals extend downwind.

Preliminary results
Over the course of the 4-month study period the region exhibited light winds (diurnal means from 1 to 4 m s −1 ) from all directions, with the most prevalent direction being southwesterly.Air and soil temperatures decreased over the period, while soil moisture increased (Figs. 6a,b).Daily mean net radiation decreased over the course of the study, which showed a direct relationship with ABL height (measured as the height of the inversion on the daily 1800 UTC radiosonde

Continuity through Environmental Response Functions
CHEESEHEAD19 disentangles how land surface heterogeneity relates to atmospheric transport of mesoscale eddies, which contributes to the discrepancy between EC flux observations and model predictions.We strive to create a new class of observational flux data product that reconciles energy balance closure biases on the order of 10% (Chen et al. 2011;Foken et al. 2011) and reveals actual surface emissions.For nonuniform exchange surfaces such as in CHEESEHEAD19, this requires us to evaluate the conservation of mass and energy continuously in time and space throughout the study domain (e.g., Finnigan 2008).However, even intensive field instrumentation campaigns such as CHEESEHEAD19 cannot produce observations everywhere, all the time.Here, environmental response functions (ERF; Metzger et al. 2013;Metzger 2018) can help attain the necessary information continuum from individual observation plots to model grid scale.To achieve this, ERFs complement information across disciplines and observation types by using a machinelearning algorithm to find relationships between measured fluxes and their meteorological and surface drivers within the flux footprints (Fig. SB1a).This provides a powerful approach not only for post-field data synthesis, but already in the experiment planning stage, e.g., in combination with large-eddy simulations (Fig. SB1b).Maximizing scientific return on experimental investment (Fig. SB1c; Metzger et al. 2021) is one example of how ERFs can help close the circle among obtaining "knowledge from data" and "data from knowledge" (Reichstein et al. 2019).F E B R UA RY 2 0 2 1 E435 launches; Fig. 6c).One of the most relevant seasonal changes with respect to energy balance was the change in the daytime Bowen ratio (H S /H L ) which averaged 0.5 in the summer and 1.0 in the fall, with the latter period having more variability than the former (Fig. 6d).Diurnal cycles of sensible and latent heat flux show that latent heat flux is much larger in the summer when the canopy is fully evapotranspiring compared to the fall, when senescence of broadleaf trees reduces H L , allowing H S to comprise a larger share of the total heat flux over the region (Figs.6e-h).
As is typical for EC measurements, we observed energ y f luxes that were lower in magnitude than the available energy (R N − G), when averaged across all sites (Fig. 7).The magnitude of the energy balance residual (C EB ) was largest during the daytime, when incoming solar radiation was highest.The reverse in sign of C EB from day to night in part can be attributed to heat storage in the canopy.However, the magnitudes of the daytime values are larger than the nighttime values, which results in a daily mean imbalance.
The energy balance residual peaked under conditions of low turbulence (Fig. 8).It is during such periods of calm wind and strongly unstable stratification in which thermally induced mesoscale eddies resulting from landscape-scale heterogeneity are expected (Steinfeld et al. 2007).This lends support to the hypothesis that mesoscale eddies are responsible for the energy balance nonclosure.
Landscape heterogeneity was observed for a range of environmental variables, including vegetation, canopy height, surface temperature, and energy fluxes.Variability in surface H S was quantified by combining tower measurements with in situ measurements

E436
of air temperature and land surface temperature from the DJI S-1,000, following Lee et al. (2017).The example from 12 July 2019 showed significant temperature and H S variability; temperature differences of 10°C and H S differences of 100 W m −2 over the 500 m × 500 m area surrounding the SW2 tower (Fig. 9).Spatial variability in temperature and H S around the tower were directly related to heterogeneity in local surface characteristics.
Landscape heterogeneity in vegetation spectral characteristics and canopy height were captured from downwardlooking airborne remote sensing instruments (Figs.10b,c).False color HySpex imagery is being used to differentiate plant functional types at 1 m × 1 m resolution.Canopy structure was measured with the Routescene lidar at 11 leaf-on flux sites (Fig. 10c) and across the entire domain from the State of Wisconsin leaf-off lidar dataset.These data are being used to identify surface roughness in the flux footprints of the EC towers.In addition, these spatial data are being used as input drivers within the ERF-VCV machine-learning approach.
There is also spatial variation in the energy balance components across the domain on a typical day (Fig. 11).This variability includes the relative weighting of latent and sensible heat fluxes, as well as the magnitude of the energy balance residual.The mean energy balance closure [calculated as (H S + H L )/(R N − G)] across all the sites over the entire study period was 0.8.This is typical for EC towers and supports the need for the advanced methods put forth by this study.
To address this spatial and temporal variability we are testing different types of spatial EC techniques, which have been suggested as a means of mitigating errors arising from single-site, temporal EC (Steinfeld et al. 2007;Mauder et al. 2008b;Engelmann and Bernhofer 2016).Using LES, Xu et al. (2020) found that standard spatial EC improved closure over standard temporal EC, while a combined spatiotemporal method performed better still.Further, by applying the ERF-VCV approach, the energy balance was found to be almost completely closed.
Spatial fluxes were calculated for aircraft data and across the tower network.The spatial fluxes for aircraft data were calculated using a wavelet decomposition.This dataset provided good spatial coverage but had limited temporal resolution-even though with 72 flight hours spread across 12 days, it is one of the largest airborne EC datasets collected to date.Spatial fluxes were also calculated for the 20 flux towers, which provided a continuous flux record through the campaign.Flux footprint calculations for 26 September 2019 show that

E437
spatial coverage of the towers, including WLEF, covered roughly 8% of the domain (Fig. 12; using Kljun et al. 2015).This is a significant increase compared to a single tower set up (typically <<1% of a 10 km × 10 km domain).This is important because it provides sampling over a wider range of physical environments.By combining the tower and aircraft EC datasets we had excellent coverage (~80%) of the study domain on flight days (Fig. 12).Both flux datasets are  Last, the characterization of the ABL and identification of mesoscale eddies is being performed using lidar measurements of wind, water vapor, temperature, and backscatter.Figure 13 shows an example of this on 24 September 2019.Increasing water vapor through the day is representative of a large-scale warm, wet air mass entering the domain (Figs.13c,d, Fig. ES3a).This characterizes the variation in water vapor throughout the collection of the morning UWKA CRL dataset (Fig. 13a).The afternoon CRL dataset (Fig. 13b) shows a more evenly mixed ABL, with variation in water vapor due to local pockets of relatively moist and dry air.These two examples show the varying applications of the CRL data depending on the atmospheric environment, with the afternoon flight illustrating the potential of the dataset for determining the degree of ABL heterogeneity arising from surface heterogeneity.Further analysis will investigate relationships with underlying vegetation and LST.
Around 1200 UTC (0700 local time) net radiation becomes positive (Fig. 13g) and soon after we see the breakup of the surface inversion (Fig. 13d).Around 1400-1500 UTC the ABL grows and is followed by development of large-scale structures revealed by strong oscillations in vertical wind speed (±2 m s −1 ; Fig. 13e).During peak hours the angle of attack of the wind vectors oscillate between roughly −10° and 15° on time scales of 10 min to an hour.These angles far exceed those of the underlying terrain, suggesting that these periodic updrafts and downdrafts are the result of mesoscale eddies.

E439
Around 1900 UTC the domain clouds over, seen in R N and backscatter (Figs.13f,g, Fig. ES3b).This causes the strength of the oscillations in vertical wind to decrease (Fig. 13e), which coincides with a change in the relative weighting of the different energy balance components, with both R N and H S decreasing strongly, while H L decreases only slightly (Fig. 13g).An increase in R N around 2000 UTC corresponds to a reversal of these trends.Further analyses will investigate the prevalence of this result across the entire dataset and examine specific drivers and possible implications for EC energy balance closure.These datasets show that changes in ABL development are closely tied to changes in the surface energy fluxes, highlighting the potential research applications of the CHEESEHEAD19 data.

Educational outreach
Several public events were conducted to introduce and communicate the science goals and objectives of the project.These include a pre-experiment communitywide public presentation at the Park Falls Public Library and a summer open house at several sites, enabling members of the community to visit data collection locations, meet CHEESEHEAD19 team members, and participate in demonstrations of the instruments (Fig. 14).CHEESEHEAD19 team members also participated in sur veys and in training on fieldwork bullying and sexual harassment prevention (Fischer et al. 2019, manuscript submitted to Bull. Amer. Meteor. Soc.).
The project also worked with two local school groups, one from the Butternut, Wisconsin, K-12 school and another from Chequamegon High School of Park Falls, Wisconsin, to include them as supporting data collectors.The GLOBE (Global Learning and Observations to Benefit the Environment) program trained Butternut K-12 students and a teacher to collect land-cover classification data, soil properties, and atmospheric data at seven of the tower sites at multiple times throughout the summer.The high school group installed 10 tree temperature sensors at 5 of the forest flux tower sites, which are being used to estimate biomass heat storage.We also hosted two undergraduate university field classes (University of Wisconsin-Madison and University of South Carolina), which conducted independent research projects on micrometeorology and carbon cycling.

Data and code availability
The database of observations and models is currently online and freely available to the community and public for general use or for further scientific investigation.The datasets and supporting information (including site characterizations, project logistics, and setup) have been gathered together in the NCAR Earth Observatory Laboratory (EOL) data repository which can be accessed through the project web page at www.eol.ucar.edu/field_projects/cheesehead.The project has open data and code policies, in which other researchers are encouraged to use CHEESEHEAD19 resources for their own research.The policies can be accessed through the above web page.
Additionally, data are stored and are being used for in-depth analysis and modeling purposes on the NSF-funded cloud computing platform CyVerse, with the goal of having a central location for users to bring their code to the data in a way that maintains data and code provenance for collaborative, multi-user projects.Additional information about the project, including descriptions of the sites, photographs, and data plots can be found on the CHEESEHEAD19 website, located at www.cheesehead19.org.

Conclusions
The data collected during the CHEESEHEAD19 field campaign show a distinct seasonal shift in surface energy fluxes, as well as spatial patterning that appears to be directly related to the characteristics of the underlying surface environment.Consequently, the imbalance in the energy budget displays both temporal and spatial variability, with the imbalance becoming larger under periods of low turbulence.The broad coverage of the measured fluxes using the 20-tower network and airborne EC, combined with the collection of spatial data of surface characteristics like LST, vegetation type, and canopy structure, will enable thorough investigation of the causes of energy balance nonclosure.Additionally, the suite of atmospheric profiling instrumentation characterizes the mesoscale structure of atmospheric flows over the study domain to an unprecedented degree, helping to determine how mesoscale eddies contribute to measured imbalances.The observational dataset provided by CHEESEHEAD19 enables machine-learning approaches and LES for testing hypotheses on scaling and parameterization of subgrid processes in mesoscale meteorological models.Findings emerging from this project are expected to have broad implications for heterogeneous terrestrial regions beyond the specific study domain.

FFig. 1 .F
Fig. 1.Maps and schematic diagram of CHEESEHEAD19 domain.(a) Map of the study area in Wisconsin.(b) Map of the location of all measurements made during the field campaign.Insets show Lakeland and Prentice airports where SURFRAD (in addition to the one in ISS field), radar wind profilers with RASS, and CLAMPS systems were installed.(c) Schematic diagram shows instrument location and a conceptual model of airborne data collection.Figure ES1 shows detailed maps of instrument locations at ISS and WLEF sites (Wisconsin Department of Natural Resources 2019).

Fig. 3 .
Fig. 3. (a) EC tower SW1-an example of the 33 m AGL telescoping towers deployed by NCAR ISFS and (b) EC instruments mounted at the top of tower SW2.

Fig. 2 .
Fig. 2. (a) HSRL beam next to WLEF tall tower, (b) EC instruments at 396 m AGL on the WLEF tall tower, and (c) the ISS field with modular wind profiler, MWR, ceilometer, sodar-RASS, SURFRAD, EC, and meteorological towers with UWKA flying overhead and WLEF tall tower in the distance.

Fig. 4 .
Fig. 4. The location (superimposed) of all 480 UWKA flight legs completed during the CHEESEHEAD19 field campaign.The yellow square represents the study domain and the red dots indicate the flux tower locations.

Fig. 5 .
Fig. 5. Land surface temperature on 15 Jun 2019 from (a) ERA5 and (b) derived from Landsat-8, where subgrid spatial resolution is present, but temporal resolution is low at one image every 8 days (Gerace et al. 2020; Landsat 8 data courtesy of the U.S. Geological Survey; ERA5 data generated using Copernicus Climate Change Service Information 2020).

Fig. SB1 .
Fig. SB1.(a) ERFs augment sparse response observations (e.g., tower and aircraft EC) with abundant driver observations (e.g., meteorological stations and satellites).High-rate time-frequency decomposition and source area modeling facilitate data joins among these response and driver observations at minute and meter scale.Machine learning then extracts a driver-response process model from the resulting space-and time-aligned dataset.Ultimately, this driver-response process model complements the properties of response and driver observations in the response data product.In the present example these are meter-scale sensible heat flux maps, which can be used to more reliably evaluate the conservation of energy across the nonuniform CHEESEHEAD19 experiment domain.(b) During the experiment planning stage we used LESs to create synthetic atmospheres over the CHEESEHEAD19 domain for different synoptic conditions.We simultaneously sampled the synthetic atmospheres as observed by different virtual experiment designs.Each experiment design resulted in a separate set of virtual observations that we independently processed through the ERFs in (a).(c) We benchmarked the different experiment designs against their ability to reproduce the LES reference in the form of flux grids that ERF reconstructed from the virtual observations alone.Identifying the optimal experiment design not only allowed us to double the scientific return on experimental investment, but also to simplify flight plans and increase crew safety.For additional details, see the full study by Metzger et al. (2021).

Fig. 6 .
Fig. 6.Daily (24 h) mean across all ISFS towers of (a) temperature and relative humidity, (b) soil moisture at 5-cm depth and rain, (c) net radiation and ABL height (measured at ISS field), and (d) Bowen ratio (daytime only).Aerial view of site NE2 on (e) 12 Jul 2020 and (f) 9 Oct 2020.Diurnal cycles of sensible and latent heat averaged across all ISFS sites for the weeks of (g) 7-14 Jul and (h) 4-11 Oct.

Fig. 8 .
Fig. 8. Daily mean energy balance residuals (C EB ) normalized by net radiation minus ground heat flux (R N − G) plotted against friction velocity (u * ) for all ISFS EC towers for the entire CHEESEHEAD19 dataset (excludes individual towers on days without complete quality-controlled data).

Fig. 7 .
Fig. 7.The diurnal cycle of energy balance components: latent and sensible heat flux (H L and H S ), ground flux (G), net radiation (R N ; where negative represents net incoming radiation), and energy balance residual (C EB ), averaged across all flux towers over the entire study period.

Fig. 9 .
Fig. 9. (a) Surface temperature and (b) H s from a downward-pointing infrared camera flown on the DJI S-1000 sUAS surrounding the SW2 tower between 1504 and 1518 UTC 12 Jul 2019.(c),(d) As in (a) and (b), but between 1614 and 1628 UTC 12 Jul 2019.H s is computed following Lee et al. (2017).As the technique requires an initial H s to derive the variability in H s and H s was unavailable from SW2 on 12 Jul, H s at SW2 was estimated using a linear regression with data from nearby towers.The mean plus and minus one standard deviation is shown at the bottom of each panel.

Fig. 10 .
Fig. 10.Surface maps showing spatial variation around tower site SW2 in (a) surface temperature measured by the DJI S-1000 (as in Fig. 10a), (b) vegetation spectral characteristics measured by the HySpex shown as a false color image (849 nm, red; 1,650 nm, green; 2,217 nm, blue), and (c) surface/canopy height measured by the sUAS Routescene lidar.

Fig. 11 .
Fig. 11.Average daily mean energy balance pie charts for the flux towers over the entire study period.The pie chart with the cyan outline (bottom center) was a buoy EC system deployed on a small lake where G was not measured.

Fig. 13 .
Fig. 13.(a),(b) CRL cross sections of H 2 O mixing ratio [cut to domain size; color bar in (c) is for (a)-(c)] for each of 10 legs on research flights 17 and 18 (at 1351-1626 and 1911-2131 UTC 24 Sep); time series profiles at WLEF tall tower on 24 Sep 2019 of (c) H 2 O mixing ratio and (d) T measured by the ground-based MWR, (e) vertical wind speed from wind lidar LVS, and (f) 532-nm backscatter from the ground-based HSRL.(g) Stacked energy balance components: available energy (R N − G), sensible and latent heat fluxes (H S and H L ), and energy balance residual (C EB ), averaged across all EC towers, with C EB during the passing of a moist synoptic system [see (a)-(c)].C EB increases at the breakup of the morning inversion [see (d)], is highest when mesoscale eddies are present midday [see (e)], and dips when clouds form in the afternoon [see (f)].

Fig. 14 .
Fig. 14. Bill Brown (just right of the radiosonde balloon) describing the capabilities of the ISS facility during the community open house.

Table 1 . List of instruments and data collected during CHEESEHEAD19. Definitions of variable abbreviations are located in the appendix. Profiling instrument measurement ranges and resolutions are in Table ES1. Footnotes in Measured variables and Period columns correspond with Location(s). Data source Data provider Location(s) Measured variables Period Ground-based measurements
a Measurements were taken at ISS field.bMeasurements were taken at Prentice airport.cMeasurements were taken at Lakeland airport.Unauthenticated | Downloaded 03/21/24 10:43 AM UTC