The National Ecological Observatory Network (NEON) is a multidecadal and continental-scale observatory with sites across the United States. Having entered its operational phase in 2018, NEON data products, software, and services become available to facilitate research on the impacts of climate change, land-use change, and invasive species. An essential component of NEON are its 47 tower sites, where eddy-covariance (EC) sensors are operated to determine the surface–atmosphere exchange of momentum, heat, water, and CO2. EC tower networks such as AmeriFlux, the Integrated Carbon Observation System (ICOS), and NEON are vital for providing the distributed observations to address interactions at the soil–vegetation–atmosphere interface. NEON represents the largest single-provider EC network globally, with standardized observations and data processing explicitly designed for intersite comparability and analysis of feedbacks across multiple spatial and temporal scales. Furthermore, EC is tightly integrated with soil, meteorology, atmospheric chemistry, isotope, phenology, and rich contextual observations such as airborne remote sensing and in situ sampling bouts. Here, we present an overview of NEON’s observational design, field operation, and data processing that yield community resources for the study of surface–atmosphere interactions. Near-real-time data products become available from the NEON Data Portal, and EC and meteorological data are ingested into AmeriFlux and FLUXNET globally harmonized data releases. Open-source software for reproducible, extensible, and portable data analysis includes the eddy4R family of R packages underlying the EC data product generation. These resources strive to integrate with existing infrastructures and networks, to suggest novel systemic solutions, and to synergize ongoing research efforts across science communities.
The National Ecological Observatory Network (NEON) makes it possible to study interactions between biotic and abiotic systems across previously inaccessible space and time scales.
OBSERVATORY PURPOSE AND DESIGN.
Environmental science has long been successful at answering questions, discerning processes, and developing theories regarding the state of ecosystems and the services they provide. As environmental change accelerates, there is an increased need for both monitoring and developing predictive capacity across a wide range of scales (Dietze et al. 2018; Heffernan et al. 2014; Kuhlman et al. 2016; Peters et al. 2014; Soranno and Schimel 2014). For example, six out of seven “grand challenges in the environmental sciences” highlighted by the National Research Council (2001; Table ES1 in the online supplement, available at https://doi.org/10.1175/BAMS-D-17-0307.2) relate to environmental forecasting. As a result, environmental science increasingly finds itself confronted by questions that involve larger geographic areas and more extended time periods, much like synoptic-scale meteorology and climatology (Peters et al. 2014; Soranno and Schimel 2014). This shift has led to the concept of the “macrosystem”: hierarchical ecological systems comprising biological, geophysical, and social components, which interact with one another across fine to broad scales (Folke et al. 2011; Heffernan et al. 2014). The National Academy of Sciences (2013) as well as Canadell et al. (2000) and Sutherland et al. (2013) further highlight the need for confronting questions at the macrosystem scale with the necessary distributed observations. For example, Swann et al. (2018) describe the continental consequences of regional forest die-offs via ecoclimatic teleconnections. This is illustrative of a macrosystem-scale ecological question that coordinated observation networks enable to address.
Coordinating and linking distributed observations has long been recognized as necessary to monitor and predict large-scale changes in the terrestrial biosphere (Running et al. 1999). To generate observational datasets covering a wide range of spatial and temporal scales, environmental science has thus far relied upon several strategies: 1) synthesize smaller existing datasets; 2) compile data from remote sensing platforms; 3) link spatially distributed sensor measurements that share compatible methods; and 4) implement large observatories that span continental scales, use standardized methods, and are designed to enable scaling and address macrosystem research questions (Richter et al. 2018; Soranno and Schimel 2014). Bottom-up, self-organizing networks of independent research stations such as FLUXNET, Long Term Ecological Research (LTER), and Critical Zone Observatory (CZO) have performed strategies 1–3 to reveal continental and even global-scale patterns and relationships of ecosystem functioning (e.g., Baldocchi 2008; Gosz et al. 2010; Guo and Lin 2016; Novick et al. 2018). This success has led to the advent of centrally managed observatories (strategy 4) to realize further benefits of standardization and coordinated observations (Franz et al. 2018). The National Ecological Observatory Network (NEON) is one such observatory, designed to act as a single, coordinated continental-scale instrument to generate, store, and share data relevant to monitoring and predicting environmental change (Collinge 2018; Peters et al. 2014; Richter et al. 2018).
As early as 1999, ecologists and biologists discussed the need for a Biodiversity Observation Network (BON). By 2000, the idea had developed into a more comprehensive “ecological observatory network”: an instrument to advance the ability of scientists to examine and understand the interactions between life and the environment across the United States. Through 2005 a series of workshops concretized the idea and completed an initial plan for NEON in 2006. Over the next five years the National Science Foundation (NSF), the National Science Board and Congress formally reviewed and revised the NEON design, plan, and budget, and in 2011 approved NEON construction: the first life science project solely constructed with NSF Major Research Equipment and Facilities Construction (MREFC) funding. Between 2011 and 2019 NEON established independent meteorological, soil, organismal, biogeochemical, and freshwater sampling at 81 research sites, accompanied by three airborne remote sensing observatories, a central operating facility, and a cyberinfrastructure center. In 2019 NEON entered full operations, funded by the NSF Division of Biological Infrastructure (DBI) and managed by Battelle. It continuously produces over 175 data products at various temporal and spatial scales over a 30-yr timeframe.
The NEON geographic design was based on a statistically rigorous analysis using algorithms for multivariate clustering (Hargrove and Hoffman 1999, 2004). Based on national datasets for ecoclimatic variables, the continental United States, including Hawaii, Alaska, and Puerto Rico, were partitioned into 20 ecoclimatic domains (Fig. 1). These domains capture the full range of U.S. ecological and climatic diversity as well as distinct regions of vegetation, landforms, and ecosystem dynamics. In each domain, a core (30 yr) site that represents the predominant “wildlands” ecosystem acts as an anchor for additional research sites designed to address specific environmental or human-dimension questions (e.g., land use, management, disturbance, or recovery) within the respective domain. A map and list of all NEON field sites with links to individual site descriptions are accessible online (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/sites).
This NEON stratified design links diverse approaches and technologies in a coordinated and standardized framework crucial to understanding regional- and continental-scale events. For example, Mahecha et al. (2017) compared extreme event detection rates among two in situ measurement networks, NEON and AmeriFlux (Novick et al. 2018): with only one-third of the number of AmeriFlux sites at the time, NEON would still have captured two-thirds of the AmeriFlux-detected extreme events that occurred in the United States from 2009 to 2014. This speaks to the efficacy of a systematically designed top-down network to detect change at the macrosystem scale, and to add previously underrepresented ecoclimates to the joint site distribution (Fig. 1). Conversely, bottom-up networks like AmeriFlux provide a rich mechanistic, site-based, and technical understanding that NEON alone cannot supply. For example, continued efforts to evaluate uncertainty in flux records linked to the choice of instrumentation should continue to be a hallmark of this approach (Novick et al. 2018). Indeed, AmeriFlux scientists have been at the forefront of reviewing NEON designs through leadership in NEON’s Technical Working Groups (TWGs) and Science, Technology, and Educational Advisory Committee (STEAC). Besides, bottom-up networks will always have the advantage of increased flexibility and responsiveness to redesign sites/instrumentation in response to emerging questions and technology (e.g., methane flux). Thus, bottom-up and top-down networks can fill complementary niches: bottom-up networks provide testbeds for new scientific inquiry, and top-down networks facilitate the identification of patterns and relationships at the macrosystem scale. For this reason, NEON developments have focused on both the robustness of our systems and their capacity for community development and interoperability with existing and future networks.
An essential component of NEON are the 47 terrestrial tower sites shown in Fig. 1 (blue points), on which we focus in the following. There, eddy covariance (EC; Aubinet et al. 2012) sensors are operated to determine the surface-atmosphere exchange of momentum, heat, water, and CO2, alongside meteorology, atmospheric composition, and soil sensor assemblies. Here, we present an overview of NEON’s terrestrial tower sites and associated workflows to serve as a citable guide for future users interested in leveraging NEON resources. This systemwide overview also provides a framework for the detailed documentation and technical discussion of individual components elsewhere (e.g., Metzger et al. 2016, 2017; Starkenburg et al. 2016) and forthcoming insights from their combined use. Topics covered within the remainder of this overview include information regarding location and design of tower infrastructure and the architecture of the NEON turbulent exchange and storage exchange assemblies. We then provide a summary of closely related assemblies (i.e., meteorological, atmospheric composition, soil sensing), and linkages with contextual NEON systems across disciplinary boundaries. Additionally, we discuss elements of the data processing pipeline, data dissemination, and the integration of NEON into the broader landscape of scientific networks.
SITE INFRASTRUCTURE AND SENSOR DEPLOYMENTS.
NEON’s terrestrial tower sites are hosts to the terrestrial instrument system (TIS) with focus on the pedosphere–biosphere–atmosphere interface (Fig. 2). Through NEON’s integrated design, TIS measurements are collocated and coordinated with NEON’s terrestrial observation system and airborne observation platform. A separate set of 34 aquatic sites is typically adjacent to the 47 TIS locations and connected through watersheds. These sites host aquatic instrument and observation systems with a focus on the hydrosphere–biosphere interface. Also, a mobile deployment platform can be deployed in rapid response to natural phenomena or researchers’ interests and can be requested by the research community for separate investigations.
Tower location and design.
The stratified observatory design process yields target ecosystems within each NEON domain that maximize the scientific return on investment for operating a TIS network across the United States. Subsequently, NEON personnel identified candidate locations for each domain–ecosystem combination and assessed them for technical, logistical, and scientific feasibility. In addition to spatial representativeness of the target ecosystem (minimum 80% flux footprint contribution with a design goal of 90%; Leclerc and Foken 2014), cross-network synergies through collocation were essential considerations in this process (Fig. 3).
Among others, this provides interoperability anchor points through comparison of variables that are collected in multiple networks. This facilitates their joint, unbiased use in a network of networks also at non-collocated sites, thus improving total ecoclimatic coverage (e.g., Novick et al. 2018). Where collocated, such redundancy of variables can be further utilized as multiple constraints for a best estimate of the truth with substantially reduced uncertainty. One example are data fusion approaches such as ensemble Kalman filtering (e.g., Rastetter et al. 2010). In other cases, observations complement each other across networks. Here, broader and deeper environmental insights are made possible through an expanded range of data products that can be used synergistically (e.g., Baatz et al. 2018). In this way, previously inaccessible, cross-disciplinary questions become tangible to address, such as how changes in biotic species distribution and abundance influence biogeochemical cycles with feedbacks to the atmosphere. The sidebar “Environmental relationships and continuous information” below provides an example for such complementary use of data products across networks.
Here, we portend an “environmental science future through cohesive research resources” to elucidate fundamental processes at the surface–atmosphere interface: we join NEON’s coordinated biological observations, tower-based flux and meteorological data and airborne remote sensing at the University of Kansas field site (UKFS; https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/ukfs) with satellite remote sensing and regional reanalysis (Fig. SB1). For this purpose, we use the environmental response function information discovery approach (Metzger 2018; Metzger et al. 2013). As a result, insights on landscape-scale processes among environmental drivers and responses become accessible, and the spatiotemporal relationship between fluxes and stocks can be assessed consistently across a wide range of space and time scales.
The tower design itself strives to trade off safety, stability, durability, accessibility, thermal mass, reflectivity, harmonic, and flow distortion requirements (Munger et al. 2012). The result is a galvanized-steel walk-up tower with fall-restraint system, and a tower base of 2 m × 2 m to mimic natural ecosystem structures and openings for most forest ecosystems found across NEON sites (Fig. 4). The height of the tower, corresponding to the measurement height of the turbulent exchange assembly (see “Turbulent exchange assembly” section below) is a site-dependent function of the ecosystem structure. For short-stature ecosystems (e.g., grasslands, shrublands, or crops) with canopy height hc lower than 3 m, a fixed tower height hm of 8 m is used. Over forested or more structurally complex ecosystems, a tower height corresponding to hm ≈ d + 4(hc – d) is used, with d being the zero-plane displacement height (Paeschke 1937). This aims to maximize the period of time when the turbulent exchange assembly is sampling above the roughness sublayer and below the free atmosphere, so the observations can be directly related to surface processes. In addition to tower-top measurements, several vertical tower measurement levels (4–8 levels) are installed according to the site-specific ecosystem structure to capture meteorologically and ecologically meaningful vertical profiles.
The instrument support arms (booms) are 4 m long in order to extend beyond the immediate flow separation and recirculation zones around the tower. Furthermore, the tower–boom combination was designed to minimize harmonic effects in the 0–20-Hz range in which the turbulent exchange assembly performs its observations. Last, depending on the wind patterns around a site, the boom arms are oriented to minimize the occurrence of wind observations from the leeward side of the tower (in which case air first passes through the tower and then the sensor; Munger et al. 2012). Remaining leeward-side observations are disqualified for analysis.
Turbulent exchange assembly.
The eddy covariance turbulent exchange assembly aims to maximize data coverage and quality through close integration of a suite of sensors and auxiliary components. For each type of sensor, the manufacturer, model, acquisition frequency, deployment locations, and calibration cycle are listed in Table 1. In short, the three-dimensional wind components at the tower top are measured by a sonic anemometer operating at a rate of 20 Hz. An attitude and heading reference system is attached to the sonic anemometer and includes a gyroscope, accelerometer, and magnetometer that collect data at a rate of 40 Hz. This allows rotated deployment of the sonic anemometer following the streamlines at sloped sites and makes it possible to quantify and correct residual boom and tower motions. The H2O and CO2 concentrations are measured at 20 Hz by an enclosed-path infrared gas analyzer in combination with a mass flow controller that ensures a flow rate of 12 standard liters per minute through the sampling cell. Last, a validation system supplies reference gas concentrations to the infrared gas analyzer every 23.5 h for periodic validation to quantify and correct sensor drift.
Storage exchange assembly.
The eddy covariance storage exchange assembly consists of a suite of sensors that record air temperature and CO2 and H2O concentration profiles, some of which are shared with the meteorology assembly (see “Meteorology assembly” section below). Sensor manufacturers, models, acquisition frequencies, deployment locations, and calibration cycle can be found in Table 2. In short, the storage exchange assembly provides vertical atmospheric profiles of air temperature, CO2, and H2O for the calculation of corresponding storage fluxes. Air temperature is measured at a rate of 1 Hz using triple-redundant aspirated platinum resistance thermometers at the tower top, and single aspirated platinum resistance thermometers at the remaining vertical profile levels. To measure CO2 and H2O concentrations at a rate of 1 Hz, an infrared gas analyzer is located in the instrument hut, which is programmed to operate in two modes: sampling and field validation. During sampling mode, the infrared gas analyzer cycles through air samples that are drawn with a dedicated vacuum pump for each vertical profile level on the tower. The site-specific flow rate of 5–11 standard liters per minute is controlled to maintain the sample line pressure at 40%–50% of the ambient pressure to prevent condensation. At each measurement level, the infrared gas analyzer measures CO2 and H2O for about 180 s. The first 60 s are used for flushing sample air from the previous level from the plumbing system, and only the remaining 120 s are used for calculation. After 180 s, the measurement switches to the next level. During the field validation mode, reference gas concentrations are supplied to the infrared gas analyzer, similar to the turbulent exchange assembly (see “Turbulent exchange assembly” section above).
The meteorology assembly measures baseline meteorological variables at frequencies between 0.1 and 1.0 Hz. For each sensor type, the manufacturer, model, acquisition frequency, deployment locations, and calibration cycle are listed in Table 2 (air temperature) and Table 3 (all other sensors). In short, these include incoming/outgoing shortwave and longwave radiation, in addition to air temperature, relative humidity, dewpoint temperature, two-dimensional wind speed and direction vertical profiles, barometric pressure, and precipitation. Further variables such as direct and diffuse radiation, throughfall precipitation, biological temperature, and photosynthetically active radiation vertical profile are also measured, and phenology imagery is taken.
Atmospheric composition assembly.
The atmospheric composition assembly comprises a suite of sensors that further characterize wet deposition, atmospheric isotopes, and particulate matter. Corresponding sensor manufacturers, models, acquisition frequencies, deployment locations, and calibration cycles are listed in Table 4. In short, gaseous-phase stable carbon and water isotopes are measured along the tower vertical profile. Wet deposition sampling occurs at the tower top across 37 TIS sites, which were selected to represent a range of concentrations of nitrate, ammonium, and sulfate. Analysis occurs at the Illinois State Water Survey laboratory at the University of Illinois, which handles the analysis of other atmospheric deposition sampling programs. Stable isotopes in wet deposition are also sampled and analyzed by the Stable Isotope Ratio Facility for Environmental Research at the University of Utah. Samples are shipped in the 2.5-L high-density polyethylene (HDPE) plastic bottles used for collection. No temperature control mechanisms are used during shipping, but samples are refrigerated before analysis. Samples are analyzed within two weeks of collection. Archived samples are stored at 4°C and are currently slated for preservation for five years, mirroring the North American Deposition Program (NADP) protocol for sample archiving. Particulate matter less than or equal to 10 µm in the atmosphere are monitored via PM10 sampling and continuous particle size monitoring at a transect across the Rocky Mountains consisting of six sites. Aerosol optical depth measurements at the tower top complete the atmospheric composition assembly.
Each TIS site includes five soil plots containing a suite of sensors (Fig. 5). Sensor manufacturers, models, acquisition frequencies, deployment locations, and calibration cycles are listed in Table 3 (quantum-line photosynthetically active radiation, net radiometer, infrared biological temperature, and throughfall precipitation) and in Table 5 (all other sensors). The plots are typically arranged along a transect within the zone of mutual representativeness (see “Benefits of integrated and standardized design” section below) among the TIS and the terrestrial observation system, and in the locally dominant (∼1-km2 scale) soil type. Spacing between soil plots was determined by assessing spatial variation in soil temperature and moisture over 1 ha (Loescher et al. 2014) to maximize spatial independence, while being limited to ≤40-m spatial separation because of cost constraints. Vertical profiles of soil temperature (nine or fewer sensors per plot), moisture (eight or fewer sensors per plot), and CO2 concentration (three sensors per plot) are measured in each plot, with measurement depths based on absolute soil depth, soil horizon thicknesses, and other soil properties (Fig. 5). Soil heat flux is measured 8 cm below the soil surface in three of the soil plots. All soil sensor measurements are made at 0.1 Hz. In addition, radiation and throughfall precipitation measurements are made near the soil surface (Fig. 5).
The unique value of NEON TIS measurements stems from their close integration with contextual systems, which allows the joint use of data across traditional disciplinary boundaries.
The terrestrial observation system comprises plot-based biometric observations by field personnel. At the site scale, collocated observations of small mammals, insects, birds, soils, and vegetation diversity, biomass, and chemistry can be linked with TIS tower, soil plot, and meteorological data to generate deep insights into the changing biosphere (e.g., Flanagan and Johnson 2005; Heald and Spracklen 2015; Mainka and Howard 2010). It is the novelty of NEON that will allow researchers to expand these site-level studies across the many ecosystems that span North America (see Table ES3 for specific use cases of NEON data). Within the zone of mutual representativeness shared with the TIS (Fig. 2; see “Benefits of integrated and standardized design” section below), additional plots focus on quantifying aboveground productivity and belowground biomass stocks. Together, these data enable understanding of relationships between ecosystem-scale fluxes, individual species and vegetation components (e.g., Kao et al. 2012; Thorpe et al. 2016), alongside consistency checks (e.g., Babst et al. 2014; Curtis et al. 2002; Luyssaert et al. 2009).
Next, the airborne observation platform visits terrestrial sites during peak greenness [defined as the date range when normalized difference vegetation index (NDVI) values were 90% or greater of the average peak for a given site, derived from historical Moderate Resolution Imaging Spectroradiometer (MODIS) NDVI data] to provide high-resolution (1 m) spatially continuous information about land-cover physical and biogeochemical properties. A high-resolution waveform and discrete lidar, hyperspectral imaging spectrometer, and high-resolution red–green–blue (RGB) camera are used to accomplish this task. The flight area covers a bounding box with a minimum size of 10 km × 10 km encompassing terrestrial as well as aquatic site locations, providing spatial linkages between the systems (Kampe et al. 2010).
The aquatic instrument system consists of continuously monitoring sensors in streams, lakes, and rivers, shallow groundwater wells, and riparian meteorological stations to characterize physical hydrology and water quality. This is complemented by the aquatic observation system, which collects data on aquatic organism abundance and diversity, biogeochemical properties of the surface, groundwater, and sediment, and physical hydrologic and geomorphic properties (Goodman et al. 2015).
Last, the mobile deployment platform comprises a “TIS in a box”: a core subset of TIS instrumentation and capabilities are integrated onto a mobile platform that can be rapidly deployed across the country. This includes an instrumented Rohn tower for meteorological and eddy flux measurements, an array of soil instruments, and a hut that houses instrumentation, validation gases, and command and control and data acquisition infrastructure. The mobile deployment platform also includes a separate power generation trailer for cases when line power is not available.
NEON also offers an “assignable asset” capability through which community researchers can make use of the airborne and mobile deployment platforms, request access and add instrumentation to existing NEON sites, or request NEON Field Ecologists for data and sample collection. Prospective users of this service submit a request form describing their proposed use of the assignable asset, which NEON will then assess for technical, logistical, and scientific feasibility. Assuming a successful evaluation, NEON will provide a cost estimate alongside a letter of support/collaboration that the requestor can then include in a grant proposal to the National Science Foundation or another funding agency. Learn more online (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/aa).
NEON samples and specimens can be accessed from the NEON Biorepository located at the Arizona State University Biocollections (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/biorepo). These include voucher specimens, whole organisms, tissues, and samples that are collected and processed for chemistry, disease, and genetics.
BENEFITS OF INTEGRATED AND STANDARDIZED DESIGN.
The integrated, top-down design of NEON allows the adoption of value-engineering principles to create a transparent and traceable “assembly line” of environmental observations. Front and center in such an approach is the purpose of the product, which in the case of NEON is the ability to address grand-challenge questions in the environmental sciences (summarized in Table ES1; National Research Council 2001). Ideally, one expends only as much effort as needed to achieve the data tolerances required to address the purpose. This is accomplished by translating and tracing the grand-challenge questions (Table ES1) through the cascade of high-level science requirements (Table ES2), over specific science-use cases (Table ES3), into engineering requirements [e.g., supplement to Metzger et al. (2016): accuracy, precision, sample rate, range, dimensions, power consumption, mean failure rate, maintenance frequency, environmental impacts, etc.]. The result is an end-to-end integrated, requirements-based design of hardware, software, and people-to-people interactions.
NEON’s sensor selection process is one example for applying these principles within fair competition rules: first, NEON personnel captures the relevant science, engineering, and operations requirements. These are then placed in a request for information which is circulated among a broad range of vendors. Next, NEON personnel without conflict of interest evaluates the responses and performs additional integration tests. Finally, the National Science Foundation approves sensor selection.
A host of tests was performed for the EC systems to ensure that engineering designs, in combination with scientific data analysis, meet data product requirements while also minimizing the need for frequent human intervention. All designs are vetted by a Technical Working Group consisting of 5–12 external researchers with interest in a given discipline, as well as relevant NEON personnel, to which readers may nominate themselves (for details, see https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/twg). One novel solution resulting from this optimization is the turbulent exchange infrared gas analyzer system integration to allow adaptive high-frequency spectral correction: Metzger et al. (2016) performed spectral testing to determine the optimal rain cap and intake tube heater designs that reduced spectral attenuation to within tolerances that permit the use of adaptive frequency response correction algorithms (Nordbo and Katul 2013). In another systemic solution, theoretical formulations and experimental tests were used to minimize the effect of infrared gas analyzer–induced flow distortion on the sonic anemometer and subsequently the turbulent fluxes. This led to the development of new installation requirements, as well as the addition of an attitude and heading reference system. The latter permits tilted alignment of the sonic anemometer at sloped sites, thus limiting transducer-shadowing effects by reducing the angle of attack (e.g., Frank et al. 2016; Horst et al. 2015; Nakai and Shimoyama 2012). Moreover, sonic anemometer alignment is tracked over time, and the observed wind components can be reliably related to the Earth coordinate system and corrected for boom and tower oscillations.
An example of an integrated, cross-disciplinary design are the zones of mutual representativeness and exclusion (Fig. 2): Human-based biometric observations of ecosystem stocks (e.g., in situ biomass, plant phenology) and sensor-based observations of biophysical processes (e.g., evapotranspiration, light, and water-use efficiencies) must represent the same ecosystem, but not significantly influence one another. For this purpose, first, the area of mutual representativeness around the TIS flux tower is determined using source area modeling for radiation, scalar, and flux measurements (Kormann and Meixner 2001; Schmid 1994, 1997). Next, we determine the exclusion zone by applying an impact threshold (10%) to the convolution of footprints, area of land impacted by each sampling activity, impact intensity, and scientific value of collocation. Last, biometric sampling locations were selected using a stratified random sampling design, with preference in the area of mutual representativeness but outside the exclusion zone.
Moreover, NEON’s tight design integration and standardization permit a comprehensive science operations management framework for TIS quality assurance and quality control (Fig. 6). This begins with training for maintaining sensors, which is provided via curriculum lesson plans and hands-on demonstrations by qualified NEON personnel. Preventative maintenance is then performed biweekly according to procedures developed by NEON science and engineering departments and recorded electronically on a mobile application. Most sensors are rotated out of the field at specified intervals and through the NEON Calibration and Validation Laboratory, where calibration is performed to traceable national and international standards [e.g., International Temperature Scale of 1990 (ITS-90) for temperature measurements]. Sensor health is monitored by field personnel as well as remotely through software that interfaces directly with the data acquisition system at each site. An automated alert system monitors sensor health in real-time and creates trouble tickets directly in NEON’s issue management system upon actual or impending sensor malfunction. The next section addresses subsequent quality measures embedded in data processing, monitoring, and versioning.
CYBERINFRASTRUCTURE, EDDY4R SOFTWARE, AND DATA PROCESSING.
The data collected at NEON field sites are stored, curated, processed, and disseminated via NEON’s cyberinfrastructure (Fig. 6). NEON TIS alone utilizes several data processing pipelines. Among these, a dedicated, near-real-time EC pipeline aims for a 5-day turnaround from data acquisition to data product availability. The pipeline retries data processing for up to 30 days should one or more upstream dependencies not be fulfilled. Central to this premise is the Development and Operations (DevOps) model (Wurster et al. 2015) for the community-collaborative development of portable, reproducible, and extensible EC data processing and advanced analytics software. DevOps focuses on rapid development and continuous iteration, empowering community members to participate actively in the code development and synergize with ongoing research efforts. This addition of capabilities results in regular, public software releases alongside corresponding data product generation at NEON. A detailed description of the NEON EC DevOps workflow can be found in Metzger et al. (2017).
Technically, DevOps is achieved by housing the source code in a public and version-controlled repository, including review guidelines. This allows community users to access any desired functionalities for their purposes. Examples include incorporating an individual computational algorithm (definition function) into their specific workflow, or cross comparing the computational implementation of different flux processors. At the same time, community developers can view and build upon the entire history of code development as desired. This facilitates full community integration through attribution of source code and archived discussion surrounding code changes, especially as community standards evolve. Specifically, the eddy4R EC software consists of a family of modular R packages that include hierarchical and extensible sets of definition functions, wrapper functions, and workflows. Individual packages are separated according to various objectives of EC data processing, including shared utilities (eddy4R.base), quality (eddy4R.qaqc), and turbulence data processing (eddy4R.turb). This publication coincides with releases of the eddy4R.base and eddy4R.qaqc packages for community use and development via the eddy4R GitHub repository (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/repo). The repository includes an issue tracker and user forum for discussion and continued development.
The DevOps workflow continues by packaging eddy4R, alongside its computational environment and all dependencies, into a Docker image (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/docker). The Docker image only contains what is needed to run and performs the same regardless of the underlying operating system. This facilitates reproducible results and rapid deployment to all users, including through community-accessible and scalable high-performance computing resources such as CyVerse (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/cyverse). It also provides the foundation for the eddy4R wiki including the user’s guide and developers’ guide (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/wiki) and an eddy4R tutorial (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/tutorial) that accompany the source code publication (Fig. 7). The eddy4R Docker image is the same as is deployed for NEON EC data processing and generates the corresponding data products available for download from the NEON Data Portal (see “Publicly available data products and usability tools” section below).
During TIS data processing, a standard suite of quality tests are applied to calibrated sensor data at native resolution, including range, step, null/gap, and persistence tests (Hubbard et al. 2005; Taylor and Loescher 2013) as well as de-spiking (Brock 1986; Hojstrup 1993). Additional tests, such as evaluation of sensor diagnostic codes and measurement theory, are also applied on a product-specific basis (e.g., Foken et al. 2012). The quality test results are then propagated across processing levels and aggregated for data products via the extensible quality flags and quality metrics framework of Smith et al. (2014). This enables the integration of any number of tests, and the near-seamless incorporation of additional or improved tests as community standards evolve. The eddy4R.qaqc package includes functions for the standard suite of quality tests, de-spiking algorithms, and the propagation framework. Data product–specific quality tests are included in the corresponding eddy4R package (e.g., stationarity tests in eddy4R.turb).
Furthermore, a flexible science review flag based on human review complements the efficiency and standardization of the above automated tests. This allows indicating that data is suspect because of known adverse conditions not captured by automated flagging. For example, field reports provide insight in measurement interference such as a blocked tipping-bucket precipitation funnel or bird droppings on a radiometer. In such cases, NEON personnel raise the science review flag, which in turn overrides final data product quality regardless of its previously computed value.
NEON periodically reprocesses data of all levels as part of a formal versioning and revision system. The goals of this system are to 1) create data traceability, 2) classify and communicate the implications of changes in data and associated algorithms, and 3) strike a reasonable balance between minimizing data latency and maximizing data improvements and quality control. The three classes of data iterations are the following:
Revision: Occurs for a data product when a sensor or processing change is so significant that data from different revisions of the data product are not directly comparable and should be used with caution when combining for use or analysis. This would occur, for example, upon the switch to an entirely new measurement technology or incorporation of a new correction in the processing algorithm.
Release: A consistently processed, static, and citable dataset over the entirety of data within a revision of a data product. Data changes between releases are minor, such as the adjustment of a quality control test threshold.
Provisional: Data that have been recorded or edited since the most recent release. These data can be updated at any time, without guarantee of reproducibility.
Releases and revisions are annotated with a summary of changes made between the current and preceding release/revision.
PUBLICLY AVAILABLE DATA PRODUCTS AND USABILITY TOOLS.
All NEON-generated data products are open and freely available via the NEON Data Portal (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/data-portal), including over 20 World Meteorological Organization essential climate variables. The NEON Data Product Catalog (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/data-catalog) provides a thorough description of available data products with search functionality allowing users to explore products by keyword, data theme, science team, and other parameters. NEON provides data in two distinct packages: basic and expanded. The basic package provides users with the core data variables and a final quality flag, while the expanded package provides a more robust quality assessment by providing individual quality metrics. When selecting a data download package, users will also have the choice to select relevant documentation to be included in the download package, such as the algorithm theoretical basis document that explains all computational processing steps performed on the data to develop the provided data products. All algorithm theoretical basis documents, protocols, etc., are also available from the NEON document library (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/documents). Data products can also be obtained through the NEON application programming interface (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/data-api), which provides users with a programmatic interface to the NEON database. The NEON code resources web page (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/code-resources) has links to multiple code repositories containing tools for interacting with the NEON application programming interface in various programming languages. The NEON-utilities R package (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/neon-utilities) also includes functions for joining data files downloaded from the NEON Data Portal. Other packages facilitate (e.g., NEON spatial data handling) and perform common calculations and transformations on select NEON data products. To facilitate data use by the community, NEON also provides tutorials and other training materials for working with NEON data (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/data-tutorials). For example, tutorials are available for using NEON-utilities either directly in R or within Python.
In 2018, annual user traffic to NEON’s website (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/neon) rose nearly 100% from the previous three years, with over 200,000 users and roughly 430,000 unique page views. Nearly 60% of traffic to the site resulted from return visitors, and 40% of traffic resulted from new visitors. Data tutorials accounted for more than half of the traffic to the NEON website, which hints at increased data downloads and usage. However, these metrics are not yet robust as NEON has just entered full operations. We are working on infrastructure to improve tracking of data downloads, among others through implementing Google Tag Manager for the data portal and application programming interface.
NEON provides data specifically for surface–atmosphere research via a dedicated data product bundle (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/data-bundle). Scientific applications of this data bundle range from basic research, for example, on diffusivity, resistances, and surface roughness, to applied research such as ecosystem respiration, energy, water, and carbon cycling, and response to elevated CO2 concentrations. The full list of the bundled EC data products is provided in Table 6 and presented to the user in Hierarchical Data Format 5 (HDF5; https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/hdf). HDF5 provides high compressibility, fast, efficient reading and writing capabilities, directory-style files, and metadata attachment. Moreover, all major programming languages support HDF5, and files can also be explored intuitively, for example, with tools freely available from the HDF Group such as HDFView. The file contents are centered around results from the turbulent exchange and storage exchange assemblies, supplemented with essential variables from the meteorology and atmospheric composition assemblies. They include a range of data product types from low-level descriptive statistics of individually measured data streams, to high-level composite data products derived from multiple measurement streams. Most characteristically, files include the surface–atmosphere exchange of momentum, heat, H2O, and CO2, alongside their turbulent and storage component fluxes. Footprint metrics, source area grids, and quality and uncertainty budgets accompany these data products.
NEON provides data products from remaining TIS assemblies in flat ASCII files containing subhourly descriptive statistics alongside traceable quality and uncertainty estimates. Data products from the meteorology assembly provide the necessary baseline for continuously monitoring the effects of land-use change, ecosystem processes, and climate variability. Similarly, time series of radiation and throughfall precipitation measurements can be directly related to phenological events (e.g., leaf-out and litterfall). For select data products and sites, daily, monthly, and yearly aggregations are also available, including air temperature, relative humidity, dewpoint temperature, wind speed, primary precipitation, shortwave radiation, and barometric pressure.
Data products from the atmospheric composition assembly provide additional constraints for in-depth analyses, such as factoring an independent evapotranspiration estimate into a water balance, and effects of nutrient additions from wet and dry deposition. Stable carbon/water isotopes in the gaseous phase, stable isotopes in precipitation, and wet deposition chemistry are widely available across NEON sites. Data availability for dry deposition is limited to giving investigators insight into particulate transport over the Rocky Mountains and mirrors the particulate monitoring protocols of the Environmental Protection Agency. Aerosol optical depth monitoring also occurs at NEON sites as part of National Aeronautics and Space Administration (NASA)’s Aerosol Robotic Network (AERONET) program, and the AERONET program website (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/aeronet) hosts the corresponding data.
Data products from the soil assembly, extending from the soil surface down to 2 m below ground, are relevant to short-term predictions, such as flood or heatwave risk, as well as biological activity, including soil respiration rates and plant growth. For example, soil respiration data from CO2 concentration profiles can be used to partition ecosystem-level carbon exchange into above- and belowground components. Similarly, soil moisture data can provide a contextual understanding of temporal variation in plant production. Soil temperature and heat flux data products are also available.
NEON’s contextual systems complete the data product package to perform integrated studies on surface–atmosphere and ecosystem interactions: they provide the necessary, comprehensive data foundation to address grand challenges in the environmental sciences (Table ES1), centered around detecting continental-scale ecological change and forecasting its impacts. For example, terrestrial biometric data products include plant phenology, biomass, biogeochemistry, and litter, as well as mammal, bird, and insect abundance. Airborne data products then provide the necessary spatial linkage, for example, through vegetation indices, microtopography, and plant foliar chemistry. Aquatic data products such as surface and groundwater elevation and quality, physical hydrologic and geomorphic properties, and aquatic organism abundance and diversity contribute insights into the hydrological cycle. The sidebar “Environmental relationships and continuous information” above provides an applied example.
NEON maintains strategic partnerships with other national- and global-scale networks that are already well established in serving certain data products to facilitate distribution and utilization of NEON data by the scientific community (e.g., Bond-Lamberty 2018). To facilitate NEON data being interoperable with other networks, NEON uses traceable National Institute for Standards and Technology (NIST) standards and thoroughly quantifies instrument uncertainty. Thus, all NEON meteorological, atmospheric composition, and soil data products were developed to be interoperable with other established national networks, such as Atmospheric Radiation Measurement (ARM), National Oceanic and Atmospheric Administration (NOAA), U.S. Geological Survey (USGS), National Resources Conservation Service (NRCS), and the International Atomic Energy Agency (IAEA). Furthermore, the wet deposition data product is interoperable with the NADP. To synergize ongoing research efforts across communities, a subset of data products are served via the data portals of partner networks, such as phenocam data via the PhenoCam network and sun photometer data via AERONET. One example that will extend NEON data to a global network is the registration of NEON sites and ingestion of NEON data into AmeriFlux (Durden et al. 2017). Subsequently, these will be delivered via the globally harmonized FLUXNET datasets, alongside records from international partners including the Integrated Carbon Observation System (ICOS; https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/icos) and the Terrestrial Ecosystem Research Network (TERN; https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/tern), among others.
CHALLENGES AND CONTINUED DEVELOPMENT.
As noted in the 2015 NEON science capability assessment (Abbott et al. 2015), NEON is a project of unprecedented scope and complexity for the ecological community. Specifically, the assessment states, “Indeed, no previous (Major Research Equipment and Facilities Construction) project begins to approach NEON in its scope and complexity: the combination of many locations and enormous diversity of different types of measurements generate unique challenges to (the National Science Foundation) and the NEON staff at all stages of design, construction, and operations…. In addition to the complexities of building NEON, it is important to remember that the ecological research community also has no experience with a project of this scale.” With these considerations in mind, it is not surprising that NEON encountered several challenges throughout construction including a descoping of sites and data products, and turnover of the management entity: effective management is critical in a top-down organization as all decisions permeate downward and outward through the organization and its many sites. An iteration among the research community, NEON personnel and the NSF eventually refined the vision to “measure everything, everywhere and all the time” to an actionable set of “essential ecological variables.”
Determining this core set of variables still requires an unprecedented fleet of instruments, challenging their comparable operation and maintenance across sites. Every sensor is thus inspected upon initial receipt and configured per requirements. Next, the sensor is calibrated/validated before initial deployment, as well as periodically after that. Maintaining a regular schedule over a vast number and variety of sensors has posed a challenge, which NEON overcame through creating dedicated and largely automated calibration fixtures. An asset number facilitates maintaining a calibration database across sensors, sites, and time, and applying the correct coefficients during data postprocessing. Data quality is a concern for all large networks. A trouble ticket system has been implemented at NEON to report issues, facilitate cross-team problem resolution, and track progress. Specific components are detailed in the “Benefits of integrated and standardized design” and “Cyberinfrastructure, eddy4R software, and data processing” sections above. The quality of the resulting data rises and falls with the qualification and awareness of sensor health by the field personnel, as well as cross-team communication. Periodic meetings between domain scientists and field crews alongside training on standard operating procedures facilitate mutual understanding and collaboration. NEON personnel discuss common issues in web-based forums and post them alongside their solutions in a joint knowledge base.
Not all sensor models can be expected to outlast the multidecadal time horizon of NEON. Consequently, a sensor obsolescence strategy will be needed to ensure data continuity and comparability through time and space. Configuration management and design can facilitate such a strategy, which evaluates impacts to data products, measurement systems, and resources. NEON personnel identify dependencies across stakeholders, estimate effort and availability, request resources, and create a test plan. Data product revisioning then provides an opportunity to contrast data product requirements against changing sensor configurations.
Even though collocated and designed for cross-disciplinary use, NEON’s different observing systems utilize a variety of observation and measurement principles. These often take differing perspectives of the environment, such as Eulerian in the case of fixed-frame biometric and remote sensing observations, and Lagrangian in the case of meteorological observations of upstream phenomena including ecosystem fluxes (Schmid 1997). Thus, many resulting data products overlap only fractionally in space and time, necessitating a mathematical translation to ensure their unbiased, combined use (Nappo et al. 1982). Reconciling these differences is not trivial and compounded by factors such as instrument properties, meteorology, and objective of the analysis. First approaches to derive the necessary matching layer are emergent, including the Environmental Response Function (Metzger 2018; Metzger et al. 2013).
IN THE VANGUARD OF INTERDISCIPLINARY RESEARCH.
Environmental science has traditionally been in the vanguard of interdisciplinary research, from our empirical beginnings, over theoretical generalizations to computational simulations. More recently, data-intensive information discovery promises new predictive capabilities across the “macrosystem”—hierarchical ecological systems comprising biological, geophysical, and social components that interact with one another. Together with this new paradigm also come new requirements on data capture, curation, and analysis, giving rise to new a category of an observatory that is designed to act as a single, coordinated continental-scale instrument: sustained and cross-disciplinary coordinated observations, reproducible research practices, and free software tools for data discovery and analysis.
NEON, alongside U.S. Department of Energy (DOE) ARM (https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/arm), NOAA Global Greenhouse Gas Reference Network (GGGRN; https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/gggrn), U.S. Department of Agriculture (USDA) Natural Resources Conservation Service (NRCS; https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/nrcs2), and USGS National Streamflow Network (NSN; https://w3id.org/smetzger/Metzger-et-al_2019_NEON-BAMS/usgs) are among the few observatories in this new category to date: addressing the need for long-term standardized measurements in an era of global change. What truly sets NEON apart, however, is its purposeful transgression of traditional disciplinary boundaries following the macrosystem paradigm: monitoring the environmental continuum from the atmosphere over biosphere to hydrosphere and pedosphere equally using instrumented field sites, observations by field personnel, and airborne remote sensing. NEON thus intends to provide a platform for genuinely transformative environmental research through enabling scaling not only across space and time, but in particular across disciplines.
Many colleagues at Battelle, which operates the NEON Program, supported this work. In particular, Leslie Goldman designed Figs. 2 and 5, and Keli Goodman, Tristan Goulden, Courtney Meier, and Greg Wirth (now: Ball Aerospace) contributed to the description of the NEON contextual systems. Christine Laney and Megan Jones have been tireless in their efforts to improve user experience via the NEON Data Portal and tutorials. Moreover, nothing goes without a village of dedicated hardware, software, and systems engineers alongside enthusiastic field personnel. Henry Loescher, Jeffrey Taylor, and Thomas Gulbransen helped shepherd the TIS scientific design and data processing through construction. Our thanks go to Sharon Collinge (now: University of Colorado Boulder), Wendy Gram, and Richard Leonard for guiding this manuscript and its publication through required administrative procedures. We are grateful for the guidance received from NEON Technical Working Group and Science, Technology, and Educational Advisory Committee members. Special thanks go to Ankur Desai and Thomas Foken for commenting on an earlier version of this manuscript. Lastly, Metzger is grateful to Xunhua Zheng and Yukun Zhang at the Chinese Academy of Sciences for providing temporary workspace. The National Ecological Observatory Network is a project sponsored by the National Science Foundation and managed under cooperative agreement by Battelle. This material is based upon work supported by the National Science Foundation (Grant DBI-0752017). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
A supplement to this article is available online (10.1175/BAMS-D-17-0307.2)