1. Introduction
The vertical exchange of CO2 between the terrestrial biosphere and the atmosphere constitutes the largest, single-component flux in the global carbon cycle (e.g., Beer et al. 2010). Spatiotemporal patterns of flux exchange display pronounced variability between regions. The Pacific Northwest (PNW) of the United States represents one of the strongest carbon sinks in North America (e.g., Law et al. 2004; Law and Waring 2015). Accurate quantification of the magnitude of this sink is critically important for understanding current and future carbon cycles from regional to global scales.
Over the last decade, atmospheric inverse modeling approaches, mostly based on Bayesian optimization, have been widely applied to constrain terrestrial biosphere carbon fluxes, covering various temporal scales and from state to regional (Göckede et al. 2010a; Lauvaux et al. 2012b; Schuh et al. 2013) and continental to global scales (e.g., Rödenbeck et al. 2003; Bousquet et al. 2011; Carouge et al. 2010; Turner et al. 2013). To constrain rates of surface–atmosphere flux exchange within the target domain, the inverse top-down approach generally uses time series of atmospheric mixing ratios in combination with atmospheric transport models, where a priori flux fields can optionally be provided by biosphere models with different levels of complexity (Enting 2005). Top-down modeling can also serve as an independent validation for spatiotemporal flux fields produced by biosphere process models. Linking these prior flux fields to observed spatiotemporally corresponding observations of atmospheric CO2, inverse modeling can provide an indispensable tool to evaluate the representativeness of the bottom-up modeling products, many of which still fail to fully capture the interdependent biosphere and atmospheric processes in every detail needed (Schwalm et al. 2010; Keenan et al. 2012; Richardson et al. 2012; Lichstein et al. 2014).
Regional assessments of the terrestrial carbon fluxes are especially challenging since they need to be nested into global scale grids that provide boundary conditions, which accurately represent processes outside the modeling domain. Assessments need to be made at scales appropriate for decision-making, which requires representation of finescale processes that are usually missing from coarse-scale applications.
Our study region is ideal for the development of a modeling framework that could be applied in other regions with a complex environment. With increased spatial resolution, more details about the parameterized environment can be resolved by the modeling framework, allowing one to appropriately represent finescale processes that are usually missed in larger-scale applications. The PNW is characterized by a highly heterogeneous landscape with finescale vegetation mosaics and pronounced climate gradients that make the state of Oregon particularly interesting for high-resolution modeling efforts to simulate the net ecosystem exchange (NEE) of CO2. Turner et al. (2011) compared different approaches to estimate annual carbon budgets of Oregon and found generally higher net ecosystem production (NEP) for the top-down approach compared to a relatively simple bottom-up ecosystem process model Biome-BGC (Biogeochemical; Thornton et al. 2005). The CO2 budgets from the two approaches differed by more than 80%, highlighting significant differences in results between the bottom-up and top-down perspectives.
Both modeling techniques are affected by uncertainties that make direct comparisons challenging. Atmospheric inverse modeling uses measurements of atmospheric mixing ratios of a trace gas to constrain surface fluxes of the same gas to match those mixing ratio observations. This is achieved by linking the mixing ratios to the fluxes through a transport model (Gerbig et al. 2003). While accounting for uncertainties associated with mixing ratio observations, the transport model, and the uncertainties of the fluxes before the assimilation (prior fluxes), the inferred flux estimates (posterior fluxes) minimize a cost function that mathematically accounts for differences between the observed and modeled values (Enting 2005). When applied to larger regions, this method is often influenced by a limited number of available observations and constraining information about uncertainties that is needed for every grid cell of a model domain (e.g., Tans et al. 1996; Kasibhatla et al. 2000; Göckede et al. 2010a).
It is especially challenging to comprehensively assign accurate prior uncertainties because the best estimate, which would allow exact error quantification by comparison, is what the inversion process is actually trying to derive by assimilating observations. Options for estimating prior flux uncertainties are limited because fluxes from eddy covariance towers are usually only available for a few sites scattered over a large region. In many cases, reference flux data are not available at all for distinct environmental conditions (vegetation cover, soil properties, and climate). Hence, prior flux rates as well as their uncertainties need to be based on extrapolation of site level data through, for example, geostatistical methods or mechanistic prior models (e.g., Michalak et al. 2004).
The accuracy of the error estimates significantly affects the outcome of any variational data assimilation process. Therefore, accurate estimates for both the prior model error and the error of the constraining measurements are crucial for atmospheric inversion approaches (Enting 2005; Tarantola 2005). The measurement errors and transport model errors, summarized in the so-called model–data mismatch error matrix, therefore need to be addressed in a systematic manner. Reference values for a direct error assessment such as calibration gases or airborne measurements of boundary layer heights and wind fields can be used to quantify the measurement errors. The gridded prior model results usually lack such reference values, at least for the majority of cells of a regional or global grid. More recently, analyses used data from large research networks such as FLUXNET (http://www.fluxdata.org and http://fluxnet.ornl.gov). The integration of such databases provides measured reference values to improve model parameterization directly and facilitates the derivation of sound estimates of the errors of fluxes from land models (Williams et al. 2009; Wang and Mo 2015).
Here, we apply a Scaling Factor Bayesian Inversion (SFBI; e.g., Gerbig et al. 2003; Jeong et al. 2012) in the observation space to optimize NEE and the corresponding annual NEP in Oregon. For the measurements of CO2 mixing ratios and fluxes, we use an observation network of five high-precision and accuracy NOAA Global Greenhouse Gas Reference Network CO/CO2 observation towers and three eddy covariance flux systems that provide direct measurement reference values for NEE. Our prior CO2 flux estimates were calculated with the Community Land Model, version 4.5 (CLM4.5), with a 1/24° × 1/24° spatial resolution and 3-hourly temporal resolution. Because of the substantial influence of the observations on posterior results, we use a combination of Bayesian inference and geostatistical approaches including experimental semivariogram analyses and Kriging to systematically quantify prior model uncertainties. To find the best compromise between aggregation errors and the degree of freedom in the inverse system, we developed a machine-learning approach for the spatial structuring of our domain. The unsupervised two-step clustering approach is based on spatial flux patterns as modeled by CLM4.5 and information about the land surface, soil properties, and the composition and distribution of the vegetation. Combining these Bayesian and geostatistical approaches, we present a framework to constrain statewide NEP with the tower infrastructure available in Oregon. Figure 1 shows the steps in our inversion framework and how they are connected. The data and analysis steps are presented in detail in section 2.
2. Data and methods
2.1. Tower network and atmospheric observation data
Oregon is characterized by significant micro- to mesoscale variability in climate and vegetation characteristics. The crest of the Cascade Mountain Range creates a pronounced precipitation gradient, roughly splitting the state into the mesic western part, dominated by highly productive and managed Douglas fir forests, a large valley intensively used for agriculture, and a semiarid eastern part mainly consisting of open ponderosa pine forest and juniper–sagebrush–grass communities (Figure 2).
The locations of the observation towers (Table 1) were selected to optimally represent the distinct environmental conditions found in Oregon along a strong climate gradient and changes of land cover. From west (wet) to east (dry), the towers are the Mary’s Peak mountain-top tower in the Coast Range for incoming air (OMP), Douglas fir in the Coast Range (OWA), Willamette valley crops (OSI), eastern Cascades slopes (OMT), and the northern basin and range in the High Desert (ONG). The towers are equipped with profile systems that measure high-precision, well-calibrated CO and CO2 mixing ratio time series. Meteorological measurements include wind speed, wind direction, solar radiation (total radiation, diffuse radiation, and photosynthetically active radiation), air temperature, and relative humidity. Picarro Cavity Ring-Down Spectrometer (CRDS, model G2302) analyzers are employed at four of the sites for CO2 and CO mixing ratios (Crosson 2008) and the Mature Pine site (OMT) is equipped with a closed-path infrared gas analyzer (IRGA, model LI840). To align all observation data with WMO standard references, measured mixing ratios are linearly corrected using CO and CO2 calibration gases that are sampled every 50 h. Because of the generally larger sensor drift of the IRGA (Schmidt et al. 2012; Andrews et al. 2014) compared to the CRDS analyzer, a sequence of calibration gases including a CO2-free (zero) value is sampled every 4 h at the OMT tower. Schmidt et al. (2014) present further information about the mixing ratio measurement setup for towers with CRDS devices. Details about the IRGA-based measurements’ setup at the OMT pine forest tower are provided by Göckede et al. (2010a). The OMT tower is also equipped with an eddy covariance (EC) flux system consisting of an open-path IRGA LI7500 and a CSAT3 sonic anemometer mounted at 34 m above ground, which measures fluxes above a vast mature ponderosa pine (Pinus ponderosa) forest in the eastern Cascade Mountains (Vickers et al. 2012). Further details on the towers and measurements can be found in appendix A.
Observation towers in the Oregon network used for this study. Sites that are equipped with Picarro CRDS CO2/CO/H2O analyzers are marked with an asterisk (*). Towers that provide eddy covariance flux measurements are marked with a plus sign (+), and AmeriFlux towers are marked with an octothorpe (#).
Only mixing ratio data from periods with well-mixed atmospheric conditions were used for the inversion to ensure suitable performance of the transport model. After a first outlier removal, an additional two-step data filter was applied to avoid biases during the inversion process that are introduced due to weakly mixed conditions in the boundary layer. First, only mixing ratio values from 1200 to 1800 local time were used in this study. Using only noon and afternoon values has been applied as a basic filter criterion in related studies to ensure well-developed boundary layer conditions (e.g., Schuh et al. 2010; Chen et al. 2015; Wu et al. 2016; Henne et al. 2016). As a second quality criterion, we used the vertical profile mixing ratios available at the OWA, OSI, and ONG towers. Mixing ratio measurements were used if the difference between the uppermost and the lowest inlet height for the 3-hourly averages was <1 ppm.
For the inversions, the mixing ratios from the uppermost inlet of each tower were used to achieve the highest spatial representativeness of the mixing ratios and the corresponding surface influence strength matrices, hereafter simply referred to as footprints (Lin et al. 2003).
2.2. The terrestrial biosphere model CLM
The CLM4.5 used to calculate our prior flux estimates is the land component of NCAR’s Community Earth System Model (CESM). CLM is a fully prognostic biosphere model with respect to all carbon and nitrogen cycles of vegetation, litter, and soil organic matter (Oleson et al. 2013). CLM was run in stand-alone mode decoupled from the climate model component of CESM. The downscaled MIROC5 climate data (Watanabe et al. 2010) from phase 5 of the Coupled Model Intercomparison Project (CMIP5; Taylor et al. 2012) were used as input for the CLM calculations. The Multivariate Adaptive Constructed Analogs (MACA) method (Abatzoglou and Brown 2012; Abatzoglou 2013) was applied to downscale the climate data from the original horizontal resolution of the global climate models to our model grid resolution of 1/24° × 1/24° (approximately 4 km × 4 km in Oregon). Further details about the downscaled climate dataset are given in appendix B.
Distributions of subgrid fractional areas were assigned to individual plant functional types (PFTs) to allow a high-resolution description of the terrestrial biosphere in the surface parameter map and associated parameterizations. In addition to the percentage PFT distribution in each grid cell, the land surface in our CESM CLM setup is comprehensively parameterized with characteristics including soil composition and properties, reflectance for various bands governing the radiation budget, reflective and thermodynamic properties of buildings and streets, and topography (Oleson et al. 2013).
Our PFT parameterization of plant traits was based on Hudiburg et al. (2013a). Model parameters and plant physiological traits such as the fraction of nitrogen used by RuBisCo (FLNR) were obtained from our plot data on tree species in the region (Law and Berner 2015) and AmeriFlux A/Ci curves archived by the LeafWeb project (Sun et al. 2014). The land-cover data were derived from the NALCMS dataset (USGS 2013) with a spatial resolution of 250 m. Soil data from SSURGO/STATSGO2 were used to update the soil texture layer of CLM (Soil Survey Staff 2015). Soil colors were derived from Lawrence and Chase (2007). The biogenic volatile organic compounds model was updated to the Model of Emissions of Gases and Aerosols from Nature, version 2.1 (MEGAN2.1; Guenther et al. 2012).
The model’s major PFTs in the Oregon domain are needleleaf evergreen temperate trees, broadleaf deciduous temperate trees, broadleaf evergreen temperate trees, broadleaf evergreen shrubs, nonarctic C3 grass, and C3 crops (Bonan et al. 2002). Evergreen needleleaf species dominate the region, yet they are very different in climate sensitivities, so we used tree species traits from our plot data in the region (Law and Berner 2015). We separated the physiological model parameter set of this PFT class into three subclasses of (i) Douglas fir that dominates the western forests in the Coast Range Mountains and is also abundant in the western Cascades region and the Blue Mountains in northeastern Oregon, (ii) ponderosa pine that dominates the eastern Cascade Mountain Range and also has a significant abundance (9% of vegetation land cover) in the Blue Mountains region, and (iii) juniper woodlands that cover a transitional zone between the pine forest and the semiarid grass and shrubland areas in southeastern Oregon, where geographic characteristics of the northwestern Great Basin dominate the landscape and vegetation (Figure 2). Southeastern Oregon also exhibits areas with significant juniper coverage, particularly at the higher elevations of the Steens Mountain Wilderness area.
As a detailed mechanistic model, CLM4.5 aims to describe the physical and physiological processes involved in the terrestrial carbon cycle as realistically as possible. Each PFT is described by a set of 122 specific parameters (CLM release 4.5.1_r119). This complexity comes at a cost: the high number of parameters makes it increasingly difficult to improve results by refining selected single algorithms or parameters because many interdependencies exist between the environmental, climatological, and physiological functions simulated in the model. On the other hand, this complexity makes CLM4.5 a well-suited candidate for coupling with atmospheric top-down approaches because it covers many details that cannot be constrained through inversions. At the same time, the top-down inversion imposes bulk scaling factors to nudge net fluxes and appears to be the best way to improve the large-scale representativeness of biosphere fluxes of CO2 as modeled by CLM4.5. In addition to quantifying the terrestrial carbon budget better by inferring corresponding atmospheric measurements, posterior results can also be used to discover weaknesses of complex mechanistic models like CLM and to improve parameterizations or algorithms.
2.3. Atmospheric transport model setup
We used the Stochastic Time-Inverted Lagrangian Transport (STILT) model to calculate the sensitivity of measured CO2 mixing ratios to the gridded NEE over the derived tower footprints (Lin et al. 2003; Nehrkorn et al. 2010). The 4D meteorological fields used in STILT were calculated with the Weather Research and Forecast Model (WRF) in the Advanced Research WRF (WRF-ARW, version 3.7 (Michalakes et al. 2001). For the WRF boundary conditions, we used the NCEP Final (FNL) operational global analysis data with 1° horizontal resolution and a 6-hourly temporal resolution, respectively (Kalnay et al. 1990) for 27 vertical levels. The NCEP FNL is produced using the Global Data Assimilation System (GDAS), which continuously collects observational data from various sources. The NCEP FNL data include variables that are used as boundary conditions for the outer WRF Model domain and STILT meteorological drivers (e.g., surface pressure, geopotential height, temperature, sea surface temperature, soil values, ice cover, relative humidity, u and v winds, and vertical motion). A domain setup with three nested grids (18-, 6-, and 2-km resolution) was used for the WRF runs, with the highest-resolution grids centered on the towers to improve representation of the near-field emissions in the footprints.
The WRF output time step was chosen at 10-min resolution to serve as input for the coupled Lagrangian particle dispersion model. WRF Model physics followed the settings in Göckede et al. (2010a). The number of particles released at the receptor points in STILT was set to 500 for the backward trajectory calculations, and particle trajectories were simulated for 72 h backward in time. The WRF–STILT trajectories were used to derive the tower footprints and the initial inflow mixing ratios of CO and CO2 (background). We applied boundary conditions that were specifically provided by NOAA ESRL for the Pacific coast region (see section 2.5.2.1 and appendix C for more details).
2.4. Bayesian inversion framework
Partitioning of the domain and aggregation in the inverse framework
Most atmospheric Bayesian inversions are limited by data availability from the atmospheric mixing ratio monitoring network, that is, the total number of observations and number of observation sites. As a consequence, the degrees of freedom in the constrained flux fields need to be reduced to avoid equifinality effects as well as implausible results for parts of the model grid with low data availability.
One approach is to apply aggregation to reduce the degrees of freedom (e.g., Bocquet et al. 2011). Multiplying the 25 920 grid cells (216 east to west × 120 north to south) distributed over our entire model domain with the number of time steps (2920 yr−1 at 3-h resolution) results in over 75 million fluxes to be solved by the inversion. With five available measurement locations for atmospheric CO2 mixing ratios and the diminishing source weight of the tower footprints with increasing distance from the tower, there is not enough information from the atmospheric monitoring network to constrain the fluxes sufficiently using a pixel-based inversion. Accordingly, we use spatial aggregation to arrive at meaningful inversion results. However, this aggregation should target the best compromise between retaining crucial details in surface structure while taking into account the information content from the available tower observations.
When aggregating to larger areas in order to retain sufficiently homogeneous land surface units, caution is needed to limit averaging and aggregation errors (Kaminski et al. 2001; Engelen et al. 2002; Göckede et al. 2010a; Turner and Jacob 2015). One important aspect is the choice of an appropriate spatial aggregation of the fluxes or, as in our case, their respective scaling factors assembled in the state vector that is to be optimized. If the state vector is aggregated and adjusted over areas that are too large, the finescale heterogeneity of the fluxes causes averaging errors and the fluxes are not representative for the vegetation and soil types in the assigned areas. Yet, if the spatial resolution of the fluxes is too fine, the number of degrees of freedom becomes large and jeopardizes the ability of the limited observations to sufficiently constrain the state vector.
Different approaches have been applied in related atmospheric inversion studies to address and limit the aggregation error that is generally difficult to quantify exactly (e.g., Kaminski et al. 2001; Engelen et al. 2002; Thompson et al. 2011; Wu et al. 2011). Spatial aggregation areas for the state vector were found using very different approaches including predefined classifications of surface types based on, for example, ecoregions and land-cover classes (e.g., Göckede et al. 2010a) or statistical methods such as principle component analyses or Gaussian mixture models in combination with radial basis functions (Turner and Jacob 2015).
Here, we apply a data-driven approach for the spatial structuring of our domain that was informed by an unsupervised clustering approach to find the best compromise between aggregation errors and the degrees of freedom in our inverse system. Based on the percentage distribution of the PFTs, soil properties (amount of organic matter, sand/clay/silt composition), foliar traits, the probability densities of monthly prior flux estimates in the model domain, and the spatial assignments of grid cells to the nine Omernik Level III ecoregions in Oregon (Omernik 1995; Thorson et al. 2003), a self-organizing map neural network (SOM; Kohonen 2001) with 256 nodes arranged in a 16 × 16 hexagonal neuron grid was used to cluster the model grid cells for spatial flux aggregation. The SOM grid cell groups were further aggregated into the final areas by a subsequent k-means clustering (e.g., MacQueen 1967; Jain et al. 1999) with 1000 iterations and 50 replicate runs, each using varying initial cluster centroid positions. In contrast to classical clustering approaches, this two-step approach not only updates the winning neurons (cluster centroids) during the SOM learning phase but also their neighbors in a defined range. Its neural interconnectivity gives the SOM a better ability to recognize similarities of high dimensional datasets with smooth boundaries in variable space (e.g., Vesanto and Alhoniemi 2000; Clare and Cohen 2001; Schmidt et al. 2011). To unambiguously distinguish the derived areas from the USGS ecoregion classification, we hereafter refer to our aggregation areas as ecological feature zones (EFZs). Each EFZ exhibits strong homogeneity in terms of PFT distribution, soil properties, and (prior) CO2 flux patterns. The monthly scaling factors are optimized for each EFZ specifically.
2.5. Error assessment for the inverse framework
2.5.1. Prior error covariance matrix
The primary data source to constrain prior flux errors in this study were time series of surface–atmosphere exchange fluxes provided by eddy covariance sites within the target domain. We calculated the prior flux uncertainty for each grid cell, represented by the diagonal elements of
Generally the model underestimates carbon uptake in summer and underestimates respiration (positive NEE) that dominates the average net exchange during fall and winter. This especially applies to the grass and shrubland areas in the High Desert region, where the measured respiration exceeds the prior monthly respiration values by about 11% (average of October–January over all 3 years). The residuals were assigned in quadrature to the diagonals in
Although often used to quantify the model error, measured NEE fluxes are not necessarily the “true value” sought after, but the direct measurement of NEE offers the most straightforward estimate to quantify the initial error variance or covariance values, respectively. To account for errors of the EC measurements used to populate
Because direct comparison of EC flux data and model output for error estimation can only be implemented for the footprint area of three flux sites within the model domain, an extrapolation procedure needs to be designed to fill the entire
An experimental semivariogram over the annual prior NEE from 2012 to 2014 was calculated for the high spatial resolution of the 1/24° × 1/24° model grid. Generally, a semivariogram gives the variance of the residuals between variables at two locations, thereby quantifying the relation of spatially distinct variables with distance and, in case of an anisotropic semivariogram, with direction (Cressie 1993). The maximum correlation distance between prior fluxes at different locations was found to be about 100 km, where the semivariogram function converges at its sill (Figure 3a). In addition to the isotropic semivariogram, we also calculated a directional semivariogram. For the directional analysis the distance-dependent variance was calculated for 15° angular sections separately (Cressie 1993; Cressie and Wikle 2011). The result shows a maximum difference of 24% in the gradients of the variance as a function of distance for different directions (Figure 3b). Therefore, universal Kriging was applied for the geostatistical interpolation, as the universal algorithm not only accounts for changes in variance due to a changing distance between points but also considers changes as a function of direction. Detailed descriptions of the various Kriging methods can be found in, for example, Cressie (1993), Stein (1999), or Hengl (2007). To fit the experimental semivariogram to a theoretical semivariogram, as needed for the spatial interpolation with Kriging, an exponential variance decay model (McBratney and Webster 1986) was found to describe the observed semivariogram best (Figure 3a).
Based on the theoretical semivariogram calculated from the prior NEE estimates in combination with the error variances known from the available EC stations, the spatial Kriging interpolation algorithm assigned error variances to all cells of the model grid. The error variance values were then averaged for each EFZ to populate the diagonal elements of
2.5.2. Assessment of observational errors
2.5.2.1. Biospheric CO2 signal error Sveg
The first component of Sveg is the error of the actual CO2 mixing ratio measurements Smeas. This includes the instrument uncertainty, the uncertainty added for water vapor correction when calculating dry mixing ratios, the uncertainty of calibration gases used for the linear regression correction of the raw measurement values, and the uncertainty of the WMO primary standard gas cylinders provided by the Central Calibration Laboratory at NOAA, which served as final long-term reference after calibration (Schmidt et al. 2014). More information on the calibration procedure is given in appendix A. Measurement uncertainties differ slightly between the various measurement systems with values ranging from 0.16 to 0.21 ppm for the five observation towers.
The second component contributing to Sveg is the uncertainty of the advected background mixing ratios Sbg that were removed from the tower measurements to assign observed changes in CO2 solely to the biosphere. Daily background mixing ratios of atmospheric CO and CO2 are spatially arranged in a curtain covering a height profile from 500 to 6500 m with a latitudinal resolution of 1°. The background mixing ratios also have an error estimate assigned specifically for each time, height, and latitude. The RMSE values are based on the residuals of the data used to construct the boundary curtains. A description of the methodology used to create the background product provided by NOAA ESRL can be found in Jeong et al. (2013).
The CO2 background values at the tower locations in Oregon were calculated as an average of the CO2 background mixing ratios at the latitude that the 500 released particles crossed at the end of their STILT backward trajectories. Accordingly, the average RMSE values were used to assign the uncertainty error for the background values calculated at the towers (Table 2). The calculated error variances Sbg (squared RMSE) for the study period range from 0.49 ppm to maximum values of 3.67 ppm, depending on date and tower location. More details about the calculation of the background mixing ratios at each tower are given in appendix C.
RMSE ranges for the modeled background mixing ratios.
The third component of Sveg is Sff, the uncertainties introduced by anthropogenic fossil fuel emissions that are not already considered in the large-scale background product but originate from regional and local emissions. In our study domain anthropogenic emissions are dominated by few isolated, larger urban areas in the Willamette valley. We developed an approach to sufficiently remove the traffic-related portion of CO2 as the most important anthropogenic source of CO2 in Oregon [Oregon Department of Environmental Quality (ODEQ) 2015] using CO:CO2 emission ratios measured during a mobile campaign in 2012 in combination with mixing ratios data measured at the OSI tower (Schmidt et al. 2014).
The fossil fuel offsets were removed using the CO:CO2 ratios from road traffic sources after removing a baseline from the observations. The use of CO as a combustion tracer and for inverse model studies was successfully demonstrated in previous studies (e.g., Gerbig et al. 2003; Vardag et al. 2015). In contrast to the four other observation towers that are located in rural and remote regions, regional and local anthropogenic emissions of CO2 significantly affect the measured mixing ratios at the OSI tower that represents the large Willamette valley ecoregion. Following Schmidt et al. (2014), we used the measured CO mixing ratios as a tracer for the corresponding CO2 emission offset in combination with a weight factor based on the composition of all fossil fuel–related CO2 emissions in Oregon. This method was applied at all towers and resulted in an additional uncertainty of 0.86 ppm.
2.5.2.2. Stochastic particle dispersion error Spart
The Spart was statistically quantified by repeating the STILT runs with unchanged model settings for the five towers and 1 year (2013) with 3-hourly time steps 50 times. Because of the stochastic nature of the error and the error assessment method, it is assumed that this error does not change significantly over the years. The RMSE of the modeled CO2 mixing ratios was calculated for all towers and all runs and averaged 0.56% of the total modeled CO2 concentration, including the background level and anthropogenic fossil fuel emission offset leading to values for the Spart-related component of 2.04 to 2.38 ppm.
2.5.2.3. Transport model error Stransp
Flight data were used to calculate the transport model error components of Stransp, which are a composite error including errors of the modeled PBL height as well as errors of the modeled wind field. Seven profile flights at each tower location conducted from spring 2012 through the end of 2014 were used to compare the height of the planetary boundary layer as modeled by WRF 3.7 to the values derived from the airborne measurements. Observations were acquired from a modified Mooney TLS (M20M) aircraft equipped with a Picarro CRDS and meteorological sensors. The profile flights spiraled around the tower locations with the radii of the flight routes reaching distances from 0.2 to 2 km from the towers. Profiles were flown covering heights from about 200 to 3300 m above ground. Flight data during late fall and winter months (November–February) were not available. The flight routes between towers provide data for comprehensive sections through the model domain for the comparison of measured and modeled PBL height, wind speed, and wind direction. The airborne mixing ratio measurements show a strong mesoscale spatial variability of the tropospheric CO2 across the entire horizontal extent of the domain. Nevertheless, clear spatial patterns persisting over longer flight segments were observed with relatively steady mixing ratios (σ2 < 2 ppm) that correspond to vegetation, topography, and temporally established atmospheric layers (Figure 4).
The PBL height was derived from the strongest vertical gradients (10th percentile of gradients) of three meteorological variables (virtual potential temperature, relative humidity, and wind direction) and the mixing ratios of CO2 and CH4 measured during the profile flights at the tower stations. An automated routine was applied to derive the PBL height estimates in a systematic and reproducible manner (Figure 5) instead of using visual inspection that is more prone to subjective decisions (Shaw et al. 2007; Schmid and Niyogi 2012). If a strong absolute gradient at a certain height was detected for at least three out of the five variables, we chose this point as the best PBL height estimate (Figure 5b). In cases where a strong vertical gradient was observed at more than one height with the same number of variables, the lowest of those heights was chosen as the PBL height following the theoretical definition of the PBL height (e.g., Seidel et al. 2010). The duration of the profile flight sections was 3 to 6 min. This campaign setup provides a localized average of the PBL, which spatially and temporally represents the corresponding point in space and time in the nested WRF Model domain (Figure 6).
Because only noon to afternoon mixing ratios were used for the inversion process, we only considered measured and modeled PBL heights during well-mixed conditions at noon to afternoon for the comparison with clearly visible vertical gradients of the meteorological variables. In particular, we used data with the air temperature decreasing vertically by 0.3 K (100 m)−1 or more. No systematic differences of errors among observation sites were observed based on a χ2 test and a 90% significance level. After comparison with the aircraft observations, we found an RMSE of 98.81 m corresponding to an average deviation of 8.1%.
To propagate the error into the sum of variances that build the diagonal elements of the model–data mismatch matrix
We derived errors for the two wind velocity components u and υ as calculated by WRF and used in STILT by simply decomposing the wind velocities measured during the flights. To compare measured values with the WRF estimates, we calculated 1-min averages of the airborne wind velocity, direction, and location when the distance was within a 500-m radius of a 3D grid cell within the 2 km × 2 km nested WRF domain, and no significant changes of altitude occurred during the averaging interval. We restricted the values used for the comparison to mean values with standard deviation equal to or less than 25% of the average of velocity and direction to further account for the effects of continuously changing measurement locations during the flights on fluctuations of wind measurements. This ensures the stability and representativeness of the measurements that were compared to the corresponding 3D grid cell values in the WRF domain. This left 224 data records available for the model–data comparison of the wind field. The stochastic error components of u and υ, respectively, were then propagated through STILT following the procedure as first described by Lin and Gerbig (2005). This yielded mixing ratio variations of 0.43 to 1.72 ppm caused by the error of the transport wind field, which corresponds to 4.7% to 19.1% of the vegetation signal of the mixing ratios used for the inversions.
2.5.2.4. Aggregation error Saggr
The aggregation error Saggr originates from optimizing fluxes of aggregated areas. The aggregated fluxes do not resolve the high resolution of flux patterns of the original prior flux model and therefore introduce an additional uncertainty; Saggr was estimated by calculating the RMSE between the 1/24° × 1/24° resolution NEE and the corresponding aggregated NEE values (Gerbig et al. 2003; Zhao et al. 2009) and applied to the model–data mismatch matrix for each EFZ and each month leading to a mean aggregation error of 5.9%.
2.5.2.5. The error components Socean and Seddy
The influence of neglected CO2 exchanges over the Pacific Ocean and the corresponding uncertainty contribution Socean is considered very small because the model domain only covers a small section of ocean at its western boundary (Figure 7). Moreover, the NOAA background product provides values specific for the Pacific Ocean coastal region, including large-scale influences of the ocean on the mixing ratios. Hence, we set Socean to a fixed value of 0.1 ppm following Göckede et al. (2010b), whose model domain had a similar extent. We also adopted their corresponding value of 0.5 ppm for the error component that accounts for eddies that cannot be resolved by the WRF–STILT model (Seddy).
2.5.3. Removing periods with fire emissions from observation data
Temporary yet potentially large biases introduced by CO2 emissions from biomass burning need to be avoided. Thus, fire events were excluded by combining time and location of wildfires in the Monitoring Trends in Burn Severity (MTBS) database with temporally corresponding tower footprints. With 72-h backward trajectories for the transport model, we assume that all fires that potentially could have affected the measurements at the tower were excluded. Offsets associated with large-scale synoptic transportation are incorporated in the large-scale background values that were subtracted. Furthermore, CO mixing ratios that serve as a tracer for emissions from biomass burning (e.g., Pechony et al. 2013) were used as an additional measure to ensure that the mixing ratio data were not affected by fire emissions. We applied a data filter with a 3-day moving window and removed data periods with peak values of CO > (mean + 3σ) from the data used for the inversions.
The MTBS database provides information about wildfire events, including date, location, extent, and burn severity (Eidenshink et al. 2007). Most fires that affected tower observations occurred in the dry eastern Cascades and foothills, the High Desert region, and the eastern Blue Mountains. Tower measurements were affected by wildfire events or prescribed fires on 56 days during the study period (Figure 7). Periods when fire emissions affected the tower footprint areas were excluded from the inversion calculations, totaling 69 observations during 3-hourly periods.
3. Results and discussion
3.1. Aggregation results for Bayesian framework
Using the results of the machine-learning clustering procedure presented in section 2.4.1, the optimal number of aggregation areas was obtained by finding the minimum of the Davies–Bouldin validity index IDB (Davies and Bouldin 1979; Maulik and Bandyopadhyay 2002). The IDB accounts for the within-cluster distances between each cluster member and the center of the respective cluster as well as for the distances between the cluster centers, minimizing the first while maximizing the latter.
A stable minimum was reached for 27 clusters that represent our aggregated homogeneous areas for optimizing the fluxes. Some of the spatial clusters incorporate areas that are close simply in terms of their relative geographical locations corresponding to the ecoregion classification. Other EFZ clusters, however, are composed of areas that are not necessarily spatially close but similar in terms of their physical properties and traits accounted for during the clustering process.
To account for seasonal effects on the fluxes and mixing ratios, while also retaining a sufficient number of 3-hourly measurements to be incorporated in the inversion process, we calculated transient scaling factors for monthly intervals, resulting in 918 scaling factors (34 months × 27 EFZs) for modeled NEE. Due to low data availability, scaling factors for the first two months of 2012 remain 1 for all EFZs. Accordingly, the uncertainties (Figure 8) represent prior values for those months. Because NEE is small during winter months, the effect on the annual budget for 2012 is negligible. The calculated scaling factors λpost ranged from 0.795 to 1.484, depending on EFZ and month. Figure 8 shows the scaling factors averaged over all 27 EFZ and the corresponding average posterior uncertainty for each month.
3.2. Posterior fluxes and uncertainties
Posterior uncertainties, as quantified through the assigned posterior error covariance matrix [Equation (5)], vary among EFZs and time periods. The distribution scaling factor uncertainty reduction directly reflects the location and amount of available data. Hence, the greatest reduction of uncertainty from a 34-month average of 30.4% (prior) to 11% (posterior) was achieved for western Oregon including the Coast Range and the Willamette valley ecoregions where three towers (OWA, OSI, and OMP) are located, providing a strong footprint coverage associated with high measurement data density over time. The posterior uncertainty for all 3 years and EFZs averages 29% (Figure 8) and was reduced by 28.4% through the inversion process. The overall reliability of our posterior flux estimates was significantly improved compared to the prior estimates.
After optimizing the CLM4.5 NEE values for each grid cell assigned to its corresponding EFZ and monthly scaling factor, the statewide NEP averages 29.7 (±8.86) TgC yr−1 over the 3 years, indicating that the vegetation of the state acted as a strong carbon sink. The statewide NEP was higher than previous forest estimates averaged over multiple years from inventory data (15.2 ± 1.6 TgC yr−1, 1990–2001; Law et al. 2013), Biome-BGC (17.0 ± 10 TgC yr−1, 1990–2001; Turner et al. 2007), and CLM4.0 (12.8 TgC yr−1, 2001–06; Hudiburg et al. 2013b). Although our estimates include all land-cover types in the state, forests account for a majority of the total NEP. Both periods included drier than normal years, and 2001–03 reduced NEP by ~40% (Thomas et al. 2009).
The average difference between the prior and posterior annual estimates from 2012 to 2014 was 7.16 (± 1.55) TgC yr−1. This corresponds to a 32% increase in net uptake of CO2 after the inversion. The results show that the west–east gradient of the CO2 sink strength (positive values for NEP) is more pronounced in the posterior results compared to the model priors (Figure 9). Changes are dominated by increased NEP values in the western part of the state, whereas binned areas with a strong reduction of the annual NEP are limited to small patches in southeastern Oregon. The 3-yr average NEP was highest in the highly productive forests of the Coast Range, Cascades, and Klamath Mountains (9.23 ± 1.26, 6.45 ± 1.99, and 4.45 ± 1.99 TgC yr−1) and moderate in the croplands of the Willamette valley and semiarid eastern Cascades (2.81 ± 1.00 and 2.48 ± 1.48 TgC yr−1). It was lowest in the dry grasslands of the Columbia Plateau and the shrublands of the northern basin and Snake River Plain (0.81 ± 0.31, 0.72 ± 0.29, and 0.19 ± 0.13 TgC yr−1), areas that located in the rain shadow of the Cascade Mountains in the west.
The relative ranking of NEP is more consistent with the observed spatial gradient of net primary production computed from forest inventory and plot data, which was 25.6, 14.1, 1, and 0.3 TgC yr−1 in the Coast Range, western Cascades, eastern Cascades, and northern basin for a statewide total of 109 TgC yr−1 (Hudiburg et al. 2009). The prior land model and the inverse approach using the mixing ratios and the footprints produced similar interannual patterns of NEP over the 3-yr period, with 2012 exhibiting the highest and 2013 exhibiting the lowest statewide NEP. Annual NEP from 2012 to 2014 was 39.52, 22.31, and 27.28 TgC, respectively. The temporal pattern strongly corresponds to interannual variation of precipitation in the most productive regions (Coast Range and western Cascades; Table 3). PRISM data (Daly et al. 2008) showed regional annual average precipitation of 2815, 1418, and 2277 mm in the Coast Range and 2304, 1333, and 2142 mm in the Cascades, respectively, from 2012 to 2014. Maximum posterior NEP values of 827 gC m−2 yr−1 in the mesic western fir forest region underline the importance of the Coast Range Mountain forests as CO2 sinks in the PNW region. Thus, an area that corresponds to ≈10% of the total area of Oregon accounts for 31% of the statewide NEP. The strong relative increase in NEP for that region compared to the prior (42%) cannot solely be explained by the relative data and footprint coverage because the neighboring area of the Willamette valley, with very similar data and footprint coverage, shows a much smaller relative increase in NEP after the inversion process (29%). A relative increase with the same order of magnitude was estimated in the fir and pine forests in the western Cascade Range Mountains where the average CO2 uptake was increased by 61.5% (2.48 TgC yr−1) after assimilation of the mixing ratio data.
Average posterior results and standard deviations (2012–14) of net ecosystem production of the ecoregions of Oregon.
The average annual NEP values for each ecoregion is given in Table 3 to facilitate better comparison with other studies. The largest changes of average net CO2 exchange after Bayesian optimization occurred in the western Coast Range region, where the net CO2 uptake was underestimated by 42% in our CLM4.5 prior estimates (Figure 10). NEP was increased for most areas except for the grassland and shrubland area of the northern Great Basin where the small annual uptake was decreased by 35% on average (Table 3). This decrease has a minor effect overall because NEE and NEP, respectively, are relatively small in this grassland and shrubland area with dry summers and cold winters and a short growing season.
Our posterior NEP estimates for the productive forested regions of the western Cascades and the Coast Range are roughly 2 times higher than previous studies using bottom-up approaches (Turner et al. 2007, 2011), while the NEP of most other regions agree fairly well with differences of 25% or less. The statewide mean posterior NEP of 29.7 TgC yr−1 is about 16% lower than the average posterior result found in an earlier inversion study for Oregon that was based on a smaller observational database and used a simpler land model trained with EC flux data (Göckede et al. 2010b). Nevertheless, given the pronounced differences in the available database on prior flux structures, with an average difference of 16%, the posterior results of the two different Bayesian inversion approaches are in better agreement compared to the bottom-up approaches. Considering that the prior results of the two studies differ by more than 44% (40.0 vs 22.54 TgC yr−1), our findings demonstrate that the assimilation of atmospheric mixing ratios brings the posterior results of both studies in close agreement. The Coast Range Mountain area in western Oregon and the Cascades Mountain area, including the eastern slopes and foothills (Figure 2), are particularly important because most of the remaining large and continuous forests are found in those regions, providing the most significant land-based CO2 sink in Oregon (e.g., Turner et al. 2011).
The optimized CO2 exchange demonstrates the capabilities and limits of our inversion framework to correct the prior values efficiently. While the NEE fluxes have been optimized to better match the observations, the performance of the inversion process is predominantly restricted by the uncertainties associated with the transport model linking the mixing ratio measurements to the fluxes. Errors of the transport model are also propagated when calculating the background mixing ratios at the towers and therefore increase their uncertainties. Furthermore, accounting for local fossil fuel offsets significantly increases the overall measurement uncertainty. Hence, while both portions of the mixing ratio provide important boundary conditions (e.g., Gourdji et al. 2012), the weight in the assimilation process of the tower mixing ratios that initially exhibit a high accuracy and precision (see section 2.5.2.1) is impaired by the added uncertainty of those components.
In ongoing research CLM is now being parameterized for more forest species foliar traits and carbon allocation patterns as well as age mapping to improve the model’s drought sensitivity, which varies with species and age class (Law and Waring 2015). A previous version of the model (CLM4) performed well in the wet ecoregions and in older forests when compared with inventory biomass and productivity after model modifications (Hudiburg et al. 2013b). The modifications based on ecoregion-specific observations were stemwood allocation, mortality rates, and physiological variables (foliar C:N, foliar N in Rubisco, leaf longevity, fine root C:N, and specific leaf area). The modifications were carried over to our CLM4.5 setup but were generally applied to major species rather than species within the ecoregion. For example, the Douglas fir is found in several of the ecoregions, but the physiological parameters were averaged across ecoregions. This may have reduced model performance in this study, particularly in the Coast Range.
In addition to an improved land model and a refined model parameterization, a denser tower network would remediate the need to aggregate areas in order to constrain the system sufficiently and account for spatial flux variability for regional inversions with high spatial resolutions (Göckede et al. 2010a; Lauvaux et al. 2012a).
Optimizing fluxes with no spatial aggregation will make use of the full potential of land and transport models with high native resolution if larger portions of the model domain fall into near-field footprint areas. This would also facilitate a beneficial change from a PFT or PFT group parameterization system to a finer, species-specific parameterization, which is needed to address the physiological processes more realistically in terrestrial vegetation models (Law 2014). This applies in particular to complex mechanistic models like CLM4.5, which can account for differences within diverse vegetation using high-resolution land surface information and a detailed physiological parameterization.
Similar considerations for tower densities are needed in other regions with complex terrain and strong vegetation–climate gradients when high-resolution, regional Bayesian inversions are conducted. However, a finer-resolution Bayesian inversion for a heterogeneous land surface also comes with the problem of accurately defining an error covariance matrix with increased complexity, which must be addressed in corresponding future studies.
The approach of combining flux and mixing ratio datasets along with land system modeling and inversions shows promising results even in regions with strong gradients in vegetation and climate. Diagnostics suggest where we might need to improve input data (e.g., soils characteristics), parameters, or ecophysiological sensitivity of CLM to nutrients and drought for predicting future responses to climate. For diagnostic modeling of current conditions, integrated greenhouse gas measurements and a model framework such as the one described in this study have the potential to be implemented to meet obligations of international climate agreements that require transparency and regular assessments (e.g., COP21).
Acknowledgments
This research was supported by the NOAA Earth System Science Program (Grant Number NOAA-OAR-CPO-2012-2003041), the Agriculture and Food Research Initiative of the USDA National Institute of Food and Agriculture (Grant Numbers 2014-67003-22065 and 2014-35100-22066), and by the U.S. Department of Energy (Grant Number DE-SC0012194). We thank Arlyn E. Andrews, Colm Sweeney, and Ken Masarie from NOAA ESRL for the provision of the background data. Funding for AmeriFlux core site data used within this study was provided by the U.S. Department of Energy’s Office of Science. The authors also thank David E. Rupp for the downscaling of the climate forcing data used in CLM.
APPENDIX A
Further Details of Observation Tower Network
The OSI tower is the tallest tower in our network with the uppermost sample inlet at 269-m height above ground. The mixing ratios measured at the tower represent the Willamette valley as well as large areas of the Coast Range and the western Cascades (Figure 2). During certain meteorological conditions, the OSI measurements are affected by local to regional fossil fuel emissions from the densely populated Salem area 18 km to the west and from the Portland metropolitan area about 40 km to the north. In addition, the highway I-5 corridor with daily traffic of 50 000 to 100 000 vehicles [annual average, depending on the section of this main north–south traffic route in Oregon; Oregon Department of Transportation (ODT) 2014] crosses the entire valley at a distance of 22 km west of the OSI tower. We use CO mixing ratios to effectively remove the influence of regional anthropogenic emissions affecting the CO2 measurements at the tall OSI tower (Schmidt et al. 2014). A technical description of the OWA and OMP towers is in Schmidt et al. (2014). Both towers are located in the Coast Range (Table 1).
The 40-m-tall ONG tower is located in Oregon’s High Desert region and was erected in 2012 replacing a 6-m measurement pole setup equipped with a closed-path IRGA (Göckede et al. 2010b).
The semiarid area is flat sagebrush steppe dominated by bunchgrasses (>25% of the area, mostly Festuca idahoensis) with an open shrub layer of sagebrush (Artemisia tridentata). In addition to the CRDS system, the ONG tower is equipped with an eddy covariance system measuring the vertical, turbulent fluxes of CO2, water vapor, and energy at a height of 25 m above ground using an enclosed-path LI7200 IRGA (Burba et al. 2012) in combination with a CSAT3 sonic anemometer.
As part of the AmeriFlux network, the US-MRf flux tower (Figure 2) is located in the Oregon Coast Range, measuring the atmospheric exchange of CO2, water vapor, and energy above a mature Douglas fir (Pseudotsuga menziesii) forest at 37 m above ground. Further information about the EC measurements at the US-MRf tower and the corresponding flux data processing can be found in Thomas et al. (2013). Details about the mixing ratio measurements and data processing as well as a technical description of the OWA and OMP towers, both located in the Coast Range Mountains (Table 1), are given in Schmidt et al. (2014). Three of the CO2 observation towers were newly installed or replaced in 2012, so the overall observation data availability for 2012 is reduced by 14.1% and 24.6% compared to the years 2013 and 2014.
Linear calibration using standard gases were applied at the towers to derive the uncertainty of the CO2 mixing ratios. For each 50-hourly calibration cycle, a first-order polynomial fit is applied with known target mole fractions from the calibration tanks and the mean mole fractions from the corresponding measurements. The coefficients for the linear regression are calculated by linear interpolation between the two adjacent calibrations and then applied to the data between two calibration cycles. The final measurement uncertainty is defined as the difference between the linearly corrected CO2 mixing ratios and an additional WMO reference gas that was not used for the actual CO2 correction procedure.
APPENDIX B
The Downscaled Climate Dataset Used for CLM
The source of the climate variables used to create the downscaled climate driver dataset for the CLM runs are simulations using the global climate model (GCM) MIROC5 as part of CMIP5 (Taylor et al. 2012). The MIROC5 simulations were statistically downscaled and bias corrected using MACA (Abatzoglou and Brown 2012). The MACA algorithm samples from a dataset of gridded observations (training data) using spatial pattern matching for the variable from the GCM and training data. The daily gridded climate data of the METDATA observational dataset (Abatzoglou 2013) with a horizontal resolution of 1/24° × 1/24° was used as training data.
To disaggregate the downscaled daily data to a 3-hourly resolution, we make use of the 3-hourly data from the original CMIP5 datasets. The method consists of rescaling the 3-hourly GCM time series to be consistent with aggregate daily values, or maximum and minimum daily values, from MACA. This entails first converting the variables in the 3-hourly global climate model datasets to dimensionless standardizes values. The downscaled daily data are disaggregated to 3 hourly by multiplying the daily data by the standardized quantities. To apply the second step, each fine-grained cell in the downscaled dataset is mapped to its associated coarse-grained cell in the GCM dataset. The strength of this method is that it maintains the covariance structure of, and therefore the physical consistency between, all the variables. A potential weakness of the method is that the standardized time series for a given day is identical across the GCM cell. This means that a storm, for example, peaks at exactly the same time everywhere within the GCM cell, though the magnitude would vary across the cell. Because in CLM lateral exchanges between the cells are not considered, this drawback of the method is not important for the application in CLM.
APPENDIX C
Calculation of Background Mixing Ratios at the Towers
For the WRF–STILT trajectory calculations 500 particles were released at each receptor. If the trajectory of a particle did not cross the domain boundary, the latitude, altitude, and time of the last data record were used and the corresponding average background curtain mixing ratio assigned. Daily background mixing ratios of atmospheric CO and CO2 are spatially arranged in a curtain covering a height profile from 500 to 6500 m with a latitudinal resolution of 1°. The background values are provided specifically for the Pacific coast region of the United States by NOAA ESRL (Jeong et al. 2013). The background values were calculated as an average of the background mixing ratios at the position that the particles crossed at the end of the backward trajectory or when leaving the domain.
We compared the modeled background mixing ratios with measured background values using CO and CO2 observations from the OMP site during conditions with practically no surface exchange over the last 72 h [surface influence integrated over footprint <0.01 ppm (µmol m−2 s−1)−1] and trajectories associated with westerly wind directions (incoming air from the Pacific Ocean). Because of its exposed position on top of the highest mountain in the Oregon Coast Range the OMT tower frequently samples air from above the atmospheric mixing layer during respective meteorological conditions. Measurements during those periods capture air masses with mixing ratios that were not altered through land surface exchange and therefore provide direct measurements of advected background mixing ratios. Observed biases in the background field mixing ratios were corrected using a second-order Fourier function (Göckede et al. 2010b) before being subtracted from the tower observation values. After correction, the measured and modeled background mixing ratio time series show a high correlation (r2 = 0.88, RMSE = 0.21 ppm).
References
Abatzoglou, J. T., 2013: Development of gridded surface meteorological data for ecological applications and modelling. Int. J. Climatol., 33, 121–131, doi:10.1002/joc.3413.
Abatzoglou, J. T., and T. J. Brown, 2012: A comparison of statistical downscaling methods suited for wildfire applications. Int. J. Climatol., 32, 772–780, doi:10.1002/joc.2312.
Andrews, A. E., and Coauthors, 2014: CO2, CO, and CH4 measurements from tall towers in the NOAA Earth System Research Laboratory’s Global Greenhouse Gas Reference Network: Instrumentation, uncertainty analysis, and recommendations for future high-accuracy greenhouse gas monitoring efforts. Atmos. Meas. Tech., 7, 647–687, doi:10.5194/amt-7-647-2014.
Beer, C., and Coauthors, 2010: Terrestrial gross carbon dioxide uptake: Global distribution and covariation with climate. Science, 329, 834–838, doi:10.1126/science.1184984.
Bocquet, M., L. Wu, and F. Chevallier, 2011: Bayesian design of control space for optimal assimilation of observations. Part I: Consistent multiscale formalism. Quart. J. Roy. Meteor. Soc., 137, 1340–1356, doi:10.1002/qj.837.
Bonan, G. B., S. Levis, L. Kergoat, and K. W. Oleson, 2002: Landscapes as patches of plant functional types: An integrating concept for climate and ecosystem models. Global Biogeochem. Cycles, 16, 5.1–5.23, doi:10.1029/2000GB001360.
Bousquet, P., and Coauthors, 2011: Source attribution of the changes in atmospheric methane for 2006–2008. Atmos. Chem. Phys., 11, 3689–3700, doi:10.5194/acp-11-3689-2011.
Burba, G., and Coauthors, 2012: Calculating CO2 and H2O eddy covariance fluxes from an enclosed gas analyzer using an instantaneous mixing ratio. Global Change Biol., 18, 385–399, doi:10.1111/j.1365-2486.2011.02536.x.
Carouge, C., P. J. Rayner, P. Peylin, P. Bousquet, F. Chevallier, and P. Ciais, 2010: What can we learn from European continuous atmospheric CO2 measurements to quantify regional fluxes—Part 2: Sensitivity of flux accuracy to inverse setup. Atmos. Chem. Phys., 10, 3119–3129, doi:10.5194/acp-10-3119-2010.
Chen, J. M., J. W. Fung, G. Mo, F. Deng, and T. O. West, 2015: Atmospheric inversion of surface carbon flux with consideration of the spatial distribution of US crop production and consumption. Biogeosciences, 12, 323–343, doi:10.5194/bg-12-323-2015.
Clare, A. P., and D. R. Cohen, 2001: A comparison of unsupervised neural networks and k-means clustering in the analysis of multi-element stream sediment data. Geochem.: Explor. Environ. Anal., 1, 119–134, doi:10.1144/geochem.1.2.119.
Cressie, N. A. C., 1993: Statistics for Spatial Data. Wiley, 900 pp.
Cressie, N. A. C., and C. K. Wikle, 2011: Statistics for Spatio-Temporal Data. Wiley, 588 pp.
Crosson, E. R., 2008: A cavity ring-down analyzer for measuring atmospheric levels of methane, carbon dioxide, and water vapour. Appl. Phys., 92B, 403–408, doi:10.1007/s00340-008-3135-y.
Daly, C., M. Halbleib, J. Smith, W. Gibson, M. K. Doggett, G. H. Taylor, J. Curtis, and P. P. Pasteris, 2008: Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol., 28, 2031–2064, doi:10.1002/joc.1688.
Davies, D. L., and D. W. Bouldin, 1979: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell., 1, 224–227.
Eidenshink, J., B. Schwind, K. Brewer, Z. Zhu, B. Quayle, and S. Howard, 2007: A project for monitoring trends in burn severity. Fire Ecol., 3, 3–21, doi:10.4996/fireecology.0301003.
Engelen, R. J., A. S. Denning, and K. R. Gurney, 2002: On error estimation in atmospheric CO2 inversions. J. Geophys. Res., 107, 4635, doi:10.1029/2002JD002195.
Enting, I. G., 2005: Inverse Problems in Atmospheric Constituent Transport. Cambridge University Press, 412 pp.
Gerbig, C., J. C. Lin, S. C. Wofsy, B. C. Daube, A. E. Andrews, B. B. Stephens, P. S. Bakwin, and C. A. Grainger, 2003: Toward constraining regional‐scale fluxes of CO2 with atmospheric observations over a continent: 2. Analysis of COBRA data using a receptor‐oriented framework. J. Geophys. Res., 108, 4757, doi:10.1029/2003JD003770.
Gerbig, C., S. Körner, and J. C. Lin, 2008: Vertical mixing in atmospheric tracer transport models: Error characterization and propagation. Atmos. Chem. Phys., 8, 591–602, doi:10.5194/acp-8-591-2008.
Göckede, M., A. M. Michalak, D. Vickers, D. P. Turner, and B. E. Law, 2010a: Atmospheric inverse modeling to constrain regional scale CO2 budgets at high spatial and temporal resolution. J. Geophys. Res., 115, D15113, doi:10.1029/2009JD012257.
Göckede, M., D. P. Turner, A. M. Michalak, D. Vickers, and B. E. Law, 2010b: Sensitivity of a subregional scale atmospheric inverse CO2 modeling framework to boundary conditions. J. Geophys. Res., 115, D24112, doi:10.1029/2010JD014443.
Gourdji, S. M., and Coauthors, 2012: North American CO2 exchange: Inter-comparison of modeled estimates with results from a fine-scale atmospheric inversion. Biogeosciences, 9, 457–475, doi:10.5194/bg-9-457-2012.
Guenther, A. B., X. Jiang, C. L. Heald, T. Sakulyanontvittaya, T. Duhl, L. K. Emmons, and X. Wang, 2012: The Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1): An extended and updated framework for modeling biogenic emissions. Geosci. Model Dev., 5, 1471–1492, doi:10.5194/gmd-5-1471-2012.
Hammerling, D. M., A. M. Michalak, and S. R. Kawa, 2012: Mapping of CO2 at high spatiotemporal resolution using satellite observations: Global distributions from OCO-2. J. Geophys. Res., 117, D06306, doi:10.1029/2011JD017015.
Hengl, T., 2007: A Practical Guide to Geostatistical Mapping of Environmental Variables. Office for Official Publications of the European Communities, 165 pp.
Henne, S., and Coauthors, 2016: Validation of the Swiss methane emission inventory by atmospheric observations and inverse modelling. Atmos. Chem. Phys., 16, 3683–3710, doi:10.5194/acp-16-3683-2016.
Hudiburg, T. W., B. E. Law, D. P. Turner, J. L. Campbell, D. Donato, and M. Duane, 2009: Carbon dynamics of Oregon and northern California forests and potential land-based carbon storage. Ecol. Appl., 19, 163–180, doi:10.1890/07-2006.1.
Hudiburg, T. W., B. E. Law, and P. E. Thornton, 2013a: Evaluation and improvement of the Community Land Model (CLM 4.0) in Oregon forests. Biogeosciences, 10, 453–470, doi:10.5194/bg-10-453-2013.
Hudiburg, T. W., S. Luyssaert, P. E. Thornton, and B. E. Law, 2013b: Interactive effects of environmental change and management strategies on regional forest carbon emissions. Environ. Sci. Technol., 47, 13 132–13 140, doi:10.1021/es402903u.
Jain, A. K., M. N. Murty, and P. J. Flynn, 1999: Data clustering: A review. ACM Comput. Surv., 31, 264–323, doi:10.1145/331499.331504.
Jeong, S., C. Zhao, A. E. Andrews, L. Bianco, J. M. Wilczak, and M. L. Fischer, 2012: Seasonal variation of CH4 emissions from central California. J. Geophys. Res., 117, D11306, doi:10.1029/2011JD016896.
Jeong, S., Y.-K. Hsu, A. E. Andrews, L. Bianco, P. Vaca, J. M. Wilczak, and M. L. Fischer, 2013: A multitower measurement network estimate of California’s methane emissions. J. Geophys. Res. Atmos., 118, 11 339–11 351, doi:10.1002/jgrd.50854.
Kalnay, E., M. Kanamitsu, and W. E. Baker, 1990: Global numerical weather prediction at the National Meteorological Center. Bull. Amer. Meteor. Soc., 71, 1410–1428, doi:10.1175/1520-0477(1990)071<1410:GNWPAT>2.0.CO;2.
Kaminski, T., P. J. Rayher, M. Heimann, and I. G. Enting, 2001: On aggregation errors in atmospheric transport inversions. J. Geophys. Res., 106, 4703–4715, doi:10.1029/2000JD900581.
Kasibhatla, P., M. Heimann, P. Rayner, N. Mahowald, R. G. Prinn, and D. E. Hartley, Eds., 2000: Inverse Methods in Global Biogeochemical Cycles. Geophys. Monogr., Vol. 114, Amer. Geophys. Union, 324 pp.
Keenan, T. F., and Coauthors, 2012: Terrestrial biosphere model performance for inter-annual variability of land-atmosphere CO2 exchange. Global Change Biol., 18, 1971–1987, doi:10.1111/j.1365-2486.2012.02678.x.
Kohonen, T., 2001: Self-Organizing Maps. Springer Series in Information Sciences, Vol. 30, Springer, 501 pp.
Lauvaux, T., A. Schuh, M. Bocquet, L. Wu, S. Richardson, N. Miles, and K. Davis, 2012a: Network design for mesoscale inversions of CO2 sources and sinks. Tellus, 64B, 17 980, doi:10.3402/tellusb.v64i0.17980.
Lauvaux, T., and Coauthors, 2012b: Constraining the CO2 budget of the corn belt: Exploring uncertainties from the assumptions in a mesoscale inverse system. Atmos. Chem. Phys., 12, 337–354, doi:10.5194/acp-12-337-2012.
Law, B. E., 2014: Regional analysis of drought and heat impacts on forests: Current and future science directions. Global Change Biol., 20, 3595–3599, doi:10.1111/gcb.12651.
Law, B. E., and L. T. Berner, 2015: NACP TERRA-PNW: Forest plant traits, NPP, biomass, and soil properties, 1999-2014. ORNL DAAC, accessed 13 January 2016, doi:10.3334/ORNLDAAC/1292.
Law, B. E., and R. H. Waring, 2015: Carbon implications of current and future effects of drought, fire and management on Pacific Northwest forests. For. Ecol. Manage., 355, 4–14, doi:10.1016/j.foreco.2014.11.023.
Law, B. E., D. Turner, J. Campbell, O. J. Sun, S. Van Tuyl, W. D. Ritts, and W. B. Cohen, 2004: Disturbance and climate effects on carbon stocks and fluxes across western Oregon USA. Global Change Biol., 10, 1429–1444, doi:10.1111/j.1365-2486.2004.00822.x.
Law, B. E., T. Hudiburg, and S. Luyssaert, 2013: Thinning effects on forest productivity: Consequences of preserving old forests and mitigating impacts of fire and drought. Plant Ecol. Diversity, 6, 73–85, doi:10.1080/17550874.2012.679013.
Lawrence, P. J., and T. M. Chase, 2007: Representing a new MODIS consistent land surface in the Community Land Model (CLM 3.0). J. Geophys. Res., 112, G01023, doi:10.1029/2006JG000168.
Lichstein, J. W., and Coauthors, 2014: Confronting terrestrial biosphere models with forest inventory data. Ecol. Appl., 24, 699–715, doi:10.1890/13-0600.1.
Lin, J. C., and C. Gerbig, 2005: Accounting for the effect of transport errors on tracer inversions. Geophys. Res. Lett., 32, L01802, doi:10.1029/2004GL021127.
Lin, J. C., C. Gerbig, S. C. Wofsy, A. E. Andrews, B. C. Daube, K. J. Davis, and C. A. Grainger, 2003: A near-field tool for simulating the upstream influence of atmospheric observations: The Stochastic Time-Inverted Lagrangian Transport (STILT) model. J. Geophys. Res., 108, 4493, doi:10.1029/2002JD003161.
MacQueen, J. B., 1967: Some methods for classification and analysis of multivariate observations. Statistics, L. M. Le Cam and J. Neyman, Eds., Vol. 1, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, 281–297.
Maulik, U., and S. Bandyopadhyay, 2002: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell., 24, 1650–1654, doi:10.1109/TPAMI.2002.1114856.
McBratney, A. B., and R. Webster, 1986: Choosing functions for semi-variograms of soil properties and fitting them to sampling estimates. J. Soil Sci., 37, 617–639, doi:10.1111/j.1365-2389.1986.tb00392.x.
Michalak, A. M., L. Bruhwiler, and P. P. Tans, 2004: A geostatistical approach to surface flux estimation of atmospheric trace gases. J. Geophys. Res., 109, D14109, doi:10.1029/2003JD004422.
Michalakes, J., S. Chen, J. Dudhia, L. Hart, J. Klemp, J. Middlecoff, and W. Skamarock, 2001: Development of a Next Generation Regional Weather Research and Forecast Model. Developments in Teracomputing: Proceedings of the Ninth ECMWF Workshop on the Use of High Performance Computing in Meteorology, Walter Zwieflhofer and Norbert Kreitz, Eds., World Scientific, 269–276.
Nehrkorn, T., J. Eluszkiewicz, S. C. Wofsy, J. C. Lin, C. Gerbig, M. Longo, and S. Freitas, 2010: Coupled Weather Research and Forecasting–Stochastic Time-Inverted Lagrangian Transport (WRF–STILT) model. Model. Meteor. Atmos. Phys., 107, 51–64, doi:10.1007/s00703-010-0068-x.
ODEQ, 2015: Oregon greenhouse gas in-boundary inventory data. Accessed 2 January 2016. [Available online at http://www.oregon.gov/DEQ/AQ/Pages/Greenhouse-Gas-Inventory-Report.aspx#inventory.]
ODT, 2014: 2013 transportation volume tables. Oregon Department of Transportation, Transportation Data Section, Transportation System Unit, accessed 02 December 2015. [Available online at https://www.oregon.gov/ODOT/TD/TDATA/tsm/docs/2013_TVT_Complete_Publication.pdf.]
Oleson, K. W., and Coauthors, 2013: Technical description of version 4.5 of the Community Land Model (CLM). NCAR Tech. Note NCAR/TN-503+STR, 422 pp., doi:10.5065/D6RR1W7M.
Omernik, J. M., 1995: Ecoregions: A framework for environmental management. Biological Assessment and Criteria: Tools for Water Resource Planning and Decision Making, W. S. Davis and T. P. Simon, Eds., CRC Press, 49–62.
Pechony, O., D. T. Shindell, and G. Faluvegi, 2013: Direct top-down estimates of biomass burning CO emissions using TES and MOPITT versus bottom-up GFED inventory. J. Geophys. Res. Atmos., 118, 8054–8066, doi:10.1002/jgrd.50624.
Richardson, A. D., and Coauthors, 2012: Terrestrial biosphere models need better representation of vegetation phenology: Results from the North American Carbon Program site synthesis. Global Change Biol., 18, 566–584, doi:10.1111/j.1365-2486.2011.02562.x.
Rödenbeck, C., S. Houweling, M. Gloor, and M. Heimann, 2003: CO2 flux history 1982–2001 inferred from atmospheric data using a global inversion of atmospheric transport. Atmos. Chem. Phys., 3, 1919–1964, doi:10.5194/acp-3-1919-2003.
Schmid, P., and D. Niyogi, 2012: A method for estimating planetary boundary layer heights and its application over the ARM Southern Great Plains site. J. Atmos. Oceanic Technol., 29, 316–322, doi:10.1175/JTECH-D-11-00118.1.
Schmidt, A., C. Hanson, J. Kathilankal, and B. E. Law, 2011: Classification and assessment of turbulent fluxes above ecosystems in North-America with self-organizing feature map networks. Agric. For. Meteor., 151, 508–520, doi:10.1016/j.agrformet.2010.12.009.
Schmidt, A., C. Hanson, W. S. Chan, and B. E. Law, 2012: Empirical assessment of uncertainties of meteorological parameters and turbulent fluxes in the AmeriFlux network. J. Geophys. Res., 117, G04014, doi:10.1029/2012JG002100.
Schmidt, A., C. W. Rella, M. Göckede, C. Hanson, Z. Yang, and B. E. Law, 2014: Removing traffic emissions from CO2 time series measured at a tall tower using mobile measurements and transport modeling. Atmos. Environ., 97, 94–108, doi:10.1016/j.atmosenv.2014.08.006.
Schuh, A. E., A. S. Denning, K. D. Corbin, I. T. Baker, M. Uliasz, N. Parazoo, A. E. Andrews, and D. E. J. Worthy, 2010: A regional high-resolution carbon flux inversion of North America for 2004. Biogeosciences, 7, 1625–1644, doi:10.5194/bg-7-1625-2010.
Schuh, A. E., and Coauthors, 2013: Evaluating atmospheric CO2 inversions at multiple scales over a highly inventoried agricultural landscape. Global Change Biol., 19, 1424–1439, doi:10.1111/gcb.12141.
Schwalm, C. R., and Coauthors, 2010: A model-data intercomparison of CO2 exchange across North America: Results from the North American Carbon Program site synthesis. J. Geophys. Res., 115, G00H05, doi:10.1029/2009JG001229.
Seidel, D. J., C. O. Ao, and K. Li, 2010: Estimating climatological planetary boundary layer heights from radiosonde observations: Comparison of methods and uncertainty analysis. J. Geophys. Res., 115, D16113, doi:10.1029/2009JD013680.
Shaw, W. J., M. S. Pekour, R. L. Coulter, T. J. Martin, and J. T. Walters, 2007: The daytime mixing layer observed by radiosonde, profiler, and lidar during MILAGRO. Atmos. Chem. Phys. Discuss., 7, 15 025–15 065, doi:10.5194/acpd-7-15025-2007.
Soil Survey Staff, 2015: Web soil survey. Natural Resources Conservation Service, United States Department of Agriculture, accessed 30 May 2016. [Available online at http://websoilsurvey.nrcs.usda.gov/.]
Stein, M. L., 1999: Interpolation of Spatial Data: Some Theory for Kriging. Springer, 266 pp.
Sun, Y., and Coauthors, 2014: Asymmetrical effects of mesophyll conductance on fundamental photosynthetic parameters and their relationships estimated from leaf gas exchange measurements. Plant Cell Environ., 37, 978–994, doi:10.1111/pce.12213.
Tans, P., P. S. Bakwin, and D. W. Guenther, 1996: A feasible global carbon cycle observing system: A plan to decipher today’s carbon cycle based on observations. Global Change Biol., 2, 309–318, doi:10.1111/j.1365-2486.1996.tb00082.x.
Tarantola, A., 2005: Inverse Problem Theory and Methods for Model Parameter Estimation. SIAM, 339 pp., doi:10.1137/1.9780898717921.
Taylor, K. E., R. J. Stouffer, and G. A. Meehl, 2012: An overview of CMIP5 and the experiment design. Bull. Amer. Meteor. Soc., 93, 485–498, doi:10.1175/BAMS-D-11-00094.1.
Thomas, C. K., B. E. Law, J. Irvine, J. G. Martin, J. C. Pettijohn, and K. J. Davis, 2009: Seasonal hydrology explains interannual and seasonal variation in carbon and water exchange in a semi-arid mature ponderosa pine forest in central Oregon. J. Geophys. Res., 114, G04006, doi:10.1029/2009JG001010.
Thomas, C. K., J. G. Martin, B. E. Law, and K. Davis, 2013: Towards biologically meaningful net carbon exchange estimates for tall, dense canopies: Multi-level eddy covariance observations and canopy coupling regimes in a mature Douglas-fir forest in Oregon. Agric. For. Meteor., 173, 14–27, doi:10.1016/j.agrformet.2013.01.001.
Thompson, R. L., C. Gerbig, and C. Rödenbeck, 2011: A Bayesian inversion estimate of N2O emissions for western and central Europe and the assessment of aggregation errors. Atmos. Chem. Phys., 11, 3443–3458, doi:10.5194/acp-11-3443-2011.
Thornton, P. E., S. W. Running, and E. R. Hunt, 2005: Biome-BGC: Terrestrial Ecosystem Process Model, version 4.1.1: Data model. Oak Ridge National Laboratory Distributed Active Archive Center, accessed 4 September 2015, doi:10.3334/ORNLDAAC/805.
Thorson, T. D., S. A. Bryce, D. A. Lammers, A. J. Woods, J. M. Omernik, J. Kagan, D. E. Pater, and J. A. Comstock, 2003: Ecoregions of Oregon (2 sided color poster with map, descriptive text, summary tables, and photographs). U.S. Geological Survey, Reston, VA. Scale 1:1,500,000. [Available online at http://www.oregon.gov/GEO/Pages/index.aspx.]
Turner, A. J., and D. J. Jacob, 2015: Balancing aggregation and smoothing errors in inverse models. Atmos. Chem. Phys., 15, 7039–7048, doi:10.5194/acp-15-7039-2015.
Turner, D. P., W. D. Ritts, B. E. Law, W. B. Cohen, Z. Yang, T. Hudiburg, J. L. Campbell, and M. Duane, 2007: Scaling net ecosystem production and net biome production over a heterogeneous region in the western United States. Biogeosciences, 4, 597–612, doi:10.5194/bg-4-597-2007.
Turner, D. P., and Coauthors, 2011: Multiple constraints analysis of regional land-surface carbon flux. Tellus, 63B, 207–221, doi:10.1111/j.1600-0889.2011.00525.x.
Turner, D. P., A. R. Jacobson, W. D. Ritts, W. L. Wang, and R. Nemani, 2013: A large portion of North American net ecosystem production is offset by emissions from harvested products, river/stream evasion, and biomass burning. Global Change Biol., 19, 3516–3528, doi:10.1111/gcb.12313.
USGS, 2013: 2005 North American land cover at 250 m spatial resolution. Natural Resources Canada/Canadian Center for Remote Sensing, United States Geological Survey, Insituto Nacional de Estadística y Geografía, Comisión Nacional para el Conocimiento y Uso de la Biodiversidad, and Comisión Nacional Forestal, accessed 2 February 2016. [Available online at http://landcover.usgs.gov/nalcms.php.]
Vardag, S. N., C. Gerbig, G. Janssens-Maenhout, and I. Levin, 2015: Estimation of continuous anthropogenic CO2: Model-based evaluation of CO2, CO, δ13C(CO2) and Δ14C(CO2) tracer methods. Atmos. Chem. Phys., 15, 12 705–12 729, doi:10.5194/acp-15-12705-2015.
Vesanto, J., and E. Alhoniemi, 2000: Clustering of the self-organizing map. IEEE Trans. Neural Networks, 11, 586–600, doi:10.1109/72.846731.
Vickers, D., C. Thomas, C. Pettijohn, J. G. Martin, and B. E. Law, 2012: Five years of carbon fluxes and inherent water-use efficiency at two semi-arid pine forests with different disturbance histories. Tellus, 64B, 17 159, doi:10.3402/tellusb.v64i0.17159.
Wang, S., and X. Mo, 2015: Comparison of multiple models for estimating gross primary production using remote sensing data and fluxnet observations. Proc. IAHS, 368, 75–80, doi:10.5194/piahs-368-75-2015.
Watanabe, M., and Coauthors, 2010: Improved climate simulation by MIROC5: Mean states, variability, and climate sensitivity. J. Climate, 23, 6312–6335, doi:10.1175/2010JCLI3679.1.
Williams, M., and Coauthors, 2009: Improving land surface models with FLUXNET data. Biogeosciences, 6, 1341–1359, doi:10.5194/bg-6-1341-2009.
Wu, L., M. Bocquet, T. Lauvaux, F. Chevallier, P. Rayner, and K. Davis, 2011: Optimal representation of source‐sink fluxes for mesoscale carbon dioxide inversion with synthetic data. J. Geophys. Res., 116, D21304, doi:10.1029/2011JD016198.
Wu, L., G. Broquet, P. Ciais, V. Bellassen, F. Vogel, F. Chevallier, I. Xueref-Remy, and Y. Wang, 2016: What would dense atmospheric observation networks bring to the quantification of city CO2 emissions? Atmos. Chem. Phys., 16, 7743–7771, doi:10.5194/acp-16-7743-2016.
Zhao, C., A. E. Andrews, L. Bianco, J. Eluszkiewicz, A. Hirsch, C. MacDonald, T. Nehrkorn, and M. L. Fischer, 2009: Atmospheric inverse estimates of methane emissions from central California. J. Geophys. Res., 114, D16302, doi:10.1029/2008JD011671.