The Earth System Prediction Suite (ESPS) is a collection of flagship U.S. weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made available either under open-source terms or to credentialed users.
The ESPS represents a culmination of efforts to create a common Earth system model architecture, and the advent of increasingly coordinated model development activities in the United States. ESPS component interfaces are based on the Earth System Modeling Framework (ESMF), community-developed software for building and coupling models, and the National Unified Operational Prediction Capability (NUOPC) Layer, a set of ESMF-based component templates and interoperability conventions. This shared infrastructure simplifies the process of model coupling by guaranteeing that components conform to a set of technical and semantic behaviors. The ESPS encourages distributed, multiagency development of coupled modeling systems; controlled experimentation and testing; and exploration of novel model configurations, such as those motivated by research involving managed and interactive ensembles. ESPS codes include the Navy Global Environmental Model (NAVGEM), the Hybrid Coordinate Ocean Model (HYCOM), and the Coupled Ocean–Atmosphere Mesoscale Prediction System (COAMPS); the NOAA Environmental Modeling System (NEMS) and the Modular Ocean Model (MOM); the Community Earth System Model (CESM); and the NASA ModelE climate model and the Goddard Earth Observing System Model, version 5 (GEOS-5), atmospheric general circulation model.
Benefits from common modeling infrastructure and component interface standards are being realized in a suite of national weather and climate codes.
Earth system models enable humans to understand and make predictions about their environment. People rely on them for forecasting the weather, anticipating floods, assessing the severity of droughts, projecting climate changes, and countless other applications that impact life, property, and commerce. To simulate complex behaviors, the models must include a range of interlinked physical processes. These processes are often represented by independently developed components that are coupled through software infrastructure.
The software infrastructure that underlies Earth system models includes workhorse utilities as well as libraries generated by research efforts in computer science, mathematics, and computational physics. The utilities cover tasks such as time management and error handling, while research-driven libraries include areas such as high-performance input/output (I/O), algorithms for grid remapping, and programming tools for optimizing software on emerging computer architectures. Collectively, this model infrastructure represents a significant investment. As a crude comparison, a comprehensive infrastructure package like the Earth System Modeling Framework (ESMF; Hill et al. 2004; Collins et al. 2005) is comparable in size to the Community Earth System Model (CESM; Hurrell et al. 2013), each at just under a million lines of code.1
Dickinson et al. (2002) articulated the goal of common model infrastructure, a code base that multiple weather and climate modeling centers could share. This idea was shaped by an ad hoc multiagency working group that had started meeting several years earlier and was echoed in reports on the state of U.S. climate modeling (NRC 1998, 2001; Rood et al. 2000). Leads from research and operational centers posited that common infrastructure had the potential to foster collaborative development and transfer of knowledge; lessen redundant code; advance computational capabilities, model performance, and predictive skill; and enable controlled experimentation in coupled systems and ensembles. This vision of shared infrastructure has been revisited in more recent publications and venues, for example, in NRC (2012).
In this article we describe how the vision of common infrastructure is being realized, and how it is changing the approach to Earth system modeling in the United States. Central to its implementation is the Earth System Prediction Suite (ESPS), a collection of weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made availablle either under open-source terms or to credentialed users.
We begin by discussing how the U.S. modeling community has evolved toward a common model architecture and then explain the role of the ESMF and related projects in translating that convergence into technical interoperability. We outline the behavioral rules needed to achieve an effective level of interoperability, and describe the ESPS code suite and its target inclusion criteria. We give examples of the adoption process for different kinds of codes and of science enabled by common infrastructure. Finally, we examine the potential role of the ESPS in model ensembles and consider areas for future work.
EMERGENCE OF A COMMON MODEL ARCHITECTURE.
Several generations of model infrastructure development, described in “Linked and leveraged: The evolution of coupled model infrastructure,” allowed for the evolution and evaluation of design strategies. A community of infrastructure developers emerged, whose members exchanged ideas through a series of international meetings focused on coupling techniques (e.g., Dunlap et al. 2014); comparative analyses, such as Valcke et al. (2012); and design reviews and working group discussions hosted by community projects, such as CESM and ESMF.
Model coupling technologies were initially targeted for specific coupled modeling systems, often within a single organization. Infrastructure that arose out of model development during this period included the FMS at the Geophysical Fluid Dynamics Laboratory, the GEMS [NASA Goddard Space Flight Center (GSFC) 1997], and the Climate System Model (CSM; Boville and Gent 1998) and the Parallel Climate Model (PCM; Washington et al. 2000) flux couplers at NCAR. Each of these systems coordinated functions such as timekeeping and I/O across model components contributed by domain specialists, and implemented component interfaces for field transformations and exchanges.
Second generation (2002–06)
Recognizing similar functions and strategies across first-generation model infrastructures, a multiagency group formed a consortium to jointly develop an ESMF. ESMF was intended to limit redundant code and enable components to be exchanged between modeling centers. Also at this time, within DOE, the common component architecture (CCA; Bernholdt et al. 2006) consortium introduced a more precise definition of components into the high-performance computing community, and members of the MCT project worked with CSM (now CCSM) to abstract low-level coupling functions into the MCT general-purpose library and develop a new CCSM coupler (CPL7).
Third generation (2007–14)
A third generation of development began as multiagency infrastructures began to mature and refactor code, to assess their successes and deficiencies, and to encounter new scientific and computational challenges. Both NASA, with MAPL (Suarez et al. 2007) and NUOPC, a group of NOAA, Navy, and Air Force operational weather prediction centers and their research partners, added conventions to ESMF to increase component interoperability. Similar refactoring efforts took place in other communities, such as surface dynamics (Peckham et al. 2013) and agriculture (David et al. 2010). The demands of high-resolution modeling and the advent of unstructured grids pushed ESMF to develop new capabilities and products, and MCT and CCSM—now CESM—to introduce new communication options. In this wave of development, the capabilities of shared infrastructure began to equal or outperform those developed by individual organizations.
What next? (2015—)
Although some infrastructure projects have disappeared or merged, projects from all three generations of development are still in use, and increasingly their interfaces may coexist in the same coupled modeling system. Future development is likely to include more cross-disciplinary projects like the Earth System Bridge (see Peckham et al. 2014), which is defining a formal characterization of framework elements and behaviors [an Earth System Framework Description Language (ES-FDL)], and using it to explore how to link components that come from different communities that have their own infrastructures (e.g., climate, hydrology, ecosystem modeling).
Over time, model developers from major U.S. centers implemented similar model coupling approaches, based on a small set of frameworks: 1) ESMF; 2) the CESM Coupler, version 7 (CESM CPL7; Craig et al. 2012), which uses the lower-level Model Coupling Toolkit for many operations (MCT; Larson et al. 2005; Jacob et al. 2005); and 3) the Flexible Modeling System (FMS; Balaji 2012). ESMF, CPL7, and FMS share several key architectural characteristics. Major physical domains, such as atmosphere, ocean, land, sea ice, and wave models, are represented as software components. Software for transforming and transferring data between components, often called a coupler, is also represented as a component. They are all single executable frameworks, meaning that constituent components, models, and coupler are called as subroutines by a driver. The driver invokes components through initialize, run, and finalize methods, which are similar in structure across frameworks. As an example, below are the application programming interfaces (APIs) of the ESMF and CESM model component run methods:
ESMF: ESMF_GridCompRun (gridcomp, importState, exportState, and clock, ...)
CESM: atm_run_mct (EClock_a, cdata_aa, x2a_aa, a2x_aa)
Both argument lists include a pointer to component information (gridcomp/cdata_aa), a container structure with input fields (importState/x2a_aa), a container structure with output fields (exportState/a2x_aa), and a clock with time step and calendar information (clock/EClock_a).
This congruence in component API and overall architecture means that CESM and ESMF model components are close to being able to work in either framework.2 Where these and other frameworks have similar component APIs, a model developer can write a separate wrapper or “cap” to adapt a component written in one framework to another. Instead of calling the component directly, the framework calls the component with the cap API, and the cap internally calls the original component API. Writing a cap usually requires minimal changes in the scientific code of the component. The changes are along the lines of passing a message passing interface (MPI) communicator into the component, or accessing additional model fields. The cap for an Earth system model component usually contains assignments of input/output field data from the original model data structures to those of the target framework, by reference or copy. The model developer also writes code in the cap to translate the original model grids and time information into the equivalent framework data types.
The design convergence of U.S. models created an opportunity for coordination that a new program was ready to exploit. The National Unified Operational Prediction Capability (NUOPC; see www.nws.noaa.gov/nuopc/), a consortium of operational weather prediction centers and their research partners, was established in 2007 with goals that included creating a global atmospheric ensemble weather prediction system and promoting collaborative model development. In support of these goals, NUOPC sought further standardization of model infrastructure and introduced the concept of a common model architecture (CMA; Sandgathe et al. 2009; McCarren et al. 2013). A CMA includes the APIs of model components, the “level of componentization,” and the protocols for component interaction. Given commonalities in these areas, the ESMF, CPL7, and FMS frameworks can be said to share a CMA.
Even with a CMA, the model components running under these different frameworks still required the use of a common or reference API for component interfaces in order to achieve an effective level of interoperability. NUOPC defined this effective interoperability as the ability of a model component to execute without code changes in a driver that provides the fields that it requires and to return with informative messages if its input requirements are not met. Drivers are assumed to implement the reference API. Model components may utilize the reference framework throughout, or just supply a cap with the reference API.
The definition of effective interoperability suggests that a generic test driver could be used to check for compliant component behavior. The definition has other implications as well. The model component needs to communicate sufficient information to the driver through the API to allow the component to interact with other components (e.g., which fields the model component can provide). The driver must be able either to handle data communications among components or to invoke additional components to perform coupling tasks. Effective interoperability does not depend on the details of the coupling techniques (field merges, grid remapping methods, etc.).
ESMF emerged as a way to implement the reference API. Unlike FMS and CESM, which are associated with specific coupled modeling systems (including scientific components and fully defined coupling strategies), ESMF was designed to support multiple systems. Using ESMF, the NUOPC consortium undertook formal codification of a CMA and its realization in widely usable (e.g., portable, reliable, efficient, documented) infrastructure software.
ESMF AND THE NUOPC LAYER.
ESMF is high-performance software for building and coupling Earth system models. It includes a superstructure for representing model and coupler components and an infrastructure of commonly used utilities, including grid remapping, time management, model documentation, and data communications (see www.earthsystemcog.org/projects/esmf/). It was developed and is governed by a set of partners that includes NASA, NOAA, the U.S. Department of Defense, and the National Science Foundation (NSF). ESMF can be used in multiple ways: 1) to create interoperable component-based coupled modeling systems, 2) as a source of libraries for commonly used utilities, 3) as a file-based offline generator of interpolation weights, and 4) as a Python package for grid remapping.
The ESMF design evolved over a period of years through weekly community reviews and thousands of user support interactions. It accommodates a wide range of data structures, grids, and component layout and sequencing options. Physical fields are represented using ESMF_Fields, which are contained in import and export ESMF_State objects in order to be passed between components. ESMF has two kinds of components: model components (ESMF_GridComp) and coupler components (ESMF_CplComp). Both must be customized, since ESMF does not provide scientific models or a complete coupler. The modeler fills in the coupling function, such as the transfer of fluxes, field merging, and handling of coastlines, or can wrap an existing coupler implementation. Likewise, ESMF can serve as the primary infrastructure for a scientific model component or, in a process made easier by a shared CMA, the modeler can write an ESMF cap. This approach enables centers to maintain local differences in coupling methodologies; longstanding coupled modeling efforts at the National Center for Atmospheric Research (NCAR), GFDL, and NASA have established organizational preferences for such operations.3 It also enables the ESMF software to coexist with the native infrastructure. The idea that a single common software framework must replace all others, a solution advanced in the 2012 National Research Council (NRC) report, proved unnecessary and arguably undesirable.
Although ESMF does not provide a complete coupler component, it does include tools for building them. The calculation and application of interpolation weights are key operations in model coupling. An ongoing collaboration between CESM and ESMF led to joint development of the parallel ESMF grid remapping tools. The source and destination fields can be discretized on logically rectangular grids (ESMF_Grid), unstructured meshes (ESMF_Mesh), or observational data streams (ESMF_LocStream). The tools support two-dimensional (2D) and three-dimensional (3D) interpolation, regional and global grids, a number of interpolation methods (e.g., bilinear, first-order conservative, higher order, nearest neighbor), and options for pole treatments. For conservative interpolation, ESMF also supports the exchange grid (ESMF_XGrid) construct developed at GFDL, which enables sensitive flux computations to be performed on a fine grid defined by superimposing the grids of the interacting components (Balaji et al. 2006). A set of ESMF utility classes, including clocks for managing model time and utilities for functions like I/O and message logging, is also available.
ESMF provides component interfaces, data structures, and methods with few constraints about how to use them. This flexibility enabled it to be adopted by many coupled modeling systems,4 but it limited the interoperability across these systems. To address this issue, the NUOPC consortium developed a set of coupling conventions and generic representations of coupled modeling system elements—drivers, models, connectors, and mediators—called the NUOPC Layer (see www.earthsystemcog.org/projects/nuopc/).
NUOPC drivers are responsible for invoking and sequencing model, mediator, and connector components. The NUOPC model offers a way to write caps that are not application specific for science model components. The caps provide access to fields imported, fields exported, and clock information through the ESMF component APIs. Mediators contain custom coupling code, for example, reconciliation of masks from different model components. Mediators may leverage the ESMF grid remapping capabilities or use another grid remapping package. The driver creates connector components for models and mediators that need to exchange data. The connectors determine which exchange fields are equivalent, usually at initialization, and use this information to execute data transfers at runtime. The connectors can automatically perform simple field data transformations and transfers using ESMF library calls for redistribution and grid remapping. Table 1 summarizes NUOPC generic components and their roles. Since connectors can manage field exchanges directly between model components, a mediator component only needs to be created when custom operations are needed in the field interchange. Figure 1 is a schematic of two model configurations built using NUOPC generic components, one with a mediator and one without. NUOPC also support more complicated component arrangements involving ensembles and component hierarchies.
To specialize generic components, the modeler creates callbacks to their own code at clear specialization points.5 NUOPC Layer calls mainly appear in parts of a coupled modeling system related to component creation and sequencing, and may be interspersed with calls to ESMF time management, grid remapping, and other methods. The NUOPC generic components use the ESMF component data types and their initialize/run/finalize methods.
All of the generic NUOPC components carry standard metadata that describe how to operate them. Perhaps the most important metadata are a specification of three maps: an InitializePhaseMap, a RunPhaseMap, and a FinalizePhaseMap, which associate specific, labeled phases with ESMF component initialize, run, and finalize methods, respectively. This structure, together with the import/export fields and clocks passed through the ESMF component APIs, provides the information needed to allow the model, mediator, and connector components to be managed by a generic driver. Figure 2 shows the syntax of a sample configure file that is read by a driver to invoke models, a mediator, and connectors in a run sequence.
While use of the NUOPC Layer cannot guarantee scientific compatibility (see sidebar “Limits of Component Interoperability”), it does guarantee a set of component behaviors related to technical interoperability. These are described in NUOPC (2016). Specifically, it ensures that a component will provide the following elements.
A GNU makefile fragment that defines a small set of prescribed variables.6 Each component keeps its native build system but extends it to include make targets that produce a library containing the NUOPC-capped version of the component together with the makefile fragment file. This makefile fragment is used by the build system of the coupled modeling system to link the external components into a single executable.
A single public entry point, called SetServices. Standardizing this name enables code that registers components to be written generically.
An InitializePhaseMap, which describes a sequence of standard initialize phases drawn from a set of Initialize Phase Definitions. One standard phase advertises the fields a model or mediator can provide, using standard names that are checked for validity against a NUOPC Field Dictionary. Standard names included with the dictionary are drawn from the climate and forecast (CF) conventions (Eaton et al. 2011). Names that are not CF compliant can be used as aliases for CF names, or added as new dictionary entries. Connectors match fields with equivalent standard names. In a later standard phase, model and mediator components check the connection status of the advertised fields and realize those fields that will be exchanged. There are additional standard initialization phases that can be used to transfer grid information between components and to satisfy data dependencies.
A RunPhaseMap, which includes labeled run phases. The modeler sets up a run sequence by adding elements to a generic driver. An element in the run sequence can be either a labeled phase from a specific component or source and destination component names that will define a connector. As it executes, each phase must check the incoming clock of the driver and the time stamps of incoming fields against its own clock for compatibility. The component returns an error if incompatibilities are detected.
Time stamps on its exported fields consistent with the internal clock of the component.
A FinalizePhaseMap, which includes a method that cleans up all allocations and file handles.
NUOPC Layer compliance guarantees certain aspects of technical interoperability, but it does not guarantee that all components of the same type—for instance, all NUOPC-wrapped atmosphere models—will be scientifically viable in a given coupled modeling system. A simple example of scientific incompatibility is one in which the exported fields available do not match the imported fields needed for a component to run. Other incompatibilities can originate in how the scope of the component is defined (i.e., which physical processes are included), and in assumptions about how the component will interact with other components.* For example, some coupled modeling systems implement an implicit interaction between atmosphere and land models, while others take a simpler explicit approach. Whether a component can adapt to a range of configurations and architectures is determined by whether scientific contingencies are built into it by the developer. The components in the ESPS are currently limited to major physical domains, since many of the models in this category, such as CAM, CICE, and HYCOM, have been built with the scientific flexibility needed to operate in multiple coupled modeling systems and coupling configurations.
Alexander and Easterbrook (2011) provide a high-level look at variations in the component architecture of climate models.
These constraints, involving build dependencies, initialization sequencing, and run sequencing, are the focus of the NUOPC Layer because they are required to satisfy the definition of effective interoperability. The constraints nonetheless allow for the representation of many different model control sequences. They enable contingencies, such as what to do if an import field is not available, to be handled in a structured way.
The ESMF/NUOPC software distribution is suitable for broad use as it has an open-source license, comprehensive user documentation, and a user support team. It is bundled with a suite of about 6,500 regression tests that runs nightly on about 30 different platform/compiler combinations. The regression tests include unit tests, system tests, examples, tests of realistic size, and tests of performance. With a few exceptions, the NUOPC Layer API has been stable and backward compatible since the ESMF, version 6.2.0, release in May 2013. The expectation is that backward compatibility will continue to be sustained through future releases. The software has about 6,000 registered downloads.
ESMF data structures can often reference native model data structures, and ESMF methods can invoke model methods without introducing significant performance overhead. Performance evaluation occurs on an ongoing basis, with reports posted online (at www.earthsystemcog.org/projects/esmf/performance). Reports show that the performance overhead of ESMF component wrappers is insignificant (see also Collins et al. 2005) and key operations, such as sparse matrix multiply, are comparable to native implementations. The NUOPC version of CESM, still largely unoptimized, shows less than a 5% overhead when compared to the native CESM implementation.
The assessment of software ease of use depends to a large degree on the modeler’s past experience and preferences. ESMF and NUOPC are not based on pragma-style directives and contain little autogenerated code, except for overloading interfaces for multiple data types. This improves the readability of the infrastructure code and makes the flow of control easier to understand. Further, the capping approach to adoption keeps the infrastructure calls distinct from the native model code. The NUOPC Layer uses the logging feature that comes with ESMF to put backtraces into log files, which helps to make debugging easier.
THE EARTH SYSTEM PREDICTION SUITE.
The National Earth System Prediction Capability (National ESPC; see http://espc.oar.noaa.gov) combines the ESPC, initiated in 2010, and NUOPC, to extend the scope of the NUOPC program in several ways. The National ESPC goal is a global Earth system analysis and prediction system that will provide seamless predictions from days to decades, developed with contributions from a broad community. Expanding on NUOPC, the National ESPC includes additional research agency partners [NSF, NASA, and Department of Energy (DOE)], time scales of prediction that extend beyond short-term forecasts, and new modeling components (e.g., cryosphere, space).
To realize the National ESPC vision, major U.S. models must be able to share and exchange model components. Thus, the National ESPC project is coordinating development of an ESPS, a collection of NUOPC-compliant Earth system components and model codes that are technically interoperable, tested, documented, and available for integration and use. At this stage, ESPS focuses on coupled modeling systems and atmosphere, ocean, ice, and wave components.
ESPS partners are targeting the following inclusion criteria:
ESPS components and coupled modeling systems are NUOPC compliant.
ESPS codes are versioned.
Model documentation is provided for each version of the ESPS component or modeling system.
Regression tests are provided for each component and coupled modeling system configuration.
There is a commitment to continued NUOPC compliance and ESPS participation for new versions of the code.
ESPS is intended to formalize the steps in preparing codes for cross-agency application, and the inclusion criteria support this objective. NUOPC compliance is the primary requirement. It guarantees a well-defined, effective level of interoperability and enables the assembly of codes from multiple contributors. Table 2 shows the current NUOPC compliance status of ESPS components and coupled modeling systems.
At the time of this writing, not all of the inclusion criteria related to usability are satisfied for all candidate codes. Further, these criteria are likely to evolve. The extent of the metadata to be collected still needs to be determined, and specific requirements for regression tests have not yet been established. The process of refining the inclusion criteria and completing it for all codes is likely to occur over a period of years. However, a framework is now in place for moving forward. Current information is presented on the ESPS web page (www.earthsystemcog.org/projects/esps/).
Code development, compliance checking, and training tools.
The viability of ESPS depends on there being a straightforward path to writing compliant components. Several tools are available to facilitate development and compliance verification of ESPS components and coupled models. These include the command line–based NUOPC Compliance Checker and Component Explorer, both described in NUOPC (2016), and the graphical Cupid Integrated Development Environment (IDE; Dunlap 2015).
The NUOPC Compliance Checker is an analysis tool that intercepts component actions during the execution of a modeling application and assesses whether they conform to standard NUOPC Layer behaviors. It is linked by default to every application that uses ESMF and can be activated at runtime by setting an environment variable. When deactivated, it imposes no performance penalty. The Compliance Checker produces a compliance report that includes, for each component in an application, checks for the presence of the required initialize, run, and finalize phases; correct timekeeping; and the presence of required component and field metadata.
The Component Explorer is a runtime tool that analyzes a single-model component by acting as its driver. The tool offers a way of evaluating the behavior of the component outside of a coupled modeling application. It steps systematically through the phases defined by the component and performs checks, such as whether the required makefile fragment is provided, whether a NUOPC driver can link to the component, and whether error messages are generated if the required inputs are not supplied. For additional information, the Compliance Checker can be turned on while the Component Explorer is running. A test of NUOPC compliance is running the candidate component in the Component Explorer and ensuring that it generates no warnings from the Compliance Checker when it is turned on. Sample output is shown in Fig. 3.
Cupid provides a comprehensive code editing, compilation, and execution environment with specialized capabilities for working with NUOPC-based codes. It is implemented as a plugin for Eclipse, a widely used IDE. A key feature of Cupid is the ability to create an outline that shows the NUOPC-wrapped components in the application; their initialize, run, and finalize phases; and their compliance status. The outline is presented to the developer side by side with a code editor, and a command line interface for compiling and running jobs. Cupid provides contextual guidance and can automatically generate portions of the code needed for compliance. Users can select among several prototype codes as the basis for training, or they can import their own model code into the environment. Figure 4 shows the Cupid graphical user interface.
Table 3 summarizes the tools described in this section and their main uses. Static analysis mode refers to the examination of code, while dynamic analysis mode refers to the evaluation of component behaviors during runtime.
ADAPTING MODELS FOR ESPS.
In this section, we describe the approach to adapting different sorts of codes for ESPS. We look at implementation of single-model components, wholly new coupled systems, and existing coupled systems.
Single-model components are the most straightforward to wrap with NUOPC Layer interfaces. Version 5 of the Modular Ocean Model (MOM5; Griffies 2012) and the Hybrid Coordinate Ocean Model (HYCOM; Halliwell et al. 1998, 2000; Bleck 2002) are examples of this case. Both ocean models had previously been wrapped with ESMF interfaces, and both had the distinct initialize, run, and finalize standard methods required by the framework. For NUOPC compliance, a standard sequence of initialize phases was added, and conformance with the NUOPC Field Dictionary was checked. The process of wrapping MOM5 and HYCOM with NUOPC Layer code required minimal changes to the existing model infrastructure. For both MOM5 and HYCOM, NUOPC changes can be switched off, and MOM5 can still run with GFDL’s in-house FMS framework.
The construction of newly coupled systems is the next step in complexity. The Navy’s global modeling system and the NOAA Environmental Modeling System (NEMS; Iredell et al. 2014) are examples in this category. Navy developers coupled the Navy Operational Global Atmospheric Prediction System (NOGAPS; Rosmond 1992; Bayler and Lewit 1992) and HYCOM by introducing simple NUOPC connectors between the models, and they were able to easily switch in the newer Navy Global Environmental Model (NAVGEM) atmosphere (Hogan et al. 2014) when it became available. This work leveraged ESMF component interfaces introduced into NOGAPS as part of the Battlespace Environments Institute (BEI; Campbell et al. 2010). The NUOPC-based HYCOM code from this coupled system was a useful starting point for coupling HYCOM with components in NEMS and the CESM.
NEMS is an effort to organize a growing set of operational models at the National Centers for Environmental Prediction under a unifying framework. The first coupled application in NEMS connects the Global Spectral Model [GSM; previously the Global Forecast System (GFS); EMC 2003] to HYCOM and MOM5 ocean components and the Los Alamos Sea Ice Model (CICE; Hunke et al. 2015). The NUOPC mediator manages a fast atmosphere and ice coupling loop and a slower ocean coupling loop (visible in Fig. 2). Components that are capped with NUOPC and that are in the process of being introduced into NEMS include the WaveWatch III model (Tolman 2002), the Ionosphere Plasmasphere Electrodynamics (IPE) model [based on an earlier model described in Fuller-Rowell et al. (1996) and Millward et al. (1996)], and a hydraulic component implemented using the Weather Research and Forecasting (WRF) Model Hydrological modeling extension package (WRF-Hydro) (Gochis et al. 2013).8 Figure 5 shows NEMS components, current and planned.
Adapting an existing coupled modeling system for NUOPC compliance is most challenging, since adoption must work around the native code. The CESM, the Coupled Ocean–Atmosphere Mesoscale Prediction System (COAMPS; Hodur 1997; Chen et al. 2003), and ModelE (Schmidt et al. 2006) are examples of this. In CESM, a fully coupled model that includes atmosphere, ocean, sea ice, land ice, land, river, and wave components, ESMF interfaces have been supported at the component level since 2010, when it was known as the Community Climate System Model, version 4 (CCSM4). However, the CESM driver was based on the MCT data type. Recently, the driver was rewritten to accommodate the NUOPC Layer. By introducing a new component data type in the driver, either NUOPC component interfaces or the original component interfaces that use MCT data types can be invoked. These changes did not require significant modifications to the internals of the model components themselves.
Incorporating the NUOPC Layer into COAMPS involved refactoring the existing ESMF layer in each of its constituent model components and implementing a new top-level driver/coupler layer. As with the global Navy system, NAVGEM, ESMF component interfaces had been introduced as part of BEI. The COAMPS system includes the nonhydrostatic COAMPS atmosphere model coupled to the Navy Coastal Ocean Model (NCOM; Martin et al. 2009) and the Simulating Waves Nearshore (SWAN) model (Booij et al. 1999). Refactoring to introduce the NUOPC Layer into each model component involved changing the model ESMF initialize method into multiple standard phases. The representation of import/export fields was also changed to use the NUOPC Field Dictionary. These changes were straightforward and limited to the model ESMF wrapper layer. An effort that is just beginning involves wrapping the Navy Environmental Prediction System Utilizing the Nonhydrostatic Unified Model of the Atmosphere (NUMA) Core (NEPTUNE), a nonhydrostatic model that uses an adaptive grid scheme (Kelly and Giraldo 2012; Gaberšek et al. 2012; Kopera and Giraldo 2014; Giraldo et al. 2013), with a NUOPC Layer interface, as a candidate for the Navy's next-generation regional and global prediction systems.
When NUOPC Layer implementation began in ModelE, the degree of coarse-grained modularization was sufficiently complete that the ModelE atmosphere could be run with four different ocean models (data, mixed layer, and two dynamic versions), and the two dynamic oceans could both be run with a data atmosphere. At this time, atmosphere and mixed layer ocean models are wrapped as NUOPC components, and can be driven using a NUOPC driver. Specification of the multiphase coupled run sequence was easily handled via NUOPC constructs. Mediators will provide crucial flexibility to apply nontrivial field transformations as more complex coupled configurations are migrated.
Developers of the Goddard Earth Observing System Model, version 5 (GEOS-5), atmospheric model (Molod et al. 2012) incorporated ESMF into the model design from the start, using the framework to wrap both major components and many subprocesses. To fill in gaps in ESMF functionality, the GEOS-5 development team developed software called the Modeling Analysis and Prediction Layer (MAPL). A challenge for bringing GEOS-5 into ESPS is translating the MAPL rules for components into NUOPC components, and vice versa. A joint analysis by leads from the MAPL and NUOPC groups revealed that the systems are fundamentally similar in structure and capabilities (da Silva et al. 2014). The feature that most contributes to this compatibility is that neither NUOPC nor MAPL introduces new component data types—both are based on components that are native ESMF data types (ESMF_GridComp and ESMF_CplComp). MAPL has been integrated into the ESMF/NUOPC software distribution and set up so that refactoring can reduce redundant code in the two packages. Although the GEOS-5 model is advanced with respect to its adoption of ESMF, most of the work in translating between MAPL and NUOPC still lies ahead.
RESEARCH AND PREDICTION WITH COMMUNITY INFRASTRUCTURE.
Community-developed ESMF and NUOPC Layer infrastructure supports scientific research and operational forecasting. This section describes examples of scientific advances that ESPS and related infrastructure have facilitated at individual modeling centers, and the opportunities they bring to the management of multimodel ensembles.
Modeling and data center impacts.
This section provides examples of how the use of ESMF and NUOPC Layer software has benefited modeling efforts.
NAVGEM–HYCOM–CICE: The NAVGEM–HYCOM–CICE modeling system, coupled using NUOPC Layer infrastructure, is being used for research at the Naval Research Laboratory and is in preparation for operational transition in several years. An initial study, using just NAVGEM and HYCOM, examined the onset of a Madden–Julien oscillation (MJO) event in 2011 (M. Peng and C. Chen 2013, poster presentation). For stand-alone NAVGEM, the onset signature was basically absent. The coupled system was able to reasonably simulate the onset signature compared with Tropical Rainfall Measuring Mission (TRMM) measurements. With the addition of the CICE ice model, this system is now being used to explore the growing and melting of sea ice over the Antarctic and Arctic regions.
COAMPS and COAMPS for tropical cyclones (TC): The COAMPS model is run in research and operations by the U.S. Department of Defense and others for short-term numerical weather prediction. COAMPS-TC is a configuration of COAMPS specifically designed to improve TC forecasts (Doyle et al. 2014). Both use ESMF and NUOPC software for component coupling. The coupled aspects of COAMPS and COAMPS-TC were recently evaluated using a comprehensive observational dataset for Hurricane Ivan (Smith et al. 2013). This activity allowed for the evaluation of model performance based on recent improvements to the atmospheric, oceanic, and wave physics, while gaining a general but improved understanding of the primary effects of ocean–wave model coupling in high-wind conditions. The new wind input and dissipation source terms (Babanin et al. 2010; Rogers et al. 2012) and wave drag coefficient formulation (Hwang 2011), based on field observations, significantly improved SWAN’s wave forecasts for the simulations of Hurricane Ivan conducted in this study. In addition, the passing of ocean current information from NCOM to SWAN further improved the TC wave field.
GEOS-5: The NASA GEOS-5 atmosphere–ocean general circulation model is designed to simulate climate variability on a wide range of time scales, from synoptic time scales to multicentury climate change. Projects underway with the GEOS-5 AOGCM include weakly coupled ocean–atmosphere data assimilation, seasonal climate predictions, and decadal climate prediction tests within the framework of the Coupled Model Intercomparison Project, phase 5 (CMIP5; Taylor et al. 2012). The decadal climate prediction experiments are being initialized using the weakly coupled atmosphere–ocean data assimilation based on Modern-Era Retrospective Analysis for Research and Applications (MERRA; Rienecker et al. 2011). All components are coupled together using ESMF interfaces.
NEMS: The NEMS modeling system under construction at NOAA is intended to streamline development and create new knowledge and technology transfer paths. NEMS will encompass multiple coupled models, including future implementations of the Climate Forecast System (CFS; Saha et al. 2014), the Next Generation Global Prediction System (NGGPS; Lapenta 2015), and regional hurricane forecast models. The new CFS will couple global atmosphere, ocean, sea ice, and wave components through the NUOPC Layer for advanced probabilistic seasonal and monthly forecasts. NGGPS is being designed to improve and extend weather forecasts to 30 days, and will include ocean and other components coupled to an atmosphere. The NEMS hurricane forecasting capability will have nested mesoscale atmosphere and ocean components coupled through the NUOPC Layer for advanced probabilistic tropical storm-track and intensity prediction. Early model outputs from the atmosphere (GSM), ocean (MOM5), and sea ice (CICE) three-way coupled system in NEMS are currently being evaluated.
CESM: The CESM coupled global climate model enables state-of-the art simulations of Earth’s past, present, and future climate states and is one of the primary climate models used for national and international assessments. A recent effort involves coupling HYCOM to CESM components using NUOPC Layer interfaces. A scientific goal of the HYCOM–CESM coupling is to assess the impact of hybrid versus depth coordinates in the representation of our present-day climate and climate variability. The project leverages an effort to couple HYCOM to an earlier version of CESM—Community Climate System Model, version 3 (CCSM3; J. Lu et al. 2013, unpublished manuscript; J.-P. Michael et al. 2013, unpublished manuscript).
ESPS opportunities for managed and interactive ensembles.
In the weather and climate prediction communities, ensemble simulations are used to separate signal from noise, to reduce some of the model-induced errors, and to improve forecast skill. Uncertainty and errors come from several sources.
Initial condition uncertainty associated with errors in our observing systems or in how the observational estimates are used to initialize prediction systems (model uncertainty/errors play a significant role here).
Uncertainty or errors in the observed and modeled external forcing. This can be either natural (changes in solar radiation reaching the top of the atmosphere; changes in atmospheric composition due to natural forcing, such as volcanic explosions; changes in the shape and topography of continents or ocean basins), or anthropogenic (changes in the atmospheric composition and land surface properties due to human influences).
Uncertainties or errors in the formulation of the models used to make the predictions and to assimilate the observations. These uncertainties and errors are associated with a discrete representation of the climate system and the parameterization of subgrid physical processes. The modeling infrastructure development described here is ideally suited to quantify the uncertainty due to errors in model formulation, and where possible reduce this uncertainty.
To account for initial condition uncertainty, it is standard practice to perform a large ensemble of simulations with a single model by perturbing the initial conditions. The ensemble mean or average is typically thought of as an estimate of the signal and the ensemble spread, or even the entire distribution is used to quantify the uncertainty (or noise) due to errors in the initial conditions. In terms of uncertainty in the external forcing, the model simulations that are used to inform the Intergovernmental Panel on Climate Change (IPCC) use a number of different scenarios for projected greenhouse gas forcing to bracket possible future changes in the climate. In both of the abovementioned examples, it is also standard practice to use multiple models to quantify uncertainty in model formulation and to reduce model-induced errors.
The use of multimodel ensembles falls into two general categories, both of which are easily accommodated by ESPS. The first category is an a posteriori approach where ensemble predictions from different models are combined, after the simulation or prediction has been run, into a multimodel average or probability distribution that takes advantage of complementary skill and errors. This approach is the basis of several international collaborative prediction research efforts [e.g., North American Multi-Model Ensemble, Ensemble-Based Predictions of Climate Changes and Their Impacts (ENSEMBLES)] and climate change projection (CMIP) efforts, and there are numerous examples of how this multimodel approach yields superior results compared to any single model (e.g., Kirtman et al. 2014). In this case, the multimodel average estimates the signal that is robust across different model formulations and initial condition perturbations. The distribution of model states is used to quantify uncertainty due to model formulation and initial condition errors. While this approach has proven to be quite effective, it is generally ad hoc, in the sense that the chosen models are simply those that are readily available. The ESPS development described here allows for a more systematic approach, in that individual component models (e.g., exchanging atmospheric components: CAM5 for GEOS-5) can be easily interchanged within the context of the same coupling infrastructure, thus making it possible to isolate how the individual component models contribute to uncertainty and complementary skill and errors. For simplicity we refer to the interchanging or exchanging component models as managed ensembles.
The second category can be viewed as an a priori technique, in the sense that the model uncertainty is “modeled” as the model evolves. This approach recognizes that the dynamic and thermodynamic equations have irreducible uncertainty and that this uncertainty should be included as the model evolves. This argument is the scientific underpinning for the multimodel interactive ensemble approach. The basic idea is to take advantage of the fact that the multimodel approach can reduce some of the model-induced error, but with the difference being that this is incorporated as the coupled system evolves. In ESPS we can use the atmospheric component model from, say, CAM5 and GEOS-5 simultaneously as the coupled system evolves, and, for example, combine the fluxes (mean or weighted average) from the two atmospheric models to communicate with the single ocean component model. Moreover, it is even possible to sample the atmospheric fluxes in order to introduce state-dependent and nonlocal stochasticity into the coupled system to model the uncertainty due to model formulation. Forerunners of the approach have been implemented within the context of CCSM to study how atmospheric weather noise impacts climate variability (Kirtman et al. 2009, 2011) and seasonal forecasts in the NOAA operational prediction system (Stan and Kirtman 2008).
Next steps include continued development of NUOPC-based coupled modeling systems, ongoing improvements to ESPS metadata and user access information, exploration of the opportunities ESPS affords in creating new ensemble systems, and addition of capabilities to the infrastructure software itself. Whether to extend the ESPS to other types of components is an open question. Developers have already implemented NUOPC Layer interfaces on components that do not fall into the initial ESPS model categories, including WRF-Hydro, the Community Land Model (CLM), and the IPE model.
The continued incorporation of additional processes into models, the desire for more seamless prediction across temporal scales, and the demand for more information about the local impacts of climate change are some of the motivations for linking frameworks from multiple disciplines. The NSF-funded Earth System Bridge project is building converters that will enable NUOPC codes to be run within the Community Surface Dynamics Modeling System (CSDMS), which contains many smaller models representing local surface processes, and CSDMS codes to be run within ESMF. The ESMF infrastructure is also being used to develop web service coupling approaches in order to link weather and climate models to frameworks that deliver local and regional information products (Goodall et al. 2013).
A critical aspect of future work is the evaluation and evolution of NUOPC and ESMF software for emerging computing architectures. A primary goal for common infrastructure, such as the NUOPC Layer, is to do no harm and to allow for optimizations within component models. However, the NUOPC infrastructure also offers new optimization opportunities for coupled systems. The formalization of initialize and run phases allows components to send information to the driver about their ability to exploit heterogeneous computing resources. The driver has the potential to negotiate an optimal layout by invoking a mediator or other component that does resource mapping. This holds great potential in dealing with systems that have an increasing number of components and will benefit from running efficiently on accelerator-based computer hardware.
Among the planned extensions to NUOPC protocols are hardware resource management between components and the negotiation of data placement of distributed objects. Both extensions leverage the ESMF “virtual machine” or hardware interface layer, already extended under an ESPC initiative to be coprocessor aware. The awareness of data location can also be used to minimize data movement and reference data where possible during coupling. Finally, there is interest in optimizing the grid remapping operation between component grids in the mediator by choosing an optimal decomposition of the transferred model grid. This optimization requires extra negotiation between the components that could be made part of the existing NUOPC component interactions.
Through the actions of a succession of infrastructure projects in the Earth sciences over the last two decades, a common model architecture (CMA) has emerged in the U.S. modeling community. This has enabled high-level model components to be wrapped in community-developed ESMF and NUOPC interfaces with few changes to the model code inside, in a way that retains much of the native model infrastructure. The components in the resulting systems possess a well-defined measure of technical interoperability. The ESPS, a collection of multiagency coupled weather and climate systems that complies with these standard interfaces, is a tangible outcome of this coordination. It is a direct response to the recommendations of a series of National Research Council and other reports recommending common modeling infrastructure, and a national asset resulting from the commitment of the agencies involved in Earth system modeling to work together to address global challenges.
The National Aeronautics and Space Administration’s Computational Modeling Algorithms and Cyberinfrastructure program provides support for ESMF, the Cupid Integrated Development Environment, and integration of ESMF and the NUOPC Layer with ModelE (NNX12AP51G, NNX16AB20G). The National Aeronautics and Space Administration’s Modeling Analysis and Prediction program supports ESMF and the integration of ESMF and the NUOPC Layer with the GEOS-5 model (NNX11AL82G). The National Oceanic and Atmospheric Administration Climate Program Office provides support for ESMF and the development of the Climate Forecast System using NUOPC Layer tools. The National Weather Service supports ESMF and NUOPC Layer development, and development of the Next Generation Global Prediction System using NUOPC Layer tools (NA15OAR4310103, NA12OAR4320137). The Department of Defense Office of Naval Research supports ESMF and NUOPC development, including adaption for emerging computer architectures, and the integration of the NUOPC Layer into the Community Earth System Model and Navy models (N00014-13-1-0508, N00014-13-1-0845). The High Performance Computing Modernization Program provides support for development of asynchronous I/O capabilities in ESMF (PP-CWO-KY06-001-P3). The National Science Foundation provided support for early development of ESMF and support for integration of hydrology and land components into NEMS (1343811). Computing resources for testing infrastructure and implementing it in applications were provided by the National Center for Atmospheric Research Computational and Information Systems Laboratory (CISL), sponsored by the National Science Foundation and other agencies; the Oak Ridge Leadership Computing Facility, located in the National Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science (BER) of the Department of Energy; the NASA Center for Climate Simulation; and the NOAA Environmental Security Computer Center. V. Balaji is supported by the Cooperative Institute for Climate Science, Princeton University, under Award NA08OAR4320752 from the National Oceanic and Atmospheric Administration, U.S. Department of Commerce. The statements, findings, conclusions, and recommendations are those of the authors and do not necessarily reflect the views of Princeton University, the National Oceanic and Atmospheric Administration, or the U.S. Department of Commerce. The authors thank Richard Rood and Anthony Craig for their insightful comments on the original manuscript, Donald Anderson for his guidance and advocacy, and Matthew Rothstein for his contributions to understanding the performance of NUOPC modeling applications.
Not all coupling technologies follow these architectural patterns. For example, in the Ocean Atmosphere Sea Ice Soil (OASIS) coupler (Valcke 2013) used by many European climate models, components are run as separate, linked software programs or “multiple executables” and in general do not require that fields transferred between components pass through a component interface. However, the most recent versions of the OASIS coupler now support single executables as well. Valcke et al. (2012) include some discussion of the relative advantages of single versus multiple executable strategies.
Specialization points are places where the generic code implemented in the NUOPC Layer calls back into user-provided code for a specific purpose. Specialization points are indexed by system-specified string labels, such as “label_DataInitialize,” that indicate the purpose of the specialization. Some specializations are optional, and others are required.
For example, ESMF_DEP_INCPATH, which is the include path to find module or header files during compilation.
Other components in the process of being wrapped in NUOPC interfaces for use with NEMS include the Nonhydrostatic Mesoscale Model on the B grid (NMMB; Janjić and Gall 2012) and the Princeton Ocean Model (POM; Blumberg and Mellor 1987), to be coupled for a regional system, and an alternate ice model, KISS (Grumbine 2013).