Unprecedented changes in the climate and environment have been observed in the three poles, including the North Pole, the South Pole, and the Third Pole–Tibetan Plateau. Although considerable data have been collected and several observation networks have been built in these polar regions, the three poles are relatively data-scarce regions due to inaccessible data acquisition, high-cost labor, and difficult living environments. To address the obstacles to better understanding the unprecedented changes in the three poles and their effects on the global environment and humans, there is a pressing need for better data acquisition, curation, integration, service, and application to support fundamental scientific research and sustainable development for the three poles. CASEarth Poles, a project within the framework of the “CAS Big Earth Data Science Engineering” program of the Chinese Academy of Sciences, aims to construct a big data platform for the three poles. CASEarth Poles will be devoted to 1) breaking the bottleneck of polar data curation, integration, and sharing; 2) developing high-resolution remote sensing products over the three poles; 3) generating atmospheric reanalysis datasets for the polar regions; 4) exploring the synchronization, asynchronization, and teleconnection of the environmental changes in the three poles; 5) investigating the climate, water cycle, and ecosystem dynamics and the interactions among the multispheres in the polar regions and their global effects; and 6) supporting decision-making with regard to sea ice forecasting, infrastructure, and sustainable development in polar regions. CASEarth Poles will collaborate with international efforts to enable better data and information services for the three poles in the big data era.
The three poles, including the North Pole, the South Pole, and the Third Pole–Tibetan Plateau, are crucial areas of global climate and environmental changes (Stocker 2003; Qiu 2008; Callaghan et al. 2011; Yao et al. 2015; Overland et al. 2016; Shepherd 2016). The changes in the climate, ecology, water, and energy cycles on the three poles not only play significant roles in the polar regions but also closely connect to other regions through complex long-range atmospheric and ocean transfer in terms of energy, mass, and momentum (Li et al. 2014; Bintanja and Selten 2014; Woodard et al. 2014; Kug et al. 2015; van Gils et al. 2016; Overland et al. 2016; DeConto and Pollard 2016; Meehl et al. 2016; Shepherd 2016; Kim et al. 2017; Lee et al. 2017; Zou et al. 2017; Chen et al. 2018; Luo et al. 2019; Rintoul et al. 2018; Xue et al. 2018; Dai et al. 2019). Additionally, with increasing human activities in the polar regions, sustainable development in the three poles is also facing an enormous challenge (McConnell et al. 2007; Chown et al. 2012; Ford et al. 2015; Siegert 2017).
Unprecedented changes have been observed in the three poles in recent years, for example, the “polar amplification effect” and the elevation-dependent warming due to global warming (Pithan and Mauritsen 2014; Pepin et al. 2015; Gao et al. 2019), decreasing sea ice in the Arctic (Parkinson and Cavalieri 2012), slightly increasing sea ice in the Antarctic (Liu and Curry 2010), sea level rise (Gardner et al. 2013; Li et al. 2019) and greening tendency in the Antarctic (Amesbury et al. 2017), increasing Arctic vegetation coverage (Pearson et al. 2013), permafrost degradation (ACIA 2004; Cheng and Wu 2007; Li et al. 2008, 2012), glacier retreat (Meier et al. 2007; Yao et al. 2012; Cogley 2017; Kraaijenbrink et al. 2017), snow melting (Musselman et al. 2017; Wu et al. 2018), accelerating loss of Antarctic ice shelves (Paolo et al. 2015), warming effect of polar aerosol on glacier melting (Tomasi et al. 2015), slowing down of the thermohaline circulation (Chen et al. 2017), continuing increase in runoff in the Arctic (Peterson et al. 2002), rapid retreat of Greenland glaciers (Mouginot et al. 2015), and rapid cryospheric melt and water cycle intensification in the Third Pole (Yao et al. 2019). In addition, the changing environments in poles have deeply affected polar biology, such as a decline in the caribou (Fauchald et al. 2017) and penguin populations (Fretwell and Trathan 2009). These changes not only raise new scientific questions, such as understanding the spatiotemporal patterns, trends, mechanisms, and effects of polar changes, but also drive stakeholders to take countermeasures to address these unprecedented changes. Due to these factors, polar research is not only a dominating theme in the integrated multisphere study of the Earth system but is also in line with the interdisciplinary research on the nature–social science of the “Future Earth” research program (Future Earth 2013; Rockström 2016).
During the past decades, many scientific findings regarding the atmosphere, climate, water cycle, ecology, hydrology, biodiversity, and sustainability of the polar have been reported. Furthermore, a large amount of polar scientific data, such as ground-based observations, remote sensing data, multispherical model simulations, and data assimilation products, as well as socioeconomic data, have been produced along with these scientific activities. However, though considerable valuable data regarding polar changes have been obtained, there are still obvious data gaps between the facts of polar change and our understanding. Limitations resulting in these data gaps can be summarized as follows.
The first limitation is the lack of an efficient strategy for the acquisition, curation, exchange, and sharing of polar data. The polar data feature big data because an unexpectedly tremendous amount of polar scientific data have been produced. Currently, these data exist at different institutions, such as the World Data Centers, National Snow and Ice Data Center (NSIDC), Canadian Cryospheric Information Network (CCIN), Scott Polar Research Institute, Chinese National Arctic and Antarctic Data Center, the Third Pole Environment Database, and other data centers. Different institutions and researchers independently generate, archive, and utilize their own data in polar research with different standards and goals. Most of these data can be accessed from open-source domains. However, these domains are operated under their own strategic plans with various standards. The lack of interoperation among different data centers hinders the exchange, sharing, and utilization of polar data and further limits the extent and depth of integrated scientific research in the polar regions. Additionally, with rapid growth in the magnitude of polar data, archiving, processing, analyzing, and visualizing the tremendous amount of ever-increasing and multitype polar data also poses a large challenge (Huntington et al. 2017).
The second limitation is that current research methods and techniques used in polar research are not skillful at dealing with multisource, multiscale, high-dimensional, and heterogeneous datasets. To address this problem, the development and use of a novel approach that combines multisource, multiscale, and multitemporal polar data with multispherical models and multifactor analysis methods is urgently needed. In addition, as a new mode, which mainly depends on the discovery of new knowledge on correlation and slightly on causality, big data techniques have become a typical representative of the data-intensive scientific paradigm, bringing the innovation of a scientific research methodology (Fan et al. 2014; Guo et al. 2017) and having the potential to be playing a key role in scientific discovery and decision-making for polar research (Sellars 2018). Currently, however, the big data paradigm has not been popularized in polar scientific discovery and decision-making support; thus, developing and introducing big data methodologies into polar studies could introduce a new paradigm to reveal new scientific discoveries and provide decision-making support.
The third limitation is that the three poles have not been practically treated as an interconnected entity, partially due to limited data sharing. Climate and environmental changes in the three poles involve complicated and interlinked relationships among multiple entities of the globe (Overland et al. 2016); the effects and feedback among the three poles, as well as between the poles and global climate system, are still unclear (Heimann and Reichstein 2008; Moore et al. 2013; Overland and Wang 2013; Slangen et al. 2017; Turner et al. 2017). Scientists have realized the importance of polar research with systematic and integrated views by considering the three poles as an interconnected entity. Therefore, the effective integration of polar data will contribute to breakthroughs in polar research, such as the teleconnection of the three poles and linkage between the polar climate and environment.
Overall, the acquisition, curation, integration, sharing, and application of polar data as well as encouraging the use of big data methods among the polar research community could substantially leverage polar research.
CASEarth Poles, a project within the framework of the CAS Big Earth Data Science Engineering (CASEarth) program supported by Chinese Academy of Sciences, aims to build a comprehensive polar big data platform from an integrated and interdisciplinary perspective. This project addresses the abovementioned limitations and tries to make breakthroughs in the curation, integration, and sharing of polar data; efficient data analysis methods; and sufficient usage of polar data. Therefore, CASEarth Poles can play an important role in supporting polar scientific discovery and decision-making.
The geographical coverage of the North Pole (Fig. 1a), defined by the Arctic Monitoring and Assessment Programme (AMAP), extends from the high Arctic to the sub-Arctic areas of Canada, the Kingdom of Denmark (Greenland and the Faroe Islands), Finland, Iceland, Norway, the Russian Federation, Sweden, and the United States, including the associated marine areas (AMAP 1997). The South Pole refers to the area beyond the latitude of 66°34'S, including the entire Antarctic continent, the Antarctic ice sheet, and the Antarctic Peninsula (Fig. 1b). The Scientific Committee on Antarctic Research (SCAR) defines its area of interest as including Antarctica, its offshore islands, and the surrounding Southern Ocean including the Antarctic Circumpolar Current, the northern boundary of which is the Subantarctic Front (SCAR 2015). The Third Pole covers the area of 40° to 23°N and 106° to 61°E and is mainly defined by elevations higher than 4,000 m and an integrated of topography and ecosystem, including the Qinghai–Tibetan Plateau, Hengduan Mountains, the Himalayas, the Hindu Kush, and the Pamir Plateau (Liu et al. 2014) (Fig. 1c).
The architecture of CASEarth Poles is illustrated in Fig. 2, and the core scientific objectives are as follows:
Polar big data platform. A big data platform of the three poles is to be established with functions including archiving, curation, integration, sharing, analysis, and visualization of data based on advanced information technologies, such as cloud storage, cloud computing, and big data analysis. The platform aims to contain considerable polar data, for example, remote sensing data, ground-based observations, model simulations, data assimilation products, and social–economic data. Moreover, the big data analysis toolbox and scientific models of the cryosphere, atmosphere, ecology, and hydrology will be integrated into the platform.
Data products for the polar regions. Two categories of new data products are produced: (i) remote sensing products at various resolutions and (ii) high-resolution reanalysis and ensemble prediction of the climate, water cycle, and ecosystem productivities. The data products encompass polar lake–river–sea ice, glaciers and ice sheets, snow, permafrost and periglacial processes, vegetation, microorganisms, phenology, polar lake water quality, Artic river runoff, aerosols, cryosphere disasters, paleoclimate and paleoenvironment changes, carbon storage, socioeconomic data, and reanalysis products of land and atmosphere.
Big-data-enabled polar scientific research and information service. We will explore data-enabled scientific discovery, and further supplying polar information services, such as seasonal forecast systems of the Artic sea ice, high-resolution coupled land–atmospheric data assimilation, and paleoclimate data assimilation. Based on the integration of the abovementioned data, analytics toolbox, and models, we aim to improve our understanding of (i) the teleconnection characteristics of polar–global interaction, (ii) the polar paleoenvironment and paleoclimate, (iii) the temporal and spatial patterns of the cryosphere and ecohydrology, and (iv) the scientific foundations for sustainable development in the three poles.
CASEarth Poles is divided into five tasks: 1) platform for data stewardship, big data analytics, and model management; 2) polar remote sensing products and synergic and comparative studies of the environmental changes in the three poles; 3) spatial–temporal changes in the polar water and ecosystem; 4) multispherical interactions of the polar climate system; and 5) cryosphere service and decision support for the infrastructure in polar regions. Overall, CASEarth Poles is expected to be a comprehensive platform for scientists to carry out polar data acquisition and analysis and scientific research regarding polar cryosphere, climate, hydrology, ecology, and sustainable development as well as to supply information services such as Arctic sea ice forecasts and decision-making for infrastructure in polar regions.
Implementation of CASEarth poles
Platform for data stewardship, big data analytics, and model management at the three poles.
This task aims to establish a comprehensive platform for polar big data stewardship with functions of data acquisition, curation, integration, and sharing; big data analysis; and model management and online use (Fig. 3).
A data sharing platform has been established to archive and manage the existing polar data with the international metadata standard ISO19115, especially to realize interoperation with available polar data by international cooperation and sharing. This system provides bilingual (e.g., Chinese and English) data services on multiple terminals, such as desktops and mobile terminals. During the project operation period, a large number of new datasets will be produced via the reanalysis of ground observations, remote sensing, and model output by data assimilation. For example, the paleoclimate and paleoenvironment (e.g., temperature, precipitation, sea ice coverage, ice sheet mass balance) in polar regions are to be reconstructed by assimilating multiproxies, for example, ice core, tree ring, lake sediment, marine sediment, and speleothem data and historical documents, into advanced paleoclimate models.
A big data analytics toolbox is developed for processing, computing, analyzing, and visualizing multisource and multivariable big data in polar regions. In addition, a modeling environment is constructed to contain and manage simulation models and assessment models for the cryosphere, hydrosphere, biosphere, atmosphere, and anthroposphere in the polar regions. In particular, the advanced Earth system model, that is, FGOALS-f2.0 (Zhou et al. 2015), has been used in studies of data assimilation, climate change, climate variability, and interactions among climate systems over the three poles. The FGOALS-f2.0 model, developed by CAS, is a new-generation global climate system model with flexibility in horizontal resolutions up to 25 km and has been widely employed in investigating the Asian monsoon and Tibetan Plateau climatology (Wu et al. 2012, 2015; Duan et al. 2013; Hu and Duan 2015). FGOALS-f2.0 can satisfactorily capture extreme rainfall above 150 mm day−1 over the Tibetan Plateau and surrounding areas. Moreover, the extreme precipitation frequencies are evidently underestimated in both reanalysis rainfall products, including JRA-55 and ERA-Interim (Fig. 4), while the frequency–intensity diagram in FGOALS-f2.0 is quite similar to observations.
Polar remote sensing products and synergic and comparative study for global environmental changes in the three poles.
The aims of this task are as follows: 1) to generate high-quality, high-resolution remote sensing products of key environmental and climate elements over the three polar regions; 2) to analyze the global change sensitivity factors and their spatiotemporal diversity and the interrelationships of their changes, and to reveal their synchrony/asynchrony, interconnections, and teleconnection mechanisms; and 3) to expound their overall impact on global-change response and feedback, the mass and energy balance, methane emissions, and carbon cycle through taking the three poles as a whole.
We plan to produce high-quality and high-resolution remote sensing products of glaciers, frozen soil, periglacial landforms, sea ice, river ice, lake ice, aerosol, and vegetation in the three poles (Fig. 5). To achieve this goal, innovative methods and algorithms have been developed by combining traditional remote sensing models and big data analytics, such as sparse autoencoders, perception models for association rule analysis, and synergistic prediction models based on a recurrent neural network (RNN). Based on these models and methods, the following datasets and products have been developed and produced: the freezing and thawing of the three-pole glacier/snow surface, the glacier movement speed, crevasses in the ice shelf, snow-cover thickness, permafrost type, sea ice extent, sea ice concentration, vegetation type, vegetation index, aerosol type, and aerosol optical thickness.
Then, based on the above remote sensing products and big data analysis methods, we can investigate the 1) spatiotemporal variation in the vegetation dynamics, sea ice, freeze–thaw state, and ice-shelf stability in the three poles; 2) mechanisms and correlation processes between the glacier and aerosol in polar regions; and 3) synchronization, asynchronization, and teleconnection between key elements such as the sea ice change in the Arctic and Antarctic (Fig. 6).
Finally, the CASEarth Poles platform will support the analysis of the effects and feedback of the correlation of the polar key elements on global change and the synergistic influence of the polar key elements on the energy balance, mass balance, and global carbon cycle
Spatial–temporal changes in the polar water and ecosystem.
This task is to explore the past, current, and future situation of water resources and ecosystems in the polar regions by developing high-resolution ensemble predictions of hydrological and ecosystem changes.
We aim to build a high-resolution (∼10 km) dataset of surface water resources for the major rivers over the Third Pole region that includes the following: 1) the development of a multisphere hydrological model that is applicable to the Third Pole environment by incorporating enthalpy-based cryospheric components (glacier, snow, and frozen soil) into a distributed biosphere hydrological model (Wang et al. 2009a,b, 2010, 2017; Shrestha et al. 2010, 2015); 2) the construction of grid-based datasets of monthly surface water resources during 1998–2017 (∼10 km resolution); and 3) multimodel ensemble-based predictions of yearly surface water resources during 2046–65 (∼25 km resolution) for the major rivers over the Third Pole region by using the newly developed multisphere hydrological model and the latest Intergovernmental Panel on Climate Change (IPCC) climate model projections (Fig. 5).
We will also construct grid-based products of hydrometeorological data regarding the main rivers in the Arctic (e.g., Lena, Yenisey, Ob, and Mackenzie; monthly and 10 km resolution) during 1998–2017 and an ensemble-based model prediction (yearly and 50 km resolution) during 2046–65. The research foci are 1) to analyze the major contributors (e.g., climate change, cryosphere melts, and human activity) to the Arctic river runoff changes; 2) to construct high-resolution and long-term datasets of polar ice, including glaciers, ice sheets, ice caps, and ice shelves; and 3) to compare the changes in various polar ice forms and quantify their contributions to the rising sea level.
Regarding the ecosystem changes in the polar regions, we aim to develop physically based ecological models. This goal is composed of 1) producing and analyzing long-term ecological observation datasets at polar regions (e.g., long-term observations at the Naqu station above 4,500 m MSL over the Third Pole); 2) revealing the spatiotemporal patterns of polar ecological system changes through a combination of multisource datasets; 3) analyzing the driving forces of ecosystem changes and predicting their effects on polar biodiversity; and 4) comparing and analyzing the differences and similarities among ecosystem components of the three poles in response to global change.
The platform supports the investigation of the spatiotemporal patterns of polar microorganisms within the soil, glaciers, and lakes. To achieve this objective, we plan to 1) analyze the diversity of polar microorganisms; 2) test the classification of the polar isolated strains and evaluate their functions; 3) obtain massive data about polar extremophiles to reveal the community structure of microorganisms living in special biotopes; and 4) store the microbial resources and genetic resources of extremely specific microorganisms in polar regions and thus construct DNA libraries of microorganisms living in various extreme environments (Fig. 6).
In general, this task provides high-quality spatiotemporal datasets regarding polar water and ecosystems. This task can help reveal the evolution of polar water and ecological change and therefore build a more solid scientific basis for polar water resource utilization and polar ecosystem protection.
Multispherical interactions of the polar climate system.
This task aims to produce a high-quality land–atmosphere reanalysis dataset at high spatiotemporal resolution over the Third Pole (Fig. 5), to investigate multispherical interactions of the three poles and their effects on East Asian climate change, and to develop a new-generation sea ice forecast system (Fig. 6)
We plan to produce a near-surface atmosphere dataset with a high spatiotemporal resolution of 10 km and 2 h over the Tibetan Plateau and a land–atmosphere coupled reanalysis dataset of 25 km and 6 h over the Third Pole from 1981 to 2020 using a data assimilation and new-generation Earth climate system model developed by the Institute of Atmospheric Physics, CAS (Zhou et al. 2015). Furthermore, we develop a sea ice seasonal forecast system for the Arctic by combining a multiple regression model and a CAS state-of-the-art Earth climate system model. The system aims to promote the ability to predict the Arctic sea ice density and thickness using new-generation sea ice data assimilation technology.
The scientific focus is deep understanding of the interaction and teleconnection of the change in the cryosphere, atmosphere, and hydrosphere of the three poles, particularly that of the climate in East Asia, using multisource observational and reanalysis data aided by the Earth system model. Previous studies have revealed the following points: 1) the atmospheric heat change of the Tibetan Plateau is one of the important driving forces for the interannual and interdecadal changes in the Asian monsoon and the East Asian climate (Duan et al. 2013; Wu et al. 2015); 2) the acceleration of Arctic sea ice melting is advantageous to the incurrence of extreme climate events in East Asia and North America (Yao et al. 2017; Chen et al. 2018; Lei et al. 2020); and 3) the air–sea variations in Antarctic affect climate anomalies in East Asia through the atmosphere–ocean bridge. Our research results are expected to improve our understanding of the interconnected land–air–sea ice coupled process among the three poles.
Cryosphere service and decision support for infrastructure in the polar regions.
This task aims to explore a big-data-enabled study on 1) evaluating the risks of environmental change in the cryosphere and assessing cryosphere services and 2) supporting decision-making for infrastructures in the pan-Arctic region (Fig. 6).
The first goal is an evaluation of the vulnerability of cryospheric change and associated risks in the three poles, particularly those of the socioeconomic system, which can be further divided into water resources, agricultural production, and ecological maintenance systems under the shared socioeconomic pathways (SSPs). The critical indicators for risk assessment will be identified using the pressure-state-response model to complete the evaluation from the three dimensions of exposure, sensitivity and adaptability. Then, cryosphere services will be valued using coupled cryospheric–hydrological, ecological, and economic models. We focus on two key aspects: 1) the systematic assessment of permafrost change and its impact on the ecosystem in the Arctic region to reveal the regulatory function of permafrost on carbon storage, carbon sink, and ecosystem productivity and 2) the systematic evaluation of the water resource service function on the Tibetan Plateau, which could enhance our understanding of the water tower function of the Third Pole and provide a better basis for the allocation of water resources in the upper, middle, and lower stream areas of the large river basins in the Third Pole region. These findings can be used in the assessment of UN Sustainable Development Goals (SDGs) in polar regions.
The second goal is to investigate the potential impact of the permafrost change on the infrastructure in polar regions, particularly highways and high-speed railroads, such as the Tibet highway, Sino-Russia railroad, and the pan-Arctic railway, to link Eurasia and North America. Through comprehensive analysis of the freeze–thaw disasters and service performance induced by climate change and human activities, we are able to provide a data service to evaluate the permafrost change impact on major projects (e.g., Beijing–Moscow high-speed railway; Qinghai–Tibet expressway and Arctic oil and gas resources channel; China–Pakistan highway, railway, and oil and gas pipeline) via data integration and mining to support a comprehensive evaluation of the permafrost environment and engineering conditions in polar regions.
Development of the data portal.
The data repository of the three poles is available at http://poles.tpdc.ac.cn/en/ (Fig. 7). This repository has integrated existing data repositories, including the Cold and Arid Regions Science Data Center at Lanzhou (Li et al. 2011; http://card.westgis.ac.cn/ ), a member of the World data system, and the Third Pole Environment Database (http://en.tpedatabase.cn/). Additionally, the datasets and data services from the three poles data system and the National Tibetan Plateau Data Center (https://data.tpdc.ac.cn/en/) have been integrated into a uniform platform. These two data portals will publish all data in parallel by the cloud storage environment supported by the CASEarth cloud storage and computing facilities.
In the new platform, a big data analytic method library and a scientific model library are established to support the scientific research in the regions of the three poles. The big data analytic methods are organized into seven catalogs, such as machine learning, data assimilation, parameter estimation, advanced geostatistical methods, time series analysis, postprocessing methods, and causality analysis. The scientific models include atmospheric, ecological, hydrological, and cryospheric models. These methods and models will be linked to the database and the visualization function will be developed based on the PostGIS spatial database. The infrastructure adopts the Hadoop cluster technology to achieve the distributed computing, and the Spark application program interface (API) to package the inputs and outputs of these methods and models. The user can run these methods and models and carry out the visualization analysis online, which can save the time cost to download the dataset.
At present, approximately 1,400 datasets of the three poles have been published in both English and Chinese, which includes geography, atmospheric science, cryospheric science, hydrology, ecology, geology, geophysics, natural resource science, social economy, and other fields collected from ground observation, remote sensing, and statistics in the three poles. These datasets were integrated from the Cold and Arid Regions Science Data Center at Lanzhou (CARD), the Third Pole Environment Database, and several thematic datasets [such as the Heihe Watershed Allied Telemetry Experimental Research (HiWATER); Li et al. 2013], while the new datasets were developed by CASEarth Poles.
All the datasets have been assigned a unique digital object identifier (DOI) and can be freely downloaded from our data portal (Fig. 7). A list of important data is provided in the supplemental material (https://doi.org/10.1175/BAMS-D-19-0280.2). Some examples of the newly developed datasets by CASEarth Poles include the high-resolution land cover map of the Antarctic (Hui et al. 2017), the daily freeze–thaw process in Antarctic and Arctic ice sheets (Liang et al. 2013), the ice core records from the Antarctic obtained by Chinese scientists (Xiao et al. 2004; Yang and Xiao 2018), and in situ observations of snow and frozen soil in the Pan Third Pole alpine watersheds (Che et al. 2019). Several examples integrated from existing data portals include the near-surface atmospheric forcing datasets for the Third Pole region (Yang et al. 2010, 2013; Duan et al. 2018; He et al. 2020), the long-term global snow depth datasets (Che et al. 2008; Dai et al. 2015), and the long-term surface soil freeze–thaw states dataset of China (Jin et al. 2009).
A data quality control process has been made to improve the data quality before they were published in the data portal. Once the data and metadata are submitted to the system, their quality is reviewed by peers and the technical staff in the data center. This process guarantees the integrity and validity of the data description. Additionally, we also encourage data authors to provide the accuracy report and limitations in the abstract of the metadata. Regarding the in situ observations, the quality control processes include intercomparison and calibration of sensors before the experimental instruments, maintenance during the observation period, and the complete and standard data postprocessing procedures were developed for different types of observation data (Li et al. 2013; Liu et al. 2018).
Efforts are underway to develop an interoperation with data centers or data portals of domestic and international programs on the three poles, such as GEWEX/Global Atmospheric System Studies (GASS)/“Impact of initialized land temperature and snowpack on subseasonal to seasonal prediction” (LS4P; Xue et al. 2019), the Third Pole Environment (TPE) program, the GEO (Group on Earth Observations) data portal via the GEO Cold Region Initiative (GEOCRI), Integrated Global Cryosphere Information System, National Snow and Ice Data Center, and other international initiatives toward an integrated polar data center (Qiu et al. 2017).
In different data centers, the data access protocol conversion module is used to adapt different data access and operation modes. The core of the protocol conversion module is to develop data storage standard, and to design the core API to realize data query and acquisition. To speed up the data query access interoperability between systems, Redis is introduced to the protocol conversion module server as the data core information buffer. For the file interoperability between different data centers, the fully connected mesh topology is used. First, the file storage information is obtained through the protocol conversion module between the two data centers, and then the two data centers directly establish a network connection with HTTP protocol or FTP protocol and exchange files for interoperability. The system will develop and publish data inventories from data centers related to polar research, which will be updated concurrently using data search engines.
The three poles, which play crucial roles in the regional and global environment and climate change, are of significant importance in Earth system science and natural–social interdisciplinary research in the Future Earth. However, gaps between our knowledge and polar changes remain obvious and need to be bridged. Presently, the lack of collaboration and integration of research data from the polar research community hinders our progress in further solving those problems, and this weakness is the main motivation for implementing the use of big data technologies for the three poles.
To address the challenges of polar data acquisition, curation, integration, sharing, and application in the big data era, CASEarth Poles has been formally kicked off. We introduce the background, scientific objectives, and overall implementation plan of CASEarth Poles.
CASEarth Poles is expected to make breakthroughs in 1) establishing an integrated data platform encompassing high-quality multidisciplinary datasets, data stewardship, big data analytics, and model management; 2) improving the capability of Arctic sea ice monitoring and forecasting to supply real-time information services for Artic channel planning; and 3) improving the understanding of paleoclimate, teleconnection mechanisms, multispherical interactions, and cryospheric, ecological, and hydrological changes in the polar regions. All these efforts are expected to support better decision-making for sustainability development in the polar regions.
The data, models, and methods within the CASEarth Poles platform can be accessed and utilized under universal standards of exchange and sharing for researchers all over the world to best meet the diverse needs of integrated polar studies. We will collaborate with international efforts in the transformation to better data and information services for big-data-enabled scientific discoveries and to serve our common future in the three poles.
This project is supported by the “CAS Big Earth Data Science Engineering (CASEarth),” a Strategic Priority Research Program of the Chinese Academy of Sciences (Grant XDA19070000), and the 13th Five-year Informatization Plan of Chinese Academy of Sciences (Grant XXH13505-06).