High-Resolution History: Downscaling China’s Climate from the 20CRv2c Reanalysis

A Met Ofﬁce Hadley Centre regional climate model, HadRM3P, is used to dynamically downscale the NOAA Twentieth Century Reanalysis, version 2c (20CRv2c), to generate a ﬁne-resolution reconstruction of China’s climate from 1851 to 2010. The downscaled dataset has a small warm and seasonal wet bias (1.4 8 C; 0.9 mm day 2 1 ) relative to recent observations but otherwise represents spatial and temporal trends re-alistically. Analysis focused on temperature and precipitation shows that downscaling 20CRv2c is found to improve its representation of China’s climatological annual cycle, particularly over areas with sparse observational coverage such as the Tibetan Plateau. The downscaled dataset better represents the interannual variability and trends in observed temperature since 1901 and suggests that China has experienced a signiﬁcant and sustained increase in temperature of 0.05 8 C (10 yr) 2 1 since the 1850s. Chinese precipitation trends have not changed signiﬁcantly in the recent past or over the past 160 years. This analysis serves as an initial yet imperative step toward improving in-depth understanding of the characteristics and multidecadal drivers of high-impact events over China such as heat waves, droughts, and extreme precipitation.


Introduction
China's climate is varied and extreme, with intense monsoons in the east, droughts in the north, and extremes of temperature in the west, all influenced by complex orography. Enhancing insight into historical climate variability aids our understanding of changes to weather extremes, such as heat waves, flooding, and droughts, with the potential to improve future resilience in human health and food security (e.g., C. Mathison et al. 2018). Recent research shows that China is experiencing widespread warming trends (Zhou et al. 2016;Zhai and Pan 2003) and changes to extreme precipitation (Burke and Stott 2017;Liu et al. 2005;Zhai et al. 2005), including more intense and frequent droughts in the north (e.g., Qian and Zhou 2014;Qian et al. 2014;Yatagai and Yasunari 1994;Hu 2003) and flooding in the south (e.g., Yu et al. 2004;Yu and Zhou 2007;Zhou et al. 2009;Li et al. 2010).
To understand variability and changes in the extremes of key climatic variables, continuous, homogeneous, and unbiased long-term observational records are essential. However, pre-1950s records of surface climate are sparse in many parts of the world including Asia. In China, surface records are especially sparsely distributed in western regions (Xie et al. 2007). Satellite datasets provide increasingly comprehensive coverage at fine resolution, but are unavailable prior to 1979 (Adler et al. 2003;Kummerow et al. 2000;Xie et al. 2007) and offer limited insight to important multidecadal drivers and trends in Chinese climate, such as the Pacific decadal oscillation (e.g., Qian and Zhou 2014;Zhang et al. 2018b) or Atlantic multidecadal oscillation (e.g., Huang et al. 2017Huang et al. , 2019Qian et al. 2014). Recently, the Atmospheric Circulation Reconstructions over the Earth (ACRE) initiative (Allan et al. 2011) has attempted to fill these historical gaps by constructing reanalysis datasets from the assimilation of historical surface observations using Denotes content that is immediately available upon publication as open access.
Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/JAMC-D-19-0083.s1. spatially (Neff et al. 2013) homogenous climate models, substantially extending the period of existing reanalyses.
The Twentieth Century Reanalysis, version 2c (20CRv2c; Compo et al. 2011), provides a comprehensive global atmospheric circulation dataset, assimilating only surface pressure observations and using observed monthly sea surface temperature and sea ice distributions as boundary conditions. On a global scale, 20CRv2c is known to improve the representation of upper tropospheric temperatures and the surface-tropopause temperature gradient compared to past versions (Wegmann et al. 2017). The dataset has previously been used to investigate the Eurasian Silk Road Pattern (e.g., Wang et al. 2017), decadal-scale variability on the North Pacific (e.g., Williams et al. 2017) and interdecadal change in potential predictability of the East Asian summer monsoon (e.g., J. ). However, low pressure biases in marine pressures are known to affect the mass-related fields, such as pressure and geopotential height, from 1851 to 1865 (Slivinski et al. 2016). Over China, Zhu et al. (2017) find that 20CRv2c reasonably reproduces the climatology and tendency of persistent temperature extremes, but reproduces precipitation extremes less well. This is often found to be symptomatic of global reanalyses coarse spatial resolution (Rhodes et al. 2015;Dee et al. 2011;Compo et al. 2011).
In this paper, we attempt to resolve some of the limitations associated with coarse-resolution reanalyses by dynamically downscaling 160 yr of 20CRv2c (1851-2010) and 30 yr of ERA-Interim (Dee et al. 2011(Dee et al. ) (1981(Dee et al. -2010 over China using a Met Office regional climate model (RCM). We expect to provide additional value for impact-oriented climate applications by integrating features of global climate with local physiographic details at a finer model resolution. Better representation of small-scale motions and processes should result in an improved qualitative and quantitative representation of local weather extremes, while maintaining the internal consistency with large-scale circulation of the driving reanalysis, imposed as lateral boundary conditions (Laprise 2008).
We focus our assessment of these downscaled datasets over the China region. We assess the strengths and limitations of downscaled 20CRv2c (hereinafter 20CR-DS) in representing the region's temperature and precipitation climatology with respect to multiple observations-based datasets, including a newly downscaled ERA-Interim dataset (ERAI-DS). We present trends in temperature and precipitation anomalies over the past 30-160 years, derived from RCM data that provide a finer resolution and longer time frame than has previously been available. This analysis serves as an initial, yet imperative, step toward improving understanding of the characteristics and drivers of high-impact weather events in the region, such as heat waves, droughts, and extreme precipitation.
The reanalysis and downscaled data, methods, and definitions are introduced in section 2, section 3 assesses the consistency of large-scale drivers between the downscaled and reanalysis data, section 4 compares the skill of 20CR-DS with alternative datasets for our climatological period , and section 5 presents trends in 20CR-DS spanning the past 160 years.
2. Description of models, observations, and experimental design a. Driving data: 20CRv2c The 20CRv2c provides quasi-observational data of standard atmospheric variables (winds, temperature and humidity) at 6-hourly time steps. It has a horizontal spatial resolution of ;200 km [spherical harmonic truncation at wavenumber 62 (T62)] and 28 pressure levels spanning the period from 1850 to 2010. 20CRv2c comprises a 56-member ensemble that is generated using a deterministic ensemble Kalman filter, with each ensemble member considered to be equally likely . The 20CRv2c ensemble spread, relative to the difference between 20CRv2c and observations for 1981-2010 near-surface air temperature and precipitation over China, is small (Fig. 1). For this investigation, member 37 was selected as being suitably representative of the 20CRv2c ensemble.
The 20CRv2c has the longest time span among popular reanalysis products, which poses a significant advantage when investigating multiannual variability, trends, and extremes. Important features of observed extreme events are qualitatively, and sometimes also quantitatively, well represented in 20CRv2c (see Brönnimann et al. 2013;Jochner et al. 2013;Fischer et al. 2013;Stucki et al. 2013;Neff et al. 2013, among others) and therefore are of particular use in weather case studies (e.g., Giese et al. 2010;Webb 2011;Stucki et al. 2012). b. Regional climate model: PRECIS The Providing Regional Climates for Impacts Studies project (PRECIS; Jones et al. 2004;Massey et al. 2015) is an established regional climate modeling system developed by the Met Office Hadley Centre that allows its integrated RCM, HadRM3P, to be run over any area of the globe. PRECIS (HadRM3P) has been used within a range of studies from probabilistic event attribution over Europe (e.g., Massey et al. 2015) to investigating extreme temperature and precipitation changes in North America (e.g., Zhang et al. 2019) and modeling projections of future tropical cyclone activity in the Philippines (Gallo et al. 2019).

c. Experimental design
Downscaling is performed at a horizontal resolution of 25 km, with 19 vertical levels, using a rotated pole configuration (rotated North Pole is located at 51.828N, 109.938W to center a quasi-uniform latitude-longitude grid over China). After regridding to regular latitudelongitude coordinates, the domain spans 178-558N and 738-1368E, as depicted in Fig. 2 (note that the Fig. 2 caption also defines the subregion abbreviations that  are used later in this paper). The domain was chosen to minimize numerical boundary effects occurring within the region of interest, ensuring that the RCM represents important processes that occur over the ocean and Himalayan Plateau. The lateral boundary conditions (LBCs) are provided by 20CRv2c ensemble member 37 with 6-hourly updates in sea surface temperatures (SSTs) also derived from 20CRv2c. Radiative effects of time-varying greenhouse gases and sulfate aerosols are also included in the simulation [for further details, see Jones et al. (2004) and Massey et al. (2015)]. Including one year for model spinup, the 20CR-DS simulation was run for the period from December 1849 to December 2010. The 20CR-DS adheres to the maximum recommended ratio of horizontal resolution for dynamical downscaling such that the ratio of inner and outer nests does not exceed 12 (Denis et al. 2003).

d. Downscaling ERA-Interim for comparison
To complement 20CR-DS, the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim (Dee et al. 2011) reanalysis was also downscaled using PRECIS. For this simulation, the domain and configuration were identical to that described in section 2c except that LBCs and SSTs were derived from ERA-Interim, and the length of the simulation is shorter because ERA-Interim is only available from 1979. Allowing for spinup, the common period for analysis of ERAI-DS is 1981-2010. Downscaling this additional reanalysis dataset allows us to address sources of uncertainty in 20CR-DS, by downscaling two reanalyses using the same RCM and setup, it is possible to distinguish whether biases are inherited from the driving data or from the RCM. Furthermore, agreement between two distinct reanalyses indicates that each is more constrained by assimilated observations than by internal model variability and thus can be reasonably assumed to reflect reality (Sterl 2004). This is especially important because 20CRv2c extends far beyond the commonly observed period, and therefore robustly gauging how realistic 20CR-DS is in the wellobserved period is crucial before making inferences about its skill in the earlier period.

e. Observations
The distribution of station observations in China is spatially variable, with few instrumental datasets covering China before 1950 (Qian and Zhou 2014) and many stations limited to measuring only surface variables. Observations are particularly sparse over the Tibetan Plateau but relatively dense in eastern China, with station spacing ranging from kilometers to tens of kilometers (e.g., Zhou et al. 2016;Yin et al. 2015). Before 1950, the National Meteorological Information Center of the China Meteorological Administration (CMA) collected and maintained meteorological data for the Tibetan Plateau using seven observation stations above 2000 m, increasing to 97 by 1996 (Liu and Chen 2000). The precipitous orography of the Himalayan Mountain chain makes accurate observations simultaneously challenging but incredibly important, as the climatic processes in this region are vastly different to other locations. Consequently, the uncertainty around western China's historic climate is particularly high. Such factors limit the breadth of observation-based validation that can be reasonably conducted on our downscaled simulations, and therefore we include a comparison with multiple global reanalysis datasets to better assess issues of uncertainty.
The most comprehensive dataset available is the CMA ''CN05.1'' dataset (Wu and Gao 2013;Ying et al. 2009) that provides daily maximum, minimum, and mean surface air temperature (T x , T n , and T m , respectively) and daily total precipitation (Pr) for 1961-2014. CN05.1 provides the most comprehensive gridded station dataset, but variables are not uniformly homogenized (Yin et al. 2015), and therefore we also consider a number of additional datasets that use different techniques and data sources. A summary of all datasets used in this paper is detailed in Table 1.

Large-scale drivers: Consistency with lateral boundary conditions
Both 20CR-DS and ERAI-DS assume that the PRECIS RCM can accurately reproduce the large-scale circulation, driven by boundary conditions from the driving reanalyses. The task of the RCM is to generate small-scale weather systems, such as isolated mesoscale systems that are not explicitly resolved by the driving reanalyses, or other small features embedded into synoptic-scale systems, without modifying the driving large-scale circulation. This assumption defines the ''one way'' nesting approach (e.g., Jones et al. 1995;Laprise et al. 2000), which explicitly excludes the possibility that the RCM might modify larger-scale systems, thereby creating an inconsistency with the downstream large-scale circulation of the driving reanalysis. In large domains, both the mean flow and the day-to-day variability in the RCM can diverge from that of the GCM on the synoptic scale, rendering the RCM solution physically inconsistent with the GCM solution external to the RCM domain (Jones et al. 1995). Other technical problems, such as incorrect interpolation procedures, can also lead to inconsistencies of this type, so this section forms an essential first step in model quality control.
We look at 850-hPa wind components as the main indicator of large-scale circulation, since they are some of the most common circulation variables used to evaluate the strength of the Asian monsoons (Wang et al. 1999(Wang et al. , 2008. The domain used for the RCM simulations includes regions of very high and complex orography, including the Himalaya and the Tibetan Plateau, which are much better described in the regional model at 25-km resolution than in the reanalysis datasets used to drive it. To assess the RCM-reanalysis consistency, we compare seasonal means of corresponding variables from the driving reanalysis and downscaled simulations for the climatological baseline period  over the entire downscaled domain, except where the topography exceeds 850 hPa (shaded regions in Fig. 3).
Where applicable, differences between observed and downscaled datasets are quantified using the rootmean-square error (RMSE), representing the averaged magnitude of the deviation between model and observations, and the ''distance between indices of simulation and observation'' (DISO) metric of Hu et al. (2019) that combines correlation coefficient (measuring the strength and direction of the linear association between the model and observations), absolute error (measuring any persistent bias), and RMSE to summarize model-observation differences. For both RMSE and DISO, index values that approach zero indicate an improved model performance. We note that our calculation of correlation coefficient when applied to spatial data does not account for spatial autocorrelation.
Spatial correlation (Pearson r coefficient) between RCM monthly 850-hPa zonal wind (aggregated to the coarser global reanalysis resolution) and the corresponding wind component from the driving reanalysis suggests that the median correlation is high for all seasons (not shown), with medians above 0.9, for both 20CRv2c and ERA-Interim. The seasonal distribution indicates that the summer months have the lowest correlations with their driving models, an indication that the zonal winds in the RCM are diverging from reanalysis in June-August (JJA) more than in the other seasons. Agreement with driving conditions is higher for ERAI-DS than 20CR-DS, which can be explained by the finer horizontal resolution of ERA-Interim reanalysis (;0.88 vs 28 of 20CRv2c). This effect has previously been documented in specific studies on the effects of the lateral boundary condition resolution (e.g., Denis et al. 2003). Similar analysis for 850-hPa meridional winds shows a slightly lower spatial correlation but with similar patterns across seasons.
Seasonal climatological-mean 850-hPa zonal wind maps (Fig. 3) show that both downscaled reanalyses exhibit a stronger increment of eastward wind within the summer monsoonal flow over the South China Sea in JJA than do their global coarse-resolution counterparts, resulting in inflated RMSE scores: 20CR-DS RMSE 5 1.72 m s 21 and ERAI-DS RMSE 5 1.60 m s 21 . This difference in eastward flow is likely to affect the monsoon circulation by reducing and shifting the low-level eastward circulation in this region (Sampe and Xie 2010) rather than transporting southwesterly monsoon rains from India. The 20CR-DS has a realistic meridional component but lacks the zonal transport, resulting in a wet bias over continental China and a strong dry bias along the southeastern coastline (subregion SE; see also This basic assessment suggests that both RCM simulations typically follow the large-scale circulation from the lateral boundary conditions for both reanalyses, with some weakening of the summer monsoon signal expected in the downscaled output within our areas of interest. The similarity of differences between both reanalyses reflects the common setup of the regional model, domain, and orography. Both 20CR-DS and ERAI-DS tend to show similar patterns of differences when compared with their global counterparts, with minor variations attributable to the different resolutions of the driving LBCs.

Climatology analysis
The 20CR-DS representation of climatology is analyzed for the period 1981-2010 using a country mask of China matching the gridded CN05.1 observational dataset from CMA. We consider trends over the whole of China and on seven subregions (see Fig. 2), as defined by Burke and Stott (2017), with a westward extension of one subregion to span the Tibetan Plateau. These analysis regions enable independent assessment of 20CR-DS's skill at representing humid coastal areas, high-altitude arid areas, and drought-and flood-prone regions. Thirty-year multiannual monthly means of mean, minimum, and maximum near-surface temperature and for mean precipitation flux (T m , T n , T x , and Pr, respectively) are calculated for the well observed period (1981-2010) and below), where 2 standard deviations is comparable to the 5%-95% range.

a. Temperature
The 20CR-DS successfully reproduces the temporal distribution and magnitude of China's observed annual cycle for T m (Fig. 4), T x , and T n (not shown). All datasets reflect a warm bias with respect to CN05.1 across all China [DT m (20CR-DS) jCN05.1 5 11.438C]. At a regional scale, the bias is broader, 20.898 # DT m (20CR-DS) jCN05.1 # 2.668C, with improved representation of temperatures for northern regions (NC, NEC, and NE) and TP (see Fig. 4) with respect to 20CRv2c. The 20CR-DS temperatures mostly remain within 1-2 standard deviations of CN05.1 range (global 20CRv2c anomalies often exceed 2 standard deviations).
The 20CR-DS has a seasonally consistent spatial bias in temperature with respect to CN05.1: a cold bias over the Tibetan Plateau and warm bias elsewhere in China (see Figs. 5i-l). This bias pattern is also apparent in Massey et al. (2015)  suggesting that 20CR-DS is equally skilled in representing (monthly) minimum and maximum temperatures as it is for mean temperatures. Downscaling generally improves the spatial distribution of temperature-for example, reducing 20CRv2c biases such as the spuriously warm belt spanning the Tibetan Plateau to the northeastern corner in summer (Figs. 5g,k,o) and improving the magnitude of autumn (SON) temperatures (Figs. 5h,l,p); however, a winter (DJF) warm bias is introduced in the northeast in 20CR-DS (Fig. 5m).
There is a marked improvement in the Tibetan Plateau's annual temperature cycle representation after downscaling; 20CRv2c contains strongly amplified warm summer and cold winter biases, whereas 20CR-DS exhibits a small, seasonally consistent cold bias of 20.898C (often within 1 standard deviation of CN05.1). Note that uncertainty in the Tibetan Plateau region remains high because of complex orography and low station density (Zhou et al. 2016). In southern regions (SC, SEC, and SE), downscaling increases the temperature bias in winter (DJF) (Figs. 5e,i) and overestimates the hottest season (JJA), which has a small cold bias in the driving 20CRv2c (Figs. 5g,k). We suggest that the warm peaks in southern regions are the result of regional model dynamics: HadRM3P enhances the magnitude of these biases for both downscaled simulations (20CR-DS and ERAI-DS) with respect to their driving reanalyses, because the magnitude of the coarse-resolution driving data typically better matches the observational datasets (Fig. 4, dashed lines). Conversely, the JJA warm bias in the northern China plain (NEC and NC) is likely partially derived from the global reanalysis (20CRv2c), which is seen to overestimate temperatures (Fig. 4)-a common feature of reanalyses over this region (Zhou et al. 2016).
RMSE varies across seasons and subregions for 20CR-DS: for DJF, 1.788 # RMSE(T m )j CN05.1 # 4.388C, and for JJA, 1.058 # RMSE(T m )j CN05.1 # 2.378C (see Table S2 in the online supplemental material). However, accounting for spatial correlation and persistent bias, DISO shows that JJA and SON match CN05.1 most closely (Fig. 5) with DJF being the worst-performing season. Typically, DISO in TP is often an order of magnitude larger than in other subdomains, across most seasons.
We note that temperatures deviate from the driving data across northwestern China (358-508N, 708-1008E) for all seasons (Fig. 5), although the speckled nature of the fluctuations suggests that no systematic bias is introduced to this region through downscaling.

b. Precipitation
Subregional differences are more pronounced for precipitation than temperature. Precipitation over China is highly variable: southern and eastern regions are strongly affected by the East Asian summer monsoon and tropical cyclones along the coast, while western and northern regions are much more arid with moisture recycling across the land as a more dominant feature (Tian et al. 2007; van der Ent and Savenije 2011). Processes important to the accurate representation of precipitation typically occur at convective scales, below that which can be resolved at 25 km, and are therefore represented using model parameterizations in 20CR-DS and ERAI-DS. Although these can lead to deficiencies in extremes at hourly temporal resolution (Dai 2006;Hohenegger et al. 2008), parameterization has been shown to accurately describe daily and monthly extremes of precipitation over midlatitudes (Rajczak et al. 2013;Rauscher et al. 2010). Previous research suggests that ERA-Interim has high agreement with GPS-derived precipitable water over China (Zhang et al. 2018a) while 20CRv2c has a slightly wet bias (Dolinar et al. 2016). These features are generally replicated in our findings, although ERA-Interim notably exceeds the 95th percentile of CN05.1 spring and summer precipitation ( Fig. 6; the peach dashed line for All China).
20CR-DS has a suppressed wet bias over China (0.9 mm day 21 ), and the magnitude of the summer peak is improved relative to its driving reanalysis, although 20CR-DS still persistently exceeds the 95th percentile of CN05.1 observations. Agreement between the two downscaled datasets (20CR-DS and ERAI-DS) is high for all regions.
In general, 20CR-DS represents the subregion's rainfall cycle adequately; each subregion experiences a summer peak in annual precipitation distribution, although their magnitude and timing vary in accuracy (Fig. 6). Northern and coastal regions (NE, NEC, SEC, and SE) typically remain within 2 standard deviations of observations, with biases becoming more significant farther west. Principal subregional differences in 20CR-DS precipitation include the following: 1) Precipitation over the Tibetan Plateau exceeds the 5%-95% range of CN05.1 observations for all seasons (Fig. 6), although so too does ERA-Interim in this region of very low observational certainty. Downscaling improves the magnitude of precipitation with respect to 20CRv2c over this region (Figs. 7m-p). However, because it is well known that RCMs, including PRECIS, exaggerate the orographic enhancement over mountainous areas (e.g., Jones et al. 1995;Durman et al. 2001;Li et al. 2015), this finding emphasizes the extent of 20CRv2c's wet bias rather than confirming the accuracy of 20CR-DS's magnitude. This coastal dry bias is also seen in Massey et al. (2015). It therefore is likely due to the RCM's weaker representation of zonal winds such that 20CR-DS exhibits a damped eastward component of monsoon moisture transport, as discussed in section 3. 3) Enhanced inland spring precipitation ( Fig. 6; SE and SEC) exists in both downscaled datasets (20CR-DS and ERAI-DS), exhibiting a false peak in the East Asian summer monsoon precipitation in MAM. Because the driving model data (20CRv2c and ERA-Interim) do not exhibit this false peak, this result is an artifact of downscaling.

4) A persistent inland wet bias exists in 20CR-DS,
which is largely inherited from the driving data (Fig. 7). In addition, a spring enhancement exists over the North China Plain in both 20CR-DS (Fig. 7n) and ERAI-DS (Fig. S2); this could again be linked to the RCM's moisture transport mechanism, as discussed in section 3.
We note that both 20CRv2c and 20CR-DS exhibit substantial anomalous precipitation totals to the east of Bhutan across all seasons (up to 27 mm day 21 in summer months for 20CR-DS) (Fig. 7k). The anomalous rainfall values in this region could be due to 1) this location's complex elevation associated with the northern extent of the Brahmaputra River basin (on the southern face of the Himalayan Mountain range), and 2) the lack of stations providing observations for CN05.1 within this area of complex orography (see Zhou et al. 2016).

Trends in temperature and precipitation
We calculate a selection of robust linear trends using the Theil-Sen estimator (Theil 1950a,b,c;Sen 1968) for periods including 1981-2010 (the common period with observed datasets) and 1851 to 2010 (the full length of the 20CR-DS dataset). Trends are judged to be significant where the 5% confidence bounds of the Theil slope FIG. 7. As in Fig. 5, but for precipitation, comparing the extent of precipitation bias with respect to CN05.1, before and after downscaling.

OCTOBER 2019
A M A T O E T A L . estimate do not intersect zero. Agreement between datasets and 20CR-DS's representation of interannual variability are judged using the Pearson product-moment correlation coefficient r. Figure 8 compares the skill of each dataset to represent observed interannual variability for the well-observed period , and Fig. 9 shows the time series for the full period (1851-2010) with trends and their significance summarized in online supplemental Tables S5 and S6.

a. Trends from 1981
Observed annual temperature anomalies are known to have increased over China in recent decades (Burke and Stott 2017;Qian and Zhou 2014). The 1981-2010 average temperature T m increased at a rate of 0.458C (10 yr) 21 according to CN05.1 observations, noted as a steeper gradient than other observed datasets (Gao et al. 2011) (Table S5). Downscaled simulations (20CR-DS and ERAI-DS) present a slightly lower rate of warming, with gradients of 0.328C (10 yr) 21 and 0.308C (10 yr) 21 , respectively, likely associated with their persistent warm bias. This rate of warming is significant at the 1% level for all analyzed datasets. Minimum temperatures T n increased even more rapidly than T m (see Table S6).
The 20CR-DS accurately represents the interannual variability of China's mean temperature [r(T m )j CN05.1 5 0.87] and that of each subregion [0.65 # r(T m )j CN05.1 # 0.86], representing an improvement on 20CRv2c in all regions except NE (Fig. 8). The narrow range within observed datasets and high correlations of 20CR-DS with CN05.1 provide confidence that 20CR-DS can skillfully replicate the year-to-year variability of recent temperature change in China, as well as the general increasing linear trend.
The 20CR-DS corroborates with observations that no significant trend exists in annual-mean precipitation for the period 1981-2010 over China (Table S6). The range of correlations between observation-based datasets is much wider for precipitation [0.65 # r(Pr)j OBS_RANGE # 0.99] because agreement between datasets is lower, especially at a regional scale. Figure 8 shows that 20CR-DS [r(Pr)j CN05.1 5 0.6] better matches precipitation observations over China than 20CRv2c [r(Pr)j CN05.1 5 0.44]. Regionally, 20CR-DS correlations are highly variable [0.17 # r(Pr)j CN05.1 # 0.6], although it captures some significant regional events, including the ''Great Flood'' of 1998 (see Zong and Chen 2000), which affected SC and SEC regions (not shown). Downscaling results in 20CR-DS having improved or equal representation of variability in central regions (NC, NEC, SC, and SEC) but lower correlations in NE and SE relative to its driving data.
Regional  (Table S5). The 20CR-DS is found to be skillful in representing interannual temperature variability over China [r(T m )j CRU TS 5 0.79] for the entire twentieth century, with fair regional correlations [0.55 # r(T m )j CRU TS # 0.78]. Downscaling therefore represents a quantitative improvement on 20CRv2c's ability to represent observed annual extreme temperature anomalies, as for 20CRv2c: r(T m )j CRU TS 5 0.71. Figure 9 shows that 20CR-DS and 20CRv2c diverge somewhat in the 1910s-60s, with 20CR-DS better matching the magnitude of observations, although with a slight warm tendency. 20CR-DS's multidecadal variability and magnitude of twentieth century temperature anomalies is therefore well described and represents an improvement in accuracy over 20CRv2c. As a result, we suggest that 20CR-DS should produce realistic temperature anomalies over China for the full time series from 1851 onward.
From 1851 to 2010, both 20CR-DS and 20CRv2c show a significant warming trend (1% significance level) for China [DT m (20CR-DS) 5 0.058C (10 yr) 21 ] as well as the seven subregions. The rate of warming is lower for 20CR-DS than for 20CRv2c for all regions except TP.
Historic precipitation observations have low agreement on interannual variability; CRU TS and GPCC correlate at r 5 0.56  and their deviations are clearly visible in Fig. 9. GPCC has a negative trend (1% level) while CRU TS, 20CRv2c, and 20CR-DS find no significant trend in precipitation over China from 1901 to 2010. This variation in observed precipitation therefore limits the certainty associated with validating 20CR-DS for the twentieth century, with implications on the certainty associated with the full time series. Nonetheless, Fig. 9 shows that 20CR-DS replicates the magnitude of CRU TS precipitation anomalies, rather than following the strong peaks and troughs of 20CRv2c (1920 and the 1930s). This dampening of potentially spurious extremes from the driving reanalysis continues back through the unobserved period . Without earlier observations to validate against, we have reasonable confidence that 20CR-DS should continue to present sufficiently realistic anomalies in the earliest years of the simulation (1851-1900). Last, 20CR-DS finds no significant trend in China's precipitation change from 1851 to 2010, although it agrees with 20CRv2c that precipitation significantly decreased (1% significance level) in TP [DPr(20CR-DS) 5 20.002 mm day 21 (10 yr) 21 ] and significantly increased (1% level) in SEC [DPr(20CR-DS) 5 0.03 mm day 21 (10 yr) 21 ].

Summary and conclusions
The 20CR-DS represents the first reanalysis dataset to be downscaled over China for the entire twentieth century and the latter half of the nineteenth century. In this paper, we have considered the validity of the downscaling process by comparing two downscaled reanalysis datasets (20CR-DS and ERAI-DS) with their driving reanalyses and observations.
Large-scale circulation is preserved through downscaling. We used 850-hPa winds as indicators of Asian monsoon strength: our two downscaled datasets are found to be consistent with one another and their driving data. However, we note that eastward winds are strengthened below the Himalaya because of resolution differences over complex orography, and monsoonal flow is shifted to the interior of southern China. The largest departures from the driving pattern of large-scale circulation occur in summer (JJA).
With respect to dataset biases we find the following: 1) The 20CRv2c has a warm and wet bias, relative to multiple observed datasets, of approximately 1 mm day 21 and 18C relative to CN05.1 when averaged over all of China. 2) The 20CR-DS successfully reproduces the temporal distribution and magnitude of China's observed annual cycle for minimum, maximum, and mean temperatures, but with an annually averaged warm bias of 1.48C relative to CN05.1. 3) Annually averaged temperature biases are greatest over regions with the lowest station density of observation network, with a warm bias over the Taklamakan Desert and cold bias over the Himalayan Mountain chain with respect to CN05.1 (gridpoint differences up to 6 108C in boreal winter). By season, winter temperatures are anomalously high in SE, and summer temperatures are overestimated in central and eastern regions. 4) The 20CR-DS skillfully replicates the timing of anomalous hot and cold years throughout the observed period. 5) The climatological distribution and magnitude of seasonal and annual precipitation are largely well described across China, but there exists a small annual-mean wet bias of 0.9 mm day 21 with respect to CN05.1, with a persistent wet bias over the Himalaya, dry bias over coastal southern China, and wet bias over central southern China in spring. 6) Observational agreement on the timing and magnitude of wet and dry years is highly variable, rendering uncertainty around 20CR-DS precipitation to be nonnegligible. The 20CR-DS shows comparable correlations with the range of observation-based datasets in all regions except NE and SE.
Considering the effects of downscaling, we find the following: 1) The 20CR-DS climatological (30-yr average) annual temperature cycle is more realistic after downscaling for northern and western regions (especially over the Tibetan Plateau), whereas in the southern central region global 20CRv2c remains more realistic than downscaled simulations. 2) Downscaling improves correlation with temperature observations in all regions except NE. The 20CR-DS correlations remain high throughout the twentieth century (0.55 # r # 0.79). As such, 20CR-DS quantitatively improves and accurately represents longterm interannual variability of mean temperatures throughout China.
3) The large-scale climatological biases in precipitation found in both global reanalyses (wet continent and dry southern coast) are enhanced in the downscaled simulations. However, correlations between annually averaged downscaled simulations and their driving reanalyses (ERA-Interim and 20CRv2c) are low for all regions (0.13 # r # 0.66), suggesting that precipitation trends are not strongly constrained by the driving data after downscaling, regardless of the driving model. 4) Downscaling improves correlation with precipitation observations in most regions, whereas interannual correlation in NE and SE (r 5 0.17 and r 5 0.35, respectively) is notably lower than cross-observational agreement.
Trend analysis shows the following: 1) The 20CR-DS temperatures increased over China, with a linear trend of 0.328C (10 yr) 21 (significant at the 1% level) in the recent past ( China has occurred in the recent past or when considering the full 160-yr period (1851-2010). Regional trends exist in some datasets, but agreement on historic precipitation is very low.
The key aim of this analysis has been to verify key near-surface variables of a newly downscaled reanalysis product (20CR-DS) by considering monthly means over extended periods. This is an imperative first step in an ongoing assessment of this novel dataset. Future work will involve a comparison study of 20CR-DS with other RCMs over the same domain, to examine its ability to produce realistic extremes and associated teleconnections. We also hope to investigate bias correction of 20CR-DS. It is hoped that these activities will result in an improved representation of-and quantification of the uncertainties associated with-China's historic climate from 1851 to the present. Initial findings confirm 20CR-DS to be a valuable fine resolution, extended 4D proxy for observations over China, available for use in future scientific analysis and impacts studies.