In this study, the East Asia Regional Reanalysis (EARR) is developed for the period 2013–14 and characteristics of the EARR are examined in comparison with ERA-Interim (ERA-I) reanalysis. The EARR is based on the Unified Model with 12-km horizontal resolution, which has been an operational numerical weather prediction model at the Korea Meteorological Administration since being adopted from the Met Office in 2011. Relative to the ERA-I, in terms of skill scores, the EARR performance for wind, temperature, relative humidity, and geopotential height improves except for mean sea level pressure, the lower-troposphere geopotential height, and the upper-air relative humidity. In a similar way, RMSEs of the EARR are smaller than those of ERA-I for wind, temperature, and relative humidity, except for the upper-air meridional wind and the upper-air relative humidity in January. With respect to the near-surface variables, the triple collocation analysis and the correlation coefficients confirm that EARR provides a much improved representation when compared with ERA-I. In addition, EARR reproduces the finescale features of near-surface variables in greater detail than ERA-I does, and the kinetic energy (KE) spectra of EARR agree more with the canonical atmospheric KE spectra than do the ERA-I KE spectra. On the basis of the fractions skill score, the near-surface wind of EARR is statistically significantly better simulated than that of ERA-I for all thresholds, except for the higher threshold at smaller spatial scales. Therefore, although special care needs to be taken when using the upper-air relative humidity from EARR, the near-surface variables of the EARR that were developed are found to be more accurate than those of ERA-I.
A reanalysis is a high-quality climate-data product produced by assimilating long time series of observations with a consistent and state-of-the-art numerical weather prediction (NWP) model and data assimilation (DA) system to represent the best estimate of the state of the atmosphere. Reanalysis is used for a variety of applications, such as climatic variability, chemical transport, and parameterization studies (Rood and Bosilovich 2010).
Before the production of reanalyses, global analyses produced by many operational centers as initial conditions for their operational NWP models had been utilized in numerous research studies (Hoskins et al. 1989; Appenzeller and Davies 1992; Bosart et al. 1992; Price and Vaughan 1993; Mote et al. 1994). Because of the various problems encountered when analyzing the long-term trends in climate with operational analyses, however, Bengtsson and Shukla (1988) and Trenberth and Olson (1988) suggested that NWP models and assimilation methods need to be not only up to date but also kept consistent when generating analyses with historical observations to provide a valuable dataset. These suggestions and positive responses led to global reanalysis projects as shown in Table 1. At first, the 15-yr European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-15; Gibson et al. 1997), the National Centers for Environmental Prediction–National Center for Atmospheric Research Reanalysis 1 (NCEP–NCAR R-1; Kalnay et al. 1996; Kistler et al. 2001), and the National Aeronautics and Space Administration Data Assimilation Office (NASA DAO) global reanalysis for 15 years (GEOS1; Schubert et al. 1993) were produced.
In many studies using the first-generation reanalyses, various problems and limitations were found (Kållberg 1997; Stendel and Arpe 1997; Newman et al. 2000; Trenberth et al. 2001). Hence, NCEP–U.S. Department of Energy (DOE) Reanalysis 2 (R-2; Kanamitsu et al. 2002), which is an updated version of NCEP–NCAR R-1 using an improved DA system and model, was developed. Then, NCEP produced the NCEP Climate Forecast System Reanalysis (CFSR; Saha et al. 2010) with higher resolution (approximately 38 km). Likewise, ERA-40 (Uppala et al. 2005) was carried out at ECMWF using new observation types, an improved forecast model, and a variational data assimilation method to fix the limitations of ERA-15. Then, ERA-Interim (hereinafter ERA-I; Dee et al. 2011) was generated by ECMWF. Furthermore, NASA’s Modern-Era Retrospective Analysis for Research and Applications (MERRA; Bosilovich 2008; Rienecker et al. 2011; MERRA-2; Bosilovich et al. 2015) was executed by NASA, and Japanese 25-year Reanalysis Project (JRA-25; Onogi et al. 2007) and Japanese 55-year Reanalysis (JRA-55; Ebita et al. 2011; Kobayashi et al. 2015) were developed by the Japan Meteorological Agency (JMA). Because most meteorological centers produced reanalyses for the latter half of the twentieth century, it is hard to understand the global impacts of major events in the first half of the twentieth century. Accordingly, reanalyses that include the first half of the twentieth century have recently been developed through many projects. The Twentieth Century Reanalysis (20CRv1; Compo et al. 2008; 20CRv2; Compo et al. 2011) and a twentieth-century atmospheric model ensemble (ERA-20CM; Hersbach et al. 2015) have been generated.
While most reanalyses have been produced for the global area, the coarse resolution of global reanalyses limits the analysis of regional-scale phenomena. Hence, many meteorological organizations started to produce regional reanalyses with higher resolution. NCEP produced the North American Regional Reanalysis (NARR; Mesinger et al. 2006). In Europe, the European Reanalysis and Observations for Monitoring (EURO4M) project led to European regional reanalyses (Renshaw et al. 2013). Subsequently, in the Uncertainties in Ensembles of Regional Reanalyses (UERRA) project, various European meteorological regional reanalyses are being developed (Borsche et al. 2015). In addition, the Arctic research community produced the Arctic System Reanalysis (ASR; Bromwich et al. 2010), and the Arctic Observation and Reanalysis Integrated System (ArORIS) was developed to bring together ground-based data sources, satellite, and reanalysis into a common frame for the Arctic (Christensen et al. 2016). Currently, within the Indian Monsoon Data Assimilation and Analysis (IMDAA) project, the Met Office (UKMO) has planned to produce a regional reanalysis in South Asia for the period 1978–present (Mahmood et al. 2014; Barker et al. 2015).
Regional reanalyses have already been produced or are planned to be produced in North America, Europe, the Arctic, and South Asia, but no regional reanalyses exist for East Asia. Because of the large uncertainties in global reanalyses of regional-scale phenomena in Asia (Bao and Zhang 2013; Chen et al. 2014), a high-resolution regional reanalysis is required to precisely analyze the regional climate in Asia.
In this study, the East Asia Regional Reanalysis (EARR) system is developed on the basis of the Unified Model (UM) with 12-km horizontal resolution. The UM has been an operational NWP model in the Korea Meteorological Administration (KMA) since being adopted from the UKMO in 2011. The EARR system has been run for the 2-yr period of 2013–14 to prepare for a future, long-term execution of the EARR project. For practical applications of reanalysis, validation is essential because a reanalysis is an estimate of the atmospheric state, as mentioned in Mahmood et al. (2014) and Borsche et al. (2015). Thus, the characteristics and uncertainties of EARR for the period 2013–14 are compared with ERA-I and observational data.
In section 2, the EARR system, which includes the NWP model, DA system, and observation types used to produce EARR, is described. For evaluation of the performance of EARR, a description of various verification methods used is provided in section 3. The results are shown and discussed in section 4. Section 5 contains a summary and conclusions.
2. Reanalysis system
The EARR system is developed on the basis of the UM (Davies et al. 2005) at the KMA (version 8.2, the KMA operational forecasting system in June of 2015), which has been an operational NWP model at the KMA since 2011. The horizontal resolution is 12 km (0.11° × 0.11°; 540 × 432 grid points), and there are 70 vertical levels up to 80 km from the surface. The UM uses fully compressible, nonhydrostatic, and deep-atmosphere formulations (i.e., Wood and Staniforth 2003; Davies et al. 2005; Staniforth and Wood 2008) with hybrid height-based terrain-following vertical coordinates. The UM is a gridpoint model discretized spatially with a horizontally staggered Arakawa C grid and a vertically staggered Charney–Phillips grid. Semi-implicit, semi-Lagrangian time-integration methods are used in the UM (Davies et al. 2005). The physical-process parameterization schemes used in this study are the Edwards–Slingo general two-stream scheme (Edwards and Slingo 1996) for radiation, the Joint UK Land Environment Simulator (JULES; Best et al. 2011) four-layer soil model using van Genuchten (1980) soil hydrology for the surface, first-order nonlocal boundary layer scheme on the basis of Lock et al. (2000) for the boundary layer, mass-flux convection with a convective available potential energy closure (Gregory and Rowntree 1990) for cumulus parameterization, and a mixed-phase precipitation scheme (Wilson and Ballard 1999) for microphysics parameterization.
The EARR domain is shown in Fig. 1. The lateral boundary conditions (LBCs) are provided by the operational global model of the UM at the KMA with 25-km horizontal resolution (N512) and 70 vertical levels (top at 80 km). The analysis and 3-h forecast fields produced by the global UM at the KMA are used for the LBCs, and the LBCs are updated every 3 h.
b. Data assimilation system and observations
To generate EARR, the Met Office four-dimensional variational (4DVAR) data assimilation scheme (Courtier et al. 1994; Rawlins et al. 2007) is used to assimilate the observations. The type and source of observations used to produce EARR are listed in Table 2. In addition to the observations used for the operational forecast, Infrared Atmospheric Sounding Interferometer (IASI) satellite data from the European Meteorological Operational Satellite B (MetOp-B), atmospheric motion vectors (AMV) from the Communication, Ocean and Meteorological Satellite (COMS), and additional channels of satellite data from the Advanced TIROS Operational Vertical Sounder (ATOVS) and the Atmospheric Infrared Sounder (AIRS) are assimilated. Furthermore, for EARR, the satellite data density is modified by applying a different thinning distance than that of operational forecasts. Thus, the observational data used for EARR are enhanced from those used for operational purposes.
The EARR system is developed to provide reanalysis four times per day (at 0000, 0600, 1200, and 1800 UTC) using 6-hourly analysis cycles and a 6-h assimilation window. The following three-step process explains how the reanalysis is produced at time t + 0:
An analysis increment is acquired at t − 3 h using 4DVAR via an iterative procedure to obtain an initial field at t − 3 h (the beginning of the assimilation window), which minimizes the differences between the observations and forecast starting at the previous cycle within the assimilation window (t − 3, t + 3).
The increment obtained in step 1 is added to the 3-h forecast integrated from the reanalysis at t − 6 h that was produced in the previous cycle. The resulting field is considered to be the initial field at t − 3 h.
To maintain the configuration of the operational NWP model at the KMA, the 3-h forecast from the initial field at t − 3 h is considered to be the reanalysis (t + 0), and the 6–12-h forecast from the initial field at t − 3 h is used as the background state in the next 4DVAR cycle.
For the Land DA System (LDAS), soil moisture fields produced from the global UM on the basis of 6-h updates with nudging increments provided by the Advanced Scatterometer (ASCAT) soil moisture dataset are interpolated to the EARR domain four times per day. In the UM LDAS, the background state of surface temperature is used as the surface temperature without analysis, and the sea surface temperature and sea ice are updated on the basis of the daily mean of the 5-km (1/20°) Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA) from the UKMO at 0600 UTC.
3. Verification methods
a. RMSE and bias
The RMSE is defined as
where R represents the reanalysis interpolated to the observation sites using bilinear interpolation (Press et al. 1992), O represents the observations, and n is the number of observations. When calculating RMSE for wind, RMSEs for zonal and meridional wind are computed separately and are combined as in Dee et al. (2011):
The bias (mean error) is defined as
b. Skill scores
The skill score (SS; Wilks 2006) is used to determine the performance of a forecast and reveals how much it is improved relative to the performance of a reference forecast. To validate the accuracy of the reanalysis in this study, the reanalysis and reference reanalysis are used instead of the forecast and reference forecast in the original definition of SS in Wilks (2006). In particular, the SS in this study is defined on the basis of the RMSE, which is the same as that of EURO4M (Renshaw et al. 2013):
where RMSEref represents the RMSE of the reference reanalysis from the observations, RMSEperf represents the RMSE of the reanalysis when assuming that the reanalysis is perfect, and RMSE represents the RMSE of the reanalysis to be verified. Because the RMSE is zero for a perfect reanalysis (i.e., RMSEperf = 0), the SS can be expressed more simply as shown in the rightmost term of Eq. (4).
The SS is 100% when the RMSE is 0. When the RMSE is equal to RMSEref, the SS is 0%, which indicates that there is no improvement relative to the reference reanalysis. If RMSE is larger than RMSEref, the SS is negative, which indicates that the performance is degraded relative to the reference reanalysis.
c. Kinetic energy spectra
By analyzing the observational data from aircraft flying near the tropopause (approximately 10 km) during the Global Atmospheric Sampling Program (Perkins 1976; Papathakos and Briehl 1981), Nastrom and Gage (1985) explored the characteristics of the kinetic energy (KE) spectra as a function of the horizontal wavenumber. On the basis of these features in previous studies, the KE spectra of the model were calculated and compared with the canonical spectra to evaluate the ability of the model to reproduce the canonical spectrum structure (Koshyk and Hamilton 2001; Skamarock 2004; Hamilton et al. 2008; Skamarock et al. 2014). This ability can be regarded as evidence of accuracy of a model’s configuration, formulation, and implementation (Skamarock et al. 2014). The KE spectra of the aircraft observations in the upper troposphere approximately follow a −5/3 power slope of the horizontal wavenumber for scales corresponding to wavelengths of less than approximately 500 km and a −3 power slope for scales of larger than 500 km (Nastrom and Gage 1985; Cho et al. 1999; Lindborg 1999). The observed −3 power-slope regime is commonly explained by quasigeostrophic turbulence theory (Charney 1971), but the dynamics of the mesoscale portion, which is the −5/3 power slope, remains under discussion (e.g., Lindborg 2005, 2007; Tulloch and Smith 2009).
d. Triple collocation
Triple collocation, introduced by Stoffelen (1998), is an effective method to estimate the random error variances of three collocated datasets with uncorrelated errors (Gruber et al. 2016). It has been widely used for the large-scale evaluation of many geophysical variables in oceanography, such as wind speed (Caires and Sterl 2003; Vogelzang et al. 2011), wave height (Caires and Sterl 2003; Muraleedharan et al. 2006; Janssen et al. 2007), and sea surface temperature (Blackmore et al. 2007; O’Carroll et al. 2008; Gentemann 2014), and in hydrometeorology, such as precipitation (Roebeling et al. 2012) and soil moisture (Scipal et al. 2008; Dorigo et al. 2010; Gruber et al. 2016). Because the comparison with point measurements is not enough to provide accurate information on the performance of a large-scale product, the triple collocation analysis can provide a much better spatial error analysis.
The coefficients of the functional relationships between the variables and variances of their random errors are defined and calculated as follows (Caires and Sterl 2003; Muraleedharan et al. 2006). There are three sets of n observations (xi, yi, zi), i = 1, …, n, corresponding to measurements of certain deterministic underlying variables Ti, i = 1, …, n, with certain systematic deviations and subject to zero-mean random errors (exi, eyi, ezi), i = 1, …, n. The observations can be derived from models, in situ or satellite data. For simplicity, the subscripts of variables are omitted and the average of the variable is denoted by 〈x〉, 〈xy〉, and so on. The following Eq. (5) is assumed, and the unknown parameters α1, α2, β1, and β2 and the variances of errors need to be estimated:
where X, Y, and Z are the true measurements of the physical truth corresponding to the observations x, y, and z. Assume that X, Y, and Z are linearly related to the deterministic variable T. By removing the mean from each variable and denoting the result by x*, y*, z*, and T*, the model can be simplified to
By calculating the cross correlations and assuming that the errors are independent, the variances of the errors can be obtained as
More detailed derivation and discussion regarding triple collocation can be found in Caires and Sterl (2003). In this study, the functional-relationship (FR) model that is an error-in-variables model in which the underlying variables T are fixed (or deterministic) is used.
e. Fractions skill score
With traditional verification scores, small-scale variabilities in high-resolution forecasts can be problematic in judging the quality of a forecast because of double penalty. Thus, when verifying high-resolution data products, spatial verification methods need to be employed. The fractions skill score (FSS) introduced by Roberts and Lean (2008) can compare forecasts of different resolutions without doubly penalizing high-resolution forecasts for representativeness errors (Mittermaier et al. 2013). Although it is more common to apply the FSS to the verification of precipitation variables, the FSS is used to verify near-surface wind variables of EARR and ERA-I in this study.
The FSS is a variation on the Brier skill score (Roberts 2008) and is defined as
where the fractions Brier score (FBS) is a mean square difference between observation and forecast fractions and is defined as
with Oi and Ri being the surface synoptic (SYNOP) observation fractions and reanalysis fractions, respectively, at each point (in this study, reanalysis is chosen instead of forecast) and having values between 0 and 1. Here, N is the number of points in the verification area of the FSS (shaded area in gray in Fig. 1). The FBSworst is the largest FBS and is defined as
f. Hypothesis test for significant differences
When evaluating whether reanalysis 1 is better than reanalysis 2, it is necessary to test whether the difference in the verification scores is significant. Hence, in this study, the hypothesis test is conducted as follows. The null hypothesis is H0: μ1 − μ2 = 0 for dependent samples. The hypothesis test concerns a stochastic variable d = x1 − x2, which is a difference between paired values from two datasets. Here, t is now the test statistic, defined as
and formed by the sample mean and the variance of the sampling distribution of the mean , where s2 is the sample variance of the individual d and n is the sample size. In addition, if there is autocorrelation in the time series of data, the variance of a time average would be larger than that obtained under the assumption of time independence. Thus, the variance is inflated by considering the ratio (1 + ρ1)/(1 − ρ1) acting as a variance inflation factor (Wilks 2006; Kim and Kim 2014) as
where n′ is the estimated effective sample size and ρ1 is the lag-1 autocorrelation coefficient in the time series. The distribution of the test statistic t under H0 depends on the sample size. For sample sizes of less than 30, the Student’s t distribution with n′ − 1 degrees of freedom should be used. In this study, the Gaussian distribution that is appropriate for larger sample sizes is used and the serial correlation in the time series of data is considered. To consider contemporaneous correlation between datasets, the hypothesis tests developed by any of Diebold and Mariano (1995), Hering and Genton (2011), or DelSole and Tippett (2014) need to be applied.
a. Skill scores
To verify the EARR accuracy, skill scores of the EARR are calculated for 2013–14 with ERA-I as a reference. To obtain the skill scores in Eq. (4), the RMSEs of each EARR and ERA-I variable at 0000 and 1200 UTC for 2013–14 are calculated. The black dashed box in Fig. 1 denotes the verification domain (20°–55°N, 103°–150°E). The wind and temperature near the surface, the mean sea level pressure (MSLP), and the wind, temperature, geopotential height, and relative humidity at 1000, 850, 500, and 200 hPa are selected for verification.
The pressure-level and near-surface variables are verified against radiosonde observations and surface SYNOP observations, respectively. Approximately 108 radiosonde and 550 SYNOP observations are used for verification at 0000 and 1200 UTC. The uncertainty of radiosonde observation from 1000 to 10 hPa is regarded as 1.3–2.1 m s−1 for zonal and meridional wind, 0.65–1.5 K for temperature, 10.5%–19% for relative humidity, and 7.8–72 m for geopotential height. SYNOP uncertainty estimates are considered to be 2 m s−1 for 10-m wind, 1.1 K for 2-m temperature, 6.2% for 2-m relative humidity, and 1 hPa for MSLP. The observations used for verification are assimilated to generate EARR and ERA-I. Thus, the requirements on independence are loosened to sustain a reasonably big dataset because strictly independent observations are difficult to get, as noted by Borsche et al. (2015).
The average, median, lower and upper quartiles (25th and 75th percentiles), and minimum and maximum values of skill scores for 2013–14 are shown in Fig. 2. Except for MSLP, the geopotential height at 850 and 1000 hPa, and the upper-air relative humidity, most of the skill scores are positive, which indicates that the EARR performance improves relative to ERA-I. The EARR accuracy is higher than that of ERA-I for surface variables relative to upper-air variables. Consistent with EURO4M (Renshaw et al. 2013), the MSLP of the EARR is degraded relative to ERA-I, which may be caused by the boundary conditions of the limited domain in the EARR. The boundary conditions of the EARR regional model are provided by the global UM; thus boundary errors originate from differences in model resolution and physical parameterization between the regional model and the global model at the regional model boundaries. These boundary errors can inevitably influence the interior of the regional model domain, and the regional model tends to simulate large-scale flows worse than the global model does (Warner et al. 1997; Renshaw et al. 2013). MSLP is a variable that represents large-scale flows. The skill scores for the upper-air relative humidity, which are negative, are discussed and analyzed in detail by comparing the EARR RMSEs with those of ERA-I in the next section.
To examine the temporal variation of the skill scores for wind and temperature near the surface, the relative humidity at 1000 hPa, and the geopotential height at 500 hPa, the time series of monthly mean skill scores are shown for 2013–14 in Fig. 3. Most of the monthly mean skill scores for the four variables are positive, which indicates improvement of EARR relative to ERA-I. For wind and temperature near the surface (Figs. 3a,b), the skill scores in winter are higher than in summer and show a similar pattern for each year. The higher SS of the temperature near the surface in winter than that in summer is consistent with the EURO4M results shown by Renshaw et al. (2013). Because the SS is the relative ratio of the EARR and ERA-I RMSEs, a significant improvement in the RMSE in winter (Figs. 5c and 6c, described in more detail below) leads to a higher SS. For the geopotential height at 500 hPa and the relative humidity at 1000 hPa (Figs. 3c,d), although there is no distinct seasonal variability in the time series of the SS, most of the skill scores are positive throughout all seasons, which indicates that EARR simulates better than ERA-I.
b. RMSE and bias
To investigate the magnitude, vertical distribution, and seasonal characteristics of the RMSE of wind, temperature, and relative humidity in the troposphere, the EARR and ERA-I RMSEs at the pressure levels are averaged for 0000 and 1200 UTC in January and July, respectively, for 2013–14 (Fig. 4). For the near-surface temperature and winds of EARR and ERA-I, the RMSEs and biases at 0000 and 1200 UTC for January and July, respectively, in 2014 are examined (Figs. 5 and 6).
Except for the upper-air meridional wind and relative humidity in January, the EARR RMSEs for all variables are smaller than those of ERA-I (Fig. 4), which indicates that EARR can simulate variables better than ERA-I can in comparison with observations. To be specific, for the zonal wind the EARR RMSEs for all levels in January and July are reduced, on average, by 0.41 and 0.60 m s−1, respectively, relative to those of ERA-I (Figs. 4a,b), and these differences are statistically significant at the 1% significance level. For the meridional wind the EARR RMSEs below 250 hPa in January and for all levels in July are, on average, smaller by 0.35 and 0.55 m s−1, respectively, relative to those of ERA-I (Figs. 4c,d), and these differences are statistically significant at the 1% significance level. The EARR RMSE for the meridional wind at 200 hPa in January is 0.12 m s−1 larger than that of ERA-I. According to So and Suh (2015), the intensity and three-dimensional variation of the jet core at 200 hPa in East Asia vary significantly depending on the reanalysis type, and the difference in the upper-air wind between different reanalyses is approximately 3 m s−1. The variability may explain why the EARR RMSE for the upper-air meridional wind in January is slightly larger than that of ERA-I. Nevertheless, the RMSE of the EARR meridional wind for all levels is smaller than that of R-2 vector wind (varying in the range of 3.2–6.4) and that of NARR vector wind (varying in the range of 2.4–4.5), as shown in Mesinger et al. (2006).
For temperature (Figs. 4e,f), the EARR RMSE is largest at the lower level and decreases with height below 500 hPa. Above 500 hPa, the EARR RMSE increases with height. This vertical profile of the temperature RMSE of the EARR is similar to that of NARR and R-2 ( Mesinger et al. 2006). Throughout all levels (between 1000 and 200 hPa), the temperature RMSE of the EARR is smaller than that of ERA-I by 0.16 K in January and 0.18 K in July (on average). At each pressure level, the difference in the EARR and ERA-I RMSEs is statistically significant at the 1% significance level.
For the relative humidity (Figs. 4g,h), EARR exhibits improvements in January below 850 hPa and in July throughout all levels, relative to ERA-I, and these differences are statistically significant at the 1% significance level. The relative humidity has considerable uncertainty in reanalyses (Martin et al. 2011). In particular, Chung et al. (2013) identified that the uncertainty of the diurnal variation in upper-tropospheric humidity is substantial because the physical-process parameterizations governing upper-tropospheric humidity in reanalyses are incomplete; moreover, observations to constrain the moisture budget are not used effectively.
Because EARR (12 km) has a higher resolution than ERA-I (80 km), EARR tends to better simulate near-surface variables than upper-air variables. Jermey and Renshaw (2016) mentioned that the improvement in precipitation representation of the UKMO regional reanalysis over ERA-I is mainly due to the higher resolution of the forecast model and assimilation system of the UKMO regional reanalysis, which is particularly important when representing high-threshold and small-scale events. With respect to the near-surface variables (wind and temperature), the temporal variations of the RMSE and bias in comparison with SYNOP observations are calculated at 0000 and 1200 UTC for January and July of 2014 (Figs. 5 and 6). For the near-surface temperature, the 2-m temperature is used for the observations and ERA-I, whereas the 1.5-m temperature is used for EARR. The EARR near-surface temperature biases are generally negative, and those of ERA-I are positive (Figs. 5a,b). A temperature inversion near the surface in the stable boundary layer, where temperatures typically decrease at night primarily as a result of radiative cooling, might be one reason why EARR has negative biases at 0000 and 1200 UTC (corresponding to 0800–0900 and 2000–2100 LST in East Asia) because EARR simulates the 1.5-m temperature, which is closer to the surface than the 2-m temperature used in ERA-I and the observational data.
The average difference in magnitude (i.e., absolute values) of the biases in January between EARR and ERA-I (Fig. 5a) is 0.07 K (larger biases for EARR) and is not statistically significant at the 5% significance level. In contrast, the average magnitude of the biases in July for EARR is larger by 0.25 K than that of ERA-I (Fig. 5b), which is significantly different at the 1% significance level. Bias has limitations in estimating the accuracy of reanalyses because bias does not provide information on the representative magnitude of each error (Wilks 2006). Instead, estimating the temperature accuracy via the RMSE shows that the average EARR RMSEs decrease by 0.36 K in January and 0.18 K in July relative to those of ERA-I (Figs. 5c,d). The EARR RMSEs for the near-surface temperature are significantly different from those for ERA-I at the 1% significance level. The larger bias and smaller RMSE of EARR relative to ERA-I indicate that the ERA-I near-surface temperature varies more than that of EARR in comparison with observations. Furthermore, on average, the near-surface temperature RMSE of the EARR in July is smaller by 0.56 K than in January, which indicates that EARR can better simulate the near-surface temperature in July than in January. This pattern is found in other reanalyses, such as R-1, ERA-40, and ERA-I over Ireland (Mooney et al. 2011).
For the near-surface wind, both EARR and ERA-I have positive biases in both January and July of 2014 and have smaller biases in July than in January (Figs. 6a,b). The EARR bias in both January and July is smaller than that of ERA-I. In January and July, the average EARR RMSE is smaller by 0.91 and 0.64 m s−1, respectively, than that of ERA-I. Differences in the 10-m wind RMSE between EARR and ERA-I are significant at the 1% significance level. In addition, the average EARR RMSE for the 10-m wind in July is smaller by 0.11 m s−1 than in January.
Therefore, it is found that EARR and ERA-I can produce more accurate near-surface wind and temperature fields in July than in January. EARR is able to better simulate near-surface variables than ERA-I, and the EARR improvements over ERA-I in terms of the reduced RMSE are larger in January than in July. Although this result is consistent with the characteristics of NARR, EARR has smaller RMSEs than NARR [cf. Figs. 5 and 6 in this study with Figs. 8 and 9 of Mesinger et al. (2006)].
c. Correlation coefficient
To determine how well the reanalysis agrees with observations, correlation coefficients with respect to near-surface wind and temperature are investigated by comparing the SYNOP observations with EARR and ERA-I, respectively. The monthly means are calculated by averaging the reanalyses and observations at 0000 and 1200 UTC in January and July of 2014. The near-surface variables of the reanalysis at individual grid points are interpolated to the locations of the SYNOP observations via bilinear interpolation in the verification area. Observations at the SYNOP observation sites with fewer than one-third of the data for a given month are excluded because such data are not considered to be appropriate for calculating monthly means.
Linear regression analysis for the reanalyses and observations is conducted with the monthly means of each near-surface variable at meaningful observation sites. The near-surface wind data with standardized residuals of greater than 3 are regarded as outliers and are excluded from the linear regression analysis. The standardized residual is obtained by dividing the difference between the data value and the regressed line by the standard deviation of the residual.
For the near-surface wind and temperature, the correlation coefficients R from EARR for both January and July are higher than those from ERA-I, and the regression coefficients m, which are the slopes of the regressed lines, from EARR for both January and July are closer to 1 than those from ERA-I (Fig. 7). Therefore, EARR can provide a much improved representation of near-surface variables that is more consistent with observations than that of ERA-I. In addition, the differences between the EARR and ERA-I monthly means for both the near-surface wind and temperature in January and July are statistically significant at the 1% significance level.
d. Kinetic energy spectra
The KE spectra are generally computed for a global model. For a limited-area model, a different method can be applied. Therefore, the spectral analysis is conducted using the discrete cosine transform as in Denis et al. (2002), and the KE spectra are obtained by averaging the spectra of the variances of the zonal and meridional wind fields. At 0000 and 1200 UTC for January and July of 2014, the KE spectra at 200, 500, and 850 hPa of the EARR and ERA-I are computed in the interior domain, excluding the 40 grid points nearest the boundaries to avoid the impact of the boundary conditions in the EARR domain.
The average KE spectra calculated at 200, 500, and 850 hPa are shown in Fig. 8. For the scales that are larger than 1000 km, the EARR and ERA-I KE spectra show generally similar energy levels at each vertical height. In particular, the EARR and ERA-I KE spectra at 200 hPa exhibit slopes closer to −3. As the vertical height is lower, the EARR and ERA-I KE spectra for the scales of larger than 1000 km are smaller, which is consistent with the spectra shown by Skamarock (2004). In contrast, for the scales that are less than 1000 km, the ERA-I KE spectra at each vertical height decrease dramatically as the wavenumber increases, relative to the EARR KE spectra. As a result, the EARR slope is closer to −5/3 than that of ERA-I. Although the EARR spectral tails are also damped, EARR can still reproduce greater KE than ERA-I, which can be explained by the effective resolution resulting from differences in the inherent resolution of EARR and ERA-I (Denis et al. 2002; Skamarock 2004; Zentek et al. 2016).
According to Skamarock (2004), the effective resolution can be defined as the wavelength at which a model’s spectrum starts to decay relative to the observed spectrum or relative to a spectrum from a simulation with higher resolution because modes with wavenumbers greater than the effective resolution are damped. The effective resolution determined from KE spectra is found to be approximately 7 times the horizontal grid spacing, because the model filters scale with the model grid size (Skamarock 2004). For the effective resolution for the EARR KE spectra, the EARR KE spectra at 200 hPa start to fall off around 200 km, whereas those at 850 hPa start to fall off approximately 84 km as indicated by the blue vertical line in Fig. 8, which is 7 times the EARR grid spacing (12 km). As the vertical height is lower, the EARR KE spectra for the scales of smaller than approximately 500 km contain more KE, show a slope that is closer to −5/3, and start to fall off at smaller scales. It may be associated with the numerical dissipation and filtering configurations in the model, as noted by Skamarock and Dempsey (2005).
e. Monthly mean horizontal field
To analyze the characteristics of the monthly spatial distribution of the near-surface variables from EARR and ERA-I, the monthly mean near-surface wind and temperature are obtained by averaging each variable at 0000 and 1200 UTC in January and July of 2014 (Figs. 9 and 10). Overall, the monthly mean near-surface wind and temperature from EARR for January of 2014 are similar to those from ERA-I (Fig. 9). For the near-surface temperature, however, EARR can simulate finescale features on land in greater detail than ERA-I can; for the near-surface wind, EARR is good at reproducing small-scale phenomena near Taiwan and the coast of Russia in the ocean as well as on land (Fig. 9). Overall, the EARR near-surface wind on land is underestimated relative to that of ERA-I (Figs. 9b,d), which is consistent with Fig. 7 and indicates that the EARR variable better resembles the observations than does that of ERA-I. Although the monthly means of the near-surface wind and temperature from EARR in July of 2014 are generally comparable to those from ERA-I, the small-scale phenomena in the ocean and on land are better captured in EARR than in ERA-I (Fig. 10).
f. Triple collocation
The triple collocation analysis is applied to near-surface wind and temperature fields of SYNOP observations, ERA-I, and EARR reanalysis data at 0000 and 1200 UTC for January and July of 2014. For the SYNOP, ERA-I, and EARR collocation, EARR and ERA-I data are interpolated bilinearly to the SYNOP locations in the verification area.
The scatter diagrams of comparisons between three datasets for the near-surface wind and temperature from January and July of 2014 are presented in Figs. 11 and 12, respectively. The estimates of the FR coefficients and of the variances of the errors are given in Table 3. The FR lines obtained are superposed on the scatter diagrams. For the near-surface temperature, both EARR and ERA-I underestimate SYNOP observations of high temperature and overestimate SYNOP observations of low temperature (Figs. 11a,c and 12a,c). With respect to the near-surface wind, both EARR and ERA-I overestimate SYNOP observations (Figs. 11b,d and 12b,d). In addition, estimates of the variances of the random errors in EARR are much lower than those for SYNOP and ERA-I (Table 3) for both the near-surface wind and temperature in January and July of 2014, which can be confirmed by the scatter diagrams (Figs. 11 and 12). The scatter of SYNOP versus ERA-I comparisons (Figs. 11a,b and 12a,b) is greater than that of EARR versus any other data. Thus, EARR performs better in representing both the near-surface wind and temperature than ERA-I does for January and July of 2014.
g. Fractions skill score
Before calculating FSS, ERA-I and SYNOP observations are reprojected onto the same grid as the EARR (12-km grid spacing) so that they can be directly compared with each other. For every point, the fraction within a given square area (neighborhood) that exceeds a specific threshold is calculated for EARR, ERA-I, and SYNOP. The same process is repeated for each neighborhood size [1 Δx (12 km), 5 Δx (60 km), 9 Δx (108 km), and 11 Δx (132 km)] and threshold (2, 4, 6, and 8 m s−1). The maximum size of neighborhood is chosen to be 132 km to include the Korean Peninsula and Japan in East Asia, and the neighborhood is square. Because the SYNOP observations in the ocean are very sparse (around 10 points), only land with sufficiently reliable SYNOP coverage is considered as the verification area for FSS. The FSS can have values between 0 (a complete mismatch) and 1 (perfect skill). The FSS obtained from a random reanalysis with the same fractional coverage over the verification area as that of the observation for each threshold is called FSSrandom (fo). The FSS that is halfway between random and perfect skill is called FSSuniform (0.5 + fo/2). When the FSS curve reaches FSSuniform at a certain scale, the smallest scale is regarded as useful or as a skillful scale.
Figure 13 shows the averaged FSS of EARR and ERA-I reanalysis for 0000 and 1200 UTC in January and July of 2014 as a function of spatial scale (length of square) for each threshold. Overall, the EARR FSS is higher than the ERA-I FSS for all thresholds. For thresholds of 2 and 4 m s−1 (Figs. 13a,b), the EARR FSS surpasses FSSuniform at around 40 and 100 km, respectively, whereas the ERA-I FSS does not reach FSSuniform for any spatial scales up to 132 km. In addition, for thresholds of 6 and 8 m s−1 (Figs. 13c,d), even though both of the reanalyses’ curves do not achieve a skillful scale (FSS < 0.5 + fo/2), the EARR FSS is still higher than the ERA-I FSS except for the 8 m s−1 threshold at a spatial scale of 12 km. Mittermaier et al. (2013) mentioned that the decreasing trend of FSS with increasing thresholds for precipitation forecasts is affected by a combination of increased biases and mismatch in location as the thresholds increase. In addition, Mittermaier and Roberts (2010) pointed out that the FSS scores for higher thresholds are difficult to achieve a skillful spatial scale because the higher thresholds are related to smaller-scale features that are more difficult to represent accurately. The limitation of the maximum size of neighborhood in this study may be one of the reasons to have lower FSS scores for higher thresholds. Therefore, increased biases and mismatch in location and limited neighborhood size in this study may be reasons for low FSS values at higher thresholds of EARR and ERA-I. Nevertheless, on the basis of the FSS, the near-surface wind of EARR is statistically significantly better simulated than that of ERA-I for all thresholds at the 5% significance level, except at the 6 m s−1 threshold for a spatial scale of 12 km and the 8 m s−1 threshold for spatial scales of 12 and 60 km.
5. Summary and conclusions
A reanalysis is a high-quality climatic dataset that represents the best estimate of the atmospheric state, which is produced by assimilating long time series of observations with a consistent and state-of-the-art NWP model and DA system. Therefore, reanalyses have widespread applications in many research areas. Although most reanalyses have been produced for the global area, because of the limitation of the coarse resolution used in global reanalyses in analyzing regional-scale phenomena, the meteorological organizations in North America, Europe, the Arctic, and South Asia have produced or plan to produce regional reanalyses with higher resolution.
Because of the large uncertainties of global reanalyses at the regional scale (Bao and Zhang 2013; Chen et al. 2014) and the high population density in Asia, a high-resolution regional reanalysis is required to precisely analyze the regional climate in Asia. No regional reanalysis has been produced for East Asia, however. Therefore, in this study the East Asia Regional Reanalysis (EARR) is developed for 2013–14. EARR is based on the UM with 12-km horizontal resolution and 4DVAR, which has been an operational NWP model at the KMA since being adopted from the UKMO in 2011. Because reanalysis is an estimate of the atmospheric state, for practical applications of reanalysis, validation is vital. Thus, the characteristics and uncertainties of EARR for the period of 2013–14 are examined and are compared with ERA-I and observations.
In terms of skill scores, except for MSLP, the geopotential height at 850 and 1000 hPa, and the upper-air relative humidity, the EARR performance for most variables, such as wind, temperature, relative humidity, and geopotential height, is improved relative to ERA-I. In addition, for the near-surface wind and temperature, the SS in winter is higher than in summer.
The vertical distributions of the wind, temperature, and relative humidity RMSEs show that the EARR RMSEs are smaller than those of ERA-I (except for the upper-air meridional wind and relative humidity in January). Throughout all levels (between 1000 and 200 hPa), on average, the temperature RMSE of the EARR is improved by approximately 0.2 K relative to that of ERA-I. Furthermore, for the relative humidity, EARR provides improvements over ERA-I in January below 850 hPa and in July throughout all levels.
For the near-surface temperature, the EARR RMSEs are smaller by 0.36 K in January and 0.18 K in July than those of ERA-I. Furthermore, the EARR RMSEs in July are smaller by 0.56 K than those in January, which indicates that EARR can better simulate near-surface temperature in July than in January. In a similar way, for the near-surface wind, the EARR RMSEs are smaller by 0.91 m s−1 in January and by 0.64 m s−1 in July than those of ERA-I. In addition, the EARR RMSEs in July are smaller by 0.11 m s−1 than those in January. Therefore, both EARR and ERA-I produce more accurate near-surface wind and temperature fields in July than in January. In addition, EARR is able to better simulate the near-surface variables than ERA-I is, and the EARR improvement over ERA-I in terms of the reduced RMSE is larger in January than in July. The better representation of the near-surface variables in EARR than in ERA-I is due to the higher resolution of the forecast model and DA system, similar to Jermey and Renshaw (2016).
In terms of the correlation coefficients for the near-surface wind and temperature, EARR can provide a much improved representation of the near-surface variables that is more consistent with the observations than is that of ERA-I. Overall, the spatial distributions of the monthly mean near-surface wind and temperature from EARR for January and July 2014 are similar to those from ERA-I. EARR, however, is good at reproducing the finescale features of the near-surface variables in greater detail than is achieved by ERA-I.
The results of the KE spectra analysis show that at scales that are larger than 1000 km the EARR and ERA-I KE spectra have generally similar energy levels and slopes closer to −3. In contrast, for scales of less than 1000 km, because of the effective resolution difference caused by the higher resolution of EARR, the ERA-I KE spectra decrease dramatically as the wavenumber increases. As a result, the slope of the EARR spectra is relatively closer to −5/3 when compared with that of ERA-I, and therefore the EARR KE spectra are in better agreement with the canonical atmospheric KE spectra.
From triple collocation analysis, the variances of errors in EARR are estimated to be much lower than those for SYNOP and ERA-I for the near-surface wind and temperature, which can be reaffirmed by the scatter diagrams. In addition, using the FSS approach, the near-surface wind of EARR is statistically significantly better simulated than that of ERA-I for all thresholds at the 5% significance level, except at the 6 m s−1 threshold for a spatial scale of 12 km and the 8 m s−1 threshold for spatial scales of 12 and 60 km.
We conclude that the EARR near-surface variables are more accurate than those of ERA-I and that special care should be taken when using the EARR upper-air relative humidity. To address these difficulties, the advanced methods used for producing other reanalysis products can be recommended to be applied in producing the EARR for a longer period. For example, the variational bias correction that wasused for generating ERA-I (Dee et al. 2011) and JRA-55 (Kobayashi et al. 2015), the cloud assimilation used for generating EURO4M (Renshaw et al. 2013), and the precipitation assimilation applied to NARR (Mesinger et al. 2006) could be applied. More analysis will be conducted after EARR is produced for a longer period.
The authors appreciate three reviewers for their valuable comments. This study was supported by a National Research Foundation of Korea (NRF) grant funded by the South Korean government (Ministry of Science and ICT) (Grant 2017R1E1A1A03070968), the Korea Meteorological Administration Research and Development Program under Grant KMIPA 2015-5200, and an operation-oriented research project of the Numerical Modeling Center of the Korea Meteorological Administration. The authors appreciate Dr. Jun Kyung Kay for discussions at the earlier stages of the study and appreciate the Numerical Modeling Center and the National Center for Meteorological Supercomputer of the Korea Meteorological Administration and the Met Office for providing computer facility support and resources for this study.