Reanalyses have the potential to provide meteorological information in areas where few or no traditional observation records are available. The terrestrial branch of the water cycle of CFSR, MERRA, ERA-Interim, and NARR is examined over Quebec, Canada, for the 1979–2008 time period. Precipitation, evaporation, runoff, and water balance are studied using observed precipitation and streamflows, according to three spatial scales: 1) the entire province of Quebec, 2) five regions derived from a climate classification, and 3) 11 river basins. The results reveal that MERRA provides a relatively closed water balance, while a significant residual was found for the other three reanalyses. MERRA and ERA-Interim seem to provide the most reliable precipitation over the province. On the other hand, precipitation from CFSR and NARR do not appear to be particularly reliable, especially over southern Quebec, as they almost systematically showed the highest and the lowest values, respectively. Moreover, the partitioning of precipitation into evaporation and runoff from MERRA and NARR does not agree with what was expected, particularly over southern, central, and eastern Quebec. Despite the weaknesses identified, the ability of reanalyses to reproduce the terrestrial water cycle of the recent past (i.e., 1979–2008) remains globally satisfactory. Nonetheless, their potential to provide reliable information must be validated by comparing reanalyses directly with weather stations, especially in remote areas.
Since the end of the last century, interest in the study of climate change has grown considerably. Looking at observations or data produced by global climate models representing the recent past and analyzing them using statistics could be relevant to detecting possible trends regarding climate change. In the 1980s and 1990s, considerable improvements in weather forecasting and numerous upgrades of model and data assimilation methods contributed to the notion of reanalyzing the recent past (Bengtsson and Shukla 1988). Briefly, a reanalysis aims to provide the best estimation of the state of the atmosphere, ocean, and land surface from the recent past. The term “reanalysis” stands for “retroactive analysis,” since the analysis is done on a past period extending up to the near present. Reanalyses are three-dimensional gridded datasets produced by a weather forecasting model. Two main characteristics can define these datasets. First, a data assimilation scheme is used to integrate observations from different sources in order to provide the most coherent state of the atmosphere. Second, the assimilation scheme and the forecasting model of a reanalysis remain unchanged during the entire simulation period. As such, inconsistencies that might be induced by continuous updates of the data production system are avoided.
Many observations measured from different sources, such as radio sounding, aircrafts, boats, satellites, surface sensors, buoys, etc., are assimilated in the production of a reanalysis. Moreover, the large range of available reanalysis data products provides information such as radiative fluxes, wind, temperature, humidity, precipitation, albedo, snow, vegetation, and land cover, to name just a few. The global coverage and the huge range of available variables with a consistent time and space resolution during the simulated period represent some of the main benefits that reanalyses provide to climate studies, including a smaller bias and a finer spatial resolution from one generation of reanalyses to the next. Nevertheless, reanalyses also show some limitations. Indeed, the reliability of some variables may significantly vary in time and space. Moreover, the evolution of the number and quality of assimilated observations may introduce some artificial variability and trends. As well, the water balance is rarely conserved, and reanalyses sometimes show substantial biases between variables, such as precipitation, that are not directly constrained by assimilation.
In the 1990s, the European Centre for Medium-Range Weather Forecasts (ECMWF) and the National Centers for Environmental Prediction (NCEP) began producing the first global reanalyses, the 15-yr ECMWF Re-Analysis (ERA-15; Gibson 1997) and the NCEP–National Center for Atmospheric Research (NCEP–NCAR) reanalyses, also known as R1 (Kalnay et al. 1996). Armed with awareness about the limitations of these first two datasets, a second generation of reanalyses was produced, namely, the NCEP–U.S. Department of Energy AMIP-II reanalysis, also known as R2 (Kanamitsu et al. 2002); the North American Regional Reanalysis (NARR) (Mesinger et al. 2006); the 40-yr ECMWF Re-Analysis (ERA-40; Uppala et al. 2005); and the Japanese 25-year Reanalysis Project (JRA-25; Onogi et al. 2007) developed by the Japan Meteorological Agency (JMA). Recently, a third generation of reanalyses was developed, including the Climate Forecast System Reanalysis (CFSR; Saha et al. 2010) from NCEP, the ECMWF interim reanalysis (ERA-Interim; Dee et al. 2011), and the Modern-Era Retrospective Analysis for Research and Applications (MERRA) produced by NASA (Rienecker et al. 2011).
In addition to their application to climate studies, reanalyses have the potential to provide climate information in areas that are sparsely inhabited or with limited surface observations (in space or time), for applications such as water resource management and hydrological modeling. For the latter, examining precipitation, evaporation, runoff, and the water balance for the terrestrial branch of the water cycle (Peixoto and Oort 1992) obtained through reanalyses should provide useful information.
Bukovsky and Karoly (2007) compared the precipitation of NARR, R2, and ERA-40 using a set of gridded observations. Exploring the spatial distribution and the diurnal and annual cycles, they revealed that NARR showed better results than the other two reanalyses over the continental United States. Nevertheless, these authors recommended that users should proceed cautiously when looking at the rest of the North American domain, especially along the U.S. borders and in southeastern Canada, where the overall NARR precipitation is strongly underestimated. Among all existing reanalyses, NARR is the only one that assimilates precipitation and in which the quality of the simulated precipitation strongly depends on the quality of observed data and on the assimilation process (Mesinger et al. 2006). Zhang et al. (2012) focused on the change in the global average of precipitation from CFSR during the period 1998–2001. They demonstrated that an interaction between the bias of the data assimilation model and the nonstationarity in the ingestion of some observed data is the source of the global average increase in the CFSR precipitation after 1998. Bosilovich et al. (2011) evaluated the water balance of MERRA over the entire globe. Since this reanalysis was configured to include in the water balance the residual generated by the data assimilation process, the water balance over the land and oceans is closed. However, a notable shift in annual water balance was identified, starting in 1999, which coincides with the beginning of Advanced Microwave Sounding Unit (AMSU) radiance assimilation. Sheffield et al. (2012) identified noteworthy differences in NARR evaporation and runoff, compared to two offline land surface model simulations [Noah, version 2.7.1, and Variable Infiltration Capacity model (VIC)], using observational runoff estimates over the continental United States. NARR (which uses a previous version of Noah) and Noah simulations present an overestimation of annual evaporation and runoff ratios (simulated runoff divided by observed runoff) that are 50% lower than the VIC simulation results. Regarding NARR, the authors identified these differences as being mainly related to the evaporation component of the Noah model, versus other factors such as atmospheric forcings or biases induced by precipitation assimilation into NARR. Lorenz and Kunstmann (2012) investigated the closure of the water balance (i.e., balance between precipitation, evaporation, surface runoff, and moisture flux) of ERA-Interim, MERRA, and CFSR over the entire globe and found that ERA-Interim is the reanalysis that likely provides the most reliable rainfall estimates globally, especially over regions with a dense network of observations. Moreover, they showed that, in the long-term mean, ERA-Interim and MERRA show a reasonable closure of the global surface water balance, as precipitation P minus evaporation E (i.e., P − E) over land equals the divergence of moisture E − P over the oceans. Furthermore, the change in the number of assimilated data in CFSR and MERRA around 1998, revealed by Zhang et al. (2012) for the case of CFSR, leads to a substantial imbalance between P − E over the land and oceans.
The province of Quebec in Canada has abundant freshwater resources. Its numerous lakes and rivers play a fundamental role in local wildlife and flora sustainability. On the other hand, there is great interest in managing water resources, particularly in terms of hydroelectricity production. Quebec has a surface area greater than 1.5 million km2 and is characterized by five different climate regimes (Bukovsky 2011). However, the spatial distribution of meteorological stations varies across the province, being dense in the south, decreasing to the north, and almost nonexistent in the far north. Therefore, reanalyses should be useful in providing substantial information, particularly in these remote regions with little observational data. This study focuses on the assessment of the components of the terrestrial branch of the water cycle (Peixoto and Oort 1992) of four recent reanalyses, CFSR, ERA-Interim, MERRA, and NARR, over the province of Quebec, and especially on the assessment of the reliability of the four reanalyses in representing the terrestrial water cycle components in the northern regions of the province. The analysis is divided into three parts. In the first part, the long-term mean water balance is examined over the entire territory of Quebec [the water balance will be further defined in section 2 using Eq. (1)]. Second, the mean annual cycle (expressed through long-term mean monthly values) of the terrestrial water cycle components and annual water balance are analyzed according to the climate classification of Bukovsky. Third, the precipitation, runoff, and E/P ratio (evaporation divided by precipitation) are examined over river basins representing the different hydrological regimes of the province. Section 2 details the methodology and the main characteristics of the datasets used in this study. Section 3 presents the results and discussion, and section 4 provides the concluding remarks.
2. Data and methods
The reanalysis and observational datasets that were used in this study are first described, and then the methodology follows.
a. Reanalysis datasets
The widely used NARR, CFSR, MERRA, and ERA-Interim were chosen to evaluate the terrestrial branch of the water cycle over the province of Quebec. These reanalyses benefited from improvements obtained from the preceding generation of reanalyses and thus represent the most advanced and suitable products available for carrying out this study. This section describes the relevant properties of each reanalysis as used in this study, including their similarities and differences. For the analysis, we selected the 1979–2008 period since the 30-yr period is widely used in climate studies. Moreover, the selected period starts in 1979, as it is the first year covered by the four selected reanalyses. All the references and main properties of the datasets used in this study are summarized in Table 1.
The recent global ERA-Interim includes an assimilation system based on a four-dimensional variational data assimilation (4D-VAR) approach, which is more complex and computationally intensive than the three-dimensional variational data assimilation (3D-VAR) used by CFSR, NARR, and MERRA, but more efficiently uses available observations (Dee et al. 2011). Most of the surface fields are available every 3 h, and every 6 h for the atmospheric fields, on a 0.75° regular grid (about 83-km horizontal resolution).
NARR, which covers the North American continent and parts of the North Atlantic and Pacific Oceans, has the particularity of assimilating observed precipitation, contrary to the other three reanalyses. In fact, before being assimilated, the observed precipitation in NARR is converted into latent heat. Additional details about how NARR is generated and about precipitation assimilation can be found in Mesinger et al. (2006). NARR outputs are available every 3 h, with a 32-km horizontal resolution (about 0.3°).
CFSR is the first coupled global atmosphere, ocean, land surface, and sea ice system reanalysis (Saha et al. 2010). As for ERA-Interim and MERRA, the atmosphere, ocean, land surface, and sea ice models share boundary data during the forecasting process. However, the analysis of each model component in CFSR, ERA-Interim, and MERRA is processed separately. Both CFSR and NARR use the Noah land surface model, but in CFSR, the Noah model has the particularity to be forced with observed precipitation, instead of using the precipitation generated by the atmospheric model, which is considered too biased (Saha et al. 2010). Furthermore, Meng et al. (2012, p. 1623) explained that “previous studies have shown nontrivial biases in the [Global Data Assimilation System] precipitation (Gottschalck et al. 2005). Such a bias over land often leads to biases in many simulated land surface variables.” Surface fields of CFSR are available at a resolution of 0.3° (about 32-km horizontal resolution) and every 6 h.
MERRA is a global reanalysis simulated on a ⅔° longitude and ½° latitude regular grid, with hourly outputs. The assimilation process of MERRA is similar to that of CFSR, which is the Gridpoint Statistical Interpolation analysis system, developed at NCEP. In addition, the main specificity of MERRA consists of the use of an incremental analysis update (IAU) procedure to improve water balance conservation [for further explanations, see Rienecker et al. (2011)].
b. Observational datasets
This section describes the Natural Resources Canada (NRCAN) observationally based gridded precipitation, as well as the Impact des changements climatiques sur l’hydrologie (Q) au Québec [(cQ)2] observed streamflow time series used as reference in this study. The NRCAN dataset provides daily Canada-wide precipitation and minimum and maximum temperatures on a 10-km-resolution regular grid (Hutchinson et al. 2009). It was originally produced to support studies requiring daily data, for instance, hydrological modeling, agricultural and forestry applications, extreme event analysis, etc. The gridded precipitation was obtained from the interpolation of measurements at individual stations, whose number varies in time from about 2000 to 3000 for the 1961–2010 period. The Australian National University Splines (ANUSPLIN) model was applied for the interpolation. It uses a trivariate thin plate smoothing splines method to model the spatial distribution of precipitation as functions of latitude, longitude, and elevation across Canada. Figure 1b shows the locations of the 638 weather stations available to generate the daily precipitation for the period 1979–2008 over the province of Quebec, provided by Environment Canada (now known as Environment and Climate Change Canada).
The (cQ)2 database contains daily outlet streamflows, names, and surface areas, as well as center and contour coordinates of 306 river basins over the province of Quebec (Arsenault and Brissette 2014). This database was jointly produced by Hydro-Quebec, Rio Tinto Alcan, and Centre d’expertise hydrique du Québec in order to unify water resources information and simplify access to hydrometric data.
Both observational datasets have some known limitations mainly related to streamflow measurements in (cQ)2 and to decreasing spatial density of weather stations used to produce the NRCAN dataset from southern to northern Quebec. For instance, over the Saguenay–Lac Saint-Jean and west of the Côte Nord regions, NRCAN shows inconsistent precipitation between 1995 and 1996, and significant underestimation from 2004 to 2008, due to the malfunction of a weather station (C. Guay 2015, personal communication). Moreover, measuring consistent streamflow time series in northern remote areas is challenging and may introduce some errors. Therefore, some of the (cQ)2 time series have been postprocessed to correct some inconsistencies, for instance, natural streamflow reconstruction over regulated river basins or measurement correction due to inaccurate measurements of river flows under ice cover. Despite the application of postprocessing techniques, some biases remain, and streamflow time series in the southern part of the province may be more reliable than those in the northern regions. Moreover, as the density of observed data integrated into the NRCAN decreases from south to north, the reliability of this dataset may be more questionable over northern regions.
The terrestrial water balance of the reanalyses can be written following Eq. (1):
where W is the surface water storage (mm), t is time (days), P is the precipitation (mm day−1), E is the total surface evaporation (mm day−1), and R is the total runoff (mm day−1), including the subsurface runoff. The term RES (mm day−1) stands as a residual that comes from the assimilation process. During this process, a new state of the atmosphere is generated, which is different from the one simulated by the model, but closer to observations. Atmospheric states are then discontinuous, which unbalances the water balance. In other words, the RES term may also be viewed as an estimate of the overall error in the water balance (Roads et al. 2003). Roads et al. (2003) computed annual mean (1996–99) surface variables of R1 and R2 over the Mississippi River basin and presented the RES values of these two reanalyses, which were equal to 0.592 and 0.255 mm day−1, respectively. In this study, Eq. (1) is rearranged into Eq. (2):
where B is the relative water balance (% of P), which includes both RES and the change of surface water storage. Considering climatic time scales, the temporal change in surface water storage (soil water and snow water equivalent) is assumed to be negligible (Kleidon and Schymanski 2008). In this case, B equals RES. This assumption is not completely verified on an annual time scale. Nevertheless, Roads et al. (1998) concluded that the RES term should be the most important contribution in comparison with the change in water storage.
Precipitation, evaporation, and runoff from CFSR, ERA-Interim, MERRA, and NARR were first downloaded from their respective websites for the period going from 1979 to 2008 (30 years). Each variable is aggregated on a daily basis and computed (mm day−1). The evaporation includes evaporation from the bare soil, transpiration from vegetation, interception loss, and snow sublimation. For CFSR, the evaporation component is derived from the latent heat flux (W m−2), as the evaporation is not directly available. Equation (3) was used to compute the evaporation:
where E is the evaporation (mm day−1), Hf is the latent heat flux (W m−2), Le(T) is the latent heat of vaporization (J kg−1), and C is a constant to convert the evaporation to a daily time step (C = 86 400 s day−1). Although the latent heat of vaporization depends on the surface temperature T, its influence on the evaporation can be neglected (Lorenz and Kunstmann 2012). Therefore, the latent heat of vaporization is approximated to 2.5 MJ kg−1, and Eq. (3) can be simplified as follows:
In the case of the NARR and MERRA datasets, base flow and surface runoff, which are provided separately, are summed to compute the total runoff. The analysis that was carried out in this paper is divided into three parts, according to the three different spatial scales that are considered. Long-term and annual time scales have also been computed, as introduced hereafter.
1) Long-term mean of the water cycle components over the province of Quebec
In the first part of this study, precipitation, evaporation, runoff, and the relative water balance from the four reanalyses are averaged at a daily time scale over the period 1979–2008 for each reanalysis tile. First, P, E, and R are averaged annually (mm day−1) from 1 October to 30 September of the following year. As such, the snowpack is supposed to completely accumulate and melt and then be transferred to runoff during the year. Therefore, there are 29 annual values, since the periods from 1 January to 30 September 1979 and from 1 October to 31 December 2008 are not considered. Values of P, E, and R are then averaged during the entire period on each tile following Eq. (5):
where X (mm day−1) is the averaged variable P, E, or R; xi is the value of P, E, or R at year i; and n is the total number of years in the period 1979–2008 (29 values, as explained above). The relative water balance B is computed from the averaged values of P, E, and R using Eq. (2). To consider only data over the land, land–sea masks of the four reanalyses are applied. As MERRA natively provides land-cover fractions, a threshold of 0.5 was used to distinguish the land from the water surfaces.
2) Water cycle components over the climatic regions
In the second part of this study, spatial averages of P, E, R, and B are computed over subregions of Quebec, according to the climate classification of Bukovsky (2011). This classification was created to provide a consistent climate division, as part of the North American Regional Climate Change Assessment Program (NARCCAP). The climate classification is based on a simplification of ecoregions of Ricketts et al. (1999). According to this classification, Quebec is divided into five climatic regions: Great Lakes (GL), North Atlantic (NA), East Boreal (EB), East Taiga (ETA), and East Tundra (ETU). Figure 2a shows the five climatic regions over the province of Quebec, at the resolution of the CFSR dataset. The annual cycles of P, E, and R are produced (one value per month per variable) following Eq. (6):
where xij is the value of P, E, or R (mm day−1) on day i and on tile j, p is the number of reanalysis tiles within a climatic region, and k is the total number of days in a particular month throughout the entire 1979–2008 period (e.g., k = 930 for January). Finally, one value (Y; mm day−1) is computed for each month. The NRCAN precipitation is used as the reference for precipitation, and its mean annual cycle computed in the same way as for the reanalyses.
The relative water balance of reanalyses is averaged annually (one value per year) for each climatic region, following the same methodology as for the preceding calculation over all of Quebec: annual average values of P, E, and R are first computed using Eq. (6), except that k is now the number of days within the period from 1 October to 30 September of the following year. The relative water balance is then calculated using Eq. (2) for each year.
3) Water cycle components within river basins
In the third part of this study, P, R, and the E/P ratio are computed annually over 11 selected river basins from the (cQ)2 database using Eq. (6), in which p is the number of dataset grid points within a river basin and k is the number of days within the period from 1 October to 30 September of the following year. Spatial averages are computed as follows: 1) when at least four dataset grid points are located inside the contour of a river basin (or aggregated river basin), a simple arithmetic mean is used; 2) otherwise, the Thiessen’s polygons method is applied (Rhynsburger 1973), attributing specific weights to the four closest grid points inside or outside the river basin(s) contour. Table 2 presents the number of grid points of the five datasets located inside the contour of the selected river basins. The river basins were selected according to three main criteria: 1) their representativeness of the six major hydrologic regimes of Quebec, namely South, Center, North, Gaspésie, Côte Nord, and Arctic, since the (cQ)2 streamflow time series contain missing data, especially during the winter low-flow period; 2) the basins with the longest period without any missing data; and 3) the longest temporal coverage was favored. Rivers in Arctic and North regions flow to the west and north, whereas those in Côte Nord, Gaspésie, South, and Center flow into the St. Lawrence River. These regimes correspond fairly well with the Bukovsky climatic regions, with a distinction being made between the Center and Côte Nord, as compared to the unique East Boreal climate region. Each hydrological regime is represented by two river basins, except for the Arctic, where only one river basin showed consistent observed streamflows. Furthermore, one of the two river basins in the South is formed by three small aggregated river basins; their streamflow time series were spatially aggregated, while their surface areas were summed up, which increases the number of reanalysis grid points within the river basin, providing more relevant information.
Figure 2b shows the hydrologic regions and the selected river basins of (cQ)2 over Quebec. Among the 306 river basins, some are subbasins and are not represented in Fig. 2b. To further investigate precipitation from the four reanalyses, the temporal correlation coefficient between the daily time series of precipitation from each reanalysis and from NRCAN is computed over the river basins of interest. Moreover, distribution of the daily precipitation intensities from CFSR, ERA-Interim, MERRA, NARR, and NRCAN over the 11 river basins for the period 1979–2008 for winter [December–February (DJF)] and summer [June–August (JJA)] is also computed. Each daily precipitation is categorized into eight bins of precipitation from 0.25–1 mm day−1 to 64–128 mm day−1, and the bins are presented in percentage of their contribution to the total amount of seasonal precipitation. A threshold of 0.25 mm day−1 is applied to dissociate dry days from wet days. For the five datasets, no aggregation is applied, and each grid point is considered as a daily precipitation within a day. Regarding the runoff analysis, runoff estimates are calculated using the (cQ)2 streamflows relative to the surface area of each of the 11 river basins. Moreover, NRCAN precipitation is used as reference data, as well as an estimated E/P ratio derived from NRCAN precipitation and (cQ)2 streamflows. The reference ratio is calculated by total annual streamflow volumes (m3 s−1) being converted into runoff (mm day−1) according to the surface area of each river basin. Assuming that the water balance of observed data is closed, an estimation of evaporation is computed using Eq. (7):
where E is the evaporation estimate derived from observations (mm day−1), P is the NRCAN precipitation (mm day−1), and R is the runoff (mm day−1) derived from observed streamflows. As for the preceding calculations, Eq. (7) is used on an annual basis from 1 October to 30 September of the following year. The E/P ratio of estimated observations is then computed, dividing E by the NRCAN precipitation. Since streamflow time series differ from one river basin to another, E/P is calculated only during the available streamflow periods. Finally, those computations are supplemented with the analysis of the distribution of the yearly mean annual precipitation, runoff, and ratio (mm day−1) previously calculated from CFSR, ERA-Interim, MERRA, NARR, and NRCAN, from 1979 to 2008 over the 11 selected river basins.
a. Long-term mean of the water cycle components over the province of Quebec
Figure 3 shows the long-term mean of precipitation, evaporation, and runoff for the four reanalyses over the province of Quebec, as well as the contour of the river basin Manic-5, in which Manicouagan Lake is located. Globally, the spatial distribution of precipitation from ERA-Interim and MERRA are quite similar, whereas precipitation from CFSR generally reaches the highest values among the four datasets (up to 5 mm day−1 over the south of the province). Moreover, the underestimation of precipitation from NARR highlighted by Bukovsky and Karoly (2007) at the U.S.–Canadian border is clearly obvious in Fig. 3. Regarding evaporation, MERRA shows higher values in the center and in the south relative to the other three reanalyses (2.5–3 mm day−1), while NARR and ERA-Interim show the lowest values in the north of the province (0–1 mm day−1). CFSR and NARR show high runoff values close to Manicouagan Lake, especially NARR, which reveals values above 3 mm day−1 (blue spot on the NARR runoff map in Fig. 3), while the three other reanalysis runoffs range from 0.5 to 2.5 mm day−1. Furthermore, ERA-Interim shows the highest runoff values in the center and the south, while runoff values from MERRA and NARR drop to almost zero in the west and south, respectively.
Figure 4 shows the closure of the water balance using the long-term mean B values computed for CFSR, ERA-Interim, MERRA, and NARR over Quebec. As expected, the water balance of MERRA is closed, with values of B almost equal to 0%. Conversely, the water balances of CFSR, NARR, and ERA-Interim are not closed, with values of B different from 0%. Over the entire province, CFSR presents positive values of B from 30% to 50%. These noticeable results are mainly related to the high values of CFSR precipitation illustrated in Fig. 3. It is worth recalling that CFSR uses observed precipitation instead of that simulated to force its surface scheme, which introduces some imbalance between evaporation, runoff, and the model-generated precipitation. Over central and southern Quebec, ERA-Interim shows B values between −30% and −10%, and a relatively closed water balance with B close to 0% over the rest of the province. NARR presents positive B values of about 20% in the far north latitudes, negative B values in the center, and reaches its lowest negative values in the south. The underestimations of NARR precipitation at the U.S.–Canadian border and high runoff values close to Manicouagan Lake illustrated in Fig. 3 lead to significantly underestimated B values (from −70% to −90%).
b. Water cycle components over the climatic regions
Figure 5 shows relative water balance, runoff, evaporation, and precipitation of the four reanalyses over the five climatic regions of Bukovsky, with the addition of the NRCAN dataset for the case of precipitation. Mean annual cycles are shown for runoff, evaporation, and precipitation, whereas mean annual values are shown for the relative water balance (one value per year) from 1979 to 2008.
As already seen in Fig. 3, precipitation from CFSR is generally higher than that from the other four datasets over the five climatic regions, except in the summer months over the GL and NA regions (Fig. 5). Precipitation from NARR is lower than for MERRA, ERA-Interim, and CFSR, especially over the EB, GL, and NA regions (0.5–1 mm day−1 below the values of the other reanalyses). Nevertheless, although precipitation values may differ greatly depending on the dataset, the temporal distributions are quite similar over the EB, ETA, and ETU regions, which is in agreement with the precipitation from NRCAN. Over the GL and NA regions, precipitations from each dataset show discrepancies between each other, mainly in the summer (JJA).
Regarding evaporation, MERRA summer peak values are systematically higher than those from the other reanalyses, particularly over the EB, ETA, and ETU regions, reaching 4.3, 3.5, and 3 mm day−1, respectively. Over the NA region, the evaporation values from MERRA are also very close to those from NARR, with summer peaks being 1.4 times higher than those from ERA-Interim and CFSR. Globally, the evaporation is questionable, as summer maximum values may double or triple from one reanalysis to another, depending on the climatic region. The general north–south gradients seen in Fig. 3 for precipitation and evaporation are also reproduced in the mean annual cycles, as precipitation and evaporation generally tend to decrease from southern to northern regions, with precipitation from CFSR being an exception, as already pointed out, and precipitation over the GL region being lower than that over the NA region. Such gradients agree well with the known gradients in precipitation and evaporation across the province, also illustrated by Natural Resources Canada (2014, 2015).
Runoff values from ERA-Interim are higher than those from MERRA, NARR, and CFSR all year long for the NA and GL regions, and during fall and/or winter seasons for the EB, ETA, and ETU regions (1–2 mm day−1 more than the other reanalyses). For instance, over the GL region, the ERA-Interim peak values during the spring are 2.5 times higher than those from the other three datasets. On the other hand, MERRA, NARR, and CFSR agree well over this region. Likewise, MERRA, NARR, and CFSR show similar runoff values over the NA region, whereas those for ERA-Interim are substantially higher. Moreover, MERRA presents higher maxima over ETA and ETU (4.3 and 3.6 mm day−1, respectively) compared with ERA-Interim, NARR, and CSFR values. Generally, runoff values from the different reanalyses show more disagreement over the EB, ETA, and ETU regions, as the low- and high-flow seasons show a large range of values.
The relative water balance values from the four reanalysis datasets cover a range from −50% to +50% of precipitation, except for the NARR dataset in the NA region, which will be discussed further. The high precipitation values and average values of evaporation and runoff from CFSR induce positive values of relative water balance (about 25%), higher than for MERRA, ERA-Interim, and NARR over the six climatic regions. In some years, the underestimation of precipitation from NARR leads to negative values of B over the five regions, especially over NA in the early 2000s, where B values are significantly low, below −150%. This may have been induced by the high evaporation values during summer, combined with the low precipitation of NARR in this region, which includes the U.S.–Canadian border where precipitation from NARR is strongly underestimated (section 3a). Despite the highest evaporation maxima from MERRA, this reanalysis shows closed water balance with B values around 0% over the five climatic regions, as expected (section 2a). Regarding the ETA and ETU regions, ERA-Interim reveals a relatively closed water balance with B values close to 0%, compared with NARR, and particularly CFSR. However, the higher runoff values from ERA-Interim over the EB, GL, and NA regions tend to produce a negative relative water balance (about −15%) for most years. Globally, the reanalyses do not agree in terms of relative water balance, while some of their water cycle components present some similarities. On the other hand, although MERRA was designed to close its water balance, it also reveals some overestimation of different water cycle components (with respect to the other reanalyses, and with respect to NRCAN observed precipitation mostly in ETA and ETU regions).
c. Water cycle components within river basins
Mean annual precipitation from the four reanalyses and NRCAN datasets were computed from 1979 to 2008 (Fig. 6a) as spatial averages over the 11 selected river basins (RBs; section 2c and Fig. 2b). As already seen in the previous sections, precipitation from CFSR is generally higher than that of the other datasets over the 11 river basins, including NRCAN. Among the four reanalyses, MERRA and ERA-Interim seem to be the least biased with respect to the NRCAN observations. On the other hand, NARR presents significant underestimations, as compared to ERA-Interim and MERRA, especially over RB 1 and 6 (1.5–2 mm day−1 less). Furthermore, a sudden change in precipitation from NARR around 2003 is clearly noticeable in RB 1–6 (which are located in the GL climatic region for RB 1, in the EB region for RB 2–4, and in the NA region for RB 5 and 6). Figure 6b shows the distribution of the yearly mean annual precipitation from the four reanalyses and NRCAN datasets over the river basins. Globally median values from the five datasets differ more from one another over the southern river basins than over the northern ones. One the other hand, the dispersion of annual values from the five datasets is quite similar. The sudden change in precipitation from NARR around 2003 is also noticeable in Fig. 6b over the affected river basins (outlier values).
Looking at Fig. 7, the spatial distribution of the mean precipitation from NARR is quite different between the 1979–2002 and 2003–08 periods. Indeed, the bias toward low precipitation values over the U.S.–Canadian border between 1979 and 2002 vanished between 2003 and 2008. In southern Quebec, there is significantly more precipitation after 2003 than before. This could likely be related to the update of the NARR assimilation system in April 2003 and to the change in the number and nature of assimilated observations (Mesinger et al. 2006).
Table 3 shows the correlation between daily time series of the spatially averaged precipitation from each of the four reanalyses and NRCAN over the selected river basins (section 2c). Despite the global overestimation of precipitation from CFSR, the correlations over all river basins are similar to those from MERRA and ERA-Interim. On the other hand, NARR presents low or inconsistent correlations, compared with MERRA, ERA-Interim, and CFSR from −0.33 to 0.62. The negative value of NARR over RB 10 (−0.13) is not surprising when considering the low correlations of the other three datasets. However, the negative correlation value from NARR over RB 4 (−0.33) is probably related to the sudden shift in precipitation happening around 2003, discussed above (Fig. 7). Nevertheless, these results must also be taken with caution since the quality of the precipitation from NRCAN is questionable in the northern part of the province, as the gridded data have been interpolated from very few weather stations (section 2b). For further analysis, it could be relevant to perform a direct comparison between reanalysis grid points and measurements from the closest weather station in order to avoid biases induced by the generation of gridded observation datasets. In any case, a correlation between daily time series around 0.8 or more should be acceptable to indicate that reanalyses represent the climate over the river basins of interest quite well, especially in southern Quebec, and therefore, may be useful for hydrological modeling purpose.
Figure 8 shows the distribution of the daily precipitation intensities from CFSR, ERA-Interim, MERRA, NARR, and NRCAN, over the 11 river basins, for the period 1979–2008. Some of the main characteristics of the precipitation pattern over the province are well depicted in Fig. 8. For instance, there are more dry days in summer than in winter and precipitation events are lower in winter, whereas the most extreme events generally happen during summer. On the other hand, reanalysis datasets underestimate the percentage of dry days compared with the NRCAN dataset. This remark is not valid for NARR in southern Quebec (RB 1–6), as this dataset tends to underestimate the precipitation, compared with the four other datasets (Figs. 5–7). Moreover, CFSR shows the lowest percentage of dry days among the four reanalyses in winter. However, over the northern river basins in winter, NRCAN presents fewer dry days than the four reanalyses. One has to keep in mind that, in this part of the province, the density of weather stations (from which NRCAN has been generated) is very low (Fig. 1b). In summer, NARR, which assimilates precipitation during the analysis process, shows quite similar distributions of precipitation, compared with NRCAN, except over RB 1, 5, 6, and 9. Globally, among the four reanalyses, CFSR presents the highest percentage of the contribution of extreme events (16–128 mm day−1) to the total amount of precipitation during winter and summer.
Figure 9a shows the mean annual runoff from CFSR, ERA-Interim, MERRA, NARR, and the runoff estimates from (cQ)2 over the 11 river basins. The streamflow records for the 11 selected river basins do not completely cover the 1979–2008 period, so the runoff calculation periods differ among the river basins. Despite the high values of runoff from ERA-Interim over GL, NA, and EB climatic regions (Fig. 3), this reanalysis shows runoff values similar to the runoff estimates at the watershed scale, with greater differences over RB 3,4, and 9. On the other hand, CFSR and MERRA generally reveal comparable amounts of runoff water (from 0.5 to 1 mm day−1) that are systematically lower than those from ERA-Interim and the runoff estimates (about 1–1.5 mm day−1 less). Over RB 10, CFSR and NARR show high runoff values (3 and 5.5 mm day−1, respectively) in the period 1979–84 (Fig. 9a). This river basin encompasses Manicouagan Lake, where high local values of runoff were also highlighted in Fig. 3 for the case of NARR and CFSR. As this result is common to these two reanalyses, one can assume that their common surface model Noah may be involved in the production of those high runoff values. Figure 9b shows the distribution of the yearly mean annual runoff from the four reanalyses and the runoff estimates over the river basins. In agreement with the previous results, runoff median values of the four reanalyses differ from the one of NRCAN over all river basins, except for ERA-Interim. The dispersion of annual values from CFSR, ERA-Interim, and MERRA is relatively similar to the one of NRCAN, except for RB 1, 5, 6, and 10 for CFSR; RB 1, 5, and 6 for MERRA; and RB 3, 5, and 8 for ERA-Interim. Regarding NARR, the dispersion of annual values is systematically different from the one of NRCAN, except for RB 11.
Figure 10a shows the mean annual E/P ratio from CFSR, ERA-Interim, MERRA, NARR, and the observational estimates over the 11 river basins. As for the runoff results, the E/P calculation periods differ among the river basins. Considering the quality of the precipitation from NRCAN and of some (cQ)2 streamflow time series, one can expect the observational estimates to sometimes be questionable (section 2b). Therefore, the observational estimates should not always be considered as the absolute truth, but rather, as another dataset for the purpose of analysis. Over southern and central Quebec, evaporation is expected to be about half of precipitation, which means an E/P ratio around 0.5 (Fisheries and Environment Canada 1978; Wang et al. 2014). Looking at Fig. 10, the observational estimates ratio approaches 0.5 over RB 1 and 3, and moderately underestimates 0.5 over RB 2. Looking more closely at RB 4 and 10, the observational estimates ratio drops in the early 2000s (Fig. 10a). This drop is also noticeable in Fig. 6a, where, starting in 2004, the precipitation from NRCAN shows a slight downward trend, related to the quality issue of this precipitation dataset (discussed in section 2b) over Lac Saint-Jean and west of the Côte Nord region, where RB 4 and 10 are located, respectively. For all river basins, MERRA and NARR show the highest E/P ratio values, especially NARR, which even exceeds 1 over the South, Center, and Gaspésie basins, which is not physically consistent. The high ratio values from NARR over RB 6 (and to a smaller extent, RB 1 and 2) are directly related to the underestimation of precipitation close to the U.S.–Canadian border, which extends through the river basins of Gaspésie (Figs. 2b, 7). Ratio values from MERRA tend to (and even sometimes exceed) 1 for the river basins in the South, Gaspésie, and Center regions. This means that almost all the precipitation evaporates, which implies that there is no runoff in these regions for MERRA. This particularity is also illustrated in Fig. 5, where MERRA shows the lowest runoff and the highest evaporation over the southern climatic regions in summer. Rienecker et al. (2011) highlighted deficiencies in MERRA that lead to the immediate evaporation of much of the rainfall and consequently limit the surface runoff. Regarding ERA-Interim and CFSR, the E/P ratios are quite consistent with the expected value of 0.5 over the South and Center regions, except for some higher values in RB 1 between 1994 and 2008. Globally, the E/P ratios of the reanalyses show a steadier evolution during the 30 years over the North and Côte Nord basins versus in the South, Gaspésie, and Center basins. However, the observational estimates ratios are most often lower than for the four reanalyses in the northern regions. In fact, the general limitations (discussed in section 2b) of the observed streamflow and the precipitation time series reduce the credibility of the evaporation observational estimates in the northern part of the province. Over RB 11 in the Arctic region, only 3 years of streamflow data were available to compute the E/P ratio. Nevertheless, the results show that the general behaviors of the reanalyses and the observational estimates appear to be similar to those over the other northern river basins. Figure 10b shows the distribution of the yearly mean annual ratio from the four reanalyses and the observational estimates over the river basins. Dispersion of annual values from NRCAN is greater than the ones from CFSR, ERA-Interim, and MERRA over the Center, Gaspésie, Côte Nord, and North regions. In addition to the inconsistent values of the NARR ratio, this reanalysis presents the largest dispersions among the five datasets over the South and Gaspésie regions.
4. Concluding remarks
The scope of this study was to determine the reliability of CFSR, MERRA, ERA-Interim, and NARR in representing the terrestrial branch of the water cycle over the province of Quebec and to explore their potential in providing meteorological variables where few or no observations are available. The long-term mean of the water balance and the water cycle components were investigated for the four reanalyses. A first look at the water cycle components of the reanalyses showed that the water balance is not always maintained. Globally, the MERRA water balance is closed, as compared to the other three reanalyses, whereas NARR and ERA-Interim revealed negative water balances in central and southern Quebec, and CFSR revealed a positive water balance. The negative water balance of NARR is strongly influenced by inconsistencies in its components, for instance, underestimations of precipitation over the U.S.–Canadian border and a large bias of runoff close to Manicouagan Lake. Furthermore, the high precipitation from CFSR compared with those of MERRA, ERA-Interim, and NARR contributes to the positive values of the water balance over the entire province.
Precipitation, evaporation, and runoff were also assessed on a multiyear monthly basis, over the Bukovsky climatic regions (Bukovsky 2011). For the four reanalyses, mean annual cycles of the water cycle components showed a similar temporal evolution. However, the amounts of water differed, especially the maxima of evaporation and runoff and the precipitation from NARR and CFSR. A closer look at precipitation from the four reanalyses at the river basin scale confirmed that values from ERA-Interim and MERRA agreed fairly well with one another and with those of NRCAN. On the other hand, CFSR and NARR almost systematically represented the highest and the lowest precipitation values, respectively, and consequently, do not appear to be particularly reliable over the province of Quebec, especially in the south.
Nevertheless, the temporal correlation of the daily precipitation from the reanalyses versus NRCAN over the river basins showed that precipitation from CFSR, ERA-Interim, and MERRA vary quite synchronously despite water amount disparities between them. Precipitation from NARR, which showed an overall poor daily correlation, should be taken with care, depending on the region of interest. Despite its highest runoff values among the reanalyses over southern climatic regions, ERA-Interim showed runoff values similar to runoff estimates over almost all studied river basins. Moreover, the E/P ratios revealed that ERA-Interim and CFSR succeeded relatively well in reproducing the distribution of precipitated water into evaporation and runoff, whereas NARR showed physically inconsistent results over southern Quebec. Regarding MERRA, although this dataset was designed to close the water balance, its surface model does not properly distribute an adequate amount of precipitated water into evaporation and runoff, since most of the precipitated water seems to evaporate before running off. Therefore, each of the four reanalyses has its own strengths and weaknesses. NARR, which has been shown to be quite reliable in the continental United States (Mesinger et al. 2006; Bukovsky and Karoly 2007; Sheffield et al. 2012), appeared to be the least consistent dataset among the four reanalyses over eastern Canada. MERRA and ERA-Interim provide probably the most reliable precipitation over the province of Quebec, whereas evaporation remains questionable. The runoff values of ERA-Interim are the most consistent, compared with the observations, while runoff values from the other three reanalyses remain questionable.
No matter the case, for impact studies, care should be taken in using any of the reanalysis terrestrial water cycle components, for instance, in using evaporation and runoff as reference data in water resource management or as input or calibration data when performing hydrological modeling. In fact, it could be relevant to compare reanalysis datasets with weather station measurements (in the case of precipitation) to validate the ability of reanalyses to represent the weather, especially with the small number of weather stations available in remote regions (northern regions of Quebec). Such a task would involve accounting for the discrepancies between grid scales and point scales. A next step to this study will involve feeding hydrological models with reanalysis temperature and precipitation and evaluating and comparing their potential for model calibration and validation. Moreover, scientific works are currently in progress concerning the use of hydrological variables from reanalyses, as the evaporation, as reference data in evaluating intermediate steps of simulation within the hydrological models, or introduced directly inside hydrological models in order to reduce the number of calibration parameters. Nevertheless, even though further investigations are necessary, these reanalyses have good potential to provide meteorological and hydrological information in remote areas of Canada. Moreover, some variables of these reanalyses succeed in revealing consistent values in regions where observational datasets are reliable.
We thank the National Centers for Environmental Prediction, the National Center for Atmospheric Research, the National Aeronautics and Space Administration, and the European Centre for Medium-Range Weather Forecasts for developing and providing the reanalysis datasets. CFSR and NARR were obtained through the CISL Research Data Archive website at rda.ucar.edu/pub/cfsr.html and rda.ucar.edu/datasets/ds608.0, respectively. MERRA was downloaded from gmao.gsfc.nasa.gov/merra, and ERA-Interim was downloaded from apps.ecmwf.int. We also thank Catherine Guay from the Institut de Recherche d’Hydro-Québec for providing the (cQ)2 database and useful additional information about these data. We also would like to thank Blaise Gauvin St-Denis from Ouranos for providing valuable help in downloading and formatting reanalysis datasets, and finally, the organizations that funded this project, the Conseil de recherches en sciences naturelles et en génie du Canada, Hydro-Québec, Rio Tinto Alcan, and Ontario Power Generation.