Index insurance is becoming increasingly popular because of its ability to provide low-cost, relatively easy to implement agricultural insurance for vegetation types whose productivity has been notoriously difficult to measure and to farmers in less-developed nations where traditional crop insurance schemes are not reasonable to implement. This study examines if the remotely sensed normalized difference vegetation index (NDVI) can be an effective basis for index-based crop insurance over a diverse set of locations. To do this the authors compare Advanced Very High Resolution Radiometer (AVHRR) values to cumulative precipitation, extreme heat, and crop yields for 60 locations across the United States for the years 1982–2003. Quadratic regression equations are used to explore these relationships. The findings suggest that the relationship between NDVI, precipitation, extreme heat, and crop yields is highly variable and dependent on location-specific characteristics. Without site-specific calibration, NDVI should not be widely applied to index-based insurance product design. However, NDVI may still be a useful tool in insurance design under certain circumstances. This may be disappointing to proponents of NDVI as a risk transfer mechanism but the authors believe it important to report negative results as a caveat, and to give researchers and practitioners pause before investing time and money into the proposition.
Index-based agricultural insurance is gaining increasing popularity because of its ease of implementation, its safeguards against the moral hazard and adverse selection problems frequently found in traditional agricultural indemnity insurance, and its resulting affordability.1 In the developing world, index-based insurance is seen as a method of providing risk protection to communities previously thought to be uninsurable (Barrett et al. 2009), while Brown and de Beurs (2008) and Brown et al. (2011) argue that farmers in the developing world could insure crop yield risk using geophysical indexes. Chantarat et al. (2009) describe how the normalized difference vegetation index (NDVI) is actually being implemented in the form of livestock index insurance in Kenya. Food Early Solutions for Africa (FESA) are examining the use of a Meteosat-based drought microinsurance system that can reach every farmer in Africa, with encouraging trials in Tanzania (as described at http://www.ears.nl/fesa.php). In developed countries, it provides an inexpensive method of hedging against weather risk in pasture, rangeland, and large monocultures (Rowley et al. 2007). To date, most index-based insurance products (e.g., rainfall and/or heat insurance) are dependent on meteorological data collected via weather stations (Barnett 2004; Berg et al. 2009; Barrett et al. 2009; Turvey 2001; Turvey and Norton 2008). This makes these products dependent on a strong meteorological information infrastructure, which often is nonexistent in many of the developing countries that are the target clientele for this innovation in agricultural insurance. Increasingly, satellite information, in particular NDVI, is being promoted as a method of overcoming these weather information deficiencies (Chantarat et al. 2009). As a result, programs employing NDVI data in index-based agricultural insurance currently exist in Canada, the United States, Spain, Kenya, Ethiopia, and India (Leblois and Quirion 2012).
NDVI is a satellite product that measures the vigor and greenness of vegetation on the earth’s surface. It is calculated as the ratio of visible spectral wave bands to near-infrared spectral wave bands. Healthy, green vegetation has a high presence of chlorophyll pigment, which causes low reflectance in visible wave bands and high reflectance in near-infrared wave bands. The reverse is true in vegetation under stress. NDVI is a unitless index, with values ranging from −1 to 1. Healthy vegetation has the highest positive values, while bare soil, water, snow, ice, or clouds have NDVI values of zero or that are slightly negative. Vegetation under stress or with a small leaf area has lower positive NDVI values. Typically the NDVI values from healthy vegetation will increase as plant cover increases at the beginning of the growing season, reach a peak sometime during the middle of the growing season, and will then decrease as the season comes to its end (Mkhabela et al. 2005).
The idea behind using NDVI in index-based insurance is that by measuring greenness, NDVI provides a description of vegetative health at any given time, and in turn should be closely correlated with crop yields and primary production. NDVI values are thought to reflect the many variables affecting crop growth and distress, leading to a more comprehensive expression of crop health than an index based on one or two weather variables alone.
But how broadly can NDVI be used as an index to insure crop losses beyond pasture and rangeland? For NDVI to work as an insurance index it must be efficient in capturing the covariate risk between weather variables and crop or pastoral yields. For this to occur two relationships need to be investigated. First, for a general application of NDVI index insurance it needs to be shown that the weather parameters of interest (e.g., extreme heat and/or drought) are strongly related to the NDVI metric. Second, it needs to be shown that NDVI can explain the variability of crop yields to a high degree. These two criteria are quite distinct from the effective use of NDVI in phenological growth models that relate climate and climate change dynamically to natural phenomena such as plant growth. On the other hand, for insurance to be risk and cost efficient NDVI measures must be correlated with loss to obtain regulatory approval. This is at the core of the problem investigated in this paper. Smith and Watts (2009) lament that NDVI is generally poorly correlated with actual plant growth even though the data are available for relatively small 8 km × 8 km grids, but there is not enough publicly available information for an informed consensus. Indeed, there has been little research linking NDVI to crop indemnities, particularly in the economics literature. Likewise, there is scarce research tying the scientific innovations of remote sensing to economic innovations in global insurance markets.
With an increasing number of climate scientists, meteorologists, and agrometeorologists being employed by the insurance industry or conducting scholarly research on behalf of the insurance industry, this paper takes an objective look at this problem by investigating correlates between NDVI and agricultural output at multiple sites in the United States and correlates between NDVI and heat and precipitation measures, and then assesses the efficacy of a stylized NDVI-based insurance product. We use meteorological data from 59 weather stations across the United States to explore the relationship between NDVI and weather variables. We then use a subsample of 25 locations to compare NDVI with yield data directly. Our findings suggest that NDVI cannot be applied generally as a reliable indicator of crop loss, though it may work under specific conditions and in select locations, or when used in conjunction with in situ data. This may be disappointing to proponents of NDVI as a risk transfer mechanism (of which we were part) but we believe it important to report negative results as a caveat and to give researchers and practitioners pause before investing time and money into the proposition.
For NDVI to be an adequate base for generalized index-based agricultural insurance products, it needs to be a reliable indicator of acute crop loss across a variety of locations representing an assortment of growing practices and diverse climate regimes. This paper contributes to the literature by exploring the relationship between NDVI, meteorological variables, and crop productivity. Specifically, it seeks to determine if NDVI can be applied generally as a substitute for meteorological and yield data in the development of index-based agricultural insurance products. Of particular interest is how NDVI can be used as a risk transfer mechanism for developing countries.2 Leblois and Quirion (2012) in a rather extensive review of meteorological indexes used in insurance, state that NDVI, while being highly adapted to biomass assessment (pasture and rangeland), is not so clear-cut for crop yield data. Mahul and Skees (2007), commenting on the low correlation between livestock mortality and rainfall in Mongolia, indicated that the next route to insuring livestock could be through NDVI. Combining pasture grazing and livestock mortality, the potential use of NDVI for insuring meteorological risks in developing countries has been highlighted by Barrett et al. (2009). Chantarat et al. (2009) describe an index-based livestock insurance product for northern Kenya based on NDVI. McPeak et al. (2010) describe an interesting discussion of actually explaining the scheme in an experimental framework. NDVI has been shown to be a reasonable predictor of drought conditions in Africa (Ward et al. 2008). Carter (2009), in a policy brief, states that among index insurance contracts for West Africa NDVI had high power in predicting village grain productions relative to local precipitation measures. However, he adds caution to overgeneralizations noting that in Mali, NDVI was inferior to a district area yield index.
Borrowing from Smith and Watts (2009), scalability and sustainability are important concerns for index insurance programs. Scalability involves 1) access or coverage, 2) participation, and 3) the cost of operating and administering the program. Sustainability includes 1) the willingness of farmers to participate over the long term and 2) the ability of a country’s public administration (regulatory agencies, etc.) and private insurance sector to deliver and manage the program after initiation. NDVI insurance appears on the surface to increase access and coverage, opening up hitherto inaccessible insurance options for vast numbers of farmers and at low cost. If NDVI proves to be generally highly correlated with multiple perils it could be sustainable in terms of continuous participation and insurer and reinsurer delivery, easily regulated and designed, and attractive, on the development side, to nongovernmental organizations (NGOs) and international aid organizations.
Most of the literature on NDVI is centered in the remote sensing community, and research on its ability to serve as a proxy for yield data or as a measure of primary productivity has arrived at mixed conclusions. For the most part crop scientists and agrometeorologists have confined use of NDVI and other remote sensing techniques to crop yield simulation models (Doraiswamy et al. 2005) and these have worked quite well. But the idea of matching NDVI to harvested yields rather than modeling physical phenological growth are quite separate measures. Of the limited published works on NDVI in an insurance context, Rowley et al. (2007) provide some evidence that NDVI correlates with the perceptions of ranchers in Kansas and Oklahoma at the county level, especially in drought years. They call for a national NDVI-based insurance program for rangeland management. Atwood et al. (2005) provide some evidence that an out-of-sample forecast of an NDVI predictive model is highly correlated to lower-than-average production when measured against an actual yield index. Insurance fundamentally deals with anomalous losses and this is the more important criterion for using NDVI as an insurance product. Some year-to-year variation in crop yield is simply part of the business of agriculture; crop insurance is in place to protect the farmer against the event of extreme crop loss. Index-based insurance, in particular, functions as a hedging product intended to protect a policy owner from environmental shocks. Thus, the ability of NDVI to capture years of extreme crop loss is of the greatest interest. Indeed Atwood et al. (2005) recommend an NDVI insurance product for rangeland and pastureland and base this in part on the observed predictive relationship between NDVI and crop yields. Their recommendation is a dual trigger in which an indemnity would be made if the NDVI falls below a percentage of its historical average or (not and) when a county-based proxy average falls below a percentage of its historical average. The trigger difference is due to imperfections in the NDVI measure relative to ground production metrics.
Several studies exploring the relationship between NDVI and climate variables carried out in the U.S. Great Plains found varying results on the influence of these variables on NDVI (Di et al. 1994; Tieszen et al. 1997; Wang et al. 2001, 2003; Tan 2007; Yang et al. 1998). In a study using precipitation data from 410 weather stations and temperature data constructed from 17 weather stations across the state of Kansas, Wang et al. found precipitation to be the dominant climate regime influencing both temporal (Wang et al. 2001) and spatial (Wang et al. 2003) NDVI variation. However, other research suggests that temperature has greater influence on NDVI variation in much of the Great Plains (Tan 2007; Yang et al. 1998; Zhou et al. 2001). Precipitation data from the preceding growing season also seems to play an important role in the NDVI for the current growing season (Wang et al. 2001). Differences in studies likely result from the size of the study area and climate regimes covered, as the climate variable most affecting NDVI appears to vary depending on vegetation type and frequency of rainfall events.
In Africa, meaningful direct relationships have been found between NDVI, rainfall, and vegetation cover in many studies carried out in the Sahel zone (Tucker et al. 2005; ,Malo and Nicholson 1990), Botswana (Prince and Tucker 1986), East Africa (Boutton and Tieszen 1983; Davenport and Nicholson 1993), and Tunisia (Kennedy 1989). However, findings for these locations were still highly variable. Nonetheless, it is often concluded from this body of work that NDVI and precipitation have a strong linear (Malo and Nicholson 1990) or log-linear (Davenport and Nicholson 1993) relationship, when monthly and annual precipitation is within a range that makes rainfall a limiting factor on vegetation growth. This last condition confines the strongest relationships between NDVI and precipitation to regions where annual rainfall is within a specific range, identified as between 300 and 900 mm in South Africa (Richard and Poccard 1998), between 500 and 700 mm in China (Li et al. 2002), and less than 700 mm with an average rainfall/potential evapotranspiration ratio of 0.5 to 2 in Spain (Udelhoven et al. 2009). In regions where the precipitation regime is not of a range that characterizes it as a limiting factor to plant growth, soil moisture appears to become the dominant climate variable affecting NDVI values (Farrar et al. 1994; Nicholson and Farrar 1994; Udelhoven et al. 2009; Yang et al. 1998).
Collectively, the work on relationships between NDVI, precipitation, and temperature is characterized by highly variable relationships including complex radiative interactions between the atmosphere, sensor view angle, and solar zenith angle, which all affect NDVI value accuracy and are difficult to account for in standard adjustments (du Plessis 1999). These findings implicate the strong influence of other factors, such as soil and vegetation type, geographic region, climate zone, and radiometric disturbances of NDVI measurement in the ability of NDVI to reflect changes in precipitation and temperature.
Much like the relationship between NDVI and meteorological variables, the relationship between NDVI and agricultural productivity appears to be highly variable, depending on a wide variety of additional site-specific characteristics (Box et al. 1989; Pareulo and Lauenroth 1995; Fuller 1998). In Swaziland, Mkhabela et al. (2005) discovered that NDVI could be used effectively to forecast maize yield in three of the country’s four agroecological regions. Lewis et al. 1998 found NDVI to be a good indicator of maize production in a study using annual maize production statistics for 36 agricultural districts in Kenya. In their study of grasslands throughout Montana, Thoma et al. (2002) found NDVI to explain 63% of the statistical variation in live biomass. This relationship was consistent for all of six of their sample locations, which spanned north and south Montana. Similar results are reported in Kennedy (1989) in grazing lands in Tunisia and du Plessis (1999) in Etosha National Park, Namibia, with a coefficient of determination of 0.5166. Farmland also often contains a multitude of vegetation types, including several crops, often rotated, each with different planting and harvesting times. These factors all add to the complexity involved in measuring crop health using satellite data.
Nonetheless, just as variability characterizes the relationships between NDVI, temperature, and rainfall, it is a defining characteristic in the relationships reported between NDVI, vegetation, and crop yield in the literature. This points to a need to clarify the limitations of NDVI to best understand the potential for future applications.
3. Investigating applicability of NDVI in index-based insurance
Very few studies exist directly exploring NDVI’s potential for use in insurance products predominantly in rangeland and pasture insurance. In the United States, Rowley et al. (2007) compared rancher perception of low production years to NDVI data for several counties in Kansas and Oklahoma. Their study found weak statistical relationships between individual range productivity and NDVI values (R2 = 0.15). Importantly, when these data were aggregated to the county level, the relationship between NDVI values and rancher perception of productivity improved substantially (R2 = 0.65). Rowley et al. (2007) also found NDVI failed to capture midseason low rainfall events flanked by periods of high rainfall. While ranchers classified such years as years of poor range productivity because of the critical timing of the low-rainfall event, NDVI was unable to mark this effect on range conditions.
Makaudze and Miranda (2010) assessed the possibility of using NDVI for crop insurance for corn and cotton crops in Zimbabwe. While the authors conclude that NDVI holds great potential for index-based insurance, their results suggest the relationship between NDVI and crop yields to be highly variable between crops and districts. They discover a correlation between NDVI and critical growth stages of the crops ranging from 0.42 to 0.70, depending on crop and district. The timing of this critical growth stage also varies depending on district location, thus requiring calibration based on farmer input on a district-by-district basis.
4. Existing index-based agricultural insurance projects using NDVI data
NDVI-based insurance programs currently exist in the United States, Canada, India, Kenya, Ethiopia, and Spain. The U.S. Department of Agriculture’s (USDA’s) Risk Management Agency (RMA) runs a pilot vegetation index insurance program for pasture, rangeland, and forage. The program has been available since 2007 in select states, and was expanded in 2009 to include additional states. States were chosen such that the index would be tested in various climate, soil, and weather conditions via six U.S. regions: the warm and humid Southeast, the cool and humid Northeast, the northern Great Plains, the southern Great Plains, the semiarid Southwest, and the intermountain region of the Northwest. Farmers select one or more 3-month time periods that represent the high-risk time period for their pasture, rangeland, and forage management practices. NDVI data used to evaluate loss are in the form of a 4.8 mile × 4.8 mile grid, and losses are evaluated according to the grid, not losses experienced at individual properties. Losses are calculated as the difference between the expected normal NDVI value and the actual NDVI value experienced for the interval the farmer chose to insure. A payment is received when this number falls below a certain trigger grid index (USDA 2009).
NDVI-based insurance in Canada was launched in 2001 by Agriculture Financial Services Corporation (AFSC). The insurance is limited to areas where pasture is the predominant land cover and is intended to cover hay production. NDVI data are calculated for each township in the area and are scaled to reflect native pasture production. Areas of irrigated cropland and bush are removed, as they can significantly influence the program outcome. A pasture vegetation index (PVI) is developed for each square kilometer of the township. Pasture production data collected by AFSC during a past cage clipping system operation were used for correlation comparisons from 1991 to 1999. Precipitation data measured at Environment Canada weather stations were also compared to NDVI data, and client meetings were conducted where farmers identified their two best and two worst pasture production years in the last 15-yr period. Historical PVI values seemed to identify production shortfalls resulting from cool early season temperatures and drought. In addition, the anecdotal production perception of farmers surveyed corresponded to geographical differences between township PVI values. NDVI correlations with precipitation data and collected production data were not considered good, with an r = 0.65.
AFSC augmented NDVI data by collecting precipitation and pasture growth data over the growing season at test stations throughout the insured pasture area. This substantially improved correlations between the developed PVI index, precipitation, and pasture production. Thus, the developed insurance index takes into account production data alongside NDVI data. A loss payment is triggered when the current year’s PVI falls below 90%–85% of the average PVI from previous years (the percentage depends on if the farmer has full-season or split-season coverage) (AFSC 2012).
Spain’s NDVI-based insurance was designed to protect farmers from droughts affecting pasture areas. It has also been available since 2001, but unlike Canada’s insurance scheme, Spain’s insurance index is solely based on NDVI data; it is not verified by any additional measurement of yields. The index is constructed using historical NDVI values for the given area to create a curve of expected NDVI values. When actual NDVI observations in a given year fall below the average curve an indemnity is defined. NDVI values are defined for a 10-day period. The maximum value for each 10-day period is used to reduce the effect of clouds; this is called the maximum value composite index (MVCI). The deductible is the historic average MVCI for each area minus 1.25 standard deviations from the average MVCI.
The Agricultural Insurance Company of India (AIC) uses NDVI data combined with temperature parameters for their wheat insurance index. NDVI and temperature data from the peak crop growth stage are used to determine if there is to be a payout (AIC 2012). AIC is also in the process of carrying out research intended to aid in developing an NDVI-based index insurance product for tea production.
A pilot index insurance project for several Millennium Villages locations uses NDVI as its basis. The project used NDVI data with an 8-km resolution, aggregating data to an average area of 100 km × 100 km. The index targeted the time between flowering and harvest in local cropping calendars, since this was found to have the strongest relationship between local rainfall and historical yields. NDVI was found to be sufficient to underpin contracts in regions of Senegal, Mali, Ethiopia, northern Nigeria, and northeastern Kenya because of their semiarid climates. NDVI was used in combination with rainfall for the wetter, less variable climates of Uganda, Rwanda, Tanzania, Malawi, and western Kenya. Research for these projects found that when native vegetation shows signs of stress at the 100 km × 100 km scale, crop yields are typically greatly reduced (Hellmuth et al. 2009).
There are two issues of concern when it comes to understanding NDVI in an insurance context. First, the color tones of NDVI arise from the interaction of heat and precipitation conditions, but how these interact to affect NDVI is not clear. Since the agricultural parameters of interest are heat and precipitation, we need to better understand the covariate relationship between heat and precipitation. Second, in order for NDVI to be effective we need also show that there is a covariate relationship between NDVI and crop yields. This is in fact a necessary condition for any insurance company to offer NDVI-based insurance under current insurance law. We estimate the two relations as separate regressions because our intent is to segregate the nulls. Under the first model the null hypothesis is that there are no relationships between heat and precipitation affecting NDVI, while under the second the null is that there is no relationship between NDVI, precipitation, or heat on crop yields. To conceptualize the relationship between precipitation (p), heat (h), NDVI, and crop yields (Y), we define a production function with a single input x, which is held constant so that the final variance is due to the exogenous weather factors. We illustrate this using a quadratic production function and assume that the production input (e.g., nitrogen, x) has already been optimally chosen so that any variability from the input decision onward is due to weather variability. Rather than adding variability to the production function we assume, quite reasonably, that the source of variation enters through the standard error of the production coefficients, that is
The standard assumption behind the use of NDVI as a proxy for the combined impact of precipitation and heat on crop yields is that so that we can write
Note that these are random coefficients. For simplicity assume
As these random factors evolve in nature the variance in yield (given the fixed input level) is
We seek an understanding of the relationship between NDVI and precipitation and heat in the context of yield variability. In other words, if NDVI is an acceptable proxy for combined heat and precipitation events we require in a general way
We estimate these relationships in reduced form. Equation (11) compares NDVI to cumulative precipitation and growing degree-days (GDD) using the quadratic equation below:
where NDVI is the bimonthly values for NDVI at a given station, cp is cumulative precipitation for the same semimonthly period, gdd is the cumulative growing degree-days for the period, and α is the constant term. The quadratic function was used to capture interaction effects between the variables, as precipitation and heat, as well as the joint effect of the two, are important in identifying instances of severe crop distress and loss. Using a growing degree-day standard of 80°F allows the gdd variable to capture instances of extreme heat. Thus, this equation should not be reflecting only the vegetation stress caused by a lack of rainfall or extreme heat, but also the combined effect of rainfall and extreme heat events. We used this equation to carry out regressions for each of the 59 locations.3
a. NDVI and crop yields
We can surmise from the previous results that the depiction of crop yield response to NDVI versus precipitation and heat as described by Eqs. (1) and (2) does not hold. If the presumed relationship does not hold, then what might be the relationship between NDVI and the weather variables? To investigate this we ran a host of regressions, far too many to report here, and found that the relationships were as ubiquitous as discussed above. To simplify matters we report in this section one series of regressions that illustrates our concerns. In the regressions that follow we wanted to determine whether if NDVI and heat/precipitation are not substitutable as surmised, were they at least complementary or of any economic consequence.
Since the relationship of most importance in an index-based insurance product is between the data used in the index itself and crop yields, we selected a subsample of 25 stations to use in comparing crop yields and NDVI directly. Again, we used a quadratic equation to capture interaction effects between precipitation and heat, this time using crop yield as the dependent variable, and adding NDVI into the equation as an independent variable.4 The model used is presented below:
where “yield” is county-level yield data for years 1982–2003, NDVI is the maximum NDVI value during the growing season, CP is cumulative precipitation for the growing season, GDD is total growing degree-days during the growing season (again measured using the 80°F standard), and β0 is the constant term. As previous studies have suggested both integrated NDVI and maximum NDVI as the best predictors of crop yield, we also performed a regression analysis using integrated NDVI for the growing season as the NDVI variable. Results for the two measures were not significantly different, and all results presented here are for maximum seasonal NDVI values.
The analysis was carried out for each of the 25 locations using county-level corn yields. We did not treat the collective data as a panel dataset because this would have assumed the same marginal response to NDVI, heat, or precipitation at location, when what we really seek is how marginal effects differ by location. We also fit the model using hay, wheat, and soybean yields for those stations in which these crops were relevant and the data were available. In the case of wheat, this was all stations except Algona, Iowa, and Montrose, Pennsylvania. For hay, this excluded stations located in Missouri. And for soybeans, this excluded Montrose; Heppner, Oregon; Eltopia, Washington; Ardmore, South Dakota; and stations located in New York.
b. The NDVI
For this study, we used the Advanced Very High Resolution Radiometer (AVHRR)-NDVI dataset provided by the University of Maryland. The AVHRR-NDVI dataset is desirable because it is a global gridded dataset, with an 8 km × 8 km spatial resolution. Additionally, the AVHRR-NDVI dataset has been recommended for use in index-based insurance products above alternative satellite options (Atwood et al. 2005). The dataset begins in 1981 and goes through 2003, with bimonthly NDVI estimates. The NDVI value for each bimonthly period represents the maximum NDVI for that period. This is the most effective method of reducing errors resulting from cloud cover and similar distortions. We use NDVI data for the grid point associated with each of the 60 meteorological stations. Data were collected for an estimated growing season, beginning 1 May and ending 15 August and containing seven semimonthly periods.
c. Meteorological and crop yield data
Meteorological data were obtained from the National Oceanic and Atmospheric Administration (NOAA), via the Internet-based weather risk evaluation tool WeatherWizard (www.weatherwizard.us; cf. Turvey and Norton 2008). We collected cumulative precipitation, average mean daily temperature, and cumulative growing degree-days for each bimonthly period corresponding to the NDVI data. Growing degree-days were calculated using an 80°F standard to capture extreme heat.
County-level crop yield data were selected for a subsample of 25 stations for 1982–2003. This included yearly yield data for hay, corn, soybeans, and wheat depending on the crops grown in the selected counties. These data were obtained from the USDA’s National Agricultural Statistical Service (NASS).
d. Site locations
All the sites selected are located in heavily farmed areas known to use little or no irrigation. Locations of meteorological stations are shown in Fig. 1. Circles mark the 60 meteorological stations. Those stations where county-level yield data were also collected have an X to their left. The sites span variety of climate regions within the United States. Five are located in coastal areas, and one on Lake Erie. The locations also represent a range of elevations, from sea level to 1485 m.
a. General trends
Results of the regressions of NDVI on meteorological variables were mixed across stations, without apparent trends in which meteorological variables are significant in determining NDVI, or even common signs on coefficients for the same variables. Cumulative precipitation was a significant variable in determining NDVI for 7 of the 59 stations. We anticipated that NDVI would be positively related to rainfall and negatively related to heat in relation with the effects of drought or extreme heat. Coefficients for CP were negative in 39 of the 59 stations; CP2 had a negative coefficient in 22 of the locations, and was only significant for 8 of the 59 stations. Similarly, GDD was a significant factor in determining NDVI for 4 stations, and had negative coefficients in 27 of the locations; GDD2 was only significant in 5 stations, and its coefficient was negative for 28 locations. The interaction term CP×GDD was only significant for 6 stations, and had a negative coefficient in 22 locations.
The coefficient of determination for the model when applied to the 59 stations ranged from 0.2435 to 1.0, with 49% of stations having R2 values greater than 0.6. Those stations with higher R2 values did not show more consistency in coefficient signs across variables, and had mixed results in which variables were significant in determining NDVI. In the regression analysis of several stations, GDD, GDD2, and in some cases, CP×GDD was dropped from the model. These are instances where there were no days where average temperatures rose above 80°F.
b. Case studies: Saluda, South Carolina; Windsor, Illinois; and Angelica, New York
Results for the meteorological variables at three of the locations are presented in Table 1. Of these three examples, Saluda had the lowest R2 value and none of the meteorological variables are statistically significant in determining NDVI for this location. The positive coefficient on cumulative precipitation is in line with what would be expected, as precipitation is one of the primary factors determining plant growth, with higher levels of rainfall generally leading to more vigorous growth in vegetation. The positive sign on GDD is not what would be predicted, as extreme heat is often associated with drought conditions and high levels of crop stress. In this case, heat may also be associated with rainfall, causing the positive coefficient. However, GDD2 may be a better indicator of extreme heat for this location and does have the expected negative sign.
In the case of Windsor (see Table 1), we also did not find a significant relationship between NDVI and each meteorological variable using this model. Here CP also has the expected positive sign, and GDD has the predicted negative coefficient. Yet the coefficient for GDD2 is positive, which seems unusual given that extreme heat should lead to lower NDVI values. The negative value on CP2 is within reason, as at a certain point, higher levels of rainfall lead to floods and overly saturated soils, damaging vegetation and causing poor plant growth.
In Angelica, both cumulative precipitation and CP2 are significant variables at the 20th percentile. However, the signs for the coefficients on both variables are not as anticipated. Cumulative precipitation has a negative coefficient, which does not reflect the positive correlation between rainfall and crop growth. The negative coefficient on CP2 also does not seem appropriate, as we would expect extreme rainfall events to have a negative impact on vegetation health.
These three locations represent the wide variety of relationships between NDVI, precipitation, and extreme heat across the many locations surveyed. Generally most instances where the model had a low coefficient of determination corresponded to locations in the southeast; however, this was not consistently the case and several of these instances were for stations located in other regions of the country. Differences in signs on coefficients for precipitation and extreme variables could not be explained simply by the station’s regional location. Elevation and coastal location also did not account for inconsistencies between regression results across stations. The significance of these results is that with sign reversals on estimated coefficients and inconsistent levels of statistical significance it is difficult to conclude that NDVI can be applied generally to all crop regions. This is not to say that NDVI cannot be used at the local level when covariate risks are indicated, but that general applicability of index insurance based on NDVI should not be implemented as a matter of course without detailed investigations of the relations at play.
c. NDVI and crop yields
Comparisons of crop yield data to NDVI values also produced ambiguous results. In estimating the quadratic model for corn yields, NDVI was a significant variable in only 3 of the 25 locations at the 5% level. The estimated coefficients on NDVI were negative in 32% of locations. Coefficient of determination for the model ranged from 0.2043 to 0.862. For a given location, signs on coefficients and which variables were statistically significant were not consistent across multiple crops. The two case studies presented below illustrate the wide variation in relationships between yields, NDVI, and meteorological variables across sites and crops. Because of the number of regressions run, we cannot provide results for all locations.
Results for Saluda show the model to be fitting well, with a relatively high coefficient of determination when using corn and soy yield data (0.6606 and 0.7826, respectively). For both crops, NDVI has a negative coefficient, which differs from results of prior studies that showed a strong positive linear relationship between NDVI and crop yields. However, NDVI2 has a positive coefficient for both soy and corn. The cross-effect variable CP×NDVI also has a positive coefficient for both crops. Of these variables, NDVI, NDVI2 and CP×NDVI are only significant when using soy yields. These are provided in Table 2.
For Saluda, the only variable showing a significant relationship with corn yields was GDD2, which had the predicted negative coefficient. Soy yields, however, had a significant relationship with all but two of the variables (CP2 and CP×GDD). Windsor provides a different perspective on the relationship between NDVI and crop yields than what we saw in Saluda. In the case of this station, NDVI has a positive coefficient for both corn and soybean yields. The relationship is not statistically significant in the case of either crop. However, the model does have a good fit for both crops, with R2 values of 0.7346 and 0.6746 for corn and soybeans, respectively. Signs on the coefficients of the other variables are consistent for both crops at this location, and are largely as expected. The negative coefficient on NDVI2 is unusual, as there does not seem to be a reason that higher levels of photosynthetic activity would at some point lead to lower crop yields.
7. A simple index-based insurance contract using NDVI values
Though our regression analysis provided little evidence of a relationship between NDVI and crop yields that could be considered reliable for a range of locations, ultimately the question at hand is whether years of anomalously great crop loss can be captured by a NDVI-based trigger mechanism. To test this, we assessed the effectiveness of a simple NDVI-based contract in addressing major loss events by performing a historical burn analysis.5 The contract would pay out when seasonal NDVI values were below a trigger value of average NDVI −(0.25standard deviation), where average NDVI is the mean NDVI for the growing season. This design is similar to the design implemented by USDA RMA’s Pasture, Forage, and Rangeland program, where losses are measured as the difference between expected NDVI for a parcel of land and actual NDVI for that parcel during a critical growing period, and a trigger value is decided upon to render a payout.
For this example, we chose a quarter of a standard deviation below the average NDVI value for the location. We confined the contract to NDVI values during the growing season defined as 1 May to 15 August, as this was the closest to representing critical growing periods that could be applied across a wide variety of locations, each growing different crops at different times depending on their unique growing seasons. We used a historical burn analysis to evaluate this contract for corn at each of the 25 locations, and for wheat, hay, and soybeans at the locations where data were available.
With the exception of five stations (Dublin, Georgia; Beaver Dam, Kentucky; Eltopia, Washington; and Fredonia, New York), all stations had a payout frequency ranging from 20% to 25%. When evaluating the contract in relation to corn yields, 0%–71% of these payouts occurred in the lowest quarter of crop yields; 86%–20% occurred in the lowest half of crop yields. Rock Rapids, Iowa, was the station where the contract best succeeded in capturing extreme crop loss events, with 71% of payouts falling in years in the lowest quarter of crop yields, and 86% occurring in years in the lowest half of crop yields. While, in Heppner, Oregon, the contract failed to detect any of the years in the lowest quarter of crop yields, and captured only one of the events in the lowest half of crop yields. Table 3 presents general statistics on the relationship between payouts and indemnity events for corn crops at all of the locations.
Figures 2 and 3 present the timing of payouts for the NDVI-based insurance contract for corn for Saluda and Windsor. Payouts are represented by filled circles. In the case of Saluda, only one of the five payouts occurred in a year exhibiting anomalously low crop yields. The other four payouts all occur in years with above-average corn yields.
The contract is more successful at targeting loss in the case of Windsor. Here, no payouts occur in the years with the highest yields, and loss events, such as low crop yields in 1984 and 1988, are successfully captured by the contract. Additionally, years such as 1989, 1991, and 1996, which are in the lower half of yields, receive a payout. However, the contract does fail to capture 1983, a year of anomalously low yields.
Comparing contract payouts to hay, wheat, and soybean yields had similar results, with 0%–80% of payouts occurring in the lowest quarter of wheat yields, 0%–57% in the lowest quarter of hay yields, and 17%–60% in the lowest quarter of soybean yields (data not shown). The contract was best at capturing extreme wheat crop loss in Menomonie, Washington (80% of payouts in the lowest quarter of yields, 100% in the lowest half). For hay yields, payouts were most often triggered in years with the greatest losses in Rock Rapids, Iowa, and Angelica, New York, where 57% of payouts were in the lowest quarter of yields, and 86% were in the lowest half for both cases. Algona, Iowa, had the greatest percentage of payouts in the worst quarter of soy yields (60%), yet also had 40% of payouts in years in the highest half of soy yields, while Rock Rapids and Windsor had 86% of payouts occurring in the lowest half of crop yields.
The timing for the NDVI-based soybean contract for Saluda is presented in Fig. 4. As was the case of the corn contract in Saluda, only one payout occurs in a year of anomalously low crop yields. Three occur in years in the highest half of yields. The NDVI-based contract for soybeans performed better at the Windsor location (Fig. 5). All but one of the payouts (1998) occur in years in the lowest half of yields, and four of the seven payouts (1984, 1988, 1989, and 1996) occur in years in the lowest quarter of yields.
8. Discussion and conclusions
The relationship between raw NDVI data and precipitation differs depending on location, and likely many other factors outside the scope of this model. Our results show NDVI to have a positive relationship with precipitation in some locations, while exhibiting a negative relationship in others. The quadratic form used in this study shows precipitation to be a significant variable affecting NDVI in very few locations within our study sample. This is in line with the literature showing a high variability in the strength of the relationship between precipitation and NDVI.
It is important to note that prior studies finding strong relationships between NDVI and rainfall were mostly working in smaller regions, with study sites often restricted to one state or county. Additionally, most were working with a shorter time series of data, many comparing data spanning only two to three years. Our results and conclusions do not in any way bring into question the use of NDVI in these small localized regions where it has been proven successful; our point in preparing this paper was to offer up the caveat that with general applicability as a guiding principle for NDVI index insurance the researcher, analyst, or practitioner should approach investigation with an open and objective mind.
Results show the relationship between NDVI and extreme heat to be highly variable among locations. Furthermore, the relationship between NDVI and extreme heat is likely not constant for the growing season. Such changes in the relationship between temperature and NDVI throughout the growing season may account for some of the variation in signs on regression coefficients. Additionally, prior studies have found NDVI to be both positively and negatively correlated with temperature interannually (Yang et al. 1998; Wang et al. 2001, 2003; Zhou et al. 2001). Given the assorted locations covered by this study, it is not surprising that our results agree with both findings.
The strength of the relationship between NDVI and extreme heat also differs depending on location. While extreme heat was a significant variable in some locations, in most cases it was not. This is likely the result of the interplay of many factors including interaction effects between temperature and precipitation, soil water content, potential evapotranspiration, and vegetation type. Though a complex crop model incorporating such factors may be able to explain the relationship between NDVI and temperature, NDVI alone does not reliably reflect instances of extreme heat across a wide variety of locations. This is not to say that NDVI cannot capture vegetation stress resulting from extreme heat in particular geographic areas.
The results of this study did not find a consistent relationship between NDVI and crop yields across study sites. Though the model had low coefficients of determination in some locations, in most it was able to effectively account for a large proportion of the variation in NDVI. Though all the sites in this study were composed predominantly of farmland with little to no irrigation, many locations are growing multiple crops, others consist primarily of pasture, and many contain forested areas. We did not include data on technology changes or crop rotation, which have been evidenced to make NDVI less reliable in detecting trends in vegetation conditions over time (Atwood et al. 2005). Furthermore, using county-level data may have weakened expressed relationships, as there are certain to be discrepancies between NDVI pixel location and county lines. Nonetheless, our results indicate that the relationship between NDVI and crop yields is not strong enough at the single-pixel level, such that it would not be disturbed or lost amid changes in these sorts of factors.
Several studies have found NDVI is most effective when consolidating pixels covering an expansive area of land (du Plessis 1999; Rowley et al. 2007). Some have found NDVI to be most successful at capturing vegetative condition when estimated over an extended period of time (15 months, in the case of Wang et al. 2003) or when smoothing techniques are employed (du Plessis 1999). While these papers point to a strong underlying relationship between NDVI and vegetation condition, our results suggest NDVI is not particularly useful when looking at the smaller-scale, seasonal information relevant to index-based crop insurance products. This is consistent with other studies assessing the possibilities of using NDVI for index-based agricultural insurance, which found significant differences between farm-level experiences and NDVI values (Rowley et al. 2007; Makaudze and Miranda 2010). This is further evidenced by the weak ability of NDVI to capture years of extreme crop loss across locations, as seen by the historical burn analysis presented.
This study was not able to find a reliable relationship between NDVI and precipitation or between NDVI and extreme heat that would justify the use of NDVI as a substitute for meteorological data in the development of index-based insurance products. While NDVI may be able to reliably reflect rainfall and extreme heat patterns in some locations, notably grasslands and pastures, the relationship is not as clear in mixed vegetation environments including cropland.
Finally, we do not believe that the results of this study are the end of the story. This paper was prepared as a caveat on the general applicability of NDVI for insurance purposes, no more and no less. One cannot deny the efficacy of NDVI as reported in Chantarat et al. (2009) or the positive results reported by the FESA project in Tanzania. More research is needed exploring the time period and area size best suited to maximizing the strength of the relationship between NDVI, meteorological variables, and crop yields. Examining the relationship between NDVI during critical stages of crop growth and crop yields may provide useful information for insurance design.
Index-based insurance is a generic term used to describe insurance policies with payoffs contingent on the value of some underlying index such as temperature, rainfall, or in the case of the present study, NDVI. Index insurance is generally targeted toward volumetric risk such as crop yields in agriculture, or electricity or natural gas demand in the energy sector.
Indeed, to be fair to the reader, the original intent of this research was to use the abundant data in the United States to investigate the efficiency of NDVI as an index for crop insurance in China, where data are limited and expensive. The reasoning was that if the general conditions for insurability held generally in the United States, they would likely hold true in China as well. Based on the results reported in this paper we were forced to rethink the strategy.
We selected this model after exploring both linear and log-linear models, since prior studies have suggested these to describe the relationship between NDVI and precipitation. However, these studies were describing relationships within a specific climate regime, and where grassland was the predominant ground cover. For the variety of locations we are assessing, the quadratic model presents a more comprehensive method of exploring the relationship between NDVI, cumulative precipitation, and growing degree-days. Initial regressions included mean temperature in addition to these variables, but mean temperature was not a significant variable in any of the locations, nor did it appear to contribute additional information relating to temperature, as this study is most interested in cases of extreme heat, best captured using the growing degree measure.
Although this yield regression may appear to suffer from endogeneity and multicollinearity, this would be true only if we found strong statistical relationships between precipitation and heat in the NDVI equation previously discussed. This was not the case. That we find no consistent relationship between heat and precipitation suggests that the yield regressions will be only minimally biased; but in a broader sense we cannot say that all regressions are free of bias or are efficient. However, our results are observational and not predictive, and with the intent of uncovering general properties we can tolerate bias and inefficiencies in some equations, so long as they do not hold in all equations.
Burn analysis is a common insurance term that simply means that frequencies of events are calculated directly from the historical record without reference to a parametric form of probability distribution.