## Abstract

The short-wavelength infrared bands of the Thermal and Near-Infrared Sensor for Carbon Observation (TANSO)–Fourier transform spectrometer (FTS) instrument on board the *Greenhouse Gas Observing Satellite* (*GOSAT*) have degraded, which affects the retrieval of data for CO_{2} and CH_{4}. Herein, a new algorithm that uses principal component analysis (PCA) to evaluate these degradations from on-orbit solar calibration spectra has been developed. The datasets of the spectra were decomposed using PCA, and the temporal variations of their components were fitted using the appropriate functions. Our results show that PCA is effective to construct a suitable degradation model for TANSO-FTS. Comparisons of CO_{2} data retrieved using the new degradation model with that using the ground-based FTS indicate that the new model improves the measurement biases.

## 1. Introduction

A Japanese satellite, the *Greenhouse Gas Observing Satellite* (*GOSAT*), was the first spacecraft specifically designed to monitor greenhouse gases such as carbon dioxide (CO_{2}) and methane (CH_{4}) (Yokota et al. 2009). It was launched on 23 January 2009, and has continuously operated for more than 10 years. After the launch of *GOSAT*, several other satellites have also been launched into space to monitor greenhouse gases. *GOSAT* Thermal and Near-Infrared Sensor for Carbon Observation (TANSO) consists of two sensors. The Fourier transform spectrometer (FTS) is the main sensor used to measure CO_{2} and CH_{4} concentrations. The FTS has three bands (bands 1–3) in the short-wavelength infrared (SWIR) region; they are centered at 0.76, 1.6, and 2.0 *μ*m, respectively. One additional band (band 4) covers the thermal infrared (TIR) region from 5.5 to 14.3 *μ*m. The spectral-sampling steps are approximately 0.2 cm^{−1} for each band. The SWIR bands measure the orthogonal polarizations “P” and “S.” The other sensor is the Cloud and Aerosol Imager, which is used to discriminate scenes contaminated with clouds and aerosols; it also has four bands at 0.38, 0.67, 0.87, and 1.6 *μ*m. The *GOSAT* project provides two types of level 2 (L2) products, the column-averaged dry-air mole fractions of CO_{2} and CH_{4} (called XCO_{2} and XCH_{4}, respectively) and profiles of those gases. XCO_{2} and XCH_{4} are retrieved from SWIR bands (Yoshida et al. 2011, 2013), and the profiles of CO_{2} and CH_{4} are from TIR band (Saitoh et al. 2009).

It is known that the radiometric responsivities of SWIR bands of TANSO-FTS have degraded relative to the prelaunch calibration (Kuze et al. 2011; Yoshida et al. 2012; Kuze et al. 2014). The degradations must therefore be corrected in some way to process the SWIR spectral data. The current retrieval system (Yoshida et al. 2013) is based on the maximum a posteriori solution, which minimizes the difference between the observed and calculated spectra. Since the optical pathlength modification has large impacts on the retrieval, we have to consider this modification due to the atmospheric particles and surface reflectance in the radiative transfer. Moreover, we use the entire ranges of the absorption bands in the spectral fitting. Therefore, the radiometric calibration covering the wavenumber ranges is necessary for an accurate retrieval. Yoshida et al. (2012) has previously considered these degradations in the SWIR L2 processing. They evaluated the degradations using the on-orbit solar calibration data for the first 2.8 years spanning April 2009–December 2011 and constructed a degradation model that predicts the relative degradation of each spectral band and polarization at 50 cm^{−1} increments.

In September 2014, *GOSAT* experienced an anomaly in the primary pointing system, which caused large pointing errors and instability (reference section 2.3 of Kuze et al. 2016). The pointing system was therefore switched from the primary system (PM-A) to the backup system (PM-B) on 26 January 2015. Because the incident angle into the plate changed because of this switch, it has become necessary to evaluate this effect by considering bidirectional reflection distribution function (BRDF) of the diffuser plate. Moreover, there are several components for degradations and their contributions are temporally changed independently for each from checking solar calibration spectra, as shown in section 3.

Recent requirements for the satellite-based CO_{2} measurement to enable accurate local- and regional-scale CO_{2} inversion are much smaller than 1 ppm. However, as shown in section 4, the comparison studies show that the XCO_{2} from *GOSAT* still has some biases and they depend on time. These biases have become larger in recent years; the cause of this can be partly attributed to the responsibility degradation of FTS. The aim of this study is to reduce this uncertainty caused by the radiometric responsibility degradations. In this study, we therefore propose a new algorithm based on principal component analysis (PCA) to evaluate the degradations and construct a new degradation model for TANSO-FTS that is applicable to the entire period of the *GOSAT* observations.

## 2. Data and methodology

### a. Data selection

In this study, we evaluated the radiometric responsivity of TANSO-FTS using the on-orbit solar calibration data of the product version V210.210. The spectra in the unit of V cm were converted to the unit of W cm^{−2} str^{−1} cm^{−1} using the conversion coefficients provided by the *GOSAT* project (https://data2.gosat.nies.go.jp/doc/document.html#TechnicalInformation). Our data selection strategy follows that of Yoshida et al. (2012). We obtained solar calibration data during the times when *GOSAT* was passing over Earth’s North Pole. The solar calibration data are measured usually using the front side of the diffuser plate and using the back side once a month (Kuze et al. 2011). As in Yoshida et al. (2012), we used only the data obtained with the back side of the plate because the data are affected by less degradation to the plate itself. In Yoshida et al. (2012), the data were selected based on the angle between the vector from Earth to the satellite and that from Earth to the sun to avoid contamination from the terrestrial atmosphere (see Figs. 3 and 4 of Yoshida et al. 2012). Moreover, we selected the data obtained when the angle was 105° ± 0.5°. We analyzed the data obtained from the launch to the end of 2017. The number of selected data is 305. In the analysis, the incident angle of the solar radiation to the diffuser plate (*θ*_{in}) must be considered because the angle changes seasonally due to the change of positional relation among the satellite, Earth, and sun. The intensity of solar radiation reflected by the plate varies with this angle according to its BRDF (Yoshida et al. 2012).

### b. Principal component analysis

In this work, we applied PCA to the spectra in order to extract the characteristics of the spectral dependencies. PCA is widely used to reduce the dimensions of a dataset by transforming it into a new dataset constructed with uncorrelated variables, called principal components (PCs) (Jolliffe 2002). PCA is an effective tool to explore data to seek interesting characteristics and determine how many significant modes of variability exists. PCs are easily obtained by using singular vector decomposition (SVD). If there are *N* samples (here, *N* is a number of spectra, each at associated time *t*) of *P* parameters (here, *P* is number of spectral or pseudospectral points per band/polarization), the SVD of an *N* × *P* dataset matrix $X$ is given by

where $U$ and $P$ are *N* × *M* and *P* × *M* matrices called left-hand singular vector and right-hand singular vector, respectively; **Σ** is an *M* × *M* diagonal matrix; and *M* is the number of considered components, i.e., *M* ≤ *P*. The columns of $P$ and the diagonal elements of **Σ** are, respectively, the eigenvectors and eigenvalues of $X$^{T}$X$. The matrix $Z$ of PCs is defined by

Moreover, the *i*th PC vector is the *i*th row of $Z$. Hence, the columns of $P$ and the diagonal elements of **Σ** are the unit vectors and magnitudes of the PCs. The matrix of PC scores, which gives the values of the projections on the PCs, is defined as $T$ = $XP$, which is also obtained from

The proportion of the total due to the *i*th PC is given by

where *λ*_{i} is the *i*th eigenvalue and tr(**Σ**) is the trace of matrix **Σ**. A representation of $X$ in terms of a subset of *m* components can be written as

where $X$′ is the reconstructed dataset, and $P$′ is the matrix composed of the first to *m*th eigenvectors. $X$′ is an obtained reconstructed model. In the following section, the procedure to reconstruct the new degradation model $X$′ has been demonstrated.

## 3. Constructing the degradation model

### a. Normalizing and interpolating spectra

First, we normalized the observed spectra using the calculated spectra as

where $S$_{obs} and $S$_{calc} are the observed and calculated spectra, respectively. The calculated spectra are expressed as follows:

where *ν* is the wavenumber, *θ* and *ϕ* are incident and relative azimuth angles to the diffuser plate, *t* is the number of days after launch, *F*_{SUN} is the solar irradiance, *R* is the distance between sun and Earth in astronomical units, OPT is the optical efficiency of TANSO-FTS without diffuser plate, and subscripts *P* and *S* indicate the polarization components (Yoshida et al. 2012). The wavenumber shifts introduced by the Doppler shifts and instabilities of the FTS mechanism reported in Kuze et al. (2016) were considered in the calculated spectra. Each normalized spectrum with a native full spectral resolution is interpolated using spline fitting at knot points every 25 cm^{−1}, which is twice as fine as Yoshida et al. (2012). The numbers of wavenumber points are 13, 25, and 18 for band 1, 2, and 3, respectively. Figure 1 shows an example of the normalized-and-interpolated spectra for different incident angles on four close dates. The spectral ratios are different for different incident angles. In addition, there are significant spectral dependencies, and they are different for different bands. For example, in the panel for band 1P, the difference between the blue line and the other line largely varies with the wavenumber and similar tendencies are also observed in other bands. These variations do not depend on the number of days after launch, and the reasons for these variations are unknown. One of our major purposes in applying PCA to the spectra is to distinguish the components whose contributions independently vary with time and separately evaluate these components to construct a degradation model.

### b. Decomposing the dataset into PC scores and eigenvectors

We defined the dataset matrix $X$ in Eq. (1) as the set of differences between the normalized-and-interpolated spectra and a reference spectrum. Here, we defined the reference spectra as the normalized-and-interpolated spectrum observed on 4 March 2009 (the number of days after the launch is 40), when the incident angle to the diffuser plate was *θ*_{in} = 33°. So now $R\u2032=$$R$*/*$R$_{ref}. We decomposed $X$ according to Eq. (1) to derive quantities such as the eigenvectors and PC scores. Figure 2 shows the first to fifth eigenvectors for each band. The first eigenvector for each band has less wavelength dependency than the others because it is caused by variations in the light intensity into the instrument due to the change of *θ*_{in} (see Fig. 4 for more details). However, the remaining eigenvectors have significant wavenumber dependencies. Figure 3 shows the cumulative proportions of the PCs. The contributions of components are different for each band. The proportions of the first PC, defined by Eq. (4), for band 1P and 1S exceed 0.9. For the other bands, the contribution of first PC is more than 0.6.

### c. Correcting for angular dependencies

As shown in Eq. (5), the contributions of the eigenvectors to the dataset are expressed by the PC scores. Figure 4 shows the temporal variations of the first to sixth PC scores for band 1P. The first PC score exponentially changes with time, and its annual variation corresponds to that of *θ*_{in}. This correspondence is also seen in some of the other components, such as PC3. Moreover, the variation of PC scores due to *θ*_{in} shown in Fig. 4 changed at the switch from PM-A to PM-B (i.e., at 2194 days after launch) for several components. This is because the output angle, which is the angle between the reflected radiation and normal vector of diffuser plate, is different for PM-A and PM-B (Kuze et al. 2016), and the intensity of the reflected radiation should be different for this angle due to BRDF of the diffuser plate. These dependencies occurred because the diffuser plate is not Lambertian, but instead the reflection has an angular dependency. To evaluate the degradation of the FTS, it is necessary to correct for these variations due to changes of *θ*_{in} and output angle, which is the angle between the reflected radiation and normal vector of diffuser plate.

We performed the correction for the switch from PM-A to PM-B as follows. We first fitted the variations of the PC scores with linear functions of cos*θ*_{in} for each period. Here, we used the data obtained in 2014 for PM-A and in 2015 for PM-B. We define the correction factor as

where *f* is a fitting function. We obtained the corrected PC scores for PM-B by multiplying the values for PM-B by this factor for the *θ*_{in} corresponding to each observation.

After making this correction, we corrected *θ*_{in} dependencies for the entire period. As shown in Fig. 4, PC scores have variations according to the incident angle. We fitted the variations attributed to the differences in *θ*_{in} with second-order polynomials in cos*θ*_{in} using the data obtained on 4 March 2009, as in Yoshida et al. (2012):

where *r* is the diffuser plate reflectance, *ν* is the wavenumber, and *θ*_{0} is the reference incident angle. The coefficients *a*, *b*, and *c* were obtained for each band. We corrected the PC scores to those corresponding to *θ*_{0} = 33° using the fitting functions.

### d. Fitting the temporal variations of the PC scores

We fitted the temporal variations of corrected PC scores discussed in the previous sections with appropriate functions. The functions we used are listed in Table 1, where *t* is the time, and the other quantities are the coefficients of the functions. The temporal variations of corrected PC scores were fitted using these functions, and the functions with the smallest residual are determined as the fitting functions for each PC score. Here, we used only the data with *θ*_{in} < 35° according to Yoshida et al. (2012). The number of data under this condition in the data selected in section 2a is 180. Figure 5 shows the time series of the angular corrected PC scores and the fitting functions for band 1P same as Fig. 4. The angular dependency was successfully eliminated and the functions are fitted well especially first PC.

### e. Reconstructing the radiometric responsivity degradations

The eigenvectors corresponding to the PC scores for which the cumulative proportions exceed 0.95 are stored as columns of the matrix $P$′, and we calculated the reconstructed dataset $X$′ according to Eq. (5). The numbers of components we used for the reconstructions were 3, 2, 13, 16, 9, and 13 for band 1P, band 1S, band 2P, band 2S, band 3P, and band 3S, respectively. The reconstructed degradations [represented in the form of Eq. (5)] using these fitting curves are shown in Figs. 6–8. The dots are the *θ*_{in}-corrected observation data. The time series increases only for band 3P around 5200 cm^{−1}. This finding is the same as in previous reports (Yoshida et al. 2012), but the reasons are still unknown. It seems that the temporal change in the reflection property of the diffuser plate is caused by the exposure to solar radiation because similar behavior was observed when the front side of the diffuser plate was used. Although we cannot determine the cause of this, one possible cause is a volatile on the detector. If a volatile substance that prevents exposure to radiation adheres to the detector and slowly evaporates, the sensor detectability likely increases. For the other bands, the values decreased exponentially. These are generally similar to the results reported in Yoshida et al. (2012); however, there are certain differences in the details.

We scaled the model using the absolute degradation factor based on the vicarious calibration on 29 June 2009 (Kuze et al. 2011). The absolute degradation factor was estimated from the spectra observed using *GOSAT* and calculated using measured surface and atmospheric parameters in situ. The 2D plots of the scaled degradation model are shown in Fig. 9. This corresponds to $X$′ in Eq. (5) scaled with the absolute degradation factor. Figure 10 shows the differences between the degradation model constructed in this study and that by Yoshida et al. (2012), which is used in the current retrieval system. This figure implies that the value of 0.01 corresponds to 1% change in radiance. For all bands, the values with the new model are slightly higher than those with the current model in the early period. However, the values with the new model are rapidly decreasing, in particular for band 2P. The exception is band 3P. Moreover, there are some wavenumber dependencies in the differences. The decreases are relatively slow at approximately 13 100–13 150 cm^{−1} in band 1 and 5900 cm^{−1} in band 2. Overall, the values with the new model are larger than those with the current model in band 3P.

## 4. Application to CO_{2} retrieval

In this section, the performances of the retrievals using the proposed degradation model are evaluated. The new model was applied to the retrieval system described in Yoshida et al. (2013), and CO_{2} retrievals were performed. The processing and screening procedure is the same as the official product (Yoshida et al. 2017). To evaluate the bias and scatter of the retrieved XCO_{2}, XCO_{2} data from the ground-based FTS network, Total Carbon Column Observing Network (TCCON) (Wunch et al. 2011), are used. The XCO_{2} products from satellite, such as *GOSAT* and *Orbiting Carbon Observatory-2* (*OCO-2*), are validated with TCCON measurements (Morino et al. 2011; Wunch et al. 2017). The comparison here is performed according to the validation studies of *GOSAT* and *OCO-2*. For comparison, we selected the retrieved *GOSAT* XCO_{2} within ±2° latitude–longitude box centered at each TCCON site in combination with mean values of the TCCON data measured within ±30 min of *GOSAT* overpass time. We selected five TCCON sites listed in Table 2 by considering the quantity of matchup data and corresponding observation period. Here, we show the results for the common data that are successfully retrieved from the current and new models. The results from all the data are shown in appendix A.

Comparisons of the retrieved XCO_{2} values are shown in Table 3. Although TANSO-FTS normally observes with gain *H* over land, gain *M* is used over bright surfaces such as deserts. For Pasadena and JPL, the comparisons are separated for gain *H* and gain *M*, because the matchup data include both gains. Although the biases and standard deviations from Tsukuba and Wollongong using the new alternative model in this paper are larger than those obtained using the current operational model, those obtained using the new model for the other sites are smaller than those obtained using the current model. For both gains, the biases improved from −0.44 and −0.70 ppm to −0.25 and −0.41 ppm. In total, that improved from −0.49 to −0.28 ppm. The standard deviations slightly changed from 2.18 to 2.14 ppm, indicating that the degradation model has less impact on the precision of CO_{2} retrieval. Figure 11 shows the temporal variations of the differences between XCO_{2} from *GOSAT* and TCCON. For both current and new models, the negative biases are observed, except for 2013, and the biases become larger with time after 2013. The biases from the new model between 2010 and 2012 are slightly larger than those from the current model. However, the biases from the new model are smaller than those from the current model after 2013 and the improvements become larger with time. This is the main cause of improvement of the total bias. The improvement of biases for gain *M* is larger than that for gain *H* because the data for gain *M* have been available only in the recent period. See appendix B for additional temporal variation comparisons.

According to Kuze et al. (2011), the uncertainty of the vicarious calibration is estimated at approximately 7%, including 2% that is related to solar irradiance. They also mentioned that the required uncertainty of the prelaunch absolute calibration is 5% to achieve the XCO_{2} retrieval accuracy of 1 ppm. For example, as shown in Fig. 11, the improvement in bias in 2018 is approximately 0.5 ppm. One note is that Kuze et al. (2011) reported the absolute calibration but the relative calibration to correct temporal variations was operated this study. Nevertheless, the improvement in 2018 corresponds to 2.5% of the radiometric calibration uncertainty.

## 5. Summary and conclusions

In this study, we have proposed a new scheme for evaluating sensor degradations and have applied it to TANSO-FTS on board *GOSAT*. We used the on-orbit solar calibration data of *GOSAT* for the evaluation. The new scheme employs PCA to distinguish and separately evaluate the temporal variations of each independent component. The sequence of analysis is as follows: 1) Normalize and interpolate the spectra using spline interpolation. 2) Decompose the sets of spectra into PCs and eigenvectors using SVD. 3) Correct the dependencies for the incident angle of the PC scores. 4) Fit the corrected PC scores with appropriate functions. 5) Reconstruct the degradations from the eigenvectors and the fitting curves.

Although the models we obtained were generally similar to those of Yoshida et al. (2012), there are some differences, especially in the recent period. We applied the newly constructed degradation model to the CO_{2} retrieval system and compared the results with TCCON measurements at selected sites. In total, the bias obtained using the new model improved from −0.48 to −0.28 ppm; in particular, the biases after 2013 improved. The standard deviations only slightly changed from 2.17 to 2.14 ppm. We plan next to apply this model to the operational SWIR retrieval algorithm to generate the next version of the L2 product.

## Acknowledgments

This work was funded by the National Institute for Environmental Studies *GOSAT* and *GOSAT-2* project. The authors thank Prof. Wennberg, Prof. Griffith, and Dr. Morino for operating the TCCON measurements. The TCCON site at Tsukuba is supported in part by the *GOSAT* series project. The authors would also like to thank Dr. Shiomi and Dr. Kikuchi of JAXA for their helpful comments on the results. Retrievals using the new degradation model were conducted at the *GOSAT-2* Research Computation Facility. The authors have no conflicts of interest, financial or otherwise, related to this study.

### APPENDIX A

#### Comparisons of XCO_{2} Difference for All the Data

In section 4, we showed the comparisons of the XCO_{2} from *GOSAT* with the TCCON measurements for the common data that are successfully retrieved using the current and new degradation models. Here, we show the results from all the retrieved data for both cases in Table A1. The biases and standard deviations are similar to those listed in Table 3. The number of data from the new model decreased by approximately 7% as the screened data increased. The *GOSAT* SWIR L2 data are screened by the empirical threshold of the spectral residuals after the retrieval processing. The decrease in the number of data is caused by the increased residuals in band 1 of a part of the data. Therefore, the number of data should be improved if the threshold is reexamined in future.

### APPENDIX B

#### Temporal Variations of the XCO_{2} Differences for Each TCCON Site

## REFERENCES

*IEEE Trans. Geosci. Remote Sens.*

*IEEE Trans. Geosci. Remote Sens.*

*Atmos. Meas. Tech.*

*Atmos. Meas. Tech.*

_{2}retrieval algorithm for the thermal infrared spectra of the Greenhouse Gases Observing Satellite: Potential of retrieving CO

_{2}vertical profile from high-resolution FTS sensor

*J. Geophys. Res.*

*Philos. Trans. Roy. Soc.*

_{2}measurements with TCCON

*Atmos. Meas. Tech.*

_{2}and CH

_{4}retrieved from GOSAT: First preliminary results

*SOLA*

_{2}and CH

_{4}column abundances from short-wavelength infrared spectral observations by the Greenhouse Gases Observing Satellite

*Atmos. Meas. Tech.*

*Atmos. Meas. Tech.*

_{2}and XCH

_{4}and their validation using TCCON data

*Atmos. Meas. Tech.*

_{2}, CH

_{4}and H

_{2}O column amounts retrieval from GOSAT TANSO-FTS SWIR. National Institute for Environmental Studies Algorithm Theoretical Basis Doc. NIES-GOSAT-PO-014, 88 pp.

## Footnotes

Denotes content that is immediately available upon publication as open access.