## 1. Introduction

Radiative transfer modeling is crucial in determining the radiation budget for weather/climate prediction models and for remote sensing of the earth and atmosphere. Demands for accurate radiation modeling have accompanied recent advances in finescale atmospheric modeling to resolve clouds, and in satellite measurements by sensors with improved spatial and spectral resolution. Monte Carlo photon transport algorithms are often employed to simulate realistic radiation processes, such as three-dimensional (3D) radiative transfer (e.g., Barker et al. 1998; O’Hirok and Gautier 1998; Macke et al. 1999). Numerous studies have revealed that 3D radiative transport is particularly important when examining radiation processes at a cloud-resolving scale (Várnai and Davies 1999; Fu et al. 2000). Adequate treatments of 3D radiative effects are important for pixel-scale satellite retrieval of cloud properties such as optical thickness and effective droplet radius; radiative effects can also impact global-scale climatology (Iwabuchi and Hayasaka 2003; Cornet et al. 2005). The Monte Carlo model is useful for calculating path-length statistics when retrieving amounts of gaseous species using differential optical absorption spectroscopy (DOAS) techniques (e.g., Hönninger et al. 2004).

Several tests and a physically correct basis have made the accuracy of the Monte Carlo model widely recognized. Thus, this model has frequently been used to validate other models (e.g., Evans 1998). Although the Monte Carlo method is associated with random noise, in recent decades, simulation accuracy has been improved by increasing computational power, a trend that will likely continue in the future. Nevertheless, practical applications of the Monte Carlo model often encounter difficulties with current computational power, especially when calculating radiances that are essential for inverse modeling of radiative transfer.

A common question that arises regarding the Monte Carlo model relates to the local estimation method used for radiance calculations. The method requires computationally intensive ray tracing at each collision. The use of realistic, strongly peaked phase functions of Mie scattering by cloud and aerosol particles poses another difficulty. Phase functions are used to sample radiance contributions from each scattering event. Computed radiance can be contaminated by significant noise because sharp peaks are often infrequently sampled. This poor sampling problem has been well addressed by Barker et al. (2003), who proposed an approximation that uses the Henyey–Greenstein phase function to multiply scattered light. The approximation can reduce the noise caused by strong spikes in the original phase function, although biases due to the approximation are not negligible, especially for optically thin cases. Barker et al. (2003) also proposed another method, which truncates the spiky radiance contribution and then redistributes the excess to the whole radiance image. The technique seems to work well when incident photon packets are not very large (∼5 × 10^{4} per unit area). However, it is questionable how much the efficiency can be improved and whether the technique of redistribution to the whole domain is generally adequate regardless of the incident number of photons and domain sizes. Furthermore, the issue of radiance noise for an optically thin atmosphere must still be addressed. Studies of variance (noise) reduction techniques have been rare in atmospheric physics. It is thus meaningful to discuss strategies to improve numerical efficiency.

The purpose of this study was to address problems specific to Monte Carlo atmospheric radiative transfer models and to develop efficient algorithms for variance reduction. Nominal target accuracy in this study was about 1% or less for radiance at the unit area scale (pixel scale). Recent optical instruments for remote radiance measurements are highly accurate, and thus theoretical simulations should be comparable or more accurate than observations. This paper mainly discusses calculations of solar radiance, although the methods and techniques presented are also applicable to calculations of fluxes and heating rates. The paper is organized as follows: section 2 outlines some basic methods implemented in standard Monte Carlo models. In section 3, we propose several techniques for variance reduction. Section 4 demonstrates the performance of the proposed techniques by numerical experiments. Finally, a summary and conclusions are presented in section 5.

## 2. Monte Carlo radiative transfer model

Several model variants of the classic Monte Carlo algorithms for radiative transfer modeling have been developed (e.g., Marchuk et al. 1980; Evans and Marshak 2005). A review of the many possible algorithms is beyond the scope of this paper. Therefore, a model that employs several standard methods is described in this section. During this study, a parallelized Monte Carlo model was developed for multiple purposes. The model was based on the forward-propagating photon-transport algorithm; the model was designed to trace the trajectories of photon packets from radiation sources (solar light, thermal emissions, artificial lamps, laser beams, or a mixture thereof) to termination due to absorption or escape from the top of the atmosphere. Note that some of the methods and techniques used are similar to those in backward-propagating models. The Cartesian coordinate system and a cyclic boundary condition are employed. Figure 1 shows examples of model output: camera image–like, hemispherical plots of radiance viewed under cloudy sky conditions from a point on the surface. The cloud fields were taken from large-eddy simulations. The radiation model simulated realistic 3D radiative effects, such as shadowing and cloud side illumination, even for complex geometry. Basic components of the model are described below.

### a. Fundamentals of the model

**r**= (

*x*,

*y*,

*z*)

^{T}, and the motion direction are initialized by random numbers. Photon packets are first regionalized in the model domain according to the 3D probability distribution of the radiative source power. Note that when the source is solely solar light, the initialization is so simple that the photon packets are distributed uniformly in a horizontal plane at the top of the atmosphere. Each photon packet is also characterized by the order of collision

*n*and the energy flux weight

*w*at each step of the simulation. The radiative power of a photon packet with unity weight (

*w*= 1) is given as

*N*

_{tot}is the total number of photon packets incident to the whole domain, and

*F*

_{src}is the vertically integrated source irradiance.

*τ*

_{free}to the next collision point is determined from a uniform random number

*ρ*between 0 and 1, as follows:

_{τ}*m*is a type index for a constituent, which can be a polydispersion of air molecules, aerosols, or hydrometeors (e.g., cloud water, cloud ice, raindrops, snowflakes). A single size-distribution bin computed from a bin microphysics model or derived from in situ observation is also possible for a constituent (Barker et al. 2003), if large computer memory is available. As usual, the extinction coefficient and other optical properties are uniform in each grid cell.

*n*at location

**r**

*, the weight*

_{n}*w*is scaled as

_{n}*ω*

*w*≪ 1) are less important than those for large weight values. To save computer time, a random number

*ρ*is generated, and the weight is rescaled as follows:

_{w}*w*′ = 0. In other words, the survival of the packet is determined randomly (a so-called Russian roulette method). Note that the total energy is strictly conserved in the formulation for any

*W*and that this method can be used at any step of the simulation algorithm. This standard technique can improve efficiency if used properly (Booth 1985; Kawrakow and Rogers 2001). In this paper, the method is used only when

*w*<

*W*/2 after applying (4). We set

*W*= 1, as is usual in atmospheric science modeling (e.g., O’Hirok and Gautier 1998). Note that the photon packet would be “killed” with high probability if

*W*were large.

*m*) is determined from a random number according to fractions of the scattering coefficients

*β*(

_{e}**r**,

*m*)

*ω*(

**r**, m) for respective

*m*. Subsequently, the scattering angle Θ is determined by solving

*ρ*

_{Θ}is a uniform random number, and

*Q*(Θ;

*l*) is a cumulative probability distribution function of the scattering angle with an index

*l*as follows:

*P*is an azimuthally averaged, normalized phase function. The index

*l*can correspond to the location

**r**and the type of collision (

*m*). In actual implementation,

*P*and

*Q*are tabulated from lookup tables, and the scattering angle is determined by interpolation from a table. Tables can be made for different media (e.g., molecules, aerosols, or hydrometeors) and for different moments of size distributions (e.g., effective radius and dispersion), or for bins of the size distribution. In any case, the tables should be made with fine angular resolution (ideally 10 000 points or more for radiance computations). After scattering, a new free path should be simulated. Trajectory tracing is then continued until the photon packet escapes from the top of the atmosphere.

### b. The local estimation method

*θ*and an azimuth angle

*ϕ*and for a horizontal section

**S**with an area Δ

*A*, the area-averaged radiance can be calculated by integrating the contribution from the

*n*th event,

*ζ*is given as

_{n}*is an angular distribution function,*

_{n}*θ*′ and

*ϕ*′ are the zenith angle and azimuth angle of the direction before the collision event, and

*τ*(

**r**

*,*

_{n}**r**

*) is the optical thickness between the location*

_{υ}**r**

*involved in section*

_{υ}**S**(i.e.,

**r**

*∈*

_{υ}**S**) and the location

**r**

*of the*

_{n}*n*th collision. The direction specified by

*θ*and

*ϕ*should correspond to a vector pointing from

**r**

*to*

_{n}**r**

*. Calculation of*

_{υ}*τ*(

**r**

*,*

_{n}**r**

*) is a computational burden because it requires ray tracing from*

_{υ}**r**

*to*

_{n}**r**

*(for each*

_{υ}*n*th event). The camera-like image similar to that shown in Fig. 1 is composed of the angle-averaged local radiances (i.e., the radiative quantity at a certain point), which can be calculated in a modified way (Marchuk et al. 1980; Evans and Marshak 2005).

*n*≥ 1 (collisions with atmospheric particles or the underlying surface), the function Ψ

*is given by*

_{n}*P*and

_{n}*R*are, respectively, the phase function and the bidirectional reflectance distribution function (BRDF) for the

_{n}*n*th collision, and

*α*(

_{n}*θ*′,

*ϕ*′) is the surface albedo for the direction of incidence (

*θ*′,

*ϕ*′). The function Ψ

_{0}for an isotropically emitting source from a conical angular region

**C**with a half-cone angle

*θ*is represented as

_{c}*θ*′,

*ϕ*′) corresponds to the center of

**C**and can be used for the solar source. For better precision, we should consider a more realistic angular distribution of emitted radiance in the solar disk.

## 3. Variance reduction techniques

### a. Modification of the local estimate

The local estimation method requires numerically intensive ray tracing from each collision point to the sensor being considered for the optical thickness [*τ*(**r*** _{n}*,

**r**

*)] integration [see (10)]. Barker et al. (2003) discussed treatments of small*

_{υ}*ζ*, which could be from small Ψ

_{n}*and/or large*

_{n}*τ*(

**r**

*,*

_{n}**r**

*), and suggested that for solar radiance reflected from clouds the population of small*

_{υ}*ζ*was large, while the contribution to total radiance was small. The use of ray-tracing computing time for such a small contribution would be inefficient. Barker et al. (2003) have proposed a method that excludes the contribution of small

_{n}*ζ*simply by terminating ray tracing for small

_{n}*ζ*. They found that computing time could be reduced by about 10% with negligible total bias if the cutoff threshold was 0.001 in a limited number of cases with clouds. The method is clearly biased if the threshold is too high, so the threshold cannot be set a priori.

_{n}A better acceleration method without bias could use Monte Carlo techniques to sample *ζ _{n}*. By this method, sampled

*ζ*is modified to null or a value that is larger than a prescribed threshold

_{n}*ζ*

_{min}using random numbers. The modification is based on concepts similar to those of the Russian roulette method and the random sampling of optical thickness as in (2). Two cases are considered here.

*is as small as*

_{n}*π*Ψ

*≤*

_{n}*ζ*

_{min}, we sample

*ζ*′ instead of

*ζ*,

*ρ*

_{Ψ}is a random number, and

*τ*

_{free}is a randomly chosen optical thickness as in (2). This small Ψ

*case is often observed for highly anisotropic phase functions. For example, for the phase function in Fig. 3, if |cos*

_{n}*θ*| = 1 (zenith/nadir-looking directions) and

*ζ*

_{min}= 1/4, then

*π*Ψ

*≤*

_{n}*ζ*

_{min}for a scattering angle larger than approximately 45°. According to (13), ray tracing is frequently omitted (

*ζ*′ = 0) when Ψ

*is small as compared to*

_{n}*ζ*

_{min}(the omission is judged by

*ρ*

_{Ψ}). In addition, the ray tracing can be discontinued if the partially integrated optical thickness exceeds

*τ*

_{free}. This situation is frequently observed in optically thick clouds because the average of

*τ*

_{free}is unity.

*π*Ψ

*>*

_{n}*ζ*

_{min},

*τ*≤

*τ*

_{max}, and the termination of the ray tracing depends on the randomly chosen

*τ*

_{free}when

*τ*>

*τ*

_{max}.

Using the above modifications, the cost of ray tracing is reduced when the set *ζ*_{min} is large. It is easy to show that this method is perfectly unbiased (energy conservative) for any *ζ*_{min}. As a result, we can set a large *ζ*_{min} and reduce the time required for computation. However, if *ζ*_{min} is extremely large (≫1), sampling of *ζ*′_{n} will be too rare, increasing the variance of *ζ*′_{n}. Consequently, the optimal value for *ζ*_{min} should be evaluated by tests.

### b. Dual-end truncation approximation for sharply peaked phase functions

Another difficulty in radiance computation relates to large *ζ*, which often occurs when there are sharp diffraction peaks in the Mie phase function, as mentioned in section 1. Peaks in the forward direction can be larger than the minimum phase function by a factor of a million or more for large water droplets at short wavelengths. The radiance contribution *ψ _{n}* from a scattering event can be very noisy due to infrequent sampling of the peaks. This problem can be partly resolved if the peaks are truncated and the phase function is transformed to a smoother function (i.e., truncation approximation).

*f*is hereafter referred to as the delta fraction, Δ(Θ) is the Dirac delta function, and

_{δ}*P̂*is the truncated phase function. The extinction coefficient and single scattering albedo values are scaled as

*β̂*,

_{e}*ω̂*, and

*P̂*. Various truncation approximations have been proposed, with individual representations of

*P̂*(e.g., Nakajima and Tanaka 1988; Antyufeev 1996; Thomas and Stamnes 1999; Modest 2003).

*and Θ*

_{f}*. By defining the first and second moments of the original phase function*

_{b}*P*as

*ĝ*

_{1}and

*ĝ*

_{2}are similar to

*g*

_{1}and

*g*

_{2}, respectively, but for

*P̂*

*f*) is prescribed here, angles Θ

_{δ}*and Θ*

_{f}*are determined by numerically solving the two equations in (21). To fulfill the conditions, the forward fraction*

_{b}*f*, as in Fig. 3. We found that the method presented above achieved slightly higher accuracy than a method that truncated only the forward part (see the appendix). Other possible methods for truncation approximation are presented in the appendix.

_{δ}*P*and

*P̂*[see (18)]. As mentioned in the previous section, the lookup tables for the phase function and its cumulative function should be tabulated for many angles. If the shapes of the phase functions differ by

*n*, then huge tables are needed. The similarity between

*P*and

*P̂*avoids the necessity of making different tables for respective approximation regimes with different

*f*, which can vary by the collision order. For the method presented above, the scattering angle for the truncated phase function can be determined by solving

_{δ}*Q*of the original phase function. Thus, the method does not require large additional computer memory and is suitable for the Monte Carlo model.

*f*, which controls the accuracy of the approximation. In general, a relatively large delta fraction can be used for the transfer of near-isotropic light such as infrared radiation and high-order scattered solar radiation. In contrast, a small delta fraction should be used for low-order scattering in order to simulate the strong anisotropy in transmitted solar light. It is particularly recommended that the original phase functions be used for first-order scattering in solar spectral regions. Furthermore, it is reasonable to consider the statistical directionality of photon packets because the order of collision is not a good parameter for inhomogeneous media with varying asymmetry of the phase function. The directionality parameter

_{δ}*χ*after the

_{n}*n*th scattering is given here as

*g*is the asymmetry factor of the phase function. Since −1 <

*g*< 1 for actual scattering,

*χ*is a decreasing function against

*n*, with a lower limit of null when

*n*is infinite. At the source emission (

*n*= 0),

*χ*

_{0}is initialized according to the directionality of the corresponding source. For example,

*χ*

_{0}= 1 for a collimated radiation source, and

*χ*

_{0}= 0.5 for thermal emission.

*χ*) and to be large for a sharply peaked phase function. Considering forward-hemispherical moments of the original phase function, the final form of

_{n}*f*for the (

_{δ}*n*+ 1)th order of collision is as follows:

*F*

_{max}is a tuning parameter. The function

*H*is given as

*h*is an integer function defined as

*h*

_{max}determines the resolution of the order of the truncation regimes and can be usually set as five or more. The design of (26) was chosen to adapt to various phase functions for small to large particles. According to (26), no truncation approximation is applied (

*f*= 0) for low-order collisions (with

_{δ}*χ*⩾

_{n}*χ*

_{max}), while

*f*increases for multiply scattered light. Preliminary study showed that gradually increasing

_{δ}*f*achieves significantly higher accuracy than suddenly changing

_{δ}*f*= 0 for single scattering to a large fixed value (with

_{δ}*H*= 1) for subsequent scatterings. It should be noted that when the delta fraction varies with the order of collision, the optical thickness used in (10) should be scaled with respect to the (

*n*+ 1)th order of collision (the next order of the collision being sampled).

### c. Collision-forcing method for optically thin media

*is scaled to a larger value*β

_{e}*β*

_{e}, then more frequent collisions are forced. Accordingly, a scaled single scattering albedo is given as

*f*is the delta fraction, given as

_{d}*β*

_{e},

*ω*

*P*′ without any bias.

The algorithm flow of the Monte Carlo model should be slightly modified by introducing the collision-forcing method. At each collision, the weight of a photon packet is multiplied by *ω**ω**P _{n}*(1 −

*f*) instead of

_{d}*P*. The scatter direction is determined according to (32), in which probabilities of scattering due to the delta function and the original phase function are, respectively,

_{n}*f*and 1 −

_{d}*f*. A random number determines which kind of scattering occurs. The direction of motion of the photon packet is not altered for the delta function.

_{d}*β*

_{e}. One can specify a large

*β*

_{e}for an arbitrary collision order or for regions where sampling of many collisions is needed. It is reasonable to prescribe a minimum

*τ*

_{min}for the total column optical thickness. In this paper, a domainwide constant is set for

*f*,

_{e}*τ*

_{col}is the domain average of the total column optical thickness. Note that since the collision-forcing method can be used in conjunction with truncation approximations (described in the previous subsection),

*τ*

_{col}and

*f*depend on the truncation approximation regime.

_{e}### d. Numerical diffusion

In the usual Monte Carlo model, each sampling (integration) of radiative quantity is performed at a localized point. Sampling conducted for the region (area or volume) around the point could have a denoising (smoothing) effect on the spatial distribution of the computed quantity. Thus, an artificial numerical diffusion can be used to reduce the noise caused by an insufficient number of photon packets. A simple method is introduced here. At each integration process, the radiative energy is virtually distributed in the horizontal, rectangular area around the site of the photon packet, and energy fractions are integrated in respective subareas, while the actual trajectories of photon packets are the same as those without the diffusion. Since the numerical diffusion is limited to horizontal directions, the horizontal domain averages of radiative quantities are unaltered.

*N*

_{tot}. For radiances, we can expect a better denoising effect if the diffusion area is proportional to the sampled contribution [see (9)],

*w*(

_{n}ζ_{n}*θ*,

*ϕ*). Thus, the widths

*s*and

_{x}*s*along the

_{y}*x*and

*y*axes, respectively, of the rectangular diffusion space for radiance sampling are given as

*c*

_{rad}and

*d*

_{rad}are coefficients,

*X*

_{max}and

*Y*

_{max}are domain sizes along the

*x*and

*y*axes, respectively, and

*σ*is a length given as

*χ*, which increases with increasing light isotropy [see (25)]. Thus,

*σ*= 0 for the source emission, and

*σ*becomes large when near-isotropic light travels for a long distance. From a physics viewpoint,

*σ*can be considered an indicator of the uncertainty in the horizontal location within the photon packet. As seen from (36), the diffusion area used for radiance sampling is determined at the scattering point (not at the detector point). Spreading of the diffusion area along a path from the scattering point to detector is not taken into account because the local estimation method forces virtual scattering in one specific, unique direction. Because sampling from a grid with a null scattering coefficient is physically impossible, the sampled energy is distributed according to a weighting function,

*s*/2 ≤

_{x}*x*′ ≤

*s*/2, and −

_{x}*s*/2 ≤

_{y}*y*′ ≤

*s*/2, with

_{y}*β*

_{sca}as the scattering coefficient. By rule, the distribution functions are normalized to unity at each collision event, conserving the total energy sampled.

The method presented here is an approximation. There is no reason to expect that we can derive a unique, universal method for numerical diffusion. More complicated methods can be employed, such an accounting for exponential profiles in all directions to redistribute the radiative energy. However, such methods may increase the computational burden, as there is a tradeoff between accuracy and computer time. Thus, a rather simple, rapid computation scheme is employed in the present study. It should be noted that systematic errors might be large if the coefficients in (35) (hereafter referred to as the diffusion coefficients) are too large, although noise would decrease. Optimal values for the coefficients are theoretically unknown and should be determined from numerical experiments.

## 4. Evaluations

### a. Description of numerical experiments

*q*,

_{c}*κ*= 0.8 (Martin et al. 1994) and a constant number density

*N*= 100 cm

_{c}^{−3}over the domain. Water-soluble aerosols were imposed with an exponential vertical profile. The single scattering properties of aerosols and cloud water droplets were calculated using Mie theory. Scattering and absorption by gases were included. The surface was assumed to be a flat, Lambertian reflector. The same cloud data were used as in Fig. 1.

It would be difficult to find the best combination of the numerous tuning parameters that are used in variance reduction techniques. The purpose of this section is to clarify the effectiveness and the characteristics of individual techniques. Table 2 summarizes an example (scheme V) that used combined variance reduction techniques. The parameters were tuned by preliminary tests. For convenience, we will refer to the modified local estimation as MLE, the dual-end truncation approximation as DTA, the collision forcing as CF, and the numerical diffusion as ND. To evaluate the performance of the various methods, CPU times were measured on a single-CPU personal computer. The accuracy (error) of the computed quantities was estimated with benchmarks that used a larger number of photon packets. The benchmarks were obtained from parallel computations on a large-scale scalar computer, not using biasing techniques (e.g., DTA and ND). For this paper, computed radiances were normalized as *πI*/(*F*_{0}cos *θ*_{0}), where *I* is radiance, *F*_{0} is extraterrestrial solar irradiance, and *θ*_{0} is the solar zenith angle.

### b. Results

#### 1) Biases due to the truncation approximation

First, mean bias errors due to DTA were checked, because DTA is the only technique, presented in this paper that can introduce the domain-average bias error. Visible radiances were computed for various *F*_{max} values [see (26)] for single-layer, plane-parallel cloud cases with optical thicknesses of 1, 5, and 25. Figure 4 shows the bias errors of the radiances computed by DTA, with *χ*_{max} = 0.9 and *χ*_{min} = 0.4, for various cloud optical thicknesses (*τ*) and view zenith angles. A large number of photon packets were used for this experiment (e.g., 2 × 10^{9} for *τ* = 1) so that the Monte Carlo noise was almost negligible (∼0.1% or less). When *F*_{max} was large, the delta fraction was large [see (26)]. Therefore, larger bias would be expected for larger *F*_{max}. The results show that the bias errors were quite small (<0.3%) when *F*_{max} ≤ 0.8, except for the solar aureole region (within 10° of the solar direction) and the opposite reflection directions where *τ* < 25. Relatively accurate radiances were obtained in the optically thin case (*τ* ∼ 1) because the delta fraction was null for the direct-beam and small for low-order scattering. For optically thick cases (*τ* ⩾ 25), the truncation approximation worked well with very small biases even for the solar aureole. In general, if accuracy of 1% is required for radiance computation, the delta fraction seemed too large when *F*_{max} = 1, which introduced relatively large biases in the solar aureole region, and for large-view zenith angles (>70°). Thus, it is recommended to set* F*_{max} ≤ 0.8.

Even when *F*_{max} = 0.8, bias in the solar aureole region was not negligible for a moderate optical thickness of about 5; maximum error was approximately +8% in the solar direction. This large positive bias implies extreme transmission of first- (or low) order scattering light. With first-order scattering, light is strongly scattered in the forward direction, and can transmit more due to the reduction in the extinction coefficient in (17a) for higher-order scattering. In the circumsolar region, negative biases (about −3%) were shown. This result is an artifact of the truncation of the forward part of the phase function. In practice, accurate aureole radiances can be computed by using small *F*_{max} and/or small *χ*_{max} and *χ*_{min}, with a relatively small delta fraction. Actually, the solar aureole is bright because of frequent sampling of the forward peaks, although the truncation approximation was proposed to remove noise resulting from rare sampling of the peaks. There is thus no apparent reason to use truncation approximations for solar aureole simulation.

#### 2) Performance comparisons: Case 1

From the viewpoint of numerical efficiency, both the accuracy and CPU times of different schemes should be compared. Figure 5 shows the effects of DTA and MLE on the root-mean-square (rms) relative error, *ε* (%), CPU time, *T*, and the efficiency factor, defined as 1/(*ε*^{2}*T*). Nadir-reflected visible radiances were tested for inhomogeneous clouds (case 1). The number of incident photon packets used per column was *N*_{col} = 1000 for each simulation. The rms error can be considered to be Monte Carlo noise because the MLE method is unbiased, and DTA added negligible bias (<1%) as compared to the rms error (>8%) in this experiment. It is clear from the figure that DTA with larger *F*_{max} had two effects: both noise and CPU time were reduced. The latter was due to decreased collision frequency by the scaling of extinction coefficients in (17a). When *F*_{max} = 1 with *ζ*_{max} < 0.1, the rms error was smaller than *F*_{max} = 0 (no DTA) by a factor of about 7, and the numerical efficiency was about 100 times higher than that for *F*_{max} = 0. This significant improvement indicates the very high ability of DTA to reduce Monte Carlo noise for a realistically peaked phase function. In the MLE method, CPU time could also be reduced by using a larger *ζ*_{max}, as expected. Note that the usual local estimation method (not modified) was almost the same as for *ζ*_{max} = 10^{−7}. For *ζ*_{max} > 1, no significant time reduction was shown. However, rms errors were almost constant for *ζ*_{max} < 1. This may not be a straightforward result because the use of a larger *ζ*_{max} reduces the sampling frequency, so increased variance might be expected. Two explanations are suggested: the contribution of a small *ζ* to total radiance is not very large (Barker et al. 2003), and by MLE, the *ζ* sampled was always larger than *ζ*_{max} [see (13) and (14)] so as to cause a partial reduction in variance. When *ζ*_{max} > 1, significant increases in rms error were found, as expected, because of poor sampling. Therefore, we used an optimal setting, *ζ*_{max} = 0.3, in subsequent experiments. When *ζ*_{max} = 0.3, CPU time was reduced by approximately 50% without a significant increase in rms error.

Figure 6 shows the nadir-reflected radiances and corresponding errors when *N*_{col} = 1000. The standard scheme S used no variance reduction technique except the MLE method (*ζ*_{max} = 0.3). The impact of applying truncation approximation was clear. The DTA technique reduced noise imperfectly but significantly. No significant tendency was found for error depending on the brightness of radiance, suggesting a small biasing effect of DTA. Residual noise in scheme T with *F*_{max} = 0.8 could be further smoothed by scheme V, mainly because ND relaxed spiky noise. Error reduction in scheme V from scheme T was mainly due to ND. As shown in Fig. 6, the error in scheme S or T does not resemble white noise. Smoothing is needed in some but not all locations. After Monte Carlo integration, it would be difficult to know where smoothing is needed. Using ND, smoothing can be performed for each sampling process with adaptive diffusion widths (35).

The numerical efficiency should be tested for various numbers of photon packets; a method that works well when low accuracy is required may not always be effective when a high degree of accuracy is needed. Figure 7 shows rms errors plotted against CPU time for *N*_{col} = 10^{3}, 10^{4}, and 10^{5}. The square of the rms error is theoretically proportional to 1/*N*_{col} for the standard scheme S. Scheme T with *F*_{max} = 1 shows good performance when *N*_{col} was as small as 10^{3}, but performance was poor for larger *N*_{col}. This result reflects contamination by relatively large biases (∼1%) when *F*_{max} = 1. We should set *F*_{max} = 0.8 to achieve 1% accuracy; ND further improved the performance. These results suggest that the dependence of the ND length on the number of photon packets [as in (35)] is reasonable, reducing the noise in radiances independent of *N*_{col}. The use of ND was found to be efficient, especially for small *N*_{col}. This is a reasonable result because the area for ND decreases with increasing *N*_{col}, and denoising artifacts are less pronounced. By using scheme V, the rms error was reduced by a factor of approximately 9 at the visible wavelength with *N*_{col} = 10^{5}, even though the CPU time was almost the same in schemes S and V. The increased CPU time in scheme V as compared to scheme T with *F*_{max} = 0.8 was mainly due to the use of the CF method with *τ*_{min} = 10. Thus, scheme V successfully improved efficiency (equivalent computer-time reduction) by a factor of approximately 80 to obtain a fixed accuracy of 0.8% for the visible wavelength. This efficiency improvement was more notable in the visible wavelengths than in the near-infrared because the forward peaks of the phase function were sharper. For both wavelengths, accuracy of approximately 5% can be achieved by using *N*_{col} = 10^{3}, and accuracy of approximately 1% can be expected when *N*_{col} = 10^{5}.

By using large diffusion coefficients [as in (35)], noise becomes small, but the spatial distribution of computed radiance could be unrealistically smooth due to the smoothing artifact. Table 3 presents the rms error in scheme V with different diffusion coefficients. This error does not differ largely when the diffusion coefficients are changed by a factor of 2. Although the diffusion coefficients used in this study would work well for cases similar to this study, sensitivity studies are recommended to determine optimal values for coefficients in completely different cases.

#### 3) Effect of the collision-forcing method: Case 2

To demonstrate the impacts of the CF method, an optically thin case (case 2 in Table 1) was selected here. In case 2, the domain averages of the total (cloud + aerosol + molecules) column optical thickness were approximately 1.2 at 0.64 *μ*m and 1.1 at 2.13 *μ*m. The surface was set as black, as in Table 1. In cloudless regions, scattering media were so thin that reflected radiance was very low. Radiances would thus be poorly sampled without the CF method, exhibiting large relative error. It is easy to imagine similar situations encountered in practical applications, not only in reflection but also in transmission. Figure 8 shows how the *τ*_{min} used in the CF method affected numerical performance. For *τ*_{min} = 0, no CF was used. As expected, the rms error was reduced by using the CF with a large *τ*_{min} due to increased sampling (collision) frequency. Error (noise) reduction was significant at relatively small *τ*_{min}. CPU times increased linearly with *τ*_{min}. As a result, the numerical efficiency factor was maximized at *τ*_{min} ∼ 5. For larger *τ*_{min}, however, decreased efficiency with increasing *τ*_{min} was possible (as for 0.67 *μ*m). In this case, it was more efficient to use a larger number of photon packets than to force too many collisions. The merit of the CF method was more notable at 2.13 *μ*m. Cloudless regions are optically very thin at this wavelength, leading to larger rms error and lower efficiency than at 0.67 *μ*m. Although collisions were forced with constant *f _{e}* for the entire domain, the efficiency could be further improved if the CF method was used only where needed, with varying

*f*. For example, if CF was used only for light with

_{e}*n*= 0 (direct beams), then the first scattering would be forced to occur frequently. Results suggest that the CF method is useful for sampling radiances scattered from optically thin regions.

## 5. Summary and conclusions

Several variance reduction techniques have been proposed for the Monte Carlo radiative transfer model, and their efficiency has been demonstrated for cloudy cases. All of the techniques are energy conservative and usually have very small biases. One technique is an unbiased modification of local estimates for radiance calculations. This technique was introduced to reduce the computational burden required for sampling many small contributions from each scattering event, especially for highly anisotropic phase functions. According to Monte Carlo practice, sampling frequency can be successfully reduced without bias in calculation results on average and without significant increase in random noise. Another method is a truncation approximation that is well suited to the Monte Carlo model. The approximation transforms a sharply peaked scattering phase function to a linear mixture of Dirac’s delta function and a truncated phase function that is smoother than the original phase function. Using the method, a significant reduction can be expected in noise due to poor sampling of spiky peaks. A fraction of the peak truncation was set to increase with the diffusivity of the photon packets. The method resulted in very small biases (<0.3%), except for the solar aureole region with moderate cloud optical thickness (between 1 and 20). Numerical efficiency can be dramatically improved for solar radiances using this approximation, reducing both computation time and Monte Carlo noise. An efficiency improvement factor of approximately 80 was shown for visible wavelengths when we tried to achieve accuracy of 1%.

A collision-forcing method for optically thin media was also proposed. By using this method, the extinction coefficient can be modified to an arbitrary value larger than the original; consequently, single scattering albedo and the phase function are also modified according to similarity relations. This method can flexibly force frequent collisions where needed, thereby reducing Monte Carlo noise. In addition, compared with other methods of forcing collisions, modifications of the algorithm are minimal and easily implemented. Last, artificial numerical diffusion was proposed. In each sampling process, the energy of each photon packet is partitioned and redistributed to a rectangular horizontal area, the size of which is adaptively determined; the area is large when the sampled energy is large and when near-isotropic light travels a long path. Results showed that this method successfully smoothed noise in radiance and that the method worked well regardless of the number of trajectories.

All the proposed methods can be used in combination. According to the results presented in the previous section, we can expect about 1% accuracy by using a combination of the methods for pixel radiances with 10^{5} photon packets per pixel in typical cases with cloud optical thickness of approximately 10. Better accuracy can be expected, of course, if more photon packets are used. Thus, these methods will still be valuable even when increased computing power becomes available in the future. Although all of the simulations in this paper were performed for monochromatic wavelengths, the computer time required for broadband calculations is independent of the number of spectral intervals and almost the same as for monochromatic calculations (e.g., Fu et al. 2000). This is a unique, attractive property of the Monte Carlo model. Thus, the merits of the model will be maximized when it is used for calculation of spectrally integrated radiative quantities.

The proposed methods are also useful for calculations of fluxes and heating rates, although we have restricted our discussion mainly to radiance in this paper. In particular, the collision-forcing method could significantly improve the efficiency of heating rate calculations in optically thin regions. In addition, numerical diffusion similar to that discussed in this paper could be incorporated for efficient sampling of fluxes and heating rates. Further discussion will be presented in a separate paper.

## Acknowledgments

The author thanks Dr. Akira T. Noda of Tohoku University, Japan, for providing the large-eddy simulation data, and Dr. Tsuneaki Suzuki of the Japan Agency for Marine–Earth Science and Technology, Japan, for helpful comments. This work was partly supported by the Ministry of Education, Culture, Sports, Science and Technology, Grant-in-Aid for Scientific Research [(A) 17204039].

## REFERENCES

Antyufeev, V. S., 1996: Solution of the generalized transport equation with a peak-shaped indicatrix by the Monte Carlo method.

,*Russ. J. Numer. Anal. Math. Model.***11****,**113–137.Barker, H. W., J-J. Morcrette, and G. D. Alexander, 1998: Broadband solar fluxes and heating rates for atmospheres with 3D broken clouds.

,*Quart. J. Roy. Meteor. Soc.***124****,**1245–1271.Barker, H. W., R. K. Goldstein, and D. E. Stevens, 2003: Monte Carlo simulation of solar reflectances for cloudy atmospheres.

,*J. Atmos. Sci.***60****,**1881–1894.Booth, T. E., 1985: A sample problem for variance reduction in MCNP Los Alamos National Laboratory Rep. LA-10363-MS, 68 pp.

Cornet, C., J-C. Buriez, J. Riédi, H. Isaka, and B. Guillemet, 2005: Case study of inhomogeneous cloud parameter retrieval from MODIS data.

,*Geophys. Res. Lett.***32****.**L13807, doi:10.1029/2005GL022791.Evans, K. F., 1998: The spherical harmonics discrete ordinate method for three-dimensional atmospheric radiative transfer.

,*J. Atmos. Sci.***55****,**429–446.Evans, K. F., and A. Marshak, 2005: Numerical methods.

*3D Radiative Transfer for Cloudy Atmospheres,*A. B. Davis and A. Marshak, Eds., Springer-Verlag, 243–281.Fu, Q., M. C. Caribb, H. W. Barker, S. K. Krueger, and A. Grossman, 2000: Cloud geometry effects on atmospheric solar absorption.

,*J. Atmos. Sci.***57****,**1156–1168.Hönninger, G., C. von Friedeburg, and U. Platt, 2004: Multi axis differential optical absorption spectroscopy (MAX-DOAS).

,*Atmos. Chem. Phys.***4****,**231–254.Iwabuchi, H., and T. Hayasaka, 2003: A multi-spectral non-local method for retrieval of boundary layer cloud properties from optical remote sensing data.

,*Remote Sens. Environ.***88****,**294–308.Kawrakow, I., and D. W. O. Rogers, 2001: The EGSnrc code system: Monte Carlo simulation of electron and photon transport. NRCC Rep. PIRS-701, 287 pp. (revised in 2003).

Liou, K-N., 2002:

*Introduction to Atmospheric Radiation*. 2d ed. Academic Press, 583 pp.Macke, A., D. L. Mitchell, and L. V. Bremen, 1999: Monte Carlo radiative transfer calculations for inhomogeneous mixed phase clouds.

,*Phys. Chem. Earth***24B****,**237–241.Marchuk, G., G. Mikhailov, M. Nazaraliev, R. Darbinjan, B. Kargin, and B. Elepov, 1980:

*The Monte Carlo Methods in Atmospheric Optics*. Springer-Verlag, 208 pp.Martin, G. M., D. W. Johnson, and A. Spice, 1994: The measurement and parameterization of effective radius of droplets in warm stratocumulus clouds.

,*J. Atmos. Sci.***51****,**1823–1842.Modest, M. F., 2003:

*Radiative Heat Transfer*. 2d ed. Academic Press, 822 pp.Nakajima, T., and M. Tanaka, 1988: Algorithms for radiative intensity calculations in moderately thick atmospheres using a truncation approximation.

,*J. Quant. Spectrosc. Radiat. Transfer***40****,**51–69.O’Hirok, W., and C. Gautier, 1998: A three-dimensional radiative transfer model to investigate the solar radiation within a cloudy atmosphere. Part I: Spatial effects.

,*J. Atmos. Sci.***55****,**2162–2179.Thomas, G. E., and K. Stamnes, 1999:

*Radiative Transfer in the Atmosphere and Ocean*. Cambridge University Press, 517 pp.Várnai, T., and R. Davies, 1999: Effects of cloud heterogeneities on shortwave radiation: Comparison of cloud-top variability and internal heterogeneity.

,*J. Atmos. Sci.***56****,**4206–4224.

## APPENDIX

### Other Possible Truncation Approximations

There are several possible methods for truncation approximation that are well suited for the Monte Carlo model. In the preliminary study, we tested the accuracy of the methods described below.

*);*

_{f}*f*= 1 −

_{δ}*f*. However, the assumption does not satisfy the conservation of the first moment between the original phase function and approximated one [right-hand side of Eq. (16)]. Consequently, it can be shown that the method assuming

_{t}*f*= 1 −

_{δ}*f*results in bias error in radiance that is too large for remote sensing purposes. By adding a constraint of moment conservation, significantly higher accuracy can be achieved. Accordingly, the truncation point Θ

_{t}*is determined by numerically solving*

_{f}*f*is prescribed [e.g., by Eq. (26)], and

_{δ}*ĝ*

_{1}is similar to that in (22), but Θ

*=*

_{b}*π*. To fulfill the conditions, 1 −

*f*should be larger than

_{t}*f*. The method described here is slightly less accurate (but simpler) than the DTA presented in the text.

_{δ}*f*is a normalization coefficient. For such a formulation, too, we can determine Θ

_{t}*so as to fulfill (A3). Experiments showed that this method is slightly less accurate than the above method in this appendix. Other possibilities are the delta–isotropic approximation and the delta–Henyey–Greenstein approximation (Thomas and Stamnes 1999). Although these methods cannot be used for remote sensing purposes because of their large biases, they may be useful for calculating fluxes and heating rates.*

_{f}Summary of two cloud cases.

Parameter set used in scheme V.

Rms errors for scheme V, varying the coefficients for the numerical diffusion.