Abstract
The forward model solution and its functional (e.g., the cost function in 4DVAR) are discontinuous with respect to the model's control variables if the model contains discontinuous physical processes that occur during the assimilation window. In such a case, the tangent linear model (the first-order approximation of a finite perturbation) is unable to represent the sharp jumps of the nonlinear model solution. Also, the first-order approximation provided by the adjoint model is unable to represent a finite perturbation of the cost function when the introduced perturbation in the control variables crosses discontinuous points. Using an idealized simple model and the Arakawa–Schubert cumulus parameterization scheme, the authors examined the behavior of a cost function and its gradient obtained by the adjoint model with discontinuous model physics. Numerical results show that a cost function involving discontinuous physical processes is zeroth-order discontinuous, but piecewise differentiable. The maximum possible number of involved discontinuity points of a cost function increases exponentially as 2kn, where k is the total number of thresholds associated with on–off switches, and n is the total number of time steps in the assimilation window. A backward adjoint model integration with the proper forcings added at various time steps, similar to the backward adjoint model integration that provides the gradient of the cost function at a continuous point, produces a one-sided gradient (called a subgradient and denoted as ∇sJ) at a discontinuous point. An accuracy check of the gradient shows that the adjoint-calculated gradient is computed exactly on either side of a discontinuous surface. While a cost function evaluated using a small interval in the control variable space oscillates, the distribution of the gradient calculated at the same resolution not only shows a rather smooth variation, but also is consistent with the general convexity of the original cost function. The gradients of discontinuous cost functions are observed roughly smooth since the adjoint integration correctly computes the one-sided gradient at either side of discontinuous surface. This implies that, although (∇sJ)Tδx may not approximate δJ = J(x + δx) − J(x) well near the discontinuous surface, the subgradient calculated by the adjoint of discontinuous physics may still provide useful information for finding the search directions in a minimization procedure. While not eliminating the possible need for the use of a nondifferentiable optimization algorithm for 4DVAR with discontinuous physics, consistency between the computed gradient by adjoints and the convexity of the cost function may explain why a differentiable limited-memory quasi-Newton algorithm still worked well in many 4DVAR experiments that use a diabatic assimilation model with discontinuous physics.
Corresponding author address: Dr. S. Zhang, GFDL/NOAA, Princeton University, P.O. Box 308, Princeton, NJ 08542. Email: snz@gfdl.noaa.gov