A multiple-set canonical correlation analysis (MCCA), which can be used to study atmospheric motions by analyzing the relationships among more than two sets of data fields, is proposed. By using the product or squared product of correlation matrices as the optimization criterion, this method generalizes the two-set canonical correlation analysis (CCA) and reduces the complications associated with the supermatrix approaches previously proposed in statistical textbooks. The final optimization equations can be greatly simplified to involve weighting functions of one field at a time. Furthermore, excluding or emphasizing correlations between special field pairs based on physical considerations can be easily implemented.
The method is identical to a supermatrix approach based on maximizing the product of canonical correlation coefficients when the individual canonical correlation matrices are perfectly diagonal. This would be true for idealized data that contain only orthogonal motion systems so that all datasets are perfectly correlated. In such a case, all supermatrix methods will also converge to the same solution. In real cases, cross-component correlations will occur, and their largest values, called largest residual correlations (LRCs), are a crude measure of the validity of the approximation. When LRCs are small compared to the corresponding canonical correlation coefficients, the results are reliable. Otherwise, solutions of different methods diverge and are all doubtful.
A statistical textbook example illustrates that solutions obtained are comparable to those from the supermatrix methods, and the relative LRCs are about 20%. A meteorological application example shows that, compared to the two-set CCA, the proposed MCCA gives a more powerful concentration of variance in the leading modes and higher canonical correlation coefficients. The resultant relative LRCs are small throughout all leading modes, apparently because meteorological data contain highly correlated variations.
The proposed technique nay also be applied to the singular-value decomposition analysis to allow a multiple-set singular-value decomposition analysis to be used on mart than two sets of data fields.