KGS Home List of Computer Contributions

Kansas Geological Survey, Computer Contributions 32, originally published in 1969


FORTRAN IV Programs for Canonical Correlation and Canonical Trend-Surface Analysis

by P.J. Lee

McMaster University

small image of the cover of the book; brown paper with black text.

Originally published in 1969 as Kansas Geological Survey Computer Contributions 32.

Introduction

In the statistical technique of simple linear regression a dependent variable, Y, is related to an independent variable, X, by an equation of the type

Y = bX + ε,

where ε is a random variable. The "strength of the relationship" of Y and X may be expressed by the correlation coefficient, r.

In multiple linear regression, the dependent variable is related to several independent variables by an equation of this type

Y = ∑biXi + ε.

The multiple correlation coefficient serves as a measure of the "strength of the relationship". More precisely, the square of the multiple correlation coefficient gives the proportion of the total sums of squares of Y which may be attributed to variation of the Xi.

Multiple correlation techniques have been applied frequently in geology. A typical application might relate a bulk property of a rock (e.g. permeability) to a series of mineralogical or textural properties (e.g. mean grain size, sorting, skewness, etc.).

It is not unusual, however, for geologists to make two (or more) sets of measurements on a single specimen (e.g. size and compositional parameters, trace element and major elements, or chemical and modal analyses) or at a single locality (e.g. formation thickness, sand percentage, etc., and size or compositional parameters). It may be of interest to attempt to relate one set of variables to another set of variables, to discover equations of the type

U = ∑aiXi, and

V = ∑bjYj,

where the coefficients ai and bj are chosen to give the largest possible correlation between U and V. This technique is called canonical correlation.

As an extension of this technique, the Xi might be chosen to be geographical coordinates (and their polynomials) of a set of spatially distributed variables, Yj. In this instance, the technique would be similar to that of trend analysis, except that the trend which is determined is not the trend of any particular variable, Y, but is the common trend of a set of variables, Yj. This technique may be called canonical trend-surface analysis.

Acknowledgments

Special thanks are due to Prof. G. Middleton for his valuable comments. The writer is indebted to Dr. George Lynts of Duke University for supplying the raw data. The Geological Survey of Canada provided funds for support of the research.

The complete text of this report is available as an Adobe Acrobat PDF file.

Read the PDF version (6.8 MB)


Kansas Geological Survey
Placed on web Sept. 9, 2019; originally published 1969.
Comments to webadmin@kgs.ku.edu
The URL for this page is http://www.kgs.ku.edu/Publications/Bulletins/CC/32/index.html