Kansas Geological Survey, Computer Contributions 2, originally published in 1966

A Generalized Two-dimensional Regression Procedure

by John R. Dempsey

Northern Natural Gas Company

small image of the cover of the book; light green paper with black text.

Originally published in 1966 as Kansas Geological Survey Computer Contributions 2.

Introduction

Frequently, in the solution of an engineering problem with the aid of a digital computer, the investigator wishes to use values in the calculations that originate from a chart or table of values. These values may have been derived by discrete variations of some complicated function or may have been derived experimentally. In the former case the trend is usually smooth in nature whereas the latter may be quite erratic. With the latter, he normally desires the most general trend of the tabu lated data, but this often negates much of the experimental inconsistency. However, he must be very cautious with this assumption.

Several methods are at the programmer-engineer's disposal to represent data for automatic computation. A few of these are:

Read in the value of the dependent variable at each step,
Represent a table by matrices and use a table look-up procedure,
Apply differencing techniques (polynomial approximation), and
Compute normal least squares.

There are many inherent problems in using any of the above methods. The first one obviously is extremely time consuming and expensive, because the computer is idle while the value of the dependent variable corresponding to the specified independent variable is found.

The second method is often satisfactory when accuracy is desired. It normally requires a great deal of computer space, however, and is relatively slow in the evaluation of the desired value.

Differencing techniques are easily evaluated once the polynomial has been defined; however, there are many problems in evaluating the required differences to specify the polynomial. Some of the problems are: spacing, boundary conditions, and error accumulation.

The fourth is probably the best method, at least from the theoretical and utility standpoints. The theory states that the function must have a minimum deviation for any set of discrete data and for any order. It is also easily evaluated for digital computation.

Several problems are inherent in the computation of a normal least squares. One disadvantage is that a system of simultaneous equations must be reduced to evaluate the coefficients. The matrix formed by these equations is of the well-known Hilbert matrix form. It is extremely ill-conditioned and fails to converge at higher orders. The numerical round-off is also a problem during the matrix inversion. A second major disadvantage of the normal least-squares method is in the evaluation of the minimum variance for a set of different orders. Because the coefficients themselves are not analytically independent (i.e. they depend on both the order and set of data), one must evaluate a separate set of coefficients for each order and then proceed to evaluate the associated variance.

Due to the many problems inherent to function approximation by the above techniques, an alternate technique should be appropriate. The method described here or use of the procedure utilizing this method will eliminate or partly nullify the adverse conditions described above.

The complete text of this report is available as an Adobe Acrobat PDF file.

Read the PDF version (2 MB)

Kansas Geological Survey
Placed on web March 16, 2015; originally published 1966.
Comments to webadmin@kgs.ku.edu
The URL for this page is http://www.kgs.ku.edu/Publications/Bulletins/CC/2/index.html