Home Reports Start

Preliminary Report on Statistical Quality Control for Year 2002 Water Well Measurements

John C. Davis

logo of Kansas Geological Survey Kansas Geological Survey
1930 Constant Avenue
Lawrence, Kansas 66047-3726

Report to the Director of the
Kansas Geological Survey
University of Kansas
Open-file Report No. 2002-5
Released Feb. 8, 2002, Electronic version created June 2002


The year 2002 Quality Control and Assurance Program for observation well water-level measurements in western Kansas is patterned after the quality assurance techniques developed during the preceding five years field work and statistical analysis. This discussion of procedures is adapted from Miller, Davis, and Olea (1997), incorporating adjustments in the program that were noted in last year's report (Davis, 2001).

The primary variable measured in the water well observation program is depth to water in an observation well. This primary variable is associated with three secondary variables; the ground elevation, east-west coordinate, and north-south coordinate of the well. The secondary variables serve to locate the primary variable in space, and make it possible to determine spatial relationships between observation wells, including mapping the water table and calculating changes in aquifer volume. Historically, the three location variables were determined initially by the U.S. Geological Survey for each well and not re-determined unless a serious error in the original coordinates was suspected. In the 1997 ground water observation measurement program conducted by the Kansas Geological Survey, the geographic (latitude and longitude) coordinates of all wells were re-determined by GPS techniques. In subsequent year's measurement programs, all observation wells were again re-determined by GPS. "Selective Availability," which limited the resolution of GPS measurements, was turned off by the Federal government in 2001, so locations determined that year were substituted for previous determinations. For a few locations where year 2001 GPS measurements were not taken, measurements made in 1999 are used.

In addition, several secondary characteristics of the observation wells and of the measurement procedure were noted in order to determine if these might influence the quality of the measurements being made (these measurements are referred to as exogenous variables). As part of the quality control program, water level measurements were repeated two or more times on 171 wells, yielding a collection of 225 quality control observations. Because these data include replicates, they provide an additional check on estimates of the influence of well conditions or measuring techniques on water levels. A subsequent round of measurements resampled 50 wells selected at random from the original set for quality assurance purposes. These wells were measured two or more times for a set of Ill quality assurance values.

The primary variable, depth to water, changes with geographic location and differences in topography so much that these factors will overwhelm all other sources of variation. Because of this, any errors in location may have a profound effect on the water table elevation. To avoid the complications of simultaneously considering uncertainties in the secondary variables, this statistical quality control study is based on first differences (specifically, the difference between 2002 and 2001 depth-to-water measurements). The secondary variables cancel out, leaving only the difference in depth, which is numerically identical to the year's change in water level. In this statistical quality control study, the difference between 2002 and 2001 corrected depth measurements is abbreviated "'02-'02." If the water table is lower this year, the variable '02-'01 win be a positive number. Because all wells measured in the current program were also measured in 2001, there are a total of 495 wells having the variable '02-'01. This is one more than the number of measurements available last year.

The objective in our quality control study is to identify and assess possible sources of unwanted variation in water level measurements made by the KGS. The purpose of the analysis is to provide guidance to the KGS field measurement program, to suggest ways in which field measurements might be improved, and to provide information necessary to identify past or current measurements that are suspect. The statistical quality control and field measurement programs have been intimately intertwined from the outset when the KGS assumed responsibility in 1997 for measuring observation wells formerly measured by the USGS. A comparison of results from 2002 with those from previous years shows that the desired improvements in the measurement program are being achieved through quality control.

Statistical Procedures

Preliminary examination did not detect any wells that deviated from last year's measurement by a significant amount. Two wells that were measured in this year's program were not measured in 2001, so for them the variable '02-'01 cannot be calculated. All repeated measurements are excluded from this analysis to avoid inflating the total variance. 492 observations are included in the initial statistical analyis, which is an unbalanced analysis of variance (ANOVA) procedure designed to estimate the influence of different well characteristics and procedural differences on variable '02-'01. The following variables have been recorded for each well.
1. Depth to water
2. GPS longitude
3. GPS latitude
4. Date
5. Measurer's initials
6. Well Access
1 = good
0 = poor
7. Weighted Tape
1 = yes
0 = no
8. Oil on Water
1 = yes
0 = no
9. Chalk Cut Quality
2 = excellent
1 = good
0 = poor
In addition, the data file contains several variables that do not enter into the analyses. These include a unique USGS IID number and KGS ID designation, a surface elevation, a legal description of the well location, and a decimal latitude and longitude (obtained by LEO conversion of the legal description). There are other variables that are used for statistical analyses, taken from the historical records. These are Well Use, the purpose for which water from the well is used, and Aquifer Code, which describes the primary source of water in the well. The manner in which aquifer code values were assigned is summarized in Miller, Davis, and Olea (1997).
10. Well Use
H = household water supply
S = stock water supply
I = irrigation
U = unused observation
Z = animal disposal
11. Aquifer Code
KD = Cretaceous Dakota aquifer
KJ = undifferentiated Cretaceous/Jurassic aquifer
KN = Cretaceous Niobrara aquifer
QA = Quaternary alluvium aquifer
QAQU = Quaternary alluvium and undifferentiated aquifers
QAQUTO = Quaternary alluvium and undifferentiated aquifers and Tertiary Ogallala aquifer
QATO = Quaternary alluvium and Tertiary Ogallala aquifers
QU = Quaternary undifferentiated aquifer
QUTO = Quaternary undifferentiated and Tertiary Ogallala aquifers
QUTOKJ = Quaternary undifferentiated, Tertiary Ogallala, and Cretaceous/ Jurassic aquifers
QUTOKD = Quaternary undifferentiated, Tertiary Ogallala, and Cretaceous Dakota aquifers
QUKD = Quaternary undifferentiated and Cretaceous Dakota aquifers
TO = Tertiary Ogallala aquifer
TOKD = Tertiary Ogallala and Cretaceous Dakota aquifers
TOKJ = Tertiary Ogallala and undifferentiated Cretaceous/Jurassic aquifers
The set of aquifer codes used in 2002 differs slightly from previous years because of changes in the areas where the KGS measures wells. Specifically, the code KU is no longer used, and the code QUTOKD has been added. The initial statistical model includes all erogenous variables recorded during the quality control study that may contribute to the variability in the response, '02-'01, plus the variables Well Use and Aquifer Code. Unlike the 2001 measurement program, several erogenous variables contribute significantly to the total variance. These include a significant operator effect as measured by the variable Measurer, and significant effects of Well Access and Oil on Water. Unlike previous years, Chalk Cut Quality is not a significant source of variation. As expected, there are significant contributions to total variance from Well Use and Aquifer Code.

Analysis of Variance Table for Initial Model
SourceDFSum of Squares Mean SquareF RatioProb>F
Well Access128.104228.10424.75770.0297*
Weighted Tape14.80524.80520.81350.3676ns
Well Use478.342319.58563.31560.0108*
Oil on Water131.895531.89555.39950.0206*
Chalk Cut Quality234.075817.03792.88430.0569ns
Aquifer Code13249.733419.21033.2520<0.0001**
RSquare 0.1528 
ns = Not significant; * = Significant; ** = Highly significant

A revised model was run that combined aquifers into classes similar to those used in 1997 through 2001. This 5-part classification distinguishes between (1) wells that tap alluvial aquifers, (2) wells that tap both alluvial aquifers and other unconsolidated aquifers, (3) wells drawing from the High Plains aquifer, (4) wells into bedrock aquifers, and (5) wells that draw from both bedrock and unconsolidated aquifers. This fias the effect of reducing the degrees of freedom required for the model and thus increasing the sensitivity of the analysis for detecting other influences.

Analysis of Variance table for Grouped Aquifers
SourceDFSum of Squares Mean SquareF RatioProb>F
Well Access127.215327.21534.54100.0336*
Weighted Tape14.78914.78910.79910.3718ns
Well Use490.362222.59063.76940.0050**
Oil on Water140.766040.76606.80210.0094**
Chalk Cut Quality232.430616.21532.70560.0679ns
Aquifer Group4156.044239.01116.5092<.0001**
ns = Not significant; * = Significant; ** = Highly significant

Measurer, Well Access, Well Use, Oil on Water, and Aquifer Group are signficant sources of variation in the revised model, similar to last year except that Oil on Water is significant and Chalk Cut Quality is not. Unfortunately, past models are not directly comparable because there are different numbers of degrees of freedom assigned to some variables, and the response (annual change in water level) has significantly different variances from year to year. It has been noted that the variance of the response variable seems to alternate in magnitude every other year; this pattern continues in 2002 which has a significantly lower variance than measurements made in 2001. Although the year-to-year changes in total variance are highly significant, the cause is speculative (Davis, 2001).

One way to improve the statistical results of the measurement program is to discard wells in which exogenous variables make unusually high contributions to the total variance, arguing that the readings from such wells are atypical and likely erroneous. Only six wells exhibited extreme behavior in 2001 and one of these was deleted from the network because it was plugged and could not be measured.

Importance of contributing variables

We can determine the relative contributions of each category of the contributing variables by examining the least-squares means (averages) of '02-'01 for a specified state of a variable, while holding all other variables at their average value. (In statistical terms, these averages are referred to as the expected values of the variables.) A positive value indicates the average depth to water in a well is greater in 2002 than in 2001 (the water level has declined from last year's measurement). That is, the elevation of the water level in the well is lower than it was previously. The following list gives the least-squares means for the complete data set.

Least Sq Mean
*indicates new operator in 2002
Well Access
Least Sq Mean

Weighted Tape
Least Sq Mean

Well Use
Least Sq Mean
Oil on Water
Least Sq Mean
Chalk Cut
Least Sq Mean
Geologic Group
Least Sq Mean
1 (Cretaceous)2.8587
2 (Alluvium)0.4822
3 (Al. + Tert.)0.8856
4 (Tertiary)1.6541
5 (Tert. + K)0.5025

Summary of the Analyses of Variance

Year 2002 measurements show significant or highly significant variations attributable to Measurer, Well Access, Well Use, and Oil on Water in addition to differences between the aquifer being tapped by the well. The standard deviation of variable '02-'01 is 2.56, which is less than the standard deviation of variable '01-'00 (3.07 ft), the standard deviation of variable '00-'99 (2.69 ft.), or the standard deviation of variable '99-'98 (4.21 ft.). The median decline in water level from 2001 to 2002 is 1.09 ft, slightly less than the median decline from 2000 to 2001 (1.39 ft.), but much greater than the 1999 to 2000 decline of 0.31 ft., the 1998 to 1999 decline (0.72 ft.) or the decline between 1997 to 1998 (0.41 ft.).

There are significant differences between measurers, mostly attributable to JDM (who tended to produce shallower than expected measurements) and DRL and NC (who tended to record deeper measurements). Note that both JDM and NC are first-time measurers.

Water levels measured in 2002 in exclusively Cretaceous aquifers (Group 1) show declines of over 5.3 ft. from 2001. The water level in the Ogalalla aquifer (Group 4) tends to be over 3.5 ft. deeper than last year. Measurements made in wells tapping alluvial aquifers (Group 2) show a decline of 2.2 ft., whereas last year these wells had a slight increase in average water level. Wells in alluvial plus other sources (Group 3) show a decline in water level of 2.3 ft. Water levels in wells tapping Cretaceous aquifers plus Quaternary and/or Tertiary aquifers (Group 5) tend to be 2.6 ft. deeper on average this year. There are highly significant differences of the annual change in water level among aquifers, mostly due to the behavior of Cretaceous wells. (Statistics for 2002 can only be compared in detail with those from 2001 because of the change in responsibility for wells in two counties that occurred after year 2000.)

The ANOVA equation can be used to create an expected value and residual (difference between observed and expected value) for each well. The distribution of residuals should be approximately normal. Examination of the residual outliers will reveal any well measurements which cannot be explained by extreme combinations of the different sources of variation. The residual plot, shown in Figure 1, deviates somewhat from normality; it is more peaked than normal, and slightly skewed to negative values. Outliers, or extreme values, are measurements which differ from their expected values by more than ±10 feet. Six wells have been identified by this process. These wells show changes in water level between 2001 and 2002 that are outside the range expected. These well measurements may be correct and reflect unusual changes in aquifer level; the wrong wells may have been measured in one year or the other; or changes in wen construction or other factors may have altered the measurability of a well. The six wells, with their residuals, are:

Well IDResidual, ft.
33S 37W 35ACD 01-15.9
05S 4OW 18ADB 01-11.8
30S 32W 22BBB 0110.0
24S 33W 19DBB 0210.5
30S 32W 35BBA 0110.8
27S 23W 28AAA 0115.5

A positive residual indicates that the 2002 water level is lower than predicted in a well with a declining water level, or is not as high as predicted in a well with an increasing water level. A negative residual indicates that the 2002 water level has declined less than predicted in a well with a declining water level, or has risen more than predicted in a well with a rising water table. One well, 27S 23W 28AAA 01, has poor access and a weight could not be used on the tape; wells with such deficiencies should be considered for replacement. The remaining wells have very limited histories of measurement, have been unusually variable from year to year, and show an exceptionally large drawdown this year. However, because only a few wells had questionable measurements, the decision was again made not to have a post-season remeasurement program in 2002.

Quality Assurance (remeasure) Program

The year 2002 Quality Assurance program of random remeasurements showed that the QA data contained three statistically significant sources of variation. Fifty randomly selected QA wells were remeasured by experienced personnel during the period when the regular field measurement program was underway. These were combined with data from the regular measurement program, to yield 135 measurements for statistical quality control. Two significant exogenous variables, weighted tape and chalk cut, were detected. The variance among the QA replicates is essentially the same as the variance of the complete data set. However, the most extreme value of '02-'01 among the QA wells is only -10.3 ft., compared to an extreme of 21.2 ft. in the complete data set.

Within the QA data set alone, there are no significant contributions due to Measurer. Geological Units shows significant differences between levels in the QA data set, due almost entirely to the different behavior of KJ and QA from 2001 to 2002.


The purpose of the Quality Control and Assurance Program is to identify wells and procedural conditions that may contribute significantly to the variance of Depth to Water measured in observation wells, and which do not reflect true changes in the water table elevation. Gathering Quality Control information requires little additional effort by the field crews, emphasizes the importance of procedural consistency, and certifies performance. Quality Control for the year 2002 field season, like the 2001 season, is remarkably free of inconsistencies compared to previous field seasons. The results can be interpreted as demonstrating the value of training and the desirability of deleting troublesome wells from the monitoring program. The QA process continues to identify specific wells as troublesome, and flags well locations which require verification before being permanently incorporated into the WIZARD data base. The improvement in locational accuracy caused by removal of Selective Availability may also conribute to the improvement in measurement quality.

The Quality Control program has achieved its objectives of identifying and quantifying sources of unwanted variation in observation well data collection, and in flagging wells whose measurements require verification. It detected a small number of spurious values, confirming the benefits of "cleaning" the data base in past years. As the Quality Control process is routinely applied to KGS observation well measurements in the future, and particularly if it is applied to the entire Kansas observation well network, the quality of the groundwater measurement data will continue to be progressively improved with time.

Figure 1--Histogram of residuals from predicted change in water level '02-'01, as estimated by regression model. Curve is fitted normal distribution with same mean and variance as residuals. Wells whose change in water level deviates more than 10 feet from the predicted value are indicated.

several wells have water level changes that are outside the predicted values


Davis, J.C., 2001, Statistical Quality Control For Year 2001 Water Well Measurements: Kansas Geological Survey, Open-File Report 2001-2, 23 p. [Available Online]

Miller, R.D., J.C. Davis, and R.A. Olea, 1997, Acquisition Activity, Statistical Quality Control, and Spatial Quality Control for 1997 Annual Water Level Data Acquired by the Kansas Geological Survey: Kansas Geological Survey Open-File Report No. 97-33, 45 p. [Available Online]

Miller, R.D., J.C. Davis, and R.A. Olea, 1998, 1998 Annual Water Level Raw Data Report for Kansas: Kansas Geological Survey Open-File Report No. 98-7, 275 p., 6 plates, and 1 compact disk. [Available Online]

Next Page--Appendix A--not yet available.

Kansas Geological Survey, Water Level CD-ROM
Send comments and/or suggestions to webadmin@kgs.ku.edu
Updated June 13, 2002
Available online at URL = http://www.kgs.ku.edu/Magellan/WaterLevels/CD/Reports/OFR02_5/rep00.htm