Dakota--OFR 93-1B--3--Comparison of Methods

Kansas Geological Survey, Open-File Rept. 93-1B
Statistical Methods for Delineating Water Quality--Page 3 of 5

Comparison of Methods

Piper Trilinear Diagrams and Geographic Distribution

The 250 water analyses from the statewide distribution of one sample per location were converted to millequivalents/liter (meq/L) and to concentrations (%) of cations and anions for use in producing trilinear plots. The cations used were total dissolved calcium (Ca), magnesium (Mg), and sodium (Na). The anions used were total dissolved chloride (CI), sulfate (SO₄), and bicarbonate (HCO₃).

Figure 1. Location of study area subdivided into groups on the basis of geology. Zones 1 to 4 are used for plotting of trilinear diagrams and statistical tests. Symbols represent counties in each zone in figures 2 and 3. A larger version (137k) of this figure is available.

The study area was divided into four zones (based on geologic outcrop, subcrop, and depth of well) to facilitate presentation of similar water chemistry (fig. 1). These zones are the basis for the trilinear diagram groupings. The counties in zones 1 and 4, the outcrop and subcrop regions of the Dakota aquifer, were combined in fig. 2 because of similar chemistry and depth of wells. The counties in zones 2 and 3, the deeper portions of the Dakota Formation, are shown together in fig. 3 because of similar chemistry and depth of wells.

Figure 2. Piper (1944) trilinear diagrams of zones 1 and 4. Areas are grouped together based on their geology. Symbols represent counties in each zone.

Figure 3. Piper (1944) trilinear diagrams of zones 2 and 3. Areas are grouped together based on their geology. Symbols represent counties in each zone.

The trilinear diagrams show an evolution of water types as one moves westward and toward the interior of Kansas. The north-central and southwestern parts of the state (zones 1 and 4; figs. 1 and 2) have predominantly calcium bicarbonate waters with some sodium bicarbonate and a few sodium chloride waters. These water types represent recharge areas in the outcrop and subcrop regions of the Dakota aquifer and reflect the presence of local flow systems.

The west-central and central parts of the state (zones 2 and 3; figs. 1 and 3) have more sodium chloride and mixed-water types than either zone 1 or 4. These water types are representative of confined regional flow areas. In addition, particularly in zone 2, there is evidence of saltwater intrusion upward from the underlying Permian units.

Discriminant Analysis

Discriminant analysis calls for the data to be grouped into predefined classes before the analysis can be performed. The test calculates one or more discriminant functions depending on the number of predefined groups. This function is applied to the data to determine whether the data accurately fit into the predefined groups. If the data fails the internal test, the data are reclassified into a new group. A table of misclassified results is generated, and a group-fit table of percentage of results that fit the predefined groups plus the reclassified data is generated. Use of the group-fit table indicates how well the statistical method works for analysis of the data.

For this study chloride concentration was used as the group delimiter for the discriminant analysis because it is the most common constituent measured, and chloride is a conservative ion; that is, the concentration of chloride in water is not generally affected by microbial or geochemical processes or by water-rock interactions. The group classifications for the data as total chloride concentration (in mg/L) were <250, 250 <= Cl <= 1500, > 1500. The 250 mg/L limit was chosen because this is the maximum recommended contaminant limit set by the U.S. EPA for safe drinking water. The 250 <= Cl <= 1500 was set for mixed-water types. Concentrations greater than 1500 mg/L are considered brines.

The group classifications and the use of transformed data resulted in a 87.2% correct classification of the data (Table 1). Because sodium is generally associated with chloride, the tests were run both with and without sodium (Na⁺) to determine if sodium acted as a secondary group delimiter. The results of the analyses were similar indicating the effectiveness of using chloride as the sole group delimiter.

The analyses that were misclassified and assigned to another group had large percentages of cations or anions other than chloride in the data set. These other concentrations influenced the calculation of the discriminant function and caused the different classification (Table 2).

Table 1. Discriminant Analysis Classification Results

Actual group No. of cases Predicted group membership

1 2 3

Group
(CL < 250) 1 181 166
91.7% 15
8.3% 0
0%

Group
(250 <= CL <= 1500) 2 48 12
25.0% 36
75.0% 0
0.0%

Group
(CL > 1500) 3 21 0
0.0% 5
23.8% 16
76.2%

Percent of "grouped" cases correctly classified: 87.2%

	Actual group	No. of cases	Predicted group membership
1	2	3
Group (CL < 250)	1	181	166 91.7%	15 8.3%	0 0%
Group (250 <= CL <= 1500)	2	48	12 25.0%	36 75.0%	0 0.0%
Group (CL > 1500)	3	21	0 0.0%	5 23.8%	16 76.2%

Table 2. Misclassifications from Discriminant Analysis.

County Township, range
section, quarter section Old group New group CA MG NA CL S0₄ HCO₃

Cloud 07S 05W 02BBAB 1 2 104 17.9 403 78.1 568 684

Ellis 15S 2OW 35CAA 1 2 5 3 350 215 179 320

Gove 14S 26W 14DC 1 2 4.5 2.2 430 210 180 560

Gove 14S 27W 11DC 1 2 2.4 1 370 140 86 620

Greeley 17S 42W 36BBB 1 2 18 8.8 380 220 210 450

Logan 14S 33W 22CDD 1 2 4 1 350 70 230 450

Logan 14S 35W 17BCA 1 2 4.8 1 320 69 100 600

Lane 16S 28W 04BCD 1 2 4.8 2.8 300 120 77 480

Lane 16S 29W 11CCC 1 2 5.2 3 380 110 39 730

Mitchell 07S 06W IOCBB 1 2 10 2.3 350 110 150 570

Pawnee 20S 19W 28AAD 1 2 26 16 370 130 320 480

Rush 17S 16W 20 1 2 5 3 303 119 137 431

Trego 15S 24W 15CCC 1 2 26 1.1 429 244 304 390

Trego 14S 24W 19CCA 1 2 4 2 361 201 192 337

Wichita 18S 37W 24CCCA 1 2 5 2.6 365 63 500 350

Cloud 06S 05W 06CB 2 1 20 13 340 250 74 500

Cloud 08S 05W OICCCD 2 1 78.3 18 257 294 97 402

Ellsworth 15S 1OW 36CCB 2 1 140 55 710 1200 170 370

Ellsworth 16S 08W 04DAA 2 1 130 43 180 310 160 340

Graham 08S 23W 25 2 1 158 22 271 485 197 205

Hodgeman 23S 22W 29DDD 2 1 35 19 460 560 110 330

Hodgeman 62IS21W 3IDDA 2 1 27 14 480 540 190 270

Mitchell 06S 07W 14BADD 2 1 89.7 35.6 380 507 128 458

Pawnee 22S 17W 19CBB 2 1 42 26 490 610 130 340

Rush 16S 17W 16DCDC 2 1 24 7.5 402 422 186 271

Rush 17S 2OW 3OCCB 2 1 32 25 400 270 400 300

Russell 14S 13W 12DAD 2 1 74 19 400 501 178 299

Ellis 15S 17W 23AB 3 2 35 26 1330 1650 389 370

Republic 04S 05W 23BC 3 2 19 15 1500 1700 260 920

Rice 19S IOW 33AAD 3 2 1100 84 210 2300 31 120

Rooks 10S 19W 35 3 2 56 58 1660 1920 664 630

Russell 15S 14W 07ABD 3 2 22 21 1400 1800 310 460

County	Township, range section, quarter section	Old group	New group	CA	MG	NA	CL	S0₄	HCO₃
Cloud	07S 05W 02BBAB	1	2	104	17.9	403	78.1	568	684
Ellis	15S 2OW 35CAA	1	2	5	3	350	215	179	320
Gove	14S 26W 14DC	1	2	4.5	2.2	430	210	180	560
Gove	14S 27W 11DC	1	2	2.4	1	370	140	86	620
Greeley	17S 42W 36BBB	1	2	18	8.8	380	220	210	450
Logan	14S 33W 22CDD	1	2	4	1	350	70	230	450
Logan	14S 35W 17BCA	1	2	4.8	1	320	69	100	600
Lane	16S 28W 04BCD	1	2	4.8	2.8	300	120	77	480
Lane	16S 29W 11CCC	1	2	5.2	3	380	110	39	730
Mitchell	07S 06W IOCBB	1	2	10	2.3	350	110	150	570
Pawnee	20S 19W 28AAD	1	2	26	16	370	130	320	480
Rush	17S 16W 20	1	2	5	3	303	119	137	431
Trego	15S 24W 15CCC	1	2	26	1.1	429	244	304	390
Trego	14S 24W 19CCA	1	2	4	2	361	201	192	337
Wichita	18S 37W 24CCCA	1	2	5	2.6	365	63	500	350
Cloud	06S 05W 06CB	2	1	20	13	340	250	74	500
Cloud	08S 05W OICCCD	2	1	78.3	18	257	294	97	402
Ellsworth	15S 1OW 36CCB	2	1	140	55	710	1200	170	370
Ellsworth	16S 08W 04DAA	2	1	130	43	180	310	160	340
Graham	08S 23W 25	2	1	158	22	271	485	197	205
Hodgeman	23S 22W 29DDD	2	1	35	19	460	560	110	330
Hodgeman	62IS21W 3IDDA	2	1	27	14	480	540	190	270
Mitchell	06S 07W 14BADD	2	1	89.7	35.6	380	507	128	458
Pawnee	22S 17W 19CBB	2	1	42	26	490	610	130	340
Rush	16S 17W 16DCDC	2	1	24	7.5	402	422	186	271
Rush	17S 2OW 3OCCB	2	1	32	25	400	270	400	300
Russell	14S 13W 12DAD	2	1	74	19	400	501	178	299
Ellis	15S 17W 23AB	3	2	35	26	1330	1650	389	370
Republic	04S 05W 23BC	3	2	19	15	1500	1700	260	920
Rice	19S IOW 33AAD	3	2	1100	84	210	2300	31	120
Rooks	10S 19W 35	3	2	56	58	1660	1920	664	630
Russell	15S 14W 07ABD	3	2	22	21	1400	1800	310	460

The advantage of using discriminant analysis is that the resulting groups can be easily plotted and evaluated geographically to determine whether there is trend in the data resulting from location. Figures 4 and 5 show the original and reclassified data based on the three groups listed in table 1. Figure 6 shows the data distribution and the points that were reclassified.

Figure 4. Plot of original classification of discriminant analysis data based on concentration of chloride. A larger version of this figure is available.

Figure 5. Reclassification of discriminant analysis data based on concentration of chloride. A larger version of this figure is available.

Figure 6. Data distribution and reclassification results of discrimination analysis. A larger version of this figure is available.

Figures 4 and 5 show that the majority of the samples with low chloride percentages (<250 mg/L) are in the outcrop and subcrop regions of the Dakota aquifer. The mixed water types (250 mg/L <= chloride <= 1500 mg/L) are in the southwestern part of Kansas and in parts of the northeast. The zone of high chloride (> 1500 mg/L) is near the middle of the state with a few sites in the northeast.

Comparison of the discriminant analysis data with the trilinear classification of samples (figs. 2-5) suggests that chloride concentration alone may not be the best indicator of water typing for the area. As shown in table 2, the values that were misclassified from group 1 to 2 are mixed-water types with chloride concentrations of less than 250 mg/L but with a correspondingly high concentration of another anion and associated cation. The other misclassifications have chloride concentrations of greater than 250 mg/L but also higher concentrations of sulfate (SO^-2₄) and/or bicarbonate (HCO^-₃).

The discriminant analysis is in close agreement with the water typing for samples that have chloride concentrations greater than 1500 (figs. 3-5). Based on these results discriminant analysis can provide a reasonable estimate for defining geographic areas of interest for further study by generating the table of misclassified data. Further evaluation of the misclassified data is presented in the discussion section.

Factor Analysis

Factor analysis is frequently used to reduce multiple variables for a single location to several factors that help to explain the relationship among the variables. In the current set of water chemistry data for the Dakota aquifer, the factor analysis did not result in any additional clarification of the data beyond that provided by discriminant analysis; however, the results were useful for plotting the data distribution by factor score. The factor score maps support the results of other methods for the distribution of water chemistry throughout Kansas.

The variables in our analysis were projected as vectors onto two arbitrarily oriented new axes termed factor axes and labeled factor I and factor II. If the factor axes are assumed to be a unit length as measured from the origin, then the value (termed loading) of any given projection from a variable vector onto a factor axis must be between -1 and + 1.

The effectiveness of factor loadings in representing the relationships between variables is shown by the commonality, which is the sum of the squared factor loading for each variable (Harbaugh and Demirmen, 1964). The present analysis shows that 88.8% (mean commonality) of the relationships between the six chemical variables is explained by the two factors (table 3). The factor matrix indicates that five of the variables (Mg, Na, Cl, S0₄, and HCO₃ contribute to factor I and that one variable (Ca) contributes the most to factor II.

The original data were converted to factor scores by conversion to z scores [(x - )/s) (where x is the sample value, is the mean, and s is the standard deviation)] and multiplied by the factor score coefficient matrix generated by the SPSS factor analysis program. The calculated factor scores for the dominant factor (factor I, which accounted for 72% of the variation in the data) were plotted on a map of the areal distribution of the Dakota aquifer (fig. 7).

Figure 7. Highest factor analysis scores plot in central portion of state.

The area of highest values plots in the central part of Kansas in Russell, Barton, Ellsworth, Ellis, Trego, Rooks, and Rush counties (fig. 7). This concentration of high factor values adds support to the discriminant analysis plots (figs. 4 and 5) which indicate the occurrence of high chloride waters in this part of the state, to the rank order tests for correlation between chloride concentration and depth (next section), and to the predominance of sodium chloride waters on the trilinear diagrams for this area (fig. 3). The implication is that upward recharge of underlying saltwater occurs in this area because the base of the Dakota is in hydraulic contact with lower units, resulting in mixing of waters in this area (Townsend et al., 1989; Macfarlane et al., 1988). Factor analysis of the chemistry data in conjunction with hydrogeologic parameters may result in an increased understanding of the flow system and geochemical mixing in future studies of this area.

Table 3. Factor Matrix (Loadings)

Ca mg Na Cl S0₄ HCO₃

Factor I 0.45873 0.95046 0.96082 0.95075 0.90435 0.76243

Factor II 0.85499 0.08248 -0.13155 0.03581 -0.01403 -0.47947

Communality 0.94144 0.91018 0.94047 0.90521 0.81805 0.81118

Mean commonality = 0.888

	Ca	mg	Na	Cl	S0₄	HCO₃
Factor I	0.45873	0.95046	0.96082	0.95075	0.90435	0.76243
Factor II	0.85499	0.08248	-0.13155	0.03581	-0.01403	-0.47947
Communality	0.94144	0.91018	0.94047	0.90521	0.81805	0.81118
Mean commonality = 0.888

Nonparametric Correlation

Nonparametric tests (Kendall's and Spearman's ) were used to measure the degree of correlation between chloride and depth. Both methods are rank-ordered tests of the degree of independence between random variables. These methods were used to determine whether a correlation exists between chloride concentration and depth. There are 926 valid cases in the data set. The calculated Spearman's of 0.365 is a significant value, indicating that some relationship exists between depth and chloride content (table 4). The value of Kendall's (0.247) also shows that some correlation exists between chloride and depth (table 4). The values of both Spearman's and Kendall's are small, suggesting that the correlation is not strong (table 4).

Evaluation of the entire data set (926 measurements) for a correlation between depth and chloride concentration (in mg/L) was also done by small groups based on county code and approximate geologic depth of the Dakota Formation (table 4; fig. 1). The results show that the north-central tier of counties (zone 1), which corresponds to the outcrop regions of the Dakota formation, show low Kendall and Spearman values indicating that depth and chloride are not well correlated; zones 3 (west-central) and 4 (south-central) also show low correlations; and zone 2 (central) shows the best correlation for both methods.

Table 4. Kendall and Spearman Correlation for Chloride and Depth

Area of interest Sample size Kendall Spearman

All zones 926 0.247^a 0.365^a

Zone 1 357 -0.002^b
(0.951) 0.006^b
(0.908)

Zone 2 301 0.465^a 0.662^a

Zone 3 100 0.326^a 0.469^a

Zone 4 150 0.189^b
(0.0007) 0.300^b
(0.0002)

a. Significant at the 0.0001 level.
b. Significance level indicated in parentheses.

Area of interest	Sample size	Kendall	Spearman
All zones	926	0.247^a	0.365^a
Zone 1	357	-0.002^b (0.951)	0.006^b (0.908)
Zone 2	301	0.465^a	0.662^a
Zone 3	100	0.326^a	0.469^a
Zone 4	150	0.189^b (0.0007)	0.300^b (0.0002)

Previous page--Statistical Methods || Next page--Discussion
Start of this report || Table of Contents

Kansas Geological Survey, Dakota Aquifer Program
Original report available from the Kansas Geological Survey.
Electronic version placed online Nov. 1998
Scientific comments to P. Allen Macfarlane
Web comments to webadmin@kgs.ku.edu
URL=http://www.kgs.ku.edu/Dakota/vol3/ofr93_1b/rep03.htm