Dataset Refinement and Editing:
Why
filter? There are a number of reasons why the user may wish to
modify a dataset by “filtering” --selecting only a certain range of values
for one or more of the variables. Examples include:
The first two examples can conveniently be done either within the database (the on-line filter option) or off-line, in a dataset downloaded for modification and uploading. The geographic range definition is best done off-line at present.
On-line filtering: After the variables have been selected, Proceed to the variable review page, where a "Filter" button is available next to the listing of each variable. Clicking a filter button will display the range of values for that variable over the current geographic and cell type selection, and a set of 'button' choices -- Greater than, Less than, Equal, Not Equal, Between. One of these should be selected, and the appropriate numerical value entered in the box. For the "Between" choice, the proper format is: 111 and 222
Repeat the process for as many variables as desired (it is not necessary to filter all variables, but the data set will be treated as filtered if any component is).
This variable review page confirms the geographic range, cell type, and variables selected. It also offers a choice of "No Null" for each of the variables. If this box is checked, any cells that have no value associated with that variable (indicated by -9999 in the database entry) will be dropped from the final data set. In general, elimination of null values will make a more statistically satisfying cluster group, but at the expense of omitting parts of the geographic visualization.
Once you have made the decisions at this stage, proceed to the Generate Cluster Data step.
Off-line filtering: Although the data selection process provides basic filter capabilities, it will never be possible to provide every kind of tool that the advanced user might desire. Fortunately, the LOICZVIEW capability to accept uploaded datasets, in combination with the database download option, permits the user to adjust data sets using relatively simple spreadsheet operations. At present, offline filtering is the only practical way to modify a geographic range to an irregular (non-rectangular) shape. The following example provides a procedural outline.
EXAMPLE:
Clustering of the Australia-New Zealand region yielded poor results for
hydrologic variables when the standard geographic region selection (Zones
21 and 26) was used. This was because the rectangular lat-long boxes that
include all of Australia and NZ also include portions of Indonesia and New
Guinea, with a very different rainfall regime. Use of the coordinate selection
boxes can not solve the problem, because a rectangular box that includes
all of Australia (South of 10 degrees S latitude) still clips enough of
Indonesia to skew the data distribution.
The following procedure can be used to adjust the Australia- NZ geographic region:
Once a geographic template is constructed, it can be applied to future dataset downloads.