U.S. National Science Foundation Project OCE 00-03970

Click on a logo below to visit a project site.

LOICZView clustering program was created by Bruce Maxwell. Web interface created by Casey Smith

NOTES AND SUGGESTIONS

-getting started in LOICZVIEW (LV)-

These are Bob's personal comments, and do not necessarily represent the views of anybody well informed. They are oriented primarily to the beginner. Good luck, and let us know what works, what doesn't, and what else you need. -- Bob Buddemeier, 1 April 01

 

A Basic Idea: Clustering is not a magic answer-generating machine; rather, it is a tool that can be combined with the standardized database (another tool) to facilitate the application and testing of your expert judgment -- or, alternatively, to help you use your judgment to expand your expertise.

There are two main classes of use. One is exploratory -- getting a real picture of the characteristics of the landscape when you are working with variables or regions not fully familiar to you. The other is hypothesis testing -- using the tool to develop and calibrate a statistical model of spatial distributions or characteristics. Athoughl these uses are related and can overlap, it is wise to remember that their operational strategies are different. At the point where you shift from one mode to another, it is a good idea to step back and review your approach.

 

1. If you just want to practice with LOICZVIEW, you can bypass the database front end and go directly to LV at www.palantir.swarthmore.edu/loicz. There is a dataset of coastal variables for Australia at one- (rather than half-) degree resolution built in --- you can use these to test the system and practice a bit.

2. If you can't wait to personalize the process with edited or added data, LV can accept uploads from a spreadsheet -- format instructions and procedures can be found on the "Data" page of LV, as well as in the "guidelines to data filtering" that are linked to the Tool pages of the site.

3. While it can be fun to do "garbage can" (lots of variables, lots of area) clusters, and they do tend to show that there is something useful to be discovered, they are too complicated to refine very effectively. To develop skill and understanding, it is probably better to pick a region and a small number of variables you are familiar with, and work up gradually to exploring more complex interactions and larger systems. Comparing such results with "big picture" clustering exercises can be very informative about issuses of scale and range of variables.

4. Keep notes and records (in addition to saved/downloaded files)-- not only can LV handle more variables than humans can integrate, it can also quickly generate more images and datasets than we can remember. Save your results by copying or downloading the cluster summaries and/or source files -- this protects you against discovering that your best result came early in the process but you didn't save it because you were suree you could do better.

5. Share results and questions -- we all want to know, and some may be able to help out with problem fixes or ideas. We'll be happy to post results on the website for review and discussion if that's not something you can do yourself.

6. The Minimum Description Length (MDL) option in LV is a tool for determining the optimum number of clusters for a given data set. It has two problems -- it can be very slow to run, and it tends to suggest numbers of clusters that are too large for easy visualization. It is best used in the refinement stage, when you think you have a basic strategy developed and are working on final implementation.

7. It's most effective to 'overcluster' and then combine or back down. The "Merge Cluster" function is not presently enabled on the web version of LV (it will be eventually), but you can do something similar by reviewing the cluster sizes, similarities, and statistics. 8. In terms of number of iterations and number of runs, it will be easiest (and generally safe, even if not optimally efficient) to accept the default values in the early stages of learning.