At the OBIS workshop at URI on September 12-13, 2000, there was considerable discussion of issues related to specification, preservation and retrieval of information concerning spatial and temporal resolution (or uncertainty) associated with observations and collections. The following is an effort to start a discussion forum by proposing some general problem statements and identifying approaches to problem resolution that are presently in use or have been proposed.


Statement of the problem (proposed for discussion)

Many of the available records of organism occurrence/abundance contain useful but imprecise or non-quantitative location information. In addition, some sampling methods and survey techniques result in data that are aggregated over a significant area or water volume. Transforming this information into digital data for database storage and retrieval presents of the challenge of identifying and providing access to the information without loss and without misrepresentation (e.g., spurious precision introduced by using a point coordinate for an areal observation).

Similar problems occur with date/time data. Collection or observation times may be known to the minute, hour, day, month or year, or to any interval based on those units. As with the spatial case, an interval can have two meanings -- one, the precision with which an actual point event is known, and two, the actual interval over which the collection (e.g., a plankton tow) was in operation. In some cases, a time interval can be safely inferred; for example, there is little doubt that the sample was collected before the date of published description or museum accession.

These problems are most common with older collections or descriptions, but they will never disappear completely. Organisms that are not of commercial significance, not easily observed and identified, or not the target of specific research investigations will continue to be found and identified as 'bycatch' with other samples, or by a dispersed network of individual investigators. In these and other cases, useful information will become available without the technologically-based ancillary information expected -- even demanded -- of focused scientific collection efforts. Since the vast majority of the ocean's organisms probably fall into either this category or into the class of organisms collected by 'non-point' methods, and since at least some future observations will be based on remote sensing, a comprehensive system for documenting and retrieving a variety of types of spatio-temporal information is essential.

The information discussed is particularly critical to the biodiversity and biogeography aspects of OBIS/CoML. Historic observations provide the baselines against which present and future censuses can and must be compared to achieve any understanding of long-term population and distribution dynamics. We know that for many species and regions, human pressures had already had substantial impact on populations and habitats before the advent of precisely recorded time and location data. Our understanding of trends and forcing functions will depend critically on the information inherent in historical data, and our ability to use those data with the power of developing information technologies will depend on our ability to formulate semantic and syntactic links that will allow computers to provide and portray data without censorship or misrepresentation.

Objectives (proposed for discussion)

To use community discussion and input to devise a family of data conventions and storage techniques that address the need to preserve and utilize valuable information that may have various types of imprecision associated with its spatial and temporal reference information. The approaches used should meet the following criteria:

1. Preserve, in some accessible form, essentially all of the original information;

2. Abstract salient aspects of the information in forms amenable to automated analysis and data processing;

3. Link caveats and precision estimates to the derived data in such a way as to minimize inadvertent misinterpretation;

4. Be sufficiently simple and well-defined so that basic search and entry operations do not require expert intervention in most cases; and,

5. Be as consistent and compatible as possible across various taxa, applications, and types of information.
