GPS Ground Truth Data Acquisition


Chapter 4 - Methodology

Data Acquisition  |  GPS Ground Truth Data Acquisition  |  Image Preprocessing  |  Image Classification  |  Change Detection  |  1984 Land Use & Land Cover Map  |  Data Conversion  |  Land Assessment  |  Hardcopy Map Production

GPS Ground Truth Data Acquisition

Before the preprocessing and classification of satellite imagery began, an extensive field survey was performed throughout Carroll County using Global Positioning System (GPS) equipment. The Global Positioning System has developed into an efficient GIS data collection technology which allows for users to compile their own data sets directly from the field as part of ‘ground truthing’ (Cunningham, 1998). Ground-truth surveys are essential components for the determination of accuracy assessment for classified satellite imagery (Congalton, 1996).

This survey was performed in order to obtain accurate locational point data for each land use and land cover class included in the classification scheme as well as for the creation of training sites and for signature generation; in addition, creating an independent data set reserved for accuracy assessment. The land use and land cover categories of focus were pastures, deciduous forest, coniferous forest, barren land, and herbaceous land. Urban and perennial water classes were not included in the survey. Alternative classification procedures were employed for these classes.

The herbaceous category was not included in the final classification scheme. During image classification, there was difficulty in differentiating herbaceous areas from pasture as a result of similar reflectance values and seasonal vegetation characteristics. Anderson et al (1976) characterized herbaceous land as dominated by naturally occurring grasses and forbs. This type of landscape is often used for livestock grazing which may also help explain the misclassification.

Additional ground truth categories were also obtained on an experimental basis; these include orchards, cedar glades, and chicken houses. These were also discarded from the final land use and land cover classification due to few numbers of quality sites collected as well as resolution issues. Previous research (Culpepper & Bayard, 1997) has also identified the difficulty in discerning cedar glades with Landsat TM imagery.

The field survey was carried out over a six week period beginning in mid - March and ending at the end of April, 2000. A total of nine different field reconnaissance trips were made. Before each field reconnaissance trip was made, mission preplanning was conducted to ensure successful data collection. Specific factors where considered for data collection:

1: Geographic distribution - an attempt to obtain point data evenly throughout the study area

2: Proximity routes - for travel logistics purposes

3: Comprehensive classification - to ensure point data was collected for all land use classes employed in this study

The GPS equipment utilized in the field survey was a Trimble TDC1 data logger and rover receiver. This equipment was borrowed from The Center for Advanced Spatial Technologies (CAST) at the University of Arkansas, Fayetteville with permission and instruction from Michael Garner, former CAST GPS instructor.

The utilization of the rover receiver allowed for real time differential correction and collection of data in the field which greatly minimized post-processing time of the rover files. The rover receiver collects data simultaneously with a base station receiver located at a recorded position from the same or similar sets of satellites and calculates positional corrections and transmits the location to the rover receiver in the field in real time (Gibbons, 1992).

The base station utilized during ground truth field surveys is located in Sallisaw, Oklahoma and is part of the Continuously Operating Reference Station (CORS) Network which is maintained by the National Geodetic Servey (NGS), an office of The National Oceanic and Atmospheric Administration (NOAA). These base stations coordinate a network of continuously operating reference positions that provide GPS carrier phase and code range measurements in support of three dimensional positioning activities throughout the United States and its territories (National Geodetic Survey, 2000).

Point data collection in the field was obtained by utilizing a bearing and distance measure. The rover receiver was installed on the roof of author’s 4-wheel drive vehicle and was connected to the Trimble TDC1 data logger. Once an appropriate field site was located the vehicle would come to a complete stop, a compass bearing was taken in degrees from North and a distance measure to the center of the field site (i.e. a pasture) was estimated in meters and input into the data logger with the appropriate attribute or land category name. Approximately 4 to 6 points were collected at each site. In order to minimize errors in distance estimation, data points were not collected greater than 100 meters from the vehicle. It should be noted that by utilizing this procedure, the collection of ground truth was limited to roadside observation. However, over 870 miles of U.S, State, and county roads in Carroll County were traversed in order to collect this data. (Photograph 4.1)

Post processing was completed using PathFinder Office Software, Version 2.51, produced by Trimble Navigation Corp. Pathfinder Office Software is utilizing for the management and processing of field data collected for mapping and GIS applications from Trimble GPS receivers. Point data accuracies were consistently sub-meter. On occasion, a few points in each rover file were +/- 2 meters. However, when considering the spatial resolution of the Landsat TM data, 30 meters, these accuracies were more than adequate. Every effort was made during field data collection to collect points in homogeneous areas greater than 30 square meters to coincide with the spatial resolution of the TM imagery. To account for temporal differences between acquisition of the Landsat imagery overpass (October 1999) and the ground truth data survey (MarchApril 2000), point data was only collected for areas that were recognizably constant or unchanged.

Photograph 4.1 Ground Truthing Barren Areas around Beaver Lake

Photo by Alice Bottomley, 4/09/2000

A total of 424 data points were obtained from the field survey from which 344 ground truth points were used as ground truth for accuracy assessment: 251 pasture, 54 deciduous forest, 34 coniferous forest, and 5 barren land. The spatial distribution of these 344 ground truth points is portrayed in Map Foldout 4.1. Such significantly fewer deciduous, coniferous, and barren data points were collected due to the difficulty of locating homogeneous field sites of at least 30 square meters. The barren field sites were restricted to dry lake beds, river banks, and large gravel bars. Perennial water pixels utilized for ground truth and accuracy assessment were not obtained with GPS equipment, but simply digitized on the Landsat imagery from a priori sites such as Beaver and Table Rock Lakes, Lake Leatherwood, and the White and Kings Rivers. A random number of 69 pixels were digitized in order to collect a variety of water conditions, both clear and turbid.

Map Foldout 4.1 (click to view full version)

In order to assess the percentage of the study area sampled for ground truth, the number of ground truth pixels collected in the field for each land use and land cover class incorporated in the final 1999 land use and land cover map was divided by the total number of pixels in each of the classified land cover classes. These results are presented it Table 4.2.

Table 4.2  Percent of Classified Image Sampled for Ground Truth Data by Land Class

* Does not include Urban Classes since no ground truth data was collected for this land class.

Due to the large number of pixels in this classified image (almost 2 million), it was difficult and not practicle to obtain a feasible percentage of ground truth data for the entire study area. Only one-half of one percent of the study area was obtained. In this case, research by Congalton (1996) states that collecting a minimum of 50 samples for each land use category for use in an error matrix is a good rule of thumb. An attempt was made during the field surveys to collect as many ‘good’ samples as possible, yet this was still not practical for all land use classes. However, when considering the importance or weight for each land cover class in this study, the barren land class is not a significant factor in the overall objective of detecting land cover conversion of forests to pastures. Once post-processing was completed, the rover files (point data) were reprojected to UTM Zone 15, NAD27 and exported as ArcView Shapefiles. They were sorted in ArcView by land class theme. Each ground truth category was then imported as vectors into PCI’s ImageWorks Module for the creation of training sites.