Image Classification


Chapter 4 - Methodology

Data Acquisition  |  GPS Ground Truth Data Acquisition  |  Image Preprocessing  |  Image Classification  |  Change Detection  |  1984 Land Use & Land Cover Map  |  Data Conversion  |  Land Assessment  |  Hardcopy Map Production

Image Classification

Within the scope of this project, image classification is defined as the extraction of differentiated classes or themes, land use and land cover categories, from raw remotely sensed digital satellite data (Gorham, 1999). The development and creation process of the 1999 land use and land cover map for Carroll County is presented in Figure 4.3.

Figure 4.3

Classification Scheme

For the purposes of this project, the terms land use and land cover have been combined as one entity for the description of the landscape within the area of study. It should be noted that while land use and land cover are recognized as separate entities (Meyer, 1995), they have been combined in this study in order to conform with the level of detail employed. Also, finer levels of inquiry would most likely need to use separate measures of land use and land cover and/or to use more detailed levels of the classification scheme.

A multilevel, hierarchical land use classification was derived from the author’s a priori knowledge of the study area and is roughly based upon an Anderson level II classification (Anderson et al, 1976). For the scope of this study, 5 level I categories and 8 level II categories compose the hierarchical land use and land cover classification employed in this project. (Table 4.3) Level I and level II land use and land cover categories were broadly categorized purposely in order to minimize confusion between land cover classes that experienced change over the time period considered in this study.

Table 4.3  Land Use & Land Cover Classification Categories

The urban level I category was grouped into 3 level II categories based upon the result of an independent unsupervised classification and subsequent aggregation by visual interpretation of the raw TM images. Urban intensity was determined based upon the brightness of each pixel cluster in urban areas. Changes in urban areas with regard to urban expansion, although a topic of much research, was not the subject of focus of this project.

All land cover classes relative to herbaceous, grassland, rangeland, forage land, agricultural land and pasture were merged into one level II category, grass and pasture lands. Carroll County experiences very limited agricultural production with respect to crop growth. Small amounts of winter wheat are grown in Carroll County of which most is used for local cattle feed. In fact, so little recorded winter wheat is grown that Carroll County is grouped in the "All other combined counties category in district one for the state of Arkansas" (Arkansas Agricultural Statistics Service, 1998). Hay is the other main forage crop grown during the warm or summer season. This is also used predominantly for cattle feed. Photograph 4.2 portrays a typical pasture on a bright spring day in Carroll County.

Photograph 4.2  Typical Pasture in south central Carroll County

Photograph by Author, 4/18/2000

Level I forested lands were simply divided into two level II categories, deciduous and coniferous. This was based upon the GPS ground truth data collected in the field. For the scope of this study, level III forest categories, species level, were not deemed necessary for the determination of forest to pasture conversion. Photograph 4.3 depicts a deciduous forest in the White River Valley of northwest Carroll County while Photograph 4.4 depicts a roadside stand of coniferous cedar trees along Highway 187.

Photograph 4.3  Deciduous Forest in the White River Valley

Photograph 4.4  Roadside Coniferous, Cedar Stand, Hwy 187 East

Both photographs by Author, 4/28/2000

Only 1 level II water category, perennial water, was included in this study due to the temporal characteristics of the imagery employed in this study. Both images utilized in the change detection process were obtained in late summer which is typically characterized by dry conditions. Photograph 4.5 shows Beaver Dam and the largest perennial water source in Carroll County, Beaver Lake Reservoir.

Photograph 4.5  Beaver Dam and Beaver Lake Reservoir

Photograph by Author, 4/29/2000

Landscape categories usually associated with barren conditions such as rock outcrops, bare earth, bare soil, gravel bars, and dry lake beds were all considered as a single barren land level II category. This is in part due to the factor of difficulty in differentiating between such areas with similarly high reflectance values in multi-spectral imagery and low number of ground truth data collected for this land class. Photograph 4.1, page 53, depicts the dry lake beds around Beaver Lake, utilized as field sites for ground truth data.

Modification of Classification Technique

Both supervised, maximum likelihood, and unsupervised, ISOCLUS, remote sensing classification methodologies were utilized in a multi-step approach for this project. The predetermined classification methodology for this study had to be adjusted somewhat as initial classification results were not deemed satisfactory. An initial single scene, supervised maximum likelihood classification on the October 6th, 1999 image was first attempted but did meet anticipated project accuracy requirements. This was most likely the result of two factors: a difficulty in differentiating vegetation classes as a result of similar seasonal and phenological vegetation characteristics and poorly developed signatures for each land cover class. Both of these dilemmas were assessed and amended by employing alternative approaches by means of trial and error until an acceptable 1999 land use and land cover classification image was created.

The first problem was resolved by employing a combination of three multi-temporal 1999 TM images in the supervised classification. The implementation of multi-temporal or multi-seasonal imagery has shown potential for higher forest classification precision over single-date classifications (Schriever & Congalton, 1993). Another benefit from a multi-temporal approach stems from the fact that single date analyses rarely permit accurate classification of all cover types of interest in an agricultural setting over the course of a growing season (Lo, et al, 1986). This was the case with respect to the differentiation of some seasonal pastures being misclassified as barren land when fallow.

Training Site Development & Signature Generation

The second problem concerning quality signature generation was also addressed. Difficulty in generating ‘good’ signatures was a reflection of the heterogeneity within the study area, and thus internal spectral variety within the training sites. The rural landscape is a mosaic of natural and human managed patches that vary in size, shape, and arrangement (Turner, 1990). In turn, this variety in the landscape is reflected by an array of different spectral characteristics. If training sites are characterized by inherent spectral variation, classification performance or accuracy will decrease as a result of its large spectral variability (Arai, 1992).

Effective classification of remote sensing image data depends upon separating land cover types of interest into sets of spectral classes (signatures) that represent the data in a form suited to the particular classifier algorithm used (Richards & Kelly, 1984). Supervised classification processes involve the initial selection of areas (training sets) on the image which represent specific land classes to be mapped (Robinove, 1981). Training sites are sets of pixels that represent what is recognized as a discernable pattern, or potential land cover class (ERDAS, 1999). The delineation of training sites representative of land cover types is most effective when an image analyst has knowledge of the geography of a region and experience with the spectral properties of the cover class (Skidmore, 1989).

Initial training sites for signature generation were developed from points obtained from the GPS ground truth data. These points were converted from vectors into bitmap masks categorized by land use class. However, statistically, groups of single pixels do not make for good training sites. The more pixels that can be used in training (within reason), the better the statistical representation of each spectral class (Lillesand & Kiefer, 1987). In order to attempt to rectify this problem, each ground truth point was treated as a seed pixel and by means of a proximity analysis was grown out 1 pixel on all sides to create new training sites, each composed of 9 pixels. However, this was not a reliable solution. The proximity analyses introduced unknown amounts of variability with regard to pixel value within each training class and increased the potential for land cover category confusion. The signatures generated from these training sites were not characterized by sufficient separation so an alternative training site creation method was developed. Sufficient signature separation implies that the signature generated from each training site has a high probability of being correctly classified (Lillesand & Kiefer, 1987).

Training sites were again developed from the ground truth data obtained by Global Positioning System (GPS) field collection. However, this was combined with a modified decision rule based upon multi-seasonal, vegetation phenology, visual interpretation of multitemporal imagery, and a distance measure from each ground truth data point. New training sites were generated by on-screen digitizing of selected areas for each land cover class derived from an image loaded with different band combinations and that did not contain at least one pixel that was no more than 2 pixels away from the original ground truth point. Original ground truth pixels were not included in the training sites purposely in order to maintain an independent data set reserved for accuracy assessment.

For example, coniferous forests training sites were digitized over the December 25th, 1999 TM image loaded with Bands 4, 3, 2 (RGB) to produce a false-color composite as well as the Band combination 3, 2, 1 (RGB) to produce a true-color composite. Coniferous vegetation appeared red to bright red in the false-color composite image (Figure 4.4) and green to dark green in the true-color composite (Figure 4.5). By viewing these different band combinations simultaneously, side by side in full resolution windows, coniferous vegetation patterns could be visually interpreted with relative ease.

Figure 4.4

Figure 4.5

This specific training site development method was also used for cool season pastures. The cool season pastures appeared pink to light red in the December false- color composite image and light green in the true-color composite and were visually discernable in many instances by their mostly rectangular shape. Although ground truth sites were collected for only Level I pastures, both warm and cool season pasture training sites were developed in order to accommodate for the seasonal rotation of pasture use in Carroll County. The warm season training sites were digitized over both a May 5th true color composite, Bands 3, 2, 1 (RGB), and December 25th false color composite images, Bands 4, 3, 2, (RGB), Figure 4.6.

Figure 4.6

Warm season pastures appeared bright green in the May 5th true color composite and light blue in the December false color composite. Both warm and cool season pasture classes were classified in the supervised maximum likelihood classification to generate the most accurate pasture land class. However, they were merged back into one broad pasture land class (to coincide with the GPS ground truth data).

Deciduous forest training sites were also developed in a similar fashion except for the inclusion of the May 5th, 1999 TM image loaded with Bands 3, 2, 1 (RGB), true-color composite, at full resolution. Forested areas that appeared green in the May true-color composite and tan to light-brown in the December true-color composite and in the December false-color composite were digitized as deciduous training sites.

No water ground truth sites were collected during the GPS field surveys. Water training sites were simply digitized on screen over the December 25th TM scene, false color composite, Bands 4, 3, 2 (RGB), of a priori areas of perennial water to obtain water training sites. Training sites were digitized over portions of Beaver Lake, Table Rock Lake, Lake Leatherwood, and the White and Kings Rivers in an attempt to obtain a variety of water conditions, both clear and turbid. Areas of clear water appear black in most TM scenes as a result of the absorptive properties of water. Turbid water can appear more bluish to cyan as a result of the reflection off of suspended sediment particles in the water.

Unfortunately, the collection of barren land ground truth sites was limited to 5 points as a result of difficulty in locating large homogeneous (natural) barren areas. With that in mind, barren training sites were also digitized over the December 25th scene, false color composite, Bands 4, 3, 2 (RGB), in order to minimize areas of dead or dying vegetation that might appear similarly bright in the image. Barren areas often appear bright or white in most TM scenes as a result of the reflective properties of bare soil/earth and exposed rock. All training sites were digitized as image graphics (bitmap masks). These bitmaps were utilized in the generation of signatures required for supervised maximum likelihood classification.

A signature is a set of data that statistically defines the training site for each land category of interest. The spectral signature data is derived from an image on selected database channels (TM Bands) as sampled under a selected window or bitmap (mask). Signature data is principally used to define partitions (land classes) in the feature space (TM Image), which will subsequently be used to classify the data. The classification decision rule (classification algorithm) requires the signature definitions as input (ERDAS, 1999).

Each parametric signature is defined by statistical parameters of the pixels that are included in the training sites. A parametric signature is characterized by statistical attributes such as the number of bands in the input image, the minimum and maximum data value for each band for each training site, the mean data value in each band for each training site, the covariance matrix for training site, and the number of pixels in each training site (ERDAS, 1999).

Before the signatures were generated, each training site was evaluated graphically to determine their spectral response patterns. By visually evaluating training site graphics, analysts are able to determine whether signature data is a true representation of the pixels to be classified for each land category. Two-dimensional scatterplots, in the form of feature class ellipses, were calculated from the means and standard deviations derived from the range of pixel values in each training site in two different image data (TM Band) combinations. The scatterplots offer visual analysis of the correlations between various spectral bands to determine which combination of TM Bands captures the desired features in the image.

When feature class ellipses in the scatterplot show extensive overlap, then the spectral characteristics of the pixels represented by the training site signatures cannot be distinguished in the two bands that are graphed. In the best case, there is no overlap. Some overlap, however, is expected. After extensive viewing of the each feature class scatterplot with multiple band combinations, the best two TM Band combination that depicted the least amount of overlap between feature classes was Bands 3 and 4 of the December 25th, 1999 image. Feature classes were refined by drawing an ellipse, based upon two standard deviations of the mean for each feature class, in each scatterplot. In order to minimize overlap between feature class ellipses, outliers (noise) outside the two standard deviation ellipses were removed. (Figure 4.7)

Figure 4.7

All refined feature class ellipses were then displayed together to visually determine final overlap between feature classes. There was some minor overlap between deciduous and coniferous feature classes as well as between barren and warm season pastures. However, this was determined to be insignificant once signature separation was calculated. Otherwise, virtually all overlap between classes was removed while still maintaining a high percentage of the original training site values. Figure 4.8 displays all six feature class ellipses together in feature space.

Figure 4.8

Multitemporal signatures were generated from image data of all three 1999 Landsat TM images. Bands 2, 3, 4, 5, and 7 were incorporated as the database input channels for a total of 15 channels. TM Bands 1 and 6 were excluded. TM Band 6, Thermal-infrared, 10.40-12.50 um, which is characterized almost entirely of emitted radiation was not considered in this study. The signature generator in PCI allows a maximum of 15 database input channels. Thus, one additional TM Band from each image was excluded to conform to these parameters. Of the remaining 6 available TM Bands, Band 1, Visible Blue, 0.45-0.52 um, was determined to be the least useful in the signature generation so that data from each of the visible, near-infrared, and mid-infrared portions of the spectrum would be available for classification.


Signature separability was determined to be good for all land categories. Separability was calculated employing the Bhattacharrya distance measure which calculates distance based upon class means and covariance matrices (Richards, 1986). Separation between signatures is characterized as good when 1.9 < x < 2.0, poor when 1.0 < x < 1.9, and very poor when 0.0 < x < 1.0. The minimum calculated separability between the signatures generated in this project was 1.97966 (between warm and cool season pastures). Average calculated separability was 1.99810 and the maximum calculated separability was 2.0. (Table 4.4)

Table 4.4: Signature Seperability

1999 Multitemporal Image Classification

Once ‘good’ separation of the training sites was determined, a supervised, (full gaussian) maximum likelihood classification was reprocessed implementing the three 1999 multitemporal TM scenes (May, October, and December). This classification incorporated the pixel values (Digital Numbers) from TM Bands 2, 3, 4, 5, and 7 from each scene, based upon the signatures generated for each land cover category.

The full maximum likelihood classifier uses the Gaussian threshold stored in each class signature to determine if a given pixel falls within the class or not. The threshold is the radius (in standard deviation units) of a hyperellipse surrounding the mean of the class in feature space. If the pixel falls inside the hyperellipse, it is assigned to the class, otherwise it is assigned to a null class. However, for this classification, the null class option was not utilized, thus the thresholds were ignored and every pixel was assigned to the most probable class, the nearest class based upon the Mahalanobis distance measure (Richards, 1986).

The maximum likelihood classifier is considered to give very accurate results, yet is a much slower process due to the large number of calculations. However, the accuracy is largely dependent upon the quality of the signatures. The resulting multitemporal 1999 land use and land cover image was comprised of pasture, deciduous, coniferous, perennial water, and barren land level II land cover classes. Each land class category was assigned a numeric value, code, (Table 4.3, p. 63) in a similar approach as Anderson et al (1976). A maximum likelihood report was generated to determine the outcome of the maximum likelihood classification.

Visual inspection of the resulting classified image was promising with two exceptions. There were large areas of misclassification of barren land and the absence of urban areas. Both of these instances were largely the result of urban areas being misclassified as barren land due to the fact that no ground truth was collected, training sites created, or signatures generated for urban areas. Urban areas and barren land have similarly bright reflectance values as captured in TM imagery.

For the scope of this project, urban areas were purposely not included in ground truth data collection and thus not incorporated in the Maximum Likelihood Classification. An alternative classification approach roughly based upon Gorham’s (1999) urban classification was employed specifically for the classification of urban areas with the assistance of ancillary vector data.

The level I urban category was extracted by identifying all areas within the city boundaries as defined by the Arkansas State Highway and Transportation Department (AHTD) Data for Carroll County. These urban areas included the cities of Alpena, Beaver, Berryville, Blue Eye, Eureka Springs, Green Forest, Holiday Island (unincorporated), and Oak Grove. These areas were treated as potential urban areas. An ISOCLUS unsupervised classification algorithm under a potential urban area binary mask was performed on all three 1999 TM images (Bands 2,3,4,5, and 7) which grouped the potential urban pixels into 174 different cluster classes based upon their digital numbers (DN’s). The ISOCLUS program is based on the ISODATA method. In principle, the ISOCLUS program represents a comprehensive set of heuristic procedures which have been incorporated into an interactive scheme. Pixel cluster centers are iteratively determined sampled means (Tou, et al, 1974).

The resulting clusters were aggregated, based on visual image interpretation of all three TM scenes with the band combination 5, 4, 3 (RGB), into 4 categories: urban level I low intensity, urban level II moderate intensity, urban level III high intensity, and non-urban. Non-urban areas were subtracted from the potential urban mask to create a new binary urban mask, comprised of the 3 level II urban areas, by employing an image modeling logical NOT operation. All urban areas within the city boundaries were filtered using a 3 by 3 pixel averaging filter to remove any unwanted noise. By utilizing an image modeling script, all newly classified urban areas were encoded into the 1999 land use and land cover image based upon their new numeric code.

This urban classification procedure dramatically reduced many of the misclassified barren land clusters in the 1999 land use and land cover image as well as produced well-represented urban areas. However, there was still a noticeable number of remaining barren areas as classified by the maximum likelihood classifier. An initial attempt to rectify this problem employed a mode filter with a kernal size of 7 by 7 pixels. A mode filter computes the mode, most occurring value, of the pixel values under the kernal and replaces the least occurring value with the most occurring value. All values were preserved with the exception of the barren land class.

Filtering moderately decreased the number of the barren areas. Yet, overly visible regions of barren land remained. The author’s a priori familiarity of the study area prompted the concern for this over-classification. In order to correct the over-present barren areas, a final unsupervised, ISOCLUS, classification was computed under a bitmap mask of all remaining barren areas. This classification grouped all barren areas into 99 cluster classes based upon their pixel values. These classes were aggregated into their correct land category, urban, pasture, deciduous, water, or barren, based upon visual interpretation of all 3 1999 TM images loaded with different band combinations to best display each land class, similar to the training site creation process, and at full resolution.

During the aggregation process, one small cloud and accompanying shadow was detected in the southern portion of the county. Upon further investigation, the cloud and accompanying shadow were derived from the May 7, 1999 TM image. The maximum likelihood classifier had classified them as barren as a result of the bright pixel values. This was simply aggregated to its proper category, deciduous forest, with the visual aid of the October 6, and December 25, 1999 multitemporal images.

The inclusion of perennial water training sites in the maximum likelihood classification produced well-represented water bodies. However, due to the spatial resolution of the TM imagery, overhanging tree canopy, and the natural width of many perennial rivers and streams in Carroll County, most of the major drainage networks were not resolvable during classification. In order to include these significant physiographic features in the 1999 land use and land cover map, additional post-classification procedures were implemented. Primary focus was directed on the White and Kings Rivers, Osage, Yocum, Long, and Dry Creeks.

Water pixels have relatively low reflectance values (Digital Numbers) in TM band 7. Thus, major rivers and streams were extracted using a simple grey-level threshold derived from the May 7, 1999 TM image. This scene was chosen in order to coincide with higher river levels and conditions associated with the annual spring peak in rainfall. Digital number (DN) values 1 – 55 were arbitrarily thresholded out from the May TM scene to create a potential water bitmap. This range of values introduced some additional noise such as hill shadows. Unwanted pixels were subjectively removed with the visual aid of overlain ancillary river and stream vector data on the bitmap. The edited perennial rivers and streams bitmap was encoded into the permanent water category by applying a logical AND operation by means of an image modeling script.

Following the multiple image classification techniques employed to create the 1999 land use and land cover map of Carroll County, Arkansas, a final accuracy assessment was performed on the image. The accuracy results are discussed further in Chapter 5, Results and Analysis.