Supervised Classification Supervised Classification is a technique for the computer-assisted interpretation of remotely sensed imagery. The operator trains the computer to look for surface feature s with similar reflectance characteristics to a set of examples of known interpretation within the image. Supervised Classification Tool so-called wxI Class is a GUI application which allows to generate spectral signature s for an image by allowing the user to outline region s of interest.
The resulting signature file can be used as input for i. The image is classified on the basis of predefined landuse-land cover classes and algorithm by the analyst.
Atmospheric Absorption Bands 4. Both center line and boundary line of color classes can be vector ized automatically using R2V's vectorization function. The clusters are usually identified or label ed as some useful type of material e. Dragon can measure length and area on any georeference d image. However this assumes the image uses a distance-preserving projection.
Topic: GIS. Supervised Classification.Here we explore supervised classification.
Various supervised classification algorithms exist, and the choice of algorithm can affect the results. In supervised classification, we have prior knowledge about some of the land-cover types through, for example, fieldwork, reference spatial data or interpretation of high resolution imagery such as available on Google maps.
Specific sites in the study area that represent homogeneous examples of these known land-cover types are identified. These areas are commonly referred to as training sites because the spectral properties of these sites are used to train the classification algorithm. NLCD is a m Landsat-based land cover database spanning 4 epochs, and NLCD is based primarily on a decision-tree classification of circa Landsat data.
It has two pairs of class values and names that correspond to the levels of land use and land cover classification system. These levels usually represent the level of complexity, level I being the simplest with broad land use land cover categories. Read this report by Anderson et al to learn more about this land use and land cover classification system.
We did a lot of things here. Take a step back and read more about ratify. Alternatively, you can use predefined sites that you may have collected from other sources. We will generate the sample sites following a stratified random sampling to ensure samples from each LULC class. You can see there are two variables in samp The cell column contains cell numbers of nlcd sampled.
We will drop the cell column later. Here nlcd has integer values between You will often find classnames are provided as string labels e. There are several approaches that could be used to convert these classes to integer codes. We can make a function that will reclassify the character strings representing land cover classes into integers based on the existing factor levels.
Please install the package if it is not available for your machine. Once we have the sites, we can extract the cell values from landsat5 RasterStack.The user specifies the various pixels values or spectral signatures that should be associated with each class.
This is done by selecting representative sample sites of a known cover type called Training Sites or Areas. The computer algorithm then uses the spectral signatures from these training areas to classify the whole image. Ideally, the classes should not overlap or should only minimally overlap with other classes. In ENVI there are four different classification algorithms you can choose from in the supervised classification procedure.
There are as follows: Maximum Likelihood: Assumes that the statistics for each class in each band are normally distributed and calculates the probability that a given pixel belongs to a specific class. Each pixel is assigned to the class that has the highest probability that is, the maximum likelihood. This is the default. Minimum Distance: Uses the mean vectors for each class and calculates the Euclidean distance from each unknown pixel to the mean vector for each class.
The pixels are classified to the nearest class. Mahalanobis Distance: A direction-sensitive distance classifier that uses statistics for each class.
It is similar to maximum likelihood classification, but it assumes all class covariances are equal, and therefore is a faster method. All pixels are classified to the closest training data. Spectral Angle Mapper: SAM is a physically-based spectral classification that uses an n -Dimension angle to match pixels to training data.
This method determines the spectral similarity between two spectra by calculating the angle between the spectra and treating them as vectors in a space with dimensionality equal to the number of bands. This technique, when used on calibrated reflectance data, is relatively insensitive to illumination and albedo effects.
Training Sites Training sites are areas that are known to be representative of a particular land cover type. The computer determines the spectral signature of the pixels within each training area, and uses this information to define the statistics, including the mean and variance of each of the classes.
Preferably the location of the training sites should be based on field collected data or high resolution reference imagery. It is important to choose training sites that cover the full range of variability within each class to allow the software to accurately classify the rest of the image.
If the training areas are not representative of the range of variability found within a particular land cover type, the classification may be much less accurate.
Supervised Learning: Basics of Classification and Main Algorithms
Multiple, small training sites should be selected for each class. The more time and effort spent in collecting and selecting training site the better the classification results.
In supervised classification the majority of the effort is done prior to the actual classification process. Once the classification is run the output is a thematic image with classes that are labeled and correspond to information classes or land cover types.
Supervised classification can be much more accurate than unsupervised classification, but depends heavily on the training sites, the skill of the individual processing the image, and the spectral distinctness of the classes. If two or more classes are very similar to each other in terms of their spectral reflectance e.
Supervised classification requires close attention to the development of training data.Image classification in the field of remote sensing refers to the assignment of land cover categories or classes to image pixels.
For instance, land cover data collections and imagery can be classified into urban, agriculture, forest, and other classes for the sake of further analysis and processing.
Typically, professionals in GIS remote sensing work with three types of image classification techniques; these are:. Out of these, supervised and unsupervised image classification techniques are the most commonly used of the three.SUPERVISED CLASSIFICATION BY PCI GEOMATICA 9.1
Once a clustering algorithm is selected, the number of groups to be generated has to be identified. In the next step, every individual unclassified cluster is identified with land cover classes. As samples are not necessary for unsupervised classification, this technique serves as an easy means of segmenting and understanding images.
How to pick the best supervised classification method?
Supervised classification requires the selection of representative samples for individual land cover classes. In this technique of remote sensing image classification, spectral signature described in the training set are used trained GIS experts to deliver accurate and detailed results. The most commonly used supervised classification algorithms are minimum-distance classification and maximum likelihood.
Both these types of classification methods necessitate requisite degree of knowledge in the areas of interest. The most important being:. In the case of unsupervised classification technique, the analyst designates labels and combine classes after ascertaining useful facts and information about classes such as agricultural, water, forest, etc.
Because of the presence of mixed land cover classes, the assignment of geo-spectral clusters becomes a difficult task for GIS experts. Therefore, unsupervised classification is mainly used for the quick assignment of labels to simpler, less complex, and broadly defined land cover classes.
In other words, unsupervised classification is responsible for reducing analyst bias. On the other hand, supervised classification permits the fine tuning of information classes by analysts; these may be linked to finer subcategories including level classes.
In this image classification technique, high-accuracy GPS devices are used for collecting training data in the field. By using the supervised approach, GIS analysts can zoom in on any area, decipher the problem minutely, and use more accurate data to train classification algorithms. In comparison to unsupervised data, the usage of training data in supervised classification yields more accurate results.
This is because of the presence of reduced mixed pixels in the data collected through the supervised approach. The GIS team at SBL provides benchmarked mapping and remote sensing services for most industry verticals — mining, geology, petroleum pipelines, utility, agriculture, land information management, retail, financial, and health care to name a few.
Get in touch with SBL to get all-inclusive solutions for your spatial data management needs and take your organization to the next level of success. Related Posts. Contact Us.
One common application of remotely-sensed images to rangeland management is the creation of maps of land cover, vegetation type, or other discrete classes by remote sensing software.
In supervised classification, the image processing software is guided by the user to specify the land cover classes of interest.
The software determines the spectral signature of the pixels within each training area, and uses this information to define the mean and variance of the classes in relation to all of the input bands or layers. Each pixel in the image is then assigned, based on its spectral signature, to the class it most closely matches.
It is important to choose training areas that cover the full range of variability within each land cover type to allow the software to accurately classify the rest of the image. Some of the more common classification algorithms used for supervised classification include the Minimum-Distance to the Mean Classifier, Parallelepiped Classifier, and Gaussian Maximum Likelihood Classifier.
Supervised classification can be very effective and accurate in classifying satellite images and can be applied at the individual pixel level or to image objects groups of adjacent, similar pixels. However, for the process to work effectively, the person processing the image needs to have a priori knowledge field data, aerial photographs, or other knowledge of where the classes of interest e.
This method is often used with unsupervised classification in a process called hybrid classification. Unsupervised classification can be used first to determine the spectral class composition of the image and to see how well the intended land cover classes can be defined from the image. After this initial step, supervised classification can be used to classify the image into the land cover types of interest.
Supervised classification methods are used to generate a map with each pixel assigned to a class based on its multispectral composition. The classes are determined based on the spectral composition of training areas defined by the user.
Satellite images can be classified based on many distinguishable cover types that are specified by the user, including:. Supervised classification can be much more accurate than unsupervised classification, but depends heavily on the prior knowledge,skill of the individual processing the image, and distinctness of the classes. If the designated training sites are not representative of the range of variability found within a particular land cover type, the classification may be much less accurate.
Likewise, if two or more classes are very similar to each other in terms of their spectral reflectance e.
A combination of supervised and unsupervised classification hybrid classification is often employed; this allows the remote sensing program to classify the image based on the user-specified land cover classes, but will also classify other less common or lesser known cover types into separate groups. Supervised and unsupervised classification are both pixel-based classification methods, and may be less accurate than object-based classification Ghorbani et al.
Unsupervised classification can be performed with any number of different remote-sensing or GIS-derived inputs. Commonly, spectral bands from satellite or airborne sensors, band ratios or vegetation indices e. Unsupervised classification is relatively easy to perform in any remote sensing software e.
This image shows the use of training sites, shown as colored polygons, to inform the remote sensing software of major land cover and vegetation classes in the image for a supervised classification image source: Short, N.
The Remote Sensing Tutorial, Section 1. Many of the current land cover maps that are routinely used in rangeland management were developed using supervised classification techniques. Some examples include:. You must have an account and be logged in to post or reply to the discussion topics below. Click here to login or register for the site. User Tools Login. Site Tools.
Table of Contents Supervised Classification. Unsupervised Classification. Hybrid or combined classification combination of both supervised and unsupervised classification methods. Object-based Classification. Satellite images can be classified based on many distinguishable cover types that are specified by the user, including: Land cover classes. Alrababah, M. International Journal of Remote Sensing — - used unsupervised and supervised classification methods to map land use, and showed that supervised classification improved map accuracy.This is the continuation of the previous topics.
Classification is the type of supervised learning which we already discussed in the previous posts. Here I tend to write on classification and its algorithm types.
Classification is a machine learning task that identifies the class to which an instance belongs. It involves training a model to predict qualitative target. It is a technique to determine the extent to which a data sample will or will not be a part of a category or type. Classification models predict categorical class labels. It is a classification algorithm detects information about card holder such as income, purchase information, occupation. Its systems includes optimisation techniques such as generic algorithm, and support vector machine shows a greater fluctuation in the implementation of many different technologies.
It refers to identifying an unknown face image using computational algorithm. Successful face recognition methodology depends heavily on the particular choice of the features. Considering hours of studying for passing the exam. Logistic regression is used here because the probability is to find pass or fail which is binary.
In this case, you model the probability distribution of output y as 1 or 0. Unlike logistic regression, there is no closed form solution for finding optimal weights of logistic regression. Instead, you must solve this with maximum likelihood estimation a probability model to detect maximum likelihood of something happening.
It can be used to calculate the probability of a given outcome in a binary model, like probability of being classified as sick or passing an exam.
The first equation shows the probability of output variable y being equal to 1 i. The second equation shows the probability of output variable y being equal to 0 i. The total of two probability is 1. The data is used to conclude whether the weather will be sunny, stormy, cloudy, or rainy. The points lying on the sigmoid function fits are either classified as positive or negative cases.
A threshold is decided for classifying the cases. This curve has a finite limit that is y can only be 0 or 1. The probability distribution of output y is restricted to 1 or 0. The confusion matrix demonstrates how your model of classification is confused when it makes projections. It compares the models true positive and false positive rates to the ones from a random assignment.
It measures the entire 2-D area under the entire ROC curve.As stated in the first article of this seriesClassification is a subcategory of supervised learning where the goal is to predict the categorical class labels discrete, unoredered values, group membership of new instances based on past observations. There are two main types of classification problems:. The following example is very representative to explain binary classification:.
There are 2 classes, circles and crosses, and 2 features, X1 and X2. The model is able to find the relationship between the features of each data point and its class, and to set a boundary line between them, so when provided with new data, it can estimate the class where it belongs, given its features. In this case, the new data point falls into the circle subspace and, therefore, the model will predict its class to be a circle.
Different Classes. It is important to note that not every classification models will be useful to separate properly different classes from a dataset.
Some of the most typical casses are represented in the following picture:. So, the task of selecting an appropiate algorithm became of paramount importance in classification problems, and this will be one of the main topics that will be discussed throughout the article. Classification in Practice. This performance will be very influenced by the data available, number of features and samples, the different classes and whether they are linearly separable or not.
To remind ourselves the six main steps to do in the development of a machine learning model:. Next, well proceed to explore the different classification algorithms and learn which one is more suitable to perform each task. So, to tackle binary classification problemsthe Logistic Regression is one of the most used algorithms.
Logistic regression is a simple but powerful classification algorithm despite of its name. It works very well on linearly separable classes and can be extended to multiclass classification, via the OvR technique. Odds Ratio. The odds ratio is one important concept in order to understand the idea behind logistic regression. The odds ratio is the probability that a certain event will occur. It can be written as:.
Where P stands for the probability of the positive event the one that we are trying to predict. Derived from this we can define the logit function. Logit Function. The logit function is simply the logarithm of the odds ratio log-odds. We will use it to express linear relationships between feature values and the log-odds.
Our true motivation behind this is to predict the probabilty that a sample belongs to a certain class. This is the inverse of the logit function, and is frequently called the sigmoid function. The Sigmoid Function. The formula of the sigmoid function is:.
What is Supervised Classification in Remote Sensing
Z is the net input, which is the linear combination of weights and sample features and can be calculated as:. When it is represented in a graphic, it adopts the following shape:. It means that the function approaches to one if the z tends to infinity and approaches to zero if the z tends to minus infinity.
In summary, this is what the logistic regression model does while being trained. The predicted probability can be converted into a binary outcome by unit step function a quantizier :.
Looking at the previous sigmoid graph, the equivalence will be:.