CLASSIFIER

There are 3 types of events that can occur in a certain process, and we label them e1, e2 and e3. We can measure an output variable of the process, called Obsv, and we use that to try to determine which event occured. When event e1 occurs, the value of Obsv has a normal distribution (Gaussian) with mean of 2 and standard deviation of 1. If e2 occurs the mean is 5 with std dev of 2, and if e3 occurs the mean is 8 with std dev of 1. Events e1, e2 and e3 occur with equal frequency.

When an observation is made, we can never be sure which event occured, since each event can cause any value of Obsv. But we use the value of Obsv to determine the most likely event that occured, and we return that as an answer (represented by node Classifier). The answer is either correct or an error, as indicated by node Result.

If you examine the equation of node Obsv, you will see it captures the combination of normal distributions described above. The equation of node Classifier uses thresholds to divide up the observation space.

Compile the network and try clicking on different values of the Class node. When Class is e1, you can see that Obsv is a normal distribution centered at 2 and fairly narrow. The Classifier node indicates that e1 will be predicted 83.8% of the time, e2 16.2% of the time and e3 very rarely. Of course that means that the Result will be correct 83.8% of the time.

By clicking on different values of the Obsv node, you can see what happens when different observations are made. If the observation is between 2 and 3, then then e1 occured with probability 79%, e2 with probability 21%, and e3 as a slight possibility. Since Classifier always predicts e1 for this observation, Result will be correct 79% of the time.

By clicking on different values of Classifier, you can determine what the confidence of classification is. For instance, if our system returns an answer of e1, the actual class is e1 with probability 84.5%, e2 with probability 15.5%, and possibly even e3.

If you click on the Error value of Result, you can see that the greatest probability of error is when the observations are near the boundaries between classes (i.e. Obsv is 3 to 4, or 6 to 7). The class most likely to result in an error is e2.

This is the classic form of a classifier problem that often occurs in engineering. Sometimes instead of events, there are objects to classify, or something else, but the form is the same. You can adjust this network to represent many of these problems by adjusting the prior probabilities of e1, e2 and e3 (corresponding to prior information of their relative frequencies), changing the number of events, changing the equation of the Obsv node (to represent different physical situations), and changing the equation of the Classifier node (to represent different classification rules). Sometimes there are multiple class variables, or multiple observation variables, which can simply be added as additional nodes.

Belief networks are an extremely powerful tool for analyzing all these classifier situations, since they can answer so many different types of questions so quickly.

Remember that if you change an equation, you have to choose Table -> Equation to Table before re-compiling.

To find the optimal cutoff points for the equation of the Classifier node in order to maximize some expected utility, you can use a decision network. For an example, see "Classifier Optimization".

This network was designed by Norsys Software Corp.

Copyright 1998 Norsys Software Corp.