Learn Latent

Images Available:
(Click to expand)
Right click on the image to zoom it, or press the Alt-key to scroll it.
This Bayes net demonstrates learning a latent (or "hidden") variable, which occurs when you have an important node in the net for which *no* data appears in the case file.

An example of that situation is when you conjecture that a number of nodes may have a common unobservable cause, such as a disease which is not directly observable, giving rise to a number of symptoms.

When you build the net with Netica, you add a node for the latent variable, and provide it with the number of states you think it should have. You may have to repeat this whole process, with a different number of states each time, to see what best matches the data. It is better to err on the side of having too few states, since too many states will be difficult for Netica to learn, and for you to interpret. Keep in mind that the node Netica learns will probably have the states in a different order than you expect, since Netica has no knowledge that would help it to decide an order for the states.

As an example, consider this net, which has a parent node A with three children: R, S and T. The first step is to generate a case file which just contains information on R, S and T, but not A. You can do that by compiling the net, selecting the nodes R, S, T, and then choosing "Cases->Simulate Cases". There is such a case file, except containing very many cases (100,000) in a condensed format, in the same directory as this net, called "Learn Latent Data.cas". You can use "File->Open as Text" to examine it now.

That case file represents data that could have been gathered from some real-world process whose internal mechanism is modeled by the net, but for which no observations of node A were available. Now suppose we are given that case file only, and through our insight into the world we conjecture that there is a common cause (A) for the observations R, S and T, and we want to determine how R, S and T probabilistically depend on A.

You could build the new net from scratch, or just delete the CPTs of the "Learn Latent" net, but the most convenient way right now is just to open the net "Learn Latent no CPTs", which is in the same directory, and has the same structure as "Learn Latent", but without any CPTables.

After opening it, choose "Cases->Learn Using EM", select the "Learn Latent Data.cas" file from the dialog box, and accept the default degree of 1. Netica will open the Messages window and proceed to do many steps of the EM algorithm to learn the new CPTs. For each step of EM, it prints a line showing the iteration number, the "log likelihood", and the percentage change in log likelihood from the previous iteration. The "log likelihood" is the per case average of the negative of the logarithm of the probability of the case given the current bayes net (structure + CPTs).

Normally you will only do a few steps of EM algorithm, and then stop it by holding down the left mouse button and the ctrl and alt keys for a little while. However, this example runs to completion quite quickly.
You can now compare the learned net with the original (which represents the distribution in the real world). For instance, select node R of the original net and choose Table->View/Edit, then do the same for node R of the Learned net. You can see that the learned net has learned the real-world relationships quite well considering that it had no observations of node A.
The EM algorithm searches over bayes net CPTs in an attempt to maximize the probability of the data given the bayes net (i.e. minimize negative log likelihood). This can also be done using a gradient descent algorithm. From the Netica menu, choose "Cases->Learn Using Gradient". That works similarly to the EM learning, but using a very different algorithm internally.

At Norsys, we would appreciate any comments you have on our advanced learning algorithms, and reports of particular successes or failures you have had on your data sets (email info@norsys.com).