Learning From a Case File

This describes how to learn a Bayes net from a file of cases (alternately, you can learn from cases one-by-one as you enter them, or do other learning functions).  The steps involved are listed below, followed by more detailed instructions:

1. Obtain a file of cases

2. Create nodes for the variables of interest

3. Connect the nodes with links

4. Learn the conditional probability tables (CPTs)

5. You may then want to view or modify the CPTs, harden the CPTs, absorb nodes, or learn from further case files (which may require changing the names of nodes or states, or adding new nodes or links)

1. Case File:  See Creating Case Files.  Note: If you are using Netica on a Mac, it cannot learn cases  from an Excel file.  You must first convert the Excel file into a text file in order to successfully execute learning.  Note: be sure  your text file follows proper formatting.

2. Nodes:  Before learning begins, you must have a Bayes net whose nodes are the variables (i.e. attributes) of the cases.  It is okay if it has additional nodes related or unrelated to the cases in the file.

If you don’t already have a net constructed, or the net you have doesn’t include all the variables in the case file that you wish, Cases Learn Add Case File Nodes may be helpful.  It will scan through a case file and add to the current net new nodes for any variables that it discovers in the case file that aren’t already in the net.  The states of the new nodes will be all the possible values discovered from the case file.  If your net already has a node with the same name as some variable from the case file, but that node doesn’t have all the states that are mentioned in the case file for that variable, then those states will be added to the node (unless the node has a state called ‘other’).

After Netica has added all the nodes, you move them to the positions you want, and delete any that you aren’t interested in.

3. Links:  Add links between the nodes in the net to capture the dependencies that you wish to learn.  Try to avoid giving any node too many parents, especially if you don’t have very many cases to learn from.  Alternatively, you can use TAN learning to learn the link structure, given a target node.

4. Learn:  When you choose Cases Learn Incorp Case File, Netica will ask you for a case file and a “degree”.  Normally, you enter 1 for the degree, but you can enter other numbers for special effects.  If you want to undo the effect of earlier learning, you can learn again from the same file, but with a degree of –1 (it doesn’t matter if you have done other counting learning since then, providing you haven’t hardened, softened, faded, or edited the CPTs).  If you enter 2 for the degree, the learning will act as if it sees every case in the file twice, and similarly for other numbers (fractional numbers are okay).  Netica builds up the CPTs according to its learning algorithm, and as it processes the cases, it reports its progress in the Messages window.