Simulating Random Cases

You can use Netica to generate a series of random cases whose probability distribution matches that of a particular Bayes net, which is known as simulation (sometimes called sampling).  These cases can by used as example scenarios of what one should expect if the Bayes net matches reality.  Or they can be manipulated and combined with other cases, and then used to learn a new net.

The sampling algorithms used are precise, so that the long-range frequencies of the cases will exactly approach the probabilities of the Bayes net, while taking account of all findings currently entered.

The cases will be stored in a file whose format matches the specification of a case file.  Once Netica has made the case file, you can browse it with the f8 key to see the individual cases.

How To:  To generate a case file for the active Bayes net, compile it, select the nodes for which you wish to have values in the case file, and then choose Cases Simulate Cases.  All the nodes of the net will be used to generate the cases, but columns will only be made for the selected ones.  You will be queried for how many cases to generate, the file name for the case file, where to put it, and how much missing data you want.  Normally you will enter 0 for the amount of missing data, but if you want to have a case file with asterisks for some fraction of the fields, enter that fraction (e.g. entering 0.25 means 25% of the values will be missing).  If you wish to generate only a single random case (and not save to file), choose Cases Random Case.

Example:  As an example, if you do a Cases Simulate Cases command with ‘Chest Clinic.dne’ from the Examples folder, and enter 120, “Chest Clinic.cases” and 0 to the dialog boxes, then you will obtain a case file similar to this (the case file you obtain may be a little different, since random numbers are involved).

With Equations:  If one or more nodes have an equation to define the relation between a node and its parents, then you may want Netica to use those equations directly to generate the random cases, instead of the probability tables which approximate the equations.  In that case, don’t compile the net before doing Cases Simulate Cases.  The sampling process will be slow if the net has an unlikely set of findings entered (a rejection method is used).  In the case file generated, continuous variables (whether or not they have been discretized) will have as values their continuous real number for each case, not just a state representing a range of values.