Test Net with Cases

Testing a Net Using Cases

This section documents the menu choice Cases → Test Net Using Cases (or " Network → Test Using Cases" on older versions of Netica). The purpose of this feature is to grade a Bayes net using a set of real cases to see how well the predictions or diagnosis of the net match the actual cases. It is not for decision networks.

First, select the nodes you do not wish the network to know the value of during its inference. For example, if the network is for medical diagnosis, you might select the disease nodes and nodes representing other unobservable internal states. We will call these nodes the "unobserved" nodes. Then choose Network → Test With Cases. You will be asked which file of cases to use, and after you choose one, Netica will start processing.

The Messages window will come to the front and display the percentage of cases processed so far. Hold down Ctrl + Alt + Left Button at the same time if you want to stop processing cases (the results for the cases already processed will then be printed). Netica will pass through the case file, processing the cases one-by-one. Netica first reads in the case, except for any findings for the unobserved nodes. It then does belief updating to generate beliefs for each of the unobserved nodes. It goes back and checks the true value for those nodes as supplied by the case file (if they are supplied for that case), and compares them with the beliefs it generated. It accumulates all the comparisons into summary statistics.

When Netica is done, it will print a report for each of the unobserved nodes (except constant nodes). Typically you are only interested in some of them, so you can ignore the rest. The report for a node named "SpkQual" (with node title "Spark quality" might look something like this:

For SpkQual: Spark quality

Confusion:

.......Predicted......

good bad very_b Actual

------ ------ ------ ------

253 0 0 good

22 176 4 bad

13 19 430 very_bad

Error rate = 6.325%

Scoring Rule Results:

Logarithmic loss = 0.2144

Quadratic loss = 0.1099

Spherical payoff = 0.9409

Calibration:

good 0-0.5: 0 | 0.5-1: 0 | 1-2: 0 | 2-5: 0 |

5-80: 49 | 80-95: 87.5 | 95-98: 95.7 |

bad 0-1: 0 | 1-2: 1.52 | 2-5: 2.4 | 5-10: 5.17 |

10-50: 20 | 50-85: 82.6 | 85-95: 90 | 95-100: 100 |

very_bad 0-0.1: 0 | 0.1-0.5: 0 | 0.5-5: 6.94 | 5-10: 9.33 |

10-20: 16.2 | 20-95: 83.3 | 95-98: 98.9 | 98-99: 100 |

99-100: 100 |

Total 0-0.1: 0 | 0.1-0.5: 0 | 0.5-1: 0 | 1-2: 0.431|

2-5: 2.5 | 5-10: 6.28 | 10-15: 10.9 | 15-20: 13.3 |

20-50: 30.1 | 50-80: 81.5 | 80-90: 86 | 90-95: 93.7 |

95-98: 97.6 | 98-99: 100 | 99-100: 100 |

Times Surprised (percentage):

.................Predicted Probability...................

State < 1% < 10% > 90% > 99%

----- ---- ----- ----- -----

good 0.00 (0/312) 0.00 (0/614) 6.86 (14/204) 0.00 (0/0)

bad 0.00 (0/225) 1.98 (13/657) 0.00 (0/69) 0.00 (0/0)

very_bad 0.00 (0/216) 3.32 (12/361) 0.25 (1/399) 0.00 (0/31)

Total 0.00 (0/753) 1.53 (25/1632) 2.23 (15/672) 0.00 (0/31)

Sections of the Report

Confusion Matrix & Errors

Scoring Rule Results

Calibration & Times Surprised Table

Quality of Test

NOTES:

If you have any findings entered before choosing Network → Test With Cases they will be taken into account during all belief updating (unless the case file has a column for that node). Netica will warn you in this event, so that you don't obtain wrong results by inadvertently leaving some findings in the network. A situation in which you would want to leave a finding in the network is if the network is designed for a broader class of cases than the case file. For example, if you have a network designed to handle people of both genders (and it has a 'gender' node), but the case file contains females only, you should enter a finding of 'female' for the 'gender' node before grading the network.

If the findings for the non-unobserved nodes of a case in the case file are impossible according to the network, then an inconsistent-findings error message will be displayed, that case will be ignored, and processing will continue. If the network makes predictions for the unobserved nodes that are inconsistent with the case file, then of course no error messages will be generated, the network will simply be graded more poorly (and have a logarithmic loss of INFINITY). Depending on your application, any of the measures calculated could be the most valuable to you. However, if you want a single number to grade a network, and aren't sure which one to pick, we suggest the logarithmic loss. This function will properly support a 'NumCases' column in the case file, if one is present.

As well as grading a network, this feature can also be used to determine the usefulness of particular tests or findings in a real world environment. Often groups of findings or tests can have quite a different usefulness when considered together, than when considered one-by-one, and this feature also allows you to investigate such groups. By selecting extra nodes in the first step, you can make some possible findings from the case file unavailable to the network. Then you can see how much the results of the network are degraded by not having access to those findings. In the medical example mentioned earlier, you might additionally select the nodes 'Blood Test' and 'Smear Test', and then compare the new confusion matrix generated with the old one, to find if the number of false negatives and false positives of serious diseases changed significantly.

This feature is also available to programmers using Netica API; contact Norsys for more information.