asia {bnlearn} R Documentation

Asia (synthetic) data set by Lauritzen and Spiegelhalter

Description

Small synthetic data set from Lauritzen and Spiegelhalter (1988) about lung diseases (tuberculosis, lung cancer or bronchitis) and visits to Asia.

Usage

data(asia)

Format

The asia data set contains the following variables:

  • D (dyspnoea), a two-level factor with levels yes and no.

  • T (tuberculosis), a two-level factor with levels yes and no.

  • L (lung cancer), a two-level factor with levels yes and no.

  • B (bronchitis), a two-level factor with levels yes and no.

  • A (visit to Asia), a two-level factor with levels yes and no.

  • S (smoking), a two-level factor with levels yes and no.

  • X (chest X-ray), a two-level factor with levels yes and no.

  • E (tuberculosis versus lung cancer/bronchitis), a two-level factor with levels yes and no.

Note

Lauritzen and Spiegelhalter (1988) motivate this example as follows:

“Shortness-of-breath (dyspnoea) may be due to tuberculosis, lung cancer or bronchitis, or none of them, or more than one of them. A recent visit to Asia increases the chances of tuberculosis, while smoking is known to be a risk factor for both lung cancer and bronchitis. The results of a single chest X-ray do not discriminate between lung cancer and tuberculosis, as neither does the presence or absence of dyspnoea.”

Standard learning algorithms are not able to recover the true structure of the network because of the presence of a node (E) with conditional probabilities equal to both 0 and 1. Monte Carlo tests seems to behave better than their parametric counterparts.

The complete BN can be downloaded from https://www.bnlearn.com/bnrepository/.

Source

Lauritzen S, Spiegelhalter D (1988). "Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion)". Journal of the Royal Statistical Society: Series B, 50(2):157–224.

Examples

# load the data.
data(asia)
# create and plot the network structure.
dag = model2network("[A][S][T|A][L|S][B|S][D|B:E][E|T:L][X|E]")
## Not run: graphviz.plot(dag)