structural.em {bnlearn} R Documentation

Structure learning from missing data

Description

Learn the structure of a Bayesian network from a data set containing missing values using Structural EM.

Usage

structural.em(x, maximize = "hc", maximize.args = list(), fit,
    fit.args = list(), impute, impute.args = list(), return.all = FALSE,
    start = NULL, max.iter = 5, debug = FALSE)

Arguments

x

a data frame containing the variables in the model.

maximize

a character string, the score-based algorithm to be used in the “maximization” step. See structure learning for details.

maximize.args

a list of arguments to be passed to the algorithm specified by maximize, such as restart for hill-climbing or tabu for tabu search.

fit

a character string, the parameter learning method to be used in the “maximization” step. See bn.fit for details.

fit.args

a list of arguments to be passed to the parameter learning method specified by fit.

impute

a character string, the imputation method to be used in the “expectation” step. See impute for details.

impute.args

a list of arguments to be passed to the imputation method specified by impute.

return.all

a boolean value. See below for details.

start

a bn or bn.fit object, the network used to perform the first imputation and as a starting point for the score-based algorithm specified by maximize.

max.iter

an integer, the maximum number of iterations.

debug

a boolean value. If TRUE a lot of debugging output is printed; otherwise the function is completely silent.

Value

If return.all is FALSE, structural.em() returns an object of class bn. (See bn-class for details.)

If return.all is TRUE, structural.em() returns a list with three elements named dag (an object of class bn), imputed (a data frame containing the imputed data from the last iteration) and fitted (an object of class bn.fit, again from the last iteration; see bn.fit-class for details).

Note

If at least one of the variables in the data x does not contain any observed value, the start network must be specified and it must be a bn.fit object. Otherwise, structural.em() is unable to complete the first maximization step because it cannot fit the corresponding local distribution(s).

Note that if impute is set to bayes-lw, each call to structural.em may produce a different model since the imputation is based on a stochastic simulation.

Author(s)

Marco Scutari

References

Friedman N (1997). "Learning Belief Networks in the Presence of Missing Values and Hidden Variables". Proceedings of the 14th International Conference on Machine Learning, 125–133.

See Also

score-based algorithms, bn.fit, impute.

Examples

data(learning.test)

# learn with incomplete data.
incomplete.data = learning.test
incomplete.data[1:100, 1] = NA
incomplete.data[101:200, 2] = NA
incomplete.data[1:200, 5] = NA
structural.em(incomplete.data)

## Not run: 
# learn with a latent variable.
incomplete.data = learning.test
incomplete.data[seq(nrow(incomplete.data)), 1] = NA
start = bn.fit(empty.graph(names(learning.test)), learning.test)
wl = data.frame(from = c("A", "A"), to = c("B", "D"))
structural.em(incomplete.data, start = start,
  maximize.args = list(whitelist = wl))

## End(Not run)