bn.fit {bnlearn} R Documentation

Fit the parameters of a Bayesian network

Description

Fit, assign or replace the parameters of a Bayesian network conditional on its structure.

Usage

bn.fit(x, data, cluster, method, ..., keep.fitted = TRUE,
  debug = FALSE)
custom.fit(x, dist, ordinal, debug = FALSE)
bn.net(x)

Arguments

x

an object of class bn (for bn.fit() and custom.fit()) or an object of class bn.fit (for bn.net).

data

a data frame containing the variables in the model.

cluster

an optional cluster object from package parallel.

dist

a named list, with element for each node of x. See below.

method

a character string, see below for details.

...

additional arguments for the parameter estimation procedure, see below.

ordinal

a vector of character strings, the labels of the discrete nodes which should be saved as ordinal random variables (bn.fit.onode) instead of unordered factors (bn.fit.dnode).

keep.fitted

a boolean value. If TRUE, the object returned by bn.fit will contain fitted values and residuals for all Gaussian and conditional Gaussian nodes, and the configurations of the discrete parents for conditional Gaussian nodes.

debug

a boolean value. If TRUE a lot of debugging output is printed; otherwise the function is completely silent.

Details

bn.fit() fits the parameters of a Bayesian network given its structure and a data set; bn.net returns the structure underlying a fitted Bayesian network.

bn.fit() accepts data with missing values encoded as NA. If the parameter estimation method was not specifically designed to deal with incomplete data, bn.fit() uses locally complete observations to fit the parameters of each local distribution.

Available methods for discrete Bayesian networks are:

  • mle: the maximum likelihood estimator for conditional probabilities.

  • bayes: the classic Bayesian posterior estimator with a uniform prior matching that in the Bayesian Dirichlet equivalent (bde) score.

  • hdir: the hierarchical Dirichlet posterior estimator for related data sets from Azzimonti, Corani and Zaffalon (2019).

  • hard-em: the Expectation-Maximization implementation of the estimators above.

Available methods for hybrid Bayesian networks are:

  • mle-g: the maximum likelihood estimator for least squares regression models.

  • hard-em-g: the Expectation-Maximization implementation of the estimators above.

Available methods for discrete Bayesian networks are:

  • mle-cg: a combination of the maximum likelihood estimators mle and mle-g.

  • hard-em-cg: the Expectation-Maximization implementation of the estimators above.

Additional arguments for the bn.fit() function:

  • iss: a numeric value, the imaginary sample size used by the bayes method to estimate the conditional probability tables associated with discrete nodes (see score for details).

  • replace.unidentifiable: a boolean value. If TRUE and method one of mle, mle-g or mle-cg, unidentifiable parameters are replaced by zeroes (in the case of regression coefficients and standard errors in Gaussian and conditional Gaussian nodes) or by uniform conditional probabilities (in discrete nodes).

    If FALSE (the default), the conditional probabilities in the local distributions of discrete nodes have a mximum likelihood estimate of NaN for all parents configurations that are not observed in data. Similarly, regression coefficients are set to NA if the linear regressions correspoding to the local distributions of continuous nodes are singular. Such missing values propagate to the results of functions such as predict().

  • alpha0: a positive number, the amount of information pooling between the related data sets in the hdir estimator.

  • group: a character string, the label of the node with the grouping of the observations into the related data sets in the hdir estimator.

  • impute and impute.args: a character string, the label of the imputation method (and its arguments) used by hard-em, hard-em-g and hard-em-cg to complete the data in the expectation step. The default method is the same as for impute().

  • fit and fit.args: a character string, the label of the parameter estimation method used by hard-em, hard-em-g and hard-em-cg to estimate the parameters in the maximization step. The default method is the same as for bn.fit().

  • loglik.threshold: a non-negative numeric value, the minimum improvement threshold to continue iterating in hard-em, hard-em-g and hard-em-cg. The threshold is defined as the relative likelihood improvement divided by the sample size of data, and defaults to 1e-3. Setting it to zero means that iterations only stop in case of negative improvement, which can happen due to stochastic noise if the imputation of the missing data uses approximate inference.

  • params.threshold: a non-negative numeric value, the minimum maximum elative change in the parameter values to continue iterating in hard-em, hard-em-g and hard-em-cg. The threshold is defined as the maximum of the differences between parameter values divided scaled by the parameter value in the lastest iteration. The default value is 1e-3.

  • max.iter: a positive integer value, the maximum number of iterations in hard-em, hard-em-g and hard-em-cg. The default value is 5.

  • start: a bn.fit object, the fitted network used to initialize the hard-em, hard-em-g and hard-em-cg estimators. The default is to use the bn.fit object obtained from x with the default parameter estimator for the data, which will use locally complete data to fit the local distributions.

  • newdata: a data frame, a separate set of data used to assess the convergence of the hard-em, hard-em-g and hard-em-cg estimators. The data in data are used by default for this purpose.

An in-place replacement method is available to change the parameters of each node in a bn.fit object; see the examples for discrete, continuous and hybrid networks below. For a discrete node (class bn.fit.dnode or bn.fit.onode), the new parameters must be in a table object. For a Gaussian node (class bn.fit.gnode), the new parameters can be defined either by an lm, glm or pensim object (the latter is from the penalized package) or in a list with elements named coef, sd and optionally fitted and resid. For a conditional Gaussian node (class bn.fit.cgnode), the new parameters can be defined by a list with elements named coef, sd and optionally fitted, resid and configs. In both cases coef should contain the new regression coefficients, sd the standard deviation of the residuals, fitted the fitted values and resid the residuals. configs should contain the configurations if the discrete parents of the conditional Gaussian node, stored as a factor.

custom.fit() takes a set of user-specified distributions and their parameters and uses them to build a bn.fit object. Its purpose is to specify a Bayesian network (complete with the parameters, not only the structure) using knowledge from experts in the field instead of learning it from a data set. The distributions must be passed to the function in a list, with elements named after the nodes of the network structure x. Each element of the list must be in one of the formats described above for in-place replacement.

Value

bn.fit() and custom.fit()returns an object of class bn.fit, bn.net() an object of class bn. See bn class and bn.fit class for details.

Note

Due to the way Bayesian networks are defined it is possible to estimate their parameters only if the network structure is completely directed (i.e. there are no undirected arcs). See set.arc and cextend for two ways of manually setting the direction of one or more arcs.

In the case of maximum likelihood estimators, bn.fit() produces NA parameter estimates for discrete and conditional Gaussian nodes when there are (discrete) parents configurations that are not observed in data. To avoid this either set replace.unidentifiable to TRUE or, in the case of discrete networks, use method = "bayes".

Author(s)

Marco Scutari

References

Azzimonti L, Corani G, Zaffalon M (2019). "Hierarchical Estimation of Parameters in Bayesian Networks". Computational Statistics & Data Analysis, 137:67–91.

See Also

bn.fit utilities, bn.fit plots.

Examples

data(learning.test)

# learn the network structure.
cpdag = pc.stable(learning.test)
# set the direction of the only undirected arc, A - B.
dag = set.arc(cpdag, "A", "B")
# estimate the parameters of the Bayesian network.
fitted = bn.fit(dag, learning.test)
# replace the parameters of the node B.
new.cpt = matrix(c(0.1, 0.2, 0.3, 0.2, 0.5, 0.6, 0.7, 0.3, 0.1),
            byrow = TRUE, ncol = 3,
            dimnames = list(B = c("a", "b", "c"), A = c("a", "b", "c")))
fitted$B = as.table(new.cpt)
# the network structure is still the same.
all.equal(dag, bn.net(fitted))

# learn the network structure.
dag = hc(gaussian.test)
# estimate the parameters of the Bayesian network.
fitted = bn.fit(dag, gaussian.test)
# replace the parameters of the node F.
fitted$F = list(coef = c(1, 2, 3, 4, 5), sd = 3)
# set again the original parameters
fitted$F = lm(F ~ A + D + E + G, data = gaussian.test)

# discrete Bayesian network from expert knowledge.
dag = model2network("[A][B][C|A:B]")
cptA = matrix(c(0.4, 0.6), ncol = 2, dimnames = list(NULL, c("LOW", "HIGH")))
cptB = matrix(c(0.8, 0.2), ncol = 2, dimnames = list(NULL, c("GOOD", "BAD")))
cptC = c(0.5, 0.5, 0.4, 0.6, 0.3, 0.7, 0.2, 0.8)
dim(cptC) = c(2, 2, 2)
dimnames(cptC) = list("C" = c("TRUE", "FALSE"), "A" =  c("LOW", "HIGH"),
                   "B" = c("GOOD", "BAD"))
cfit = custom.fit(dag, dist = list(A = cptA, B = cptB, C = cptC))
# for ordinal nodes it is nearly the same.
cfit = custom.fit(dag, dist = list(A = cptA, B = cptB, C = cptC),
         ordinal = c("A", "B"))

# Gaussian Bayesian network from expert knowledge.
distA = list(coef = c("(Intercept)" = 2), sd = 1)
distB = list(coef = c("(Intercept)" = 1), sd = 1.5)
distC = list(coef = c("(Intercept)" = 0.5, "A" = 0.75, "B" = 1.32), sd = 0.4)
cfit = custom.fit(dag, dist = list(A = distA, B = distB, C = distC))

# conditional Gaussian Bayesian network from expert knowledge.
cptA = matrix(c(0.4, 0.6), ncol = 2, dimnames = list(NULL, c("LOW", "HIGH")))
distB = list(coef = c("(Intercept)" = 1), sd = 1.5)
distC = list(coef = matrix(c(1.2, 2.3, 3.4, 4.5), ncol = 2,
               dimnames = list(c("(Intercept)", "B"), NULL)),
          sd = c(0.3, 0.6))
cgfit = custom.fit(dag, dist = list(A = cptA, B = distB, C = distC))