| Index | Topics |
| bn.fit {bnlearn} | R Documentation |
Fit the parameters of a Bayesian network
Description
Fit, assign or replace the parameters of a Bayesian network conditional on its structure.
Usage
bn.fit(x, data, cluster, method, ..., keep.fitted = TRUE,
debug = FALSE)
custom.fit(x, dist, ordinal, debug = FALSE)
bn.net(x)
Arguments
x |
an object of class |
data |
a data frame containing the variables in the model. |
cluster |
an optional cluster object from package parallel. |
dist |
a named list, with element for each node of |
method |
a character string, see below for details. |
... |
additional arguments for the parameter estimation procedure, see below. |
ordinal |
a vector of character strings, the labels of the discrete nodes which should be saved as ordinal
random variables ( |
keep.fitted |
a boolean value. If |
debug |
a boolean value. If |
Details
bn.fit() fits the parameters of a Bayesian network given its structure and a data set;
bn.net returns the structure underlying a fitted Bayesian network.
bn.fit() accepts data with missing values encoded as NA. If the parameter
estimation method was not specifically designed to deal with incomplete data, bn.fit() uses
locally complete observations to fit the parameters of each local distribution.
Available methods for discrete Bayesian networks are:
-
"mle": the maximum likelihood estimator for conditional probabilities. -
"bayes": the classic Bayesian posterior estimator with a uniform prior matching that in the Bayesian Dirichlet equivalent ("bde") score. -
"hdir": the hierarchical Dirichlet posterior estimator for related data sets from Azzimonti, Corani and Zaffalon (2019). -
"hard-em": the Expectation-Maximimation implementation of the estimators above.
Available methods for Gaussian Bayesian networks are:
-
"mle-g": the maximum likelihood estimator for least squares regression models. -
"hard-em-g": the Expectation-Maximisation implementation of the estimators above.
Available methods for conditional Gaussian Bayesian networks are:
-
"mle-cg": a combination of the maximum likelihood estimators"mle"and"mle-g". -
"hard-em-cg": the Expectation-Maximisation implementation of the estimators above.
Available methods for zero-inflated Bayesian networks are:
-
"mle-zihp": the maximum likelihood estimator for zero-inflated hyper-Poisson regression models, computed by expectation-maximization with the generalised linear model components fitted by iteratively reweighted least squares. -
"mle-zinb": the maximum likelihood estimator for zero-inflated negative binomial regression models, computed by expectation-maximization with the generalised linear model components fitted by iteratively reweighted least squares.
Additional arguments for the bn.fit() function:
-
iss: a numeric value, the imaginary sample size used by the"bayes"method to estimate the conditional probability tables associated with discrete nodes (seescorefor details). -
replace.unidentifiable: a boolean value. IfTRUEandmethodis a maximum likelihood estimator, unidentifiable parameters are replaced by zeroes (in the case of regression coefficients and standard errors in Gaussian and conditional Gaussian nodes) or by uniform conditional probabilities (in discrete nodes).If
FALSE(the default), the conditional probabilities in the local distributions of discrete nodes have a maximum likelihood estimate ofNaNfor all parent configurations that are not observed indata. Similarly, regression coefficients are set toNAif the linear regressions correspoinding to the local distributions of continuous nodes are singular. Such missing values propagate to the results of functions such aspredict(). -
alpha0: a positive number, the amount of information pooling between the related data sets in the"hdir"estimator. -
group: a character string, the label of the node with the grouping of the observations into the related data sets in the"hdir"estimator. -
imputeandimpute.args: a character string, the label of the imputation method (and its arguments) used by"hard-em","hard-em-g"and"hard-em-cg"to complete the data in the expectation step. The default method is the same as forimpute(). -
fitandfit.args: a character string, the label of the parameter estimation method used by"hard-em","hard-em-g"and"hard-em-cg"to estimate the parameters in the maximisation step. The default method is the same as forbn.fit(). -
loglik.threshold: a non-negative numeric value, the minimum improvement threshold to continue iterating in"hard-em","hard-em-g"and"hard-em-cg". The threshold is defined as the relative likelihood improvement divided by the sample size ofdata, and defaults to1e-3. Setting it to zero means that iterations stop only when the improvement is negative, which can happen due to stochastic noise if the imputation of missing data uses approximate inference. -
params.threshold: a non-negative numeric value, the minimum maximum relative change in the parameter values to continue iterating in"hard-em","hard-em-g"and"hard-em-cg". The threshold is defined as the maximum difference between parameter values, divided by the parameter value in the latest iteration. The default value is1e-3. -
max.iter: a positive integer value, the maximum number of iterations in"hard-em","hard-em-g"and"hard-em-cg". The default value is5. -
start: abn.fitobject, the fitted network used to initialise the"hard-em","hard-em-g"and"hard-em-cg"estimators. The default is to use thebn.fitobject obtained fromxwith the default parameter estimator for the data, which will use locally complete data to fit the local distributions. -
newdata: a data frame, a separate set of data used to assess the convergence of the"hard-em","hard-em-g"and"hard-em-cg"estimators. The data indataare used by default for this purpose. -
em.max.iter: a positive integer value, the maximum number of expectation-maximization iterations in the"mle-zihp"and"mle-zinb"estimators. The default value is100. -
em.tol: a non-negative numeric value, the relative convergence tolerance (on both the log-likelihood and the parameters) of the expectation-maximization iteration in the"mle-zihp"and"mle-zinb"estimators. The default value is1e-8. -
m.step: a character string, either"full"(the default) to re-fit each generalised linear model component to convergence in every maximisation step, or"one-step"to take a single iteratively reweighted least squares step per iteration (a cheaper generalized-EM update) in the"mle-zihp"and"mle-zinb"estimators.
An in-place replacement method is available to change the parameters of each node in a
bn.fit object; see the examples for discrete, continuous and conditional Gaussian networks
below.
-
For a discrete node (class
bn.fit.dnodeorbn.fit.onode), the new parameters must be in atableobject. -
For a Gaussian node (class
bn.fit.gnode), the new parameters can be defined either by anlm,glmorpensimobject (the latter is from thepenalizedpackage) or in a list with elements namedcoef,sdand optionallyfittedandresid. -
For a conditional Gaussian node (class
bn.fit.cgnode), the new parameters can be defined by a list with elements namedcoef,sdand optionallyfitted,residandconfigs. In both casescoefshould contain the new regression coefficients,sdthe standard deviation of the residuals,fittedthe fitted values andresidthe residuals.configsshould contain the configurations of the discrete parents of the conditional Gaussian node, stored as a factor. -
For a zero-inflated hyper-Poisson node (class
bn.fit.zihpnode), the new parameters can be defined by a list with elements namedinflation,intensity,dispersionand optionallyfittedandresid. -
For a zero-inflated negative binomial node (class
bn.fit.zinbnode), the new parameters can be defined by a list with elements namedinflation,intensity,dispersionand optionallyfittedandresid.
custom.fit() takes a set of user-specified distributions and their parameters and uses them
to build a bn.fit object. Its purpose is to specify a Bayesian network (complete with
parameters, not just the structure) using expert knowledge rather than learning it from a data set. The
distributions must be passed to the function in a list, with elements named after the nodes of the network
structure x. Each element of the list must be in one of the formats described above for
in-place replacement.
Value
bn.fit() and custom.fit()returns an object of class bn.fit,
bn.net() an object of class bn. See bn
class and bn.fit class for details.
Note
Due to the way Bayesian networks are defined, it is possible to estimate their parameters only if the
network structure is completely directed (that is, there are no undirected arcs). See set.arc and cextend for two ways of
manually setting the direction of one or more arcs.
In the case of maximum likelihood estimators, bn.fit() may produce NA
parameter estimates. For instance, for discrete and conditional Gaussian nodes, this happens when there are
(discrete) parent configurations that are not observed in data. To avoid this, either set
replace.unidentifiable to TRUE or, in the case of discrete networks, use
method = "bayes".
Author(s)
Marco Scutari
References
Azzimonti L, Corani G, Zaffalon M (2019). "Hierarchical Estimation of Parameters in Bayesian Networks." Computational Statistics & Data Analysis, 137:67–91.
See Also
bn.fit utilities, bn.fit plots.
Examples
data(learning.test)
# learn the network structure.
cpdag = pc.stable(learning.test)
# set the direction of the only undirected arc, A - B.
dag = set.arc(cpdag, "A", "B")
# estimate the parameters of the Bayesian network.
fitted = bn.fit(dag, learning.test)
# replace the parameters of node B.
new.cpt = matrix(c(0.1, 0.2, 0.3, 0.2, 0.5, 0.6, 0.7, 0.3, 0.1),
byrow = TRUE, ncol = 3,
dimnames = list(B = c("a", "b", "c"), A = c("a", "b", "c")))
fitted$B = as.table(new.cpt)
# the network structure is still the same.
all.equal(dag, bn.net(fitted))
# learn the network structure.
dag = hc(gaussian.test)
# estimate the parameters of the Bayesian network.
fitted = bn.fit(dag, gaussian.test)
# replace the parameters of the node F.
fitted$F = list(coef = c(1, 2, 3, 4, 5), sd = 3)
# set the original parameters again.
fitted$F = lm(F ~ A + D + E + G, data = gaussian.test)
# discrete Bayesian network from expert knowledge.
dag = model2network("[A][B][C|A:B]")
cptA = matrix(c(0.4, 0.6), ncol = 2, dimnames = list(NULL, c("LOW", "HIGH")))
cptB = matrix(c(0.8, 0.2), ncol = 2, dimnames = list(NULL, c("GOOD", "BAD")))
cptC = c(0.5, 0.5, 0.4, 0.6, 0.3, 0.7, 0.2, 0.8)
dim(cptC) = c(2, 2, 2)
dimnames(cptC) = list("C" = c("TRUE", "FALSE"), "A" = c("LOW", "HIGH"),
"B" = c("GOOD", "BAD"))
cfit = custom.fit(dag, dist = list(A = cptA, B = cptB, C = cptC))
# for ordinal nodes, it is nearly the same.
cfit = custom.fit(dag, dist = list(A = cptA, B = cptB, C = cptC),
ordinal = c("A", "B"))
# Gaussian Bayesian network from expert knowledge.
distA = list(coef = c("(Intercept)" = 2), sd = 1)
distB = list(coef = c("(Intercept)" = 1), sd = 1.5)
distC = list(coef = c("(Intercept)" = 0.5, "A" = 0.75, "B" = 1.32), sd = 0.4)
cfit = custom.fit(dag, dist = list(A = distA, B = distB, C = distC))
# conditional Gaussian Bayesian network from expert knowledge.
cptA = matrix(c(0.4, 0.6), ncol = 2, dimnames = list(NULL, c("LOW", "HIGH")))
distB = list(coef = c("(Intercept)" = 1), sd = 1.5)
distC = list(coef = matrix(c(1.2, 2.3, 3.4, 4.5), ncol = 2,
dimnames = list(c("(Intercept)", "B"), NULL)),
sd = c(0.3, 0.6))
cgfit = custom.fit(dag, dist = list(A = cptA, B = distB, C = distC))
# zero-inflated hyper-Poisson nodes from expert knowledge.
dag = model2network("[A][B|A][C|A:B]")
ldistA = list(inflation = c("(Intercept)" = 0),
intensity = c("(Intercept)" = 0),
dispersion = 1)
ldistB = list(inflation = c("(Intercept)" = 0, A = 1),
intensity = c("(Intercept)" = 0, A = 1),
dispersion = 0.5)
ldistC = list(inflation = c("(Intercept)" = 0, A = 0.5, B = 0.5),
intensity = c("(Intercept)" = 0, A = 0.4, B = 0.6),
dispersion = 0.25)
zifit = custom.fit(dag, list(A = ldistA, B = ldistB, C = ldistC))
# zero-inflated negative binomial nodes from expert knowledge.
ldistA = list(inflation = c("(Intercept)" = 0),
prsucc = c("(Intercept)" = 0),
failures = 1)
ldistB = list(inflation = c("(Intercept)" = 0, A = 1),
prsucc = c("(Intercept)" = 0, A = 1),
failures = 0.5)
ldistC = list(inflation = c("(Intercept)" = 0, A = 0.5, B = 0.5),
prsucc = c("(Intercept)" = 0, A = 0.4, B = 0.6),
failures = 0.25)
zifit = custom.fit(dag, list(A = ldistA, B = ldistB, C = ldistC))
| Index | Topics |