## Computing the log-likelihood of data for a Bayesian network

Computing the log-likelihood of some data for a given Bayesian network is a fundamental task in many inference tasks,
whether Bayesian (predictive/posterior log-likelihood) or not (log-likelihood loss in cross-validation).
**bnlearn** provides a `logik()`

method for `bn.fit`

objects. In its most basic
form, it takes just a fitted Bayesian network and the data frame containing the data and returns the log-likelihood
of the whole sample:

> library(bnlearn) > training.set = learning.test[1:4000, ] > test.set = learning.test[4001:5000, ] > dag = model2network("[A][C][F][B|A][D|A:C][E|B:F]") > bn = bn.fit(dag, training.set) > logLik(bn, test.set)

[1] -4780.369

However, `logLik()`

can also compute the log-likelihood of the data for a subset of the nodes in the
network specified by the `nodes`

argument.

> logLik(bn, test.set, nodes = c("A", "C", "F"), debug = TRUE)

> computing the log-likelihood of a discrete network. * processing node A. > 1000 locally-complete observations out of 1000. > log-likelihood is -1098.737553. * processing node C. > 1000 locally-complete observations out of 1000. > log-likelihood is -704.801355. * processing node F. > 1000 locally-complete observations out of 1000. > log-likelihood is -693.493696.

[1] -2497.033

And it can return the log-likelihood for the individual observations in the data instead of summing it up over the
whole sample (for all the nodes in the network or the nodes specified in the `nodes`

argument).

> summary(logLik(bn, test.set, nodes = c("A", "C", "F"), by.sample = TRUE, debug = TRUE))

> computing the log-likelihood of a discrete network. * processing node A. * processing node C. * processing node F.

Min. 1st Qu. Median Mean 3rd Qu. Max. -4.784 -3.356 -2.102 -2.497 -2.080 -2.068

`logLik()`

returns `-Inf`

if the data have probability or density equal to zero, which
typically happens if the model is singular. For discrete nodes, if there are probabilities equal to zero in their
conditional probability tables for the values observed in the data:

> dag = model2network("[A][B|A]") > dists = list( + A = matrix(c(0.4, 0.6), ncol = 2, dimnames = list(NULL, c("LOW", "HIGH"))), + B = matrix(c(0.8, 0.2, 0, 1), ncol = 2, + dimnames = list(c("GOOD", "BAD"), c("LOW", "HIGH"))) + ) > bnD = custom.fit(dag, dists) > dataD = data.frame( + A = factor(c("LOW", "LOW", "HIGH", "HIGH"), levels = c("LOW", "HIGH")), + B = factor(c("GOOD", "BAD", "GOOD", "BAD"), c("GOOD", "BAD")) + ) > cbind(dataD, loglik = logLik(bnD, dataD, by.sample = TRUE))

A B loglik 1 LOW GOOD -1.1394343 2 LOW BAD -2.5257286 3 HIGH GOOD -Inf 4 HIGH BAD -0.5108256

For Gaussian and conditional Gaussian nodes, if their distribution is singular and it has a standard error equal to zero:

> dag = model2network("[A][B|A]") > dists = list( + A = list(coef = c("(Intercept)" = 2), sd = 1), + B = list(coef = c("(Intercept)" = 2, A = 3), sd = 0) + ) > bnG = custom.fit(dag, dists) > > dataG = data.frame(A = rnorm(4), B = rnorm(4)) > cbind(dataG, loglik = logLik(bnG, dataG, by.sample = TRUE))

A B loglik 1 -0.3049104 -1.804960 -Inf 2 -1.2893305 -0.435960 -Inf 3 -1.4712841 -1.046516 -Inf 4 1.9151273 1.736503 -Inf

However, `logLik()`

returns `+Inf`

for those obsevations that coincide with the expected value
of the singular distribution.

> spike = data.frame(A = 1, B = 2 + 3 * 1) > cbind(dataG, loglik = logLik(bnG, spike, by.sample = TRUE))

A B loglik 1 -0.3049104 -1.804960 Inf 2 -1.2893305 -0.435960 Inf 3 -1.4712841 -1.046516 Inf 4 1.9151273 1.736503 Inf

`logLik()`

returns `NA`

if any of the parameters of the Bayesian network that are involved in
the computation of the log-likelihood are equal to `NA`

, which may happen when they are estimated by maximum
likelihood.

> cl = class(bnD) > class(bnD) = "list" > bnD$B$prob[, "HIGH"] = NA > class(bnD) = cl > > cbind(dataD, loglik = logLik(bnD, dataD, by.sample = TRUE))

A B loglik 1 LOW GOOD -1.139434 2 LOW BAD -2.525729 3 HIGH GOOD NA 4 HIGH BAD NA

> cl = class(bnG) > class(bnG) = "list" > bnG$B$coefficients["A"] = NA > class(bnG) = cl > > cbind(dataG, loglik = logLik(bnG, dataG, by.sample = TRUE))

A B loglik 1 -0.3049104 -1.804960 NA 2 -1.2893305 -0.435960 NA 3 -1.4712841 -1.046516 NA 4 1.9151273 1.736503 NA

The behaviour of `logLik()`

for data with missing values is described
here.

`Thu Mar 9 17:50:16 2023`

with **bnlearn**

`4.9-20230309`

and `R version 4.2.2 (2022-10-31)`

.