Comparing Bayesian networks according to their respective scores
Computing a network score
Let's first load the gaussian.test data set.
> library(bnlearn) > data(gaussian.test)
The correct network structure of the data, described in the manual page, is the following:
> res = empty.graph(names(gaussian.test))
> modelstring(res) = "[A][B][E][G][C|A:B][D|B][F|A:D:E:G]"
> res
Randomly generated Bayesian network
model:
[A][B][E][G][C|A:B][D|B][F|A:D:E:G]
nodes: 7
arcs: 7
undirected arcs: 0
directed arcs: 7
average markov blanket size: 4.00
average neighbourhood size: 2.00
average branching factor: 1.00
generation algorithm: empty
and it's correctly learned by all the learning algorithms implemented in bnlearn.
Its BIC score can be computed with the score function:
> score(res, gaussian.test, type = "bic-g") [1] -53191.54
Other score functions can be used by changing the type, as documented in the
manual page of the function.
Testing score equivalence
Arcs whose direction does not influence the v-structures present in the network structure are said to be score equivalent, because their reversal does not alter the score of the network (with the notbale exception of the K2 score function). Usually these arcs are not oriented in the networks learned with constraint-based algorithms.
In gaussian.test the only score equivalent arc is D -> B,
as shown below.
> res = set.arc(res, "D", "B") > score(res, gaussian.test, type = "bic-g") [1] -53191.54 > res = set.arc(res, "B", "D") > score(res, gaussian.test, type = "bic-g") [1] -53191.54
Changing the direction of any other arc alters the score of the network; for example
reversing B -> C lowers the BIC by 6788.81.
> res = set.arc(res, "B", "C") > score(res, gaussian.test, type = "bic-g") [1] -53191.54 > res = set.arc(res, "C", "B") > score(res, gaussian.test, type = "bic-g") [1] -59980.35
The debugging output of the choose.direction function reaches the same conclusions;
in the former case the network score is not altered, while in the latter the direction B -> C
results in a higher network score.
> choose.direction(res, data = gaussian.test, c("B", "D"), debug = TRUE, criterion="bic-g")
* testing B - D for direction.
> initial score for node B is -12648.04 .
> initial score for node D is -14683.05 .
> score delta for arc B -> D is 13144.39 .
> score delta for arc D -> B is 13144.39 .
@ nothing to do, same score delta.
> choose.direction(res, data = gaussian.test, c("B", "C"), debug = TRUE, criterion="bic-g")
* testing B - C for direction.
> initial score for node B is -12648.04 .
> initial score for node C is -16131.65 .
> score delta for arc B -> C is 12402.44 .
> score delta for arc C -> B is 5613.624 .
@ arc B -> C is better .
[...]