Plotting networks and marginal distributions with the Rgraphviz package

Commercial and free software suits implementing Bayesian network modelling typically display a Bayesian network as:

  1. a nicely laid-out graph, with nodes positioned according to the topological ordering of the network (root nodes on top, leaves at the bottom);
  2. marginal distributions summarized with key parameters in each node.

The aim of this kid of plot is to summarize the structure and the parameters of the Bayesian network in a single plot that can also be used to compare the effects of inference (incorporating evidence, performing interventions, counterfactuals, etc.). We can produce this kind of plot in bnlearn using the graphviz.chart() function (documented here). In its simplest form, it takes an object of class bn.fit and produces a greyscale barplot.

Discrete networks

All nodes in discrete Bayesian networks are categorical or ordinal: the bars (or the lines) in each box represent the marginal probabilities of all the values taken by the corresponding node.

> library(bnlearn)
> dag = model2network("[A][C][F][B|A][D|A:C][E|B:F]")
> fitted = bn.fit(dag, learning.test)
> graphviz.chart(fitted)
plot of chunk unnamed-chunk-2

graphviz.chart() has several groups of options to customize how the plot looks:

  1. the layout of the graph, as in graphviz.plot() (see some examples);
  2. the type of plot to use for the marginal distributions: "barchart", "dotplot" or "barprob";
  3. whether to display the labels of the values of each variable (draw.levels), a grid of reference values (grid), and the aspect ratio of the node (scale);
  4. various colours like col (node frame colour), bg (node background colour), text.col (colour of the labels, including the node label), bar.col (colour of the bars in a barchart, of the lines in a dotplot), and strip.bg (the background colour of the strip containing the node label).

By default, graphviz.chart() produces barcharts like that shown above. We can overlay the probabilities represented by the bars on the barcharts by changing the type of the plot to type = "barprob".

> graphviz.chart(fitted, type = "barprob")
plot of chunk unnamed-chunk-3

Changing the type from type = "barchart" to type = "dotplot" replaces the bars with lines and points resembling the plots produced by dotplot in the lattice package.

> graphviz.chart(fitted, type = "dotplot")
plot of chunk unnamed-chunk-4

And we can also add a grid by setting grid = TRUE, in which case grid lines are placed at c(0, 0.25, 0.50, 0.75); or by assigning their locations manually.

> graphviz.chart(fitted, type = "dotplot", grid = c(0, 0.1, 0.2, 0.4, 0.8))
plot of chunk unnamed-chunk-5

Finally, we can add some colour to the plot with the arguments mentioned in the last bullet point above. Note that colour setting apply to all nodes: it is currently not possible to use a different set of colours for a particular node.

> graphviz.chart(fitted, type = "barprob", grid = TRUE, bar.col = "darkgreen",
+   strip.bg = "lightskyblue")
plot of chunk unnamed-chunk-6

Continuous networks

In the case of Gaussian networks, all nodes have local distributions that can be represented as linear regression models: the bars (or the lines) represent the regression coefficients associated with the parents of each node.

> dag = model2network("[A][B][E][G][C|A:B][D|B][F|A:D:E:G]")
> fitted = bn.fit(dag, gaussian.test)
> graphviz.chart(fitted, type = "barprob", grid = TRUE, bar.col = "darkgreen",
+   strip.bg = "lightskyblue")
plot of chunk unnamed-chunk-7

Since regression coefficients can be negative, the default grid is set to c(0, range[[node]] / 2, range[[node]]) when there are both positive and negative coefficients. If coefficients are either all positive or all negative the grid is set to c(0, range[[node]], mean(range[[node]])). In either case, the baseline at zero will be drawn. Flipping the signs of some of the coefficients in the plot above, the effects is as follows.

> for (node in nodes(fitted)) {

+   coefs = coef(fitted[[node]])
+   coefs = coefs * rep_len(c(-1, 1), length(coefs))
+   fitted[[node]] = list(coef = coefs, sd = sigma(fitted[[node]]))

+ }#FOR
> graphviz.chart(fitted, type = "barprob", grid = TRUE, bar.col = "darkgreen",
+   strip.bg = "lightskyblue")
plot of chunk unnamed-chunk-8

All the graphical formatting options discussed above for discrete networks apply to continuous networks as well.

Hybrid networks

In the case of conditional Gaussian networks, the meaning of the bars (or the lines) depends on the type of node. For discrete nodes, they represent conditional probabilities. For continuous nodes with no discrete parents, they represent the regression coefficients associated with the parents. For continuous nodes with discrete parents, the bars (or the lines) in the plot represent the regression coefficients of the continuous parents in the local distribution averaged over all the configurations of the discrete parents.

> dag = model2network("[A][B][C][H][D|A:H][F|B:C][E|B:D][G|A:D:E:F]")
> fitted = bn.fit(dag, clgaussian.test)
> graphviz.chart(fitted, type = "barprob", grid = TRUE, bar.col = "darkgreen",
+   strip.bg = "lightskyblue")
plot of chunk unnamed-chunk-9

All the graphical formatting options discussed above for discrete networks again apply.

Last updated on Sat Feb 17 23:47:20 2024 with bnlearn 5.0-20240208 and R version 4.3.2 (2023-10-31).