Interventions and counterfactuals
The fundamental techniques of Judea's Pearl framework for causality are interventions and counterfactuals, which operate on Bayesian networks through their structural causal model representation. Both have implementations in bnlearn, basic but under active development (documented here).
Consider the following Bayesian network.
> bn = model2network("[X1][X3][X5][X6|X8][X2|X1][X7|X5][X4|X1:X2][X8|X3:X7][X9|X2:X7][X10|X1:X9]") > graphviz.plot(bn)
Interventions
Interventions alter the graphical structure of the network by removing the arcs incoming to the nodes we choose to
intervene upon. The resulting netwoks is also known as the mutilated network. The intervention()
function takes a bn
or bn.fit
object; and a named list containing the nodes we intervene on
and the values we fix them to as arguments.
> mutilated = intervention(bn, evidence = list(X9 = 6.9, X6 = 4.2)) > par(mfrow = c(1, 2)) > graphviz.compare(bn, mutilated, main = c("ORIGINAL", "MUTILATED"))
This corresponds to a hard intervention. Soft interventions are not supported by intervention()
, but can
be implemented by manually changing the network structure (see some examples here.)
The values of the variables listed in the evidence
argument are ignored when the network is a
bn
object. On the other hand, if the network is a bn.fit
object the intervention nodes are
also set to the specified values. In Gaussian and conditional Gaussian nodes, the local distribution will become a
simple regression model with a standard error of zero.
> data = as.data.frame(matrix(rnorm(100), nrow = 10, ncol = 10, + dimnames = list(NULL, nodes(bn)))) > fitted = bn.fit(bn, data) > mutilated = intervention(fitted, evidence = list(X9 = 6.9, X6 = 4.2)) > mutilated$X6
Parameters of node X6 (Gaussian distribution) Conditional density: X6 Coefficients: (Intercept) 4.2 Standard deviation of the residuals: 0
In discrete nodes, the local distribution will be a marginal distribution assigning a probability of 1 to the specified value.
> data = discretize(data, breaks = 3) > for (var in colnames(data)) + levels(data[, var]) = letters[1:3] > fitted = bn.fit(bn, data) > mutilated = intervention(fitted, evidence = list(X9 = "c", X6 = "b")) > mutilated$X9
Parameters of node X9 (multinomial distribution) Conditional probability table: a b c 0 0 1
The function mutilated
is an alias of intervention
, retained for backward compatibility.
Counterfactuals
Unlike intervention()
, conterfactuals operate on the twin network constructed from the Bayesian
network and thus explicitly require to convert it into a structural causal model. The main difference between a Bayesian
network and a structural causal model is that all nodes have a local distribution in the former but not in the latter.
Instead, a structural causal model splits each node into two: one with the exhogenous noise and one with a deterministic
functional relationship between the node and its parents.
The twin()
creates the twin network, in which each node from the Bayesian network is duplicated into
its factual and counterfactual copies. The exhogenous noise is separated and connected to both as a
parent.
> twn = twin(bn) > graphviz.plot(twn)
The returned twin network has class bn, bn.twin
. For node X
in the original Bayesian
network, it contains the node X
itself, its counterfactual copy X.
and the exhogenous noise
uX
linke to both.
A counterfactual()
is essentially an intervention on the counterfactual copies of the
evidence
nodes in the twin network. For instance, intervening on node X8
removes all the arcs
incoming on X8.
, including that from the noise node uX8
.
> ctf = counterfactual(bn, evidence = list(X8 = 4.2)) > graphviz.plot(ctf)
Comparing ctf
with the twin network twn
, we can see the missing arc uX8
→ X8.
. We can also see that ctf
has fewer nodes. By default,
counterfactual()
merges factual and counterfactual nodes with the same distribution, and removes the
associated noise nodes. This facilitates interpretation and significantly reduces the computational burden of exact
and approximate inference.
> par(mfrow = c(1, 2)) > graphviz.compare(twn, ctf)
Passing merging = FALSE
to counterfactual()
will make it skip the node merging.
Tue Aug 5 22:03:12 2025
with bnlearn
5.1
and R version 4.5.0 (2025-04-11)
.