Index  Topics 
cpquery {bnlearn}  R Documentation 
Perform conditional probability queries
Description
Perform conditional probability queries (CPQs).
Usage
cpquery(fitted, event, evidence, cluster = NULL, method = "ls", ..., debug = FALSE) cpdist(fitted, nodes, evidence, cluster = NULL, method = "ls", ..., debug = FALSE) mutilated(x, evidence)
Arguments
fitted 
an object of class 
x 
an object of class 
event, evidence 
see below. 
nodes 
a vector of character strings, the labels of the nodes whose conditional distribution we are interested in. 
cluster 
an optional cluster object from package parallel. 
method 
a character string, the method used to perform the conditional probability query. Currently only
logic sampling ( 
... 
additional tuning parameters. 
debug 
a boolean value. If 
Details
cpquery
estimates the conditional probability of event
given evidence
using the method specified in the method
argument.
cpdist
generates random samples conditional on the evidence
using the method
specified in the method
argument.
mutilated
constructs the mutilated network arising from an ideal intervention setting the nodes
involved to the values specified by evidence
. In this case evidence
must be provided
as a list in the same format as for likelihood weighting (see below).
Note that both cpquery
and cpdist
are based on Monte Carlo particle filters, and
therefore they may return slightly different values on different runs.
Value
cpquery()
returns a numeric value, the conditional probability of event()
conditional on evidence
.
cpdist()
returns a data frame containing the samples generated from the conditional
distribution of the nodes
conditional on evidence()
. The data frame has class
c("bn.cpdist", "data.frame")
, and a meth, 8od
attribute storing the value of the
method
gument. In the case of likelihood weighting, the weights are also attached as an attribute
called weights
.
mutilated
returns a bn
or bn.fit
object, depending on the class of
x
.
Logic Sampling
The event
and evidence
arguments must be two expressions describing the event of
interest and the conditioning evidence in a format such that, if we denote with data
the data set
the network was learned from, data[evidence, ]
and data[event, ]
return the correct
observations. If either event
or evidence
is set to TRUE
an
unconditional probability query is performed with respect to that argument.
Three tuning parameters are available:

n
: a positive integer number, the number of random samples to generate fromfitted
. The default value is5000 * log10(nparams(fitted))
for discrete and coditional Gaussian networks and500 * nparams(fitted)
for Gaussian networks. 
batch
: a positive integer number, the number of random samples that are generated at one time. Defaults to10^4
. If then
is very large (e.g.10^12
), R would run out of memory if it tried to generate them all at once. Instead random samples are generated in batches of sizebatch
, discarding each batch before generating the next. 
query.nodes
: a a vector of character strings, the labels of the nodes involved inevent
andevidence
. Simple queries do not require to generate samples from all the nodes in the network, socpquery
andcpdist
try to identify which nodes are used inevent
andevidence
and reduce the network to their upper closure.query.nodes
may be used to manually specify these nodes when automatic identification fails; there is no reason to use it otherwise.
Note that the number of samples returned by cpdist()
is always smaller than n
,
because logic sampling is a form of rejection sampling. Therefore, only the obervations matching
evidence
(out of the n
that are generated) are returned, and their number depends on
the probability of evidence
.
Likelihood Weighting
The event
argument must be an expression describing the event of interest, as in logic
sampling. The evidence
argument must be a named list:

Each element corresponds to one node in the network and must contain the value that node will be set to when sampling.

In the case of a continuous node, two values can also be provided. In that case, the value for that node will be sampled from a uniform distribution on the interval delimited by the specified values.

In the case of a discrete or ordinal node, two or more values can also be provided. In that case, the value for that node will be sampled with uniform probability from the set of specified values.
If either event
or evidence
is set to TRUE
an unconditional
probability query is performed with respect to that argument.
Tuning parameters are the same as for logic sampling: n
, batch
and
query.nodes
.
Note that the samples returned by cpdist()
are generated from the mutilated network, and need
to be weighted appropriately when computing summary statistics (for more details, see the references below).
cpquery does that automatically when computing the final conditional probability. Also note that the
batch
argument is ignored in cpdist for speed and memory efficiency.
Author(s)
Marco Scutari
References
Koller D, Friedman N (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.
Korb K, Nicholson AE (2010). Bayesian Artificial Intelligence. Chapman & Hall/CRC, 2nd edition.
Examples
## discrete Bayesian network (it is the same with ordinal nodes). data(learning.test) fitted = bn.fit(hc(learning.test), learning.test) # the result should be around 0.025. cpquery(fitted, (B == "b"), (A == "a")) # programmatically build a conditional probability query... var = names(learning.test) obs = 2 str = paste("(", names(learning.test)[3], " == '", sapply(learning.test[obs, 3], as.character), "')", sep = "", collapse = " & ") str str2 = paste("(", names(learning.test)[3], " == '", as.character(learning.test[obs, 3]), "')", sep = "") str2 cmd = paste("cpquery(fitted, ", str2, ", ", str, ")", sep = "") eval(parse(text = cmd)) # ... but note that predict works better in this particular case. attr(predict(fitted, "C", learning.test[obs, 3], prob = TRUE), "prob") # do the same with likelihood weighting. cpquery(fitted, event = eval(parse(text = str2)), evidence = as.list(learning.test[2, 3]), method = "lw") attr(predict(fitted, "C", learning.test[obs, 3], method = "bayeslw", prob = TRUE), "prob") # conditional distribution of A given C == "c". table(cpdist(fitted, "A", (C == "c"))) ## Gaussian Bayesian network. data(gaussian.test) fitted = bn.fit(hc(gaussian.test), gaussian.test) # the result should be around 0.04. cpquery(fitted, event = ((A >= 0) & (A <= 1)) & ((B >= 0) & (B <= 3)), evidence = (C + D < 10)) ## ideal interventions and mutilated networks. mutilated(fitted, evidence = list(F = 42))
Index  Topics 