Bayesian Networks in R

with Applications in Systems Biology

R. Nagarajan, M. Scutari and S. Lèbre (2013).
Use R!, Vol. 48, Springer (US).
ISBN-10: 1461464455
ISBN-13: 978-1461464457
Springer Website
Amazon Website

Errata Corrige

page 3: “if a node v_i precedes v_j, there can be no arc from v_j to v_i” should be “if a node v_i precedes v_j, there can be no path from v_j to v_i”.
page 3: it's true that leaf nodes do not have any outgoing arc, but they are not required to have any incoming arc.
page 17: “S_AB ⊂ V” should be “S_AB ⊂ V”, where V is the node set.
page 22: in Equation (2.12), the numerator of the ratio under the square root should be “n - |Z| - 2” and not “n - 2”.
page 35: bnlearn 3.2 and later versions are more picky about setting arc directions; as a result bn.gs is an undirected graph and must be extended into a DAG with cextend() to conclude the example.
page 39: at least in modern times, deal is unable to fit a network containing only continuous variables. A workaround is to include a dummy factor (e.g. marks$XYZ <- factor(rep("xyz", nrow(marks)))) before calling network() so that jointprior() does not fail.
page 47: “phopsholypids” should be “phospholipids”.
page 75: “coef(object)” should be “coef(lasso.fit)”.
page 78, 79: “arth.edges” should be “arth.arcs”.
page 89, 149: “Cheng & Druzdel (2000)” should be “Cheng & Druzdzel (2000)”.

page 98: the code to create and fit the dynamic Bayesian network inference example fails in modern versions of R and bnlearn. The following, slightly modified snipped works with an updated installation as of May 2015.

dbn2 = empty.graph(c("265768_at", "245094_at1",
        "258736_at", "257710_at", "255070_at",
        "245319_at", "245094_at"))
dbn2 = set.arc(dbn2, "245094_at", "265768_at")
for (node in names(coef(lasso.s))[-c(1, 6)])
  dbn2 = set.arc(dbn2, node, "245094_at")
dbn2 = set.arc(dbn2, "245094_at1", "245094_at")
dbn2.data = as.data.frame(x[, nodes(dbn2)[1:6]])
dbn2.data[, "245094_at"] = y
dbn2.data[, "245094_at1"] =
  arth12[2:(nrow(arth12) - 1), "245094_at"]
dbn2.fit = bn.fit(dbn2, dbn2.data)

page 104: “a single operations” should be “a single operation”.

pages 113, 114: the code

start = random.graph(names(hailfinder), num = 4, 382 method = "melancon")

should read

start = random.graph(names(hailfinder), num = 4, method = "melancon", max.in.degree = 2)

without the “382”, and with the “max.in.degree = 2” to produce networks with a reasonable number of parameters. Similarly,

s0 = random.graph(names(hailfinder), method = "melancon")

should be as follows.

s0 = random.graph(names(hailfinder), method = "melancon", max.in.degree = 2)

Reference Versions of the Relevant R Packages

The following R packages were used (or at least mentioned) in the book. The reference version used in the writing of the book and a link to the CRAN/BioConductor homepage are reported for each package.

Rgraphviz version 1.32.0 [ BioConductor ]
graph version 1.32.0 [ BioConductor ]
igraph version 0.6-2 [ CRAN ]
bnlearn version 3.1 [ CRAN ]
grBase version 1.3.4 [ CRAN ]
gRain version 0.8.5 [ CRAN ]
catnet version 1.13.4 [ CRAN ]
mugnet version 0.13.5 [ CRAN ]
vars version 1.5-0 [ CRAN ]
G1DBN version 3.1 [ CRAN ]
deal version 1.2-34 [ CRAN ]

ARTIVA version 1.2 [ CRAN ]
simone version 1.0-1 [ CRAN Archive ]
GeneNet version 1.2.5 [ CRAN ]
lars version 1.1 [ CRAN ]
glmnet version 1.8-2 [ CRAN ]
penalized version 0.9-41 [ CRAN ]
EDISON version 1.0 [ CRAN ]
rsprng version 1.0 [ CRAN ]
Rmpi version 0.5-8 [ CRAN ]
snow version 0.3-3 [ CRAN ]
rpvm version 1.0-4 [ CRAN Archive ]

R Code and Data Files

R code, Chapter 1. [ link ]
R code, Chapter 2. [ link ]
R code, Chapter 3. [ link ]
R code, Chapter 4. [ link ]
R code, Chapter 5. [ link ]

R code for the analysis of Sachs' data. [ link ]
Sachs' raw observational data. [ link ]
Sachs' complete pre-processed data. [ link ]

Introduction
1. A Brief Introduction to Graph Theory
  1. Graphs, Nodes, and Arcs
  2. The Structure of a Graph
  3. Further Reading
2. The R Environment for Statistical Computing
  1. Base Distribution and Contributed Packages
  2. A Quick Introduction to R
  3. Further Reading
Exercises
Bayesian Networks in the Absence of Temporal Information
1. Bayesian Networks: Essential Definitions and Properties
  1. Graph Structure and Probability Factorization
  2. Fundamental Connections
  3. Equivalent Structures
  4. Markov Blankets
2. Static Bayesian Networks Modeling
  1. Constraint-Based Structure Learning Algorithms
  2. Score-Based Structure Learning Algorithms
  3. Hybrid Structure Learning Algorithms
  4. Choosing Distributions, Conditional Independence Tests, and Network Scores
  5. Parameter Learning
  6. Discretization
3. Static Bayesian Networks Modeling with R
  1. Popular R Packages for Bayesian Network Modeling
  2. Creating and Manipulating Network Structures
  3. Plotting Network Structures
  4. Structure Learning
  5. Parameter Learning
  6. Discretization
4. Pearl’s Causality
5. Applications to Gene Expression Profiles
  1. Model Averaging
  2. Choosing the Significance Threshold
  3. Handling Interventional Data
Exercises
Bayesian Networks in the Presence of Temporal Information
1. Time Series and Vector Auto-Regressive Processes
  1. Univariate Time Series
  2. Multivariate Time Series
2. Dynamic Bayesian Networks: Essential Definitions and Properties
  1. Definitions
  2. Dynamic Bayesian Network Representation of a VAR Process
3. Dynamic Bayesian Network Learning Algorithms
  1. Least Absolute Shrinkage and Selection Operator
  2. James–Stein Shrinkage
  3. First-Order Conditional Dependencies Approximation
  4. Modular Networks
4. Non-homogeneous Dynamic Bayesian Network Learning
5. Dynamic Bayesian Network Learning with R
  1. Multivariate Time Series Analysis
  2. LASSO Learning: lars and simone
  3. Other Shrinkage Approaches: GeneNet, G1DBN
  4. Non-homogeneous Dynamic Bayesian Network Learning: ARTIVA
Exercises
Bayesian Network Inference Algorithms
1. Reasoning Under Uncertainty
  1. Probabilistic Reasoning and Evidence
  2. Algorithms for Belief Updating: Exact and Approximate Inference
  3. Causal Inference
2. Inference in Static Bayesian Networks
  1. Exact Inference
  2. Approximate Inference
3. Inference in Dynamic Bayesian Networks
Exercises
Parallel Computing for Bayesian Networks
1. Foundations of Parallel Computing
2. Parallel Programming in R
3. Applications to Structure and Parameter Learning
  1. Constraint-Based Structure Learning Algorithms
  2. Score-Based Structure Learning Algorithms
  3. Hybrid Structure Learning Algorithms
  4. Parameter Learning
4. Applications to Inference Procedures
  1. Bootstrap
  2. Cross-Validation
  3. Conditional Probability Queries
Exercises

Bayesian Networks in R

with Applications in Systems Biology

Errata Corrige

Reference Versions of the Relevant R Packages

R Code and Data Files

Table of Contents

Introduction

A Brief Introduction to Graph Theory

The R Environment for Statistical Computing

Exercises

Bayesian Networks in the Absence of Temporal Information

Bayesian Networks: Essential Definitions and Properties

Static Bayesian Networks Modeling

Static Bayesian Networks Modeling with R

Pearl’s Causality

Applications to Gene Expression Profiles

Exercises

Bayesian Networks in the Presence of Temporal Information

Time Series and Vector Auto-Regressive Processes

Dynamic Bayesian Networks: Essential Definitions and Properties

Dynamic Bayesian Network Learning Algorithms

Non-homogeneous Dynamic Bayesian Network Learning

Dynamic Bayesian Network Learning with R

Exercises

Bayesian Network Inference Algorithms

Reasoning Under Uncertainty

Inference in Static Bayesian Networks

Inference in Dynamic Bayesian Networks

Exercises

Parallel Computing for Bayesian Networks

Foundations of Parallel Computing

Parallel Programming in R

Applications to Structure and Parameter Learning

Applications to Inference Procedures

Exercises

Solutions

References

Index