Bayesian Networks in R

with Applications in Systems Biology


R. Nagarajan, M. Scutari and S. Lèbre (2013).
Use R!, Vol. 48, Springer (US).
ISBN-10: 1461464455
ISBN-13: 978-1461464457
Springer Website
Amazon Website
photo

Errata Corrige

  • page 3: “if a node vi precedes vj, there can be no arc from vj to vi” should be “if a node vi precedes vj, there can be no path from vj to vi”.
  • page 3: it's true that leaf nodes do not have any outgoing arc, but they are not required to have any incoming arc.
  • page 17: “SAB ⊂ V” should be “SABV”, where V is the node set.
  • page 22: in Equation (2.12), the numerator of the ratio under the square root should be “n - |Z| - 2” and not “n - 2”.
  • page 35: bnlearn 3.2 and later versions are more picky about setting arc directions; as a result bn.gs is an undirected graph and must be extended into a DAG with cextend() to conclude the example.
  • page 39: at least in modern times, deal is unable to fit a network containing only continuous variables. A workaround is to include a dummy factor (e.g. marks$XYZ <- factor(rep("xyz", nrow(marks)))) before calling network() so that jointprior() does not fail.
  • page 47: “phopsholypids” should be “phospholipids”.
  • page 75:coef(object)” should be “coef(lasso.fit)”.
  • page 78, 79:arth.edges” should be “arth.arcs”.
  • page 89, 149: “Cheng & Druzdel (2000)” should be “Cheng & Druzdzel (2000)”.
  • page 98: the code to create and fit the dynamic Bayesian network inference example fails in modern versions of R and bnlearn. The following, slightly modified snipped works with an updated installation as of May 2015.
    dbn2 = empty.graph(c("265768_at", "245094_at1",
            "258736_at", "257710_at", "255070_at",
            "245319_at", "245094_at"))
    dbn2 = set.arc(dbn2, "245094_at", "265768_at")
    for (node in names(coef(lasso.s))[-c(1, 6)])
      dbn2 = set.arc(dbn2, node, "245094_at")
    dbn2 = set.arc(dbn2, "245094_at1", "245094_at")
    dbn2.data = as.data.frame(x[, nodes(dbn2)[1:6]])
    dbn2.data[, "245094_at"] = y
    dbn2.data[, "245094_at1"] =
      arth12[2:(nrow(arth12) - 1), "245094_at"]
    dbn2.fit = bn.fit(dbn2, dbn2.data)
    
  • page 104: “a single operations” should be “a single operation”.
  • pages 113, 114: the code
    start = random.graph(names(hailfinder), num = 4, 382 method = "melancon")
    
    should read
    start = random.graph(names(hailfinder), num = 4, method = "melancon", max.in.degree = 2)
    
    without the “382”, and with the “max.in.degree = 2” to produce networks with a reasonable number of parameters. Similarly,
    s0 = random.graph(names(hailfinder), method = "melancon")
    
    should be as follows.
    s0 = random.graph(names(hailfinder), method = "melancon", max.in.degree = 2)
    

Reference Versions of the Relevant R Packages

The following R packages were used (or at least mentioned) in the book. The reference version used in the writing of the book and a link to the CRAN/BioConductor homepage are reported for each package.

  • Rgraphviz version 1.32.0 [ BioConductor ]
  • graph version 1.32.0 [ BioConductor ]
  • igraph version 0.6-2 [ CRAN ]
  • bnlearn version 3.1 [ CRAN ]
  • grBase version 1.3.4 [ CRAN ]
  • gRain version 0.8.5 [ CRAN ]
  • catnet version 1.13.4 [ CRAN ]
  • mugnet version 0.13.5 [ CRAN ]
  • vars version 1.5-0 [ CRAN ]
  • G1DBN version 3.1 [ CRAN ]
  • deal version 1.2-34 [ CRAN ]
  • ARTIVA version 1.2 [ CRAN ]
  • simone version 1.0-1 [ CRAN Archive ]
  • GeneNet version 1.2.5 [ CRAN ]
  • lars version 1.1 [ CRAN ]
  • glmnet version 1.8-2 [ CRAN ]
  • penalized version 0.9-41 [ CRAN ]
  • EDISON version 1.0 [ CRAN ]
  • rsprng version 1.0 [ CRAN ]
  • Rmpi version 0.5-8 [ CRAN ]
  • snow version 0.3-3 [ CRAN ]
  • rpvm version 1.0-4 [ CRAN Archive ]

R Code and Data Files

  • R code, Chapter 1. [ link ]
  • R code, Chapter 2. [ link ]
  • R code, Chapter 3. [ link ]
  • R code, Chapter 4. [ link ]
  • R code, Chapter 5. [ link ]
  • R code for the analysis of Sachs' data. [ link ]
  • Sachs' raw observational data. [ link ]
  • Sachs' complete pre-processed data. [ link ]

Table of Contents

  1. Introduction

    1. A Brief Introduction to Graph Theory

      1. Graphs, Nodes, and Arcs
      2. The Structure of a Graph
      3. Further Reading
    2. The R Environment for Statistical Computing

      1. Base Distribution and Contributed Packages
      2. A Quick Introduction to R
      3. Further Reading
       

    Exercises

  2. Bayesian Networks in the Absence of Temporal Information


    1. Bayesian Networks: Essential Definitions and Properties

      1. Graph Structure and Probability Factorization
      2. Fundamental Connections
      3. Equivalent Structures
      4. Markov Blankets
    2. Static Bayesian Networks Modeling

      1. Constraint-Based Structure Learning Algorithms
      2. Score-Based Structure Learning Algorithms
      3. Hybrid Structure Learning Algorithms
      4. Choosing Distributions, Conditional Independence Tests, and Network Scores
      5. Parameter Learning
      6. Discretization
    3. Static Bayesian Networks Modeling with R

      1. Popular R Packages for Bayesian Network Modeling
      2. Creating and Manipulating Network Structures
      3. Plotting Network Structures
      4. Structure Learning
      5. Parameter Learning
      6. Discretization
    4. Pearl’s Causality

    5. Applications to Gene Expression Profiles

      1. Model Averaging
      2. Choosing the Significance Threshold
      3. Handling Interventional Data
       

    Exercises

  3. Bayesian Networks in the Presence of Temporal Information

    1. Time Series and Vector Auto-Regressive Processes

      1. Univariate Time Series
      2. Multivariate Time Series
    2. Dynamic Bayesian Networks: Essential Definitions and Properties

      1. Definitions
      2. Dynamic Bayesian Network Representation of a VAR Process
    3. Dynamic Bayesian Network Learning Algorithms

      1. Least Absolute Shrinkage and Selection Operator
      2. James–Stein Shrinkage
      3. First-Order Conditional Dependencies Approximation
      4. Modular Networks
    4. Non-homogeneous Dynamic Bayesian Network Learning

    5. Dynamic Bayesian Network Learning with R

      1. Multivariate Time Series Analysis
      2. LASSO Learning: lars and simone
      3. Other Shrinkage Approaches: GeneNet, G1DBN
      4. Non-homogeneous Dynamic Bayesian Network Learning: ARTIVA
       

    Exercises

  4. Bayesian Network Inference Algorithms

    1. Reasoning Under Uncertainty

      1. Probabilistic Reasoning and Evidence
      2. Algorithms for Belief Updating: Exact and Approximate Inference
      3. Causal Inference
    2. Inference in Static Bayesian Networks

      1. Exact Inference
      2. Approximate Inference
    3. Inference in Dynamic Bayesian Networks

       

    Exercises

  5. Parallel Computing for Bayesian Networks

    1. Foundations of Parallel Computing

    2. Parallel Programming in R

    3. Applications to Structure and Parameter Learning

      1. Constraint-Based Structure Learning Algorithms
      2. Score-Based Structure Learning Algorithms
      3. Hybrid Structure Learning Algorithms
      4. Parameter Learning
    4. Applications to Inference Procedures

      1. Bootstrap
      2. Cross-Validation
      3. Conditional Probability Queries
       

    Exercises

   Solutions

   References

   Index