## Dynamic Bayesian network of dermatologic and mental conditions, Scientific Reports (2024)

This is a short HOWTO describing the code for the paper “Inferring Skin-Brain-Skin Connections from Infodemiology Data using Dynamic Bayesian Networks” by Scutari, Kerob and Salah (2024, Scientific Reports).

### The Data: an Overview

> data = readRDS("data/prepd.v3.rds") > symptoms = c("Acne", "Asthma", "ADHD", "Burn", "Erectile_dysfunction", "Scar", + "Alcoholism", "Anxiety", "Depression", "Dermatitis", + "Sleep_disorder", "Obesity")

The data contain `100`

weeks worth of data on `12`

dermatological and psychiatric conditions, collected between `2020-03-02`

and
`2022-01-24`

. The data are organised in a multivariate time series for
`2879`

counties spread over `50`

US states. All time series
are observed over the whole period.

> data = data[data$county %in% names(which(table(data$county) > 35)), ] > data$county = droplevels(data$county) > data = data[, c("state", "county", "date", symptoms)]

The data originally contained missing data, either single data points or whole time series for specific counties and periods. Hence, we explore the imputation error of three approaches to impute missing values in time series: moving average (with a window of 10 data points), interpolation and Kalman filters. We estimate the relative imputation error by removing individual values from the data, reimputing them and measuring the relative difference.

> experiment = function(each.county, data, m, imputeFUN, symptoms) { + complete = incomplete = as.matrix(data[data$county == each.county, symptoms]) + how.many = ceiling(m * length(complete)) + incomplete[sample(length(complete), how.many)] = NA + # for each symptom individually. + for (s in colnames(incomplete)) { + # impute using the user-provided function FUN; short time series cannot + # be imputed, just return zero average error. + imputed = try(imputeFUN(incomplete[, s]), silent = TRUE) + if (is(imputed, "try-error")) + return(0) + else + incomplete[, s] = imputed + }#FOR + reldiff = abs(complete - incomplete) / complete + reldiff[is.na(reldiff) | is.infinite(reldiff)] = 0 + return(sum(reldiff) / how.many) + }#EXPERIMENT

All methods perform similarly for all of 2% and 5%, 10% and 20% proportions of missing values.

We perform a similar experiment for values missing in a block of consecutive data points to simulate symptoms not being recorded for a whole month in specific counties.

> experiment = function(each.county, data, m, imputeFUN, symptoms) { + complete = incomplete = as.matrix(data[data$county == each.county, symptoms]) + which.symptoms = sample(ncol(complete), ceiling(m * length(complete)), replace = TRUE) + which.times = sample(nrow(complete) - 3, ceiling(m * length(complete)), replace = TRUE) + for (s in which.symptoms) + incomplete[which.times[s] + 0:3, s] = NA + how.many = sum(is.na(incomplete)) + # for each symptom individually. + for (s in colnames(incomplete)) { + # impute using the user-provided function FUN; short time series cannot + # be imputed, just return zero average error. + imputed = try(imputeFUN(incomplete[, s])) + if (is(imputed, "try-error")) + return(0) + else + incomplete[, s] = imputed + }#FOR + reldiff = abs(complete - incomplete) / complete + reldiff[is.na(reldiff) | is.infinite(reldiff)] = 0 + return(sum(reldiff) / how.many) + }#EXPERIMENT

All methods perform similarly for all of the considered proportions of missing values.

### The Dependence Structure of the Data

The data are characterised by a strong dependence between observations across both space and time. In the case of space, we can measure how much of the variance of the conditions of interest is explained by the US states and the counties.

> r2.states = parSapply(cl, symptoms, function(symptom, data) { + model = lmer(paste(symptom, "~ 1 | state"), data = data) + return(1 - var(resid(model)) / var(data[, symptom])) + }, data = data[, c("state", "date", symptoms)]) > > r2.counties = parSapply(cl, symptoms, function(symptom, data) { + model = lmer(paste(symptom, "~ 1 | county"), data = data) + return(1 - var(resid(model)) / var(data[, symptom])) + }, data = data[, c("county", "date", symptoms)])

We compute the proportion of explained variance using an intercept-only random-effect model because the heterogeneity
between states and between counties violates the homoscedasticity assumption of fixed-effect models. We can see that US
states explain between `8`

% and `16`

% of the variance in
each of the symptoms (average `12`

%); counties explain between
`26`

% and `84`

%
(average `51`

%).

To account for time dependence, we can add an autocorrelation with lag 1 to the individual time series corresponding to the different symptoms in the random-effect model for the counties. The implied assumption is that each symptom depends on itself one week prior within each county.

> r2.time = parSapply(cl, symptoms, function(symptom, data) { + model = lmer(paste(symptom, "~ 1 | county"), data = data) + rr = residuals(model) + date = as.integer(as.factor(data$date)) + model = try(lme(rr ~ 1, random = ~ 1 | date, correlation = corAR1(), + control = lmeControl(returnObject = TRUE))) + return(1 - var(resid(model)) / var(data[, symptom])) + }, data = data[, c("county", "date", symptoms)])

Time and space dependence explain between `35`

% and `86`

%
of the variance in each symptom (average `63`

%), suggesting that a non-negligible part of
the information in the symptoms stems from their longitudinal nature.

### Bayesian Networks: Static versus Dynamic Structures

The classic (static) formulation of Bayesian networks is as follows: for a set of variables and a directed acyclic graph , the joint distribution of can be written as

Each node in the graph corresponds to one of the variables in , and the arcs in determine which other variables each variable depends on. If all the variables are continuous, as is the case for the medical conditions in the data, we typically assume that follows a multivariate normal distribution. As a result, the local distributions take the form of a linear regression model in which the node is the response variable and its parents are the explanatory variables:

where the are the regression coefficients associated with the parents and the are independent, normally distributed errors with some variance specific to If a node does not have any parents, the equation above will contain only the intercept and the error terms.

The assumption that residuals are independent, which implies that data points are independent, is violated by the data: as we noted above, the data points are dependent across time and space due to how they were collected. Replacing linear regression models with more complex regressions such as the random-effect models we used above makes learning the graph from data computationally unfeasible for large data sets. However, we can use the dynamic formulation of Bayesian networks. It is constructed as follows:

- Include two nodes for each variable in the graph one for the variable as observed at a time and one for the variable as observed at the following time .
- Only allow arcs going from nodes at time to nodes at time in the graph. Arcs between nodes associated with the same time point, sometimes called "instantaneous dependencies", describe probabilistic dependencies that take effect without any time lag. Such dependencies are difficult to interpret as causal effects because Granger causality requires a time lag between cause and effect.

Arcs between nodes corresponding to the same variable at different time points model autocorrelation with lag 1: an arc from to implies that is regressed against itself at the previous time point like in an AR(1) time series. Crossing arcs model feedback loops: if both and are in the graph then the relationship between the underlying variables is . Note that static Bayesian networks are unable to express feedback loops because nodes are not duplicated and the graph is acyclic by definition.

The local distributions in a dynamic Bayesian network take the form

where the parents will all be nodes for variables observed at time . The local distributions for the nodes at time take the form

since nodes at time do not have any parents by construction. As a result, dynamic Bayesian networks focus completely on modelling the transition between any two consecutive times and, if we again assume that follows a multivariate normal distribution, generalise vector autoregressive time series. The implication of using two generic time points is that we assume that the time series is stationary.

Dynamic Bayesian networks allow us to model the space dependence between the data points as well. Assume that each condition has a different baseline value in each state that does not change over time. Then, the local distribution of each condition at time is regressed against itself at time (same county, previous time point). The different baseline for the state then appears in both sides of the equation and can be accounted for in the regression model.

Learning the graph and the regression coefficients of a static Bayesian network from the data while disregarding the space-time dependence structure of the data produces strongly biased models.

> decorrelated = parSapply(cl, symptoms, function(symptom, data) { + model = lmer(paste(symptom, "~ 1 | county"), data = data) + rr = residuals(model) + date = as.integer(as.factor(data$date)) + model = try(lme(rr ~ 1, random = ~ 1 | date, correlation = corAR1(), + control = lmeControl(returnObject = TRUE))) + return(residuals(model)) + }, data = data[, c("county", "date", symptoms)]) > decorrelated = as.data.frame(decorrelated) > > dag = hc(data[, symptoms]) > fitted.original = bn.fit(dag, data[, symptoms], cluster = cl) > fitted.decorrelated = bn.fit(dag, decorrelated[, symptoms], cluster = cl) > > inflation = unlist(lapply(symptoms, function(node) { + coef(fitted.original[[node]])[-1] / coef(fitted.decorrelated[[node]])[-1] + }))

We find that `63`

% of the regression coefficients are
inflated by a factor of at least 2, and the sign differs in `29`

%
of the coefficients. The reason is that the model is misspecified, and the dependence relationships between the
variables are confounded with space-time dependencies between the data points.

> dag.original = hc(data[, symptoms]) > dag.decorrelated = hc(decorrelated[, symptoms]) > shd(dag.original, dag.decorrelated) / narcs(dag.original)

[1] 0.7096774

Furthermore, we find that the graph learned from the original data and that learned from the decorrelated data where
the space-time dependence has been removed differ in
`71`

% of the arcs.

### Learning the Dynamic Bayesian Network

Learning the structure of the dynamic Bayesian network requires us to reshape the data to construct the pairs of variables at for all the time series for the different counties.

> reshaped = parLapply(cl, levels(data$county), function(each.county, data, symptoms) { + county.data = subset(data, county == each.county) + available.t0 = county.data$date[-length(county.data$date)] + available.t1 = county.data$date[-1] + t0 = subset(county.data, date %in% available.t0)[, symptoms] + t1 = subset(county.data, date %in% available.t1)[, symptoms] + names(t0) = paste0(symptoms, "_0") + names(t1) = paste0(symptoms, "_1") + return(cbind(county = each.county, date = available.t0, t0, t1)) + }, data = data, symptoms = symptoms) > > dyndata = do.call(rbind, reshaped) > dyndata$county = as.factor(dyndata$county) > > t0.vars = grep("_0", names(dyndata), value = TRUE) > t1.vars = grep("_1", names(dyndata), value = TRUE)

We then use the reshaped data to learn the graph of the dynamic Bayesian network. As we discussed above, we blacklist all the arcs between nodes in the same time points, as well as arcs pointing backwards in time from nodes in to nodes in . We define the optimal graph as that We define the optimal graph as that with the best BIC score computed as the sum of the BIC value for all the regression models in the local distributions.

> learn.dbn = function(data, penalty = 1) { + bl = rbind(tiers2blacklist(list(t0.vars, t1.vars)), + set2blacklist(t0.vars), set2blacklist(t1.vars)) + hc(data, blacklist = bl, score = "bic-g", k = penalty * log(nrow(data) / 2)) + }#LEARN.DBN

Furthermore, we use model averaging to increase the stability of the learned graph. We implement it using bagging:

- We produce 500 bootstrap samples from the data. We sample only 75% of the available counties and shuffle the order of the variables to increase the heterogeneity of the samples, which improves the performance of bagging.
- We learn a graph from each of them.
- We construct a consensus graph that contains only those arcs that appear in a large proportion of the learned graphs, with a data-driven inclusion threshold.

> bagging = function(i, data, counties, penalty = 1) { + keep = sample(counties, 0.75 * nlevels(counties)) + boot.sample = data[counties %in% keep, ] + boot.sample = boot.sample[, sample(ncol(data), ncol(data))] + learn.dbn(boot.sample, penalty = penalty) + }#BAGGING

> averaging = function(data, penalty = 1) { + dags = parLapply(cl, seq(500), bagging, data = data[, c(t0.vars, t1.vars)], + counties = data$county, penalty = penalty) + strength = custom.strength(dags, nodes = c(t0.vars, t1.vars)) + averaged.network(strength) + }#AVERAGING > > avg.dag = averaging(dyndata, penalty = 1)

The consensus graph produced by the code above is the following.

We abbreviate the labels of the nodes here and in the following for readability.

original names (time 0) | abbreviations (time 0) | original names (time 1) | abbreviations (time 1) |
---|---|---|---|

Acne_0 | ACNE0 | Acne_1 | ACNE1 |

Asthma_0 | ASTH0 | Asthma_1 | ASTH1 |

ADHD_0 | ADHD0 | ADHD_1 | ADHD1 |

Burn_0 | BURN0 | Burn_1 | BURN1 |

Erectile_dysfunction_0 | ED0 | Erectile_dysfunction_1 | ED1 |

Scar_0 | SCAR0 | Scar_1 | SCAR1 |

Alcoholism_0 | ALC0 | Alcoholism_1 | ALC1 |

Anxiety_0 | ANX0 | Anxiety_1 | ANX1 |

Depression_0 | DEP0 | Depression_1 | DEP1 |

Dermatitis_0 | DER0 | Dermatitis_1 | DER1 |

Sleep_disorder_0 | SLD0 | Sleep_disorder_1 | SLD1 |

Obesity_0 | OBE0 | Obesity_1 | OBE1 |

Merging the nodes that correspond to the same variable (at different time points) produces the following graph, in which we show feedback loops as bi-directed arcs.

We observe that the graph is dense. Because of the large sample size, even small effects are significant, and therefore, we learn a large number of arcs. However, we would like a sparser graph with few enough arcs to interpret their meaning and reason about them qualitatively.

Recall that we use BIC to choose which arc to include. If we take two networks and with the inclusion criterion for that arc is:

Taking into account that if we assume all models are equally probable a priori, the above means:

so we accept over if the Bayes factor for the two models is larger than 1. If we increase the BIC penalty, say,

then

and the resulting Bayes factor becomes

so that the strength of the evidence we need increases with the sample size at a rate that we control, gradually excluding arcs that correspond to small effects. For instance, increasing to yields the following consensus graph.

Further increasing to yields the following consensus graph.

### Validating Predictive Accuracy

The best value of is that striking the right balance between the sparsity of the graph and predictive accuracy. Since all variables are continuous, we measure predictive accuracy using the proportion of variance in the symptoms explained by the predicted values.

> dbn.accuracy = function(dag, training, validation) { + fitted = bn.fit(dag, training[, c(t0.vars, t1.vars)]) + accuracy = structure(vector(length(t1.vars), mode = "numeric"), names = t1.vars) + for (node in t1.vars) { + pred = predict(fitted, data = validation[, c(t0.vars, t1.vars)], + node = node, method = "parents") + accuracy[node] = summary(lm(validation[, node] ~ pred))$adj.r.squared + }#FOR + accuracy["AVERAGE"] = mean(accuracy) + return(accuracy) + }#DBN.ACCURACY

In particular:

- We create a training set comprising the first 53 weeks' worth of data, and we keep the remaining 21 weeks as a validation set.
- We learn the graph of the dynamic Bayesian network from the training set for .
- We learn the regression coefficients for each node in each network from the training set.
- For each data point in the validation set, we use the symptoms at time to predict the symptoms at time .
- For each symptom, we regress the observed values on the predicted values, and measure the proportion of explained variance.
- We average the proportions of explained variance in all symptoms to compute a summary measure of the overall predictive accuracy of the dynamic Bayesian network.

> training = subset(dyndata, date < min(date) + 52 * 7) > validation = subset(dyndata, date >= min(date) + 52 * 7) > > penalty = c(1, 2, 4, 8, 16, 32, 64, 128) > accuracy = vector(length(penalty), mode = "list") > > for (i in seq_along(penalty)) { + dag = averaging(training, penalty = penalty[i]) + accuracy[[i]] = dbn.accuracy(dag, training, validation) + }#FOR

The proportion of variance explained in the validation set is comparable to that in the training set, which suggests that the model generalises well. Note that many of the data points in the validation set are measured weeks and months after the most recent data point in the training set, which confirms that the model can predict the symptoms well in the next time point beyond the immediate future.

The number of arcs in the graph decreases linearly in
,
from `109`

for
to `48`

for
and to `31`

for
.

Furthermore, the proportion of variance explained in the validation set remains stable for all
between 1 and 64: it decreases by only `0.01`

even though the number of arcs in the graph decreases from `109`

to
`21`

. The proportion of the variance in the training set, which measures the goodness of
fit (as opposed to predictive accuracy), starts to fall sharply after
.
This suggests that
produce dynamic Bayesian networks with a good balance between sparsity, goodness of fit and predictive accuracy.

The dynamic Bayesian network learned with is shown above. For reference, that learned with is shown below.

Finally, we choose the Bayesian network learned with
as the best compromise between predictive accuracy and sparsity: it has `80`

arcs and a
predictive accuracy of `0.67`

(`99.96`

% of the network learned with
with only `73.39`

% of the arcs).

The strengths of the arcs included in the final Bayesian network are reported below.

from | to | strength | direction | |
---|---|---|---|---|

12 | Acne_0 | Acne_1 | 1.000 | 1 |

14 | Acne_0 | ADHD_1 | 1.000 | 1 |

17 | Acne_0 | Scar_1 | 1.000 | 1 |

18 | Acne_0 | Alcoholism_1 | 1.000 | 1 |

19 | Acne_0 | Anxiety_1 | 1.000 | 1 |

21 | Acne_0 | Dermatitis_1 | 1.000 | 1 |

22 | Acne_0 | Sleep_disorder_1 | 1.000 | 1 |

23 | Acne_0 | Obesity_1 | 1.000 | 1 |

35 | Asthma_0 | Acne_1 | 1.000 | 1 |

36 | Asthma_0 | Asthma_1 | 1.000 | 1 |

38 | Asthma_0 | Burn_1 | 1.000 | 1 |

39 | Asthma_0 | Erectile_dysfunction_1 | 1.000 | 1 |

40 | Asthma_0 | Scar_1 | 1.000 | 1 |

42 | Asthma_0 | Anxiety_1 | 0.990 | 1 |

58 | ADHD_0 | Acne_1 | 1.000 | 1 |

59 | ADHD_0 | Asthma_1 | 1.000 | 1 |

60 | ADHD_0 | ADHD_1 | 1.000 | 1 |

61 | ADHD_0 | Burn_1 | 1.000 | 1 |

62 | ADHD_0 | Erectile_dysfunction_1 | 1.000 | 1 |

63 | ADHD_0 | Scar_1 | 1.000 | 1 |

64 | ADHD_0 | Alcoholism_1 | 1.000 | 1 |

65 | ADHD_0 | Anxiety_1 | 1.000 | 1 |

66 | ADHD_0 | Depression_1 | 1.000 | 1 |

67 | ADHD_0 | Dermatitis_1 | 0.698 | 1 |

69 | ADHD_0 | Obesity_1 | 1.000 | 1 |

82 | Burn_0 | Asthma_1 | 1.000 | 1 |

83 | Burn_0 | ADHD_1 | 0.992 | 1 |

84 | Burn_0 | Burn_1 | 1.000 | 1 |

85 | Burn_0 | Erectile_dysfunction_1 | 1.000 | 1 |

86 | Burn_0 | Scar_1 | 1.000 | 1 |

87 | Burn_0 | Alcoholism_1 | 1.000 | 1 |

88 | Burn_0 | Anxiety_1 | 1.000 | 1 |

90 | Burn_0 | Dermatitis_1 | 1.000 | 1 |

91 | Burn_0 | Sleep_disorder_1 | 0.960 | 1 |

105 | Erectile_dysfunction_0 | Asthma_1 | 1.000 | 1 |

106 | Erectile_dysfunction_0 | ADHD_1 | 1.000 | 1 |

107 | Erectile_dysfunction_0 | Burn_1 | 1.000 | 1 |

108 | Erectile_dysfunction_0 | Erectile_dysfunction_1 | 1.000 | 1 |

109 | Erectile_dysfunction_0 | Scar_1 | 1.000 | 1 |

114 | Erectile_dysfunction_0 | Sleep_disorder_1 | 0.934 | 1 |

127 | Scar_0 | Acne_1 | 1.000 | 1 |

128 | Scar_0 | Asthma_1 | 1.000 | 1 |

129 | Scar_0 | ADHD_1 | 1.000 | 1 |

130 | Scar_0 | Burn_1 | 1.000 | 1 |

131 | Scar_0 | Erectile_dysfunction_1 | 1.000 | 1 |

132 | Scar_0 | Scar_1 | 1.000 | 1 |

133 | Scar_0 | Alcoholism_1 | 1.000 | 1 |

135 | Scar_0 | Depression_1 | 0.688 | 1 |

136 | Scar_0 | Dermatitis_1 | 1.000 | 1 |

137 | Scar_0 | Sleep_disorder_1 | 0.992 | 1 |

150 | Alcoholism_0 | Acne_1 | 0.958 | 1 |

152 | Alcoholism_0 | ADHD_1 | 1.000 | 1 |

155 | Alcoholism_0 | Scar_1 | 1.000 | 1 |

156 | Alcoholism_0 | Alcoholism_1 | 1.000 | 1 |

161 | Alcoholism_0 | Obesity_1 | 0.994 | 1 |

173 | Anxiety_0 | Acne_1 | 1.000 | 1 |

175 | Anxiety_0 | ADHD_1 | 1.000 | 1 |

176 | Anxiety_0 | Burn_1 | 1.000 | 1 |

180 | Anxiety_0 | Anxiety_1 | 1.000 | 1 |

181 | Anxiety_0 | Depression_1 | 1.000 | 1 |

198 | Depression_0 | ADHD_1 | 1.000 | 1 |

201 | Depression_0 | Scar_1 | 0.648 | 1 |

203 | Depression_0 | Anxiety_1 | 1.000 | 1 |

204 | Depression_0 | Depression_1 | 1.000 | 1 |

206 | Depression_0 | Sleep_disorder_1 | 1.000 | 1 |

219 | Dermatitis_0 | Acne_1 | 0.930 | 1 |

221 | Dermatitis_0 | ADHD_1 | 0.990 | 1 |

222 | Dermatitis_0 | Burn_1 | 1.000 | 1 |

224 | Dermatitis_0 | Scar_1 | 1.000 | 1 |

226 | Dermatitis_0 | Anxiety_1 | 1.000 | 1 |

227 | Dermatitis_0 | Depression_1 | 0.970 | 1 |

228 | Dermatitis_0 | Dermatitis_1 | 1.000 | 1 |

229 | Dermatitis_0 | Sleep_disorder_1 | 0.994 | 1 |

230 | Dermatitis_0 | Obesity_1 | 1.000 | 1 |

245 | Sleep_disorder_0 | Burn_1 | 1.000 | 1 |

246 | Sleep_disorder_0 | Erectile_dysfunction_1 | 0.944 | 1 |

247 | Sleep_disorder_0 | Scar_1 | 0.996 | 1 |

249 | Sleep_disorder_0 | Anxiety_1 | 1.000 | 1 |

250 | Sleep_disorder_0 | Depression_1 | 1.000 | 1 |

252 | Sleep_disorder_0 | Sleep_disorder_1 | 1.000 | 1 |

253 | Sleep_disorder_0 | Obesity_1 | 1.000 | 1 |

265 | Obesity_0 | Acne_1 | 1.000 | 1 |

266 | Obesity_0 | Asthma_1 | 0.978 | 1 |

271 | Obesity_0 | Alcoholism_1 | 1.000 | 1 |

272 | Obesity_0 | Anxiety_1 | 0.990 | 1 |

273 | Obesity_0 | Depression_1 | 1.000 | 1 |

274 | Obesity_0 | Dermatitis_1 | 1.000 | 1 |

275 | Obesity_0 | Sleep_disorder_1 | 1.000 | 1 |

276 | Obesity_0 | Obesity_1 | 1.000 | 1 |

The predictive accuracies for the individual symptoms are as follows.

x | |
---|---|

Acne_1 | 0.42 |

Asthma_1 | 0.85 |

ADHD_1 | 0.58 |

Burn_1 | 0.87 |

Erectile_dysfunction_1 | 0.76 |

Scar_1 | 0.88 |

Alcoholism_1 | 0.57 |

Anxiety_1 | 0.57 |

Depression_1 | 0.53 |

Dermatitis_1 | 0.74 |

Sleep_disorder_1 | 0.60 |

Obesity_1 | 0.66 |

AVERAGE | 0.67 |

The final Bayesian network follows a multivariate normal distribution. Therefore, we can represent it with its marginal and partial correlation matrices. Below are the portions of the those matrices that describe the (absolute) correlations between variables across time.

The proportions of variance explained by the parents of key conditions, discounting the variance explained by the condition itself due to autocorrelation, are reported below.

> of.interest = c("Acne_1", "Dermatitis_1", "Depression_1", "Anxiety_1", + "Sleep_disorder_1", "ADHD_1", "Obesity_1") > > total.explained = rep(0, length(of.interest)) > names(total.explained) = of.interest > proportions.explained = + matrix(0, nrow = length(of.interest), ncol = length(t0.vars), + dimnames = list(of.interest, t0.vars)) > > for (node in of.interest) { + # construct the local model to compute the variances.. + pa = unique(c(sub("_1", "_0", node), parents(fitted, node))) + formula = paste(node, "~", paste(pa, collapse = " + ")) + aa = anova(lm(formula, data = dyndata)) + # ... save and print the total explained variance... + total.explained[node] = + sum(aa[row.names(aa) != "Residuals", "Sum Sq"]) / sum(aa[, "Sum Sq"]) + cat("<p><strong>node:", node, "</strong> ", + round(total.explained[node], 3), " explained variance.</p>\n") + # ... remove the residuals and scale the other components. + aa = aa[!(row.names(aa) %in% c("Residuals", sub("_1", "_0", node))), , drop = FALSE] + aa = aa[, "Sum Sq", drop = FALSE] / sum(aa[, "Sum Sq", drop = FALSE]) + print(knitr::kable(round(aa, 3))) + proportions.explained[node, row.names(aa)] = aa[, "Sum Sq"] + cat("\n") + }#FOR

**node: Acne_1 ** 0.48 explained variance.

Sum Sq | |
---|---|

Asthma_0 | 0.235 |

ADHD_0 | 0.066 |

Scar_0 | 0.127 |

Alcoholism_0 | 0.018 |

Anxiety_0 | 0.216 |

Dermatitis_0 | 0.074 |

Obesity_0 | 0.264 |

**node: Dermatitis_1 ** 0.745 explained variance.

Sum Sq | |
---|---|

Acne_0 | 0.261 |

ADHD_0 | 0.005 |

Burn_0 | 0.185 |

Scar_0 | 0.160 |

Obesity_0 | 0.389 |

**node: Depression_1 ** 0.533 explained variance.

Sum Sq | |
---|---|

ADHD_0 | 0.081 |

Scar_0 | 0.072 |

Anxiety_0 | 0.371 |

Dermatitis_0 | 0.109 |

Sleep_disorder_0 | 0.207 |

Obesity_0 | 0.160 |

**node: Anxiety_1 ** 0.579 explained variance.

Sum Sq | |
---|---|

Acne_0 | 0.175 |

Asthma_0 | 0.219 |

ADHD_0 | 0.064 |

Burn_0 | 0.067 |

Depression_0 | 0.230 |

Dermatitis_0 | 0.128 |

Sleep_disorder_0 | 0.079 |

Obesity_0 | 0.039 |

**node: Sleep_disorder_1 ** 0.597 explained variance.

Sum Sq | |
---|---|

Acne_0 | 0.090 |

Burn_0 | 0.331 |

Erectile_dysfunction_0 | 0.011 |

Scar_0 | 0.170 |

Depression_0 | 0.156 |

Dermatitis_0 | 0.145 |

Obesity_0 | 0.097 |

**node: ADHD_1 ** 0.681 explained variance.

Sum Sq | |
---|---|

Acne_0 | 0.014 |

Burn_0 | 0.495 |

Erectile_dysfunction_0 | 0.138 |

Scar_0 | 0.121 |

Alcoholism_0 | 0.140 |

Anxiety_0 | 0.046 |

Depression_0 | 0.033 |

Dermatitis_0 | 0.011 |

**node: Obesity_1 ** 0.658 explained variance.

Sum Sq | |
---|---|

Acne_0 | 0.622 |

ADHD_0 | 0.158 |

Alcoholism_0 | 0.043 |

Dermatitis_0 | 0.052 |

Sleep_disorder_0 | 0.125 |

Here are the same proportions of variance, arranged in stacked barplots as in the paper.

`Sun May 5 20:20:15 2024`

with **bnlearn**

`5.0-20240418`

and `R version 4.3.2 (2023-10-31)`

.