Local discovery algorithms for structure learning from data with missing values
Like other structure learning algorithms, local discovery algorithms are originally formulated to work on complete data in the literature. The Chow-Liu algorithm to learn trees and the ARACNE algorithm (documented here) are no exception.
Both algorithms are based on pairwise measures associations, organised as a mutual information matrix.
bnlearn handles incomplete data sets transparently by using pairwise-complete observations to estimate
each element of this matrix in the same way as R estimates covariance matrices with
cov(.., use = "pairwise.complete.obs")
.
> missing = matrix(FALSE, nrow(learning.test), ncol(learning.test)) > missing[sample(length(missing), 100)] = TRUE > incomplete = learning.test > incomplete[missing] = NA > par(mfrow = c(1, 2)) > graphviz.compare(chow.liu(learning.test), chow.liu(incomplete))
> par(mfrow = c(1, 2)) > graphviz.compare(aracne(learning.test), aracne(incomplete))
All the considerations on how constraint-based algorithms can handle missing data in the same way are relevant for local discovery algorithms as well.
Mon Aug 5 02:46:38 2024
with bnlearn
5.0
and R version 4.4.1 (2024-06-14)
.