Default value for the gamma parameter of the extended BIC
The extended BIC (eBIC) score was originally introduced by Foygel and Drton
in this NIPS
paper for Gaussian graphical models. Its construction extends the classical BIC score with an additional penalty
term and an associated penalty coefficient γ. Their simulation studies suggest that γ = 0.5 works well
enough in different scenarios, which makes it an ideal default value for Gaussian Bayesian networks
(score = "ebic-g"
). But is γ = 0.5 a good default for discrete Bayesian networks as well
(score = "ebic"
)?
Using hc()
to perform structure learning for 11 reference networks from the
Bayesian network repository across sample sizes between 0.1 * nparams(bn)
and
5 * nparams(bn)
, we evaluated γ = {0, 0.125, 0.25, 0.5, 0.75, 1}. The value γ = 0 corresponds
to the classical BIC score, which we can use to normalise SHD as SHD(eBIC(gamma)) / SHD(BIC) to have a common scale
for all the plots. The lower the normalized SHD values, the more eBIC improves on BIC.
In the small sample size regime between 0.1 * nparams(bn)
and 0.5 * nparams(bn)
the
value γ = 0.5 sometimes gives the most improvement, sometimes it is neither the best nor the worst γ, and
it is never the worst performer of the values under consideration.
The same is true in the large sample size regime between 1 * nparams(bn)
and
5 * nparams(bn)
.
In conclusion, γ = 0.5 seems a reasonable default choice that can perform well and never performs the worst among γ = {0, 0.125, 0.25, 0.5, 0.75, 1}. If nothing else, it ensures better performance than BIC in the vast majority of simulations in the small sample size regime.
Tue Nov 29 18:24:11 2022
with bnlearn
4.9-20221107
and R version 4.2.2 Patched (2022-11-10 r83330)
.