New Techniques for Learning Parameters in Bayesian Networks.
Abstract
One of the hardest challenges in building a realistic Bayesian network (BN) model is
to construct the node probability tables (NPTs). Even with a fixed predefined model
structure and very large amounts of relevant data, machine learning methods do not
consistently achieve great accuracy compared to the ground truth when learning the
NPT entries (parameters). Hence, it is widely believed that incorporating expert judgment
or related domain knowledge can improve the parameter learning accuracy. This
is especially true in the sparse data situation. Expert judgments come in many forms.
In this thesis we focus on expert judgment that specifies inequality or equality relationships
among variables. Related domain knowledge is data that comes from a different
but related problem.
By exploiting expert judgment and related knowledge, this thesis makes novel
contributions to improve the BN parameter learning performance, including:
• The multinomial parameter learning model with interior constraints (MPL-C)
and exterior constraints (MPL-EC). This model itself is an auxiliary BN, which
encodes the multinomial parameter learning process and constraints elicited from
the expert judgments.
• The BN parameter transfer learning (BNPTL) algorithm. Given some potentially
related (source) BNs, this algorithm automatically explores the most relevant
source BN and BN fragments, and fuses the selected source and target parameters
in a robust way.
• A generic BN parameter learning framework. This framework uses both expert
judgments and transferred knowledge to improve the learning accuracy. This
framework transfers the mined data statistics from the source network as the parameter
priors of the target network.
Experiments based on the BNs from a well-known repository as well as two realworld
case studies using different data sample sizes demonstrate that the proposed new
approaches can achieve much greater learning accuracy compared to other state-of-theart
methods with relatively sparse data.
Authors
Zhou, YunCollections
- Theses [3706]