Bayesian networks are a knowledge representation formalism. They were originally developed to capture and use the domain knowledge of some human expert in order to develop automated reasoning systems.
A Bayesian network is a graph where nodes represent stochastic variables and (arrowhead) arcs represent dependencies among these variables. A variable can take a finite set of values. For instance, suppose that you want to represent the variable Gender. The variable Gender may take two possible values: Male and Female. The assignment of a value to a variable is called the state of the variable. So, the variable Gender has two states: Gender = Male and Gender = Female. The graphical structure of a Bayesian network looks like this:
The network represent notion that Obesity and Gender affects the Heart_Condition of a patient. The variable Obese can take three values: Yes, Borderline and No. The variable Heart_Condition has two states: True and False.
We said that the variables used in a Bayesian networks are stochastic. This mean that the assignment of a value to a variable is represented by a probability distribution. For instance, if we do not know for sure what is the gender of a patient, we may want to encode the information that we have better chances of having a female patient rather than a male one. This guess could be based, for instance, on statistical considerations, but this may not be our unique source of information. So, for the sake of this example, let's say that there are 80% chances of being Female and 20% of being Male.
Similarly, we can encode that the incidence of obesity is 10% and there is a 20% of borderline cases.
The following set of distributions tries to encode the fact that obesity increases the cardiac risk of a patient, but this effect is more significant in men than women.
The dependency is modeled by a set of probability distributions, one for each combination of states of the variables Gender and Obesity, called the parent variables of Heart_Condition.
Once the network has been defined, it can be used to solve problems, make predictions, and find explanations. In a word, they can be used for reasoning about real or hypothetical cases.
When compared with traditional production rule-based systems, systems developed using Bayesian networks do not encode heuristic rules but explicit representation of dependencies dictated by the domain knowledge. This feature makes easier the acquisition and maintenance of the knowledge base.