George F. Luger & William A. Stubblefield, "Reasoning with uncertain or incomplete information"

in Artificial Intelligence:

Structures and strategies for complex problem solving (3rd ed)

pp. 247-268

Addison Wesley, 1998

Summary

Saira Qureshi

CIS 718: Expert Systems

Spring 2001

Instructor: Dr. Danny Kopec

Introduction:

Uncertainty is the lack of adequate information to make a decision. It may prevent us from making the best decision and may even cause bad decision to make. Uncertainty is also very common in real world problems where it requires a set of heuristics to determine an approximate answer without being able to achieve an exact match.

Consider an example

If the engine does not turn over, and

The lights do not come on then

The problem is battery or cables.

This rules appears to be sound but it is not. It could be possible though very unlikely that the fault might be in the starter motor and in headlight. This is an example of Abductive Reasoning or Inference to the Best Explanation. The Abductive Reasoning is of the is a form:

Q is a collection of data (facts, observations, givens),

P explains Q (would, if true, explain Q),

No other hypothesis explains Q as well as P does.

--------------------------------------------------------

Therefore, P is probably correct.

In knowledge based system, we often attached a certainty factor to the rule to measure our confidence in its conclusion. E.g. The rule Q ® Q(.9), express the belief that "if you believe the P to be true, then you believe Q will"

Uncertainty Methods

Ad hoc
Bayesian Probability Theory
The Dempster-Shafer Theory of Evidence
The Stanford Certainty Factor Algebra
Casual Networks

Ad hoc

In this method uncertainty is represented as a Degree of Belief called Certainty Factors (CF). Certainty factors express belief in an event (or fact or hypothesis) based on evidence (or the expert's assessment). Certainty factors range from -1 to +1.As the certainty factor (CF) approaches 1 the evidence is stronger for a hypothesis. As the CF approaches -1 the confidence against the hypothesis gets stronger. A CF around 0 indicates that there is little evidence either for or against the hypothesis.

E.g. The rule Q ® Q(.9), express the belief that "if you believe the P to be true, then you believe Q will".

Bayesian Reasoning

It is based on formal probability theory. This reasoning calculates more complex probabilities based on prior evidence

Definitions

Prior (Unconditioned) Probability - The probability assigned to an event in the absence of any supporting evidence.

P(event)

Posterior (Conditioned) Probability - The probability of an event given some evidence.

P(event | evidence)

Eg; the posterior probability of a person with symptom S having disease D is:

P(d/s) = P(d) x P(s/d) / P(s)

"The number of people having both (intersection) the disease D and symptom S divided by the total number of people having symptom S."

Bayesian Probability Theory

Bayes Theorem calculates the probability of multiple hypotheses given certain evidence.

P(H_i/E) = P(E|H_i) x P(H_i) / S P(E/H_k) X P(H_k)

Where

P(H_i/E) is the probability that Hi is true given evidence E.

P(H_i) is the probability that Hi is true overall.

P(E/H_i) is the probability that of observing evidence E when Hi is true.

Major requirements for the use of Bayes Theorem are

All probabilities on the relationships of the evidence with the various hypotheses must be known.

All relationships between evidence and hypotheses, P(E|Hk) must be independent.

Pure Bayesian reasoning is intractable for most problems.

Bayesian Networks

Bayesian Networks relaxes several constraints of full Bayesian approach. It represents an argument path as a directed acyclic graph (DAG). Links between nodes in the graph represent conditioned probabilities

The Dempster-Shafer Theory of Evidence

It is a more general approach to representing uncertainty than the Bayesian approach. Particularly useful when decision to be made is based on the amount of evidence that has been collected. It is appropriate for combining expert opinions, since experts do differ in their opinions with a certain degree of ignorance. This Approach distinguishes between uncertainty and ignorance by creating belief functions. This also assumes that the sources of information to be combined are statistically independent. The basic idea in representing uncertainty in this model is:

Set up a confidence interval -- an interval of probabilities within which the true probability lies with a certain confidence -- based on the Belief B and plausibility PL provided by some evidence E for a proposition P.

MB and MD can be tied together

CF(H|E) = MB(H|E) - MD(H|E)

As CF approaches 1 the confidence for H is stronger

As CF approaches -1 the confidence against H is stronger

As CF approaches 0 there is little support for belief or disbelief in H

A basic probability m(S), a belief BELIEF(S) and a plausible belief PLAUSIBILITY(S) all have value in the interval [0,1].

Combining beliefs

To combine multiple sources of evidence to a single (or multiple) hypothesis do the following: Suppose M₁ and M₂ are two belief functions. Let X be the set of subsets of W to which M₁ assigns a nonzero value and let Y be a similar set for M₂. Then to get a new belief function M₃ from the combination of M₁ and M₂beliefs in and we do:

M₃= å _XÇ
Y=ZM₁(X) M₂(y) /1- å _{XÇ
Y= f}M₁(X) M₂(y)

The Stanford Certainty Factor Algebra

This approach was used in MYCIN for first time. It is based on Several Observations. Here Facts and the outcome of rules are given confidence factor between -1 and +1

+1 means fact is known to be true.

-1 means fact is known to be false.

Here the knowledge content of the rules is more important than the algebra of confidences holding the system together. In this approach confidence measures correspond to informal assessments by humans such as "it is probably true" or "it is often the case. It also separates "confidence for" from "confidence against"

MB(H|E) - The measure of belief in hypothesis H given evidence E.

MD(H|E) - The measure of disbelief of H given E.

Where the belief brings together all the evidence that would lead us to believe in P with some certainty and the plausibility brings together the evidence that is compatible with P and is not inconsistent with it.

Rules of the algebra:

CF(P1 & P2) = Min(CF(P1),CF(P2))

CF(P1 OR P2) = Max(CF(P1),CF(P2))

Given: IF P THEN Q with CF(R)

CF(Q)=CF(P) * CF(R)

If two rules R1 and R2 support Q with CF(QR1) and CF(QR2) then to find the actual CF(Q):

If both +ve: CF(Q)=CF(QR1) + CF(QR2) - CF(QR1) * CF(QR2)

If both -ve: CF(Q)=CF(QR1) + CF(QR2) + CF(QR1) * CF(QR2)

If of opposite signs: CF(Q)= ( CF(QR1) + CF(QR2) ) / ( 1 - Min( |CF(QR1)| , |CF(QR2)| ) )

Casual Networks

This approach accedes to the kind of descriptions provided by experts (e.g. CASNET, ABEL).Causal models depict relationships as links between nodes in a graph or a network of nodes. It is critical to explanatory power. Mostly developed for medical application, these networks provide multiple levels of detail to link clinical findings to pathophysiological state. They provide focus mechanisms for diagnostic investigation. It exhibits further information seeking behavior and provides mechanism for accounting for disease interaction.

Limitations

A lot of detail required.

Not immediately clear how to choose which level to reason at and how to combine reasoning at different levels.

Not a purely mechanistic approach, still unclear when the problem space is adequately covered.