Basic Statistical
methods -- Probability
The basic
approach statistical methods adopt to deal with uncertainty is via the axioms
of probability:
- Probabilities
are (real) numbers in the range 0 to 1.
- A
probability of P(A) = 0 indicates total uncertainty
in A, P(A) = 1 total certainty and values
in between some degree of (un)certainty.
- Probabilities
can be calculated in a number of ways.
Very Simply
Probability = (number of desired outcomes) / (total number of
outcomes)
So given a pack of playing cards the probability of being dealt an
ace from a full normal deck is 4 (the number of aces) / 52 (number of cards in
deck) which is 1/13. Similarly the probability of being dealt a spade suit is
13 / 52 = 1/4.
If you have a choice of number of items k from a
set of items n then the formula is
applied to find the number of ways of making this choice. (! = factorial).
So the chance of winning the national lottery (choosing 6 from 49)
is to 1.
- Conditional
probability, P(A|B), indicates the probability
of of event A given that we know event B has
occurred.
- This
reads that given some evidence E then probability that
hypothesis is
true is equal to the ratio of the probability that E will
be true given times
the a priori evidence
on the probability of and the sum of the
probability of E over the set of all hypotheses times
the probability of these hypotheses.
- The
set of all hypotheses must be mutually exclusive and exhaustive.
- Thus
to find if we examine medical evidence to diagnose an illness. We must
know all the prior probabilities of find symptom and also the probability
of having an illness based on certain symptoms being observed.
Bayesian statistics lie at the heart
of most statistical reasoning systems.
How is
Bayes theorem exploited?
- The
key is to formulate problem correctly:
P(A|B) states
the probability of A given only B's
evidence. If there is other relevant evidence then it must also
be considered.
Herein
lies a problem:
- All
events must be mutually
exclusive. However in real world problems events are not
generally unrelated. For example in diagnosing measles, the symptoms of
spots and a fever are related. This means that computing the conditional
probabilities gets complex.
In general if a prior evidence, p and
some new observation, N then computing
grows exponentially for large sets of p
- All
events must be exhaustive.
This means that in order to compute all probabilities the set of possible
events must be closed. Thus if new information arises the set must be
created afresh and allprobabilities
recalculated.
Thus
Simple Bayes rule-based systems are not suitable for uncertain reasoning.
- Knowledge
acquisition is very hard.
- Too
many probabilities needed -- too large a storage space.
- Computation
time is too large.
- Updating
new information is difficult and time consuming.
- Exceptions
like ``none of the above'' cannot be represented.
- Humans
are not very good probability estimators.
However,
Bayesian statistics still provide the core to reasoning in many uncertain
reasoning systems with suitable enhancement to overcome the above problems.
We will
look at three broad categories:
- Certainty
factors,
- Dempster-Shafer
models,
- Bayesian
networks.