Knowledge elicitation: issues and methods
Author: Anna Hart
Journal: Computer-Aided Design,
Vol. 17, No. 9, pp. 445-462,
November 1985
Knowledge elicitation: issues and methods
*Expertise
*Knowledge elicitation
*Conclusions
*Reference
EXPERTISE
Expertise is a collection of knowledge, experience, skills, techniques, facts, rules and so on, and using them to lead to goals.
Human beings solve problems with less well defined logic methods, and the methods they use are not fully understood.
Experts try methods and approaches, developing rules of thumb or heuristics. ‘Brain problems’ do not demand the correct answers, but an adequate answer. This involves weighing up different pieces of evidence in order to select a path from the several available. The potential outcomes of different paths need to be assessed and compared with the goal; the path with the best looking outcome is chosen.
The expert can often tell you the decision, i.e. ‘what’, but not describe the process, i.e. the details of ‘how’. Expertise is difficult to teach and to describe: experts use it without knowing what they are doing, but confident that their methods are effective.
Welbank gives a good definition of an expert system is as ‘a program which uses Artificial Intelligence techniques to do the same type of task as an expert does’, i.e. complex inferential reasoning based on a wide knowledge of a limited domain1 .
expert system
The smaller projects with fairly modest aims are most likely to succeed
Conclusions
The interface between users and an expert system must be ‘intelligent2’. It is able to guide the user in selection of routines, parameters, and commands, and tailor the output to the user’s needs. It can build a simple model of the user, and behave accordingly.
An expert can look at a possible design, and say, ‘That is no good, because …’. Such a judgement depends on knowledge about constraints on design, and heuristics about what is good, efficient, or known to work. If these knowledge can be coded then the design process will be made more efficient.
The main problem here is that the knowledge will vary very much according to the items being designed, i.e. it may be possible to encode the knowledge for designing metal pipes, but not design of hollow metal objects in general (see Harvey3).
KNOWLEDGE ILLICITATION
The process of acquiring knowledge from a domain expert to enter into the knowledge base of an expert system.
*Method of elicitation
The obvious method of elicitation is to identify an
expert, and question him, or to get a group of
experts to talk to each other.
* Difficulties in interview
* Variations on the basic interview technique
* Definition
Protocol analysis is based on a transcripted interview,
but attempts to structure the process, and produce
more meaningful results.
* Advantages of Protocol Analysis Method
Experts find it much easier to talk about specific
examples of problems than to talk in abstract terms.
They find it much easier to answer questions like
‘How did you know that this design would not work?’
than ‘What makes a poor design?’. From comments
on specific examples it is possible to detect general
patterns, e.g. the expert may always look at one
particular characteristic first. It is easier to structure
the knowledge into groups and concepts.
* Areas of Protocol Analysis successfully used
* Definition:
Inducing rules from knowledge contained in a set of
examples.
* Difference between induction and deduction
Induction is the converse of deduction. In
deduction, we are given a general rule from which it
is possible to deduce facts about specific cases
(general rule à specific facts). Induction works the
other way round: given a set of specific examples we
investigate the examples and induce rules or
patterns for general (specific examples à general
rules).
* Advantages of Induction
* Disadvantages of Induction
* Conclusion of Induction
If a training set is available then induction can be
useful. It should be viewed as a method of raising
questions as well as answering them. It can
identify contradictions, gaps, interesting cases or
important attributes: with current technology it
is unlikely to replace consultation with the expert.
* Training set example
Table 1 shows a very simple example of a training set
and the induced rules. The training set consists of
three different attributes, and the action is a yes/no
decision.
Match
Good Poor
No Cost
Low
Very high High
No
Yes Accurocy
Very high Low
High Medium
No
Yes Yes No
Match
Good Poor
No Cost
Very high High Low
Yes No No
Table 1. Training set for example 1
Attributes |
Action |
||
Required |
Match |
Cost of item |
Change |
Accuracy |
Process? |
||
High |
Good |
Very high |
No |
Low |
Good |
Very high |
No |
Low |
Poor |
High |
No |
Very high |
Good |
High |
No |
Medium |
Poor |
Very high |
Yes |
Low |
Poor |
Very high |
Yes |
Low |
Poor |
Low |
No |
High |
Good |
High |
No |
Low |
Good |
Low |
No |
Very high |
Good |
Very high |
No |
High |
Poor |
Low |
No |
High |
Poor |
Very high |
Yes |
Very high |
Good |
Low |
No |
Altogether there are 24 possible combinations of values for these attributes, and so a complete training set would have 24 examples. In this case only 13 cases have been covered, and the resulting induced rules are shown in Figure 1(b). The important principle to appreciate here is that the induced rules work exactly for every example in the training set. However, it is possible that the ‘true’ rules are shown in Figure 1(a). If this is the case then the induced rules will give the incorrect answer for some types of examples.
This emphasizes the importance of a complete training set. Notice that the terms high, low, good, poor, etc have to be defined if they are to be useful.
Some attributes may not appear in the induced rules. In table 1 the attribute ‘accuracy’ appears redundant. In this case this is caused by the incompleteness of the training set. It is also possible for an attribute to not appear because it is highly correlated with other attributes which do feature in the induced rules. All this issues must be discussed with the expert.
Definition :
The repertory grid is a method of investigating such
a model, and can be used effectivel6 in knowledge
elicitation.
Much of the difficulty in knowledge elicitation lies in the fact that the expert cannot easily describe how he views a problem. He may not distinguish between facts or beliefs and the factors which actually influence his decision-making. Much of his expertise lies in the way in which he views problems, i.e. his perception or insight (see Chi, Feltovich and Glaser7). This is essentially a psychological problem.
The model consists of elements and constructs. The
constructs are analogous to attributes in induction,
except that they must be bipolar, e.g. strong/weak,
true/false, size, weight, and etc. ‘Color’, for example,
would not be a construct, but ‘degree of redness’
would. Constructs are the way in which pairs of
elements can be described as either alike or different,
e.g. A is strong and B is weak; C and D are both true.
Elements are analogous to examples in induction, and
they are chosen by the user as elements which are
important to him. There is no right answer.
First of all, it is essential to define a particular problem
for the expert to think about. He then produces the
elements and constructs which he considers relevant to
this particular problem. The grid is a system of cross-
references between constructs and elements for that
problem. It is successively refined until the user is
happy with the result. In this manner the expert is
forced to investigate how he thinks about the problem.
There are various ways in which the grids can be
elicited. Elements chosen by the expert are those which
seem most relevant to the problem under discussion.
Constructs can be supplied by the expert in a similar
manner, or by a systematic analysis of the elements.
One method is to select groups of three elements
randomly, and then to ask the expert to elicit the
construct which describes this distinction. Having
supplied the construct the expert rates each element
according to this construct. This rating can be
true/false or a subjective rating on a numerical scale 1
to N (N=5, 7, …). At any stage, the expert can add more
elements or constructs, or alter entries in the grid. In
this way the process heightens his awareness of how he
views the problem.
This process of making the expert rethink is, in itself,
invaluable, and could be a very useful technique for
extracting attributes for induction or forming the basis
of a consultation. However, it is also possible to analyse
the grid by computer to identify patterns. One useful
method is to use cluster analysis (see Shaw8) to identify
constructs which are similar (or correlated) and also
elements which are similar. On the basis of such results
the grid can be reordered, or focused, to represent a
coherent model of the expert’s view. If he disagrees
with any results then the expert can modify the grid
until it best represents his perception of the issues. The
programs are tools to assist the expert, not to
contradict him.
* Grid example showing programming evaluation
Figure 2 shows some results from a consultation with a
project leader. He was evaluating the programming
skills of his programmers. Before the consultation he
was able to describe ‘good programs’ in only abstract
terms such as maintainability, which he could not
easily define. He was asked to name specific
programmers and the features of their programs. As he
named the programmers, he rated their work. Figure 2
shows the results of his investigation. He gave enough
elements to summarize programs, and enough
constructs to distinguish between them. He was happy
that this described the main characteristics of
programs. At this stage he did not wish to add any
more names (elements) or characteristics (constructs)
or alter any of the ratings.
The grid was analysed using a simple computer
program. The measure of difference between two
examples was the sum of the absolute values of
differences in ratings. This enables similar elements to
be placed close to each other in the grid. When
comparing constructs it is necessary not only to
measure the differences between all pairs of constructs,
but also to compare constructs with reversed
constructs. (In this case, if the rating on construct C is
n then its rating on the reverse of C, given by C’ is 6-n).
This enables a clustering of constructs. Figure 3 shows
the output from this program where elements and
constructs have been reordered and construct 5 has
been reversed.
In this example the output gives a representation of the
expert’s opinions. The expert found this useful in
clarifying his ideas, and made the following
observations based on the focused grid:
This example shows how a relatively poorly defined
problem can be clarified using this technique.
Induction would have been little help at this stage
because there were many classifications, and the expert
was very unclear about the relationship between the
constructs. This method can identify correlations
between constructs: in induction correlated attributes
must be used with care. Techniques are available to
compare two grids from different people, to investigate
how their views differ, and also to analyse the focused
grid by further analysis of the concepts. This grid
method does not give rules as such, but identifies
concepts which are grouped or similar, and is very
useful in coding knowledge.
CONCLUSIONS
REFERENCES