Author: Guido Naudts, e-mail:
naudts_vannoten@yahoo.com
RDF
is one of the pillars in the SemanticWeb Initiative . In order to realize
the semantic web, some elementary reasoning power will be necessary. This
reasoning power is expressed by means of rules. If I want to make an
appointment with a cardiologist and in hospital A none is available, my program needs a rule:
If none
is available and this is the first try then contact another hospital.
Another rule might exist also:
If none
is available and this is not the first try then look at site http://blabla/#doctors_list.
Btw, in N3 (Notation3 ) rules are written like:
which has
the same meaning as:
if A
then B.
A Semantic Web file consists of RDF
triples and rules.
A user of this file will want to know
something: he will make a query.
As an
answer to his query, he will recieve one or more solutions. E.g. I ask the
(RDF) database about cardiologists that are available I will recieve probably
more than one answer.
These
solutions can be obtained in two ways:
1)
with a
forward reasoning process
2)
with a
backward reasoning process
When
executing a query one or more axiom
files are given and one query file. E.g. for RDFEngine:
python
proof.py gedcom-facts.n3 gedcom-relations.n3 gedcom.query.n3
In Euler(java version) this becomes:
java Euler gedcom-facts.n3 gedcom-relations.n3 gecom.query.n3 and in CWM this would be: python cwm gedcom-facts.n3 gedcom-relations.n3 gedcom.query.n3 --think
The axiom
files contain the basic statements and rules and the query file of course
contains the query.
I will
start the explanation of these two fundamental processes with a very simple
example in notation 3 . I will assume that the reader has some basic knowledge of notation 3.
:Frank :mother :Christine.
:Christine :mother :Elza.
{?a :mother ?b. ?b :mother ?c} => {?a
:grandmother ?c}.
Now a
user wants to know who is the grandmother of Frank.
So he asks:
:Frank :grandmother ?who.
Two basic ways are often used for solving this
riddle (other techniques exist also).
a) forward reasoning
In forward
reasoning we take the facts of the
N3 file and try to apply the rules to the facts.
So
applying the only rule to the two facts of the example we get:
:Frank
:grandmother :Elza.
In the rule ?a was replaced by :Frank,
?b was replaced by :Christine and ?c was replaced by :Elza.
These replacements are called substitutions. They are indicated in this text as
tuples. This gives the following tuples:
(?a, :Frank), (?b, :Christine), (?c, :Elza).
After using this rule once (in this simple
example) the answer to the query has been found: the grandmother of Frank is
Elza.
Thus this is the recept of forwards reasoning:
Apply the rules to the facts. Thereby, new
triples are obtained. Add those triples to the database. Keep applying the
rules to the facts (in sequence or using other selection criteria) till the
answer to the query has been found.
A solution is the query plus a substitution for
each variable in the query. If the query doesn’t have variables, the
solution is a confirmation of the truth
of the query (e.g. suppose the query is: :Frank :grandmother :Elza).
What will
the query:
:Frank
:grandmother :Louisa.
give?
This will give a negative result. The program
will answer:
No
solutions were found
b) backward reasoning.
Backward reasoning proceeds in a totally
different way from forward reasoning.
First I repeat the facts and rules:
:Frank
:mother :Christine.
:Christine
:mother :Elza.
{?a
:mother ?b. ?b :mother ?c} => {?a :grandmother ?c}.
and the
query is:
:Frank
:grandmother ?who.
Now in backward reasoning we start with the
query. We try to match the query with a fact or a rule. In this case the query
matches with:
?a
:grandmother ?c.
That is
to say: ?a is sbstituted by :Frank and ?who is subtituted by ?c.
So the
substitutions are:
(?a,
:Frank) and (?who,
?c)
and we
get:
:Frank
:grandmother ?c.
Note: the first part of the rule is called the antecedent and the last part of the
rule is called the consequent.
Now, if the found substitutions (by applying
the query against the consequent of the rule) are applied to the antecedent of
the rule we get:
{:Frank
:mother ?b. ?b :mother ?c}.
Now, the consequent of a rule will be true,
whenever the antecedent is true.
This implies that we now have to proof:
{:Frank
:mother ?b. ?b :mother ?c}.
Of course,
we can easily see that:
:Frank
:mother ?b.
matches
with:
:Frank
:mother :Christine.
Which
gives the substitution:
(?b,
:Christine)
and ?b
:mother ?c becomes:
:Christine
:mother ?c.
This, of course, matches with:
:Christine
:mother :Elza.
Which
gives the substitution:
(?c,
:Elza)
Now we
have accumulated the substitutions:
(?a,
:Frank), (?who, ?c),(?b, Christine), (?c, :Elza)
as far as the solution is concerned only the
substitution (?c, :Elza) is of
importance.
RDFEngine will give following proof of the
solution:
:Frank
:mother :Christine.
:Christine
:mother :Elza.
{:Frank
:mother :Christine. :Christine :mother:Elza} => {:Frank :grandmother :Elza}.
:Frank
:grandmother :Elza.
Two more
facts in this proof are also intresting:
:Frank
:mother :Christine.
:Christine
:mother :Elza.
These facts were already known before the
process started, but in more complex inferences, other intermediate facts can
be deduced that can potentially be of intrest.
I will repeat the process, however with a
little difference. Pay attention, reader!!!
The query is now:
?x
:grandmother ?y.
Now in backwards reasoning we start with the
query. We try to match the query with a fact or a rule. In this case the query
matches with:
?a
:grandmother ?c.
That is
to say: ?a is substituted by ?x and ?y is substituted by ?c.
So the
substitutions are:
(?a,
?x) and (?y, ?c)
So we
get:
:a
:grandmother ?c.
Now, if the found substitutions (by applying the
query against the consequent of the rule) are applied to the antecedent of the
rule we get:
{:?x
:mother ?b. ?b :mother ?y}.
Now, the consequent of a rule will be true,
whenever the antecedent is true.
This implies that we now have to proof:
{:?x
:mother ?b. ?b :mother ?y}.
Of
course, we can easily see that:
?x
:mother ?b.
matches
with:
:Christine
:mother :Elza.
Which
gives the substitutions:
(?x,
:Christine), (?b, :Elza)
and ?b
:mother ?y becomes:
:Elza
:mother ?y.
Now, :Elza
:mother ?y., matches with …. euhhhhhh???
This matches with nothing. So this time we have
made bad substitutions (namely (?x,
:Christine), (?b, :Elza))
What do we do now? Of course we have to go back
and instead of matching
?x
:mother ?b.
with:
:Christine
:mother :Elza.
we must
match:
?x
:mother ?b.
with:
:Frank
:mother :Christine.
which was what we did the first time.
The process of going back and trying something else is called backtracking.
In order to do backtracking a stack must be
used. What information is put on the stack? (I’ll explain first; an example
follows). At certain points in the inferencing process, there are different
possibilities for continuing the inferencing. These are called alternatives.
One of these possibilities will be explored, the others will be pushed on
the stack. When the first alternative is explored (giving a solution or a
failure), then the other alternatives can be explored. One of those is
retrieved from the stack.
This pushing of alternatives on the stack can
happen at each step in the inferencing process. A step is when we try to
match a goal with a rule or a fact.
This finally leads to a tree-like exploring
process.
I will illustrate this using the simple example
from above.
The first step in the process is the matching
of the query:
?x
:grandmother ?y.
with the (only) available rule:
{?a :mother ?b. ?b :mother ?c} => {?a
:grandmother ?c}.
giving
the substitution:
(?a,
?x) and (?y, ?c).
There is only one
possibile way for matching, so there are no alternatives and nothing has to be
pushed on the stack.
Now we have two new goals to proof:
:?x
:mother ?b. and ?b :mother ?y.
We start with:
:?x :mother ?b.
The other goal is kept in a list of goals that have
to be proven i.e. the goallist.
This goal
now matches with two triples, so there are two alternatives:
alternative1
with the substitutions:
((?x, :Frank), (?b,:Christine))
and alternative 2 with the substitutions:
((?x, :Christine), (?b, :Elza))
What is
now the current state (say S1)? There is one goal in the goallist and
there are two alternatives. One alternative will be looked into immediately,
the other must be pushed on the stack. The goallist too must be pushed on the
stack but why? When we look into the first alternative, we will try to
proof the goals that are on the goallist one by one, each time taking a goal
away. When the goallist is empty, everything has been proved and a solution is
found.
However, we then want to go backward to
look into the second alternative, we must return to the state (S1) at
that moment. At that moment there was one goal in the goallist, so this state
of the goallist has to be restored.
Now at state S1, we had already made
also some substitutions namely:
(?a,
?x) and (?y, ?c). This subtitutions have to be pushed on the
stack too.
This gives three items that have to be pushed
on the stack at state S1:
1) The current list of goals to be proved:
?b :mother ?y.
2) All substitutions that were done till arriving
at state S1:
(?a, ?x) and (?y,
?c)
3) The list
of alternatives at state S1 (except the first) (here only one alternative):
((?x, :Christine), (?b, :Elza))
Remark
that an alternative is expressed as a substitution list.
We start
looking into the first alternative (which gives, of course, the solution:
:Frank :grandmother :Elza.
After
that we need to look into the second alternative (which, potentially might give
a solution too). We then retrieve from the stack all the informations that were
previously pushed onto it. One of the elements retrieved is the list of
alternatives; one is chosen and the rest is repushed on the stack. Then
everything proceeds as before.
I will
now put this somewhat more orderly into an algorithm:
1) The query is put in the goallist
2) A goal is selected from the goallist. If there is no goal anymore then a solution is found (goto 5). The chosen goal is matched with all triples and rules in the database. All the matches that succeed produce an alternative.
3) One alternative is selected and new goals are pushed on the goallist. If there are no alternatives we must go back to a previous state (goto 4). The current goallist, the current list of substitutions and the list of alternatives (minus the selected one) are pushed on the stack. Goto 2.
4) Retrieve the list of alternatives from the stack, the goallist and the list of substitutions. Goto 3. If the stack is empty (all possibilities were looked into), then goto 6.
5) A solution is found. Push the current list of substitutions into the solutions list. Goto 4 (try to find more solutions by backtracking).
6) End of the inferencing process. Now display
the results using the list of solutions (each solution is a list of
susbtitutions among which a substitution for each variable in the query).
So far this simple presentation of inferencing
with notation 3. Some points that could
be treated in a follow-up might be:
·
the different kind of variables
·
the treatment of blank nodes
·
the logic builtins of CWM
·
what to do with builtins
·
what to do with OWL