Inferencing with RDF and Notation 3 for beginners

Author: Guido Naudts, e-mail: naudts_vannoten@yahoo.com

1)Introduction

RDF is one of the pillars in the SemanticWeb Initiative . In order to realize the semantic web, some elementary reasoning power will be necessary. This reasoning power is expressed by means of rules. If I want to make an appointment with a cardiologist and in hospital A none is available, my program needs a rule:

If none is available and this is the first try then contact another hospital.

Another rule might exist also:

If none is available and this is not the first try then look at site http://blabla/#doctors_list.

Btw, in N3 (Notation3 ) rules are written like:

A => B

which has the same meaning as:

if A then B.

A Semantic Web file consists of RDF triples and rules.

A user of this file will want to know something: he will make a query.

As an answer to his query, he will recieve one or more solutions. E.g. I ask the (RDF) database about cardiologists that are available I will recieve probably more than one answer.

These solutions can be obtained in two ways:

1) with a forward reasoning process

2) with a backward reasoning process

When executing a query one or more axiom files are given and one query file. E.g. for RDFEngine:

python proof.py gedcom-facts.n3 gedcom-relations.n3 gedcom.query.n3

In Euler(java version) this becomes:

java Euler gedcom-facts.n3 gedcom-relations.n3 gecom.query.n3 and in CWM this would be: python cwm gedcom-facts.n3 gedcom-relations.n3 gedcom.query.n3 --think

The axiom files contain the basic statements and rules and the query file of course contains the query.

I will start the explanation of these two fundamental processes with a very simple example in notation 3 . I will assume that the reader has some basic knowledge of notation 3.

2) A simple example

:Frank :mother :Christine.

:Christine :mother :Elza.

{?a :mother ?b. ?b :mother ?c} => {?a :grandmother ?c}.

Now a user wants to know who is the grandmother of Frank.

So he asks:

:Frank :grandmother ?who.

Two basic ways are often used for solving this riddle (other techniques exist also).

a) forward reasoning

In forward reasoning we take the facts of the N3 file and try to apply the rules to the facts.

So applying the only rule to the two facts of the example we get:

:Frank :grandmother :Elza.

In the rule ?a was replaced by :Frank, ?b was replaced by :Christine and ?c was replaced by :Elza. These replacements are called substitutions. They are indicated in this text as tuples. This gives the following tuples:

(?a, :Frank), (?b, :Christine), (?c, :Elza).

After using this rule once (in this simple example) the answer to the query has been found: the grandmother of Frank is Elza.

Thus this is the recept of forwards reasoning:

Apply the rules to the facts. Thereby, new triples are obtained. Add those triples to the database. Keep applying the rules to the facts (in sequence or using other selection criteria) till the answer to the query has been found.

A solution is the query plus a substitution for each variable in the query. If the query doesn’t have variables, the solution is a confirmation of the truth of the query (e.g. suppose the query is: :Frank :grandmother :Elza).

What will the query:

:Frank :grandmother :Louisa.

give?

This will give a negative result. The program will answer:

No solutions were found

b) backward reasoning.

Backward reasoning proceeds in a totally different way from forward reasoning.

First I repeat the facts and rules:

:Frank :mother :Christine.

:Christine :mother :Elza.

{?a :mother ?b. ?b :mother ?c} => {?a :grandmother ?c}.

and the query is:

:Frank :grandmother ?who.

Now in backward reasoning we start with the query. We try to match the query with a fact or a rule. In this case the query matches with:

?a :grandmother ?c.

That is to say: ?a is sbstituted by :Frank and ?who is subtituted by ?c.

So the substitutions are:

(?a, :Frank) and (?who, ?c)

and we get:

:Frank :grandmother ?c.

Note: the first part of the rule is called the antecedent and the last part of the rule is called the consequent.

Now, if the found substitutions (by applying the query against the consequent of the rule) are applied to the antecedent of the rule we get:

{:Frank :mother ?b. ?b :mother ?c}.

Now, the consequent of a rule will be true, whenever the antecedent is true.

This implies that we now have to proof:

{:Frank :mother ?b. ?b :mother ?c}.

Of course, we can easily see that:

:Frank :mother ?b.

matches with:

:Frank :mother :Christine.

Which gives the substitution:

(?b, :Christine)

and ?b :mother ?c becomes:

:Christine :mother ?c.

This, of course, matches with:

:Christine :mother :Elza.

Which gives the substitution:

(?c, :Elza)

Now we have accumulated the substitutions:

(?a, :Frank), (?who, ?c),(?b, Christine), (?c, :Elza)

If this list of substitutions is applied to the query (:Frank :grandmother ?c.),

as far as the solution is concerned only the substitution (?c, :Elza) is of importance.

RDFEngine will give following proof of the solution:

:Frank :mother :Christine.

:Christine :mother :Elza.

{:Frank :mother :Christine. :Christine :mother:Elza} => {:Frank :grandmother :Elza}.

:Frank :grandmother :Elza.

Two more facts in this proof are also intresting:

:Frank :mother :Christine.

:Christine :mother :Elza.

These facts were already known before the process started, but in more complex inferences, other intermediate facts can be deduced that can potentially be of intrest.

I will repeat the process, however with a little difference. Pay attention, reader!!!

The query is now:

?x :grandmother ?y.

Now in backwards reasoning we start with the query. We try to match the query with a fact or a rule. In this case the query matches with:

?a :grandmother ?c.

That is to say: ?a is substituted by ?x and ?y is substituted by ?c.

So the substitutions are:

(?a, ?x) and (?y, ?c)

So we get:

:a :grandmother ?c.

Now, if the found substitutions (by applying the query against the consequent of the rule) are applied to the antecedent of the rule we get:

{:?x :mother ?b. ?b :mother ?y}.

Now, the consequent of a rule will be true, whenever the antecedent is true.

This implies that we now have to proof:

{:?x :mother ?b. ?b :mother ?y}.

Of course, we can easily see that:

?x :mother ?b.

matches with:

:Christine :mother :Elza.

Which gives the substitutions:

(?x, :Christine), (?b, :Elza)

and ?b :mother ?y becomes:

:Elza :mother ?y.

Now, :Elza :mother ?y., matches with …. euhhhhhh???

This matches with nothing. So this time we have made bad substitutions (namely (?x, :Christine), (?b, :Elza))

What do we do now? Of course we have to go back and instead of matching

?x :mother ?b.

with:

:Christine :mother :Elza.

we must match:

?x :mother ?b.

with:

:Frank :mother :Christine.

which was what we did the first time.
The process of going back and trying something else is called backtracking.

In order to do backtracking a stack must be used. What information is put on the stack? (I’ll explain first; an example follows). At certain points in the inferencing process, there are different possibilities for continuing the inferencing. These are called alternatives. One of these possibilities will be explored, the others will be pushed on the stack. When the first alternative is explored (giving a solution or a failure), then the other alternatives can be explored. One of those is retrieved from the stack.

This pushing of alternatives on the stack can happen at each step in the inferencing process. A step is when we try to match a goal with a rule or a fact.

This finally leads to a tree-like exploring process.

I will illustrate this using the simple example from above.

The first step in the process is the matching of the query:

?x :grandmother ?y.

with the (only) available rule:

{?a :mother ?b. ?b :mother ?c} => {?a :grandmother ?c}.

giving the substitution:

(?a, ?x) and (?y, ?c).

There is only one possibile way for matching, so there are no alternatives and nothing has to be pushed on the stack.

Now we have two new goals to proof:

:?x :mother ?b. and ?b :mother ?y.

We start with:

:?x :mother ?b.

The other goal is kept in a list of goals that have to be proven i.e. the goallist.

This goal now matches with two triples, so there are two alternatives:

alternative1 with the substitutions:

((?x, :Frank), (?b,:Christine))

and alternative 2 with the substitutions:

((?x, :Christine), (?b, :Elza))

What is now the current state (say S1)? There is one goal in the goallist and there are two alternatives. One alternative will be looked into immediately, the other must be pushed on the stack. The goallist too must be pushed on the stack but why? When we look into the first alternative, we will try to proof the goals that are on the goallist one by one, each time taking a goal away. When the goallist is empty, everything has been proved and a solution is found.

However, we then want to go backward to look into the second alternative, we must return to the state (S1) at that moment. At that moment there was one goal in the goallist, so this state of the goallist has to be restored.

Now at state S1, we had already made also some substitutions namely:

(?a, ?x) and (?y, ?c). This subtitutions have to be pushed on the stack too.

This gives three items that have to be pushed on the stack at state S1:

1) The current list of goals to be proved:

?b :mother ?y.

2) All substitutions that were done till arriving at state S1:

(?a, ?x) and (?y, ?c)

3) The list of alternatives at state S1 (except the first) (here only one alternative):

((?x, :Christine), (?b, :Elza))

Remark that an alternative is expressed as a substitution list.

We start looking into the first alternative (which gives, of course, the solution:

:Frank :grandmother :Elza.

After that we need to look into the second alternative (which, potentially might give a solution too). We then retrieve from the stack all the informations that were previously pushed onto it. One of the elements retrieved is the list of alternatives; one is chosen and the rest is repushed on the stack. Then everything proceeds as before.

3) An algorithm

I will now put this somewhat more orderly into an algorithm:

1) The query is put in the goallist

2) A goal is selected from the goallist. If there is no goal anymore then a solution is found (goto 5). The chosen goal is matched with all triples and rules in the database. All the matches that succeed produce an alternative.

3) One alternative is selected and new goals are pushed on the goallist. If there are no alternatives we must go back to a previous state (goto 4). The current goallist, the current list of substitutions and the list of alternatives (minus the selected one) are pushed on the stack. Goto 2.

4) Retrieve the list of alternatives from the stack, the goallist and the list of substitutions. Goto 3. If the stack is empty (all possibilities were looked into), then goto 6.

5) A solution is found. Push the current list of substitutions into the solutions list. Goto 4 (try to find more solutions by backtracking).

6) End of the inferencing process. Now display the results using the list of solutions (each solution is a list of susbtitutions among which a substitution for each variable in the query).

4) Conclusion

So far this simple presentation of inferencing with notation 3. Some points that could be treated in a follow-up might be:

· the different kind of variables

· the treatment of blank nodes

· the logic builtins of CWM

· what to do with builtins

· what to do with OWL