A Prolog view of the Semantic Web (101)
One of my favourite blogs about the Semantic Web is “dbtune“. It is the blog of Yves Raimond, a very talented Semantic Web / Prolog programmer (and PhD student) who has created (among other goodies) an Ontology of music and musicians, as his PhD Thesis. His personal site is “http://moustaki.org“, which made me wonder if he is related to the… Greek musician Georges Moustaki – who lives in France (probably not).
Well, since I consider myself a Prolog veteran but (more-or-less) a novice in the Semantic Web, I couldn’t help admiring the simplicity of the following piece of code in dbtune’s post “Henry A small N3 parser/reasoner for SWI-Prolog“, back in 2007:
rdf(C,'http://example.org/uncle',U):- rdf(C,'http://example.org/parent',F), rdf(F,'http://example.org/brother',U).
In more typical Prolog fashion, this is the same as:
uncle(C,U):- parent(C,F), brother(F,U).
In other words (i.e. in natural language):
Someone(C) has an uncle(U) if: He(C) has a parent(F), who(F) is a brother of this uncle(U).
Now, suppose that instead of a binary relation (uncle/2) we wanted to check out the unary relation, of whether or not someone (U) “is an uncle”. In Prolog, this might be:
is_uncle(U):- parent_of(C,F), brother_of(F,U).
In other words (i.e. in natural language):
someone(U) is an "uncle", if: there exists someone(C) whose parent(F) ...is a brother of this guy(U).
Unfortunately, the above modified code, will probably… run forever! To undestand why this is so, you don’t have to be a Prolog programmer, or even a programmer in general:
- Bear in mind that Prolog programs are executed sequentially, just like every other piece of code, despite their “logical” semantics. So, any program that searches first an entire database to retrieve all possible parents of everyone (in the database) before actually using the specific fact you’ve already supplied to it (the person U) is likely to take much longer to respond, than another program making immediate use of the specific fact (U) supplied to it, before combining this fact with more general information (like checking out the parents of a smaller number of individuals, the brothers of U).
So, the necessary optimisation here is to reverse the order of the two calls (‘parent_of’ and ‘has_brother’) in order to check out first if U has a brother (F), and then check if this brother (F) is also a parent (of anyone else, C):
is_uncle(U):- brother_of(F,U), parent_of(_,F).
In other words, someone (U) is an “uncle”, if one has a brother(F), who(F) also happens to be a parent of someone else (_).
Which brings me… to the realisation that this may not be so obvious, to a lot of people who aren’t acquainted with Prolog and simple optimisations like this. E.g. suppose you are a programmer and you write a program to access e.g. dbpedia (the download-able Semantic version of Wikipedia); a program much more complicated than the previous 3-line code, operating on a database of millions (or even billions) of “triples” (relations of the form object-predicate-subject).
-
In this case, chances are high that you’ll make mistakes, like the one mentioned. As a result, you shouldn’t feel surprised if your program runs (almost) forever, while… “impatient customers” will start blaming the… Semantic Web’s “innate inefficiency”!
If you wish to experiment with Prolog and the Semantic Web, a list of download suggestions follow. Amazingly enough, you can now download machine-readable Semantic editions of Wikipedia (in its entirety):
1) SWI-Prolog (open source Prolog compiler for WinXP/Vista/Linux):
2) All the SWI-Prolog code in dbtune’s blog:
3) The SmartWeb Integrated Ontology (only 1 Mb)
4) The YAGO ontology:
- The YAGO ontology: Download (1Gb, version 2008-w40-2) Preview
The YAGO ontology is licensed under the GNU Free Documentation License. - YAGO in RDFS: Download (1Gb, version 2008-w40-2) Preview
The RDFS version of YAGO is licensed under the GNU Free Documentation License. - The subclassof hierarchy of YAGO in RDFS: Download (7Mb, version 2008-w40-2) Preview
The subclassof hierarchy of YAGO in RDFS is licensed under the GNU Free Documentation License.
5) Freebase downloads:
- http://download.freebase.com
- especially the Freebase Wikipedia Extraction; a processed, machine-readable dump of the English language Wikipedia.
6) dbpedia (the Semantic version of Wikipedia):
Recommended (English) DbPedia Core Datasets for easier access:
Extended dbpedia Datasets:
7) Other Downloads, suggested in my bookmarks’ collection:
-
[...] (more links may be added here, as updates)
Related articles by Zemanta
- Zemanta: Collaborative Thought Through Borrowed Relevance (shareumentarian.wordpress.com)
- An Interview with Dr. Rudi Studer on Semantic Search Technologies (ysearchblog.com)
- Semantic Web ~ Web 3.0 (mehmetalierturk.com)
- I Have a (Semantic) Dream (expertsystem.net)
- From Semantics to Things, where the killer is (yihongs-research.blogspot.com)
- Prolog message queue (dbtune.org)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=f742339b-cb8a-493b-8023-1f4f1f71ad6f)