ReviveFoafBotProject

From FOAF

Jump to: navigation, search

OUT OF DATE! FOAFBot has been revived by Edd. Need to rummage thru this see if anything is useful for a FOAFBot FAQ... --danbri

A Semantic Web developer project looking for help... Foafbot is an RDF-based IRC community support agent. Foafbot is written in Python, and uses the Redland toolkit for RDF-related functionality, such as parsing and storing RDF/XML documents it collects from the Web. The project outlined here is to bring foafbot back up to date with the latest versions of the Redland toolkit.

Nearby: irc chat w/ Edd and others on future of foafbot (summary, talk to Edd before investing time in the work outlined below; he is rewriting foafbot to accompany a new developerWorks article).

What's the problem?

At the time of writing, the last release of Foafbot was July 14th 2002, and it worked against a July 2002 development release of Redland. Since then, changes to Redland have (for reasons not entirely understood by any of us in #foaf) broken compatibility with Foafbot. We would like to have Foafbot back in our IRC channels, but it doesn't work with current Redland. This document was written up (by DanBri) in the hope that someone (perhaps a student project?) might be interested in helping or taking a lead with this work.

There is a patch for Foafbot 0.4 providing basic fix so the code runs with the newer redland APIs. Apply the patch with: "cd foafbot-0.4; patch -p1 < /path/to/patch".

Ironically enough, the changes to Redland include the addition of 'context' functionality, allowing the labelling of sub-sets of RDF data in the store. The was in part motivated by Foafbot, an application that needed to implement such functionality on top of the earlier pre-context Redland system. Foafbot, because it collects RDF/XML data from the public Web, needs to keep track of 'who said what', and which RDF 'statement' came from which document. Redland itself now offers APIs to support this, via the notion of a 'context'.

The basic patch linked above is enough to fix the most obvious incompatibilities between Foafbot and recent Redland. However, more work is needed, since:

   * Foafbot will run, collecting and storing RDF, when patched, but appears unable to 'find' any of the data it stores
   * Foafbot uses it's own (clever, elaborate, and slow) context-tracking mechanisms instead of Redland's new built-in support

What to do?

   * It would be good to repair Foafbot so that it at least works using it's own 'attributions' (contexts) system with a current Redland installation
   * Foafbot's code could probably be simplified now that Redland has contexts


Can't we just use an older Redland installation?

We would miss out on a lot of fixes and updates to Redland and the Raptor RDF parser. Also Redland might already be installed system-wide and it would be good to keep cost of installing Foafbot low.

What is going wrong?

We're not sure. When patched, Foafbot seems to collect RDF data OK. Crude methods such as using 'strings' against the BerkeleyDB RDF stores used by Redland show that RDF statements are indeed stored locally.

How does Foafbot store its data in Redland?

This is the subtle bit. Foafbot's author, Edd Dumbill, wrote an article describing the approach for IBM developerWorks. This, plus the Python source code, is your best reference. You will need some familiarity with the basic RDF graph data structure to follow this stuff.

The basic idea is to rewrite each RDF statement {p,s,o} being stored to be {p_n, s, o}, using a different uniquely identified predicate (ie. relationship name) for each statement. This provides a key for other information (also stored as RDF) to be attached to each statement in the database. A similar approach, known as 'reification', is part of the RDF standard, but is even more verbose. From the point of view of the Redland store, it sees a whole load of somewhat non-sensical triples. From the point of view of the Foafbot application, its 'AttributedModel.py' library mediates access to Redland, reflecting the complex attributed structures back into simpler RDF statements which are tagged with their origin.

A more depth example is discussed below.

Where can I get some sample FOAF data? Using the entire foaf web for testing is slow and unpredictable...

There is the start of a FOAF test dataset, which links only to its own RDF files so is useful for testing harvesting and indexing tools. That dataset, however, doesn't seem to make Foafbot happy. This is perhaps because it doesn't include foaf:nick properties, which Foafbot uses to organise its representation of the people described in the FOAF RDF/XML documents it indexes.

Why would I want to help with this? What skills are needed?

Foafbot is an interesting application because (like FOAF itself) it brings into practical focus many of the design challenges of the Semantic Web. By providing a text-chat interface to an aggregation of RDF data, and by showing the need to track the provenence of merged data (even using PGP/crypto)to make really sure of who-said-what) Foafbot can be seen as a prototype for the sort of applications we'll see when the Semantic Web is mature.

To help with this, you'll need (or acquire!) some knowledge of Python, of the RDF graph data model, and of debugging/testing.

Anything else need doing? Bugfixing sounds boring...

Once Foafbot is up and running again, there are any number of interesting things that could be explored. You might look into using Foafbot's RDF database to make HTML or SVG (see [FoafNaut]) interfaces to the same data. Or providing support for more RDF vocabulary, such as other FOAF terms. Or doing usability testing to find out what real users of IRC channels might want from such a chat bot...

Who should I contact about this project?

This page was written initially by Dan Brickley (DanBri). Foafbot was written by Edd Dumbill. You would probably be best to use the FOAF mailing list, rdfweb-dev@vapours.rdfweb.org for discussing this, as others are potentially interested. You can (of course) also find many of us in IRC, c/o the #foaf channel on irc.freenode.net.

(Note: At the time of writing I haven't even shown this page to Edd, though I expect he will approve. --DanBri)


Is there a Todo list?

Not really. If there were, it might look something like:

 * find out why this isn't working even though we're not using Redland's context api
    * perhaps compare/contrast by installing a old (July 2002?) version of Redland, see if that works
 * fix this
 * migrate to use Redland's context API instead
    * which bits of the code would need changing? is there a nice bottleneck of is it spread throughout?
 * celbrate

Is there any more info available to help with debugging?

Here is a dump created by adding a crude 'print' trap into the data loading code. If shows the non-attributed triples being stored, as well as their src URIs, via the add_attributed_statement method of AttributedModel class in AttributedModel.py.

python ./foafbot.py -s irc.freenode.net -c foaftest -n foafless -f
Consulting plan file:./plan.rdf
parsing URI file:./plan.rdf
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [file:./plan.rdf] [http://www.w3.org/2000/01/rdf-schema#seeAlso] [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [file:./plan.rdf] [http://www.w3.org/2000/01/rdf-schema#seeAlso] [http://rdfweb.org/2003/02/a-z/c-foaf.rdf]
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [file:./plan.rdf] [http://www.w3.org/2000/01/rdf-schema#seeAlso] [http://rdfweb.org/2003/02/a-z/b-foaf.rdf]
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [file:./plan.rdf] [http://www.w3.org/2000/01/rdf-schema#seeAlso] [http://rdfweb.org/2003/02/a-z/a-foaf.rdf]
referrer [http://usefulinc.com/ns/scutter/dynamic]
encrypted to []
Fetching http://rdfweb.org/2003/02/a-z/d-foaf.rdf -> 0.rdf
Parsing into test model, base URI http://rdfweb.org/2003/02/a-z/d-foaf.rdf
Fetching signature from None
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://usefulinc.com/tmp/anon#0] [http://xmlns.com/foaf/0.1/mbox] [mailto:anon0@no.where]
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://usefulinc.com/tmp/anon#0] [http://xmlns.com/foaf/0.1/mbox_sha1sum] 3a742efac0ba1bce3ef7ff34bcb15cf1aa9f0398
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://usefulinc.com/tmp/anon#0] [http://xmlns.com/foaf/0.1/name] Anon0
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://usefulinc.com/ns/scutter/anoncount] [http://www.w3.org/1999/02/22-rdf-syntax-ns#value] 1
***** Setting anon counter to 1
owner [http://usefulinc.com/tmp/anon#0]
parsing URI file:./0.rdf
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid3) [http://xmlns.com/foaf/0.1/knows] (genid4)
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid4) [http://www.w3.org/2000/01/rdf-schema#seeAlso] [file:b-foaf.rdf]
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid4) [http://xmlns.com/foaf/0.1/mbox_sha1sum] ec68415eab32042abcae975198c23d747592f27a
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid4) [http://xmlns.com/foaf/0.1/homepage] [http://www.example.org/foaf/people/betrand/]
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid4) [http://www.w3.org/1999/02/22-rdf-syntax-ns#type] [http://xmlns.com/foaf/0.1/Person]
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid3) [http://xmlns.com/foaf/0.1/schoolHomepage] [http://foafhigh.example.org/]
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid3) [http://xmlns.com/foaf/0.1/homepage] [http://www.example.org/foaf/people/donnie/]
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid3) [http://xmlns.com/foaf/0.1/mbox] [mailto:dd@foaf.example.org]
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid3) [http://xmlns.com/foaf/0.1/mbox_sha1sum] 8bef752eeb5016538ef081f992e91bab93fadb64
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid3) [http://xmlns.com/foaf/0.1/name] Donnie Dinko
ADDING TRIPLE:
src:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf]
triple:  (genid3) [http://www.w3.org/1999/02/22-rdf-syntax-ns#type] [http://xmlns.com/foaf/0.1/Person]
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf] [http://usefulinc.com/ns/scutter/0.1#owner] [http://usefulinc.com/tmp/anon#0]
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf] [http://usefulinc.com/ns/scutter/0.1#referrer] [http://usefulinc.com/ns/scutter/dynamic]
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://rdfweb.org/2003/02/a-z/d-foaf.rdf] [http://usefulinc.com/ns/scutter/0.1#status] 0
***** Setting status for [http://rdfweb.org/2003/02/a-z/d-foaf.rdf] to 0
referrer [http://usefulinc.com/ns/scutter/dynamic]
encrypted to []
Fetching http://rdfweb.org/2003/02/a-z/c-foaf.rdf -> 1.rdf
Parsing into test model, base URI http://rdfweb.org/2003/02/a-z/c-foaf.rdf
Fetching signature from None
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://usefulinc.com/tmp/anon#1] [http://xmlns.com/foaf/0.1/mbox] [mailto:anon1@no.where]
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://usefulinc.com/tmp/anon#1] [http://xmlns.com/foaf/0.1/mbox_sha1sum] d8b2ef6bd5f3347cfcfc99e6add7a8530fcffca7
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://usefulinc.com/tmp/anon#1] [http://xmlns.com/foaf/0.1/name] Anon1
ADDING TRIPLE:
src:  [http://usefulinc.com/ns/scutter/dynamic]
triple:  [http://usefulinc.com/ns/scutter/anoncount] [http://www.w3.org/1999/02/22-rdf-syntax-ns#value] 2
***** Setting anon counter to 2
owner [http://usefulinc.com/tmp/anon#1]

(etc etc...)