Scutter Spec

There is no Scutter spec.

But if there were one, it might look a bit like this.

See also Mark Pilgrim's article on Atom aggregator behaviour; many of the issues are the same for FOAF aggregators.

Author: DanBri Date: Feb 2003

Status of this Document

This is a rough draft. The intended audience at this stage is probably interested RDF and XML developers who have a reasonable understanding of RDF, and want to put some tools into practical use as a way of gaining more experience. Future revisions might pre-suppose less background knowledge. Several parts of this document are incomplete, having missing urls, etc., but enough is here to be useful as a basic overview.

What is a Scutter?

A robot, harvester, data collecter, working over the ESW:SemanticWeb. See Scutter for more information.

What is a Scutter Plan?

An ScutterPlan is an RDF file that lists RDF documents as a starting point.

Technical Issues

This bit is still in draft, see #rdfig chat from 2003-02-10 for context.

Cacheing etc: like any Web indexing effort, a well-written Scutter should implement cacheing using the relevant HTTP machinery. (details? update rate? If-Modified-Since headers, etags, etc. OptimalScuttering )

robots.txt: scutters should observe the Robot exclusion protocol.

Sample scutterplans: see [DataSources].

A vocabulary for scutter metadata: See [ScutterVocab].

Strategies for focused, subject-oriented and/or limited but open scuttering: See [ScutterStrategies].

b/g on ETags and Robots.txt

   * 	The Peril Of Using ETags In A Cluster (more for server-side really, but interesting).
   * example perl script using etags

Further Reading

* Scutter HOWTO / developer roadmap
* Aggregation strategies ('smushing')
* ayf - 'all your foaf', a very basic scutter in Perl and rewritten  in Ruby 
* Ruby scutter, an earlier Ruby scutter that can store  data in SQL databases
* Scutter HOWTO / Spanish version
* hackscutter, experimental RDF crawler based on Jena2 Where this file is???
* ... more links here please


This page knows about the following RDF indexing tools.

* danbri's vapourware ruby scutter (/usr/bin/webdata in ESW:RubyRdf)
* mattb's Java scutter hackscutter-0.1.tar.gz
* KjetilK's Perl RDF::Scutter.
* eikeon's?
* libby's?
* jim's?
* Zool's - see the AYF Perl module documentation for more details.



In general, answer questions in a FAQ- don't just link to an answer somewhere.

If we wanted to see disconnected documents all over the place, we'd just use Google. ;)

-- LionKimbro2 DateTime(2004-06-23T05:50:37Z)