UsingDublinCoreCreator

The Dublin Core Metadata Initiative define a property, dc:creator which can be used in RDF data.

Here we explore some ways it can be mixed with other RDF data such as FOAF, and through doing so highlight a well-known problem.


(first draft by danbri, July 2003)

It isn't clear whether dc:creator relates documents to people, or to their names.

For historical reasons, DC has had a concern with 'dumbing down', to allow simple and complex metadata to co-exist.

Some DC RDF looks like this:

<rdf:Description>
  <dc:title>My document...</dc:title>
  <dc:creator>Dan Brickley</dc:creator>
  <dc:description>A textual abstract could go here</dc:description>
</rdf:Description>

In fact, the Simple DC in RDF document encourages just this style.

However, it is also quite common to see dc:creator used as a relationship to a Person. For example:

<rdf:Description>
  <dc:title>My document...</dc:title>
  <dc:creator>
     <foaf:Person> 
       <foaf:name>Dan Brickley</foaf:name>
       <foaf:homepage rdf:resource="http://rdfweb.org/people/danbri/"/>
     </foaf:Person>
  <dc:description>A textual abstract could go here</dc:description>
</rdf:Description>

Currently a fair amount of FOAF data takes this approach. We have our own fairly rich way of describing people, but we try to use Dublin Core's 'creator' property to relate people to documents. We use DC to describe documents, of course; that's what it is particularly good for.

OK, so what do the specs say? Time for a trip down memory lane...

Those who have not participated in Dublin Core should, first of all, know that the specs have changed a bit.

One thing about DC: there are generic specs defining the properties, and then specs defining their representation in RDF/XML (and also non-RDF XML).

The first spec for DC-in-RDF we had out was this one:

   * Guidance on expressing the Dublin Core within the Resource Description Framework (RDF) (Paul Miller, Eric Miller, Dan Brickley).

But this has been considered obsolete since ~2000. If you look at it, you'll see a pretty complex attempt at allowing DC to be extended, yet for all DC applications to know enough about the extensions to at least retrieve a textual value for 'creator'. Note that my second example above fails at this aim, since there is no way that a non-FOAF-aware tool could realise that the string 'Dan Brickley' was the name of the creator. Many many many Dublin Core discussions centred around different ways of achieving such extensibility.


A more mature DC in RDF spec:

   * Expressing Simple Dublin Core in RDF/XML (Dave Beckett, Eric Miller, Dan Brickley)

This only deals with using the 15 DC properties as simple textually-valued properties, as in the first example above.


The holy grail of Dublin Core, of course, was the seamless extension of this to 'qualified' Dublin Core.

Here is the latest qualified DC / RDF spec:

   * Expressing Qualified Dublin Core in RDF / XML (Stefan Kokkelink, Roland Schw�nzl)

A strength of this spec is that it tries to use RDF's native facilities wherever possible. But that is also unfortunately one of its weaknesses: it makes things more complicated by using bits of RDF that (to be honest) are best ignored.

For example:

<rdf:Description dc:title="Healthy Meat">
 <dc:creator>
   <rdf:Bag>
     <rdf:li>Jon Doe</rdf:li>
     <rdf:li>Karin Mustermann</rdf:li>
   </rdf:Bag>
 </dc:creator>
</rdf:Description>

This uses RDF's notion of a Bag, ie an unordered list. The spec also shows the use of rdf:Seq (an ordered list) and rdf:Alt (alternates).

Our investigation began by wondering whether we should use dc:creator as a relationship between a document and a person, or just as a relationship between a document and that person's name. Looking at this latest DC-in-RDF spec, things get more complex. We are being encouraged to also, sometimes, use dc:creator as a relation between a document and ... another datastructure (list of some kind).

This, to be blunt, is a pain in the butt for implementors. When you meet a dc:creator property, what on earth is it referencing?:

  * a name of a person?
  * a person (who may have arbitrary other properties you understand, or don't understand)
  * a list (alt, seq, bag) of names? 
  * a list (alt, seq, bag) of persons...?

And that's without going into the particular uselessness of RDF's Alt construct (short version: it borders meaningless) or Bag (short version: just use repeated properties, simpler and clearer).

There is actually a lot more in the dcq-rdf-xml document, including a 'dumb down' algorithm for trying to pull out relevant simple strings from the nearby regions of the RDF graph.

Here is an example it gives, showing a vcard-in-rdf vocab mixed with DC:

<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
         xmlns:vCard = "http://www.w3.org/2001/vcard-rdf/3.0#">
<rdf:Description> 
<dc:creator>              
 <rdf:Description rdf:about = "http://qqqfoo.com/staff/corky" >
  <rdfs:label> Corky Crystal </rdfs:label>
  <vCard:FN> Corky Crystal </vCard:FN>
  <vCard:N rdf:parseType="Resource">
    <vCard:Family> Crystal </vCard:Family>
    <vCard:Given>  Corky </vCard:Given>
    <vCard:Other>  Jacky </vCard:Other>
    <vCard:Prefix> Dr </vCard:Prefix>
  </vCard:N>
  <vCard:BDAY> 1980-01-01 </vCard:BDAY>
 </rdf:Description>
</dc:creator>
</rdf:Description> 
</rdf:RDF>

See the above spec for the details of their dumb-down algorithm.


There are lots of ways dc:creator is being used. This makes it hard to know, given some document, what it's DC creator is. How can it be true that both I and my name are dc:creator of the same thing? We're in a bit of a mess.

Let's go back and look at the core definition of dc:creator...

From feeding 'http://purl.org/dc/elements/1.1/creator' to the Web, I get the following definition back:

<rdf:Property rdf:about="http://purl.org/dc/elements/1.1/creator">
<rdfs:label xml:lang="en-US">Creator</rdfs:label>
<rdfs:comment xml:lang="en-US">An entity primarily responsible for making the content 
		of the resource.</rdfs:comment>
<dc:description xml:lang="en-US">Examples of a Creator include a person, an organisation,
		or a service.  Typically, the name of a Creator should 
		be used to indicate the entity.</dc:description>
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/elements/1.1/"/>
<dcterms:issued>1999-07-02</dcterms:issued>
<dcterms:modified>2002-10-04</dcterms:modified>
<dc:type rdf:resource="http://dublincore.org/usage/documents/principles/#element"/>
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#creator-004"/>
</rdf:Property>

This is the formal definition of dc:creator. And the core problem is quite clear: we are trying to have it both ways.

We say 'an entity primarily responsible...', which sets the expectation that the dc:creator of something is a thing that creeated it.

But we also say, 'Typically the name of a creator should be used to indicate the entity.', which sets the expectation that the dc:creator of something will be a name of its creator.

Now the problem is that RDF is quite happy with doing either, but it makes the difference painfully obvious. And so we are left with a world in which some people write:

<dc:creator>Dan Brickley</dc:creator>

...while others write:

<dc:creator>
   <Person>
     <name>Dan Brickley</name>
   </Person>
</dc:creator>


So, what to do?

At the moment, we have a number of FOAF examples which encourage the latter use. This is because it seemed rude to discourage use of dc:creator. But is this right? Should we instead use a non DC property to relate people to documents?

A proposal.

   * use dc:creator only as a relationship between a document and a simple flat name of (one of) its creators
   * use other relationships (eg. propose foaf:maker) to relate a document to an Agent that made it
   * write down some logical rules that express how foaf:maker and dc:creator relate
   * don't use rdf:Seq, rdf:Alt, rdf:Bag with either property
   * if we want to explicitly model ordering of authorship (eg. academic papers) we should invent markup to do so
The relation of foaf:maker to admin:generatorAgent (also known as RSS mod_admin) was discussed in #foaf.  It was determined that admin:generatorAgent was a logical subproperty of foaf:maker, "a version of a software package that is a maker of a resource representation".

If we do this, we should be able to infer that 'N' is the dc:creator of 'D' wherever we see markup to the effect that 'D' has a foaf:maker of 'M' and that 'M' has a foaf:name of 'N'.

  ?D dc:creator ?N .
  implied by
  ?D foaf:maker ?M .
  ?M foaf:name ?N 

(hmm, how to write this in proper machine-processable N3?)

So, costs/benefits of this approach:

  * (+) we have a single simple use for dc:creator
  * (-) we appear less standards compliant, using the obscure foaf:maker property instead of the widely known dc:creator
  * (+) we have a clear mapping of the one usage to the other
  * (-) most RDF/XML tools cannot automatically understand such mappings
  * (+) the meaning of the data is much clearer than for the various ways in which dc:creator has been (ab)used in RDF

What else can we do to help? Well the FOAF spec says that the foaf:name property is a sub-property of the more general, and widely known rdfs:label property. So whenever some text T is the foaf:name of a thing, RDF-smart tools can conclude that T is also the rdfs:label of that thing. This is a step towards the free-flowing interoperability that DC has been looking for. But ultimately we can't get away from the problems caused by the have-your-cake-and-eat-it approach to the use of dc:creator.

We need to decide whether dc:creator is so loose it can be used in all these different ways, or whether some are in error, or at the very least not best practice.

The proposal for FOAF usage is that only dc:creator as a simple textual value is best practice, 
given what we know about the tools and technology at this time. 
When we want to relate a document to a thing that made it, use foaf:maker (a proposed new 
property, inverse of the existing foaf:made property).

If this is accepted amongst FOAF, RDF, DC developers, some possible work items are suggested:

  * within FOAF documents, replace our ambitious over-use of dc:creator with foaf:maker
  * within Dublin Core, seek a new DC term which relates documents to agents (this is too useful to risk obscurity by solving in  FOAF only)
  * within RDF tools, work to allow automatic conversion between these different ways of describing the world


Note that there is a new Dublin Core in RDF Draft available that addresses this very issue. -- Mikael Nilsson