ExpertFinder 2fPapers 2fUsingandCombiningRDFVocabulariesforExpertFinding
From FOAF
THERE IS A MORE RECENT DRAFT OF THIS PAPER AVAILABLE AT: [1]
Using and Combining RDF Vocabularies for Expert Finding
Boanerges Aleman-Meza and Harold Boley and John G. Breslin and Malgorzata Mochol and Lyndon JB Nixon and Axel Polleres and Anna V. Zhdanova
LSDIS Lab, University of Georgia and DERI, National University of Ireland, Galway and Freie Universitaat Berlin and Universidad Rey Juan Carlos, Madrid and University of Surrey, UK and National Research Council of Canada
Abstract
This paper presents a framework for the reuse and extension of existing, established vocabularies in the Semantic Web. Driven by the primary application of expert finding, we have been exploring the reuse of vocabularies which have attracted a considerable user community already (FOAF, SIOC, etc.) or are derived from de facto standards used in tools or industrial practice (such as vCard, iCal and Dublin Core). This focus guarantees direct applicability and low entry barriers, unlike when devising a new ontology from scratch. The Web is already populated with several such vocabulary approaches which complement each other (but also have considerable overlap) in that they cover a wide range of necessary features to adequately describe the expert finding domain. Little effort has been made so far to identify and compare these approaches, and to devise best practices on how to use and extend various vocabularies conjointly in order to provide the (Semantic) Web user with a tool for description and querying. It is the goal of the recently started ExpertFinder Initiative to fill this gap. In this paper we provide the ExpertFinder framework, including a practical analysis of overlaps and options for combined use and extensions of several existing RDF formats as well as a proposal for applying rules and other enabling technologies to the expert finding task.
[edit] Introduction
The Semantic Web has arrived. A constantly growing number of people and organizations provide metadata on their personal or institutional Web pages in formats mainly relying on RDF; microformats provide a way to embed metadata directly within XHTML documents. The GRDDL Working Group FootNote(http://www.w3.org/2001/sw/grddl-wg/), recently founded by the W3C, encompasses these developments and provides ways to syntactically combine these formats. Also, the Semantic Web Best Practices and Deployment Working Group FootNote(http://www.w3.org/2001/sw/BestPractices/) has been providing guidelines how to publish and syntactically combine RDF/XML or OWL data and ontologies. So let's take the next step, assume the syntactical issues are resolved, and have a closer look at the actual vocabularies and ontologies that we can employ for our (personal or institutional) metadata.
We are proposing a basic, but equally challenging, lead application for the take-up of Semantic Web technologies: automating the task of finding experts (individuals, teams, and organizations), which is a tedious manual effort at the moment. Our assumption is that when persons, institutions, projects, and events are described in Web pages using agreed-upon machine readable formats, the automatic location of experts/expertise in a particular area or for a particular endeavor will become feasible.
We identify the following main issues that need to be resolved in order to accomplish this task, as well as similar tasks for many other applications of Semantic Web search:
* Common machine readable formats (syntax, semantics) * Critical mass (low entry barrier, tool support, reuse) * Enabling technologies to solve practical use cases * Incentives for early adoption (Web visibility)
The remainder of this paper is structured as follows. In Sec. #sec:csf we will briefly discuss each of these three critical success factors. In Sec. #sec:ExpertFinder we will describe the ExpertFinder initiative and some specific use cases which shall be covered within our general application domain. Sec. #sec:ontologies contains the meat of this paper, namely, identifying several quasi-standard ontologies relevant to expert finding, their relations to our domain, how we plan to combine them, and also trying to formalize the overlaps between these.
I mean we define the overlap tables and suggest some mappings!
Would add some user scenarios and areas of deployment.
We list related initiatives and projects in Sec. #sec:projects and discuss relations to research in the area of Ontology reuse in #sec:ontreuse. We conclude and give an outlook to the significant research work which remains to be done in Sec. #sec:concl.
[edit] Critical Success Factors
[edit] Common machine readable formats
As mentioned before, we consider syntactical issues to be solved for the moment and assume that people know how to publish semantic annotations with their webpages and that there is proper tool support. So, we can focus on semantic aspects, which in this paper we understand to be common existing RDF and OWL vocabularies which comprise areas such as descriptions of personal and institutional data including curriculum vitae and addresses; ontologies for modeling areas of knowledge/expertise, business sectors and communities; events and publications. In section\ref{sec:ontologies} we will analyze existing vocabularies, ontologies and business standards for each of these fields.
[edit] Critical mass
Regardless of which vocabulary we finally decide on, in order for a description in this vocabulary to be usable for expert finding on a Semantic Web scale we have to either convince a critical mass of users and content publishers to support this vocabulary or translate/extract existing content automatically to this vocabulary. Creating a new ontology from scratch and disseminating it within a closed community is already difficult,
and promoting its use among Web users even more so. It is more likely that different individuals and organisations
will select portions of an ontology or extend it to meet their particular descriptive needs, and that several ontologies or vocabularies will be used for expressing the same or similar things (see the later discussion on different vocabularies for describing persons). While alignment may be the most realistic aim, the uptake of FOAF (being published by LiveJournal for example) indicates the value of attempting to reuse established vocabularies where possible in order to access already existing descriptions on the Semantic Web as well as further promote use of single vocabularies to express the same thing, e.g. persons.
Axel: I can't help, but I'd still feel tempted to mention the KWeb portal ontology as an example for such an approach, but one shouldn't be the feeding hand, right? ;-)
In the context of description extraction, text retrieval methods #addreferences or wrapper technologies #addreferences facilitated by e.g. approaches like PiggyBank #huynh05piggybank are becoming more and more stable and successful over the last few years, however they do not guarantee 100% precision or recall due to the ambiguities of natural language and HTML page structure, nor solve the problem of the right ontology/vocabulary to use for annotations. So, probably there is not anything like the right ontology for our domain and we rather should focus on reuse and combination of existing vocabularies which are already in use.
The approach we focus on in the ExpertFinder initiative is to analyze and formally define their overlaps and provide best practices on how to use them together. A practical side effect of focusing on existing vocabularie is that we can rely on some existing tools. Take iCal #iCalRef, vCard #vCard or bibtex #bibtex as examples which are supported by available tools such as calendar or address book software and online citation indexes.
[edit] Enabling technologies to solve practical use cases
Whereas the previous two points more or less were concerned with how to get the necessary metadata onto the Web, now we assume its availability at significant scale. In order to solve practical use cases, we might have to consider several additional technologies in order to make them real. A list of use cases considered in the ExpertFinder domain will be given in Section #subsec:usecases below. As for enabling technologies, we mentioned text retrieval methods #addreferencesagain above already. Another key enabling technologies in our opinion will be rules. Rules play a crucial role in several respects. (i) Rules (together with expressive Ontology languages) allow us to formally define the exact relationships between the existing voacabularies we consider. While for instance vocabularies like FOAF or SIOC already formalize their structure to some degree in OWL, such vocabularies usually do not have many more features beyond simple taxonomies expressible in RDFS alone. When defining the exact relations between overlapping vocabularies we expect to even need expressive features beyond OWL.
add example mapping
Lyndon: text retrieval is pretty error prone. Why not focus more on microformats and GRDDL, where we should be able to get correct RDF results?
(ii) Rules, published together with RDF metadata can serve to link to other metadata. By means of such interlinked metadata, we discover new relations. This would enable us to link to metadata published elsewhere through expressing rules like:
http://polleres.net/foaf.rdf#me is author of all publications listed at http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/p/Polleres:Axel.html
All persons listed at http://www.rdfweb.org/topic/ExpertFinder_2fMembers are experts in http://en.wikipedia.org/wiki/Semantic_Web
but also negative dependencies
All persons listed at http://www.rdfweb.org/topic/ExpertFinder_2fMembers but not those working for any company listed at http://www.bigEvilCompaniesBlackList/ are my friends
One possiblity here would be to adopt CONSTRUCT queries from SPARQL as a view/link definition language. We hope that W3C efforts like the RDF Data Access Working Group (DAWG) FootNote(http://www.w3.org/2001/sw/DataAccess/) sparql and Rule Interchange Format (RIF) FootNote(http://www.w3.org/2005/rules/) working groups working on standards in these domains will soon provide adequate solutions in this direction.
(iii) Another enabling technologies may involve recommendation algorithms and strategies, to rate the value of meta-data, but also security and trust mechanisms allowing to restrict access to certain meta-data by encryption, etc.
Anna: to extend here!
[edit] The ExpertFinder Initiative (Axel)
some words on the initiative....
ExpertFinder FootNote(http://www.rdfweb.org/topic/ExpertFinder) is an international collaborative initiative with the aim of devising vocabulary and rule extensions (e.g. FOAF and SIOC) and best practices and recommendations towards standardization in order to annotate personal home pages, pages of institutions, conferences, publication indexes, etc. with adequate metadata to enable computer agents to find experts on particular topics...
[edit] Use cases (Axel)
summarize the expertfinder use cases
[edit] Vocabularies and their overlaps
We start with a brief description of the vocabularies we take as starting points for our analysis and justify their choice. As mentioned above, we consider the following areas: personal and institutional data including curriculum information, addresses; actual ontologies for modeling areas of knowledge/expertise, business sectors, communities; events; publications, customer opinions, ratings, reliability and trustworthiness estimates; as well as geographical location and availability. Our approach is of general purpose, and the resulting ExpertFinder ontology shall be applicable for finding any kind of expert e.g. researchers, accountants, web designers, hairdressers, plumbers and so on.
Our goal is to pick some of the most widely used vocabularies in this area, check how far they are ontologized already, identify what overlaps exist between these formats and how they can be reused in a united way. Here, we do not aim at providing an exhaustive list of all ontologies ever developed in the related areas.
In the end of the section, we will point out how we envision to end up with a framework how to arrange these components to a new ontology and to identify the normative classes and properties to use within this framework.
[edit] Starting point: FOAF (John)
The FOAF (Friend of a Friend) ontology [Brickley and Miller, 2000] is the starting point for our work on finding experts through one's extended social network.
The FOAF ontology was developed to create machine readable web pages for people, groups, organisations and other related concepts - basically, to describe people, what they do and how they interact with each other. One of the most used properties of the FOAF ontology is the "knows" property: a simple way to create social networks through the addition of knows relationships for each individual that a person knows. For example, Alice may specify knows relationships for Bob and Damian, and Eric may specify a knows relationship for Caroline and Damian; therefore Alice and Eric are connected indirectly via Damian.
Aggregations of FOAF data from many individual homepages are creating distributed social networks; this can in turn be connected to FOAF data from larger online social networking sites such as LiveJournal or Tribe.
In terms of definitions of expertise by an individual, the FOAF ontology has a number of properties of note: firstly, the foaf:interest property defines topics of interest to a person, and can be used directly to find those with an interest a particular domain; secondly, people can create foaf:publications or other foaf:Documents (via foaf:made/maker) which may have an associated foaf:topic or foaf:primaryTopic that can again be used to determine a person's domains of interest; and thirdly, foaf:currentProject/pastProject gives some information on some "some collaborative or individual undertaking" that a person may be involved in.
[edit] FOAF extensions
There have been a number of extensions or modules for the FOAF ontology that are of interest to the expert finding scenarios previously mentioned.
[Bojars, 2004] presented the "Resume" schema FootNote(http://purl.org/captsolo/semweb) for extending FOAF profiles with curriculum vitae-type information. This schema includes terms for work and academic experience, skills, courses and certifications, publications, references, etc.
[Kruk, 2005]'s FOAFRealm FootNote(http://www.foafrealm.org/) is a user profile management system based on FOAF, that provides authentication, access control and social networking features such as "semantic social collaborative filtering". The system allows users to share and annotate their personal taxonomies across a social network using WordNet, DDC and DMoz as base classifications. When implemented in document exchange systems such as JeromeDL FootNote(http://www.jeromedl.org/), a semantic digital library, users can classify their documents or bookmarks and allow others to access these resources using FOAFRealm's ACL-based social networking functionality. Each user's collection is assigned an expertise value that reflects the quality of the information that they provide; this value is calculated based on a PageRank calculation of their social network. Users are then also aware of the expertise level of others on given topics.
[edit] SIOC (John)
The Semantically-Interlinked Online Communities project (SIOC) FootNote(http://sioc-project.org/) aims to provide a framework for the connection and interchange of information from internet-based discussions and community portals. Such communities are primarily made up of users, the posts that they create, and the discussion forums that they subscribe to across a multitude of sites and discussion platforms.
The basis for SIOC is the SIOC ontology, an RDF-based schema which describes the main concepts found in online communities [Breslin, 2005]. While there are many classes and properties in SIOC, the main notion is that sioc:Users create Posts that are contained in Forums that are hosted on Sites, i.e.
Site -> host_of -> Forum -> container_of -> Post -> has_creator -> User
sioc:Posts have reply Posts, and Forums can be parents of other Forums.
With respect to finding experts in a social network, the main SIOC properties of interest are sioc:topic and dc:subject. sioc:topic defines a category resource that a particular discussion post is related to; by aggregating all the sioc:topics that are associated with a particular user's posts across a number of sites, a picture emerges as to where their topics of interest and related expertises lie. sioc:Forums or Sites may also have associated sioc:topics, and again a user with an interest in a particular topic may be a sioc:subscriber_of a certain discussion channel.
[edit] Relation to FOAF
http://sioc-project.org/files/sioc_foaf_skos_small.png
Figure 1: Connections between SIOC, FOAF and SKOS
The SIOC ontology developers have worked with the FOAF and SKOS creators to align concepts and avoid any unnecessary duplication or term conflicts. The concept of sioc:User has been defined to be a sub-type of foaf:onlineAccount, so that existing properties from FOAF can be reused and so that new properties for users can be defined in SIOC without directly impacting on the FOAF ontology. As shown in Figure 1, a foaf:Person can own many sioc:User profiles (via the foaf:holdsOnlineAccount relationship). Similarly, content that a sioc:User creates on a particular Forum (e.g. a Weblog, Mailing List, Bulletin Board etc.) can be linked using sioc:topic to a skos:Concept (e.g. in Figure 1, one post is talking about clouds and another post is referring to a narrower concept, that of rain clouds). Using SKOS to define topics under discussion and of interest leads to interesting possibilities when the relationships between the various taxonomy terms are formalised using the SKOS vocabulary.
[edit] DOAP: Description of a Project (John)
DOAP FootNote(http://usefulinc.com/doap/) is a project to create an XML/RDF vocabulary to describe open source projects. Its initial goals included: (i) internationalizable description of a software project and its associated resources (with participants and Web resources); (ii)basic tools to enable importing of projects into software directories; (iii) data exchange between software directories; (iv) automatic configuration for resources such as shared CVS repositories or bug trackers; (v) assisting package maintainers who bundle software for distributors, (vi) interoperability with other popular Web metadata projects (RSS, FOAF, Dublin Core) and (vii) the ability to extend the vocabulary for specialist purposes.
DOAP describes the current state of the project but it does not highlight changes and updates. Nevertheless to keep the repository up to date with releases the CodeZoo with an Atom FootNote(http://www.codezoo.com/about/doap_over_atom.csp) feed containing embedded DOAP can be used.
John: Erm, I don't know that much about DOAP? Anyone else?
[edit] Relation to FOAF/SIOC
DOAP uses foaf:Person to describe the corresponding contributors of each project part (e.g. project maintainer, developer etc.) ........
[edit] DOAC: Description of a Career (Gosia)
DOAC FootNote(http://ramonantonio.net/doac/) is a RDF metadata vocabulary to describe professional capabilities of workers gleamed from, e.g., CV’s or resumes. The metadata enhances specific description as well as facilitate and shorten the search to locate the suitable (regarding the given position requirements) job seeker. DOAC has been designed to be compatible with the European CV (known as Europass) which can be generated from a FOAF+DOAC file. It includes information about education, work experience, publications, spoken languages and other skills that can be shared and processed by any application.
[edit] Relation to FOAF
The FOAF vocabulary specifies the most important features of individuals acting in online communities. The vocabulary allows, for example, specification of properties regarding those who commonly appear on personal homepages and their membership in the organisation. These parts of the FOAF vocabulary are also used within the DOAC specification FootNote(http://ramonantonio.net/doac/0.1/). DOAC uses the foaf:Person class to general descriptions of job seekers and Class: foaf:Organisation to define which schools and institutions the individual attended (cf. DOAC-FOAF comparison).
The foaf:pastProject concept could be added as a subclass to the doac:Experience class. This would allow description of not only the job seeker’s general experiences in a company but also their experiences in different projects. Furthermore the doac:publication property which establishes a connection between the foaf:Person and doac:Publication should be defined as a foaf:publication linking foaf:Person with foaf:Document.
[edit] vCard (Aleman)
vCard is an standard for representing personal data such as business cards. There are various forms in which vCard data can be written. It started as a plain text format but now there are also XML, RDF-based FootNote(http://www.w3.org/TR/vcard-rdf and http://norman.walsh.name/2005/12/12/vcard) and also a microformat for vCards data. Our interest is on the RDF-based representation. There has been some discussion on updating vCard in RDF and harmonizing it with FOAF. This vocabulary allows to describe mostly contact information but there are some items relevant for expert finding. For example, affiliation information or the role someone has could indicate knowledge on some area or a particular expertise aspect.
[edit] Relation to FOAF/SIOC
The vCard standard overlaps in many aspects with FOAF, which contains useful mechanisms to specify other people that someone knows. Recent efforts to incorporate modern RDF best practices into vCard have been proposed FootNote(http://www.w3.org/2006/vcard/ns). Early indications of re-using (instead of redefining) vocabulary are quite evident on how SIOC utilizes FOAF, DC and RSS 1.0 vocabularies. It would seem that SIOC could have adopted vCard instead of FOAF yet probably this was not the case due to an apparent larger adoption of FOAF (as compared to vCard in RDF).
[edit] SKOS (Aleman)
SKOS allows to express in a simple way the structure and content of RDF vocabularies. The basic features of SKOS include declaring whether a concept is broader/narrower than another, preferred and alternative labels for terms, and related terms. SKOS facilitates sharing and representing terminologies that may not extensively require the expressive power of other languages such as OWL.
[edit] Relation to FOAF/SIOC
SKOS can be seen as a higher-level abstraction that allows to specify how concepts relate to other concepts. For example, it could be used to say that a Post in SIOC is a narrower concept than a Document in FOAF. SKOS Mappings can prove valuable to specifying which and how two concepts relate to each other. Similarly as the benefits of SIOC, this can lead to improvements on data sharing and integration from otherwise disparate vocabularies.
See [2].
[edit] RELATIONSHIP & XFN (Gosia)
RELATIONSHIP FootNote(http://purl.org/vocab/relationship) and XFN FootNote(http://gmpg.org/xfn/join) (XHTML Friends Network) are vocabularies used for describing interpersonal relationships.
[edit] Relation to FOAF
FOAF so far has only one property foaf:knows to describe interpersonal relationships. This seems to be deficient when it comes to specifying the relations between people which are rather complex and complicated in the real world and outside of the semantic descriptions and vocabularies. Since foaf:knows describes relationships between people rather sketchily, two different vocabularies are being deployed to fill this gap and assert such relationships in detail: RELATIONSHIP and XFN, which specify the foaf:knows property by defining different subproperties, can be viewed an extension of the FOAF specification.
[edit] HR-XML (Gosia)
Human Resources XML (HR-XML), developed by the HR-XML Consortium, is a XML-based standard for human resource. HR-XML is a library of more than 75 interdependent XML schemes that define data elements for specific HR transactions as well as options and constraints governing the use of those elements. It covers major processes and component schemes used across multiple business processes like: CompetenciesFootNote(http://ns.hr-xml.org/2_2/HR-XML-2_2/CPO/Competencies.pdf) (provides structure to enable competencies to be easily compared, ranked, and evaluated and is capable of referencing competency taxonomies from which competency descriptions are gleamed or used), Contact MethodFootNote(http://ns.hr-xml.org/2_2/HR-XML-2_2/CPO/ContactMethod.pdf), Education HistoryFootNote(http://ns.hr-xml.org/2_2/HR-XML-2_2/CPO/EducationHistory.pdf) (provides a method to exchange historical education information between partners), Employment HistoryFootNote(http://ns.hr-xml.org/2_2/HR-XML-2_2/CPO/EmploymentHistory.pdf) (provides a means by which historical employment information like details about the job itself e.g. time, type, position as well as details about the employer like size, location, business area) can be exchanged, OrganizationFootNote(http://ns.hr-xml.org/2_2/HR-XML-2_2/CPO/Organization.pdf) (describes the legal organizational units as well as its internal fragmentation which take into account relationships between organizations and their sub-divisions) and ResumeFootNote(http://ns.hr-xml.org/2_2/HR-XML-2_2/SEP/Resume.pdf) (combines resume information with the addition of a number of shared components: Competencies, Employment History, Education History etc.)
[edit] Relation to FOAF
The comparison of the parts of the HR-XML with FOAF concepts is difficult since the level of detail is different in these two sets of specifications. Nevertheless some ideas and ways of how, for instance, organisation is described, can be incorporated within the extension of the current FOAF vocabulary. Furthermore the schemas describing education, employment and competencies form the foundation to define a more detailed expert description.
AXEL: We should also discuss particular classifications of areas of knowledge/expertise, shall we also mention the
idea from above to link to wikipedia entries, and another idea raised earlier, ie. using
ACM URLs such as for modelling areas of knowledge in CS? e.g. http://www.acm.org/class/1998/K.4.2.html
would be a possible identifier for COMPUTERS AND SOCIETY. Social issues.
I think these are enough for the moment...
Lyndon: Yes, there is already some work in ontologizing conference information and they must use some vocabularies for identifying the topics of the papers, which of course is perfect for identifying who are the experts in complex intersections of research areas...
[edit] BOOM Ontology within the Self-Publishing Ontology (Aleman)
Recent efforts on defining new vocabulary seem to include at least some feature related, directly or indirectly, to defining expertise of a person. For example, the BOOM ontology FootNote(http://web2express.org/openlab/docs/BOON_specs_v0_1.html) includes an 'expertise' relationship applicable both for a person or a group. This is a datatype property that would contain skills and/or expertise. The BOON vocabulary also includes an 'interest' datatype property to specify interest areas that a group or person has. The BOOM ontology is being developed together with an ongoing effort to define an Ontology for Experiment Self-Publishing (SPEFootNote(http://esw.w3.org/topic/HCLS/ScientificPublishingTaskForce)). The SPE ontology is aimed to be used to self-publish data (in RDF) about scientific experiments. The tasks of sharing, discovery, and integration of information of scientific experiments are expected to benefit from Semantic Web technologies.
[edit] The overall Framework (Gosia, Aleman, John)
Anna: I think that this subsection should be a large main section containing more or less all contributions of the paper.
[from-elsewhere http://www.ibiblio.org/hhalpin/homepage/notes/vcardtable.html] (this is a table that shows how new vCard vocabulary relates to original vCard and FOAF - we might use similar comparison table; - Aleman)
Maybe in tabular form present all facets and pick one representative wherever several possibilities exist.
http://wissensnetze.ag-nbi.de/expertfinder/person.png
Figure 2: How to describe a person - overview
[edit] Overlappings/Mappings of General Descriptions
Comparison: VCard + VCard Ontology + FOAF + W3C PIM + DOAC
[edit] Overlappings/Mappings of Expert's Relations
Comparison: RELATIONSHIP + XFN (+FOAF)
[edit] Overlappings/Mappings of Educational Aspects
Comparison: DOAC + HR-XML
[edit] Overlappings/Mappings of Expert's Activities
Comparison: DOAC + DOAP + HR-XML (+ FOAF)
Comparison: DOAC + HR-XML
[edit] Related projects and initiatives (Lyndon)
Other projects and initiatives exist whose aims and goals overlap with or are relevant to the aims and goals of ExpertFinder. As an umbrella initiative involving many organisations, some of these projects and initiatives are continued among the ExpertFinder participants and results will be shared both from this related work to ExpertFinder and vice versa, giving ExpertFinder the opportunity to already impact through its work into present activities as well as allow these present activities to impact upon the broader ExpertFinder efforts. Other work takes place outside of the ExpertFinder participants but as something which is relevant and related we plan to align efforts to ensure maximum reach of results to ensure a future Semantic Web where expert finding is enabled by widely supported vocabularies, alignments, rules and best practises.
[edit] Knowledge Nets
The project Knowledge Nets (Wissensnetze) which is part of the InterVal-Berlin Research Centre for the Internet Economy, funded by the German Ministry of Research (BMBF) approaches the impact of semantic technologies from the business and technical viewpoint in order to make predictions about the influence of these new technologies on markets, enterprizes, business specific value chains and individuals. The project examines the effects of the deployment of SemanticWeb technologies for particular application scenarios and market sectors. Every scenario includes a technological component which makes use of the prospected availability of semantic technologies in a perspective of several years and a deployment component assuming the availability of the required information in machine-readable form. The first scenario in this context was situated at the Human Resource (HR) domain with the aim to analyze the online job seeking and procurement processes with and without the usage of semantic technologies [Mochol, 2004][Bizer, 2005][Tolksdorf, 2006]. In a Semantic Web-based recruitment application the data exchange between employers, job applicants and job portals is based on a set of shared vocabularies describing domain relevant terms: occupations, industrial sectors and skills. These commonly used vocabularies can be formally defined by means of a so-called Human Resource ontology (HR-ontology). Given a rich and machine-processable representation of the domain of interest the ontology was intended to be used in a Semantic Web job portal not only as a uniform representation of job postings and job seeker profiles but to form a basis for the implementation of semantic matching techniques which compute semantic similarities between information resources. In order to support common practices from the industry and to maximize the integration of job seeker profiles and job postings from different organizations the HR-ontology had to be aligned to established domain-specific standards and classifications. For this purposes sub-domains of the application setting (e.g. professional and educational skills, types of professions and industrial areas) and several useful knowledge sources covering them has been identified. As candidate ontologies we selected some of the most relevant classifications in the area, deployed by national and international agencies and statistic organizations: i). Profession Reference Number Classification (BKZ) and Standard Occupational Classification (SOC); ii). Classification of Industrial Sector (WZ2003) and North American Industry Classification System (NAICS); iii). German Translation of Human Resources XML(HRBA-XML); and iv).KOWIEN Skill Ontology. Reusing the existing standards the HR-ontology contributes to the realization of more powerful and flexible eRecruitment solutions which include advanced search and presentation facilities based on knowledge about the application domain. Furthermore, the analysis of the potential economic impacts showed that the main actors in the employment market (employers and job seekers) would both benefit from the realization of the ontology-based scenario.
[edit] Exploiting Expertise of Computer Science Researchers for the Peer-Review Process
The SemDis project at the LSDIS Lab in the University of Georgia addresses development of new query/discovery techniques for semantic relationships. In a recent SemDis publication, co-authorship and 'knows' relations from DBLP and FOAF data, respectively, were used to detect possible conflicts of interest between reviewers and authors in a peer-review process [Aleman-Meza, 2006]. An extension on such work aims at determining possible reviewers by comparing their expertise to the topics of a paper. Results to date consists of the creation of a dataset that captures the expertise of a person on different topics or areas for a subset of researchers in SwetoDblp dataset (more details are available FootNote(http://cs.uga.edu/~cameron/expertise.html)). This datasets builds upon SwetoDblpFootNote(http://lsdis.cs.uga.edu/projects/semdis/swetodblp/), a large populated ontology of computer science publications where the main source of data is the DBLP bibliography FootNote(http://www.informatik.uni-trier.de/~ley/db/). The aim was to relate existing entities in SwetoDblp to areas of expertise in order to avoid duplicity of information and allow to use existing reserchers metadata such as their published papers and affiliation. The expected results of this work are focused on development of techniques to measure the expertise of possible reviewers to facilitate the assignment of reviewers to articles. The dataset of expertise of researchers is a necessary means to this end yet with potential to be used by other tasks relevant to expert finding. In fact, the same dataset was used for displaying 'areas of expertise' in some of the annotations for people names appearing in the ISWC2006 conference website. For example, the areas of expertise for Prof. Rudi Studer in such dataset are Semi-structured semantic data, Searching and Querying, and Information Retrieval. We believe that this provides an example of how projects in ExpertFinder Innitiative can be applicable to real users outside this community.
[edit] x (work by NRC Canada)
[edit] x (work by Open University KMI)
[edit] and so on
[edit] Related Works in the Area of Ontology Sharing and Reuse (Anna)
In this section, I provide an outlook on practices of community-driven ontology constuction, reuse and sharing.
Anna: we should extend and generalize the ideas expressed below to make this section a natural part of the ExpertFinder framework.
[edit] Community-driven Ontology Construction
Another recent trend in many popular portals is to allow communities to create their own vocabularies and tag the items/information they want to share with others with arbitrary tags from their vocabularies. The following applications fall in category of such portals:
* del.icio.us – This community portal allows communities to tag and share their bookmarks on the Web, and also to search others’ bookmarks on the basis of these tags. * www.43things.com and www.43places.com – These community Web portals allow description by community-created tags and sharing of information about the things people do (www.43things.com) and about the places where people travel or want to travel (www.43places.com). * www.flickr.com – This community portal allows community members to tag with arbitrary keywords, search and share photos. * base.google.com – This community application was launched in November 2005 and is reminsicient of the functionality of the People’s Portal [Zhdanova, 2006]. The application allows regular Web users to contribute their arbitrary items (pictures, text, ads, web-sites) for searching and sharing and annotate these items using pairs of an arbitrary attribute and an arbitrary value. Most popular/shared attributes and attribute values come up in the upper level of Google search interfaces and are proposed to be used for searching and browsing the available items.
Although none of the portals above is based on Semantic Web technologies, they clearly show the massive trend of the Web in becoming more structured and annotated in a community-driven manner, via social processes and contributions of regular Web users.
[edit] Community-driven Ontology Matching
In environments where communities are free to put forward and reuse their own ontological concepts or tags, entities and tags occur which are syntactically different but semantically similar. This tendency brings difficulties for the community members in reuse of the community-contributed knowledge contained across different platforms and even within the same platforms. In current applications, this problem is either completely ignored or, in the most advanced cases, ad-hoc, non-reusable alignments are created to achieve a specific task. An example illustrating the high relevance of the community-driven ontology matching issue can be seen in the novel service Google Base (http://base.google.com, launched in November 2005). The service allows everybody to publish items (pictures, texts, ads, websites) on the Web and annotate them with arbitrary attributes and arbitrary attribute values, and to search among these items using the attributes and values contributed by the communities. In Figure 2, one can notice that already for a simple example of reusing values for a tag “gender”, the communities contributed different names for the same objects: masculine gender was represented as “male”, “m” and (probably) “ma”. Meanwhile in Google Base, one can search using only one of the existing values for the attribute “gender”, which causes inconveniences, such as a necessity to execute several searches (with all the semantically similar values one by one) instead of one search (with one value and several pre-stored community-created ontology alignments).
http://www.ee.surrey.ac.uk/Personal/A.Zhdanova/justpictures/base-tagged.PNG
Figure 2: A Heterogeneity Example in Community-driven Annotation
Being in line with the general ideas of community-driven ontology management, community-driven ontology matching extends conventional ontology matching by involving end users, knowledge engineers, and developer communities in the processes of establishing, describing and reusing mappings [Zhdanova and Shvaiko, 2006]. Approaches to community-driven ontology matching are especially crucial for the usefulness of novel applications with community-driven ontology construction.
[edit] Conclusions and Outlook (All)
Acknowledgements
This work has been partially supported by EU Network of Excellence KnowledgeWeb (FP6-507482), by Science Foundation Ireland under grant number SFI/02/CE1/I131, and the Knowledge Nets project, which is part of the InterVal- Berlin Research Centre for the Internet Economy, funded by the German Ministry of Research (BMBF).
[edit] References
[Aleman-Meza, 2006] Aleman-Meza, B., Nagarajan, M., Ramakrishnan, C., Ding, L., Kolari, P., Sheth, A.P., Arpinar, I.B., Joshi, A., Finin, T. "Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection", Proc of the 15th International World Wide Web Conference (WWW2006), Edinburgh, Scotland, UK, May 2006.
[Bizer, 2005] Bizer, C., Heese, R., Mochol, M., Oldakowski, R., Tolksdorf, R., Eckstein. R. "The Impact of Semantic Web Technologies on Job Recruitment Processes", Proc. of the International Conference Wirtschaftsinformatik (WI’05), 2005.
[Bojars, 2004] Bojars, U. "Extending FOAF with Resume Information", Proc. of the 1st Workshop on FOAF, Social Networks and the Semantic Web, 2004. http://www.w3.org/2001/sw/Europe/events/foaf-galway/papers/pp/extending_foaf_with_resume/
[Breslin, 2005] Breslin, J.G., Harth, A., Bojars, U., Decker, S. "Towards Semantically-Interlinked Online Communities", Proceedings of the 2nd European Semantic Web Conference (ESWC '05), LNCS vol. 3532, pp. 500-514, Heraklion, Greece, May 2005.
[Brickley and Miller, 2000] Brickley, D., Miller, L., "Friend of a Friend Vocabulary Specification", 2000. http://xmlns.com/foaf/0.1/
[Kruk and Decker, 2005] Kruk, S.R., Decker, S. "Semantic Social Collaborative Filtering with FOAFRealm", Semantic Desktop Workshop, International Semantic Web Conference, Galway, Ireland, 2005.
[Mochol, 2004] Mochol, M., Oldakowski, R., Heese, R. "Ontology-based Recruitment Process", Proc. of the Workshop Semantische Technologien für Informationsportale (SemTech'04) at INFORMATIK, 2004.
[Tolksdorf, 2006] Tolksdorf, R., Mochol, M., Heese, R., Eckstein, R., Oldakowski, R., C. Bizer "Semantic-Web-Technologien im Arbeitsvermittlungsprozess", Wirtschatfsinformatik: Internetoekonomie, 48(1):17–26, 2006.
[Zhdanova, 2006] Zhdanova, A.V., 2006. "An Approach to Ontology Construction and its Application to Community Portals", Doctoral dissertation, University of Innsbruck.
[Zhdanova and Shvaiko, 2006] Zhdanova, A.V., Shvaiko, P. "Community-Driven Ontology Matching". In Proceedings of the 3rd European Semantic Web Conference (ESWC'2006), 11-14 June 2006, Budva, Montenegro, Springer-Verlag, LNCS 4011, pp. 34-49 (2006).
