Wednesday, September 15, 2010

Using Named Graphs - Joan Baez and Bob Dylan

Lets use the SPARQL named graphs functionality to process only the dbpedia records for Joan Baez and Bob Dylan. Your query is almost instantaneous because you process only the graphs you specify, rather than searching through all of dbpedia. These two graphs have only a few hundred triples between them.

This use of named graphs is quite similar to Oracle's uses of multiple tables in the FROM clause of the select statement. The database may have hundreds or thousands of tables, but you only want information from the specified tables.


SHOW HOW THESE SINGERS ARE CLASSIFIED

The way to name the source graph is shown in bold text.

We'll show how the singers are classified by using the skos:subject predicate.

We'll order results so we can see what classification these artists share when we do a visual inspection.


SELECT ?o ?s 
FROM <http://dbpedia.org/resource/Bob_Dylan>
FROM <http://dbpedia.org/resource/Joan_Baez>
WHERE { ?s <http://www.w3.org/2004/02/skos/core#subject>?o . }
order by ?o ?s
See the results.



GROUPING BY HOW THESE SINGERS ARE CLASSIFIED

In order to have only 1 row per classification, I had to output the classification, ?o, without ?s.

I did process the ?s variable within group functions.

count(distinct ?s) as ?singerCount

shows 2 where both singers have the same classification, and 1 when the classification is unique to either artist.

min(?s) as ?firstSinger
max(?s) as ?lastSinger

shows which artist(s) had the classification. When there is only 1 artist with that classification, their name appears in both positions. It's a little less elegant than I'd like, but it worked.

SELECT ?o 
count(distinct ?s) as ?singerCount 
min(?s) as ?firstSinger 
max(?s) as ?lastSinger
FROM <http://dbpedia.org/resource/Bob_Dylan>
FROM <http://dbpedia.org/resource/Joan_Baez>
WHERE 
{ ?s <http://www.w3.org/2004/02/skos/core#subject> ?o .
}
order by desc(?singerCount) ?o
See the results.



SHOW ONLY THE CLASSIFICATIONS THEY BOTH SHARE

The final query shows just the classifications that are exactly the same for both artists. This is accomplished by having the same object, ?o, in both graphs.


SELECT ?o
WHERE
{
  GRAPH <http://dbpedia.org/resource/Bob_Dylan>
  {
    ?Dylan <http://www.w3.org/2004/02/skos/core#subject> ?o
  } .
  GRAPH <http://dbpedia.org/resource/Joan_Baez>
  {
    ?Baez <http://www.w3.org/2004/02/skos/core#subject> ?o
  }.
}
order by ?o


See the results.