Tuesday, August 24, 2010

Retrieving Other Info From a Dbpedia Page

While still using the dbpedia Virtuoso browser, http://dbpedia.org/sparql, pointing to a single page about Joan Baez, lets explore what it takes to retrieve different types of data.

DISPLAYING THE ABSTRACT


The subject ?s will point back to this page.

The predicate was the dbpedia-owl:abstract link copied from the page.

The object ?o will be the abstract returned from the query.

RUN QUERY


See the results.

GET ONLY THE ENGLISH ABSTRACTS

The last query returned abstracts in many different languages. To return just the English version, we modify the Sparql query as follows:
select distinct * 
where {?s <http://dbpedia.org/ontology/abstract> ?o .
 FILTER(langMatches(lang(?o),"en")).
}

The filter condition is how we specify the language tag associated with English,
abbreviated "en". The built-in function, langMatches provides the capability to detect the language associated with a column.

RUN QUERY


See the results.

GET ONLY THE ENGLISH ABSTRACTS OR ABSTRACTS AS SIMPLE STRINGS

Sometimes you might have no language tag at all, just a simple text string. To pick that up, you need the UNION operator to handle this possibility.

Note:The clause describing the abstracts with the @en label, for English, is enclosed in braces {} followed by the keyword UNION and then braces must enclose the clause for label-less abstracts.

select distinct * 
where 
{
{
  ?s <http://dbpedia.org/ontology/abstract> ?o .
  FILTER(langMatches(lang(?o),"en")).
 
 }
 UNION
 {
  ?s <http://dbpedia.org/ontology/abstract> ?o .
  FILTER(!langMatches(lang(?o),"*")).
 }
} 

The second filter condition is how we specify a string without a language tag, or rather, it does not match the object that has a language value of *, meaning any language.

In this case it doesn't make a difference because all abstracts include a language tag.

RUN QUERY


See the results.

No comments:

Post a Comment