Wednesday, July 7, 2010

What is the RDF data format

RDF stands for Resource Description Framework. It is a way of representing information so that related facts can be easily combined.
For the technically inclined, see the

RDF Primer.

THE RELATIONAL APPROACH

In a typical database, you have tables that collect related information about a particular thing, i.e an employee. The table has columns such as id, name, age, etc. and each of those columns have values and usually types. The id might be a 4 digit number, the name could be a 40 character string, and the age a 3 digit number.

A RELATIONAL EMPLOYEE TABLE

Id Name Age
1 Bill Townsend 47
2 Mary Maxwell 33

The number of columns in tables vary from 1 to several hundred.
In this employee table, 1 row of the table stores all the various items about a single employee.

 

THE RDF APPROACH

Right now I am oversimplifying, but in the RDF model, the data would be stored something like this.

EMPLOYEE INFORMATION IN RDF FORM
?subject ?predicate ?object
<http://www.BillTownsend.com/me> name "Bill Townsend"
<http://www.BillTownsend.com/me> age 47
<http://www.bestandbrightest.com/mmaxwell> name "Mary Maxwell"
<http://www.bestandbrightest.com/mmaxwell> age 33

The basic rdf "table" structure will always have these same 3 columns.
In the RDF model, 1 row stores the subject, and 1 item of information about that subject. It will take many rows or triplets (subject,predicate,object combinations) to fully describe the subject.


The subject is a web address that the whole world could use to uniquely identify this person. Bill might have created that web page for his resume, but now that web address could be used by anyone to record information about him.


The predicates are comparable to the relational database world's column names. It's best to use predicates that are already known to the RDF/SPARQL community so people know what you are talking about.


The objects correspond to the database column values. They could be literals or other web addresses that become the subject of additional predicates and objects.

 

Graphical Representation of RDF Data

In my time grappling with Sparql and RDF, I've found it most helpful to plan my queries visually, and even to analyze queries visually. I will describe the convention I will use in this blog.

The subjects for my little diagrams will be rounded, because they must be web addresses.

The predicates will be labels on the arrows.

The objects will be square if they are literals and will be rounded if they are a web address, which can have other information hanging off it.


No comments:

Post a Comment