List:General Discussion« Previous MessageNext Message »
From:Kelly Jones Date:February 1 2007 2:13am
Subject:Better way to query table converted from SQL to RDF format?
View as plain text  
I have some data in a regular MySQL table called usplaces:

city         |state|country|latitude |longitude  |population|comments
-------------+-----+-------+---------+-----------+----------+--------
New York     |NY   |USA    |40.704234| -73.917927|8008278   |Big Apple
Chicago      |IL   |USA    |41.840675|-87.679365 |2896016   |Windy City
San Francisco|CA   |USA    |37.759881|-122.437392|776733    |City by the Bay
[pretend there's lots more rows here]

I can now make queries like this:

"All information about cities between 35 and 40 degrees latitude, -90
and -70 longitude with population over 10K":

SELECT * FROM usplaces WHERE (latitude BETWEEN 35 AND 40) AND
(longitude between -90 AND -70) AND (population>10000);

"All states that have at least one city larger than 1M":

SELECT DISTINCT state FROM usplaces WHERE population>1000000;

I now convert the data to RDF format (a different MySQL table called
usplaces_rdf) as follows (I'm cheating slightly: "Chicago" in the
first column means "the URI representing the city of Chicago", and
"IL" in the third column means "the URI representing the state of
Illinois"):

key    |relation  |value
-------+----------+-----
Chicago|state     |IL
Chicago|country   |USA
Chicago|geo:lat   |41.840675
Chicago|geo:lon   |-87.679365
Chicago|population|2896016
Chicago|comments  |Windy City
[lots more rows here, 6 rows for each row in the original table]

[There are many other much more efficient ways to convert to RDF, this
is just an example]

I can now run the first query above ("All information about cities
between 35 and 40 degrees latitude, -90 and -70 longitude with
population over 10K") as something like:

SELECT us1.* FROM usplaces_rdf us1
 LEFT JOIN usplaces_rdf us2 ON (us1.key=us2.key AND relation='geo:lat')
 LEFT JOIN usplaces_rdf us3 ON (us2.key=us3.key AND relation='geo:lon')
 LEFT JOIN usplaces_rdf us4 ON (us3.key=us4.key AND relation='population')
WHERE
 (us2.value BETWEEN 35 AND 40) AND
 (us3.value BETWEEN -90 AND -70) AND
 (us4.value > 10000);

[untested, but should be fairly close].

I'm sure I could get my 2nd result ("All states that have at least one
city larger than 1M") with an even more complex query.

RDF seems to have many advantages (eg, multiple values for a given
field and the ability to store "objects" in a column, not just
values), but the queries seem long and tedious. They're probably
efficient, but hard are hard to write.

Has anyone written a simpler query language for RDF data stored in an
SQL table? Are there better ways of storing/searching RDF data?

[I sense there's a deep connection between SQL and RDF, but haven't
figured it out yet]

-- 
We're just a Bunch Of Regular Guys, a collective group that's trying
to understand and assimilate technology. We feel that resistance to
new ideas and technology is unwise and ultimately futile.
Thread
Better way to query table converted from SQL to RDF format?Kelly Jones1 Feb