List:Internals« Previous MessageNext Message »
From:Renato Golin Date:April 20 2007 3:37pm
Subject:Re: semantic storage engine
View as plain text  
On 20/04/07, Eric Prud'hommeaux <eric@stripped> wrote:
> If I recall, SeRQL has a few points of expressivity that don't exist
> in SPARQL: existentials, set difference, transitive closures (and
> builtin predicates to avoid transitive closures).

Yes, the built-in predicate is quite powerful. It can also return
graphs that can be queried again, recursively (a bit like LISP).


> When you say "query on a semantic data structure", do you mean to
> query the relational structure, much like "SHOW COLUMNS FROM"?
> virtuoso may do that out of the box. You could add that to
> SPASQL/MySQL. I'd look to d2rq for a good graph convention for
> expressing structre schema

It depends on how do you store your semantic data. RDF is one way (or
several ways depending on which flavour you use) but for big data it
becomes quite difficult to deal with all that information just by
reading and parsing the files.

By "semantic data structure" I meant any way to store data in a
semantic structure and being capable of quickly query against it.
MySQL storage engines provide storage for relational data and SQL
provide a way to retrieve that information.

SPARQL, SPASQL, SERQL or whatever may provide a way of retrieving
semantic data from a specific storage (RDF, for instance) but I
believe that, having a binary storage engine, smaller in size and
optimized to store triplets (instead of tuples) would be much better
than dealing with flatfiles or even translating RDF to relational (and
loosing information).


> The perl stuff is a  query re-writer, and probably not too interesting
> to a MySQL hacker up to their  elbows in C++. I'd be happy to happy to
> geek about it, though.

I'm still brainstorming about the particular format of that storage
engine but I guess we'll have to have at least a metadata section (to
store default predicates, for instance), a data section (list of all
available objects) and the index as:

   pointer_to_data  /  pointer_to_predicate  /  pointer_to_data

This is very straightforward, I know and even quite dumb, but if,
instead of listing through all entries we navigate them (like
evaluating at every node) we could reduce the number of scans.

I have to work it out because it might be completely rubbish what I'm
saying... ;)


> I don't know of any RDF query language standardization work in other
> standards bodies. (It's such a beautiful irony that "standardisation"
> can be spelled two ways?)

I found SeRQL to be very interesting and quite capable of being standardi[sz]ed.

cheers,
--renato

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm
Thread
semantic storage engineRenato Golin19 Apr
  • Re: semantic storage engineJay Pipes19 Apr
    • RE: semantic storage engineRick James19 Apr
    • Re: semantic storage engineEric Prud'hommeaux19 Apr
      • Re: semantic storage engineEric Prud'hommeaux19 Apr
      • Re: semantic storage engineRenato Golin19 Apr
        • Re: semantic storage engineEric Prud'hommeaux20 Apr
          • RE: semantic storage engineRick James20 Apr
            • Re: semantic storage engineRenato Golin20 Apr
              • Re: semantic storage engineBrian Aker20 Apr
                • Re: semantic storage engineRenato Golin20 Apr
          • Re: semantic storage engineRenato Golin20 Apr
    • Re: semantic storage engineRenato Golin19 Apr