On Mon, Nov 25, 2013 at 8:29 AM, Steven Siebert <smsiebe@stripped> wrote:
> If that's the only type of queries you're gonna make, sounds like MySQL
> can be a good solution (from an indexing perspective, those queries will be
> fine). Basically, you would be using MySQL as a read-only copy of the XML
> content...so your goal here is to understand how you'll be needing to
> query, and build the best model (in MySQL) you can to achieve these
> queries. Going back to your initial question, I don't believe there is an
> organic MySQL tool (or even a 3rd party tool that I know about) that will
> do this for you - you'll probably have to write something, very small, to
> do this. Not sure how much programming experience you have...but something
> like a simple SAX parser producing XML element and attribute "rows" that
> you (probably batch) insert into the schema you make will work quite
> easily...with very little code.
I've been programing for over 30 years. I already have written some python
code using ElementTree that parses the file.
> Be careful with the schema, especially the IDs. The schema suggested by
> Johan will work in simple cases, but might become a problem with unbounded
> elements and if you wanted to break out XML arrays, for example (ie more
> than one instance of a node's ID). If you need to preserve sequence, that
> might be something you need to consider too. If you don't foresee the need
> to query the parent hierarchy, I wouldn't add it (no parent_id)...not only
> would it not be necessary, you have to consider how you want to query
> hierarchy...if you need to, checked out the Nested Set model as opposed to
> the adjacent list (see h
> Again, this all depends on what you need...but based on the queries you
> indicated, you wouldn't need it.
Thanks for the link - the Nested Set model looks very interesting, but as
you say, I probably won't need that. This will most likely be an iterative
process as how it will be queried will probably evolve.
> Good luck!
> On Mon, Nov 25, 2013 at 4:29 AM, Johan De Meersman <vegivamp@stripped>wrote:
>> | Nodes |
>> | id | ---
>> | [more fields] | |
>> | parent_id | >--
>> And then you join the table with itself as needed. Do note that you'll
>> need a self-join for every level you query, and that every self-join
>> incrementally slows down the query.
>> It may be beneficial (actually, I'm pretty sure it will be :-) ) to
>> implement the recursion in software as necessary, simply follow the branch
>> you want and check wether there are children at each turn; or pick up the
>> child and go back up until parent_id is NULL.
>> Depending on what you want/need, you could also add a depth field, so you
>> don't have to loop to figure out where you are in the tree, etc.
>> Storing node x/y/z as an identifier, as you suggest, is also an option
>> with it's own benefits and downsides; nothing prevents you from taking any
>> or all of those approaches at the same time.
>> ----- Original Message -----
>> > From: "Larry Martell" <larry.martell@stripped>
>> > To: "Steven Siebert" <smsiebe@stripped>
>> > Cc: "mysql mailing list" <mysql@stripped>
>> > Sent: Friday, 22 November, 2013 3:17:44 PM
>> > Subject: Re: XML to RDB
>> > Yes, I will need to query the data, and yes, it's app specific to the
>> > The parent-node structure will meet my needs if I can figure out how to
>> > preserver the unbound multi level nodes. There will be a higher level
>> > table, which will have the file name, date received, and other
>> > info that will be start of the query. The nodes will have a foreign key
>> > back to the file table. The queries will be something like "get the abcd
>> > parameter from node x/y/z from the file foo.xml from 10/10/13". I guess
>> > can just store the x/y/z in the node table (as opposed to trying to
>> > represent z is child of y which is a child of x in 3 different tables or
>> > rows.
>> Unhappiness is discouraged and will be corrected with kitten pictures.