On 28.05.2007 09:06 CE(S)T, Kevin Hunter wrote:
> At 12:31a -0400 on 28 May 2007, Dan Nelson wrote:
>> You want the ARCHIVE storage engine.
Hm, it doesn't support deleting rows and it cannot use indexes. So doing
statistics on them (which can be a little more complex than counting
rows within a timespan, which is why I wanted to use an SQL database)
could get quite resource demanding.
> In particular, I imagine a lot of the HTTP requests would be the
> same, so you could create a table to store the requested URLs, and
> then have a second table with the timestamp and foreign key
> relationship into the first.
Interesting idea. Inserting would be more work to find the already
present "dictionary" rows. Also, URLs sometimes contain things like
session IDs. They're probably not of interest for my use but it's not
always easy to detect them for removal. I could also parse user agent
strings for easier evaluation, but this takes me the possibility to add
support for newer browsers at a later time. (Well, I could update the
database from the original access log files when I've updated the UA
IP addresses (IPv4) and especially return codes (which can be mapped to
a 1-byte value) are probably not worth the reference. Data size values
should be too distributed for this.
How large is a row reference? 4 bytes?
Yves Goergen "LonelyPixel" <nospam.list@stripped>
Visit my web laboratory at http://beta.unclassified.de