Blake Binkley wrote:
>
> I have Apache logging to Mysql as a proof of concept
>
> logging each CLF column as a column of it's own
>
> (IE: host|useragent|authuser|time|request|status|size )
>
> when Mysql gets > 28800 rows the following sql takes 4 seconds to
> resolve.
>
> select count(useragent) where useragent regexp 'Win95|Windows.95|Win32'
>
> as you can see the line above pulls out a count of how many hits were
> made to a website by Windows 95 clients, so a total search just for OS
> and Browser types results in:
>
> 5 OS's + 10 browser types = 15 x 4 seconds each = 60 seconds and there
> is still alot more data to gleen from the database...
>
> I know an index would probably speed it up a bit but what do I index on?
> none of the data can be considered "unique" any other suggestions will
> be helpful.
>
If there is anything you could index it on to make it faster, it would
be useragent. What I would do here ( maybe it won't work for quite like
I am suggesting, but you get he idea) is add some filtering before you
insert and categorize the useragent there. Then you can store it as
enum, and your queries are going to fly.
--
Sasha Pachev
http://www.sashanet.com