> > Hi,
> >
> > I need some help please! I have 60GB of proxy logs to parse and load
> > into a mysql database. I've written a parsing script but I'm stuck now
> > on how to load the data in.
> >
> > I have a database called PROXY_LOG with 2 tables:
> >
> > USER_TABLE
> > user_id date_time url_id size
> >
> > and
> >
> > URL_TABLE
> > url_id url category
> >
> >
> > The data values I have for each record are:
> >
> > user_id date_time size url category
> >
> > for example:
> >
> > u752359 2004-04-02 12:33:04 3403 http://www.mysql.com/index.html
> > business
> >
> > The problem is I'm not sure how to load the data into the 2 tables and
> > set the url_id. Could anyone point me in the right direction please?
> >
> > Many thanks,
> >
> > js.
> >
> > _________________________________________________________________
> > It's fast, it's easy and it's free. Get MSN Messenger today!
> > http://www.msn.co.uk/messenger
>What language did you use to do the parsing. If it was perl I'd recommend
>looking at using perl's DBI interface and doing it progromaticaly.
>Do you have any idea of how many different url's you have.
>
>Basic steps:
> Get record,
> check to see if url is in database, if it is get the url_id.
> if not insert it and get the generated url_id.
> insert the user record using the url_id you now have.
>repeat until you run out of records.
>
>Not elegent but it will get the job done. Note look into documentation on
>how to get the new url_id after you do an insert. Its in the DBD::mysql
>for perl.
>
Thanks for your reply William. I am using a perl script. If I have to insert
these records one by one it's going to be really slow isn't it? Maybe the
quickest way is to parse the logs twice i.e. fill the URL_TABLE first using
your procedure above, then on the second run, create a LOAD file for
USER_TABLE?
js.
_________________________________________________________________
It's fast, it's easy and it's free. Get MSN Messenger today!
http://www.msn.co.uk/messenger