At 18:57 -0700 4/27/03, Ray Kiddy wrote:
>I am writing a utility for importing data into MySQL. The new thing
>about my utility is that it will behave deterministically when
>dealing with warnings or errors and data validation issues.
>
>I am thinking it will operate in two modes.
>
>The first one is easier. It will take a file that contains such as:
>
> col1,col2,col3
> 1,foo,bar
> 2,judy,frank
> 3,adam,bob,extra
> 4,alpha,beta
>
>and create a table with 3 columns: col1 CHAR(1), col2 VARCHAR(5),
>col3 VARCHAR(5), extra1 VARCHAR(5) and put the 4 lines into the
>table.
>
>The benefit of this approach is that:
>
>1) it will create exactly the columns needed by the data
>2) the columns will be as big as they need to be to hold the data
>3) there will be no data loss from importing.
>
>Imagine that. No data loss. No message about 13274 warnings and no
>way to find out what the heck happened.
>
>This mode will also take an existing table and add columns or
>re-size the columns to get the data to fit.
>
>The second mode (or second tool) will do data validation.
>
>The idea is this. Say that if you have a bunch of data and one
>column is a number, you can import it to be in a SMALLINT column and
>then, if there is a problem fitting the data into the table, you can
>separate that line out of the data file, instead of just shoving it
>into the database, messing up the value and hoping for the best.
>
>I am writing this in C to help with its performance.
>
>I am very open to suggestions or ideas.
You might want to visit http://www.kitebird.com/mysql-cookbook and
download the recipes distribution. Then look in the transfer directory,
where you'll find a couple of Perl scripts that are relevant to your
task:
# guess_table.pl - characterize the contents of a data file and use the
# information to guess a CREATE TABLE statement for the file
# load_diag.pl - LOAD DATA diagnostic
# Simulates the effect of running a LOAD DATA statement, but attempts
# to determine which rows and columns result in warnings or errors.
They might give you some ideas.
>
>thanx - ray
--
Paul DuBois
http://www.kitebird.com/
sql, query