>>>>> "marko" == marko hrastovec <marko.hrastovec@stripped> writes:
>> Description:
marko> When I use LOAD DATA INFILE command I have problems with reading the
marko> first line of the file if I use some strange field delimiters and
marko> text qualifiers.
marko> load data infile 'file.txt' replace into table cd fields terminated by
> '¶' optionally enclosed by '¬';
marko> I have used this strange characters because other common delimiters
marko> may occur in the data.
marko> In the following example you can see the first two lines of the text
marko> file read into the database. If I put the ¬ character as the first
marko> character of the file it gets read into the field although it is
marko> the text enclosure character. With " and ; delimiters it works fine.
marko> So the text enclosure character must be omitted.
marko> This file is read properly:
marko> 0008811110321¬¶¬Pulp
> Fiction¬¶¶2999.00¶2350.00¶8¶12¶1¶1¶1¶¬Bmg¬¶100¶¶0¶4¶¬Kompilacije¬
marko> ¬0008811162429¬¶¬Dude
> Ranch¬¶¶2999.00¶0.00¶8¶5¶1¶0¶1¶¬UNIVERSAL¬¶100¶¶0¶4¶¬Blink
> 182¬
marko> This file is not read properly the first field in the first row is read like
marko> this "¬000881111032":
marko> ¬0008811110321¬¶¬Pulp
> Fiction¬¶¶2999.00¶2350.00¶8¶12¶1¶1¶1¶¬Bmg¬¶100¶¶0¶4¶¬Kompilacije¬
marko> ¬0008811162429¬¶¬Dude
> Ranch¬¶¶2999.00¶0.00¶8¶5¶1¶0¶1¶¬UNIVERSAL¬¶100¶¶0¶4¶¬Blink
> 182¬
Hi!
The problem was with 8 bit characters as field separators. Here is a
patch for this:
*** /my/monty/master/mysql-3.23.8-alpha/include/my_sys.h Sun Dec 26 02:34:45 1999
--- include-lib/my_sys.h Mon Jan 3 23:58:27 2000
***************
*** 247,253 ****
#define my_b_get(info) \
((info)->rc_pos != (info)->rc_end ?\
! ((info)->rc_pos++, (int) (info)->rc_pos[-1]) :\
_my_b_get(info))
#define my_b_write(info,Buffer,Count) \
--- 247,253 ----
#define my_b_get(info) \
((info)->rc_pos != (info)->rc_end ?\
! ((info)->rc_pos++, (int) (uchar) (info)->rc_pos[-1]) :\
_my_b_get(info))
#define my_b_write(info,Buffer,Count) \
*** /my/monty/master/mysql-3.23.8-alpha/sql/sql_load.cc Wed Nov 10 14:38:13 1999
--- sql/sql_load.cc Mon Jan 3 23:58:40 2000
***************
*** 420,429 ****
line_term_length=0;
line_term_ptr=(char*) "";
}
! enclosed_char= (enclosed_length=enclosed_par.length()) ? enclosed_par[0] :
! INT_MAX;
! field_term_char= field_term_length ? field_term_ptr[0] : INT_MAX;
! line_term_char= line_term_length ? line_term_ptr[0] : INT_MAX;
error=eof=found_end_of_line=found_null=line_cuted=0;
buff_length=tot_length;
--- 420,429 ----
line_term_length=0;
line_term_ptr=(char*) "";
}
! enclosed_char= (enclosed_length=enclosed_par.length()) ?
! (uchar) enclosed_par[0] : INT_MAX;
! field_term_char= field_term_length ? (uchar) field_term_ptr[0] : INT_MAX;
! line_term_char= line_term_length ? (uchar) line_term_ptr[0] : INT_MAX;
error=eof=found_end_of_line=found_null=line_cuted=0;
buff_length=tot_length;
***************
*** 479,485 ****
return 1;
PUSH(chr);
while (i-- > 1)
! PUSH(*--ptr);
return 0;
}
--- 479,485 ----
return 1;
PUSH(chr);
while (i-- > 1)
! PUSH((uchar) *--ptr);
return 0;
}
***************
*** 537,543 ****
if (ismbchar(p, (uchar*)to))
continue;
for (int i=0; i<ml; i++)
! PUSH(*--to);
chr = GET;
}
#endif
--- 537,543 ----
if (ismbchar(p, (uchar*)to))
continue;
for (int i=0; i<ml; i++)
! PUSH((uchar) *--to);
chr = GET;
}
#endif
***************
*** 750,756 ****
PUSH(chr);
while (--ptr != line_start_ptr)
{ // Restart with next char
! PUSH(*ptr);
}
goto try_again;
}
--- 750,756 ----
PUSH(chr);
while (--ptr != line_start_ptr)
{ // Restart with next char
! PUSH((uchar) *ptr);
}
goto try_again;
}
Yours,
Monty
*************** Warning commercial signature follows **********
If you like TCX's concept of a 'mostly free' database and free
advice, you should at least CONSIDER supporting us, so that we can
afford to continue this service up. http://www.mysql.com/