List:General Discussion« Previous MessageNext Message »
From:Michael Widenius Date:January 3 2000 10:20pm
Subject:load data infile with strange delimiters
View as plain text  
>>>>> "marko" == marko hrastovec <marko.hrastovec@stripped> writes:

>> Description:
marko> 	When I use LOAD DATA INFILE command I have problems with reading the
marko> 	first line of the file if I use some strange field delimiters and
marko> 	text qualifiers.

marko>         load data infile 'file.txt' replace into table cd fields terminated by
> '¶' optionally enclosed by '¬';

marko> 	I have used this strange characters because other common delimiters
marko> 	may occur in the data.

marko> 	In the following example you can see the first two lines of the text
marko> 	file read into the database. If I put the ¬ character as the first
marko> 	character of the file it gets read into the field although it is
marko> 	the text enclosure character. With " and ; delimiters it works fine.

marko>         So the text enclosure character must be omitted.

marko> This file is read properly:
marko> 0008811110321¬¶¬Pulp
> Fiction¬¶¶2999.00¶2350.00¶8¶12¶1¶1¶1¶¬Bmg¬¶100¶¶0¶4¶¬Kompilacije¬
marko> ¬0008811162429¬¶¬Dude
> Ranch¬¶¶2999.00¶0.00¶8¶5¶1¶0¶1¶¬UNIVERSAL¬¶100¶¶0¶4¶¬Blink
> 182¬

marko> This file is not read properly the first field in the first row is read like
marko> this "¬000881111032":
marko> ¬0008811110321¬¶¬Pulp
> Fiction¬¶¶2999.00¶2350.00¶8¶12¶1¶1¶1¶¬Bmg¬¶100¶¶0¶4¶¬Kompilacije¬
marko> ¬0008811162429¬¶¬Dude
> Ranch¬¶¶2999.00¶0.00¶8¶5¶1¶0¶1¶¬UNIVERSAL¬¶100¶¶0¶4¶¬Blink
> 182¬

Hi!

The problem was with 8 bit characters as field separators.  Here is a
patch for this:

*** /my/monty/master/mysql-3.23.8-alpha/include/my_sys.h	Sun Dec 26 02:34:45 1999
--- include-lib/my_sys.h	Mon Jan  3 23:58:27 2000
***************
*** 247,253 ****
  
  #define my_b_get(info) \
    ((info)->rc_pos != (info)->rc_end ?\
!    ((info)->rc_pos++, (int) (info)->rc_pos[-1]) :\
     _my_b_get(info))
  
  #define my_b_write(info,Buffer,Count) \
--- 247,253 ----
  
  #define my_b_get(info) \
    ((info)->rc_pos != (info)->rc_end ?\
!    ((info)->rc_pos++, (int) (uchar) (info)->rc_pos[-1]) :\
     _my_b_get(info))
  
  #define my_b_write(info,Buffer,Count) \
*** /my/monty/master/mysql-3.23.8-alpha/sql/sql_load.cc	Wed Nov 10 14:38:13 1999
--- sql/sql_load.cc	Mon Jan  3 23:58:40 2000
***************
*** 420,429 ****
      line_term_length=0;
      line_term_ptr=(char*) "";
    }
!   enclosed_char= (enclosed_length=enclosed_par.length()) ? enclosed_par[0] :
!     INT_MAX;
!   field_term_char= field_term_length ? field_term_ptr[0] : INT_MAX;
!   line_term_char= line_term_length ? line_term_ptr[0] : INT_MAX;
    error=eof=found_end_of_line=found_null=line_cuted=0;
    buff_length=tot_length;
  
--- 420,429 ----
      line_term_length=0;
      line_term_ptr=(char*) "";
    }
!   enclosed_char= (enclosed_length=enclosed_par.length()) ?
!     (uchar) enclosed_par[0] : INT_MAX;
!   field_term_char= field_term_length ? (uchar) field_term_ptr[0] : INT_MAX;
!   line_term_char= line_term_length ? (uchar) line_term_ptr[0] : INT_MAX;
    error=eof=found_end_of_line=found_null=line_cuted=0;
    buff_length=tot_length;
  
***************
*** 479,485 ****
      return 1;
    PUSH(chr);
    while (i-- > 1)
!     PUSH(*--ptr);
    return 0;
  }
  
--- 479,485 ----
      return 1;
    PUSH(chr);
    while (i-- > 1)
!     PUSH((uchar) *--ptr);
    return 0;
  }
  
***************
*** 537,543 ****
  	  if (ismbchar(p, (uchar*)to))
  	    continue;
  	  for (int i=0; i<ml; i++)
! 	    PUSH(*--to);
  	  chr = GET;
        }
  #endif
--- 537,543 ----
  	  if (ismbchar(p, (uchar*)to))
  	    continue;
  	  for (int i=0; i<ml; i++)
! 	    PUSH((uchar) *--to);
  	  chr = GET;
        }
  #endif
***************
*** 750,756 ****
        PUSH(chr);
        while (--ptr != line_start_ptr)
        {					// Restart with next char
! 	PUSH(*ptr);
        }
        goto try_again;
      }
--- 750,756 ----
        PUSH(chr);
        while (--ptr != line_start_ptr)
        {					// Restart with next char
! 	PUSH((uchar) *ptr);
        }
        goto try_again;
      }

Yours,
Monty

*************** Warning commercial signature follows **********
If you like TCX's concept of a 'mostly free' database and free
advice, you should at least CONSIDER supporting us, so that we can
afford to continue this service up.   http://www.mysql.com/
Thread
load data infile with strange delimitersmarko.hrastovec3 Jan
  • Re: load data infile with strange delimitersSasha Pachev3 Jan
  • load data infile with strange delimitersMichael Widenius4 Jan