List:General Discussion« Previous MessageNext Message »
From:SGreen Date:December 21 2005 4:56pm
Subject:Re: Cleaning illegal characters from varchar field
View as plain text  
Nathan Gross <nat101sql@stripped> wrote on 12/21/2005 11:49:10 AM:

> On 12/21/05, SGreen@stripped <SGreen@stripped> wrote:
> >
> >
> > Nathan Gross <nat101sql@stripped> wrote on 12/21/2005 11:20:50 AM:
> >
> >
> >  > On 12/20/05, SGreen@stripped <SGreen@stripped> wrote:
> >  > > Nathan Gross <nat101sql@stripped> wrote on 12/20/2005 05:34:58
> 
PM:
> >  > >
> >  > >  > Hi;
> >  > >  > [Mysql 4x] In a table where I get data from another program, I
> 
have
> >  > >  > many records (about 1000) that have illegal chars in a 
[unique]
> >  > >  > indexed varchar field. I would like to clean the illegal 
characters
> >  > >  > out and leave the rest of the data intact.
> >  > >  > 1. Is there a utility to do this?
> >  > >  > 1b) Or an UPDATE command?
> >  > >  >
> >  > >  > Problem is that I do not even know which characters are 
illegal.
> >  > >  > For the future, I can have my Java program check before the 
inserts.
> >  > >  >
> >  > >  > 2. What are the illegal characters?
> >  > >  >
> >  > >  > Thank you.
> >  > >  > -nat
> >  > >  >
> >  > >
> >  > > If you don't know what is illegal, how do you know you have 
illegal
> >  > > characters? Are you getting some kind of error message or 
warning? You
> > have
> >  > > to give us the same information you have in order for us to make 
an
> > informed
> >  > > decision.
> >  > I get two lists from different sources which I merge into the 
database
> >  > via a Java program. Since these two lists themselves sometimes get
> >  > their data from the same source, my program first does a SELECT on 
the
> >  > varchar field (unique index) [to ensure that this data is not yet 
in
> >  > the db] before inserting the data. (Actually I am using j2ee
> >  > ejbFinds.)
> >  > Occasionaly, say 5% of the time, I get a duplicate (create) 
exception
> >  > from the db, even though the db just told my program that the data 
was
> >  > not there!
> >  > With my debugger I verified that for these fields, if you try to
> >  > find/select this data, mysql will return a null resultset, but yet 
if
> >  > you try to enter this data it will throw the create exception.
> >  > Now these [defective] fields always have (1 or more) suspicious
> >  > characters like {,[,',", or commas colons, what not. I do not know
> >  > which of these is/are the culprits.
> >  > So, my little problem is twofold.
> >  > 1) How to trap this data in the first place in my Java program.
> >  > Obviously I need to know what to look for!
> >  > 2) I do not want to leave my database in this state, so I want to 
clean
> > it.
> >  > 2b)I don't mind adding a boolean field to the table to be used as a
> >  > flag for all records that have suspect data, before deciding 
exactly
> >  > how to clean it. Although I think I can just replace these 
characters
> >  > with a space, in [almost] all cases.
> >  >
> >  > One more important point. This field, although unique, is NOT the 
pk.
> >  > I can use the PK to correctly select and display the contents of 
all
> >  > fields in the record.
> >  >
> >  > Thank you much.
> >  > -nat
> >
> > I believe that what you are describing is the inability to properly 
escape a
> > query string in order to make an appropriate match. Please look atthis 
page
> > for how to escape your incoming data properly so that you can probably 
avoid
> > the problems you describe.
> >
> > http://dev.mysql.com/doc/refman/4.1/en/string-syntax.html
> Does this mean that any character NOT on that page is legal? Like a
> brace {} or brackets[]?
> 
> > Nearly every mysql client library has some kind of function to perform 
this
> > escaping for you. Please refer to the manual of the client library you 
are
> > using to see if your client has it or if it escapes content as part of 
the
> > execution of certain methods or calls.
> I use Java. I don't know where to look for such a function.
> Thank you.
> -nat

Java is organized by packages. Check the docs for the package you are 
using to communicate with MySQL. Google around if you need to. You 
mentioned "j2ee" and "ejbFinds", I would start there and work my way out 
until I either found the function or was satisfied that it doesn't exist 
in the packages you are using. If your package doesn't have an escape 
function, it would be well worth your time to write one.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine


Thread
Cleaning illegal characters from varchar fieldNathan Gross20 Dec
  • Re: Cleaning illegal characters from varchar fieldSGreen21 Dec
    • Re: Cleaning illegal characters from varchar fieldNathan Gross21 Dec
      • Re: Cleaning illegal characters from varchar fieldSGreen21 Dec
        • Re: Cleaning illegal characters from varchar fieldNathan Gross21 Dec
          • Re: Cleaning illegal characters from varchar fieldSGreen21 Dec
          • Re: Cleaning illegal characters from varchar fieldHassan Schroeder21 Dec
            • Re: Cleaning illegal characters from varchar fieldNathan Gross21 Dec
              • Re: Cleaning illegal characters from varchar fieldHassan Schroeder21 Dec
                • Re: Cleaning illegal characters from varchar fieldNathan Gross21 Dec
                  • Matching phone numbers to addressesRussell Horn21 Dec
                    • Re: Matching phone numbers to addressesPeter Brawley21 Dec
                  • Re: Cleaning illegal characters from varchar fieldHassan Schroeder21 Dec
                    • Re: Cleaning illegal characters from varchar fieldNathan Gross21 Dec
                  • Re: Cleaning illegal characters from varchar fieldMark Matthews21 Dec
                    • Re: Cleaning illegal characters from varchar fieldNathan Gross21 Dec
      • Re: Cleaning illegal characters from varchar fieldJamesDR21 Dec
        • Re: Cleaning illegal characters from varchar fieldNathan Gross21 Dec