The fact that the data is coming from the text logs doesn't really change
anything; _something_ is generating the text logs so that something could be
changed to force users to supply phone numbers in whatever format you want.
Of course, that doesn't mean _you_ can force those changes to take place; if
the text logs are coming from a customer, you may not be able to persuade
them to change the way they generate the logs. Your boss may not even want
you to explore the possibility with the customer for fear of ruffling
feathers with the customer(s).
If that is the case, I'd suggest writing a UDF (user-defined function) to do
the stripping of the punctuation for you, unless you can finagle the
existing MySQL functions to do the work for you. I'm surprised by how few
string functions MySQL supports. I use DB2 most of the time and it has lots
and lots of built-in functions, many of which are dedicated to string
manipulation. If you can't keep the punctuation out of the data in the first
place and you can't figure out how strip the punctuation with the existing
MySQL functions, I'd say a UDF is pretty much the _only_ way to get that
----- Original Message -----
From: "Rick Dwyer" <rdwyer@stripped>
Sent: Thursday, November 17, 2005 11:53 AM
Subject: Re: A bit of SQL help for a MySQL novice.
> Unfortunately, the phone numbers come from text logs that get imported
> into mysql. Because the phone number is often displayed on a document
> for the customer, they will dictate how they want it to appear (i.e. with
> ( ) etc.). The phone logs simply record those values as they are so
> data will be entered in an unclean manner.
> Therefore I must deal with it on the backend.
> Thanks for the pointers.
> On Nov 17, 2005, at 11:15 AM, Rhino wrote:
>> ----- Original Message ----- From: "Rick Dwyer" <rdwyer@quick- link.com>
>> To: <mysql@stripped>
>> Sent: Thursday, November 17, 2005 10:28 AM
>> Subject: A bit of SQL help for a MySQL novice.
>>> Hello All.
>>> I am hoping for a bit of help with some code that has really given me
>>> some trouble. If this is not he correct forum for this any help in
>>> pointing me to a more suited list would be appreciated.
>>> I have a MySQL 4.1.x database containing records with phone numbers.
>>> Most of the phone numbers are enter in 12035551212 format, but some
>>> are entered with spaces or "-" or "(" or other characters.
>>> I need to read the first 4 positions in the phone number to determine
>>> it's location.
>>> My statement looks like this:
>>> 'Select mid(phone, 1,4) as phoneareacode from phonetable'
>>> This works but if the number is entered as 1(203)-555-1212 the above
>>> would return "1(20" which is not what I am looking for.
>>> Is there a way to have the select statement examine only numeric
>>> values in the phone number so it would disregard the other charcters?
>>> In Lasso, you can use a Replace with a Regular Expression function to
>>> have just the digits 0-9 examined but haven't been able find a way to
>>> do this in SQL.
>>> Any help is appreciated.
>> You have two basic options:
>> 1. Make the data uniform in format so that it is easily accessed.
>> 2. Let the users input the data in whatever format they like and then
>> try to deal with it.
>> It looks like you have opted for the second choice. If it were me, I'd
>> _strongly_ prefer the first choice. I would put edits on the forms or
>> applications that prompt the user for the phone number and force the
>> input to match one format. For instance, if you prefer to see the phone
>> number as one long string of numbers, e.g. 12025551212, either ignore
>> any characters they type that aren't digits or strip out the punctuation
>> characters afterwards.
>> By the way, I'm _not_ saying that you should store the numbers as one
>> long string; there are other options but I would choose the one that was
>> going to be most useful to you based on your business requirements. If
>> the area code is going to be important to you, as it appears from your
>> question, it might be a good idea to store it in a separate column. For
>> instance, you could put the country code (the '1') in a Country_Code
>> column, put the area code in an Area_Code column, put the 7 digit number
>> in its own column, and then put the extension (if applicable) in yet
>> another column if that would help you. Beware of foreign phone numbers
>> though because they don't look like US ones (and don't make the mistake
>> of thinking that the '1' at the beginning of the phone number
>> automatically means the US; I'm in Canada and our phone numbers also
>> start with 1, our area codes are also three digits, and the rest of the
>> number is also 7 digits. Phone numbers in Europe or Africa or Asia
>> follow rather different patterns that are shared by Canada and the US.)
>> Now, your input routines _could_ mimic the way you store the phone
>> numbers. For instance, if you want separate columns in the database for
>> country code, area code, the rest of the number, and the extension (if
>> any), you _could_ provide a separate field in your input form for each
>> of those things. However, you don't have to do it that way; you could
>> just as well put the full phone number in one input field and then split
>> it out when you insert it into the database. That's up to you.
>> But I would definitely use the input routines to force the phone numbers
>> to follow whatever pattern you want it to have. It shouldn't be the
>> database's job to handle this sort of thing, at least in my opinion. Of
>> course, you'll want to fix the data that is already in the database,
>> too. (If there are only a few rows in the table, you could do that
>> manually. If not, you could write SQL to do it.)
>> However, if you insist on allowing multiple formats for your phone
>> numbers, the String Functions in MySQL should help you. Just look for
>> them in the manual: http://dev.mysql.com/doc/refman/4.1/en/ index.html
>> (chapter 12).
>> You may have to use a combination of functions to create new temporary
>> versions of the phone number that don't have the punctuation but you can
>> probably manage something, although it might be ugly.
>> Another possibility is that you could write a user-defined function to
>> strip the punctuation out of the phone numbers. See http://
>> dev.mysql.com/doc/refman/4.1/en/adding-functions.html for information
>> about this. Basically, you would write your own function in C or C++ to
>> do this work for you, then plug the function into MySQL so that you can
>> use it in your SQL. For instance, if you wrote a function called
>> STRIP_PHONE_PUNCTUATION() and installed it in MySQL, your query would
>> look like this:
>> 'Select mid(STRIP_PHONE_PUNCTUATION(phone, 1,4)) as phoneareacode from
>> Your new function would give you a version of the phone number that had
>> no punctuation, then the mid() function would find the area code for
>> No virus found in this outgoing message.
>> Checked by AVG Free Edition.
>> Version: 7.1.362 / Virus Database: 267.13.3/173 - Release Date:
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe: http://lists.mysql.com/mysql?unsub=1
> Rick Dwyer
> Computer Operations Manager
> Quick Link Information Services
> ph: 203-483-2922
> fx: 203-483-2920
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=1
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.362 / Virus Database: 267.13.3/173 - Release Date: 16/11/2005
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.13.3/173 - Release Date: 16/11/2005