List:General Discussion« Previous MessageNext Message »
From:Andreas Date:November 10 2003 3:56am
Subject:Dupe killing (was: Data sincronization)
View as plain text  
>* Roger
>Maybe even give a warning some time _before_ you run out of keys...? ;)
>
>You should never run out of keys. Every time you sync, you also check if
>there are many keys left to use... if you sync once a day, as soon as you
>have less than 10 times the expected daily usage of keys left to use, you
>request a new set of keys.
>  
>
You are right. I caught that by using intervals of 10.000.000
We work still in the 5 digits area and only some history tables have 5 
digit IDs at all.
So there is plenty of IDs left.

>You still would have to handle duplicates.
>  
>
Thats an interesiting issue.
We organise trade fairs and therefore I have to integrate lists of 
potential customers. Those are companies that attended other fairs and 
might be interested in ours, too. Or they have advertized somewhere and 
would fit into an event.

The problem is that those entries might already exist but not exactly in 
the same spelling or adress.
e.g.
Smith Hats, Baker Street 1, London
Hats Smith, Baker Street 1, London
Smith Hats, Baker Str. 11, London/Soho
John Smith Hats and Shoes, Baker Str. 10, London/Soho
J. Smith & Son, Baker 10, London

It depends of the source where the adress comes from or if the company 
moved and we have allready the old address in the DB. Maybe we have the 
right address but in the source-list is an older obsolete address.
Probaply the company is known to be closed but via the new list it gets 
reentered as active contact.
And I see all kinds of misspellings, too.

Is there a way to automize the dupe check ?

I fear the day when I manually have to merge our second remote database 
into the main db.
There we are talking of dupe killing in a pool of 3000 adresses that 
goes into another one with 7000.
And I know there are a lot of dupes.   :(


... Andreas


Thread
Data sincronizationGaston Escobar4 Mar
  • Re: Data sincronizationRoger Baklund4 Mar
    • Re: Data sincronizationAndreas9 Nov
      • Re: Data sincronizationRoger Baklund9 Nov
        • Dupe killing (was: Data sincronization)Andreas10 Nov
        • Re: Data sincronizationKaram Chand10 Nov
          • ANN: Database Workbench 2.4 releasedMartijn Tonies10 Nov
            • Re: ANN: Database Workbench 2.4 releasedNils Valentin10 Nov
          • Re: ANN: Database Workbench 2.4 releasedMartijn Tonies10 Nov