From: Brandon Esbach Date: August 31 2007 1:46pm Subject: RE: Two/more seperately unique columns in a table? List-Archive: http://lists.mysql.com/mysql/208846 Message-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, and thanks Baron; I should have been a bit clearer on the bulk insert - I am using a bulk insert statement, as you assumed. I'll put this onto the db server and check, I think that's a more future proof method. Will this affect any of my linked tables (linked via the row's primary key(id))? -----Original Message----- From: Baron Schwartz [mailto:baron@stripped]=20 Sent: 31 August 2007 14:28 To: Esbach, Brandon Cc: MySQL User Group Subject: Re: Two/more seperately unique columns in a table? Hi, Esbach, Brandon wrote: > Hi all, >=20 > I'm guessing this is a fairly common question, but I've drawn a blank=20 > so far with all workarounds possible. > Here's the rough scenario: > We have data streaming in constantly from various offsite locations. > This comprises of several details: a session, a unit of that session,=20 > and a measurement of that unit. > Data is captured and transferred on a timed process, so often data is=20 > repeated in the data transfer packets. This repeating is unavoidable=20 > as the software used to capture dumps this data for each measurement=20 > and each unit for the session it's working on. >=20 > Due to the volume, a bulk update is done using "values()" with an=20 > insert statement. >=20 > Unfortunately, often there are repeats of either session, unit, or=20 > measurement (as there could be new measurement or unit for the capture > that needs to be associated to the session). > The problem I've been experiencing is fairly straightforward (I hope): > I have two, sometimes three columns in any given record that need to=20 > always be unique. This comprises an ID (the key column), and one=20 > (depending on the table, sometimes two) GUIDs which should be unique=20 > at all times for the entire table. >=20 > I've tried setting the additional columns to be a primary key (which=20 > in turn sets them up to be "unique" when viewed under "Schema Indices" > on the MySQL Administrator tool); however this does not give an error=20 > (or fail silently with "insert ignore") when I insert a duplicate -=20 > mySQL seems quite happy to add the duplicate record. >=20 > At the moment, I'm running a process in the beginning which simply=20 > gathers all guids from the database and compares them as it runs=20 > through the data (then adds new ones as it runs).. This is hardly=20 > reliable, and also means starting the service would take several hours > to gather the existing guids at current data levels... almost=20 > frightening to think, what will end up happening as the data expands. It sounds like you need a separate primary key and unique index: create table t ( id int not null, guid char(32) not null, unique key (guid), primary key(id) ); Then you can do REPLACE or IGNORE with the LOAD DATA INFILE. I can't tell if you are actually using LOAD DATA INFILE or if your "bulk load"=20 is a big INSERT statement. If you're using an INSERT with multiple VALUES() sections, you can also use ON DUPLICATE KEY UPDATE. I agree the current strategy won't hold up well over time. Baron