List:Commits« Previous MessageNext Message »
From:Sergei Golubchik Date:June 21 2007 6:27pm
Subject:Re: bk commit - 6.0-falcon tree (istruewing:1.2563) BUG#26827
View as plain text  
Hi!

On Jun 21, Ingo Strüwing wrote:
> > On Jun 16, ingo@stripped wrote:
> >> ChangeSet@stripped, 2007-06-16 21:55:04+02:00, istruewing@stripped +9 -0
> >>   Bug#26827 - table->read_set is set incorrectly,
> >>                 causing update of a different column
> >>   Bug#28158 - table->read_set is set incorrectly,
> >>                 causing wrong error message in Falcon
> >>   Bug#28165 - Falcon: hang after select for update
> >>   Bug#28971 - ALTER TABLE followed by UPDATE for a CSV table
> >>               make server crash
> > 
> > Did you do it in 5.1 ? It should be pushed in 5.1, because ndb and csv
> > are also affected.
> 
> Shouldn't be a problem to do that when we have a patch that we agree upon.
> 
> > I don't see where you've fixed NDB case. As far as I understand you
> > solution is HA_REQUIRES_ALL_COLUMNS_FOR_UPDATE flag, which is enough for
> > Falcon. But it's too strong for NDB, it doesn't need all columns for
> > update.
> 
> Which bug report mentions this problem for NDB? I just read of an
> objection by NDB to extend read_set for all engines (in Bug#28158).

wrong error message happens in NDB as well. I'll add a note to the
bugreport.

> ...
> >> +/* Reads full record only if all columns are set in read_set. */
> >> +#define HA_REQUIRES_ALL_COLUMNS_FOR_UPDATE (LL(1) << 34)
> > 
> > I actually like Svoj's solution better. There's a limitation of the
> > storage engine, and a trivial two-line workaround is implemented in the
> > handler or the storage engine. Instead of adding a feature or a flag to
> > the API for everything a storage engine writers can come up with.
> 
> Ok. I will try to remember that it is ok to introduce new flags that are
> not used anywhere (HA_REQUIRES_KEY_COLUMNS_FOR_DELETE), but not new
> flags that are used twice at least.

:)

HA_REQUIRES_KEY_COLUMNS_FOR_DELETE probably wouldn't be added if I'd
noticed that in time.

But as we're going to refactor the handler API, unnecessary flags will
be removed there, in a due course.
 
> In this conjunction I wonder why it was allowed to introduce
> HA_PRIMARY_KEY_REQUIRED_FOR_DELETE, which seems also to be a special
> case for two engines only.

Because we don't have a clear set of guidelines about when and how the
API can be modified. Also, nobody "owns" the API, so anybody can modify
it any way he wants.

> Another thing worries me. In Bug#28158 you wrote that it is a bug in the
> server. Now that I fixed it in the server, you want it fixed in the
> engines. :-(

I only wrote that about Bug#28158.
With that flag you were fixing a different bug - Bug#26827. Your fix
closed Bug#28158, but not completely - not for all storage engines.

> ...
> >> +    We MUST NOT skip update_row for a transactional engine! If another
> >> +    thread has done the same update (and hence we would skip update_row),
> >> +    the engine would not notice the changed rows and continue to work
> >> +    with the original ones.
> >>    */
> > 
> > I don't understand that :(
> > You said:
> > 
> > <ingo> Assume a simple table with one row containing 'x'.
> > <ingo> Two transactions start.
> > <ingo> One updates 'x' to 'a'.
> > <ingo> The other wants to update 'x' to 'a' too. But when it selects the
> >        updated row for update, it has to wait for the other transaction
> >        to commit.
> > <ingo> After the commit, it updates 'a' to 'a'.
> > <ingo> Now it knows that the row contains 'a'.
> > <ingo> It duplicates the row with insert-select.
> > <ingo> It commits and we have two rows with 'a'.
> > <ingo> If the server tries to optimize and notices in 2nd transaction
> >        that 1st trans has changed to 'a' already, 2nd transaction takes
> >        no notice from the change and works with 'x'.
> > 
> > No, why ? The second transaction would read 'a' in the first place. It
> > would try to select the row for update, wait for the first transaction
> > commit, and when it would finally be able to read the row it would see
> > 'a'. It would never see 'x' in the table.
> 
> I must admit that I am nor a SQL guru neither a transaction guru. I just
> *feel* that there is something not as expected.
> 
> First a test case which *feels* correct to me. Both InnoDB and Falcon
> produce the same result:
> 
> DROP TABLE IF EXISTS t1;
> ###########
> # default
> ###########
> SET @@autocommit = 1;
> CREATE TABLE t1 (c1 INT) ENGINE = InnoDB;
> INSERT INTO t1 VALUES (1);
> #########
> # conn1
> #########
> SET @@autocommit = 0;
> ### Start transaction with one row with '1'
> SELECT * FROM t1;
> c1
> 1
> UPDATE t1 SET c1 = 2;
> ### Should have one row with '2', not yet committed
> SELECT * FROM t1;
> c1
> 2
> INSERT INTO t1 SELECT * FROM t1;
> ### Should have two rows with '2', not yet committed
> SELECT * FROM t1;
> c1
> 2
> 2
> #########
> # conn2
> #########
> SET @@autocommit = 0;
> SELECT @@tx_isolation;
> @@tx_isolation
> REPEATABLE-READ
> ### Start transaction with one row with '1'
> SELECT * FROM t1;
> c1
> 1
> ### When selecting rows for update, need to wait for commit.
> UPDATE t1 SET c1 = 2;

I don't know how this should work.
This seems to be very close to non-repeatable read phenomenon

==========[definition from the standard]=======
2) P2 ("Non-repeatable read"): SQL-transaction T1 reads a row.
   SQL-transaction T2 then modifies or deletes that row and performs
   a COMMIT. If T1 then attempts to reread the row, it may receive the
   modified value or discover that the row has been deleted.
===============================================

I also tried a changed version of it, with SET c1=c1+1,
and with DELETE FROM t1 in conn1. With c1=c1+1, the resulting value
was 3. Which looks like conn2 has "received the modified value", doesn't
it ? With DELETE in another transaction my UPDATE returned "0 rows
updated", which I take as "discover that the row has been deleted".
Besides, SELECT was still returning one row, which is plain
inconsistent.

I'd say, the safest action here is to return an error, saying "the row
was changed (and committed) in another transaction. Cannot update the
history".

At least in REPEATABLE READ isolation level.

> #########
> # conn1
> #########
> COMMIT;
> ### Start transaction with two rows with '2'
> SELECT * FROM t1;
> c1
> 2
> 2
> #########
> # conn2
> #########
> ### Since we waited for commit, we do now see the committed rows
> ### Should have two rows with '2', not yet committed

Not really. 'I' in 'ACID' means 'Isolation', every transaction should be
executed as if there were no other transactions. That is, it doesn't
matter if conn1 transaction has committed or not, because conn2 has
started before conn1 committed, it should not see inserted rows.
Ideally.

But this is REPEATABLE READ, and according to the
Standard "phantoms" (when you do a SELECT, repeat it later again and see
more rows satisfying your WHERE condition) are allowed at this isolation
level. Allowed, not required. Which means whether conn2 will see two
rows or one is implementation defined.

> SELECT * FROM t1;
> c1
> 2
> 2
> INSERT INTO t1 SELECT * FROM t1;
... 
> Now I leave it to you to decide if this is expected behavior (or at
> least acceptable behavior). I though it was not. Hence my attempt to fix it.

In my opinion if UPDATE doesn't fail, it's violation of the standard.
If it succeeds in either way it's wrong, no need to change one wrong
behavior to another wrong behavior.

> May I please add a plea? Please look again at my description that you
> quoted above. Please tell me what I should have added so that you would
> have understood what I meant. Or is it still unclear?

I think it's just the subject being complex. It would be helpful if such
a comment would include a small - few lines - test case. something like

  conn1> select * from t1;
  conn1> update t1 set c1=2;
  conn2> select * from t1;
  conn2> update t1 set c1=2;
  conn1> commit;
  conn2> select * from t1; -- now we expect to see XXX

> ===
> 
> Regarding the solution for Bug#28158 (wrong error message due to wrong
> read_set), can you please expand on your suggestion in the bug report:
> "
> But we'll need to wrap rnd_pos in rnd_init/rnd_end. Which basically
> means opening another cursor (as we may need to continue the scan on the
> first one). Which is handler::clone at the moment :(
> "
> Do you suggest that we do bitmap_set_all(read_set); handler= clone();
> rnd_pos(); delete handler; to retrieve a full row? Do that wherever we
> saw a duplicate key error? Or do it in handler::print_error()?

basically yes. few thoughts:
- bitmap_set_all can be done after clone
- after bitmap_set_all you need to call column_bitmaps_signal() on the
  cloned handler
- you need to clone only once, on the first duplicate key error
- you don't need bitmap_set_all, that is, no need to set ALL bits, it's
  enough to set only bits for fields that are part of the UNIQUE key.
  But as we don't expect duplicate key error to happen many times, this
  optimization isn't important.
- in any case all the above can be skipped if the necessary bit is
  already set in read_set. That is, for MyISAM, HEAP, and other storage
  engines that always read the full row the above should not be executed
  at all.

Regards / Mit vielen Grüssen,
Sergei

-- 
   __  ___     ___ ____  __
  /  |/  /_ __/ __/ __ \/ /   Sergei Golubchik <serg@stripped>
 / /|_/ / // /\ \/ /_/ / /__  Principal Software Developer
/_/  /_/\_, /___/\___\_\___/  MySQL GmbH, Radlkoferstr. 2, D-81373 München
       <___/                  Geschäftsführer: Kaj Arnö - HRB
München 162140
Thread
Re: bk commit - 6.0-falcon tree (istruewing:1.2563) BUG#26827Ingo Strüwing21 Jun
  • Re: bk commit - 6.0-falcon tree (istruewing:1.2563) BUG#26827Sergei Golubchik21 Jun
    • Re: bk commit - 6.0-falcon tree (istruewing:1.2563) BUG#26827Ingo Strüwing22 Jun
Re: bk commit - 6.0-falcon tree (istruewing:1.2563) BUG#26827Sergei Golubchik22 Jun
  • Re: bk commit - 6.0-falcon tree (istruewing:1.2563) BUG#26827Ingo Strüwing22 Jun
    • Re: bk commit - 6.0-falcon tree (istruewing:1.2563) BUG#26827Sergei Golubchik22 Jun