List:Internals« Previous MessageNext Message »
From:Zardosht Kasheff Date:February 26 2010 3:20pm
Subject:Re: question on HA_EXTRA_IGNORE_DUP_KEY
View as plain text  
Hello Sergey and Sergei,

In handler::write_row, the row is written, while the uniqueness check
is deferred to the background or to query time (whichever comes
first). So, if the deferred uniqueness check finds a duplicate, the
inserted row is thrown away.

If the row should not be there due to a duplicate, the user will never
see that row in a query.

We do a similar thing with replace into. Only, instead of throwing
away the new row, we overwrite the old row with the new row.

-Zardosht

On Fri, Feb 26, 2010 at 6:41 AM, Sergey Vojtovich <svoj@stripped> wrote:
> Hi Zardosht,
>
> On Thu, Feb 25, 2010 at 02:31:11PM -0500, Zardosht Kasheff wrote:
>> Can you please elaborate on the problems?
> * http://dev.mysql.com/doc/refman/5.1/en/trigger-syntax.html
>  <quot>
>    An AFTER trigger is executed only if the BEFORE trigger (if any)
>    and the row operation both execute successfully.
>  </quot>
>
>  Assume we have AFTER INSERT trigger, which gets executed for a row
>  that was not actually written. It is rather confusing.
>
> * Written rows counter will count duplicate (=thrown away) rows. Users
>  may rely on this counter.
>
> * As for the auto increment - I can't provide qualified answer atm. But
>  have a strong feeling that it will not work as expected.
>
>> Normally, with Insert ignore, a disk seek is required  to verify
>> uniquness of the element (and thus the return of an error). If I do
>> not need to return an error, the disk seek is unnecessary.
> Basically I'm more interested in what happens with a record, within
> ::write_row(). If a duplicate record gets written actually and thrown
> away later. Why the above seek is absolutely required? And how the
> decision to write or throw away a record is done?
>
>> It is a performance improvement.
>>
>> I realize there are the following downsides:
>>  - the estimate of the number of rows of the table may get inaccurate
>>  - the statement will not properly return the number of duplicates found
>>
>> Are there other downsides?
> Yes, there're:
> * Rather minor, but when there was actually no row written by INSERT
>  IGNORE, query cache for this table will get invalidated.
>
> * If we get success from an engine, row will get binlogged, assuming
>  binlog-format=row. Will likely fail on slave.
>
> * There're more potential side-effects, since this looks like  API
>  violation.
>
> Regards,
> Sergey
>
>> -Zardosht
>>
>> On Thu, Feb 25, 2010 at 1:55 PM, Sergey Vojtovich <svoj@stripped> wrote:
>> > Hi Zardosht,
>> >
>> > I believe this is rather a fragile solution. I can think about problems
>> > at least with counters, wrong auto_increment. Also triggers will get
>> > processed for this non-inserted record.
>> >
>> > Just out of curiosity, what kind of optimization you're trying to
>> > achieve?
>> >
>> > Regards,
>> > Sergey
>> >
>> > On Wed, Feb 24, 2010 at 01:20:12PM -0500, Zardosht Kasheff wrote:
>> >> Ok, so if I understand this correctly, the following if clause:
>> >>
>> >> thd->lex->ignore && thd->lex->duplicates ==
> SQL_ERROR
>> >>
>> >> if it evaluates to true, I can assume that the user is doing "insert
> ignore".
>> >>
>> >> How I plan to use this is as follows. Suppose I have a table, (a int,
>> >> b int, primary key (a)), that has the element (1,1) inserted.
>> >>
>> >> Given the following statements:
>> >> insert ignore into table values (2,2), (1,2);
>> >> select * from table;
>> >>
>> >> The expected result of the select * is to return (1,1),(2,2). The
>> >> insert will make two calls to handler::write_row. For (2,2),
>> >> handler::write_row will insert the element, and return success. With
>> >> other storage engines, for (1,2), handler::write_row will insert
>> >> nothing and return HA_ERR_FOUND_DUPP_KEY. What my change will do is
>> >> still insert nothing, but instead of returning HA_ERR_FOUND_DUPP_KEY,
>> >> it will return success.
>> >>
>> >> I assume there is nothing wrong with this?
>> >>
>> >> Thanks
>> >> -Zardosht
>> >>
>> >> On Wed, Feb 24, 2010 at 1:05 PM, Venu Kalyan <mydb08@stripped>
> wrote:
>> >> > Hi !
>> >> >
>> >> > On Tue, Feb 23, 2010 at 3:23 PM, Zardosht Kasheff
> <zardosht@stripped> wrote:
>> >> >> But does this distinguish the commands "insert ignore" and
> "insert on
>> >> >> duplicate key update"?
>> >> >
>> >> > Actually its contrary and unknown behavior even though MySQL
> allows
>> >> > the synax to use IGNORE and ON DUPLICATE KEY / REPLACE together in
>> >> > single statement, and it should in general through an error when
> used
>> >> > together as server does not which one to prefer.
>> >> >
>> >> > In this case (Ignore and ON DUPLICATE KEY together in single
> insert)
>> >> > ON DUPLICATE KEY takes precedence in the code due to the following
>> >> > condition..
>> >> >
>> >> > if (duplic != DUP_ERROR || ignore)
>> >> >    
>  table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
>> >> >
>> >> > For your case;  you can validate this using :
>> >> >
>> >> > if (thd->lex->ignore && thd->lex->duplicates ==
> SQL_ERROR) {
>> >> >  // safe to ignore my dups
>> >> > }
>> >> >
>> >> > if (thd->lex->ignore && thd->lex->duplicates !=
> SQL_ERROR) {
>> >> > // used ignore together with IGNORE and ON KEY DUPLICATE / REPLACE
>> >> > statements, I will decide what needs to be done here
>> >> > }
>> >> >
>> >> > Hope this helps.
>> >> >
>> >> > Thanks
>> >> > Venu Anuganti
>> >> >
>> >> >>
>> >> >> On Tue, Feb 23, 2010 at 6:20 PM, Venu Kalyan
> <mydb08@stripped> wrote:
>> >> >>> You should look for both sql_command type and
> lex->ignore value (both are
>> >> >>> part of thd->lex->sql_command and
> thd->lex->ignore) to determine if the
>> >> >>> storage engine can safely return success on dup cases. But
> remember, doing
>> >> >>> so will result in wrong stats (number of rows
> inserted/updated)
>> >> >>>
>> >> >>> Apart from insert/replace, load data also uses the same
> logic.
>> >> >>>
>> >> >>> On Tue, Feb 23, 2010 at 12:07 PM, Zardosht Kasheff
> <zardosht@stripped>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Hello,
>> >> >>>>
>> >> >>>> Does anyone have any guidance here?
>> >> >>>>
>> >> >>>> Thanks
>> >> >>>> -Zardosht
>> >> >>>>
>> >> >>>> On Wed, Feb 17, 2010 at 3:47 PM, Zardosht Kasheff
> <zardosht@stripped>
>> >> >>>> wrote:
>> >> >>>> > So it seems that just looking at this flag will
> not work, because
>> >> >>>> > INSERT IGNORE and INSERT...ON DUPLICATE KEY UPDATE
> both set this flag.
>> >> >>>> >
>> >> >>>> > I guess what I am wondering is the following, is
> there any way for the
>> >> >>>> > handler to know for sure that the user has called
> "insert ignore", and
>> >> >>>> > if so, would it be ok for the handler to return
> success even if
>> >> >>>> > internally a duplicate key is found?
>> >> >>>> >
>> >> >>>> > Thanks
>> >> >>>> > -Zardosht
>> >> >>>> >
>> >> >>>> > On Fri, Feb 12, 2010 at 5:08 PM, Zardosht Kasheff
> <zardosht@stripped>
>> >> >>>> > wrote:
>> >> >>>> >> Hello all,
>> >> >>>> >>
>> >> >>>> >> I am trying to implement an optimization in
> our storage engine if
>> >> >>>> >> HA_EXTRA_IGNORE_DUP_KEY is set. Would it be ok
> if handler::write_row
>> >> >>>> >> returns success if this flag is set, the sql
> statement is NOT
>> >> >>>> >> SQLCOM_REPLACE or SQLCOM_REPLACE_SELECT, and
> the row is not inserted
>> >> >>>> >> if a duplicate is otherwise there.
>> >> >>>> >>
>> >> >>>> >> Basically, the question is if this flag is
> set, is it ok to not return
>> >> >>>> >> HA_ERR_FOUND_DUPP_KEY in the cases where the
> statement is not "replace
>> >> >>>> >> into".
>> >> >>>> >>
>> >> >>>> >> Thanks
>> >> >>>> >> -Zardosht
>> >> >>>> >>
>> >> >>>> >
>> >> >>>>
>> >> >>>> --
>> >> >>>> MySQL Internals Mailing List
>> >> >>>> For list archives: http://lists.mysql.com/internals
>> >> >>>> To unsubscribe:  
>  http://lists.mysql.com/internals?unsub=1
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >>
>> >> >
>> >>
>> >> --
>> >> MySQL Internals Mailing List
>> >> For list archives: http://lists.mysql.com/internals
>> >> To unsubscribe:    http://lists.mysql.com/internals?unsub=1
>> >>
>> >
>> > --
>> > Sergey Vojtovich <svoj@stripped>
>> > MySQL AB, Software Engineer
>> > Izhevsk, Russia, www.mysql.com
>> >
>>
>> --
>> MySQL Internals Mailing List
>> For list archives: http://lists.mysql.com/internals
>> To unsubscribe:    http://lists.mysql.com/internals?unsub=1
>>
>
> --
> Sergey Vojtovich <svoj@stripped>
> MySQL AB, Software Engineer
> Izhevsk, Russia, www.mysql.com
>
Thread
question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff12 Feb
  • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff17 Feb
    • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff23 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff24 Feb
  • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
    • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff24 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
        • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff25 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich25 Feb
        • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff25 Feb
          • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergei Golubchik26 Feb
          • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich26 Feb
            • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff26 Feb
              • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich2 Mar
                • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff2 Mar
                  • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich2 Mar
                    • Re: question on HA_EXTRA_IGNORE_DUP_KEYMichael Widenius19 Mar
                      • Configure error during the buildHiromichi Watari21 Mar
                        • Re: Configure error during the buildMikiya Okuno21 Mar
                          • Re: Configure error during the buildHiromichi Watari22 Mar