List:Internals« Previous MessageNext Message »
From:Zardosht Kasheff Date:March 2 2010 1:11pm
Subject:Re: question on HA_EXTRA_IGNORE_DUP_KEY
View as plain text  
Are there other problems besides what has been listed? I ask because
the performance advantage we can gain from this optimization is huge.

Thanks
-Zardosht

On Tue, Mar 2, 2010 at 4:08 AM, Sergey Vojtovich <svoj@stripped> wrote:
> Hi Zardosht,
>
> thanks for this explanation. Sounds like a nice idea, but still it will
> not work with all MySQL subsystems correctly.
>
> Regards,
> Sergey
>
> On Fri, Feb 26, 2010 at 10:20:40AM -0500, Zardosht Kasheff wrote:
>> Hello Sergey and Sergei,
>>
>> In handler::write_row, the row is written, while the uniqueness check
>> is deferred to the background or to query time (whichever comes
>> first). So, if the deferred uniqueness check finds a duplicate, the
>> inserted row is thrown away.
>>
>> If the row should not be there due to a duplicate, the user will never
>> see that row in a query.
>>
>> We do a similar thing with replace into. Only, instead of throwing
>> away the new row, we overwrite the old row with the new row.
>>
>> -Zardosht
>>
>> On Fri, Feb 26, 2010 at 6:41 AM, Sergey Vojtovich <svoj@stripped> wrote:
>> > Hi Zardosht,
>> >
>> > On Thu, Feb 25, 2010 at 02:31:11PM -0500, Zardosht Kasheff wrote:
>> >> Can you please elaborate on the problems?
>> > * http://dev.mysql.com/doc/refman/5.1/en/trigger-syntax.html
>> >  <quot>
>> >    An AFTER trigger is executed only if the BEFORE trigger (if
> any)
>> >    and the row operation both execute successfully.
>> >  </quot>
>> >
>> >  Assume we have AFTER INSERT trigger, which gets executed for a row
>> >  that was not actually written. It is rather confusing.
>> >
>> > * Written rows counter will count duplicate (=thrown away) rows. Users
>> >  may rely on this counter.
>> >
>> > * As for the auto increment - I can't provide qualified answer atm. But
>> >  have a strong feeling that it will not work as expected.
>> >
>> >> Normally, with Insert ignore, a disk seek is required  to verify
>> >> uniquness of the element (and thus the return of an error). If I do
>> >> not need to return an error, the disk seek is unnecessary.
>> > Basically I'm more interested in what happens with a record, within
>> > ::write_row(). If a duplicate record gets written actually and thrown
>> > away later. Why the above seek is absolutely required? And how the
>> > decision to write or throw away a record is done?
>> >
>> >> It is a performance improvement.
>> >>
>> >> I realize there are the following downsides:
>> >>  - the estimate of the number of rows of the table may get
> inaccurate
>> >>  - the statement will not properly return the number of duplicates
> found
>> >>
>> >> Are there other downsides?
>> > Yes, there're:
>> > * Rather minor, but when there was actually no row written by INSERT
>> >  IGNORE, query cache for this table will get invalidated.
>> >
>> > * If we get success from an engine, row will get binlogged, assuming
>> >  binlog-format=row. Will likely fail on slave.
>> >
>> > * There're more potential side-effects, since this looks like  API
>> >  violation.
>> >
>> > Regards,
>> > Sergey
>> >
>> >> -Zardosht
>> >>
>> >> On Thu, Feb 25, 2010 at 1:55 PM, Sergey Vojtovich <svoj@stripped>
> wrote:
>> >> > Hi Zardosht,
>> >> >
>> >> > I believe this is rather a fragile solution. I can think about
> problems
>> >> > at least with counters, wrong auto_increment. Also triggers will
> get
>> >> > processed for this non-inserted record.
>> >> >
>> >> > Just out of curiosity, what kind of optimization you're trying to
>> >> > achieve?
>> >> >
>> >> > Regards,
>> >> > Sergey
>> >> >
>> >> > On Wed, Feb 24, 2010 at 01:20:12PM -0500, Zardosht Kasheff wrote:
>> >> >> Ok, so if I understand this correctly, the following if
> clause:
>> >> >>
>> >> >> thd->lex->ignore && thd->lex->duplicates ==
> SQL_ERROR
>> >> >>
>> >> >> if it evaluates to true, I can assume that the user is doing
> "insert ignore".
>> >> >>
>> >> >> How I plan to use this is as follows. Suppose I have a table,
> (a int,
>> >> >> b int, primary key (a)), that has the element (1,1) inserted.
>> >> >>
>> >> >> Given the following statements:
>> >> >> insert ignore into table values (2,2), (1,2);
>> >> >> select * from table;
>> >> >>
>> >> >> The expected result of the select * is to return (1,1),(2,2).
> The
>> >> >> insert will make two calls to handler::write_row. For (2,2),
>> >> >> handler::write_row will insert the element, and return success.
> With
>> >> >> other storage engines, for (1,2), handler::write_row will
> insert
>> >> >> nothing and return HA_ERR_FOUND_DUPP_KEY. What my change will
> do is
>> >> >> still insert nothing, but instead of returning
> HA_ERR_FOUND_DUPP_KEY,
>> >> >> it will return success.
>> >> >>
>> >> >> I assume there is nothing wrong with this?
>> >> >>
>> >> >> Thanks
>> >> >> -Zardosht
>> >> >>
>> >> >> On Wed, Feb 24, 2010 at 1:05 PM, Venu Kalyan
> <mydb08@stripped> wrote:
>> >> >> > Hi !
>> >> >> >
>> >> >> > On Tue, Feb 23, 2010 at 3:23 PM, Zardosht Kasheff
> <zardosht@stripped> wrote:
>> >> >> >> But does this distinguish the commands "insert ignore"
> and "insert on
>> >> >> >> duplicate key update"?
>> >> >> >
>> >> >> > Actually its contrary and unknown behavior even though
> MySQL allows
>> >> >> > the synax to use IGNORE and ON DUPLICATE KEY / REPLACE
> together in
>> >> >> > single statement, and it should in general through an
> error when used
>> >> >> > together as server does not which one to prefer.
>> >> >> >
>> >> >> > In this case (Ignore and ON DUPLICATE KEY together in
> single insert)
>> >> >> > ON DUPLICATE KEY takes precedence in the code due to the
> following
>> >> >> > condition..
>> >> >> >
>> >> >> > if (duplic != DUP_ERROR || ignore)
>> >> >> >    
>  table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
>> >> >> >
>> >> >> > For your case;  you can validate this using :
>> >> >> >
>> >> >> > if (thd->lex->ignore &&
> thd->lex->duplicates == SQL_ERROR) {
>> >> >> >  // safe to ignore my dups
>> >> >> > }
>> >> >> >
>> >> >> > if (thd->lex->ignore &&
> thd->lex->duplicates != SQL_ERROR) {
>> >> >> > // used ignore together with IGNORE and ON KEY DUPLICATE /
> REPLACE
>> >> >> > statements, I will decide what needs to be done here
>> >> >> > }
>> >> >> >
>> >> >> > Hope this helps.
>> >> >> >
>> >> >> > Thanks
>> >> >> > Venu Anuganti
>> >> >> >
>> >> >> >>
>> >> >> >> On Tue, Feb 23, 2010 at 6:20 PM, Venu Kalyan
> <mydb08@stripped> wrote:
>> >> >> >>> You should look for both sql_command type and
> lex->ignore value (both are
>> >> >> >>> part of thd->lex->sql_command and
> thd->lex->ignore) to determine if the
>> >> >> >>> storage engine can safely return success on dup
> cases. But remember, doing
>> >> >> >>> so will result in wrong stats (number of rows
> inserted/updated)
>> >> >> >>>
>> >> >> >>> Apart from insert/replace, load data also uses the
> same logic.
>> >> >> >>>
>> >> >> >>> On Tue, Feb 23, 2010 at 12:07 PM, Zardosht Kasheff
> <zardosht@stripped>
>> >> >> >>> wrote:
>> >> >> >>>>
>> >> >> >>>> Hello,
>> >> >> >>>>
>> >> >> >>>> Does anyone have any guidance here?
>> >> >> >>>>
>> >> >> >>>> Thanks
>> >> >> >>>> -Zardosht
>> >> >> >>>>
>> >> >> >>>> On Wed, Feb 17, 2010 at 3:47 PM, Zardosht
> Kasheff <zardosht@stripped>
>> >> >> >>>> wrote:
>> >> >> >>>> > So it seems that just looking at this
> flag will not work, because
>> >> >> >>>> > INSERT IGNORE and INSERT...ON DUPLICATE
> KEY UPDATE both set this flag.
>> >> >> >>>> >
>> >> >> >>>> > I guess what I am wondering is the
> following, is there any way for the
>> >> >> >>>> > handler to know for sure that the user
> has called "insert ignore", and
>> >> >> >>>> > if so, would it be ok for the handler to
> return success even if
>> >> >> >>>> > internally a duplicate key is found?
>> >> >> >>>> >
>> >> >> >>>> > Thanks
>> >> >> >>>> > -Zardosht
>> >> >> >>>> >
>> >> >> >>>> > On Fri, Feb 12, 2010 at 5:08 PM, Zardosht
> Kasheff <zardosht@stripped>
>> >> >> >>>> > wrote:
>> >> >> >>>> >> Hello all,
>> >> >> >>>> >>
>> >> >> >>>> >> I am trying to implement an
> optimization in our storage engine if
>> >> >> >>>> >> HA_EXTRA_IGNORE_DUP_KEY is set. Would
> it be ok if handler::write_row
>> >> >> >>>> >> returns success if this flag is set,
> the sql statement is NOT
>> >> >> >>>> >> SQLCOM_REPLACE or
> SQLCOM_REPLACE_SELECT, and the row is not inserted
>> >> >> >>>> >> if a duplicate is otherwise there.
>> >> >> >>>> >>
>> >> >> >>>> >> Basically, the question is if this
> flag is set, is it ok to not return
>> >> >> >>>> >> HA_ERR_FOUND_DUPP_KEY in the cases
> where the statement is not "replace
>> >> >> >>>> >> into".
>> >> >> >>>> >>
>> >> >> >>>> >> Thanks
>> >> >> >>>> >> -Zardosht
>> >> >> >>>> >>
>> >> >> >>>> >
>> >> >> >>>>
>> >> >> >>>> --
>> >> >> >>>> MySQL Internals Mailing List
>> >> >> >>>> For list archives:
> http://lists.mysql.com/internals
>> >> >> >>>> To unsubscribe:  
>  http://lists.mysql.com/internals?unsub=1
>> >> >> >>>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >> --
>> >> >> MySQL Internals Mailing List
>> >> >> For list archives: http://lists.mysql.com/internals
>> >> >> To unsubscribe:  
>  http://lists.mysql.com/internals?unsub=1
>> >> >>
>> >> >
>> >> > --
>> >> > Sergey Vojtovich <svoj@stripped>
>> >> > MySQL AB, Software Engineer
>> >> > Izhevsk, Russia, www.mysql.com
>> >> >
>> >>
>> >> --
>> >> MySQL Internals Mailing List
>> >> For list archives: http://lists.mysql.com/internals
>> >> To unsubscribe:    http://lists.mysql.com/internals?unsub=1
>> >>
>> >
>> > --
>> > Sergey Vojtovich <svoj@stripped>
>> > MySQL AB, Software Engineer
>> > Izhevsk, Russia, www.mysql.com
>> >
>
> --
> Sergey Vojtovich <svoj@stripped>
> MySQL AB, Software Engineer
> Izhevsk, Russia, www.mysql.com
>
Thread
question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff12 Feb
  • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff17 Feb
    • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff23 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff24 Feb
  • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
    • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff24 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
        • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff25 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich25 Feb
        • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff25 Feb
          • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergei Golubchik26 Feb
          • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich26 Feb
            • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff26 Feb
              • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich2 Mar
                • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff2 Mar
                  • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich2 Mar
                    • Re: question on HA_EXTRA_IGNORE_DUP_KEYMichael Widenius19 Mar
                      • Configure error during the buildHiromichi Watari21 Mar
                        • Re: Configure error during the buildMikiya Okuno21 Mar
                          • Re: Configure error during the buildHiromichi Watari22 Mar