List:Internals« Previous MessageNext Message »
From:Sergey Vojtovich Date:March 2 2010 9:08am
Subject:Re: question on HA_EXTRA_IGNORE_DUP_KEY
View as plain text  
Hi Zardosht,

thanks for this explanation. Sounds like a nice idea, but still it will
not work with all MySQL subsystems correctly.

Regards,
Sergey

On Fri, Feb 26, 2010 at 10:20:40AM -0500, Zardosht Kasheff wrote:
> Hello Sergey and Sergei,
> 
> In handler::write_row, the row is written, while the uniqueness check
> is deferred to the background or to query time (whichever comes
> first). So, if the deferred uniqueness check finds a duplicate, the
> inserted row is thrown away.
> 
> If the row should not be there due to a duplicate, the user will never
> see that row in a query.
> 
> We do a similar thing with replace into. Only, instead of throwing
> away the new row, we overwrite the old row with the new row.
> 
> -Zardosht
> 
> On Fri, Feb 26, 2010 at 6:41 AM, Sergey Vojtovich <svoj@stripped> wrote:
> > Hi Zardosht,
> >
> > On Thu, Feb 25, 2010 at 02:31:11PM -0500, Zardosht Kasheff wrote:
> >> Can you please elaborate on the problems?
> > * http://dev.mysql.com/doc/refman/5.1/en/trigger-syntax.html
> >  <quot>
> >    An AFTER trigger is executed only if the BEFORE trigger (if any)
> >    and the row operation both execute successfully.
> >  </quot>
> >
> >  Assume we have AFTER INSERT trigger, which gets executed for a row
> >  that was not actually written. It is rather confusing.
> >
> > * Written rows counter will count duplicate (=thrown away) rows. Users
> >  may rely on this counter.
> >
> > * As for the auto increment - I can't provide qualified answer atm. But
> >  have a strong feeling that it will not work as expected.
> >
> >> Normally, with Insert ignore, a disk seek is required  to verify
> >> uniquness of the element (and thus the return of an error). If I do
> >> not need to return an error, the disk seek is unnecessary.
> > Basically I'm more interested in what happens with a record, within
> > ::write_row(). If a duplicate record gets written actually and thrown
> > away later. Why the above seek is absolutely required? And how the
> > decision to write or throw away a record is done?
> >
> >> It is a performance improvement.
> >>
> >> I realize there are the following downsides:
> >>  - the estimate of the number of rows of the table may get inaccurate
> >>  - the statement will not properly return the number of duplicates
> found
> >>
> >> Are there other downsides?
> > Yes, there're:
> > * Rather minor, but when there was actually no row written by INSERT
> >  IGNORE, query cache for this table will get invalidated.
> >
> > * If we get success from an engine, row will get binlogged, assuming
> >  binlog-format=row. Will likely fail on slave.
> >
> > * There're more potential side-effects, since this looks like  API
> >  violation.
> >
> > Regards,
> > Sergey
> >
> >> -Zardosht
> >>
> >> On Thu, Feb 25, 2010 at 1:55 PM, Sergey Vojtovich <svoj@stripped>
> wrote:
> >> > Hi Zardosht,
> >> >
> >> > I believe this is rather a fragile solution. I can think about
> problems
> >> > at least with counters, wrong auto_increment. Also triggers will get
> >> > processed for this non-inserted record.
> >> >
> >> > Just out of curiosity, what kind of optimization you're trying to
> >> > achieve?
> >> >
> >> > Regards,
> >> > Sergey
> >> >
> >> > On Wed, Feb 24, 2010 at 01:20:12PM -0500, Zardosht Kasheff wrote:
> >> >> Ok, so if I understand this correctly, the following if clause:
> >> >>
> >> >> thd->lex->ignore && thd->lex->duplicates ==
> SQL_ERROR
> >> >>
> >> >> if it evaluates to true, I can assume that the user is doing
> "insert ignore".
> >> >>
> >> >> How I plan to use this is as follows. Suppose I have a table, (a
> int,
> >> >> b int, primary key (a)), that has the element (1,1) inserted.
> >> >>
> >> >> Given the following statements:
> >> >> insert ignore into table values (2,2), (1,2);
> >> >> select * from table;
> >> >>
> >> >> The expected result of the select * is to return (1,1),(2,2). The
> >> >> insert will make two calls to handler::write_row. For (2,2),
> >> >> handler::write_row will insert the element, and return success.
> With
> >> >> other storage engines, for (1,2), handler::write_row will insert
> >> >> nothing and return HA_ERR_FOUND_DUPP_KEY. What my change will do
> is
> >> >> still insert nothing, but instead of returning
> HA_ERR_FOUND_DUPP_KEY,
> >> >> it will return success.
> >> >>
> >> >> I assume there is nothing wrong with this?
> >> >>
> >> >> Thanks
> >> >> -Zardosht
> >> >>
> >> >> On Wed, Feb 24, 2010 at 1:05 PM, Venu Kalyan
> <mydb08@stripped> wrote:
> >> >> > Hi !
> >> >> >
> >> >> > On Tue, Feb 23, 2010 at 3:23 PM, Zardosht Kasheff
> <zardosht@stripped> wrote:
> >> >> >> But does this distinguish the commands "insert ignore" and
> "insert on
> >> >> >> duplicate key update"?
> >> >> >
> >> >> > Actually its contrary and unknown behavior even though MySQL
> allows
> >> >> > the synax to use IGNORE and ON DUPLICATE KEY / REPLACE
> together in
> >> >> > single statement, and it should in general through an error
> when used
> >> >> > together as server does not which one to prefer.
> >> >> >
> >> >> > In this case (Ignore and ON DUPLICATE KEY together in single
> insert)
> >> >> > ON DUPLICATE KEY takes precedence in the code due to the
> following
> >> >> > condition..
> >> >> >
> >> >> > if (duplic != DUP_ERROR || ignore)
> >> >> >    
>  table->file->extra(HA_EXTRA_NO_IGNORE_DUP_KEY);
> >> >> >
> >> >> > For your case;  you can validate this using :
> >> >> >
> >> >> > if (thd->lex->ignore &&
> thd->lex->duplicates == SQL_ERROR) {
> >> >> >  // safe to ignore my dups
> >> >> > }
> >> >> >
> >> >> > if (thd->lex->ignore &&
> thd->lex->duplicates != SQL_ERROR) {
> >> >> > // used ignore together with IGNORE and ON KEY DUPLICATE /
> REPLACE
> >> >> > statements, I will decide what needs to be done here
> >> >> > }
> >> >> >
> >> >> > Hope this helps.
> >> >> >
> >> >> > Thanks
> >> >> > Venu Anuganti
> >> >> >
> >> >> >>
> >> >> >> On Tue, Feb 23, 2010 at 6:20 PM, Venu Kalyan
> <mydb08@stripped> wrote:
> >> >> >>> You should look for both sql_command type and
> lex->ignore value (both are
> >> >> >>> part of thd->lex->sql_command and
> thd->lex->ignore) to determine if the
> >> >> >>> storage engine can safely return success on dup cases.
> But remember, doing
> >> >> >>> so will result in wrong stats (number of rows
> inserted/updated)
> >> >> >>>
> >> >> >>> Apart from insert/replace, load data also uses the
> same logic.
> >> >> >>>
> >> >> >>> On Tue, Feb 23, 2010 at 12:07 PM, Zardosht Kasheff
> <zardosht@stripped>
> >> >> >>> wrote:
> >> >> >>>>
> >> >> >>>> Hello,
> >> >> >>>>
> >> >> >>>> Does anyone have any guidance here?
> >> >> >>>>
> >> >> >>>> Thanks
> >> >> >>>> -Zardosht
> >> >> >>>>
> >> >> >>>> On Wed, Feb 17, 2010 at 3:47 PM, Zardosht Kasheff
> <zardosht@stripped>
> >> >> >>>> wrote:
> >> >> >>>> > So it seems that just looking at this flag
> will not work, because
> >> >> >>>> > INSERT IGNORE and INSERT...ON DUPLICATE KEY
> UPDATE both set this flag.
> >> >> >>>> >
> >> >> >>>> > I guess what I am wondering is the following,
> is there any way for the
> >> >> >>>> > handler to know for sure that the user has
> called "insert ignore", and
> >> >> >>>> > if so, would it be ok for the handler to
> return success even if
> >> >> >>>> > internally a duplicate key is found?
> >> >> >>>> >
> >> >> >>>> > Thanks
> >> >> >>>> > -Zardosht
> >> >> >>>> >
> >> >> >>>> > On Fri, Feb 12, 2010 at 5:08 PM, Zardosht
> Kasheff <zardosht@stripped>
> >> >> >>>> > wrote:
> >> >> >>>> >> Hello all,
> >> >> >>>> >>
> >> >> >>>> >> I am trying to implement an optimization
> in our storage engine if
> >> >> >>>> >> HA_EXTRA_IGNORE_DUP_KEY is set. Would it
> be ok if handler::write_row
> >> >> >>>> >> returns success if this flag is set, the
> sql statement is NOT
> >> >> >>>> >> SQLCOM_REPLACE or SQLCOM_REPLACE_SELECT,
> and the row is not inserted
> >> >> >>>> >> if a duplicate is otherwise there.
> >> >> >>>> >>
> >> >> >>>> >> Basically, the question is if this flag
> is set, is it ok to not return
> >> >> >>>> >> HA_ERR_FOUND_DUPP_KEY in the cases where
> the statement is not "replace
> >> >> >>>> >> into".
> >> >> >>>> >>
> >> >> >>>> >> Thanks
> >> >> >>>> >> -Zardosht
> >> >> >>>> >>
> >> >> >>>> >
> >> >> >>>>
> >> >> >>>> --
> >> >> >>>> MySQL Internals Mailing List
> >> >> >>>> For list archives:
> http://lists.mysql.com/internals
> >> >> >>>> To unsubscribe:  
>  http://lists.mysql.com/internals?unsub=1
> >> >> >>>>
> >> >> >>>
> >> >> >>>
> >> >> >>
> >> >> >
> >> >>
> >> >> --
> >> >> MySQL Internals Mailing List
> >> >> For list archives: http://lists.mysql.com/internals
> >> >> To unsubscribe:  
>  http://lists.mysql.com/internals?unsub=1
> >> >>
> >> >
> >> > --
> >> > Sergey Vojtovich <svoj@stripped>
> >> > MySQL AB, Software Engineer
> >> > Izhevsk, Russia, www.mysql.com
> >> >
> >>
> >> --
> >> MySQL Internals Mailing List
> >> For list archives: http://lists.mysql.com/internals
> >> To unsubscribe:    http://lists.mysql.com/internals?unsub=1
> >>
> >
> > --
> > Sergey Vojtovich <svoj@stripped>
> > MySQL AB, Software Engineer
> > Izhevsk, Russia, www.mysql.com
> >

-- 
Sergey Vojtovich <svoj@stripped>
MySQL AB, Software Engineer
Izhevsk, Russia, www.mysql.com
Thread
question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff12 Feb
  • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff17 Feb
    • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff23 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff24 Feb
  • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
    • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff24 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYVenu Kalyan24 Feb
        • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff25 Feb
      • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich25 Feb
        • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff25 Feb
          • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergei Golubchik26 Feb
          • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich26 Feb
            • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff26 Feb
              • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich2 Mar
                • Re: question on HA_EXTRA_IGNORE_DUP_KEYZardosht Kasheff2 Mar
                  • Re: question on HA_EXTRA_IGNORE_DUP_KEYSergey Vojtovich2 Mar
                    • Re: question on HA_EXTRA_IGNORE_DUP_KEYMichael Widenius19 Mar
                      • Configure error during the buildHiromichi Watari21 Mar
                        • Re: Configure error during the buildMikiya Okuno21 Mar
                          • Re: Configure error during the buildHiromichi Watari22 Mar