From: Zhangzhigang Date: May 9 2012 4:13am Subject: =?utf-8?B?5Zue5aSN77yaIFdoeSBpcyBjcmVhdGluZyBpbmRleGVzIGZhc3RlciBhZnRl?= =?utf-8?B?ciBpbnNlcnRpbmcgbWFzc2l2ZSBkYXRhIHJvd3M/?= List-Archive: http://lists.mysql.com/mysql/227368 Message-Id: <1336536781.43141.YahooMailNeo@web15208.mail.cnb.yahoo.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="104030145-1687539380-1336536781=:43141" --104030145-1687539380-1336536781=:43141 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable James...=0A>* By doing all the indexes after building the table (or at leas= t all the non-UNIQUE indexes), "sort merge" can be used.=C2=A0 This techniq= ue had been highly optimized over the past half-century, and is more effici= ent.=0A=0A=0AI have a question about "sort merge":=0A=0AWhy does it do the = all "sort merge"? =0A=0A=0AIn my opinion, it just maintains the B tree and = inserts one key into a B tree node which has fewer sorted keys, so it is go= od performance.=0A=0AIf it only does the "sort merge", the B tree data stru= cture have to been createdseparately. it wastes some performance.=0A=0ADoes= it?=0A=0A=0A=0A________________________________=0A =E5=8F=91=E4=BB=B6=E4= =BA=BA=EF=BC=9A Rick James =0A=E6=94=B6=E4=BB=B6=E4= =BA=BA=EF=BC=9A Johan De Meersman ; Zhangzhigang =0A=E6=8A=84=E9=80=81=EF=BC=9A "mysql@stripped"= =0A=E5=8F=91=E9=80=81=E6=97=A5=E6=9C=9F=EF=BC=9A 2= 012=E5=B9=B45=E6=9C=888=E6=97=A5, =E6=98=9F=E6=9C=9F=E4=BA=8C, =E4=B8=8A=E5= =8D=88 12:35=0A=E4=B8=BB=E9=A2=98: RE: Why is creating indexes faster after= inserting massive data rows?=0A =0A* Batch INSERTs run faster than one-row= -at-a-time, but this is unrelated to INDEX updating speed.=0A* The cache si= ze is quite important to dealing with indexing during INSERT; see http://my= sql.rjweb.org/doc.php/memory =0A* Note that mysqldump sets up for an effici= ent creation of indexes after loading the data.=C2=A0 This is not practical= (or necessarily efficient) when incremental INSERTing into a table.=0A=0AA= s for the original question...=0A* Updating the index(es) for one row often= involves random BTree traversals.=C2=A0 When the index(es) are too big to = be cached, this can involve disk hit(s) for each row inserted.=0A* By doing= all the indexes after building the table (or at least all the non-UNIQUE i= ndexes), "sort merge" can be used.=C2=A0 This technique had been highly opt= imized over the past half-century, and is more efficient.=0A=0A=0A> -----Or= iginal Message-----=0A> From: Johan De Meersman [mailto:vegivamp@stripped]= =0A> Sent: Monday, May 07, 2012 1:29 AM=0A> To: Zhangzhigang=0A> Cc: mysql@= lists.mysql.com=0A> Subject: Re: Why is creating indexes faster after inser= ting massive=0A> data rows?=0A> =0A> ----- Original Message -----=0A> > Fro= m: "Zhangzhigang" =0A> >=0A> > Creating indexes a= fter inserting massive data rows is faster than=0A> > before inserting data= rows.=0A> > Please tell me why.=0A> =0A> Plain and simple: the indices get= updated after every insert statement,=0A> whereas if you only create the i= ndex *after* the inserts, the index=0A> gets created in a single operation,= which is a lot more efficient.=0A> =0A> I seem to recall that inside of a = transaction (thus, InnoDB or so) the=0A> difference is markedly less; I mig= ht be wrong, though.=0A> =0A> =0A> --=0A> Bier met grenadyn=0A> Is als most= erd by den wyn=0A> Sy die't drinkt, is eene kwezel=0A> Hy die't drinkt, is = ras een ezel=0A> =0A> --=0A> MySQL General Mailing List=0A> For list archiv= es: http://lists.mysql.com/mysql =0A> To unsubscribe:=C2=A0 =C2=A0 http://l= ists.mysql.com/mysql --104030145-1687539380-1336536781=:43141--