List:Cluster« Previous MessageNext Message »
From:Johan Andersson Date:November 20 2009 8:59am
Subject:Re: concurrent insert performance decrease dramatically when using NDB
API
View as plain text  
Hi,
Set it to 1 and try again! Let us know if that helps.
Br
Johan

On Nov 20, 2009, at 9:17, puwei 23 <puwei23@stripped> wrote:

> Hi,
>
> No, forcesend=0,  the default.
>
> 2009/11/20 Johan Andersson <Johan.Andersson@stripped>
> Hi,
>
> Are you using forcesend=1 in the NdbTransaction::execute(...);  ?
>
>
>
> BR
> johan
>
>
> puwei 23 wrote:
> Hi,
>
> Thanks Frazer.
> Following are the detail info about my test case.
>
> 2009/11/19 Frazer Clement <Frazer.Clement@stripped>
>
> Hi Puwei,
>  It might help if you described :
>  1) How your NdbApi program works
>    Does it use batching?  Does it use the Async Api?  Does it use > 1
> transaction?
>
>
> Yes, batch insert, 500 or 1000 records per insert.
> Not using Async Api.
> More than 1 transactions. Each thread use a new transaction for each
> batching insert, and there are more than one thread.
>
>  2) What hardware you are running the client on?
>    Single core?
>
>
> Two quad-core, Intel 2.66G, 64bit.
> 8G memory
>
>
>  3) What CPU utilisation you observe on the client in each case
>
>
> 1 threads, cpu idle is about 95%;
> 2 threads, it's about 90%;
> 10 threads, it looks like:
> [root@t1 ~]# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us  
> sy id
> wa st
>  2  0    176  46280  73676 6274412    0    0 19200     0 4033 20618 22  5 61
> 12  0
>  5  2    176  49040  73668 6272476    0    0 10240     0 2353 14217 10  2
> 83  5  0
>  0  0    176  47840  73664 6273172    0    0  9600     0 2845 10292 10  2
> 81  6  0
>  0  0    176  46344  73652 6273580    0    0 10240     0 2554 17059 11  3
> 81  6  0
>  0  0    176  48924  73644 6271892    0    0 13824     0 3222 18993 15  4
> 73  9  0
>  0  0    176  49764  73628 6270656    0    0  7808     0 2373 12453  8  2
> 87  3  0
>  1  6    176  47484  73624 6273008    0    0  7808     0 1863 13148  8  2
> 87  3  0
>  0  0    176  49464  73612 6271872    0    0  6528     0 2396 9439  6  1 88
> 5  0
>  4  0    176  48020  73612 6273076    0    0  9856     0 2613 14659 11  2
> 81  5  0
>  5  0    176  47480  73612 6273680    0    0   512     0 1339 5809  0  0 99
> 0  0
>  3  0    176  47968  73600 6273092    0    0 12416     0 2782 17574 14  3
> 76  7  0
>  1  3    176  46288  73592 6274232    0    0 18816     0 3888 21614 20  5 62
> 13  0
>  4  5    176  46112  73572 6275900    0    0 21248     0 3907 20708 22  5 62
> 11  0
>
> I don't think the client machine is the performace bottleneck.
> But you give me a hint, how about the data nodes?
> 1 threads:
> [root@d1 ~]# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs  
> us sy id
> wa
>  2  2    160 539556  80004 1685656    0    0    18   333   32    40   
> 1  0
> 98  1
>  1  3    160 531160  80012 1694228    0    0     8 18652 3218  1466 14 12 50
> 24
>  0  0    160 523448  80020 1702020    0    0     8 19772 3063  1898  9 11 61
> 19
>  0  2    160 514452  80028 1710852    0    0     8  9508 2916   722   
> 5  3
> 85  8
>  0  0    160 505784  80036 1719684    0    0     8  8780 2849   826   
> 4  4
> 87  6
>  0  0    160 496632  80052 1729028    0    0     8  9792 2928   872   
> 4  3
> 85  7
>  1  0    160 490808  80060 1734740    0    0     8  9772 2869   840   
> 5  3
> 87  6
>  0  2    160 473180  80092 1752128    0    0    12 19540 2918   642  9  3 72
> 16
>  1  2    160 464864  80100 1760700    0    0     8 20344 3090  1954  7  6 74
> 13
>  1  3    160 455264  80108 1770052    0    0     8 22640 3252  1777 14 13 50
> 24
>  0  0    160 450296  80112 1775248    0    0     4 27580 2889  3107 12 13 50
> 24
>  0  0    160 441592  80128 1783812    0    0     8 10040 2889   757   
> 5  3
> 84  9
>  1  2    160 432400  80148 1792892    0    0    12  9244 2915   796   
> 5  2
> 87  6
>  0  0    160 423608  80160 1801720    0    0    12  8748 2868   867   
> 4  3
> 87  6
>  0  0    160 414328  80168 1811072    0    0     8 10508 2928   870   
> 5  3
> 86  6
>  0  0    160 406648  80176 1818864    0    0     8 16552 2904   702  7  4 77
> 12
>  0  0    160 389496  80200 1835740    0    0    12 18760 3016   821  8  3 75
> 14
>  1  1    160 381560  80216 1843784    0    0     8 22612 3216  2097 12 12 54
> 21
>  1  1    160 373624  80224 1851576    0    0     8 22396 3182  2024 14 12 49
> 25
>
> 2 threads:
> [root@d1 ~]# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs  
> us sy id
> wa
>  1  2    160  17280  80896 2207104    0    0    18   335   33    41   
> 1  0
> 98  1
>  2  1    160  18216  80916 2206564    0    0    12 20620 3308  1577 13 13 50
> 24
>  1  2    160  17584  80928 2206812    0    0    12 25324 3935  1585 15 12 49
> 25
>  1  2    160  17072  80944 2207576    0    0    16 22604 4162  1345 15 11 48
> 26
>  1  3    160  17272  80948 2207052    0    0    12 22284 4109  1170 15 11 48
> 25
>  1  3    160  17380  80968 2206772    0    0    16 22240 4101  1213 16 11 46
> 28
>  0  1    160  17576  80988 2206752    0    0    12 16364 3851   793  8  5 69
> 18
>  0  3    160  18312  81008 2206212    0    0    20 14160 3843   823  8  5 75
> 13
>  1  1    160  16688  81012 2207768    0    0    16 17224 4002   861  8  4 74
> 14
>  0  0    160  17016  81024 2207756    0    0    12 15448 3985   978  8  5 75
> 13
>  0  1    160  17812  81040 2206440    0    0    16 20748 4248   852 10  6 70
> 15
>  1  1    160  17368  81068 2206932    0    0    16 21588 4271   875 11  5 68
> 17
>  1  2    160  16748  81084 2207696    0    0    16 24008 4300  1566 15 11 46
> 28
>  1  2    160  16772  81088 2207432    0    0    12 21340 4144  1017 13 13 47
> 27
>
> 10 threads:
> [root@d1 ~]# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs  
> us sy id
> wa
>  1  2    160  16392  55180 2195900    0    0     3   113    9     8   
> 0  0
> 99  0
>  1  2    160  16892  55200 2195100    0    0    12 25304 4470  1599 17  8 45
> 30
>  1  2    160  16308  55212 2194048    0    0    12 33528 3193  1942 18  9 45
> 28
>  1  1    160  16048  55228 2196372    0    0    12 25316 3087  1534 11  5 52
> 32
>  0  2    160  16224  55172 2195648    0    0    12 34936 3740  1085 11  6 55
> 28
>  1  1    160  16624  54804 2196276    0    0    12 24020 3618  2018  9  5 68
> 18
>  1  1    160  17328  54800 2195500    0    0     4 25880 2538  3182 15 12 49
> 25
>  1  0    160  17456  54796 2195504    0    0     8 24752 3259  2637 14  9 59
> 18
>  1  2    160  17160  54792 2195248    0    0    24 27044 5982   613 17  6 59
> 18
>  1  1    160  17136  54788 2194992    0    0    32 32240 5962  1200 17  6 54
> 22
>  2  0    160  17200  54792 2195248    0    0    12 24612 4078  1782 13  9 37
> 42
>  0  0    160  16880  54796 2195764    0    0    12 20488 3515  2053 10  5 70
> 14
>  0  1    160  15984  54804 2196536    0    0    16 25716 4110  1710 17  5 50
> 28
>  1  1    160  22384  54460 2190640    0    0    16 38148 3957   991 12  7 50
> 31
>
> Maybe this is the reason that kill the concurrent insert performance?
>
>  The saturation effect you observe could be due to :
>  a) Client CPU is saturated
>    Extra threads compete with little throughput gain from  
> overlapping with
> IO
>    Try connecting a second client process on a different client  
> machine to
> see throughput improvement
>  b) Threads are competing for per-cluster-connection mutex
>    Use more than 1 ndb_cluster_connection in your client process,  
> enabling
> higher throughput.
>
>
> I've tried.  The performance improved about 10%.
> How many ndb_cluster_connection that a cluter can hold?
>
>
> Hope this helps,
> Frazer
>
>
> puwei 23 wrote:
>
> Hi All,
>
> When testing the insert performance with NDB API, I found the  
> performance
> of
> concurrent insert was ONLY a little higher than only one insert  
> client.
> Here
> are the test result:
>
>  Concurrent threads
>
> Inserted records per second,
>
> per thread
>
> Inserted records per second,
>
> all threads
>
> 1
>
> 11676.25
>
> 11676.25
>
> 2
>
> About 6202.76
>
> 12405.52
>
> 10
>
> About 1212.94
>
> 12131.00
>
> Per-thread performace decrease dramatically,  and the inserted  
> records by
> all threads are almost the same.
>
> Does anyone have experience about this?
>
>
> Regards,
> puwei
>
>
>
> --
> Frazer Clement, Software Engineer, MySQL Cluster Sun Microsystems -
> www.mysql.com
> Office: Edinburgh, UK
>
> Are you MySQL certified?  www.mysql.com/certification
>
>
>
>
>

Thread
concurrent insert performance decrease dramatically when using NDB APIpuwei 2319 Nov
  • Re: concurrent insert performance decrease dramatically when using NDBAPIFrazer Clement19 Nov
    • Re: concurrent insert performance decrease dramatically when using NDB APIpuwei 2320 Nov
      • Re: concurrent insert performance decrease dramatically when using NDBAPIJohan Andersson20 Nov
        • Re: concurrent insert performance decrease dramatically when using NDB APIpuwei 2320 Nov
          • Re: concurrent insert performance decrease dramatically when using NDBAPIJohan Andersson20 Nov
  • RE: concurrent insert performance decrease dramatically when using NDBAPIAndrew Morgan19 Nov
    • Re: concurrent insert performance decrease dramatically when using NDB APIpuwei 2320 Nov