Hi,
Set it to 1 and try again! Let us know if that helps.
Br
Johan
On Nov 20, 2009, at 9:17, puwei 23 <puwei23@stripped> wrote:
> Hi,
>
> No, forcesend=0, the default.
>
> 2009/11/20 Johan Andersson <Johan.Andersson@stripped>
> Hi,
>
> Are you using forcesend=1 in the NdbTransaction::execute(...); ?
>
>
>
> BR
> johan
>
>
> puwei 23 wrote:
> Hi,
>
> Thanks Frazer.
> Following are the detail info about my test case.
>
> 2009/11/19 Frazer Clement <Frazer.Clement@stripped>
>
> Hi Puwei,
> It might help if you described :
> 1) How your NdbApi program works
> Does it use batching? Does it use the Async Api? Does it use > 1
> transaction?
>
>
> Yes, batch insert, 500 or 1000 records per insert.
> Not using Async Api.
> More than 1 transactions. Each thread use a new transaction for each
> batching insert, and there are more than one thread.
>
> 2) What hardware you are running the client on?
> Single core?
>
>
> Two quad-core, Intel 2.66G, 64bit.
> 8G memory
>
>
> 3) What CPU utilisation you observe on the client in each case
>
>
> 1 threads, cpu idle is about 95%;
> 2 threads, it's about 90%;
> 10 threads, it looks like:
> [root@t1 ~]# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us
> sy id
> wa st
> 2 0 176 46280 73676 6274412 0 0 19200 0 4033 20618 22 5 61
> 12 0
> 5 2 176 49040 73668 6272476 0 0 10240 0 2353 14217 10 2
> 83 5 0
> 0 0 176 47840 73664 6273172 0 0 9600 0 2845 10292 10 2
> 81 6 0
> 0 0 176 46344 73652 6273580 0 0 10240 0 2554 17059 11 3
> 81 6 0
> 0 0 176 48924 73644 6271892 0 0 13824 0 3222 18993 15 4
> 73 9 0
> 0 0 176 49764 73628 6270656 0 0 7808 0 2373 12453 8 2
> 87 3 0
> 1 6 176 47484 73624 6273008 0 0 7808 0 1863 13148 8 2
> 87 3 0
> 0 0 176 49464 73612 6271872 0 0 6528 0 2396 9439 6 1 88
> 5 0
> 4 0 176 48020 73612 6273076 0 0 9856 0 2613 14659 11 2
> 81 5 0
> 5 0 176 47480 73612 6273680 0 0 512 0 1339 5809 0 0 99
> 0 0
> 3 0 176 47968 73600 6273092 0 0 12416 0 2782 17574 14 3
> 76 7 0
> 1 3 176 46288 73592 6274232 0 0 18816 0 3888 21614 20 5 62
> 13 0
> 4 5 176 46112 73572 6275900 0 0 21248 0 3907 20708 22 5 62
> 11 0
>
> I don't think the client machine is the performace bottleneck.
> But you give me a hint, how about the data nodes?
> 1 threads:
> [root@d1 ~]# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs
> us sy id
> wa
> 2 2 160 539556 80004 1685656 0 0 18 333 32 40
> 1 0
> 98 1
> 1 3 160 531160 80012 1694228 0 0 8 18652 3218 1466 14 12 50
> 24
> 0 0 160 523448 80020 1702020 0 0 8 19772 3063 1898 9 11 61
> 19
> 0 2 160 514452 80028 1710852 0 0 8 9508 2916 722
> 5 3
> 85 8
> 0 0 160 505784 80036 1719684 0 0 8 8780 2849 826
> 4 4
> 87 6
> 0 0 160 496632 80052 1729028 0 0 8 9792 2928 872
> 4 3
> 85 7
> 1 0 160 490808 80060 1734740 0 0 8 9772 2869 840
> 5 3
> 87 6
> 0 2 160 473180 80092 1752128 0 0 12 19540 2918 642 9 3 72
> 16
> 1 2 160 464864 80100 1760700 0 0 8 20344 3090 1954 7 6 74
> 13
> 1 3 160 455264 80108 1770052 0 0 8 22640 3252 1777 14 13 50
> 24
> 0 0 160 450296 80112 1775248 0 0 4 27580 2889 3107 12 13 50
> 24
> 0 0 160 441592 80128 1783812 0 0 8 10040 2889 757
> 5 3
> 84 9
> 1 2 160 432400 80148 1792892 0 0 12 9244 2915 796
> 5 2
> 87 6
> 0 0 160 423608 80160 1801720 0 0 12 8748 2868 867
> 4 3
> 87 6
> 0 0 160 414328 80168 1811072 0 0 8 10508 2928 870
> 5 3
> 86 6
> 0 0 160 406648 80176 1818864 0 0 8 16552 2904 702 7 4 77
> 12
> 0 0 160 389496 80200 1835740 0 0 12 18760 3016 821 8 3 75
> 14
> 1 1 160 381560 80216 1843784 0 0 8 22612 3216 2097 12 12 54
> 21
> 1 1 160 373624 80224 1851576 0 0 8 22396 3182 2024 14 12 49
> 25
>
> 2 threads:
> [root@d1 ~]# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs
> us sy id
> wa
> 1 2 160 17280 80896 2207104 0 0 18 335 33 41
> 1 0
> 98 1
> 2 1 160 18216 80916 2206564 0 0 12 20620 3308 1577 13 13 50
> 24
> 1 2 160 17584 80928 2206812 0 0 12 25324 3935 1585 15 12 49
> 25
> 1 2 160 17072 80944 2207576 0 0 16 22604 4162 1345 15 11 48
> 26
> 1 3 160 17272 80948 2207052 0 0 12 22284 4109 1170 15 11 48
> 25
> 1 3 160 17380 80968 2206772 0 0 16 22240 4101 1213 16 11 46
> 28
> 0 1 160 17576 80988 2206752 0 0 12 16364 3851 793 8 5 69
> 18
> 0 3 160 18312 81008 2206212 0 0 20 14160 3843 823 8 5 75
> 13
> 1 1 160 16688 81012 2207768 0 0 16 17224 4002 861 8 4 74
> 14
> 0 0 160 17016 81024 2207756 0 0 12 15448 3985 978 8 5 75
> 13
> 0 1 160 17812 81040 2206440 0 0 16 20748 4248 852 10 6 70
> 15
> 1 1 160 17368 81068 2206932 0 0 16 21588 4271 875 11 5 68
> 17
> 1 2 160 16748 81084 2207696 0 0 16 24008 4300 1566 15 11 46
> 28
> 1 2 160 16772 81088 2207432 0 0 12 21340 4144 1017 13 13 47
> 27
>
> 10 threads:
> [root@d1 ~]# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs
> us sy id
> wa
> 1 2 160 16392 55180 2195900 0 0 3 113 9 8
> 0 0
> 99 0
> 1 2 160 16892 55200 2195100 0 0 12 25304 4470 1599 17 8 45
> 30
> 1 2 160 16308 55212 2194048 0 0 12 33528 3193 1942 18 9 45
> 28
> 1 1 160 16048 55228 2196372 0 0 12 25316 3087 1534 11 5 52
> 32
> 0 2 160 16224 55172 2195648 0 0 12 34936 3740 1085 11 6 55
> 28
> 1 1 160 16624 54804 2196276 0 0 12 24020 3618 2018 9 5 68
> 18
> 1 1 160 17328 54800 2195500 0 0 4 25880 2538 3182 15 12 49
> 25
> 1 0 160 17456 54796 2195504 0 0 8 24752 3259 2637 14 9 59
> 18
> 1 2 160 17160 54792 2195248 0 0 24 27044 5982 613 17 6 59
> 18
> 1 1 160 17136 54788 2194992 0 0 32 32240 5962 1200 17 6 54
> 22
> 2 0 160 17200 54792 2195248 0 0 12 24612 4078 1782 13 9 37
> 42
> 0 0 160 16880 54796 2195764 0 0 12 20488 3515 2053 10 5 70
> 14
> 0 1 160 15984 54804 2196536 0 0 16 25716 4110 1710 17 5 50
> 28
> 1 1 160 22384 54460 2190640 0 0 16 38148 3957 991 12 7 50
> 31
>
> Maybe this is the reason that kill the concurrent insert performance?
>
> The saturation effect you observe could be due to :
> a) Client CPU is saturated
> Extra threads compete with little throughput gain from
> overlapping with
> IO
> Try connecting a second client process on a different client
> machine to
> see throughput improvement
> b) Threads are competing for per-cluster-connection mutex
> Use more than 1 ndb_cluster_connection in your client process,
> enabling
> higher throughput.
>
>
> I've tried. The performance improved about 10%.
> How many ndb_cluster_connection that a cluter can hold?
>
>
> Hope this helps,
> Frazer
>
>
> puwei 23 wrote:
>
> Hi All,
>
> When testing the insert performance with NDB API, I found the
> performance
> of
> concurrent insert was ONLY a little higher than only one insert
> client.
> Here
> are the test result:
>
> Concurrent threads
>
> Inserted records per second,
>
> per thread
>
> Inserted records per second,
>
> all threads
>
> 1
>
> 11676.25
>
> 11676.25
>
> 2
>
> About 6202.76
>
> 12405.52
>
> 10
>
> About 1212.94
>
> 12131.00
>
> Per-thread performace decrease dramatically, and the inserted
> records by
> all threads are almost the same.
>
> Does anyone have experience about this?
>
>
> Regards,
> puwei
>
>
>
> --
> Frazer Clement, Software Engineer, MySQL Cluster Sun Microsystems -
> www.mysql.com
> Office: Edinburgh, UK
>
> Are you MySQL certified? www.mysql.com/certification
>
>
>
>
>