List:Cluster« Previous MessageNext Message »
From:Xiaobin She Date:February 1 2010 1:02pm
Subject:Re: Questions about the performance of ndb engine while using ndb api
View as plain text  
hi Andrew, Jorgen and Bernhard,

Thank you very much for your help.


> Also did you look at the adaptive send algorithm?

Yes, I have look at the adaptive send algorithm, and I have set the third
parameter of the execute function of the NdbTransaction class to be 1 and 2
and rerun the test.
When the parameter is 1, the result of the test is the same as the result of
the test which the parameter is set to be 0.
And when I set the parameter to be 2, the result is worser than the former
two testes, the average execution time of each transaction is about 30ms.

>It should also be noted that with primary key equality lookups in
>general the more data node servers you have the faster these
>transactions will be.

>I would also look at multi-threading your application which should give
>you a large performance boost.  Cluster excels at multiple simultaneous
>queries.
The test I'm doing right now is just for the single process with an single
data node, and the test to multiple data nodes or multi-threads will be
performenced later.


Since the hash index is used for the partitioning,  then how did the range
search is operated in ndb engine while there are more than one data nodes?
For example, if I want to select the rows whose ids are from 100 to 1000,
then how did ndb engine operate the query when there are two data nodes?
Did the query is send to one data node, and this data node will send query
to another data node, and after the frist data node has collect all the
data, it return the result to the client? Or is that the query is send to
all data nodes at the same time, and each data node return the data that
they have, and the client lib will do the merge?

And how can I calculate the size of memory that is used for the hash index?
Are there some formulas?


As far as I know, when an insert or update operation is send by the client
to one data node group, the result of the operation (like successful or
failed) will be send back to the client after all data nodes in the group
have updated their data in the memory(right? or has to write the log?) ,
which means that the processing is synchronized.
Is the description above right?
And if this is right, can this processing set to be asynchronous, like after
the operation has been received by the data nodes, the client will get the
successed notification? Or can this processing set to be half-synchronous,
like that after one data node in the group has updated its data , then the
client can get the successed notification and don't need to wait for all the
data nodes?


Thank you again for your help.
xiaobin

Thread
Questions about the performance of ndb engine while using ndb apiXiaobin She31 Jan
  • Re: Questions about the performance of ndb engine while using ndb apiAndrew Hutchings31 Jan
    • Re: Questions about the performance of ndb engine while using ndb apiXiaobin She1 Feb
      • Re: Questions about the performance of ndb engine while using ndb apiAndrew Hutchings1 Feb
      • Re: Questions about the performance of ndb engine while using ndb apiJorgen Austvik - Sun Norway1 Feb
    • Re: Questions about the performance of ndb engine while using ndb apiXiaobin She1 Feb
      • Re: Questions about the performance of ndb engine while using ndb apiAndrew Hutchings1 Feb
        • Re: Questions about the performance of ndb engine while using ndb apiXiaobin She1 Feb
  • Re: Questions about the performance of ndb engine while using ndb apiBernhard Ocklin1 Feb