Clint Byrum wrote:
>On Wed, 2004-08-11 at 01:14, Martin Skold wrote:
>> From the MySQL server instance the retrieval process is identical from
>>any other storage engine (such as InnoDB). The storage engines currently
>>do not do any query processing, only row/column based operations.
>>In the Ndb Cluster storage engine scans on full tables or ordered indexes
>>are executed in parallel on all the involved nodes.
>>We are looking to add capabillities to push part of queries down into the
>>Ndb Cluster storage engine (this is already supported by the storage engine
>>where simple scan filters can limit the amount of data sent back to the
>>greatly). We are also planning to look at distributed query
>>but this is future work.
>Martin, Mikael, thank you for your responses. This clears things up a
>lot for me. If this mail sounds negative, please don't take it that way.
>I am trying to be constructive. :-D
>I will say that this kind of disappoints me. NDB has the ability, but
>MySQL Cluster does not take advantage of the elegant model of doing
>query restriction at each node. This clears up for me why Gigabit and/or
>SCI is recommended. That said, for non-search reads, it will be *smokin*
>fast even for an ENORMOUS database.
>Too bad thats already pretty darn fast with indexes. For any sort of
>searching (WHERE recordtype=foo), its just going to be an ordered scan
>of the indexes and send *everything* back to the MySQL server. Right?
Not exactly, if you have an ordered index you will do a range scan when
and only send back what is found in the range (including equality
means that if you plan to do particular searches you can optimize
greatly by adding
ordered indexes (at the cost of maintaining them of course).
And as I said the storage engine supports defining predicate filters for
all scans (ordered
and full table scans), so as soon as we have implemented support for
using these in the handler
most scans will only send back a fraction of the rows back to the MySQL
server, but for
now use ordered indexes if you want better search performance.
BTW: primary keys and unique indexes create additional ordered indexes
you specify you don't want them with USING HASH).
>This also explains the 8k row size limit.
>My boss had a great analogy..
>Send one person to the store to get a six pack of beer, and a gallon of
>milk. Thats MyISAM or InnoDB doing a join. The person having a map of
>the store, is the same thing, but with indexes. Send 4 people to two
>stores with the same task, and you get MySQL cluster. Its likely you'll
>get your food faster, no doubt, but you spent 4 peoples time on it. And
>they brought *every* brand of milk and beer back for you to decide on..
>so you needed two pretty big trucks. ;) (lets stop there before our
>hypothetical characters get drunk and surly ;)
>So, general SQL use is out the window.. but for pure transactional
>workloads, its probably going to be extremely fast, and pretty scalable.
>As long as you don't ever do big search queries (low timeouts would
>prevent that maybe..), you wouldn't overwhelm the network links, as
>you'd just be doing single node-group lookups, and single node-group
>Since one could replicate an NDB stored database to other servers, it
>might make sense to have a setup where your transactional system is on
>MySQL cluster, but you do all your reporting/searching on smaller
>datamart servers, or even on the same servers, but different instances
>that have all the data stored in InnoDB or MyISAM.
>Keep on pushing the envelope guys. I'd love to see MySQL Cluster just
>stomp the snot out of the alternatives out there. :-D
Martin Sköld, Software Engineer
MySQL AB, www.mysql.com
Office: +46 (0)730 31 26 21