Sergei Golubchik wrote:
>>>> Without doing this, and using encapsulation so that a THD can have
>>>> multiple Statements, it will be very difficult to work on any future
>>>> parallelization efforts.
>>> You mean, a Statement can have multiple THDs, I suppose :)
>> No he means what he is saying. I don't know why you would want a statement
>> shared across multiple THD, but having a THD be able to handle multiple
>> statements means that you can do asynchronous queries within a single
>> connection.
>
> Ah, okay. I see.
>
> I thought that "parallelization", that Jay mentioned, means executing
> parts of a single statement in different threads - which, indeed, may
> need two THDs sharing the same Statement.
Yes, what Brian says...once the THD is distinguised from a pthread,
intra-statement parallelization is possible. We've only just begun this
step in Drizzle (a Session is now no longer intricately linked to a
pthread in Drizzle) but there's clearly a ton more to do. :) I know Alex
Esterkin thinks there is no reason to do such as thing, but perhaps he
just needs an example :)
Imagine a Session which sends a few long-running SELECTs in the same
client connection. Currently, because the THD is linked to a single
pthread in MySQL, these SELECTs will not only block each other, but will
be executed in order. What if there was no reason to do so? The three
Statements, if a Session contained a vector of Statement objects, could
parse and optimize all three statements and decide to send two of them
off to other scheduler threads for execution, essentially parallelizing
the operation (particularly on a distributed node architecture...)
Sure, like Alex mentioned, Drizzle's protocol supports parallel
operations (well, not really, it is non-blocking operations, but
still...similar concept). But, separating the notion of a thread with a
Statement means that we can achieve similar theoretical points for
parallel operations. The more options/points for parallel
opportunities, the better we can scale, no?
Cheers!
Jay