A big problem in designing parallelism is making sure the threads don't duplicate other
and don't step on each other. Also, there is a non-trivial risk that parallelism could
make the query go slower. (Example: thrashing the disk and cache in multiple spots
instead of just one.)
As for the CPU vs Disk debate -- I suspect one cannot reliably predict whether a query
will be cpu bound (hence benefiting from multiple cores) or I/O bound (thereby needing a
different optimization), or even some combination (making optimization even more
complicated).
It seems like Partitioning is poised to implement intra-SELECT parallelism. The division
of labor is pretty clear and clean -- one thread per partition (after pruning). That,
plus not opening all partitions before pruning, would make Partitions the 'greatest thing
since sliced bread' for MySQL.
FYI, InfoBright also seems not to have cracked the parallelism nut, either.
Rick James
MySQL Geeks - Consulting & Review
> -----Original Message-----
> From: Hui Lin [mailto:liova77@stripped]
> Sent: Wednesday, November 25, 2009 8:18 AM
> To: Timour Katchaounov
> Cc: Venu Kalyan; internals@stripped
> Subject: Re: How MySQL exploit performance of a quad core processor?
>
> hi Timour,
>
> Thanks for the reply. As you said(one thread per query), so if I would
> like to operate one query parallelism to exploit performance of
> multiple cores, I need to concern storage engines level rather than
> query engine level.
> In order to achieve this in MySQL, I need to re-engineering MySQL? if
> that, it is absolutely too big for me.:-(
>
> And I'm interested to the alternative to implement intra-query
> parallelism, any example? Also to the reduction of critical data for
> increasing throughput by removing lock contention you mentioned.
>
> I found a thesis about accelerating critical section execution.
> http://www.ece.cmu.edu/~omutlu/pub/acs_asplos09.pdf
>
> I also have few more comments inline below.
>
> On Wed, Nov 25, 2009 at 9:43 PM, Timour Katchaounov
> <timour@stripped> wrote:
> > Hello Hui,
> >
> > Not having read the Oracle article, few replies on few of your
> > points below.
> >
> > MySQL's query processor architecture is purely single-threaded
> > (there is one thread per one query), thus the execution of a
> > single query cannot benefit from multiple cores at the query
> > engine level. The storage engines themselves make use of multiple
> > cores to various degrees, but this is invisible to the query
> > processor.
> >
> > The main hurdle to implement parallel query processing is that
> > MySQL's execution-related data structures are non-reentrant, and
> > it is a major re-engineering effort to change that. There are
> > other alternatives to implement intra-query parallelism, but none
> > looks trivial to me.
> >
> > However, MySQL makes use of multiple cores/cpus when processing
> > multiple queries. This is done by allocating one thread per query.
> > Multiple such threads may access the same storage engine.
> > The projects I am aware of, whose goal is improving parallelism,
> > are focused on increasing throughput by removing lock contention and
> > generally removing hot spots in the server.
> >
> > I have few more comments inline below.
> >
> >> thanks for your information.
> >>
> >> I have seen the article about oracle's parallel execution you
> >> supplied. oracle parses SELECT into different operations(e.g scan,
> >> sort), and divides one operation into small jobs with multiple
> >> parallel execution servers to parallel execute one operation.
> >>
> >> some doubts:
> >> 1.about meaning. It is doubtless that IO is more important than
> >> processor. Then is it worth to optimizing MySQL in multi-core
> >> computer? which steps of one SELECT can benefit from multi-core?
> >
> > One very notable case is queries over partitioned tables.
> > Your question is relevant for any DBMS, there are many cases
> > when a query engine could do things in parallel.
> >
>
> Does this mean database also has many task which need processor to
> execute, no just I/O is important? Increasing of useage rate of
> multi-core processor in database system can significantly improve the
> performance of whole database system?
>
> >>
> >> 2. regarding conditions. The article says, Parallel execution
> >> benefits four operation - Queries, Creation of large indexes, Bulk
> >> inserts, updates, and deletes, Aggregations and copying.
> But it looks
> >> like parallel execution just suit for SMP and cluster
> since they have
> >> independent disk and memory, owning sufficient I/O
> bandwidth. In one
> >> computer with multi-core, different cores must share memory. If you
> >> split one operation (such as scan, sort) into multiple
> small jobs, and
> >> different jobs need read index or data in memory, do they access
> >> memory serially?
> >
> > They need not. Also, consider that there can be different level of
> > granularity - running physical algebra operations in parallel, or
> > running each operation internally in parallel. Both are possible.
> >
>
> But if I want to execute one operation (scan) parallel like oracle do,
> does it also not need to access memory serially?
> In that case, is memory a bottleneck?
>
> >>
> >> 3.If I want to achieve parallel query to benefit my quad core
> >> computer, what is the main problem I will meet according
> to original
> >> MySQL architecture? I have seen this
> >
> > I don't understand what degree of parallelism you need. Is it to
> > improve the execution speed of single queries, or is it to increase
> > throughput (queries per time unit)?
> >
> > These two are very different problems. The latter is being
> addressed.
> > The former is much harder and I am not aware of any work on
> it (except
> > some discussions).
> >
> > So what are your needs?
> >
>
> I want to search some infomation about the former one. But seemly few
> people talk about this area.
> where can I find more information?
>
> >> thread(http://lists.mysql.com/internals/37347), it points three
> >> problems - Malloc in InnoDB, pre mature locking of Query Cache,
> >> key_buffer locking. and "The problems are addressed to
> various degrees
> >> in the "Google patch", the "Percona patch" (derived from
> Google), and
> >> 5.4." ? MySQL 5.4 has achieved parallel query? I want to know what
> >> they have done for taking advantage of multi-core computer?
> >
> > I guess these are described somewhere. Unfortunately I have not
> > worked on either of these to give you details.
> >
> >
> Don't mind that, from what you mentioned, I have learned much, thanks.
>
> > Best regards,
> >
> > Timour Katchaounov
>
> --
> MySQL Internals Mailing List
> For list archives: http://lists.mysql.com/internals
> To unsubscribe:
> http://lists.mysql.com/internals?unsub=1
>
>