List:Cluster« Previous MessageNext Message »
From:Raymond Peachey Date:October 16 2013 4:02pm
Subject:Re: Data retrieval causes query times to spike
View as plain text  
Thanks all, unfortunately, the tuning I had done did not help, but adding a
3rd node to offload the change tracking jobs to did. So the two primary
MYSQLD nodes are now serving traffic, a weaker MYSQLD node is running the
CT job.

I am still open for any tuning suggestions (or things to look at) in order
to combat higher load on a MySQL Cluster node. I know 7.3 has significant
improvements on the API connections to the Data nodes, so our equivalent 30
connections per primary MYSQLD node are better utilized. This single aspect
seems to be the change with the most impact based on previous testing we
did load testing MySQL Cluster.

Anyway, thank you all for your assistance!


On Tue, Oct 15, 2013 at 10:36 AM, Johan Andersson <johan@stripped>wrote:

> If the tables are temporary  (if you lose the cluster the table + data
> will be lost) you can do:
>
> set ndb_table_temporary=1;
> CREATE TABLE...
> set ndb_table_temporary=0;
> REPLACE ...
>
>
> Not sure it would give anything but the temporary tables are a bit less
> expensive to create.
>
> Best regards
> Johan
> Severalnines AB
>
>
>
>
> On Tue, Oct 15, 2013 at 4:25 PM, Raymond Peachey <raypeachey@stripped>wrote:
>
>> I agree, unfortunately it is not something that we can change at this
>> point
>> in time. One thing that's strange, is that if we take this particular node
>> that the job is running against out of the load balancer (two sql nodes
>> load balanced), then it runs fine against the other node, until the load
>> becomes a bit unbearable, at which point that second node will start
>> experiencing mysql has gone away as well.
>>
>> Perhaps it's got something to do with the global schema lock you mention?
>> But if that's the case, wouldn't both nodes exhibit the same exact
>> behavior
>> at the same times? The problem is severely lessened when all the load is
>> pushed on to the 2nd node.
>>
>> Unfortunately, increasing the MaxNoOfExecutionThreads did not help. I
>> increased it from 8 to 30, to no avail. What I will be doing today is
>> introducing a 3rd MySQLD node that is dedicated for this job, and leave
>> the
>> original two (much more powerful machines) dedicated to serving traffic.
>>
>>
>> On Tue, Oct 15, 2013 at 1:06 AM, Hendrik Woltersdorf <
>> hendrik.woltersdorf@stripped> wrote:
>>
>> > Creating a table in a cluster SQL node is an expensive operation, even
>> if
>> > the storage engine is not ndbcluster. The process creating the table(s)
>> > needs to acquire the "global cluster schema lock". And that takes some
>> > time. In my experience it is better to avoid creating (temporary)
>> tables in
>> > time critical processes.
>> >
>> >
>> > regards
>> > Hendrik Woltersdorf
>> >
>> > XCOM AG
>> > Softwareentwicklung
>> > Banksystem Front- und Middleoffice
>> > Peter-Breuer-Str. 13
>> > 08056 Zwickau
>> > Germany
>> >
>> > Telefon +49 375 27008-580
>> > Telefax +49 375 27008-25
>> > Mobil   +49 152 22999 584
>> > hendrik.woltersdorf@stripped
>> > http://www.xcom.de
>> >
>> > schrieb Raymond Peachey:
>> >
>> >> No, there was no ALTER TABLE done recently. The job that runs, however,
>> >>
>> >> does create tables, and does indeed do a INSERT .. SELECT from the
>> change
>> >> tracking table and the master tables to the result tables. However,
>> NDB is
>> >> READ COMMITTED/ROW Based logging, so all reads should be consistent
>> reads,
>> >> non locking (correct me if I'm wrong, perhaps it's different with
>> NDB?).
>> >>
>> >> We are currently running version: 5.5.29-ndb-7.2.10.
>> >>
>> >> There are a lot of queries that query against INFORMATION_SCHEMA to
>> >> determine primary keys etc for the change tracking tables. The main
>> >> culprit
>> >> here probably is the query that joins the trigger tables to the master
>> >> tables per job run:
>> >>
>> >> REPLACE result_table
>> >> SELECT <column_list>
>> >> FROM db.master_table AS P LEFT OUTER JOIN db.trigger_table AS CT ON
>> >> p.pk_col = CT.pk_col
>> >> WHERE timestamp_column > 'datetime';
>> >>
>> >> The number of records varies, but we have these jobs running every 5
>> >> minutes, and so it is typically only ~500 records per table, so in this
>> >> case, a total of about ~5000 per job run, and there is a little
>> >> parallelism
>> >> going on in that there's actually two jobs running simultaneously
>> (against
>> >> two separate databases), that make up that 5000 records processed.
>> >>
>> >> I could try removing the frm files and see if that helps.
>> Additionally, I
>> >> will look into increasing the execution threads today to see if that
>> >> helps.
>> >>
>> >>
>> >> On Thu, Oct 10, 2013 at 1:32 PM, Johan Andersson <
>> johan@stripped
>> >> >wrote:
>> >>
>> >>  Hi,
>> >>>
>> >>> Did you recently do an alter table on TABLE_XYZ and is it the same
>> table
>> >>> you run the query on?
>> >>>
>> >>> Not sure what that error message means "NDB online alter table
> commit:
>> >>> distributing TABLE_XYZ timed out. Ignoring..."
>> >>> But usually when i have problems like this, do stop the mysql
> server,
>> >>> remove all the .frm files belonging to NDB tables, and start the
>> server
>> >>> again.
>> >>> This way the server gets new fresh table meta data from the
>> dictionary in
>> >>> the data nodes.
>> >>>
>> >>> Run the query again, does it work?
>> >>>
>> >>> What does the query look like? Is the query trying to acquire
> locks?
>> >>>
>> >>> If you are not locking any records then the 233 error is super
> strange
>> >>> (have you upgraded to the latest 7.2.12 ?).
>> >>> How many records are you trying to read?
>> >>>
>> >>> Best regards
>> >>> johan
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Oct 10, 2013 at 6:47 PM, Raymond Peachey <
>> raypeachey@stripped
>> >>> >wrote:
>> >>>
>> >>>  In the error log, I see this:
>> >>>>
>> >>>> 131009 20:02:53 [ERROR] NDB online alter table commit:
> distributing
>> >>>> TABLE_XYZ timed out. Ignoring...
>> >>>> 131009 20:03:52 [ERROR] Got error 233 when reading table
>> >>>>
>> >>>> I see the first error a lot -- the second error I see
> specifically
>> >>>> during
>> >>>> the trouble periods. 233 I think has to do with reaching a max
> limit
>> of
>> >>>> concurrent operations.
>> >>>>
>> >>>> We do not have TransactionDeadlockDetectionTi**meout defined,
> which
>>
>> >>>> means
>> >>>> it should be at it's default of 1200 ms.
>> >>>>
>> >>>> SHOW GLOBAL VARIABLES LIKE '%timeout%' ;
>> >>>> # Variable_name, Value
>> >>>> connect_timeout, 10
>> >>>> delayed_insert_timeout, 300
>> >>>> innodb_lock_wait_timeout, 50
>> >>>> innodb_rollback_on_timeout, OFF
>> >>>> interactive_timeout, 28800
>> >>>> lock_wait_timeout, 31536000
>> >>>> net_read_timeout, 30
>> >>>> net_write_timeout, 60
>> >>>> slave_net_timeout, 3600
>> >>>> wait_timeout, 28800
>> >>>>
>> >>>> Additionally, here's other settings we have configured, that I
> think
>> may
>> >>>> help:
>> >>>>
>> >>>> MaxNoOfConcurrentOperations=**1000000
>> >>>> MaxNoOfConcurrentTransactions=**16384
>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Thu, Oct 10, 2013 at 11:38 AM, Johan Andersson <
>> >>>> johan@stripped
>> >>>>
>> >>>>> wrote:
>> >>>>>
>> >>>> --
>> >>>> MySQL Cluster Mailing List
>> >>>> For list archives: http://lists.mysql.com/cluster
>> >>>> To unsubscribe:    http://lists.mysql.com/cluster
>> >>>>
>> >>>>
>> >>>>
>> >>>>>
>> >
>> > --
>> > MySQL Cluster Mailing List
>> > For list archives: http://lists.mysql.com/cluster
>> > To unsubscribe:    http://lists.mysql.com/cluster
>> >
>> >
>>
>
>

Thread
Data retrieval causes query times to spikeRaymond Peachey7 Oct
  • Re: Data retrieval causes query times to spikeMagnus BlĂ„udd9 Oct
    • Re: Data retrieval causes query times to spikeRaymond Peachey9 Oct
      • Re: Data retrieval causes query times to spikeWagner Bianchi9 Oct
        • Re: Data retrieval causes query times to spikeRaymond Peachey10 Oct
          • Re: Data retrieval causes query times to spikeHendrik Woltersdorf10 Oct
            • Re: Data retrieval causes query times to spikeRaymond Peachey10 Oct
              • Re: Data retrieval causes query times to spikeJohan Andersson10 Oct
                • Re: Data retrieval causes query times to spikeRaymond Peachey10 Oct
                  • Re: Data retrieval causes query times to spikeJohan Andersson10 Oct
                    • Re: Data retrieval causes query times to spikeRaymond Peachey14 Oct
                      • Re: Data retrieval causes query times to spikeHendrik Woltersdorf15 Oct
                        • Re: Data retrieval causes query times to spikeRaymond Peachey15 Oct
                          • Re: Data retrieval causes query times to spikeJohan Andersson15 Oct
                            • Re: Data retrieval causes query times to spikeRaymond Peachey16 Oct
                              • Re: Data retrieval causes query times to spikeRaymond Peachey21 Oct