Your reply contains a wealth of information. Again, many thanks!
I'm not all the way there yet. I've applied non-default settings for
NoOfDiskPagesToDiskAfterRestartTUP and NoOfDiskPagesToDiskAfterRestartACC
based on your suggestions. Now the error I receive when loading has changed ;)
Now I receive "ERROR 1114 (HY000) at line 909: The table 'nbn_leads' is full,"
which I believe indicates an increase in DataMemory and/or IndexMemory is in
If that is true, I'd sure like to understand the reason. I don't want to set
overly-generous parameters to succeed in loading data, but not know why. If the
theory doesn't match reality, I am uncomfortable about going forward.
(For example, if I were to set up 2 nodes per host (which halves the memory
requirement per node and allows me to address more than 2GB per NDB on RHEL 3)
and define DataMemory=1700M and IndexMemory=300M, the load might succeed, but
those params are nearly double what they should be theoretically.)
So if you could help me step forward in understanding I'd be much obliged.
Ok, here's the latest config.ini. Any further suggestions?
MaxNoOfAttributes = 2000
MaxNoOfOrderedIndexes = 5000
MaxNoOfUniqueHashIndexes = 5000
-- Jim Hoadley
Sr Software Eng
Dealer Fusion Inc
--- Mikael Ronstr
> Hi Jim,
> Great to see your progress. These questions you are coming up with now
> is usually the last ones on the
> line towards a production configuration.
> The problem you're facing is that the REDO log gets filled up before 3
> checkpoints have completed. The
> default value of the NoOfFragmentLogFiles is 8 which means 8 * 4 * 16
> MBytes of REDO log files = 512 MB.
> The speed of a local checkpoint is controlled by the parameters
> Quote from manual
> When executing a local checkpoint the algorithm flushes all data pages
> to disk. Merely doing as quickly as possible without any moderation is
> likely to impose excessive loads on processors, networks, and disks.
> To control the write speed, this parameter specifies how many pages
> per 100 milliseconds are to be written. In this context, a "page" is
> defined as 8KB; thus, this parameter is specified in units of 80KB per
> second. Therefore, setting NoOfDiskPagesToDiskAfterRestartTUP to a
> value of 20 means writing 1.6MB of data pages to disk each second
> during a local checkpoint. This value includes the writing of UNDO log
> records for data pages; that is, this parameter handles the limitation
> of writes from data memory. UNDO log records for index pages are
> handled by the parameter NoOfDiskPagesToDiskAfterRestartACC. (See the
> entry for IndexMemory for information about index pages.)
> In short, this parameter specifies how quickly local checkpoints will
> be executed, and operates in conjunction with NoOfFragmentLogFiles,
> DataMemory, and IndexMemory.
> The default value is 40 (3.2MB of data pages per second).
> > [NDBD]NoOfDiskPagesToDiskAfterRestartACC
> This parameter uses the same units as
> NoOfDiskPagesToDiskAfterRestartTUP and acts in a similar fashion, but
> limits the speed of writing index pages from index memory.
> The default value of this parameter is 20 index memory pages per
> second (1.6MB per second).
> So in your configuration the time to checkpoint the ACC index memory
> data is around 61M/1.6M seconds = 38 seconds (Should not be a problem,
> 2 minutes should not fill up the 512 MBytes of REDO log files (would
> require 4.7 MByte of log file generated per second, REDO log has to
> 3 checkpoints).
> The DataMemory should take around 639M / 3.2M seconds = 200 seconds.
> Thus the REDO log has to keep 600 seconds of data. A good guess
> is that you get this error then before 10 minutes have passed.
> So what should your parameters be.
> It depends on how much disk and cpu you want to spend on checkpoint in
> operation compared to how fast you want the system restart to take.
> The slower your checkpoints are the more REDO log has to be processed
> at a system restart.
> My thumbnail rule is to set a checkpoint to take around 5 minutes. In
> your case with 1276 M of DataMemory this would require setting
> NoOfDiskPagesToDiskAfterRestartTUP to (1276MByte / 300 seconds) / 80
> kBytes /second = 53.16 => 54 (4.25 MByte per second)
> NoOfDiskPagesToDiskAfterRestartACC to (175M / 300 seconds) / 80
> kBytes/second = 7.29 => 8 (640 kBytes per second)
> In this case the total load on the disk per node would be 4.89 MBytes
> per second for writing checkpoints.
> If you allow checkpoints to take 10 minutes instead the checkpoint
> speed is half (27 and 4 => 2.12 MBytes + 320 kBytes = 2.44 MBytes /
> When setting NoOfFragmentLogFiles I usually use a thumbnail rule to
> allow for 6 checkpoints (usually allocating twice as much disk space
> for REDO
> logs isn't a big deal) to have a good safety margin.
> Using 5 minutes this means that the REDO log file has to keep 30
> minutes of REDO logs. How much REDO logs are produced during 30 minutes
> very dependent on the application.
> There is 72 bytes of overhead per delete/update/insert of one record
> plus there is an additional overhead of 4 bytes per field changed and
> primary key is also stored in the REDO log. Thus a REDO log record size
> is = 72 + 4 * no_of_fields_updated + size_of_fields_updated +
> As an example an insert in the dog table mentioned in one of your
> previous mails would consume
> 72 + 4 * 5 + (4 + 24 + 52 + 256 + 12) + 4 = 444 bytes
> Usually at max speed the ndbd nodes can write around 5-15 MByte of REDO
> log per second. It is not likely that you come close
> to this speed. I would guess that you are more in the range of 2
> Mbytes/second. Thus setting
> NoOfFragmentLogFiles to 64 would handle this (= 4 GByte of REDO log).
> Actually it is pretty likely a smaller figure would do as well but I
> use some margins to avoid the 410 problem.
> Rgrds Mikael
> 2005-04-04 kl. 04.12 skrev Jim Hoadley:
> > Reading the documentation and various posts in this list (thank you
> > Michael and Pekka especially), I have calculated the size of my
> > datbase as
> > data size=1276M and index size=175M (details available upon request).
> > Using these formulas and given that I have 2 replicas and 2 data
> > nodes, I
> > arrived
> > at these settings for config.ini:
> > DataMemory = (data size + (for each table(number of records * ordered
> > indexes * 10)) * NoOfReplicas) / number of data nodes (Size * 1.1) *
> > NoOfReplicas / NoOfDataNodes
> > IndexMemory = (for each table (for each primary or unique key (size of
> > attribute + 25))) * NoOfReplicas / number of data nodes
> > DataMemory=1404M
> > IndexMemory=193M
> > While loading my data (using 'mysql test < db.dump'), the process stops
> > midway (on table 8 of 20). I've calculated, and at that point, only
> > 639M
> > of DataMemory and 61M of IndexMemory should be required, therefore it's
> > unlikely that the error is caused by having either DataMemory or
> > IndexMemory set too low.
> > In fact, the error received is "ERROR 1297 (HY000) at line 2777438: Got
> > temporary error 410 'REDO log buffers overloaded, consult online manual
> > (increase RedoBuffer, and|or decrease TimeBetweenLocalCheckpoints,
> > and|or
> > increase NoOfFragmentLogFiles)' from ndbcluster". I have RedoBuffer,
> > TimeBetweenLocalCheckpoints and NoOfFragmentLogFiles set to their
> > defaults.
> > What should I set these to?
> > I'm running RHEL 3 (2.4.21-27.0.2.ELsmp) on 2 Dell PowerEdge 1850s
> > each w/6GB RAM, and here's my current config.ini:
> > #----------------------------
> > [ndbd default]
> > NoOfReplicas= 2
> > MaxNoOfConcurrentOperations=131072
> > DataMemory= 1404M
> > IndexMemory= 193M
> > Diskless= 0
> > DataDir= /var/mysql-cluster
> > TimeBetweenWatchDogCheck=10000
> > HeartbeatIntervalDbDb=10000
> > HeartbeatIntervalDbApi=10000
> > MaxNoOfAttributes = 2000
> > MaxNoOfOrderedIndexes = 5000
> > MaxNoOfUniqueHashIndexes = 5000
> > [ndbd]
> > HostName= 10.0.1.199
> > [ndbd]
> > HostName= 10.0.1.200
> > [ndb_mgmd]
> > HostName= 10.0.1.198
> > PortNumber= 2200
> > [mysqld]
> > [mysqld]
> > [tcp default]
> > PortNumber= 2202
> > #----------------------------
> > Any help would be appreciated. Thanks.
> > -- Jim Hoadley
> > Sr Software Eng
> > Dealer Fusion Inc
> > __________________________________
> > Yahoo! Messenger
> > Show us what our next emoticon should look like. Join the fun.
> > http://www.advision.webevents.yahoo.com/emoticontest
> > --
> > MySQL Cluster Mailing List
> > For list archives: http://lists.mysql.com/cluster
> > To unsubscribe:
> > http://lists.mysql.com/cluster?unsub=1
=== message truncated ===
Show us what our next emoticon should look like. Join the fun.