List:Cluster« Previous MessageNext Message »
From:Jim Hoadley Date:April 4 2005 10:13pm
Subject:Re: error on loading data
View as plain text  
Mikael --


> >
> > Your reply contains a wealth of information. Again, many thanks!
> >
> > I'm not all the way there yet. I've applied non-default settings for
> > NoOfDiskPagesToDiskAfterRestartTUP and  
> > NoOfDiskPagesToDiskAfterRestartACC
> > based on your suggestions. Now the error I receive when loading has 
> > changed ;)
> > Now I receive "ERROR 1114 (HY000) at line 909: The table 'nbn_leads' 
> > is full,"
> > which I believe indicates an increase in DataMemory and/or IndexMemory 
> > is in
> > order.
> >
> 
> Correct.
> 
> > If that is true, I'd sure like to understand the reason. I don't want 
> > to set
> > overly-generous parameters to succeed in loading data, but not know 
> > why. If the
> > theory doesn't match reality, I am uncomfortable about going forward.
> >
> 
> There is still an open part in the theory which is how many TEXT fields 
> do you have that are
> bigger than 256 bytes. There will have to be storage allocated for them 
> as well (although this
> should not be very big since the total storage was 250 MByte in some 
> mail if I remember correctly).

I have already accounted for the TEXT fields bigger than 256 bytes. For these
you SUM all the characters over 256 bytes and add to DataMemory, right?

> Also if I understand the model here I assume that there were 600.000 
> records rather than 60.000
> otherwise there is a factor of 10 that I don't understand. 

Sorry for the confusion. I sent you calculations for a single table. And for
that table I sent 2 versions, one with ~550,000 records (before purging) and
one with ~60,000 (after creative purging). In fact the database in question is
20 tables in all.

> The 
> calculations I did gave
> 1700 MByte of DataMemory + TEXT parts and
> 20 MByte of IndexMemory
 
> So from what I understand the theory should indicate
> DataMemory = 1800M (100 M allocated for big TEXT fields and other 
> overhead)
> IndexMemory = 30M (10M extra to cover for overhead of extra BLOB tables 
> and so forth)
> 
> A more conservative attempt could be made with
> DataMemory = 1900M
> IndexMemory = 100M
> if the first attempt fails. If both fails then we have to return to the 
> drawing board :) and recheck our
> calculations.
>
>
> Rgrds Mikael
> PS: I cut in my comments on previous email for others on the cluster 
> list to have a chance to
> follow the figures.

I will go back to the drawing board and recalculate. One unanswered question I
have is the overhead for CHARs. For VARCHARs it's size + 2 rounded up to a
number divisible by 4. What are the rules for CHARs?

I'll get back to you after recalulating. Thx.

-- Jim
 
> I've checked your calculations and for the most part they are a close 
> shot.
> These are the details that need to be considered.
> 
> 0) All sizes are rounded to the next 4 byte boundary
> 1) VARCHAR's take up size + 2
> 2) Text takes up 256 bytes (correct in some places) plus increments of 
> 2000 bytes for larger text fields.
> 
> This means that the calculation of record size is pretty close to 
> accurate, it should be somewhat bigger than
> what you calculated, 71 bytes bigger according to my calculation (=> 
> 2584 bytes)
> 
> However on the index side you'll be happy to hear that a smaller amount 
> of memory is needed.
> 
> 3) Primary key calculation was correct, however the primary key also 
> defines an ordered index.
> All ordered indexes takes up around 10 bytes per record independent of 
> the fields involved in the index.
> Also ordered index data is kept in the DataMemory. IndexMemory is only 
> used for primary key hash index
> and for hash part of unique key indexes.
> 
> Thus you need 33 bytes of IndexMemory per record and 7 * 10 bytes of 
> DataMemory per record for ordered
> indexes. =>
> Total 33 bytes * NoOfRecords = IndexMemory
>           2584 + 70 = 2954 * NoOfRecords = DataMemory
> 
> NoOfRecords = 60000 =>
> 
> 33 * 60000 = 2 MByte
> 2954 * 6000 = 177, 240,000 Bytes = 170 MByte
> 
> 
> > (For example, if I were to set up 2 nodes per host (which halves the 
> > memory
> > requirement per node and allows me to address more than 2GB per NDB on 
> > RHEL 3)
> > and define DataMemory=1700M and IndexMemory=300M, the load might 
> > succeed, but
> > those params are nearly double what they should be theoretically.)
> >
> > So if you could help me step forward in understanding I'd be much 
> > obliged.
> > Ok, here's the latest config.ini. Any further suggestions?
> >
> > [ndbd default]
> > NoOfReplicas= 2
> > MaxNoOfConcurrentOperations=131072
> > DataMemory= 1500M
> > IndexMemory= 193M
> > Diskless= 0
> > DataDir= /var/mysql-cluster
> > TimeBetweenWatchDogCheck=10000
> > HeartbeatIntervalDbDb=10000
> > HeartbeatIntervalDbApi=10000
> > NoOfFragmentLogFiles=64
> > #per Michael
> > NoOfDiskPagesToDiskAfterRestartTUP=54   #default=40
> > NoOfDiskPagesToDiskAfterRestartACC=8    #default=20
> >
> >
> > #http://lists.mysql.com/cluster/1441
> > MaxNoOfAttributes = 2000
> > MaxNoOfOrderedIndexes = 5000
> > MaxNoOfUniqueHashIndexes = 5000
> >
> > [ndbd]
> > HostName= 10.0.1.199
> >
> >
> > [ndbd]
> > HostName= 10.0.1.200
> >
> >
> > [ndb_mgmd]
> > HostName= 10.0.1.198
> > PortNumber= 2200
> >
> >
> > [mysqld]
> >
> >
> > [mysqld]
> >
> >
> > [tcp default]
> > PortNumber= 2202
> >
> > #-----------------------------------
> >
> > -- Jim Hoadley
> >    Sr Software Eng
> >    Dealer Fusion Inc
> >
> >
> >
> >
> >
> >
> >
> > --- Mikael Ronstr> >
> >> Hi Jim,
> >> Great to see your progress. These questions you are coming up with now
> >> is usually the last ones on the
> >> line towards a production configuration.
> >>
> >> The problem you're facing is that the REDO log gets filled up before 3
> >> checkpoints have completed. The
> >> default value of the NoOfFragmentLogFiles is 8 which means 8 * 4 * 16
> >> MBytes of REDO log files = 512 MB.
> >> The speed of a local checkpoint is controlled by the parameters
> >>
> >> Quote from manual
> >>
> >> 	> >>
> >>   When executing a local checkpoint the algorithm flushes all data 
> >> pages
> >> to disk.  Merely doing as quickly as possible without any moderation 
> >> is
> >> likely to impose  excessive loads on processors, networks, and disks.
> >> To control the write speed,  this parameter specifies how many pages
> >> per 100 milliseconds are to be written.  In this context, a "page" is
> >> defined as 8KB; thus, this parameter is specified  in units of 80KB 
> >> per
> >> second. Therefore, setting  NoOfDiskPagesToDiskAfterRestartTUP to a
> >> value of 20 means writing 1.6MB  of data pages to disk each second
> >> during a local checkpoint. This value  includes the writing of UNDO 
> >> log
> >> records for data pages; that is, this  parameter handles the 
> >> limitation
> >> of writes from data memory. UNDO log  records for index pages are
> >> handled by the parameter  NoOfDiskPagesToDiskAfterRestartACC. (See the
> >> entry for IndexMemory  for information about index pages.)
> >>
> 
=== message truncated ===



		
__________________________________ 
Do you Yahoo!? 
Yahoo! Personals - Better first dates. More second dates. 
http://personals.yahoo.com

Thread
error on loading dataJim Hoadley4 Apr
  • Re: error on loading dataMikael Ronström4 Apr
    • Re: error on loading dataJim Hoadley4 Apr
      • Re: error on loading dataMikael Ronström4 Apr
        • Re: error on loading dataJim Hoadley5 Apr
          • Re: error on loading datapekka5 Apr
          • Re: error on loading dataMikael Ronström5 Apr
Re: error on loading dataJim Hoadley4 Apr