List:Cluster« Previous MessageNext Message »
From:Jonas Oreland Date:March 4 2005 10:15am
Subject:Re: Max number of open files exceeded / Error while reading REDO
log / I can't start my cluster!
View as plain text  
Hi Alex,

Thanks for your effort and time.

Can you:
1) Create a bug report, so that other persons with some problem knows 
that we're working on it.

Try to be as descriptive as possible.

2) Pass _all_ files from this crash to the bug report.

I'm typically interested in the entire trace log.

3) Is it in anyway reproducable...
    I.e. is there a point where you during a system restart
    always get this. If, so I would be very grateful if you could "save"
    the entire filesystem, and I could maybe look at it "on site".

/Jonas

Alex Davies wrote:
> Dear Mikael and others,
> 
> The problem has reoccured so here is the info that you asked for:
> 
> Error log on node 2:
> Date/Time: Tuesday 1 Mars 2005 - 10:30:44
> Type of error: error
> Message: Max number of open files exceeded
> Fault ID: 2806
> Problem data:
> Object of reference:  Ndbfs::createAsyncFile
> ProgramName: /usr/local/mysql/bin/ndbd
> ProcessID: 11832
> TraceFile: /var/lib/mysql-cluster/ndb_3_trace.log.7
> ***EOM***
> 
> Contents of bottom of /var/lib/mysql-cluster/ndb_3_trace.log.7:
> --------------- Signal ----------------
> r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 27743391 gsn: 164 "CONTINUEB" prio: 1
> s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 27743388 length: 1 trace: 0
> #sec: 0 fragInf: 0
>  Scanning the memory channel again with no delay
> --------------- Signal ----------------
> r.bn: 248 "DBACC", r.proc: 3, r.sigId: 27743390 gsn: 259 "FSOPENCONF" prio: 1
> s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 27743388 length: 3 trace: 4
> #sec: 0 fragInf: 0
>  UserPointer: 2
>  FilePointer: 23675
> --------------- Signal ----------------
> r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 27746976 gsn: 164 "CONTINUEB" prio: 0
> s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 27746974 length: 1 trace: 0
> #sec: 0 fragInf: 0
>  Scanning the memory channel every 10ms
> --------------- Signal ----------------
> r.bn: 252 "QMGR", r.proc: 3, r.sigId: 27746972 gsn: 164 "CONTINUEB" prio: 0
> s.bn: 252 "QMGR", s.proc: 3, s.sigId: 27746970 length: 1 trace: 0
> #sec: 0 fragInf: 0
>  H'00000004
> 
> Please let me know if you need any more info (or want physical access
> to the machines) to help you debug this.
> 
> With many thanks,
> 
> Alex
> 
> 
> 
> On Tue, 1 Mar 2005 11:07:03 +0100, Mikael Ronström <mikael@stripped>
> wrote:
> 
>>Hi Alex,
>>
>>2005-03-01 kl. 10.34 skrev Alex Davies:
>>
>>
>>>Dear Mikael,
>>>
>>>I have since deleted the folder and reinitialised the cluster which
>>>was a very significant pain! If this issue reoccurs (which I suspect
>>>that it might, since I have had issues like this before) I will send
>>>the whole trace along. I was slightly worried that it would make my
>>>email too large but I am sure I can find a way of getting it in.
>>>
>>>With many thanks,
>>>
>>
>>You don't need to send the entire trace, only the last part of
>>ndb_x_out.log
>>where x is the node id of the node crashing as below. There is a
>>printout
>>of all open files and whether open or closed. This will show how many
>>open files
>>there are and whether they are all already opened. Actually the trace
>>might be
>>useful for the node crashing as well.
>>
>>Rgrds Mikael
>>
>>
>>>Alex
>>>
>>>
>>>On Tue, 1 Mar 2005 10:31:37 +0100, Mikael Ronström
> <mikael@stripped>
>>>wrote:
>>>
>>>>Hi Alex,
>>>>
>>>>2005-02-28 kl. 17.11 skrev Alex Davies:
>>>>
>>>>
>>>>>I have now had to delete the contents of the cluster and restart it
>>>>>and while I was at it I have upgraded to 4.1.10. However, as soon as
>>>>>I
>>>>>ran a backup, I get the Max number of open files error again:
>>>>>
>>>>>Date/Time: Monday 28 February 2005 - 15:23:53
>>>>>Type of error: error
>>>>>Message: Max number of open files exceeded
>>>>>Fault ID: 2806
>>>>>Problem data:
>>>>>Object of reference:  Ndbfs::createAsyncFile
>>>>>ProgramName: ndbd
>>>>>ProcessID: 14352
>>>>>TraceFile: /var/lib/mysql-cluster/ndb_3_trace.log.1
>>>>>***EOM***
>>>>>
>>>>
>>>>Right before this crash there is a printout to the ndb_3_out.log file
>>>>giving info about all
>>>>currently open files. Can you send this output, it will help
>>>>understand
>>>>what's going on in
>>>>your system.
>>>>
>>>>Rgrds Mikael
>>>>
>>>>
>>>>>config.ini looks like this:
>>>>>
>>>>>[NDBD DEFAULT]
>>>>>DataMemory: 500MB
>>>>>IndexMemory: 250MB
>>>>>MaxNoOfAttributes: 10000
>>>>>MaxNoOfTables: 800
>>>>>MaxNoOfUniqueHashIndexes: 3000
>>>>>NoOfReplicas=2
>>>>>MaxNoOfOrderedIndexes=1024
>>>>>MaxNoOfConcurrentTransactions=300
>>>>>MaxNoOfConcurrentOperations=700
>>>>>
>>>>>The servers have 2GB of RAM each and plenty of disk space.
>>>>>
>>>>>Any guesses as to what I can do to fix this error?
>>>>>
>>>>>With many thanks,
>>>>>
>>>>>Alex
>>>>>
>>>>>--
>>>>>Alex Davies // http://www.davz.net
>>>>>
>>>>>This email and any files transmitted with it are confidential and
>>>>>intended solely for the use of the individual or entity to whom they
>>>>>are addressed. If you have received this email in error please notify
>>>>>the sender immediately by e-mail and delete this e-mail permanently.
>>>>>
>>>>>--
>>>>>MySQL Cluster Mailing List
>>>>>For list archives: http://lists.mysql.com/cluster
>>>>>To unsubscribe:
>>>>>http://lists.mysql.com/cluster?unsub=1
>>>>>
>>>>
>>>>Mikael Ronström, Senior Software Architect
>>>>MySQL AB, www.mysql.com
>>>>
>>>>Jumpstart your cluster:
>>>>http://www.mysql.com/consulting/packaged/cluster.html
>>>>
>>>>
>>>
>>>
>>>--
>>>Alex Davies // http://www.davz.net
>>>
>>>This email and any files transmitted with it are confidential and
>>>intended solely for the use of the individual or entity to whom they
>>>are addressed. If you have received this email in error please notify
>>>the sender immediately by e-mail and delete this e-mail permanently.
>>>
>>>
>>
>>Mikael Ronström, Senior Software Architect
>>MySQL AB, www.mysql.com
>>
>>Jumpstart your cluster:
>>http://www.mysql.com/consulting/packaged/cluster.html
>>
>>
> 
> 
> 


-- 
Jonas Oreland, Software Engineer
MySQL AB, www.mysql.com
Thread
Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Alex Davies27 Feb
  • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Alex Davies27 Feb
    • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!pekka27 Feb
      • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Alex Davies28 Feb
        • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Alex Davies28 Feb
          • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!pekka28 Feb
            • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Alex Davies1 Mar
          • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Mikael Ronström1 Mar
            • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Alex Davies1 Mar
Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Alex Davies1 Mar
  • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Mikael Ronström1 Mar
    • Re: Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!Alex Davies2 Mar
  • Re: Max number of open files exceeded / Error while reading REDOlog / I can't start my cluster!Jonas Oreland4 Mar