Hey Stewart,
>>Actually its MySQL cluster 4.1. Old, but unfortunately I'm not able to
>>upgrade the production servers on a regular basis.
>great! (he says as he's on the cluster team)
Cluster actually works pretty well for us (love the failover offered by
using multiple replicas), up until the point that the database runs
full, actually. Thats why I chose to implement the functions to estimate
the memory usage, as the only other way of knowing is when the database
is about to go titsup (for some reason the management here usually
doesn't like that ;) ). Can't wait to get my hands on a stable 5.1
release and switch to disk-based clustering ... hopefully that'll solve
most of the memory problems (if only for the data memory).
>>I know, but products such as HP OpenView talk SNMP; as its a
>>standardised protocol I figured SNMP would be a good choice (the
>>protocol was specifically meant for this).
>yeap - sounds like you've gone beyond what these products currently
>offer.
btw, I specifically did not look into the one commercial product I found
offering SNMP support, btw. (AdventNet IIRC). This to avoid any claims
of copyright further down the road. Life for programmers seems to be a
minefield these days.
>Do you deal with redundancy? i.e. can you have more than one of your
>processes running and still get sane SNMP stuff out?
Not sure what you mean here; The subagent only connects to the mysql server on localhost
(with a seperate parameter for the cluster connectstring as this does not always run on
the localhost).
If you have multiple hosts in a cluster running mysqld and wish to have them all export
SNMP as well, sure, why not? That should work.
Multiple agents on the same machine probably won't work as the SNMP OID is registered with
SNMPD by the first agent only. This is something internal in the net-snmp library. Not
sure if that'd be useful, in any case.
If you mean threading, I try to avoid this as the Net-SNMP code is notoriously
thread-unsafe.
>within some limits.... there's been some API changes, but possibly
>nothing that would affect a tool such as this. Should just get compile
>failures IIRC.
>
>Feel free to ask about anything you're unsure of.
Ok, the one thing I need some help with later on would be the Windows
implementation; Net-SNMP does provide a windows implementation, and so
does MySQL, but I do not have a Windows development platform here so
getting this to run on Windows is fairly tricky. When I post the code,
perhaps someone would be willing to check it out and see what changes
need to be made to get that working, too?
Since I try to strictly adhere to POSIX/ISOC99 (ANSI C is not entirely
possible) I assume that most Unix platforms shouldn't pose any major issues.
Also, I assume that the MySQL SNMP OID I mentioned in an earlier mail is
currently not used by anyone in MySQL? (this to prevent any nasty
conflicts later on). If not, I may need to register a new one (which
takes approximately 4 weeks).
>coding ndb_size.pl into C - brave man :)
>
>I've made some updates to ndb_size.pl recently... perhaps it's better to
>call out to the perl and have the perl script produce easier to parse
>and mutilate output? I'd welcome your thoughts.
>
>I've been thinking that XML is perhaps an option as then it's easy to go
>to XHTML and people can use it in their MS Office/OpenOffice docs for
>calculations and the like.
Actually its 'fairly' simple; using the mysql_fetch_fields and a SHOW
COLUMNS FROM db.table I get all the info in terms of types, declarations
and sizes I need to walk the list and apply the mysql documentation on
data types and sizes. Getting the VARCHARS and BLOBS right is a pain
(currently I use the 'length' member of the MYSQL_FIELD structure). I
need to come up with something to get more accurate lengths there.
Other than that its aligning (MY_ALIGN, anyone?). ENUMS and SETS require
a bit of parsing on the COLUMN definitions using strstr().
One thing I did see in the MySQL documentation;
http://dev.mysql.com/doc/refman/5.0/en/mysql-cluster-db-definition.html
Section IndexMemory, example;
" For each record, there are 12 bytes of data plus 12 bytes overhead.
Having no nullable columns saves 4 bytes of overhead. "
If I read the perl script correctly, it doesn't take this 4 byte
reduction into account, calculating 16 bytes of overhead per row in all
cases. Am I correct here, or do I have an old ndb_size.pl? It doesn't
seem like much, but with very large tables 4 bytes adds up nicely.
Might be a bit tricky to call out to the perl module, as parsing
requires a lot of processing time. Not sure, but I think it could cost
me more time than the solution I currently use (MYSQL_FIELD structures).
It is possible, of course.
Perhaps its an idea to export something generic through the API?
Probably there's more uses for sizing functions.
>We should be able to (for cluster tables) talk to the BLOB table to get
>some stats we maintain internally.
Actually its calculated for all tables, even those not in a cluster
database. This might be used for people who would like to migrate to a
cluster to see how much memory they need to put in the system, for instance.
If there's an internal set of functions I could call out to to get this
info in a fairly fast manner that would be great, though. If not I'd
like to either stick with using the maximum sizes possible, or (with a
configuration option) allow the admin to choose for accuracy over speed.
>I'm looking forward to seeing it. It sounds like something we'd like to
>incorporate into the distribution, it'd be cool if you/you're employer
>are open to that (we can discuss details later)
I'm very open to incorporating it. Actually it was my intention to GPL
it from the get-go. My employer only gets benefits if this is done as
they get the SNMP functionality we need for monitoring the cluster with
each and every new version of the database that is installed. It saves
us from having to patch mysql every time we do an upgrade.
Also, other people might be able to add functionality to the mod I'm not
even aware of. Especially in the areas we don't use the database for
(yet), like replication.
Can you list any requirements in terms of code other than the ones put
forth in the initial post by Sascha Pachev a few years back?
(http://lists.mysql.com/internals/1048)
I'm cleaning it up anyway so I could just as well take that into account.
Cheers,
Michel
Stewart Smith wrote:
> On Tue, 2006-06-20 at 12:10 +0200, Michel Stam wrote:
>
>>> was this MySQL Cluster (as in NDB) or a bunch of replication slaves?
>>>
>> Actually its MySQL cluster 4.1. Old, but unfortunately I'm not able to
>> upgrade the production servers on a regular basis.
>>
>
> great! (he says as he's on the cluster team)
>
>
>>> there's utilities such as mytop, nagios etc. too.
>>>
>> I know, but products such as HP OpenView talk SNMP; as its a
>> standardised protocol I figured SNMP would be a good choice (the
>> protocol was specifically meant for this).
>>
>
> yeap - sounds like you've gone beyond what these products currently
> offer.
>
>
>>> I'd be interested to see your version - as no doubt would others. Is it
>>> a patch to the mysql server or a separate process?
>>>
>> I'm currently cleaning it up and making it a little more idiot proof.
>> (such as when you kill the mysqld process, or all of the management
>> nodes in a cluster).
>>
>
> sure, understandable.
>
>
>> The daemon is a seperate process, I thought this would be better.
>>
>
> yes, I think so too.
>
> Do you deal with redundancy? i.e. can you have more than one of your
> processes running and still get sane SNMP stuff out?
>
>
>> Especially since I use the standard mysqlclient libs, and the ndbclient lib.
>>
>
> great!
>
>
>> This way it should be possible to compile it on a version anywhere from
>> 4.1 to 5.1beta (I develop on 4.1, but I'll try to check this before I
>> submit the code).
>>
>
> within some limits.... there's been some API changes, but possibly
> nothing that would affect a tool such as this. Should just get compile
> failures IIRC.
>
> Feel free to ask about anything you're unsure of.
>
>
>> What it currently does is export a large number of the show global and
>> show status commands, list the cluster state (if configured) in the same
>> way ndb_mgm does, show the process list, the table status for all tables
>> in the system, and as an extra, I've coded the ndb_size.pl script into C
>> which gives a fairly accurate resource usage for tables if they were
>> stored on a cluster database (shown per table, unfortunately not
>> databases). Including BLOBS and ENUMS.
>>
>
> coding ndb_size.pl into C - brave man :)
>
> I've made some updates to ndb_size.pl recently... perhaps it's better to
> call out to the perl and have the perl script produce easier to parse
> and mutilate output? I'd welcome your thoughts.
>
> I've been thinking that XML is perhaps an option as then it's easy to go
> to XHTML and people can use it in their MS Office/OpenOffice docs for
> calculations and the like.
>
>
>> LOBS are currently giving me a bit of an issue though, as parsing the
>> column sizes drastically degrades performance (for example by selecting
>> the average sizes using a query such as SELECT AVG( LENGTH( col ) ) FROM
>> database.table like the perl script does). MYSQL_FIELD structures only
>> provide me the maximum size of the BLOB column, which would report
>> memory usage way off from the actual situation. If anyone has any
>> thoughts on this, please let me know.
>>
>
> We should be able to (for cluster tables) talk to the BLOB table to get
> some stats we maintain internally.
>
>
>> As the underlaying SNMP engine I use Net-SNMP (see
>> http://www.net-snmp.org/). I hope that is not a problem.
>>
>
> I had a look at this a little while ago and it seemed like a sane choice
> (seems to be the standard).
>
>
>> I'll keep you posted.
>>
>
> great!
>
> I'm looking forward to seeing it. It sounds like something we'd like to
> incorporate into the distribution, it'd be cool if you/you're employer
> are open to that (we can discuss details later)