From: Denis Pithon Date: January 26 2001 8:51pm Subject: MySQL parallel server [long mail] List-Archive: http://lists.mysql.com/internals/398 Message-Id: <20010126215126.A6797@pc9> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Hi all, Yes, it's again a mail about MySQL parallel server and this one is quite long... I work for two months to enable MySQL as a parallel server on Linux cluster. As you can guess, I encountered a bunch of problems ! ** Context ** At Lineo HA, we provide a Linux based software (Availix) which powers a CompactPCI hardware. Roughly, this hardware features one disk, shared between a couple of active nodes (up to 5). Each node could be seen as a diskless Linux PC, and all nodes runs the same service (one of httpd, ftpd... or why not... mysqld). IPVS runs on a particular node (controller) and dispatch IP query to the others. I hope that you understand that for a such hardware solution, replication is unfortunately not suitable. First of all, we use GFS which is realy better than NFS (for Linux, at least). I had to adapt sql/my_lock.c because GFS don't support fcntl() locking (so, I use flock). I test it, and it seems to work fine (with one server). I compiled mysql server with debug mode on, and I use the following options for the servers: safe_mysqld --enable-locking --one-thread --flush --safe-mode Certainly very slow, but we don't want speed, we first want availability... The first query in my C program tests is always 'LOCK TABLES' and the last 'UNLOCK TABLES'. Moreover, these programs run in a connect - lock - process - unlock - deconnect loop. Each new connection is established with another server (the goal is to simulate real running context). ** Tests ** I don't want to flood you with all details but updates seems to works quite well: an update done by one server is seen by the others. I wrote a C program wich fill/empty a stack and I launch several occurrences of it. It runs more than one day! Great! But this test is quite simple and don't use insert queries. Unfortunately there is many more problems with insert queries. In fact if you create a table with one server and add a row with another, that row may be invisible for the other servers (for the table creator too). I mean that you could have a result like : 1 'one' with the server which insert 0 (NULL) with another server Wait a couple of seconds, do a new select and you can obtain the good result in both ! If you process a new insert with another server, it don't crush the last inserted row and you see both rows with a select (even if you had 0 (NULL) before)... And if you create and insert with the same servers the others nodes seens the new row ! Strange isn't it ? Moreover, the table check is often annoying by these inserts. It tells me things like this for the servers which don't do the insert: Size of datafile is: 0 Should be: 25 Which is wrong, index file, data file, both or none ? But if I want few seconds and re-check the table, all may be clean... The situations isn't very clear. And repetition of the same actions don't give the same results, that's quite annoying ! I try this test with one PC wich runs several servers. The problems are the same. GFS seems to be ok. ** What I want to do ** I'm actually looking for a way to force mysqld servers to flush index file after any query and / or force mysqld server to re-read index file before any query... I know, that's a terrific slow down for the server, but I think I have no other choices than test it. I have to explore the source deeper. I have seen many IO caches (bad news for me) and even a mmap (sql_mmap.cc), ouch! To check if we could use MySQL as a database parallel server I wan't to cancel the use of the cache. Is it possible ? And is it really usefull ? OK, I hope I hurt nobody in the MySQL development team! I know that I'm trying to slow down a formula 1 to a snail speed :-) But the results I have with MySQL are terrifically better than these of mSQL, PostreSQL and actual commercial products (DB2, Informix, Sybase...) wich are designed for distributed database only. Thanks a lot for your attention ! Denis -- Denis Pithon phone +33 (0) 1 41 40 02 13 Software Engineer fax +33 (0) 1 41 40 02 01 Lineo High Availability Group mail denis.pithon@stripped