List:Internals« Previous MessageNext Message »
From:Denis Pithon Date:January 26 2001 8:51pm
Subject:MySQL parallel server [long mail]
View as plain text  
Hi all,

Yes, it's again a mail about MySQL parallel server and this one is
quite long... I work for two months to enable MySQL as a parallel
server on Linux cluster. As you can guess, I encountered a bunch of
problems !

** Context **

At Lineo HA, we provide a Linux based software (Availix) which powers
a CompactPCI hardware. Roughly, this hardware features one disk,
shared between a couple of active nodes (up to 5).  Each node could be
seen as a diskless Linux PC, and all nodes runs the same service (one
of httpd, ftpd... or why not... mysqld). IPVS runs on a particular
node (controller) and dispatch IP query to the others. I hope that you
understand that for a such hardware solution, replication is
unfortunately not suitable.

First of all, we use GFS which is realy better than NFS (for Linux, at
least). I had to adapt sql/my_lock.c because GFS don't support fcntl()
locking (so, I use flock). I test it, and it seems to work fine (with
one server). I compiled mysql server with debug mode on, and I use the
following options for the servers:

safe_mysqld --enable-locking --one-thread --flush --safe-mode

Certainly very slow, but we don't want speed, we first want
availability...

The first query in my C program tests is always 'LOCK TABLES' and the
last 'UNLOCK TABLES'. Moreover, these programs run in a connect - lock
- process - unlock - deconnect loop. Each new connection is
established with another server (the goal is to simulate real running
context).

** Tests **

I don't want to flood you with all details but updates seems to works
quite well: an update done by one server is seen by the others. I
wrote a C program wich fill/empty a stack and I launch several
occurrences of it. It runs more than one day! Great! But this test is
quite simple and don't use insert queries.

Unfortunately there is many more problems with insert queries. In fact
if you create a table with one server and add a row with another, that
row may be invisible for the other servers (for the table creator
too). I mean that you could have a result like :
                   1  'one'       with the server which insert
                   0  (NULL)      with another server

Wait a couple of seconds, do a new select and you can obtain the good
result in both !  If you process a new insert with another server, it
don't crush the last inserted row and you see both rows with a select
(even if you had 0 (NULL) before)...  And if you create and insert
with the same servers the others nodes seens the new row !  Strange
isn't it ? Moreover, the table check is often annoying by these
inserts. It tells me things like this for the servers which don't do
the insert:

   Size of datafile is: 0         Should be: 25

Which is wrong, index file, data file, both or none ? 

But if I want few seconds and re-check the table, all may be clean...
The situations isn't very clear. And repetition of the same actions
don't give the same results, that's quite annoying !

I try this test with one PC wich runs several servers. The problems
are the same. GFS seems to be ok.

** What I want to do **

I'm actually looking for a way to force mysqld servers to flush index
file after any query and / or force mysqld server to re-read index
file before any query... I know, that's a terrific slow down for the
server, but I think I have no other choices than test it.

I have to explore the source deeper. I have seen many IO caches (bad
news for me) and even a mmap (sql_mmap.cc), ouch! To check if we could
use MySQL as a database parallel server I wan't to cancel the use of
the cache. Is it possible ? And is it really usefull ?

OK, I hope I hurt nobody in the MySQL development team! I know that
I'm trying to slow down a formula 1 to a snail speed :-) But the
results I have with MySQL are terrifically better than these of mSQL,
PostreSQL and actual commercial products (DB2, Informix, Sybase...)
wich are designed for distributed database only.

Thanks a lot for your attention !

Denis

-- 
Denis Pithon                             phone  +33 (0) 1 41 40 02 13   
Software Engineer                        fax    +33 (0) 1 41 40 02 01   
Lineo High Availability Group            mail   denis.pithon@stripped

Thread
MySQL parallel server [long mail]Denis Pithon26 Jan
  • Re: MySQL parallel server [long mail]Sasha Pachev27 Jan
  • MySQL parallel server [long mail]Michael Widenius27 Jan
  • Re: MySQL parallel server [long mail]Paul Cadach28 Jan
Re: MySQL parallel server [long mail]Mauricio Breternitz26 Jan
  • Re: MySQL parallel server [long mail]Michael Widenius27 Jan