List:General Discussion« Previous MessageNext Message »
From:Devananda Date:July 11 2005 9:19pm
Subject:Re: Storing huge amount of binary data
View as plain text  
> First of all, I heard that Mysql does not perform very well when
> tablesize goes above 1 Gb. Is this a myth? 
I don't have much experience with individual tables being that large, 
though I have used MySQL databases with >100G total data with no 
problems at all. Based on my experience, it's not the table size which 
can degrade performance, but the index size - if the index is too large 
or clumsy, then MySQL can not search quickly enough. If the index is 
well-structured, the data is found quickly and overall performance is 
great regardless of the quantity of data.

> Image table is not a big
> deal, since I can partition/distribute it to ~ 100-200 tables, i.e. by
> table_(id % 100). 
This is a good way to help keep your indexes small and thus keep 
search/seek time to a minimum :)

> However, text information needs to stay in a single
> table (since I need to do queries on it for information) and there
> will be multiple indexes over this information.
A suggestion to help keep your indexes on this table small: if you have 
large text fields that you need to index, you may want to use partial 
indexes. (for example, if you have a column "filename varchar(200)", 
create an index on filename(50), or what ever is appropriate. This will 
cut the size of the index file a lot, and even though it may result in 
some duplicate entries in the index, it will probably increase overall 
performance.)

> And, as you can imagine, I am not sure if mysql can handle something
> like this, and was wondering if you can provide some feedback.
> 
> So my questions are:
> 
> 1. The main question is, do you guys have any experience with this
> much binary and regular data? Do you think Mysql can handle this much
> data in a reliable manner (without corrupting data and/or
> degrading/terrible performance) ?
Based on my experience with > 100GB InnoDB databases, with the right 
indexes, MySQL can easily handle 3,000 reads/sec on text/numerical data. 
(This is on dual xeon 3Ghz, 4 GB RAM, SCSI Raid-5 disks.) I've never had 
any problem with data corruption, but I use primarily the InnoDB engine, 
which is not prone to corruption in the same way as MyISAM.

I don't have experience storing large amounts of binary data, so I can't 
say anything about MySQL's performance in that area specifically.

> 
> 2. Can I implement this using regular SCSI disks with regular mysql?
Probably.
> Or do I have need advanced solutions such as clustered, replicated,
> etc?
No need for clustered or distributed databases, from what you've 
described. Cluster would be useful if you need 100% availability, even 
in the event of hardware failures. Replication, such as single master -> 
multiple slaves, is useful if you have massive reads and minimal writes, 
and _may_ be something you will need.

I would recommend using the command SHOW PROCESSLIST, or a tool like 
MyTop, to see what state the client connections spend the most time in. 
(searching the index, or sending the data over the network? if it's the 
latter, then you would benefit from distributing the read load to 
multiple slave servers.)



Regards,
Devananda vdv
Thread
Storing huge amount of binary dataCabbar Duzayak11 Jul
  • RE: Storing huge amount of binary dataMatt Babineau11 Jul
  • Re: Storing huge amount of binary dataPer Jessen11 Jul
    • Re: Storing huge amount of binary dataStephen Cook12 Jul
  • Re: Storing huge amount of binary dataDevananda11 Jul