List:General Discussion« Previous MessageNext Message »
From:Chris Knipe Date:July 26 2013 7:30am
Subject:Re: hypothetical question about data storage
View as plain text  
Hi All,

Thanks for the responces, and I do concur.  I was taking a stab in the
dark so to speak.

We are working with our hosting providers currently and will be
introducing a multitude of small iSCSI SANs to split the storage
structure over a multitude of disks...   This is something that needs
to be addressed from a systems perspective rather than an
architectural one.

SSD (or Fusion and the like) are unfortunately still way to expensive
for the capacity that we require (good couple of TBs) - so mechanical
disks it would need to be.  However, with the use of SANs as we hope,
we should be able to go up from 4 to over 64 spindles whilst still
being able to share the storage and have redundancy.

Many thanks for the inputs and feedbacks...

--
C


On Fri, Jul 26, 2013 at 9:23 AM, Johan De Meersman <vegivamp@stripped> wrote:
> Hey Chris,
>
> I'm afraid that this is not what databases are for, and the first thing you'll likely
> run into is amount of concurrent connections.
>
> This is typically something you should really tackle from a systems perspective. Seek
> times are dramatically improved on SSD or similar storage - think FusionIO cards, but
> there's also a couple of vendors (Violin comes to mind) who provide full-blown SSD SANs.
>
> If you prefer staying with spinning disks, you could still improve the seeks by
> focusing on the inner cylinders and potentially by using variable sector formatting.
> Again, there's SANs that do this for you.
>
> Another minor trick is to turn off access timestamp updates when you mount the
> filesystem (noatime).
>
> Also benchmark different filesystems, there's major differences between them. I've
> heard XFS being recommended, but I've never needed to benchmark for seek times myself.
> We're using IBM's commercial GPFS here, which is good with enormous amounts of huge files
> (media farm here), not sure how it'd fare with smaller files.
>
> Hope that helps,
> Johan
>
> ----- Original Message -----
>> From: "Chris Knipe" <savage@stripped>
>> To: mysql@stripped
>> Sent: Thursday, 25 July, 2013 11:53:53 PM
>> Subject: hypothetical question about data storage
>>
>> Hi all,
>>
>> We run an VERY io intensive file application service.  Currently, our
>> problem is that our disk spindles are being completely killed due to
>> insufficient SEEK time on the hard drives (NOT physical read/write
>> speeds).
>>
>> We have an directory structure where the files are stored based on
>> the MD5
>> checksum of the file name, i.e.
>> /0/00/000/000044533779fce5cf3497f87de1d060
>> The majority of these files, are between 256K and 800K with the ODD
>> exception (say less than 15%) being more than 1M but no more than 5M
>> in
>> size.  The content of the files are pure text (MIME Encoded).
>>
>> We believe that storing these files into an InnoDB table, may
>> actually give
>> us better performance:
>> - There is one large file that is being read/written, instead of
>> BILLIONS of
>> small files
>> - We can split the structure so that each directory (4096 in total)
>> sit's on
>> their own database
>> - We can move the databases as load increases, which means that we
>> can
>> potentially run 2 physical database servers, each with 2048 databases
>> each)
>> - It's easy to move / migrate the data due to mysql and replication -
>> same
>> can be said for redundancy of the data
>>
>> We are more than likely looking at BLOB columns of course, and we
>> need to
>> read/write from the DB in excess of 100mbit/s
>>
>> Would the experts consider something like this as being feasible?  Is
>> it
>> worth it to go down this avenue, or are we just going to run into
>> different
>> problems?  If we are facing different problems, what can we possibly
>> expect
>> to go wrong here?
>>
>> Many thanks, and I look forward to any input.
>>
>
> --
> Unhappiness is discouraged and will be corrected with kitten pictures.



-- 

Regards,
Chris Knipe
Thread
hypothetical question about data storageChris Knipe25 Jul
  • Re: hypothetical question about data storageVahric Muhtaryan25 Jul
  • Re: hypothetical question about data storageJohan De Meersman26 Jul
    • Re: hypothetical question about data storageChris Knipe26 Jul
      • RE: hypothetical question about data storageRick James26 Jul
        • RE: hypothetical question about data storageJohan De Meersman26 Jul
          • Re: hypothetical question about data storageChris Knipe26 Jul
            • Re: hypothetical question about data storagehsv27 Jul
            • Re: hypothetical question about data storagewilliam drescher27 Jul
              • RE: hypothetical question about data storageRick James29 Jul
                • RE: hypothetical question about data storageJohan De Meersman29 Jul
                  • RE: hypothetical question about data storageRick James29 Jul
                    • Re: hypothetical question about data storageCarsten Pedersen30 Jul
                    • Re: hypothetical question about data storageManuel Arostegui30 Jul
                    • RE: hypothetical question about data storageJohan De Meersman30 Jul