On 4/2/2014 10:16 AM, harryxiyou wrote:
> Hi all,
> I am designing and implementing a database upon a KV storage system. Now,
> I wonder how to realize "select * from table" with high performance if there
> are large amount of KV pairs in a table, which RAM cannot load all the KV pairs
> in the table once. Now, I have a simple solution which would load a
> default cursor (actually, it should be called "kv pair numbers per
> time") number of KV pairs one time (So it may load many times and reply to the
> select all request) in RAM. I am looking forward to receiving some
> suggestions from
> Thanks, Harry.
You appear to face the same resource dilemmas that all storage engines
face: there will be more data stored on disk than you can possibly
buffer in memory. There have been many strategies designed to keep only
the 'most important' or 'most recently used' elements in memory. Please
look to them for inspiration.
To go along with trying to keep a 'warm' cache of data for faster
retrieval, you need to decide how to manage scans of large quantities of
data (such as your SELECT * ... query) and if you want them to replace
the warm buffer you have been cultivating since startup. This may mean
that you skip trying to buffer any scan that uses more than X percent of
your table (you get to pick what the X is) to keep your cache nice and
And you may have multiple cache types, hot, warm, cold, temporary, ...
depending on how you want to manage them. Again, this is a hugely broad
topic with many different algorithms used in both commercial and open
source products. There is no one 'correct' answer for all situations.
Then you get to the subject of index management, file storage formats,
row identification, and how to locate a row on disk. That's an entirely
different beast. Of course, if you store nothing on disk then those
design questions become irrelevant.
MySQL Senior Principal Technical Support Engineer
Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
Office: Blountville, TN