List:General Discussion« Previous MessageNext Message »
From:Shawn Green Date:June 20 2012 9:30pm
Subject:Re: Indexing about 40 Billion Entries
View as plain text  
On 6/20/2012 5:45 AM, Christian Koetteritzsch wrote:
> Hi guys,
>
> As the title says I'm trying to index 40 billion entries with two
> indexes on a server with 16 cores and 128GB RAM. The table is the one
> below and it is a myisam table. The *.myd file is about 640GB
>
> DROP TABLE IF EXISTS `l4_link`;
> CREATE TABLE  `l4_link` (
>    `ruid1` int NOT NULL,
>    `ruid2` int NOT NULL,
>    `overlap` int NOT NULL
> );
>
> I need an index for ruid1 and for ruid2.
>

Actually, based on your proposed query, I believe you want an index on 
(ruid1, ruid2) not separate indexes for each column.


> The status for this table is the following:
>
>             Name: l4_link
>           Engine: MyISAM
>          Version: 10
>       Row_format: Fixed
>             Rows: 39806500262
>   Avg_row_length: 17
>      Data_length: 676710504454
> Max_data_length: 4785074604081151
>     Index_length: 1024
>        Data_free: 0
>   Auto_increment: NULL
>      Create_time: 2012-06-19 14:51:29
>      Update_time: 2012-06-19 16:26:35
>       Check_time: NULL
>        Collation: utf8_general_ci
>         Checksum: NULL
>   Create_options:
>          Comment:
>
> The variables for myisam are the following:
> mysql> show global variables like '%myisam%';
> +----------------------------------------+---------------------------------------+
>
> | Variable_name                       |
> Value                                      |
> +----------------------------------------+---------------------------------------+
>
> | myisam_data_pointer_size  | 6         |
> | myisam_max_sort_file_size | 9223372036853727232      |
> | myisam_mmap_size             | 18446744073709551615    |
> | myisam_recover_options     | BACKUP       |
> | myisam_repair_threads       | 1             |
> | myisam_sort_buffer_size    | 8388608                                  |
> | myisam_stats_method        | nulls_unequal |
> | myisam_use_mmap             |
> OFF                                          |
> +---------------------------------------+----------------------------------------+
>
> 8 rows in set (0.00 sec)
>
> The temp folder has about 16tb free space.
>
> When I start the indexing process, it copies the 640Gb into a temp file
> and then starts with "repair with keycache".
> On the internet I found that if it says "repair with keycache" you shold
> increase the "myisam_max_sort_file_size", but this didn't work.
> It still says "repair with keycache" after it copied the data to the
> temp dir.
>
> I hope you have any idea how to fix this.
>

Try maxing out the following settings:

   myisam_sort_buffer_size
   key_buffer_size

http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_myisam_sort_buffer_size

http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_myisam_sort_buffer_size

You will need to experiment with how large the maximum value will be 
permitted for your platform.

-- 
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
Office: Blountville, TN


Thread
Indexing about 40 Billion EntriesChristian Koetteritzsch20 Jun
  • Re: Indexing about 40 Billion EntriesAnanda Kumar20 Jun
    • RE: Indexing about 40 Billion EntriesRick James20 Jun
      • Re: Indexing about 40 Billion EntriesChristian Koetteritzsch20 Jun
        • Re: Indexing about 40 Billion EntriesChristian Koetteritzsch20 Jun
        • RE: Indexing about 40 Billion EntriesRick James20 Jun
          • Re: Indexing about 40 Billion EntriesChristian Koetteritzsch21 Jun
            • Re: Indexing about 40 Billion Entriesmos21 Jun
              • Re: Indexing about 40 Billion EntriesChristian Koetteritzsch21 Jun
              • RE: Indexing about 40 Billion EntriesRick James21 Jun
          • UNION and ORDER BYhsv22 Jun
            • RE: UNION and ORDER BYRick James22 Jun
  • Re: Indexing about 40 Billion EntriesShawn Green20 Jun
    • RE: Indexing about 40 Billion EntriesRick James20 Jun
  • Re: Indexing about 40 Billion EntriesBrent Clark21 Jun