List:General Discussion« Previous MessageNext Message »
From:Ananda Kumar Date:June 20 2012 11:36am
Subject:Re: Indexing about 40 Billion Entries
View as plain text  
looks like the value that you give for myisam_max_sort_size is not enough
for the index creation and hence it doing a "REPAIR WITH KEYCACHE"

Use the below query to set the min values required for myisam_max_sort_size
to avoid "repair with keycache"


select
        a.index_name as index_name,
        ceil(
        (select count(*) from `db_name`.`table_name`) *(
        @@global.myisam_data_pointer_size +
        sum(ifnull(1 + `a`.`sub_part` *
(`b`.`character_octet_length`/`b`.`character_maximum_length`),
                ifnull(1 + `b`.`character_octet_length`,
                        case
                                when `b`.`data_type` = 'tinyint' then 1
                                when `b`.`data_type` = 'smallint' then 2
                                when `b`.`data_type` = 'mediumint' then 3
                                when `b`.`data_type` = 'int' then 4
                                when `b`.`data_type` = 'bigint' then 8
                                when `b`.`data_type` = 'float' then 4
                                when `b`.`data_type` = 'double' then 8
                                when `b`.`data_type` = 'real' then 8
                                when `b`.`data_type` = 'bit' then 8
                                when `b`.`data_type` = 'date' then 3
                                when `b`.`data_type` = 'datetime' then 8
                                when `b`.`data_type` = 'timestamp' then 4
                                when `b`.`data_type` = 'time' then 3
                                when `b`.`data_type` = 'year' then 1
                                when `b`.`data_type` = 'enum' then 2
                                when `b`.`data_type` = 'set' then 8
                                when `b`.`data_type` = 'decimal' then 8

                        end
                        )

        )
        + if(`a`.`nullable`='YES',1,0)
        + if(`b`.`character_octet_length` >=255,2,0)
        ))/1048576)*1048576  as `index_size`
from
`information_schema`.`statistics` `a`,
`information_schema`.`columns` `b`
where
        `a`.`table_name`=`b`.`table_name` and
        `a`.`table_schema`=`b`.`table_schema` and
        `a`.`column_name`=`b`.`column_name` and
        `a`.`table_schema`='db_name' and
        `a`.`table_name`='table_name'

group by `a`.`index_name`
order by `index_size` desc limit 1;

On Wed, Jun 20, 2012 at 3:15 PM, Christian Koetteritzsch <
ckoetteritzsch@stripped> wrote:

> Hi guys,
>
> As the title says I'm trying to index 40 billion entries with two indexes
> on a server with 16 cores and 128GB RAM. The table is the one below and it
> is a myisam table. The *.myd file is about 640GB
>
> DROP TABLE IF EXISTS `l4_link`;
> CREATE TABLE  `l4_link` (
>  `ruid1` int NOT NULL,
>  `ruid2` int NOT NULL,
>  `overlap` int NOT NULL
> );
>
> I need an index for ruid1 and for ruid2.
>
> The status for this table is the following:
>
>           Name: l4_link
>         Engine: MyISAM
>        Version: 10
>     Row_format: Fixed
>           Rows: 39806500262
>  Avg_row_length: 17
>    Data_length: 676710504454
> Max_data_length: 4785074604081151
>   Index_length: 1024
>      Data_free: 0
>  Auto_increment: NULL
>    Create_time: 2012-06-19 14:51:29
>    Update_time: 2012-06-19 16:26:35
>     Check_time: NULL
>      Collation: utf8_general_ci
>       Checksum: NULL
>  Create_options:
>        Comment:
>
> The variables for myisam are the following:
> mysql> show global variables like '%myisam%';
> +-----------------------------**-----------+------------------**
> ---------------------+
> | Variable_name                       | Value
>          |
> +-----------------------------**-----------+------------------**
> ---------------------+
> | myisam_data_pointer_size  | 6         |
> | myisam_max_sort_file_size | 9223372036853727232      |
> | myisam_mmap_size             | 18446744073709551615    |
> | myisam_recover_options     | BACKUP       |
> | myisam_repair_threads       | 1             |
> | myisam_sort_buffer_size    | 8388608                                  |
> | myisam_stats_method        | nulls_unequal |
> | myisam_use_mmap             | OFF
>    |
> +-----------------------------**----------+-------------------**
> ---------------------+
> 8 rows in set (0.00 sec)
>
> The temp folder has about 16tb free space.
>
> When I start the indexing process, it copies the 640Gb into a temp file
> and then starts with "repair with keycache".
> On the internet I found that if it says "repair with keycache" you shold
> increase the "myisam_max_sort_file_size", but this didn't work.
> It still says "repair with keycache" after it copied the data to the temp
> dir.
>
> I hope you have any idea how to fix this.
>
> Thanks in advance.
>
> Christian
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql
>
>

Thread
Indexing about 40 Billion EntriesChristian Koetteritzsch20 Jun
  • Re: Indexing about 40 Billion EntriesAnanda Kumar20 Jun
    • RE: Indexing about 40 Billion EntriesRick James20 Jun
      • Re: Indexing about 40 Billion EntriesChristian Koetteritzsch20 Jun
        • Re: Indexing about 40 Billion EntriesChristian Koetteritzsch20 Jun
        • RE: Indexing about 40 Billion EntriesRick James20 Jun
          • Re: Indexing about 40 Billion EntriesChristian Koetteritzsch21 Jun
            • Re: Indexing about 40 Billion Entriesmos21 Jun
              • Re: Indexing about 40 Billion EntriesChristian Koetteritzsch21 Jun
              • RE: Indexing about 40 Billion EntriesRick James21 Jun
          • UNION and ORDER BYhsv22 Jun
            • RE: UNION and ORDER BYRick James22 Jun
  • Re: Indexing about 40 Billion EntriesShawn Green20 Jun
    • RE: Indexing about 40 Billion EntriesRick James20 Jun
  • Re: Indexing about 40 Billion EntriesBrent Clark21 Jun