MySQL Lists are EOL. Please join:

List:Internals« Previous MessageNext Message »
From:YuFeng Shen Date:August 7 2017 2:10am
Subject:How was this records/fanout logic derived for the “no
statistics” case in MySQL's Query Planner?
View as plain text  
Hi Experts,
 
In the MySQL Server 5.7 source code, the formula records = (x * (b-a) + a*c-b)/(c-1) is
used in the query planner to calculate the number of records when key distribution
statistics are not available.
 Where is this formula coming from, how was it derived, or why is this specific
formula the formula that's being used? Does it have an established theoretical
underpinning, and if so, what is its basis?
 https://github.com/mysql/mysql-server/blob/5.7/sql/sql_planner.cc#L529


          Assume that the first key part
matches 1% of the file
          and that the whole key matches 10
(duplicates) or 1
          (unique) records.
          Assume also that more key matches
proportionally more
          records
          This gives the formula:
          records = (x * (b-a) + a*c-b)/(c-1)
          b = records matched by whole key
          a = records matched by first key
part (1% of all records?)
          c = number of key parts in key
          x = used key parts (1 <= x <=
c)
 
 
Jacky
Thread
How was this records/fanout logic derived for the “no statistics” case in MySQL's Query Planner?YuFeng Shen7 Aug
  • Re: How was this records/fanout logic derived for the “no statistics ”case in MySQL's Query Planner?Sergei Golubchik9 Aug
    • Re: How was this records/fanout logic derived for the “no statistics” case in MySQL's Query Planner?shawn l.green10 Aug