List:Internals« Previous MessageNext Message »
From:Sergei Golubchik Date:August 9 2017 11:51am
Subject:Re: How was this records/fanou
t logic derived for the “no statistics ”
case in MySQL's Query Planner?
View as plain text  
Hi, Jacky!

On Aug 07, YuFeng Shen wrote:
> Hi Experts,
>  
> In the MySQL Server 5.7 source code, the formula records = (x * (b-a) + a*c-b)/(c-1)
> is used in the query planner to calculate the number of records when key distribution
> statistics are not available.
> Where is this formula coming from, how was it derived, or why is this
> specific formula the formula that's being used? Does it have an
> established theoretical underpinning, and if so, what is its basis?
https://github.com/mysql/mysql-server/blob/5.7/sql/sql_planner.cc#L529
> 
>           Assume that the first key part
> matches 1% of the file
>           and that the whole key matches
> 10 (duplicates) or 1
>           (unique) records.
>           Assume also that more key
> matches proportionally more
>           records
>           This gives the formula:
>           records = (x * (b-a) +
> a*c-b)/(c-1)
>           b = records matched by whole
> key
>           a = records matched by first
> key part (1% of all records?)
>           c = number of key parts in
> key
>           x = used key parts (1 <= x
> <= c)

According to git history, this was present even in the very first
revision, when MySQL moved from CVS to BitKeeper in July 2000.
 
There's only one person in Oracle that was in MySQL back than, and I
don't think he's reading this mailing list.

So, I'm afraid, only Monty can know that.

Regards,
Sergei
Thread
How was this records/fanout logic derived for the “no statistics” case in MySQL's Query Planner?YuFeng Shen7 Aug 2017
  • Re: How was this records/fanout logic derived for the “no statistics ”case in MySQL's Query Planner?Sergei Golubchik9 Aug 2017
    • Re: How was this records/fanout logic derived for the “no statistics” case in MySQL's Query Planner?shawn l.green10 Aug 2017