MySQL Lists are EOL. Please join:

List:Internals« Previous MessageNext Message »
From:shawn Date:November 13 2015 1:27am
Subject:Re: Developing and Index Advisor for MySQL
View as plain text  
Hello Eduardo,

On 11/12/2015 7:17 PM, Eduardo Weiland wrote:
> Quoting Physical Database Design [1]:
> "index statistics can be sufficiently estimated for the purposes of
> physical design advisors from relatively straightforward calculations
> based on the table and columns statistics of the underlying table that
> the virtual index will be based on."
> There are already tools like this for other RDBMS: Microsoft SQL Server
> Database Tuning Advisor, Oracle Access Advisor and IBM DB2 Design
> Advisor. I just want to create a tool that does the same thing for MySQL.
> Furthermore I intend to analyze all the SQL queries, including SELECTs,
> INSERTs, UPDATEs and DELETEs. Thus I want to find some index
> configuration that is good enough for the entire workload. I do know
> that workload can vary over time and the optimizations become outdated,
> but this is an academic research and I am assuming a constant workload.
> Adapt to dynamic workload changes is future work.
> I plan to use some statistics such as how many times each query is
> executedto make queries that run most often to perform better. So, if
> there are more INSERTs and UPDATEs than SELECTs in a given table, I
> intend to recommend less indexes for this table, and even suggest to
> drop some existing indexes if needed.
> [1] LIGHTSTONE S., TEOREY T., NADEAU T. Physical Database Design: The
> Database Professional’s Guide to Exploiting Indexes, Views, Storage, and
> More. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2007.
> ISBN 9780080552316.
> Em 12-11-2015 18:56, shawn escreveu:
>> Hello Eduardo,
>> On 11/5/2015 4:58 AM, Eduardo Weiland wrote:
>>> Hello there,
>>> I am new here in the list. I am from Brazil and I am finishing my
>>> bachelor's degree in Computer Science.
>>> I have to do a final work for the course, and (I don't know why) I
>>> came out with the idea of developing an index advisor for MySQL.
>>> Just to be clear, by index advisor I mean some tool that suggests
>>> what indices can be created in the database, based on statistics and
>>> query log. In other words, indexes that do not exist in the database
>>> but can be created by the administrator in order to improve query
>>> execution performance.
>>> I have never worked with MySQL internals before. I know a lot of C
>>> and C++ programming, so I just need some help to understand MySQL
>>> structure.
>>> My main inspirations for doing this work are [1] and [2] (actually,
>>> they are the same work, but two different articles). One important
>>> thing I have found in [1]:
>>>      Before making the decisions, the optimizer allows the developer
>>> to override the information about physical design by using several
>>> function ╠│hooks‘. The hooks can be replaced at runtime with functions
>>> that insert new stastistics information into the list of physical
>>> design features. This makes the optimizer believe that the newly
>>> inserted data regarding the what-if indexes and what-if tables are
>>> present in the database. Then, the optimizer selects the execution
>>> plans using the statistics from the what-if features.
>>> I guess that this is what I need. Some way to virtually create
>>> indexes and then calling the MySQL optimizer or some sort of EXPLAIN
>>> method to understand "what MySQL optimizer thinks about this index?
>>> Would this index be used if it existed?".
>>> I am probably wrong about all I have said here, but I want to hear
>>> what you think about this.
>>> I am very grateful for all your help.
>>> Att.
>>> Eduardo Weiland
>>> [1] PARINDA: an interactive physical designer for PostgreSQL
>>> [2] An automated, yet interactive and portable DB designer
>> The drawback to your approach is that you won't have the cardinality
>> estimates for your index trials until you actually test those tuple
>> combinations for uniqueness.  It is the cardinality that the optimizer
>> bases its estimates on.
>> Other approaches (most of them are manual) have been to examine the
>> query patterns (available in digest form from the PERFORMANCE SCHEMA)
>> or from the actual query statistics in the Slow query log (which can
>> be converted to patterns by tools like mysqldumpslow) to look for
>> patterns that are both
>> a) frequently used
>> b) examining many more rows than they need to return
>> Quite often, they can be improved by adding indexes to certain tables.
>> In many cases, though, it was because the query used a dependent
>> subquery which simply had to search a lot of row combinations.
>> So, it is easy to suggest indexes that would solve certain queries
>> quickly but over-indexing a table (creating more indexes than you
>> need) only hurts INSERT, UPDATE, and DELETE performance if you are
>> trying to improve a query you only execute rarely.

You said (in your top-posted response)
 > "index statistics can be sufficiently estimated for the purposes of
 > physical design advisors from relatively straightforward calculations
 > based on the table and columns statistics of the underlying table that
 > the virtual index will be based on."

That is exactly the task I said you would need to do when I said "until 
you actually test those tuple combinations for uniqueness".  MySQL does 
not keep or generate statistics for every column of a table. It only 
generates statistics for each index based on the columns used in that 

I perfectly agree with modeling a constant distribution of query types. 
There is no need for you to attempt to make your system dynamic.

Here are some references to help you get started:

* The sys Schema (an intrinsic part of MySQL version 5.7 but can be 
added to 5.6 with ease):
(search for any blogs by Mark Leith, the author, for usage ideas)

* Current index usage (useful for finding indexes that should be dropped)

* Useful pages from the manual

One alternative to changing the tables is to change the query to more 
efficiently use the indexes already on the table. But that would 
probably be your next project :)

Shawn Green
MySQL Senior Principal Technical Support Engineer
Oracle USA, Inc. - Integrated Cloud Applications & Platform Services
Office: Blountville, TN

Become certified in MySQL! Visit 
for details.
Developing and Index Advisor for MySQLEduardo Weiland5 Nov
  • Re: Developing and Index Advisor for MySQLshawn l.green12 Nov
    • Re: Developing and Index Advisor for MySQLEduardo Weiland13 Nov
      • Re: Developing and Index Advisor for MySQLshawn l.green13 Nov