List:General Discussion« Previous MessageNext Message »
From:Andrew Moore Date:September 8 2011 6:16pm
Subject:Re: Query Optimization
View as plain text  
Thinking outside the query, is there any archiving that could happen to make
your large tables kinder in the range scan?

Andy

On Thu, Sep 8, 2011 at 7:03 PM, Brandon Phelps <bphelps@stripped> wrote:

> On 09/01/2011 01:32 PM, Brandon Phelps wrote:
>
>> On 09/01/2011 12:47 PM, Shawn Green (MySQL) wrote:
>>
>>> On 9/1/2011 09:42, Brandon Phelps wrote:
>>>
>>>> On 09/01/2011 04:59 AM, Jochem van Dieten wrote:
>>>> > > ...
>>>>
>>>> > > WHERE
>>>> > > (open_dt >= '2011-08-30 00:00:00' OR close_dt >=
> '2011-08-30
>>>> 00:00:00')
>>>> > > AND (open_dt <= '2011-08-30 12:36:53' OR close_dt <=
> '2011-08-30
>>>> 12:36:53')
>>>>
>>>> > In that case your logic here simplifies to:
>>>> > WHERE
>>>> > open_dt >= '2011-08-30 00:00:00'
>>>> > AND
>>>> > close_dt <= '2011-08-30 12:36:53'
>>>>
>>>> > Now add an index over open_dt and close_dt and see what happens.
>>>>
>>>> > Jochem
>>>>
>>>> Jochem,
>>>>
>>>> I can't really use your WHERE logic because I also need to retrieve
>>>> results where the open_dt time is out of the range specified. For
>>>> example, a very large file download might span multiple days so given
>>>> your logic if the connection was started 2 days ago and I want to pull 1
>>>> days worth of connections, I would miss that entry. Basically I want to
>>>> SELECT all of the records that were opened OR closed during the
>>>> specified time period, ie. if any activity happened between my start and
>>>> end dates, I need to see that record.
>>>>
>>>> Any other ideas?
>>>>
>>>>
>>> I believe Jochem was on the right track but he got his dates reversed.
>>>
>>> Let's try a little ASCII art to show the situation. I will setup a query
>>> window with two markers (s) and (e). Events will be marked by |----| markers
>>> showing their durations.
>>>
>>> a) (s) (e)
>>> b) |---|
>>> c) |---|
>>> d) |---|
>>> e) |--------------------|
>>> f) |---|
>>> g) |---|
>>>
>>> To describe these situations:
>>> a) is the window for which you want to query (s) is the starting time and
>>> (e) is the ending time for the date range you are interested in.
>>> b) the event starts and stops before your window exists. It won't be part
>>> of your results.
>>> c) the event starts before the window but ends within the window -
>>> include this
>>> d) the event starts and ends within the window - include this
>>> e) the event starts before the window and ends after the window - include
>>> this
>>> f) the event starts inside the window but ends beyond the window -
>>> include this.
>>> g) the event starts and ends beyond the window - exclude this.
>>>
>>> In order to get every event in the range of c-f, here is what you need
>>> for a WHERE clause
>>>
>>> WHERE start <= (ending time) and end >= (starting time)
>>>
>>> Try that and let us know the results.
>>>
>>
>> Thanks Jochem and Shawn, however the following two queries result in the
>> exact same EXPLAIN output: (I hope the tables don't wrap too early for you)
>>
>> Old method:
>> SELECT
>> sc.open_dt,
>> sc.close_dt,
>> sc.protocol,
>> INET_NTOA(sc.src_address) AS src_address,
>> sc.src_port,
>> INET_NTOA(sc.dst_address) AS dst_address,
>> sc.dst_port,
>> sc.sent,
>> sc.rcvd,
>> spm.desc AS src_port_desc,
>> dpm.desc AS dst_port_desc
>> FROM firewall_connections AS sc
>> LEFT JOIN port_mappings AS spm ON spm.port = sc.src_port
>> LEFT JOIN port_mappings AS dpm ON dpm.port = sc.dst_port
>> WHERE
>> (open_dt >= '2011-08-31 09:53:31' OR close_dt >= '2011-08-31 09:53:31')
>> AND (open_dt <= '2011-09-01 09:53:31' OR close_dt <= '2011-09-01
>> 09:53:31')
>> ORDER BY rcvd DESC
>> LIMIT 0, 10;
>>
>> New method with BTREE index on open_dt, close_dt (index name is
>> ndx_open_close_dt):
>> SELECT
>> sc.open_dt,
>> sc.close_dt,
>> sc.protocol,
>> INET_NTOA(sc.src_address) AS src_address,
>> sc.src_port,
>> INET_NTOA(sc.dst_address) AS dst_address,
>> sc.dst_port,
>> sc.sent,
>> sc.rcvd,
>> spm.desc AS src_port_desc,
>> dpm.desc AS dst_port_desc
>> FROM firewall_connections AS sc
>> LEFT JOIN port_mappings AS spm ON spm.port = sc.src_port
>> LEFT JOIN port_mappings AS dpm ON dpm.port = sc.dst_port
>> WHERE
>> open_dt <= '2011-09-01 09:53:31' AND close_dt >= '2011-08-31 09:53:31'
>> ORDER BY rcvd DESC
>> LIMIT 0, 10;
>>
>> EXPLAIN output for old method:
>> +----+-------------+-------+--**------+-----------------------**
>> ----+----------+---------+----**----------------+------+------**-------+
>> | id | select_type | table | type | possible_keys | key | key_len | ref |
>> rows | Extra |
>> +----+-------------+-------+--**------+-----------------------**
>> ----+----------+---------+----**----------------+------+------**-------+
>> | 1 | SIMPLE | sc | index | open_dt,ndx_open_close_dt | ndx_rcvd | 4 |
>> NULL | 10 | Using where |
>> | 1 | SIMPLE | spm | eq_ref | PRIMARY | PRIMARY | 2 | syslog.sc.src_port |
>> 1 | |
>> | 1 | SIMPLE | dpm | eq_ref | PRIMARY | PRIMARY | 2 | syslog.sc.dst_port |
>> 1 | |
>> +----+-------------+-------+--**------+-----------------------**
>> ----+----------+---------+----**----------------+------+------**-------+
>>
>> EXPLAIN output for new method with new index:
>> +----+-------------+-------+--**------+-----------------------**
>> ----+----------+---------+----**----------------+------+------**-------+
>> | id | select_type | table | type | possible_keys | key | key_len | ref |
>> rows | Extra |
>> +----+-------------+-------+--**------+-----------------------**
>> ----+----------+---------+----**----------------+------+------**-------+
>> | 1 | SIMPLE | sc | index | open_dt,ndx_open_close_dt | ndx_rcvd | 4 |
>> NULL | 10 | Using where |
>> | 1 | SIMPLE | spm | eq_ref | PRIMARY | PRIMARY | 2 | syslog.sc.src_port |
>> 1 | |
>> | 1 | SIMPLE | dpm | eq_ref | PRIMARY | PRIMARY | 2 | syslog.sc.dst_port |
>> 1 | |
>> +----+-------------+-------+--**------+-----------------------**
>> ----+----------+---------+----**----------------+------+------**-------+
>>
>> SHOW INDEX:
>> +----------------------+------**------+-------------------+---**
>> -----------+-------------+----**-------+-------------+--------**
>> --+--------+------+-----------**-+---------+
>> | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation |
>> Cardinality | Sub_part | Packed | Null | Index_type | Comment |
>> +----------------------+------**------+-------------------+---**
>> -----------+-------------+----**-------+-------------+--------**
>> --+--------+------+-----------**-+---------+
>> | firewall_connections | 1 | ndx_open_close_dt | 1 | open_dt | A | 1342691
>> | NULL | NULL | | BTREE | |
>> | firewall_connections | 1 | ndx_open_close_dt | 2 | close_dt | A |
>> 6377783 | NULL | NULL | | BTREE | |
>> +----------------------+------**------+-------------------+---**
>> -----------+-------------+----**-------+-------------+--------**
>> --+--------+------+-----------**-+---------+
>>
>>
>> Although right now the queries do seem to be executing much faster,
>> although I'm not quite sure why. And I'm not sure why the new
>> ndx_open_close_dt isn't being used either.
>>
>> -Brandon
>>
>>
> I am still having a big issue with my query as seen above.  The table is up
> to around 32 million records at the moment and either of the two SELECT
> queries above take a very long time to run.  Is there anything at all I can
> do to speed things up?  It seems that changing the format of the WHERE
> clause did not help at all, as the EXPLAIN output is exactly the same for
> both version.  I also tried adding an index on (open_dt, close_dt, rcvd) but
> that index does not get used.
>
> Any other ideas?
>
> Thanks in advance,
>
>
> Brandon
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql?**
> unsub=eroomydna@stripped<http://lists.mysql.com/mysql?unsub=1
>
>

Thread
Query OptimizationBrandon Phelps30 Aug
  • Re: Query OptimizationMichael Dykman30 Aug
  • Re: Query OptimizationJochem van Dieten1 Sep
    • Re: Query OptimizationBrandon Phelps1 Sep
      • Re: Query OptimizationMySQL)1 Sep
        • Re: Query OptimizationBrandon Phelps1 Sep
          • Re: Query OptimizationBrandon Phelps8 Sep
            • Re: Query OptimizationAndrew Moore8 Sep
              • Re: Query OptimizationBrandon Phelps8 Sep
                • Re: Query OptimizationAndrew Moore8 Sep
                  • Re: Query OptimizationBrandon Phelps8 Sep
                    • Re: Query OptimizationDerek Downey8 Sep
                      • Re: Query OptimizationBrandon Phelps8 Sep
                        • Re: Query OptimizationMihail Manolov8 Sep
                          • Re: Query OptimizationBrandon Phelps8 Sep
                            • Re: Query OptimizationMihail Manolov8 Sep
                              • Re: Query OptimizationBrandon Phelps8 Sep
                • Re: Query OptimizationMihail Manolov8 Sep
                  • Re: Query OptimizationBrandon Phelps8 Sep
                    • Re: Query OptimizationAndrew Moore8 Sep
                      • Re: Query OptimizationBrandon Phelps8 Sep