List:General Discussion« Previous MessageNext Message »
From:Rick James Date:June 14 2012 5:21pm
Subject:RE: NoSQL help
View as plain text  
SHOW CREATE TABLE
SHOW TABLE STATUS
EXPLAIN SELECT ... \G


> -----Original Message-----
> From: Ananda Kumar [mailto:anandkl@stripped]
> Sent: Thursday, June 14, 2012 4:04 AM
> To: Manivannan S.
> Cc: mysql@stripped
> Subject: Re: NoSQL help
> 
> As seen below,
> Full table scan is happening on table "ibf".
> Can share the indexes on this table and also the complete sql
> 
> On Thu, Jun 14, 2012 at 4:17 PM, Manivannan S. <
> manivannan_s@stripped> wrote:
> 
> >    id****
> >
> > select_type****
> >
> > table****
> >
> > type****
> >
> > possible_keys****
> >
> > key****
> >
> > key_len****
> >
> > ref****
> >
> > rows****
> >
> > Extra****
> >
> > 1****
> >
> > SIMPLE****
> >
> > ibf****
> >
> > ALL****
> >
> > ibf_MsgId****
> >
> > \N****
> >
> > \N****
> >
> > \N****
> >
> > 160944****
> >
> > 1****
> >
> > SIMPLE****
> >
> > pl****
> >
> > ref****
> >
> >
> idx_unique_key_ib_msg\,index_message_id\,index_message_processing_stat
> > us**
> > **
> >
> > idx_unique_key_ib_msg****
> >
> > 180****
> >
> > reports.ibf.Message_Id\,const****
> >
> > 1****
> >
> > Using where; Using index****
> >
> > 1****
> >
> > SIMPLE****
> >
> > tl****
> >
> > ref****
> >
> > idx_unique_key_ib_text\,index_message_id****
> >
> > idx_unique_key_ib_text****
> >
> > 153****
> >
> > reports.pl.Message_Id****
> >
> > 1****
> >
> > 1****
> >
> > SIMPLE****
> >
> > xl****
> >
> > ref****
> >
> > idx_unique_key_ib_xml\,index_message_id****
> >
> > idx_unique_key_ib_xml****
> >
> > 153****
> >
> > reports.pl.Message_Id****
> >
> > 1****
> >
> > Using where****
> >
> > ** **
> >
> > Sorry for the previous mail..... this is my execution plan for 1.5
> > million
> > records....****
> >
> > ** **
> >
> > *From:* Ananda Kumar [mailto:anandkl@stripped]
> > *Sent:* Thursday, June 14, 2012 3:33 PM
> >
> > *To:* Manivannan S.
> > *Cc:* mysql@stripped
> > *Subject:* Re: NoSQL help****
> >
> > ** **
> >
> > can u share the sql, explain plan, indexes etc,****
> >
> > show full processlist out put when the sql's are running****
> >
> > On Thu, Jun 14, 2012 at 3:03 PM, Manivannan S. <
> > manivannan_s@stripped> wrote:****
> >
> > I tried with myisam engine also. But it also taking more time to
> > generate the report. In my database I am having 8 innodb tables and
> at
> > the same time I am joining 4 tables to get the report.
> >
> > I am maintaining 60days records because the user will try to generate
> > the report out of 60 days in terms of second, minute, hourly, weekly
> > and Monthly report also.
> >
> > From: Ananda Kumar [mailto:anandkl@stripped]
> > Sent: Thursday, June 14, 2012 12:32 AM
> > To: Rick James
> > Cc: Johan De Meersman; Manivannan S.; mysql@stripped
> > Subject: Re: NoSQL help****
> >
> >
> > Did you try with myisam tables.
> > They are supposed to be good for reporting requirement****
> >
> > On Wed, Jun 13, 2012 at 11:52 PM, Rick James <rjames@yahoo-
> inc.com<mailto:
> > rjames@stripped>> wrote:
> > I'll second Johan's comments.
> >
> > "Count the disk hits!"
> >
> > One minor change:  Don't store averages in the summary table; instead
> > store the SUM().  That lets you get the mathematically correct
> AVERAGE
> > over any time range via
> > SUM(sum_foo) / SUM(count_foo)
> >
> > Switching between MySQL and Mongo requires rewriting _all_ of the
> > relevant code.
> >
> > <opinion> NoSQL will be no better than MySQL for 150GB. </opinion>
> > "Count the disk hits!"
> >
> > I recently built a system that topped out at 350GB (90 days' data).
> > It involved hourly ingestion of a few GB of data and a variety of
> "reports".
> >  The prototype showed that most reports would take about an hour to
> run.
> >  Not good.  The final product, with summary tables, lets the reports
> > be run on-demand and online and each takes only a few seconds.  By
> > careful use of MEMORY tables, LOAD DATA, etc, the ingestion takes 5
> > minutes (each hour) for the raw data and 2 minutes (total) for the 7
> summary tables.
> >  PARTITIONing was vital for the design.  Once an hour a new partition
> > is populated; once a day, 24 hourly partitions are rolled into a new
> > daily partition and the 90-day old partition is DROPped.
> >
> >
> > > -----Original Message-----****
> >
> > > From: Johan De Meersman [mailto:vegivamp@stripped<mailto:
> > vegivamp@stripped>]
> > > Sent: Wednesday, June 13, 2012 6:20 AM
> > > To: Manivannan S.****
> >
> > > Cc: mysql@stripped<mailto:mysql@stripped>
> > > Subject: Re: NoSQL help
> > >
> > >
> > > ----- Original Message -----****
> >
> > > > From: "Manivannan S." <manivannan_s@stripped<mailto:
> > manivannan_s@stripped>>
> > > >
> > > > Hi all,
> > > >
> > > > [lots of data]
> > > > [slow reports]
> > > > [wooo NoSQL magic]
> > >
> > > Not that I want to discourage you, but my standard first question
> is
> > > "why do you think NoSQL (let alone any specific product) is the
> > > right solution?" :-)
> > >
> > > Don't get me wrong, it might be; but from what little I now know
> > > about your environment, it sounds like applying some data
> > > warehousing techniques might suffice - and being the cynical
> > > dinosaur that I am, I have a healthy reluctance about welding new
> > > technology onto a stable environment.
> > >
> > > To speed up reporting (and note that these techniques are often
> > > applied even when implementing NoSQL solutions, too) it is usually
> a
> > > good first step to set up a process of data summarization.
> > >
> > > Basically, you pre-calculate averages, medians, groupings, whatever
> > > you need for your reports; and your job also saves the last record
> > > IDs it's processed; then on the next run, you only read the new
> > > records and update your summary tables to incorporate the new data.
> > >
> > > Suppose I have a table like this:
> > >
> > > ID | Val
> > > --------
> > >  1     1
> > >  2     7
> > >  3     5
> > >  4    13
> > >
> > > I want to report the average on a daily basis, and calculating that
> > > over those rows is unbearably slow because I'm running the process
> > > on a wristwatch from 1860 :-)
> > >
> > > So I get a summary table, calculate (1+7+5+13)/4 = 6.5 and that
> then
> > > gets a record saying this:
> > >
> > > Avg | elementCount | lastSeen
> > > -----------------------------
> > > 6.5              4          4
> > >
> > > Now, over the course of the day, the elements 4, 17 and 2 get added
> > > with sequential row numbers. Instead of calculating
> > > (1+7+5+13+4+17+2)/7, which would be slow; I can substitute the
> > > already summarized data by Avg*elementCount. Thus, I calculate
> > > (6.5*4 +
> > > 4+17+2)/7 = 7, which is a lot faster, and my summary table now
> looks
> > > like this:
> > >
> > > Avg | elementCount | lastSeen
> > > -----------------------------
> > >   7              7          7
> > >
> > > This is of course a stupid example, but it saves you a lot of time
> > > if you already have the summary of several thousand elements and
> > > only need to update it for a handful. Similar tricks are possible
> > > for a lot of typical reporting stuff - you don't need to
> > > re-calculate data for past months over and over again, for instance
> > > - and that's what makes your reports run fast.
> > >
> > >
> > > Just my 2 cents :-)
> > > /johan
> > >
> > > --
> > > Bier met grenadyn
> > > Is als mosterd by den wyn
> > > Sy die't drinkt, is eene kwezel
> > > Hy die't drinkt, is ras een ezel
> > >
> > > --
> > > MySQL General Mailing List
> > > For list archives: http://lists.mysql.com/mysql
> > > To unsubscribe:    http://lists.mysql.com/mysql****
> >
> > DISCLAIMER: This email message and all attachments are confidential
> > and may contain information that is privileged, confidential or
> exempt
> > from disclosure under applicable law.  If you are not the intended
> > recipient, you are notified that any dissemination, distribution or
> > copying of this email is strictly prohibited. If you have received
> > this email in error, please notify us immediately by return email or
> > to mailadmin@stripped and destroy the original message.
> > Opinions, conclusions and other information in this message that do
> > not relate to the official business of SPAN, shall be understood to
> be
> > neither given nor endorsed by SPAN.****
> >
> > ** **
> >
> > DISCLAIMER: This email message and all attachments are confidential
> and may contain information that is privileged, confidential or exempt
> from disclosure under applicable law.  If you are not the intended
> recipient, you are notified that any dissemination, distribution or
> copying of this email is strictly prohibited. If you have received this
> email in error, please notify us immediately by return email or to
> mailadmin@stripped and destroy the original message.  Opinions,
> conclusions and other information in this message that do not relate to
> the official business of SPAN, shall be understood to be neither given
> nor endorsed by SPAN.
> >
> >
Thread
NoSQL helpManivannan S .13 Jun
  • Re: NoSQL helpJohan De Meersman13 Jun
    • RE: NoSQL helpRick James13 Jun
      • Re: NoSQL helpAnanda Kumar13 Jun
        • RE: NoSQL helpManivannan S .14 Jun
          • Re: NoSQL helpAnanda Kumar14 Jun
            • RE: NoSQL helpManivannan S .14 Jun
            • RE: NoSQL helpManivannan S .14 Jun
              • Re: NoSQL helpAnanda Kumar14 Jun
                • RE: NoSQL helpRick James14 Jun
          • RE: NoSQL helpRick James14 Jun
  • Re: NoSQL helpmos13 Jun
    • Handler?hsv25 Jun