James Nobis <james_nobis@stripped> wrote on 04/12/2005
10:19:33 AM:
> Hi all,
>
> Essentially I was asked to look into a problem with a query at the
> office which
> used AVG on a datetime column which produces incorrect results. (MySQL
4.1.10
> on RHEL 3 update 4) As you can see below the stark difference between
the
> correct and incorrect results. Though, the incorrect result has the
correct
> year (this seems actually to be coincidence and nothing more) but I
> am not sure
> exactly what MySQL did for casting. The query produces no errors
orwarnings.
> I *think* the incorrect result is an "internal" timestamp format but
with the
> wrong data. Is there a part of the manual that I missed which explains
this
> unexpected behavior?
>
> AVG(col) v FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(col)))
> 20050258324839 v 2005-03-24 02:05:11
>
> James Nobis
> Web Developer
> Academic Superstore
> 223 W. Anderson Ln. Suite A110, Austin, TX 78752
> Voice: (512) 450-1199 x453 Fax: (512) 450-0263
> http://www.academicsuperstore.com
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
>
I can't find it in the online manual but it used to explain that datetime
values will be cast as a number in the form of YYYYMMDDHHNNSS or as a
string like 'YYYY-MM-DD HH:NN:SS'. AVG() is a numeric function so if you
are averaging datetime values, they will appear in their numeric format.
As an example, if you needed the AVG() of the times '2005-04-30 12:55:59'
and '2005-04-30 12:56:01' you would have wound up averaging the following
numbers:
20050430125559
20050430125601
Which does not numerically average out to an numeric representaion of
'2005-04-30 12:56:00' but instead to 20050430125680 which is the numeric
form of the time '2005-04-30 12:55:80'. However, by first converting the
dates to "seconds from epoch" (by using UNIX_TIMESTAMP()) you take the
average of the numbers
1114880159
1114880161
which works out to 1114880160. Run that value through FROM_UNIXTIME() and
you get:
select from_unixtime(1114880160);
+---------------------------+
| from_unixtime(1114880160) |
+---------------------------+
| 2005-04-30 12:56:00 |
+---------------------------+
1 row in set (0.00 sec)
Which is exactly what your test revealed. Basically, you got hosed results
for your first column because the AVG() and SUM() aggregate functions do
not work with dates or times in a format that is conducive to proper
"date" math. They use the human-readable version of the date(a packed
number) and not a machine-useful, mathematically sound value like the
date's offset from epoch.
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine