> I do understand the "pro-double" arguments, however I think that exact
> representation would give us at least some hope (for the future, I know now
> it does not work) to answer certain SQL queries efficiently using index
> only without the need to read the record.
That hope exists now - not exactly from the index, but without
going to the database on disk. The use of doubles in indexes
doesn't make it (much) harder. If we add the removal of index
keys to the information in the serial log so that pending index
changes are done even after a crash, then an index entry that
references a record on disk in the database (as opposed to being
in cache or in the backlog) is guaranteed to be valid for all
transactions. All MVCC issues are dealt with in the record
cache, or, on a bad day, in the backlog.
Numbers with a value of more than 18 digits would still
have to be checked on disk, even if we added an int64 index
key, because they can not be represented as int64, so they
revert to double precision and, being more than 15 digits,
may be imprecise.
Without int64 keys, we'd have to validate index entries
for any numeric value longer than 15 digits.
So, by adding an int64 index type, we complicate the process
of changing from numeric (10,2) to numeric (12,4) - relatively
common - to avoid having to look on disk for values between
1,000,000,000,000,000 and 1,000,000,000,000,000.
> select name order by name, age where name like 'A%' and age > 10
> can be answered just using index on (name,age)
The double precision representation of 10 is precise, so there's
no problem with that query.
> Also, I suspect inexact representation of integer/decimal breaks "unique" on
> some probably rare cases. If 2 different very big int64 numbers are mapped
> to a single double, only one of these numbers can be stored in a unique
> index, isn't it so?
No, unique indexes allow more than one entry with the same value.
If they didn't, you couldn't have one transaction delete a unique
value and another reintroduce it with running transactions bridging
the two changes. Before reporting a unique constraint violation,
Falcon has to check the actual data to be sure it's a conflict
in some transaction's state of the database, and that the values
are actually the same.
I haven't looked at the code, but we certainly should be checking
that the stored values are unique.