Andrei Elkin wrote:
> Mats, hej.
>
> Hope you are getting better!
>
Yes, slowly, but I am.
> A newer patch has been committed.
>
OK. Will have a look at it.
[snip]
> However, the macros themselves neither store nor retrieve.
>
I don't get this. The macros contain assignment operators, so this
should mean that the "macros themselves" store. The other macro reads
from a pointer, so it should mean that it retrieves.
> [you wrote]
>
> 4. It is endian-dependent. For example is_bit_set(0x1, 0) will be
> 'false' on a big-endian machine, but not on a little-endian machine.
>
> Hence, your p.4 applies for this existing macro
>
> #define _bitmap_is_set(MAP, BIT) (uint) (((uchar*)(MAP)->bitmap)[(BIT) / 8] \
> & (1 << ((BIT) & 7)))
>
> as well: *MAP->bitmap = 0x1, BIT = 0.
>
> The only way for the macro to be endian-agnostic is not use merely
> (1 << BIT) shift and never locate the byte with []!
> That's what you offered in your first review mail.
>
Nope, that macro is endian-agnostic basically because 'is_bit_set(0x1,
0)' is invalid code: it should be written 'is_bit_set(map, 0)' for a
'map' defined as a MY_BITMAP. Might not seem like a big difference, but
bear with me for a while.
MY_BITMAP encodes that the bits are stored as little-endian... always.
The rationale is that the way the bits for a bitmap is set is by using
the bitmap_set_bit, and how the memory is organized internally is
unimportant (for the user). As a user, you are not expected to be able
to do a "uint x = *(uint*)bm->bitmap" and get a sensible result, since
this is internal data as far as the bitmap is concerned, and you
shouldn't read that directly.
However, for the macro you wrote above that takes an uint as parameter,
it is expected, and even encouraged, to pass the bitmap data as unsigned
integrals, which makes it endian-sensitive since you have to "decode"
the unsigned integral in some manner... and this differs between systems
with different endianess. No such "decoding" is necessary for the
original macros.
>>>> 5. It does not work if something is passed as 'i' that has a size
>>>> larger than 'int' (e.g., unsigned long long).
>>>>
>>>>
>>> What do you mean? There is the check
>>>
>>> sizeof((i))*8 > b
>>>
>>> so that out-of-range bit forces "not set" return - which is correct.
>>> Although it'd better to return -1 to designate the error in
>>> accessing the bit.
>>> That's what I am adding to your versions below.
>>>
>>>
>> Nope, that is not what I was referring to... here is a small program
>> with a simplified version of the macro. Note that for the values
>> below, sizeof(V) * 8 <= (N), so the "out of range" result would not be
>> returned.
>>
>> #define bit_is_set(V,N) ((V) & (1 << (N)) ? 1 : 0)
>>
>> int main() {
>> printf("bit_is_set(%llx,%d): %d\n",
>> (1ULL << 34), 34, bit_is_set(1ULL << 34, 34));
>> return EXIT_SUCCESS;
>> }
>>
>> The output from this program is:
>>
>> bit_is_set(400000000,34): 0
>>
>
> Sorry, I could not combine your p.5 with this example.
>
> So we have N=34, V=400 000 000 and sizeof(V)=4, that's why
>
The size of 'unsigned long long' is 8 in this case and it is strictly
larger than the "int" size. When using the standard conversion on the
constant 1 (they are always applied before doing anything with a value),
it becomes an "int", which when shifted invokes undefined behavior since
sizeof(int) * CHAR_BIT == 32 and the shift count is 34. (In this case,
it ends up shifting away the set bit, but it is actually more common
that the original value is unchanged, i.e., '1 << 34 == 1'.)
If, instead, the 1 is first cast to unsigned long long (or written as
1ULL), then the shift is correct since 'sizeof(unsigned long long) *
CHAR_BIT == 64' and 34 < 64. Hence the shift does *not* invoke undefined
behavior.
> "sizeof(V) * 8 <= (N)", right?
> But I was expecting the reverse sizeof(V) * 8 > N as
> there should be some hint towards "unsigned long long" ...
>
Just my few cents,
Mats Kindahl
--
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com