List:Commits« Previous MessageNext Message »
From:Mats Kindahl Date:November 30 2007 11:45am
Subject:Re: bk commit into 5.1 tree (aelkin:1.2610) BUG#31552
View as plain text  
Andrei Elkin wrote:
> Mats, hej.
>
> Hope you are getting better!
>   

Yes, slowly, but I am.

> A newer patch has been committed.
>   

OK. Will have a look at it.

[snip]

> However, the macros themselves neither store nor retrieve.
>   

I don't get this. The macros contain assignment operators, so this 
should mean that the "macros themselves" store. The other macro reads 
from a pointer, so it should mean that it retrieves.

>       [you wrote]
>
>       4. It is endian-dependent. For example is_bit_set(0x1, 0) will be
>      'false' on a big-endian machine, but not on a little-endian machine.
>  
> Hence, your p.4 applies for this existing macro
>
>  #define _bitmap_is_set(MAP, BIT) (uint) (((uchar*)(MAP)->bitmap)[(BIT) / 8] \
>                                          & (1 << ((BIT) & 7)))
>
> as well: *MAP->bitmap = 0x1, BIT = 0.
>
> The only way for the macro to be endian-agnostic is not use merely
> (1 << BIT) shift and never locate the byte with []!
> That's what you offered in your first review mail.
>   

Nope, that macro is endian-agnostic basically because 'is_bit_set(0x1, 
0)' is invalid code: it should be written 'is_bit_set(map, 0)' for a 
'map' defined as a MY_BITMAP. Might not seem like a big difference, but 
bear with me for a while.

MY_BITMAP encodes that the bits are stored as little-endian... always.

The rationale is that the way the bits for a bitmap is set is by using 
the bitmap_set_bit, and how the memory is organized internally is 
unimportant (for the user). As a user, you are not expected to be able 
to do a "uint x = *(uint*)bm->bitmap" and get a sensible result, since 
this is internal data as far as the bitmap is concerned, and you 
shouldn't read that directly.

However, for the macro you wrote above that takes an uint as parameter, 
it is expected, and even encouraged, to pass the bitmap data as unsigned 
integrals, which makes it endian-sensitive since you have to "decode" 
the unsigned integral in some manner... and this differs between systems 
with different endianess. No such "decoding" is necessary for the 
original macros.

>>>>   5. It does not work if something is passed as 'i' that has a size
>>>>      larger than 'int' (e.g., unsigned long long).
>>>>     
>>>>         
>>> What do you mean? There is the check
>>>
>>> sizeof((i))*8 > b
>>>
>>> so that out-of-range bit forces "not set" return - which is correct.
>>> Although it'd better to return -1 to designate the error in
>>> accessing the bit.
>>> That's what I am adding to your versions below.
>>>   
>>>       
>> Nope, that is not what I was referring to... here is a small program
>> with a simplified version of the macro. Note that for the values
>> below, sizeof(V) * 8 <= (N), so the "out of range" result would not be
>> returned.
>>
>>    #define bit_is_set(V,N) ((V) & (1 << (N)) ? 1 : 0)
>>
>>    int main() {
>>      printf("bit_is_set(%llx,%d): %d\n",
>>             (1ULL << 34), 34, bit_is_set(1ULL << 34, 34));
>>      return EXIT_SUCCESS;
>>    }
>>
>> The output from this program is:
>>
>>    bit_is_set(400000000,34): 0
>>     
>
> Sorry, I could not combine your p.5 with this example.
>
> So we have N=34, V=400 000 000 and sizeof(V)=4, that's why
>   

The size of 'unsigned long long' is 8 in this case and it is strictly 
larger than the "int" size. When using the standard conversion on the 
constant 1 (they are always applied before doing anything with a value), 
it becomes an "int", which when shifted invokes undefined behavior since 
sizeof(int) * CHAR_BIT == 32 and the shift count is 34. (In this case, 
it ends up shifting away the set bit, but it is actually more common 
that the original value is unchanged, i.e., '1 << 34 == 1'.)

If, instead, the 1 is first cast to unsigned long long (or written as 
1ULL), then the shift is correct since 'sizeof(unsigned long long) * 
CHAR_BIT == 64' and 34 < 64. Hence the shift does *not* invoke undefined 
behavior.

> "sizeof(V) * 8 <= (N)", right? 
> But I was expecting the reverse sizeof(V) * 8 > N as
> there should be some hint towards "unsigned long long" ...
>   

Just my few cents,
Mats Kindahl

-- 
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com


Thread
bk commit into 5.1 tree (aelkin:1.2610) BUG#31552Andrei Elkin20 Nov
  • Re: bk commit into 5.1 tree (aelkin:1.2610) BUG#31552Mats Kindahl22 Nov
    • Re: bk commit into 5.1 tree (aelkin:1.2610) BUG#31552Andrei Elkin27 Nov
      • Re: bk commit into 5.1 tree (aelkin:1.2610) BUG#31552Mats Kindahl27 Nov
        • Re: bk commit into 5.1 tree (aelkin:1.2610) BUG#31552Andrei Elkin29 Nov
          • Re: bk commit into 5.1 tree (aelkin:1.2610) BUG#31552Mats Kindahl30 Nov