Andrei Elkin wrote:
> Mats, hej.
>
> Thanks for a good review!
>
> I've committed a fresh patch accounting your notes.
>
OK, I'll have a look at it.
> Here are my answers.
[snip]
>> Andrei Elkin wrote:
>>
[snip]
>>> +#define is_bit_set(i, b) (uint) ((sizeof((i))*8 > b)? \
>>> + ((uchar*) (&(i)))[(b) / 8] \
>>> + & (1 << ((b) & 7)) \
>>> + : 0)
>>>
[snip]
>> Although I think that macros or functions for this purpose is a good
>> thing, these macros makes a large number of assumptions that is
>> incorrect and cause confusing error or incorrect behavior.
>>
>>
>
>
>> 1. It assumes that all bytes are 8 bits large, which is not the case.
>>
>
> Generally in the server's code we are better to have the only
> assumption. And this one as you must know is all bytes are 8 bits
> large in my_bitmap.h,cc etc.
>
Referring to another part of the code does not change the fact that it's
not correct, but this is not the important part. I would accept that
bytes are treated as 8-bit entities when it does not affect the
correctness. Comparing with my_bitmap, there is a difference. In that
case, the assumption of 8-bit bytes just lead to wasting memory, but in
this case, it will cause the computations to not work. Still, I would
not consider this as an important problem, if it were not for the issues
below.
[snip]
>
>> 4. It is endian-dependent. For example is_bit_set(0x1, 0) will be
>> 'false' on a big-endian machine, but not on a little-endian
>> machine.
>>
>
> A good catch.
> You must have seen that macros i used as the template are vulnarable as well
>
> #define _bitmap_set_bit(MAP, BIT) (((uchar*)(MAP)->bitmap)[(BIT) / 8] \
> |= (1 << ((BIT) & 7)))
>
> And then they're same wat buggy, are not they?
>
Actually not, they are storing and retrieving data in the same "faulty"
way, so they compensate each others (now, I wouldn't write the code this
way, but that is a different issue). However, in this case, you are
writing an "int" and retrieving it as "char", which is the real problem.
>> 5. It does not work if something is passed as 'i' that has a size
>> larger than 'int' (e.g., unsigned long long).
>>
>
> What do you mean? There is the check
>
> sizeof((i))*8 > b
>
> so that out-of-range bit forces "not set" return - which is correct.
> Although it'd better to return -1 to designate the error in
> accessing the bit.
> That's what I am adding to your versions below.
>
Nope, that is not what I was referring to... here is a small program
with a simplified version of the macro. Note that for the values below,
sizeof(V) * 8 <= (N), so the "out of range" result would not be returned.
#define bit_is_set(V,N) ((V) & (1 << (N)) ? 1 : 0)
int main() {
printf("bit_is_set(%llx,%d): %d\n",
(1ULL << 34), 34, bit_is_set(1ULL << 34, 34));
return EXIT_SUCCESS;
}
The output from this program is:
bit_is_set(400000000,34): 0
[snip]
>> 6. I would prefer to have a naming of the macros with a common
>> prefix, e.g., "bit_" since this is how all other naming is done
>> (e.g., "my_", "mysql_", "COM_", "SQLCOM_")
>>
>
> Okay. Let it be bit_is_set, bit_do_set, bit_do_clear.
>
>
>
>> 7. IMHO, do_set_bit() and do_clear_bit() are statements and should
>> not have a value, but this is not important.
>>
>
> I think if a routine can fail it should return something. And
> accessing a out-of-range bit is such a case.
>
Then the return value should also be consistently checked.
[snip]
>> Also, what do you mean with "conspire" (apart from that
>> it's a typo). "Conspire" basically means to plan something evil or
>> illegal... I suggest to use another word for whatever you want to
>> describe.
>>
>
> I agree, with the meaning. As the word "failure" does not carry any
> moral sense let's tolarate a metaphor which is placed here to make our
> work less routine more intriguing.
> Sometimes it's an art to make slave inconsistent :)!
>
True. :)
[snip]
>>> @@ -7170,43 +7201,48 @@ Write_rows_log_event::do_before_row_oper
>>> {
>>> int error= 0;
>>> - /*
>>> - We are using REPLACE semantics and not INSERT IGNORE semantics
>>> - when writing rows, that is: new rows replace old rows. We need to
>>> - inform the storage engine that it should use this behaviour.
>>> - */
>>> -
>>> - /* Tell the storage engine that we are using REPLACE semantics. */
>>> - thd->lex->duplicates= DUP_REPLACE;
>>> + if (is_bit_set(slave_exec_mode, SLAVE_MODE_IDEMPOTENT) != 0 ||
>>> + m_table->s->db_type()->db_type == DB_TYPE_NDBCLUSTER)
>>>
>>>
>> Here you're hard-coding for the NBD engine, which is not flexible at
>> all. I would suggest adding a property to the handler (e.g., a table
>> flag or a virtual function) and check that instead. That would mean
>> that other engines that need the same feature can just set that flag
>> or override that function.
>>
>
> There is a reason in what you are saying. Still, i think we should no rash
> with generalizing basing on only once existing case, which is it's
> only NDB that replicates Update on table with primary key as
> Write_rows event.
> The feature you are asking would be really welcome as soon as the
> NDB's method will be acquired by some-engine else and we will have
> material for thinking more carefully.
> At this point of time I am not sure if instead we should introduce a flag in
> Write_rows event to force its applying in replace i.e idempotent
> fashion.
> So let's keep it simple as it's done and I will add your todo comments.
>
Acceptable, even though I think it would be better to add the flag.
[snip]
>>> @@ -88,6 +88,16 @@ TYPELIB delay_key_write_typelib=
>>> delay_key_write_type_names, NULL
>>> };
>>> +const char *slave_exec_mode_names[]=
>>> +{ "STRICT", "IDEMPOTENT", NullS };
>>> +static const unsigned int slave_exec_mode_names_len[]=
>>> +{ strlen("STRICT"), strlen("IDEMPOTENT"), 0 };
>>>
>>>
>> Please don't use function calls (that is, strlen()) when doing static
>> initialization. It's unnecessary and it can also cause portability
>> problems. It is also contrary to how initialization is done
>> elsewhere. Without having delved deeper into how the lengths are
>> computed, there's plenty of code to compute the length of these fields
>> when they are being used, so I think that you don't have to add the
>> lengths explicitly.
>>
>> Could you check to see if this definition works?
>>
>> TYPELIB slave_exec_mode_typelib=
>> {
>> array_elements(slave_exec_mode_names)-1, "", slave_exec_mode_names, NULL
>> };
>>
>>
>>
>> If it does, then you can remove the explicit length array. In either
>> case, don't use strlen() inside a static initializer.
>>
>>
>
> I don't see reason to check:
>
> typedef struct st_typelib { /* Different types saved here */
> unsigned int count; /* How many types */
> const char *name; /* Name of typelib */
> const char **type_names;
> unsigned int *type_lengths;
> } TYPELIB;
>
> We need type_lengths to reference to lengths.
>
... but it is not used elsewhere.
> I see that all usages of declaring TYPELIB deals with
> explicit numbers, which is awkward - i believe you agree.
>
I don't see that. Here are a few excerpts from the server code:
client/mysqladmin.c:
static TYPELIB command_typelib=
{ array_elements(command_names)-1,"commands", command_names, NULL};
client/mysqldump.c:
TYPELIB compatible_mode_typelib=
{array_elements(compatible_mode_names)-1, "", compatible_mode_names,
NULL};
client/mysqltest.c:
TYPELIB command_typelib=
{array_elements(command_names),"", command_names, 0};
sql-common/client.c:
static TYPELIB option_types={array_elements(default_options)-1,
"options",default_options, NULL};
TYPELIB sql_protocol_typelib =
{array_elements(sql_protocol_names_lib)-1,"",
sql_protocol_names_lib, NULL};
sql/events.cc:
const TYPELIB Events::opt_typelib= {
array_elements(opt_event_scheduler_state_names)-1, "",
opt_event_scheduler_state_names, NULL
};
sql/handler.cc:
TYPELIB tx_isolation_typelib= {
array_elements(tx_isolation_names)-1,"", tx_isolation_names, NULL
};
> To address your concern i shall invoke sizeof() instead :)
>
Why not stick to the established convention instead?
Just my few cents,
Mats Kindahl
--
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com