Georgi Kodinov wrote:
> Hi Mats,
> Here's my take on the fix (http://lists.mysql.com/commits/124827) :
> I think my_strspn/my_strcspn are not fully character set aware, they assume that:
> a) the delimiters string is in a 1-byte character set
> b) the string to compare it with is also in 1-byte character set
> While this is probably ok by a stretch with most of the OSes (since the delimiters in
> UTF-8 are 1 byte and I don't think there're valid letters starting with the same byte as
> the delimiters) it's not correct in the general case.
> I see that my_strchr() suffers the same limitations.
> IMHO we either implement a proper character set agnostic
> my_strchr/my_strspn/my_strcspn or we don't put it in the standard library.
> Moreover we have a TODO: in my_strchr.c:
> "TODO: should be moved to CHARSET_INFO if it's going to be called frequently."
> All that said I think that it's probably OK to add these two functions there as long
> as the limitations are clearly explained in a comment and a TODO is added to move them to
> CHARSET_INFO similarly to my_strchr(). But I will leave this to Bar to decide (I've cc-ed
I think it's better not to add a new function,
because we already have a thing like this:
a function in MY_CHARSET_HANDLER:
size_t (*scan)(struct charset_info_st *,
const char *b, const char *e, int sq);
It returns the length of the initial segment of the string
which consist entirely of characters of a certain sequence "sq".
It currently understands these types in "sq":
m_ctype.h:#define MY_SEQ_INTTAIL 1
m_ctype.h:#define MY_SEQ_SPACES 2
I suggest to extend this function with a new sequence type,
Note, you don't need to implement this in ctype-ucs2.c,
a DEBUG_ASSERT(0) is enough.
> One additional question is why are you adding a function that nobody is using
> (my_strspn) ?
> This is a sure recipe for problems later and contributes to the coverage problem that
> we have.
> If you want to explain how it's done add a comment, not a dead function.
> Other than this I don't have objections to the path other than some trivia like e.g.
> adding an inline to the check_if_path() definition (and probably calling it somewhat
> differently, e.g. has_no_path_delimiters()).
> Best Regards,