List:Commits« Previous MessageNext Message »
From:Sergey Vojtovich Date:November 24 2010 1:32pm
Subject:Re: bzr commit into mysql-next-mr branch (bar:3201) WL#5510
View as plain text  
Hi Alexander,

ok to push. Though there are a few comments inline. I'm leaving
them up to you to decide.

On Thu, Nov 18, 2010 at 07:00:40AM +0000, Alexander Barkov wrote:
> #At file:///home/bar/mysql-bzr/mysql-next-mr.w5510/ based on
> revid:alexander.nozdrin@stripped
> 
>  3201 Alexander Barkov	2010-11-18
>       WL#5510 Functions to_base64 and from_base64
...skip...

> === modified file 'mysys/base64.c'
> --- a/mysys/base64.c	2009-03-20 14:27:53 +0000
> +++ b/mysys/base64.c	2010-11-18 06:58:12 +0000
...skip...

> @@ -100,24 +137,176 @@ base64_encode(const void *src, size_t sr
>  }
>  
>  
> +/*
> +  Helper table for base64_decode
> +  -2 means "space character"
> +  -1 means "bad character"
> +  Non-negative values mean valid base64 encoding character.
> +*/
> +static int8
> +from_base64_table[]=
> +{
> +/*00*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-2,-2,-2,-2,-2,-1,-1, 
> +/*10*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 
> +/*20*/  -2,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,62,-1,-1,-1,63, /*  !"#$%&'()*+,-./ */
> +/*30*/  52,53,54,55,56,57,58,59,60,61,-1,-1,-1,-1,-1,-1, /* 0123456789:;<=>?
> */
> +/*40*/  -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14, /* @ABCDEFGHIJKLMNO */
> +/*50*/  15,16,17,18,19,20,21,22,23,24,25,-1,-1,-1,-1,-1, /* PQRSTUVWXYZ[\]^_ */
> +/*60*/  -1,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40, /* `abcdefghijklmno */
> +/*70*/  41,42,43,44,45,46,47,48,49,50,51,-1,-1,-1,-1,-1, /* pqrstuvwxyz{|}~  */
> +/*80*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 
> +/*90*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 
> +/*A0*/  -2,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 
> +/*B0*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 
> +/*C0*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 
> +/*D0*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 
> +/*E0*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 
> +/*F0*/  -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
> +};
You have ending spaces in the above lines and a lot more
around code. Coding style says: "Avoid trailing whitespace,
in code and comments."

> +
> +
> +/*
> +  Convert base64 encoding character to bits
> +
> +  SYNOPSIS
> +    my_base64_pos
> +    
> +    c     - input character
> +    error - pointer to error
> +
> +  DESCRIPTION
> +    Convert input character to a number in the range [0..63].
> +    If input character is not valid base64 encoding character,
> +    then error is set to 1.
> +
> +  RETURN VALUE
> +    -2  - c is a space character
> +    -1  - c is an unknow character
> +    non-negative number - c is a valid base64 encoding character
> +*/
This type of function comment is not modern. We use doxygen these days.
Coding style says: "As of 2007, MySQL uses Doxygen comments for file,
section, class, structure, function and method comments."

Also you mess up "function status" and "return value". And return 0 on
failure. Coding style says: "All functions that can report an error
(usually an allocation error), should return 0/FALSE/false on success,
1/TRUE/true on failure. Other return values should go in an output argument."

>  static inline uint
> -pos(unsigned char c)
> +my_base64_pos(unsigned char c, int *error)
> +{
> +  int res= from_base64_table[c];
> +  if (res < 0)
> +  {
> +    *error= 1;
> +    return 0;
> +  }
> +  return (uint) res;
> +}
> +
> +
> +typedef struct my_base64_decoder_t
>  {
> -  return (uint) (strchr(base64_table, c) - base64_table);
> +  const char *src; /* Pointer to the current input position        */
> +  const char *end; /* Pointer to the end of input buffer           */
> +  unsigned int c;  /* Collect bits into this number                */
> +  int error;       /* Error code                                   */
> +  uchar state;     /* Character number in the current group of 4   */
> +  uchar mark;      /* Number of padding marks in the current group */
> +} MY_BASE64_DECODER;
> +
> +
> +/*
> +  Skip leading spaces in a base64 encoded string
> +  
> +  SYNOPSIS
> +    my_base64_decoder_skip_spaces
> +    
> +    decoder  Pointer to MY_BASE64_DECODER
> +    
> +  DESCRIPTION
> +    Skip leading spaces and stop on the first non-space character.
> +    decoder->p will point to the first non-space character, or
> +    to the end of the input string.
> +    
> +  RETURN VALUE
> +    0 - end-of-input found
> +    1 - if there are some more non-space characters
> +*/
> +static inline int
> +my_base64_decoder_skip_spaces(MY_BASE64_DECODER *decoder)
> +{
> +  for ( ; decoder->src < decoder->end; decoder->src++)
> +  {
> +    if (from_base64_table[(uchar) *decoder->src] != -2)
> +      return 1;
> +  }
> +  return 0;
>  }
>  
>  
> -#define SKIP_SPACE(src, i, size)                                \
> -{                                                               \
> -  while (i < size && my_isspace(&my_charset_latin1, * src))     \
> -  {                                                             \
> -    i++;                                                        \
> -    src++;                                                      \
> -  }                                                             \
> -  if (i == size)                                                \
> -  {                                                             \
> -    break;                                                      \
> -  }                                                             \
> +/*
> +  Get the next character from a base64 encoded string
> +
> +  SYNOPSIS
> +    my_base64_decoder_getch
> +    
> +    decoder  Pointer to MY_BASE64_DECODER
> +    
> +  DESCRIPTION
> +    Skip spaces, then scan the next base64 character or a pad character
> +    and collect bits into decoder->c.
> +    
> +  RETURN VALUE
> +    0 - error (unexpected character or unexpected end-of-input found)
> +    1 - success (valid base64 encoding character found)
> +*/
> +static inline int
> +my_base64_decoder_getch(MY_BASE64_DECODER *decoder)
> +{
> +  if (!my_base64_decoder_skip_spaces(decoder))
> +  {
> +    if (decoder->state > 0)
> +      decoder->error= 1; /* Unexpected end-of-input found */
> +    return 0;
> +  }
> +  
> +  decoder->c <<= 6;
> +  decoder->c+= my_base64_pos(*decoder->src++, &decoder->error);
> +  
> +  if (!decoder->error) /* Valid base64 character found */
> +  {
> +    if (decoder->mark)
> +    {
> +      /* If we have scanned '=' already, then only '=' is valid */
> +      DBUG_ASSERT(decoder->state == 3);
> +      decoder->error= 1;
> +      decoder->src--;
> +      return 0; /* '=' expected */
> +    }
> +    decoder->state++;
> +    return 1;
> +  }
> +  
> +  switch (decoder->state) {
Coding style says:
  Indent switch like this: 

  switch (condition)
  {
  case XXX:
    statements;
  case YYY:
    {
      statements;
    }
  }

> +  case 0:
> +  case 1:
> +    decoder->src--;
> +    return 0; /* base64 character expected */
> +    break;
> +  
> +  case 2:
> +  case 3:
> +    if (decoder->src[-1] == '=')
Afair there is some document, which says that negative index is
insecure and suggests not to use them. Unfortunately I didn't read
this document and lost it's link. So I don't know details.

> +    {
> +      decoder->error= 0;
> +      decoder->mark++;
> +    }
> +    else
> +    {
> +      decoder->src--;
> +      return 0; /* base64 character or '=' expected */
> +    }
> +    break;
> +  
> +  default:
> +    DBUG_ASSERT(0);
> +  }
> +  
> +  decoder->state++;
> +  return 1;
>  }
>  
>  
> @@ -155,75 +344,40 @@ int
>  base64_decode(const char *src_base, size_t len,
>                void *dst, const char **end_ptr)
>  {
> -  char b[3];
> -  size_t i= 0;
> -  char *dst_base= (char *)dst;
> -  char const *src= src_base;
> -  char *d= dst_base;
> -  size_t j;
> -
> -  while (i < len)
> +  char *d= (char*) dst;
> +  MY_BASE64_DECODER decoder;
> +  
> +  decoder.src= src_base;
> +  decoder.end= src_base + len;
> +  decoder.error= 0;
> +  decoder.mark= 0;
> +  
> +  for ( ; ; )
>    {
> -    unsigned c= 0;
> -    size_t mark= 0;
> -
> -    SKIP_SPACE(src, i, len);
> -
> -    c += pos(*src++);
> -    c <<= 6;
> -    i++;
> -
> -    SKIP_SPACE(src, i, len);
> -
> -    c += pos(*src++);
> -    c <<= 6;
> -    i++;
> -
> -    SKIP_SPACE(src, i, len);
> -
> -    if (*src != '=')
> -      c += pos(*src++);
> -    else
> -    {
> -      src += 2;                /* There should be two bytes padding */
> -      i= len;
> -      mark= 2;
> -      c <<= 6;
> -      goto end;
> -    }
> -    c <<= 6;
> -    i++;
> -
> -    SKIP_SPACE(src, i, len);
> -
> -    if (*src != '=')
> -      c += pos(*src++);
> -    else
> +    decoder.c= 0;
> +    decoder.state= 0;
> +    
> +    if (!my_base64_decoder_getch(&decoder) ||
> +        !my_base64_decoder_getch(&decoder) ||
> +        !my_base64_decoder_getch(&decoder) ||
> +        !my_base64_decoder_getch(&decoder))
> +      break;
> +    
> +    *d++= (decoder.c >> 16) & 0xff;
> +    *d++= (decoder.c >>  8) & 0xff;
> +    *d++= (decoder.c >>  0) & 0xff;
> +    
> +    if (decoder.mark)
>      {
> -      src += 1;                 /* There should be one byte padding */
> -      i= len;
> -      mark= 1;
> -      goto end;
> +      d-= decoder.mark;
> +      break;
>      }
> -    i++;
> -
> -  end:
> -    b[0]= (c >> 16) & 0xff;
> -    b[1]= (c >>  8) & 0xff;
> -    b[2]= (c >>  0) & 0xff;
> -
> -    for (j=0; j<3-mark; j++)
> -      *d++= b[j];
>    }
> -
> +  
>    if (end_ptr != NULL)
> -    *end_ptr= src;
> -
> -  /*
> -    The variable 'i' is set to 'len' when padding has been read, so it
> -    does not actually reflect the number of bytes read from 'src'.
> -   */
> -  return i != len ? -1 : (int) (d - dst_base);
> +    *end_ptr= decoder.src;
> +  
> +  return decoder.error ? -1 : (int) (d - (char*) dst);
>  }
In your reply to my first review you say: "... for the sake of better
code readability".

I did a small subjective comparison:
I had no questions about old implementation after 7 minutes of study.
I had no questions about new implementation after 137 minutes of study.

But new code does what it supposed to do. It doesn't break existing use
cases. So I'm ok with new implementation.

Regards,
Sergey
-- 
Sergey Vojtovich <svoj@stripped>
MySQL AB, Software Engineer
Izhevsk, Russia, www.mysql.com
Thread
bzr commit into mysql-next-mr branch (bar:3201) WL#5510Alexander Barkov18 Nov
  • Re: bzr commit into mysql-next-mr branch (bar:3201) WL#5510Sergey Vojtovich24 Nov
    • Re: bzr commit into mysql-next-mr branch (bar:3201) WL#5510Alexander Barkov25 Nov