MySQL Lists are EOL. Please join:

List:Internals« Previous MessageNext Message »
From:Øystein Grøvlen Date:May 19 2016 12:45pm
Subject:Re: storage engine: understanding the row format
View as plain text  
Hi,

On 19. mai 2016 14:42, Christoph Rupp wrote:
> Hi Øystein,
>
> thanks for the fast and helpful reply!
>
> I assume that table->s->null_bytes tells me how many bytes are used to
> describe the nullable columns?

Correct.

--
Øystein

>
> Best regards
> Christoph
>
> 2016-05-19 14:34 GMT+02:00 Øystein Grøvlen
> <oystein.grovlen@stripped>:
>> Hi Christoph,
>>
>> On 19. mai 2016 14:15, Christoph Rupp wrote:
>>>
>>> Hi,
>>>
>>> If a row has variable-length blobs (fields of type
>>> MYSQL_TYPE_VARCHAR), then the serialized row stores the full length of
>>> the blob, even if most bytes are unused. In such cases, i'd like to
>>> "compress" the row before writing it to disk.
>>
>>
>> Note that VARCHAR and BLOB are different types.  Contrary to VARCHAR, for
>> blobs, space is not allocated in the internal record buffer, but in separate
>> buffers.
>>
>>>
>>> However, i have a few difficulties understanding the row format. The
>>> following two CREATE TABLE statements are relatively similar (the
>>> first one creates an additional index). But their first byte differs,
>>> and I don't understand why.
>>
>>
>> The index is not the only difference.  Nullability of the value column also
>> differs.
>>
>>>
>>> CREATE TABLE test (value VARCHAR(30) NOT NULL, INDEX(value), num
>>> INTEGER PRIMARY KEY)
>>> INSERT INTO test VALUES("1", 1);
>>>
>>> (gdb) x/8b buf
>>> 0x7fff5000e660:    1    49    -113    -113    -113    -113    -113    -113
>>>
>>> buf[0] stores the length of 'value', buf[1] stores the data of 'value'.
>>>
>>> CREATE TABLE test (value VARCHAR(30), num INTEGER PRIMARY KEY)
>>> INSERT INTO test VALUES("1", 1);
>>>
>>> (gdb) x/8b buf
>>> 0x7fff50013cf0:    -2    1    49    -113    -113    -113    -113    -113
>>>
>>> Now buf[1] stores the length and buf[2] stores the data of 'value'.
>>>
>>> But what is buf[0]?
>>
>>
>> The additional byte(s) is used to record which nullable columns are NULL.
>> For tables where there are no nullable columns, there will not be such a
>> byte.
>>
>> Regards,
>>
>> --
>> Øystein
>>
>>
>>>
>>> Is there documentation for the serialized row format?
>>>
>>> Thanks
>>> Christoph
>>>
>>> PS: here's the code that i currently use:
>>>
>>> static inline ups_record_t
>>> pack_record(TABLE *table, uint8_t *buf, uint8_t *arena)
>>> {
>>>     assert(!row_is_fixed_length(table));
>>>
>>>     uint8_t *src = buf;
>>>     uint8_t *dst = arena;
>>>
>>>     // copy the first byte - whatever it is
>>>     // this causes problems because in some cases there is no "first byte"!
>>>     *dst = *src;
>>>     dst++;
>>>     src++;
>>>
>>>     for (Field **field = table->field; *field != 0; field++) {
>>>       uint32_t type = (*field)->type();
>>>       uint16_t key_size;
>>>       uint32_t len_bytes;
>>>
>>>       if (type == MYSQL_TYPE_VARCHAR) {
>>>         // see Field_blob::Field_blob() (in field.h) - need 1-4 bytes to
>>>         // store the real size
>>>         if ((*field)->field_length <= 255) {
>>>           len_bytes = 1;
>>>           key_size = *src;
>>>         }
>>>         else if ((*field)->field_length <= 65535) {
>>>           len_bytes = 2;
>>>           key_size = *(uint16_t *)src;
>>>         }
>>>         else if ((*field)->field_length <= 16777215) {
>>>           len_bytes = 3;
>>>           key_size = *src; // TODO implement this
>>>         }
>>>         else {
>>>           len_bytes = 4;
>>>           key_size = *(uint32_t *)src;
>>>         }
>>>       }
>>>       else {
>>>         len_bytes = 0;
>>>         key_size = (*field)->key_length();
>>>       }
>>>
>>>       ::memcpy(dst, src, key_size + len_bytes);
>>>       src += (*field)->pack_length();
>>>       dst += key_size + len_bytes;
>>>     }
>>>
>>>     ups_record_t r = ups_make_record(arena, (uint32_t)(dst - arena));
>>>     return r;
>>> }
>>>
>>
>>
>> --
>> Øystein


-- 
Øystein
Thread
storage engine: understanding the row formatChristoph Rupp19 May
  • Re: storage engine: understanding the row formatØystein Grøvlen19 May
    • Re: storage engine: understanding the row formatChristoph Rupp19 May
      • Re: storage engine: understanding the row formatØystein Grøvlen19 May