From: Ann W. Harrison Date: November 11 2008 11:19pm Subject: Re: Serial log record encoding List-Archive: http://lists.mysql.com/falcon/173 Message-Id: <491A1304.8010504@mysql.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="Boundary_(ID_mudiznPJB8eGPf1pgHyx9w)" --Boundary_(ID_mudiznPJB8eGPf1pgHyx9w) Content-type: text/plain; format=flowed; charset=ISO-8859-1 Content-transfer-encoding: 7BIT Chris, Now that you've solved that problem, here's the answer to the encoding... Best, Ann --Boundary_(ID_mudiznPJB8eGPf1pgHyx9w) Content-type: text/plain; name=FalconRecordEncoding.txt Content-transfer-encoding: 8BIT Content-disposition: inline; filename=FalconRecordEncoding.txt Falcon record encoding The purpose of data compression in a database is to improve throughput by reducing the amount of I/O needed to read a set of records. Serious compression algorithms can reduce the amount of data stored enormously – just zip a database file to see how much can be saved – but they do so at a cost of CPU that overwhelms the I/O savings. Simple run-length compression reduces the amount of data stored, cheaply, but when the amount of empty space is significant, it wastes considerable space. Much of the compressible space in a database occurs because fixed length fields are used to hold variable length data. Fields must be declared large enough to hold the largest possible value. Compression helps primarily by removing the trailing blanks in char and varchar fields. Rather than compress data, Falcon uses a variable length data encoding for records stored in the memory-resident record cache and on disk. The encoding depends on the stored values, not on the type declaration of the field. The encoding eliminates excess space, at a slight increase in the cost of referencing individual fields. When records hold their fields in full storage format, the system can reference individual fields by indexing directly into the record. Any value-based compression or encoding requires special handing to locate individual fields. To reference field three, for example, requires reading the encoding byte and possibly the length of fields one and two to locate the code for field three. The in-memory record handling code keeps an index of the offsets of fields it has visited, so finding field thirty-six after finding field thirty-three does not require looking at fields one through thirty-three. Each field in a Falcon encoded record starts with a byte that describes the field, giving a possible 255 different field types, which is a lot, even considering the number of data types currently supported. The types are declared in an enum (EncodedDataStream.h). The enum names generally have the format eds. is the data type. is either Len or Count. If it is Len, then nn is generally the number of bytes that follow. If is Count, then nn is generally the number of count bytes that represent the length of the actual value. Regardless of type, all null-valued fields are represented by a type byte that indicates that the field is null. Nulls are edsNull, regardless of their declared type, making it the exception to the naming rule, since it has neither a section nor an . Before switching from compression to record encoding, Falcon kept bits indicating which fields were null, effectively using one byte for every eight fields in the record. Record encoding eliminates the need for null flag bits. Falcon represents several MySQL data types as unscaled integer: tiny, short, int24, long, longlong, year, enum, set, and bit. Numeric and decimal values less than 19 digits with a scale of zero are also stored as unscaled integers. Unscaled integers between –10 and 31 are represented by distinct types. Regardless of the declaration of the field – tinyint, smallint, int, or bigint, if the value is between –10 and 31, the field is represented by a single byte. For example, any integer field with a value of zero is represented by edsInt0, any integer field with the value 22 is represented as edsInt22, and any integer field with the value –9 is represented by edsIntMinus9. Larger integer values start with a single byte that indicates how many bytes are required to represent the value, and that number of bytes. The range is from one byte (values from –128 to 127, excluding –10 to 31) to eight bytes, including the rarely seen 3, 5, and 7 byte lengths. For example, the value 100,000 is expressed in four bytes – one type byte and three data bytes as edsIntLen3, 1, 134, 160 [Note – the bytes may be in reverse order, but they are consistent and not dependent on machine architecture] [Second note, that conversion was done partly on my fingers and is not reliable.] The representation is the same whether the field was declared as int24, long, or longlong. For some ranges values of integer fields, the encoded data representation is longer than the actual field, but any integer whose value has a leading byte of zeros encodes to its storage length. Any integer whose value has two or more leading bytes of zeros encodes to less than its storage length. Falcon uses scaled integer values to represent scaled numeric and decimal numbers in MySQL. A scaled integer is stored as a binary value plus a scale. They are represented by a type byte that indicates the type and the number of bytes required to represent the actual value, followed by a byte containing the scale. This is the exception to the rule that represents the number of bytes remaining in the field. In the case of scaled numbers, the remaining bytes are plus one for the scale. A numeric field described as (18, 4) containing the value zero is represented in two bytes as edsScaledLen0, 4. The same definition containing the value 100,000 is represented in five bytes as edsScaledLen3, 4, 1, 134, 160 [same notes as above]. If the scaled numeric value has a precision greater than 18 digits it is represented as edsScaledCount1 followed by a byte of scale, a byte of length, followed by the bytes of data. The encoding of floating point numbers is similar, though the truncation occurs at the end of floating point numbers. All floating point is represented internally as double precision. Trailing zeros in the mantissa are eliminated. A floating point zero is represented as a single byte of type edsDoubleLen0. The reader is invited to guess the encoding of 100,000 as a floating-point number. My guess is edsDoubleLen3, 64, 248, 106. [Same notes as above]. The encoding of floating point numbers goes from edsDoubleLen0 to edsDoubleLen8. Netfrastructure represents all character strings as UTF8. An empty character string is encoded as a single byte of edsUtf8Len0. Character strings between 1 byte and 39 bytes have distinct codes ranging from edsUtf8Len1 followed by one byte of data, to edsUtf8Len39 followed by 39 bytes of data. Larger characters strings are represented by a code followed by a number of bytes that describe the length of the string, followed by the string. This paper, so far, would be a string of about 4666 bytes – all in the single character representation of UTF8 - and would be encoded as edsUtf8Count2, 18, 58, followed by 4666 bytes starting with 'F', 'a', 'l', 'c', 'o', 'n', ' r', 'e', 'c', 'o', 'r', 'd', 'e', 'n', 'c', 'o', 'd', 'i', 'n', 'g' …. [Notes; the bytes of the length may be in the wrong order and I may have miscalculated the byte value of 4,666 and the length may have changed.] In the MySQL environment, Falcon handles character strings as opaque data, with an encoding that is similar to the Netfrastructure character encoding. Opaque data with no bytes is represented as edsOpaqueLen0. Four bytes of opaque data is represented as edsOpaqueLen4 plus the four bytes. 4,666 bytes of opaque data is represented as edsOpaqueCount2, 18, 58, …. [Notes as above.] Tiny blobs – blobs of fewer than 256 bytes – are stored as edsOpaqueCount1 if they are longer than 39 bytes, or as edsOpaqueLen for blobs between 0 and 39 bytes. Blob and clob data are not encoded. Falcon stores blobs and clobs in a separate data segment from the record data. Records contain the blob and clob numbers that Falcon uses to locate their data. The major saving of encoding is the removal of unneeded space from records. Blobs and clobs are stored at their exact length and do not suffer from the problem of over definition in fixed length fields. The lengths of blob and clob identifiers vary between zero and four bytes. A two-byte blob identifier is encoded as edsBlobLen2 followed by the two bytes of the identifier. Falcon converts MySQL date and time data types to its internal formats: Time, Timestamp, and Date. Time is stored as the number of milliseconds since midnight, and encoded as edsTimeLen0 (midnight) to edsTimeLen4. Date is stored as the number of milliseconds since January 1, 1970 and encoded as edsMilliSec0 to edsMilliSec8. Timestamp is the number of nanoseconds since January 1, 1970, encoded as edsNanoSecLen0 to edsNanoSecLen8. --Boundary_(ID_mudiznPJB8eGPf1pgHyx9w)--