Hi Rafal,
I looked through the changes and they look good. I had only a small
number of probably not very useful comments, mostly about spelling and
terminology.
On Aug 21, 2009, at 6:17 AM, Rafal Somla wrote:
> #At file:///ext/mysql/bzr/backup/stream-doc/ based on
> revid:hema@stripped
>
> 2863 Rafal Somla 2009-08-21
> Bug #40587 - Doxygen documentation of backup image format is
> not up to date
>
> This patch updates the doxygen documentation for the backup
> image format
> and fixes/clarifies issues found there. The format of the
> documentation is
> changed to the one which was proposed by Paul for the internals
> manual.
> @ sql/backup/kernel.cc
> - Add the main documentation page.
> - Include description of how to use kerenel API into doxygen
> docs.
> @ sql/backup/stream.cc
> - Add page describing general backup stream format used.
> Backup image format
> is described on a separate page.
> @ sql/backup/stream_v1.c
> Add pages describing general structure of image format v1 and
> the image layer
> format as implemented by functions in this file. Format of
> the image layer
> documentation has been changed to the one proposed by Paul
> for the internals
> manual.
> @ sql/backup/stream_v1_transport.c
> Added pages documenting the transport layer format of a
> backup image.
> Documentation format is changed to the one proposed by Paul
> for internals
> manual.
>
> modified:
> sql/backup/kernel.cc
> sql/backup/stream.cc
> sql/backup/stream_v1.c
> sql/backup/stream_v1_transport.c
> === modified file 'sql/backup/kernel.cc'
> --- a/sql/backup/kernel.cc 2009-08-11 11:09:31 +0000
> +++ b/sql/backup/kernel.cc 2009-08-21 11:16:45 +0000
> @@ -3,6 +3,54 @@
>
> @brief Implementation of the backup kernel API.
>
> + @todo Use internal table name representation when passing tables to
> + backup/restore drivers.
> + @todo Handle other types of meta-data in Backup_info methods.
> + @todo Handle item dependencies when adding new items.
> + @todo Handle other kinds of backup locations (far future).
> +*/
> +
> +/**
> + @mainpage
> +
> + @section Structure Structure of the Online Backup System
Well, it's no longer called Online Backup, it's MySQL Backup.
> +
> + @verbatim
> +
> + | Online Backup Module
> + |
> + (1) | +---------------+ +------------------------+
> + Server ---> | Backup Kernel | <--> | Backup/Restore drivers |
> + | +---------------+ (3) +------------------------+
> + Core <--- | Stream layer |
> + (2) | +---------------+
> + | (4) ^ (5)
> + | | |
> + +--------|---|----------------------------
> + | v
> + | Backup Stream Library
> + |
> + @endverbatim
> +
> + Backup stream library is autonomous so that external applications
> can link
> + against it to be able to read backup images created by the system.
> + The format of backup images is described @ref stream_format "here".
> +
> + The components of the system communicate with each other using
> well defined
> + interfaces:
> + -# @ref KernelAPI "Backup Kernel API"
> + -# Object Services API and Backup Log API
> + -# Backup Driver API
> + -# @ref streamlib "Backup Stream Library API"
> + -# Backup Stream Library Callback API
> +*/
> +
> +/**
> + @defgroup KernelAPI Backup kernel API.
> +
> + Backup kernel API is the interface between the backup system and
> its external
> + users.
> +
> @section s1 How to use backup kernel API to perform backup and
> restore operations
>
> To perform backup or restore operation an appropriate context must
> be created.
> @@ -52,11 +100,6 @@
> } // if code jumps here, context destructor will do the clean-up
> automatically
> @endcode
>
> - @todo Use internal table name representation when passing tables to
> - backup/restore drivers.
> - @todo Handle other types of meta-data in Backup_info methods.
> - @todo Handle item dependencies when adding new items.
> - @todo Handle other kinds of backup locations (far future).
> */
>
> #include "../mysql_priv.h"
>
> === modified file 'sql/backup/stream.cc'
> --- a/sql/backup/stream.cc 2009-08-11 11:09:31 +0000
> +++ b/sql/backup/stream.cc 2009-08-21 11:16:45 +0000
> @@ -1,3 +1,66 @@
> +/**
> + @page stream_format Backup Stream Format
> +
> + A backup stream is a sequence of bytes that consists of a 10-byte
> prefix
> + followed by a backup image.
> + @verbatim
> +
> + backup_stream: stream_prefix backup_image
> + stream_prefix: magic_number version
> + @endverbatim
> +
> + @c Magic_number consists of eight bytes. They are as follows:
> + @verbatim
> +
> + 0xE0, // ###.....
> + 0xF8, // #####...
> + 0x7F, // .#######
> + 0x7E, // .######.
> + 0x7E, // .######.
> + 0x5F, // .#.#####
> + 0x0F, // ....####
> + 0x03 // ......##
> + @endverbatim
> +
> + @c Version is stored in the following two bytes, the least
> + significant byte comming first.
comming -> coming
Actually, I would just delete "comming". :-)
> +
> + @c Backup_image that follows @c backup_prefix is stored in the
> format
> + indicated by the version number. Currently only one backup image
> format
> + is used -- @ref image_format1.
> +
> + The stream format does not provide direct support for compression
> or
> + encryption. But a complete backup stream can be compressed and/or
> encrypted
> + externally.
Okay, this is good to know. But BACKUP DATABASE now does have
compression syntax; is the compression supported by that statement a
form of "external" compression?
> + In that case, it will be wrapped in the compression/encryption
> format used,
> + with additional headers added in front of it. Before processing
> such a stream,
> + it should be first uncomressed/decrypted and then the 10 byte
> backup prefix
> + should be seen.
> +
> + @note
> + The prefix containing the magic number and version number is @em
> not part of
> + the backup image. It is stored in the backup stream to make two
> things
> + possible:
> + - detection of where the backup image starts inside a backup
> stream,
> + - determination of the format of the backup image that follows
> the prefix.
> +
> + @par
> + For example, suppose that the backup image is not stored at the
> beginning
> + of a stream, but is preceded by some other information. An
> application can
> + scan the stream looking for the backup magic number. Once found,
> the
> + application will know with high probability that the version
> number and a
> + backup image follows.
> +
> + @par
> + In other scenarios, the 10-byte prefix might not be needed.
> Suppose that
> + the backup image is sent from some kind of a server to a client
> using a custom
> + protocol. The client knows that what server sends is a backup
> image. In the
> + initial handshake, the server tells the client the version of the
> backup
> + image format to follow. Thus, after opening the communication
> channel, the
> + client can directly read the bytes of the backup image and there
> is no need
> + to send the 10-byte prefix.
> +*/
> +
> #include "../mysql_priv.h"
> #include "my_dir.h"
>
>
> === modified file 'sql/backup/stream_v1.c'
> --- a/sql/backup/stream_v1.c 2009-06-30 20:37:35 +0000
> +++ b/sql/backup/stream_v1.c 2009-08-21 11:16:45 +0000
> @@ -9,12 +9,61 @@
>
> @brief
> Implementation of the high-level functions for writing and reading
> backup
> - image using version 1 of backup stream format.
> + image using version 1 of backup image format (see @ref
> image_format1).
>
> @todo handle errors when creating iterators in functions like
> bstream_wr_catalogue()
> @todo use data chunk sequence numbers to detect discontinuities in
> backup stream.
> */
>
> +/**
> + @page image_format1 Backup Image Format (v1)
> +
> + The backup image structure can be thought of at two levels:
> + - The transport layer consists of blocks, all of which have the
> same size
> + (except possibly the last). The size is given at the beginning
> of the first
> + block.
> + - The image layer consists of chunks, which have varying size.
> +
> + At the transport layer, blocks are read and decomposed to obtain
> fragments
> + (one or more per block). Fragments are assembled to obtain chunks.
> + A chunk might be smaller than a block, or it might contain
> information from
> + multiple blocks.
> +
> + At the higher level, backup image is seen as a sequence of data
> chunks. How to
> + interpret contents of these chunks is explained on @ref
> image_layer page.
> +
> + Page @ref transport_layer describes the low level structure of a
> backup image
> + -- how fixed size blocks are decomposed into fragments from which
> data chunks
> + are assembled.
> +*/
> +
> +/**
> + @page image_layer Image Layer Format
> +
> + On the top level, backup image consists of a premable, followed
> by several
> + chunks with table data and closed with a summary chunk.
> +
> +@verbatim
> + backup_image: preamble table_data ... summary
> +@endverbatim
> +
> + @c Preamble describes the image and objects that it contains. @c
> Table_data
> + chunks contain table contents.
> + @c Summary contains information which is known only when backup
> image has
> + been created. Currently this is mainly inforamtion about the
> validity point of
information
> + the image and data needed for point-in-time recovery operations.
> +*/
> +
> +/**
> + @defgroup streamlib Backup Stream Library
> +
> + This library defines version 1 of backup image format described
> + @ref image_format1 "here".
> + It provides functions for writing and reading streams using this
> format.
> + The functions are declared in the stream_v1.h header.
> +*/
> +
> +
> #ifdef DBUG_OFF
> # define ASSERT(X)
> #else
> @@ -23,13 +72,6 @@
> # define ASSERT(X) assert(X)
> #endif
>
> -/**
> - @page streamlib Backup Stream Library
> -
> - This library defines version 1 of backup stream format. It
> provides functions
> - for writing and reading streams using this format. The functions
> are declared
> - in the stream_v1.h header.
> -*/
>
> /* local types */
>
> @@ -150,23 +192,29 @@ byte get_byte_size_t(size_t value)
>
> *************************************************************************/
>
> /**
> - @page stream_format Backup Stream Format (v1)
> -
> - Backup image consists of 3 main parts: preamble, table data and
> summary.
> - @verbatim
> -
> - [backup image]= [ preamble | table data | 0x00 ! summary(*) ]
> - @endverbatim
> -
> - The 0x00 byte separates table data chunks from the summary chunk.
> This works
> - because no table data chunk can start with 0x00.
> -
> - Optionally, summary can be included in the preamble, instead of
> being stored
> - at the end of the image. This is indicated by flags in the header.
> - @verbatim
> + @page image_layer
> + @section preamble Preamble Format
> +
> + The preamble contains global information about the image and
> describes
> + objects it contains.
>
> - [preamble]= [ header | summary (*) | catalogue | meta data ]
> - @endverbatim
> +@verbatim
> + preamble: header snapshot_description ... catalog metadata
> +@endverbatim
> +
> + @c Preamble consists of several chunks, which contain the following
> + information:
> + - Header information, such as global image flags, image creation
> time,
> + and server version number
> + - Information about table data snapshots stored in the image
> + - A catalog listing all objects stored in the image
> + - Metadata for the objects
> +
> + Backup image can include several table data snapshots storing
> data from the
> + tables. The snapshots correspond to backup drivers which were
> used to create
> + them. Snapshots are numbered, starting from 1, according to the
> order in
> + which their descriptions appear in the preamble. The number of
> snapshots
> + present in the image is given by the @c snapshot_count field in
> the header.
> */
>
> int bstream_wr_header(backup_stream*, struct
> st_bstream_image_header*);
> @@ -193,12 +241,13 @@ int bstream_wr_preamble(backup_stream *s
> }
>
> /**
> - Read backup image preamble creating all items stored in it
> + Read backup image preamble creating all objects stored in it
>
> @retval BSTREAM_ERROR Error while reading preamble
> @retval BSTREAM_OK Read successful
> @retval BSTREAM_EOS Preamble has been read and there are no
> more chunks in
> the stream.
> + @see @ref preamble.
> */
> int bstream_rd_preamble(backup_stream *s, struct
> st_bstream_image_header *hdr)
> {
> @@ -233,27 +282,12 @@ int bstream_rd_preamble(backup_stream *s
> return ret;
> }
>
> -/**
> - @page stream_format
> -
> - @section summary Summary section
> -
> - @verbatim
> -
> - [summary]= [ vp time ! end time ! binlog pos ! binlog group pos ]
> - @endverbatim
> -
> - Summary starts with 0x00 byte to distinguish it from table data
> chunks which
> - never start with that value.
> - @verbatim
>
> - [binlog pos]= [ pos:4 ! binlog file name ]
> +/**
> + Save binlog position.
>
> - [binlog group pos] uses the same format as [binlog pos].
> - @endverbatim
> + @see @ref summary.
> */
> -
> -/** Save binlog position. */
> int bstream_wr_binlog_pos(backup_stream *s, struct
> st_bstream_binlog_info pos)
> {
> blob name;
> @@ -276,6 +310,8 @@ int bstream_wr_binlog_pos(backup_stream
> @retval BSTREAM_OK Read successful
> @retval BSTREAM_EOC Read successful and end of chunk has been
> reached
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
> +
> + @see @ref summary.
> */
> int bstream_rd_binlog_pos(backup_stream *s, struct
> st_bstream_binlog_info *pos)
> {
> @@ -299,6 +335,8 @@ int bstream_rd_binlog_pos(backup_stream
> This function assumes that all members of the header (such as @c
> vp_time) are
> already filled. It also stores the 0x00 byte separating summary
> from the
> preceding table data chunks.
> +
> + @see @ref summary
> */
> int bstream_wr_summary(backup_stream *s, struct
> st_bstream_image_header *hdr)
> {
> @@ -325,6 +363,8 @@ int bstream_wr_summary(backup_stream *s,
> @retval BSTREAM_OK Read successful
> @retval BSTREAM_EOC Read successful and end of chunk has been
> reached
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
> +
> + @see @ref summary
> */
> int bstream_rd_summary(backup_stream *s, struct
> st_bstream_image_header *hdr)
> {
> @@ -348,29 +388,59 @@ int bstream_rd_summary(backup_stream *s,
>
> *************************************************************************/
>
> int bstream_wr_snapshot_info(backup_stream*, struct
> st_bstream_snapshot_info*);
> -int bstream_rd_image_info(backup_stream*, struct
> st_bstream_snapshot_info*);
> +int bstream_rd_snapshot_info(backup_stream*, struct
> st_bstream_snapshot_info*);
>
> /**
> - @page stream_format
> -
> - @section header Header
> + @page image_layer
> + @subsection header Header Format
>
> - @verbatim
> +@verbatim
> + header: flags creation_time snapshot_count server_version
> extra_data
> +@endverbatim
> +
> + @c Header chunk contains the following fields:
> + - @c flags (2 bytes). Integer image flags.
> + - @c creation_time (6 bytes). Time of image creation.
> + - @c snapshot_count (1 byte). Number of snapshots included in the
> image.
> + - @c server_version (variable length). Version of the server
> which created the image.
> + - @c extra_data (variable length). The remaining bytes to the end
> of the chunk are reserved
> + for storing additional information - currently they are
> ignored.
> +
> + Bits in @c flags field have the following meanings:
> + - Bit 0: Reserved.
> + - Bit 1 (BSTREAM_FLAG_BIG_ENDIAN)
> + - If set, the server that created the backup has big-endian
> architecture.
> + - If clear, the server has little-endian architecture.
> + - Bit 2 (BSTREAM_FLAG_BINLOG)
> + - If set, the binlog_coords and binlog_group_coords fields in
> the summary contain valid values.
> + - If clear, binlog_coords and binlog_group_coords should be
> ignored (they are present in the summary but the values are not
> useful).
> + - Bits 3-15: Reserved.
> +
> + @c Server_version is encoded as follows:
> + - 1 byte. Integer major number
> + - 1 byte. Integer minor number
> + - 1 byte. Integer release number
> + - Variable length. String representation of the version
> + .
> + For example, a server version of "6.0.8-alpha" is stored using this
> + byte sequence: 0x06 0x00 0x08 0x0b 0x36 0x2e 0x30 0x2e 0x38 0x2d
> 0x61 0x6c 0x70 0x68 0x61
>
> - [header]= [ flags:2 ! creation time ! #of snapshots:1 ! server
> version !
> - extra data | snapshot descriptions ]
> - @endverbatim
> + @note
> + The big-endian flag has no bearing on storage of values in the
> backup image
> + itself. It might be of use in the case where an image was written
> by some
> + native driver but fails to restore on a machine other than the
> one on which
> + it was created. If one checks the big-endian flag and finds that
> the backup
> + host byte order differs from the restore host byte order, that
> might indicate
> + that the native backup and restore drivers did not correctly deal
> with
> + different host byte orders. A possible workaround would be to
> attempt the
> + restore on a host with the same byte order as the backup host.
> +*/
>
> - [snapshot descriptions] contains descriptions of the table data
> snapshots
> - present in the image. Each description is stored in a separate
> chunk
> - (number of snapshots is given in the header).
> - @verbatim
> +/**
> + Write header of backup image
>
> - [snapshot descriptions]= [ snapshot description | ... | snapshot
> description ]
> - @endverbatim
> + @see @ref header.
> */
> -
> -/** Write header of backup image */
> int bstream_wr_header(backup_stream *s, struct
> st_bstream_image_header *hdr)
> {
> int ret= BSTREAM_OK;
> @@ -404,6 +474,8 @@ int bstream_wr_header(backup_stream *s,
> @retval BSTREAM_OK Read successful
> @retval BSTREAM_EOC Read successful and end of chunk has been
> reached
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
> +
> + @see @ref header.
> */
> int bstream_rd_header(backup_stream *s, struct
> st_bstream_image_header *hdr)
> {
> @@ -425,7 +497,7 @@ int bstream_rd_header(backup_stream *s,
> return BSTREAM_ERROR;
>
> CHECK_RD_OK(bstream_next_chunk(s));
> - CHECK_RD_RES(bstream_rd_image_info(s, &hdr->snapshot[i]));
> + CHECK_RD_RES(bstream_rd_snapshot_info(s, &hdr->snapshot[i]));
> }
>
> rd_error:
> @@ -434,37 +506,50 @@ int bstream_rd_header(backup_stream *s,
> }
>
> /**
> - @page stream_format
> + @page image_layer
> + @subsection snapshot Snapshot Description Format
>
> - @subsection snapshot Snapshot description entry
> -
> - @verbatim
> -
> - [snapshot description] = [ image type:1 ! format version: 2 !
> global options:2 !
> - #of tables ! backup engine info !
> extra data ]
> - @endverbatim
> -
> - [image type] is encoded as follows:
> -
> - - 0 = snapshot created by native backup driver (BI_NATIVE),
> - - 1 = snapshot created by built-in blocking driver (BI_DEFAULT),
> - - 2 = snapshot created using created by built-in driver using
> consistent
> - read transaction (BI_CS).
> - - 3 = snapshot created by built-in no data driver (BI_NODATA),
> + Table data snapshots are created by backup drivers and store the
> data from
> + tables handled by these drivers. Each snapshot present in the
> image is
> + described by a @c snapshot_description chunk in the preamble.
> +
> +@verbatim
> + snapshot_description: snapshot_type format_version global_options
> + table_count [storage_engine_info] extra_data
> +
> + backup_engine_info: engine_name major_version minor_version
> +@endverbatim
>
> - Format of [backup engine info] depends on snapshot type. It is
> empty for the
> - default and CS snapshots. For native snapshots it has format
> - @verbatim
> + Each @c snapshot_description chunk contains the following
> information:
> + - @c snapshot_type (1 byte). Integer snapshot type, encoded as
> follows:
> + - 0 (BI_NATIVE): Snapshot was created by native backup driver.
> + - 1 (BI_DEFAULT): Snapshot was created by the built-in blocking
> driver.
> + - 2 (BI_CS): Snapshot was created by the built-in driver that uses
> + a consistent read transaction.
> + - 3 (BI_NODATA): Phony snapshot used for tables whose data is
> not stored
> + in the image.
> + - @c format_version (2 bytes). Integer snapshot format version.
> + - @c global_options (2 bytes). Integer global options. Reserved
> for future use.
> + - @c table_count (variable length). Integer count of the tables
> stored
> + in the snapshot.
> + - @c storage_engine_info (variable length). Information about
> storage engine
> + which provided the native backup driver (present only for
> native snapshots).
> + - @c extra_data (variable length). Extra data. Reserved for
> future use;
> + currently empty.
>
> - [backup engine info (native)] = [ storage engine name ! storage
> engine version ]
> + The @c backup_engine_info field is present only for native
> snapshots, that is,
> + snapshots of type 0 (BI_NATIVE). It has this format:
> +
> + - @c engine_name (variable-length). String engine name.
> + - @c major_version (1 byte). Integer major version of the engine.
> + - @c minor_version (1 byte). Integer minor version of the engine.
> +*/
>
> - [server version] = [ major:1 ! minor:1 ! release:1 ! extra string ]
> +/**
> + Save description of table data snapshot
>
> - [engine version] = [ major:1 ! minor:1 ]
> - @endverbatim
> + @see @ref snapshot
> */
> -
> -/** Save description of table data snapshot */
> int bstream_wr_snapshot_info(backup_stream *s, struct
> st_bstream_snapshot_info *info)
> {
> int ret= BSTREAM_OK;
> @@ -509,8 +594,10 @@ int bstream_wr_snapshot_info(backup_stre
> @note
> This function allocates memory to store snapshot info. The caller is
> responsible for freeing this memory.
> +
> + @see @ref snapshot
> */
> -int bstream_rd_image_info(backup_stream *s, struct
> st_bstream_snapshot_info *info)
> +int bstream_rd_snapshot_info(backup_stream *s, struct
> st_bstream_snapshot_info *info)
> {
> unsigned short int type;
> int ret= BSTREAM_OK;
> @@ -555,75 +642,107 @@ int bstream_rd_image_info(backup_stream
>
> *************************************************************************/
>
> /**
> - @page stream_format
> -
> - @section catalogue Image catalogue
> -
> - The catalogue describes what items are stored in the image. Note
> that it
> - doesn't contain any meta-data, only item names and other
> information needed to
> - identify and select them.
> - @verbatim
> -
> - [catalogue]= [ charsets ! 0x00 ! users ! 0x00 ! tablespaces !
> 0x00 !
> - databases | db catalogue | ... | db catalogue ]
> - @endverbatim
> -
> - Catalogue starts with list of charsets where each charset is
> identified by its
> - name. In other places of the image, charsets can be identified by
> their
> - positions in this list. Number of charsets is limited to 256 so
> that one byte
> - is enough to identify a charset.
> - @verbatim
> -
> - [charsets]= [ charset name ! ... ! charset name ]
> - @endverbatim
> -
> - Two first entries in [charsets] have special meaning and should
> be always
> - present.
> -
> - First charset is the charset used to encode all strings stored in
> - the preamble. This should be a universal charset like UTF8, capable
> - of representing any string.
> -
> - Second charset in the list is the default charset of the server
> on which
> - image was created. It can be the same as the first charset.
> -
> - The following charsets are any charsets used by the items stored
> in the image
> - and thus needed to restore these items.
> -
> - @verbatim
> + @page image_layer
>
> - [users]= [ user name ! ... ! user name ]
> - @endverbatim
> + @section catalog Catalog Format
>
> - User list contains users for which any privileges are stored in
> the image.
> -
> - Following user list is a list of tablespaces used by the tables
> stored in
> - the backup image. Only tablespace names are listed here, their
> definitions
> - are stored in the meta-data section.
> - @verbatim
> -
> - [tablespaces]= [ ts name ! ... ! ts name ]
> - @endverbatim
> -
> - Finally, a list of all databases follows. If the list is empty, it
> - consists of a single null string. Otherwise it has format:
> - @verbatim
> -
> - [databases]= [ db info ! ... ! db info ]
> -
> - [db info]= [ db name ! db flags:1 ! optional extra data ]
> - [db flags]= [ has extra data:.1 ! unused:.7 ]
> - [optional extra data]= [data len:2 ! the data:(data len) ]
> - @endverbatim
> -
> - [optional extra data] is present only if indicated in the flags.
> -
> - If there are no databases in the image, the database list is
> empty and there
> - are no database catalogues.
> - @verbatim
> -
> - [catalogue (no databases)] = [ charsets ! 0x00 ! users ! 0x00 !
> tablespaces ! 0x00 ]
> - @endverbatim
> + The catalog describes what objects are stored in the backup
> image. It contains
> + no metadata because that is stored in a separate section of the
> image.
> + The catalog only lists the objects and provides the information
> needed to
> + identify and select them. It consists of a header chunk followed
> by zero or
> + more chunks that each describe objects in a single database:
> +
> +@verbatim
> + catalog: catalog_header [db_catalog ...]
> +@endverbatim
> +
> + @subsection catalog_header Catalog Header Format
> +
> +@verbatim
> + catalog_header: charsets 0x00 [users] 0x00 [tablespaces] 0x00
> databases
> +@endverbatim
> +
> + The charsets, users, and tablespaces sections of the @c
> catalog_header chunk
> + each contain a list of strings and are terminated by a 0x00 byte
> + (an empty string). The databases section contains database-
> information items
> + and extends to the end of the @c catalog_header chunk.
> +
> + The @c catalog_header starts with a list of character sets
> identified by name:
> +
> +@verbatim
> + charsets: charset_name ...
> +@endverbatim
> +
> + Character set names are string values represented using ASCII
> characters.
> + The first two @c charset_name entries have a special meaning and
> should
> + always be present:
> + -# The character set used to encode all strings stored in
> + the preamble following the charsets list. This should be a
> universal
> + character set capable of representing any string, such as UTF8.
> + -# The default character set of the server on which the backup
> image was
> + created. It can be the same as the first character set.
> +
> + @c charset_name values following the first two, if present, are
> any character
> + sets used by the objects stored in the image and thus needed to
> restore those
> + objects.
> +
> + References to character sets at other locations in the backup
> image are as
> + 0-based positions within the character set list. The number of
> character sets
> + is limited to 256 so that one byte is sufficient to identify a
> character set
> + by its number.
> +
> + @note Currently, even if some objects (such as tables or
> databases) use
> + character sets in their definition, these character sets will not
> be listed in
> + the image. On restore, all entries in this list are ignored.
> + Collation information is not stored currently, but might be in
> the future.
> +
> + @note Character set and collation numbers used within the image
> catalog are
> + internal to the backup image and have nothing to do with the IDs
> used in the
> + server. Backup image-processing code must translate between image
> IDs and
> + internal server IDs if necessary.
> +
> + After the character set names, a list of users follows:
> +
> +@verbatim
> + users: user_name ...
> +@endverbatim
> +
> + The users list is currently always empty. It is reserved for
> future use when
> + there will be a need to refer to users inside backup image (e.g.,
> when user
> + definitions are included).
> +
> + After users, a list of names of all tablespaces stored in the
> image follows:
> +
> +@verbatim
> + tablespace: tablespace_name ...
> +@endverbatim
> +
> + Each @c tablespace_name value is a string.
> +
> + After tablespaces, a list of all databases follows. If the list
> is empty, it
> + consists of a single empty string. Otherwise, it is a list of one
> or more
> + @c db_info entries:
> +
> +@verbatim
> + databases: 0x00 | db_info ...
> +@endverbatim
> +
> + @c db_info entries each contain fields that describe a single
> database:
> +
> +@verbatim
> + db_info: db_name db_flags [extra_data]
> +@endverbatim
> + - @c db_name (variable length). String database name.
> + - @c db_flags (1 byte). Integer database flags.
> + - @c extra_data (variable length). Extra data. Optional; present
> only if
> + indicated in the flags. It consists of:
> + - Integer data length (2 bytes).
> + - Data bytes, as many as specified by the length.
> +
> + Bits in the @c db_flags field are used as follows:
> + - Bits 0-6: Reserved
> + - Bit 7 (BSTREAM_FLAG_HAS_EXTRA_DATA): Set if the extra_data
> field is present
> + in the db_info entry.
> */
>
> /** Definition of extra data flag bits. */
> @@ -640,7 +759,7 @@ int bstream_rd_db_catalogue(backup_strea
> The contents of the image is read from the @c cat object using
> iterators
> and @c bcat_*() functions defined by the program using this library.
>
> - @see @c bcat_iterator_get(), @c bcat_iterator_next(), @c
> bcat_iterator_free()
> + @see @ref catalog
> */
> int bstream_wr_catalogue(backup_stream *s, struct
> st_bstream_image_header *cat)
> {
> @@ -753,7 +872,7 @@ int bstream_wr_catalogue(backup_stream *
> @retval BSTREAM_EOC Read successful and end of chunk has been
> reached
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
>
> - @see @c bcat_add_item()
> + @see @ref catalog
> */
> int bstream_rd_catalogue(backup_stream *s, struct
> st_bstream_image_header *cat)
> {
> @@ -906,28 +1025,11 @@ int bstream_rd_catalogue(backup_stream *
> return ret;
> }
>
> -/**
> - @page stream_format
> -
> - Encoding of item types used in a backup image.
> -
> - - 1 = character set,
> - - 2 = user,
> - - 3 = privilege,
> - - 4 = database,
> - - 5 = table,
> - - 6 = view.
> - - 7 = stored procedure.
> - - 8 = stored function.
> - - 9 = event.
> - - 10 = trigger.
> - - 11 = table space.
> -
> - Value 0 doesn't encode a valid item type and is used as item list
> separator.
> - */
>
> /**
> - Save item type.
> + Save object type.
> +
> + See @ref basic_data_itype for encoding of object types.
>
> @retval BSTREAM_OK type was saved successfully
> @retval BSTREAM_ERROR error writing or attempt to save unknown type.
> @@ -952,7 +1054,9 @@ int bstream_wr_item_type(backup_stream *
> }
>
> /**
> - Read item type.
> + Read object type.
> +
> + See @ref basic_data_itype for encoding of object types.
>
> @retval BSTREAM_ERROR Error while reading or non-recognized type
> found.
> @retval BSTREAM_OK Read successful
> @@ -989,52 +1093,69 @@ int bstream_rd_item_type(backup_stream *
> }
>
> /**
> - @page stream_format
> -
> - @subsection db_catalogue Database catalogue
> -
> - Database catalogue lists all tables and other per-db items
> belonging to that
> - database.
> - @verbatim
> -
> - [db catalogue]= [ db-item info ! ... ! db-item info ]
> - @endverbatim
> -
> - Each entry in the catalogue describes a single item, which can be
> a table or
> - of other kind.
> - @verbatim
> -
> - [db-item info]= [ type:2 ! name ! optional item data ]
> - @endverbatim
> + @page image_layer
>
> - [optional item data] is used only for tables:
> + @subsection db_catalog Database Catalog Format
>
> - @verbatim
> -
> - [optional item data (table)]= [ flags:1 ! snapshot no.:1 ! pos !
> - optional extra data ]
> - @endverbatim
> -
> - [snapshot no.] tells which snapshot contains tables data and
> [pos] tells what
> - is the position of the table in this snapshot.
> -
> - Presence of extra data is indicated by a flag.
> - @verbatim
> -
> - [flags]= [ has_extra_data:.1 ! unused:.7 ]
> -
> - [optional extra data]= [ data_len:1 ! extra data:(data_len) ]
> - @endverbatim
> -
> - If database is empty, it stores two 0x00 bytes.
> - @verbatim
> -
> - [db catalogue (empty)] = [ 0x00 0x00 ]
> - @endverbatim
> + Each @c db_catalog chunk lists all tables and other per-database
> objects
> + belonging to a single database. If there are no databases in the
> image, the
> + databases list in the @c catalog_header chunk is empty and the
> catalog
> + contains no @c db_catalog chunks following the @c catalog_header
> chunk.
> +
> + If a database is empty, its @c db_catalog chunk consists of two
> 0x00 bytes.
> + Otherwise, @c db_catalog contains lists of @c table_info and @c
> db_item_info
> + entries:
> +
> +@verbatim
> + db_catalog: 0x00 0x00 | db_tables db_other_items
> +
> + db_tables: table_info ...
> + db_other_items: db_item_info ...
> +@endverbatim
> +
> + Each item entry in @c db_catalog starts with 2 byte integer
> indicating type of
> + the object. All tables are stored before other per-database
> objects. This
> + is important for addressing the per-database objects because the
> coordinates
> + for such objects refer to the position in the @c db_other_items
> list.
> + (For example, per-database object 0 is the object described by
> the first
> + item in @c db_other_items.)
> +
> +@verbatim
> + table_info: type table_name flags snapshot_num table_pos
> [extra_data]
> +@endverbatim
> +
> + - @c type (2 bytes): Integer object type (always 0x05 0x00 for a
> table).
> + - @c table_name (variable length): String table name.
> + - @c flags (1 byte): Integer flags.
> + - @c snapshot_num (1 byte): Integer indicating which snapshot
> contains
> + table's data (0-based).
> + - @c table_pos (variable length): Integer position of the table
> within
> + its snapshot (0-based).
> + - @c extra_data (variable length): Item data. (Optional;
> currently not used.)
> + If present, it consists of:
> + - Integer data length (2 bytes).
> + - Data bytes, as many as specified by the length.
> +
> +@verbatim
> + db_item_info:= type name [item_data]
> +@endverbatim
> +
> + - @c type (2 bytes): Integer object type.
> + - @c name (variable length): String object name.
> + - @c item_data (variable length): Item data. (Optional; currently
> not used.)
> +
> + Allowable object type values are given in @ref basic_data_itype.
> + Bits in the flags field are used as follows:
> + - Bits 0-6: Reserved
> + - Bit 7 (BSTREAM_FLAG_HAS_EXTRA_DATA): Set if the extra_data
> field is present
> + in item_data.
> */
>
> +/**
> + Save catalogue of objects belonging to given database.
>
> -/** Save catalogue of items belonging to given database. */
> + @see @ref db_catalog
> +*/
> int bstream_wr_db_catalogue(backup_stream *s, struct
> st_bstream_image_header *cat,
> struct st_bstream_db_info *db_info)
> {
> @@ -1090,7 +1211,7 @@ int bstream_wr_db_catalogue(backup_strea
> @retval BSTREAM_EOC Read successful and end of chunk has been
> reached
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
>
> - @see @c bcat_add_item()
> + @see @ref db_catalog
> */
> int bstream_rd_db_catalogue(backup_stream *s, struct
> st_bstream_image_header *cat,
> struct st_bstream_db_info *db_info)
> @@ -1145,67 +1266,57 @@ int bstream_rd_db_catalogue(backup_strea
>
> /
> *************************************************************************
> *
> - * META DATA
> + * METADATA
> *
>
> *************************************************************************/
>
> /**
> - @page stream_format
> -
> - @section meta_data Meta data section
> -
> - Meta data section contains meta-data for items which need to be
> created
> - when restoring data. It is divided into three main parts, storing
> meta data
> - for global items, tables and other items (per-db and per-table).
> - @verbatim
> -
> - [meta data]= [ global items | tables | other items ]
> - @endverbatim
> -
> - The only global items for which we store meta-data information
> are tablespaces
> - and databases. Tablespace definitions should come before database
> definitions
> - on the [global items] list.
> -
> - [Tables] section contains all tables which are grouped on per-
> database basis
> - (this is for easier skipping of tables upon selective restore).
> - @verbatim
> -
> - [tables] = [ tables from db1 | ... | tables from dbN ]
> - @endverbatim
> -
> - [Other items] has two parts for all per-database items (except
> tables) and
> - all per-table items.
> - @verbatim
> -
> - [other items]= [ per-db items ! 0x00 0x00 ! per-table items ]
> - @endverbatim
> -
> - The per-database items other than tables can not be grouped by
> database
> - because of possible inter-database dependenciens. This is why
> they are stored
> - in a separate section.
> -
> - If there are no databases in the image, [meta data] consists of
> [global items]
> - only.
> - @verbatim
> -
> - [meta data (no databases)]= [ global items ]
> - @endverbatim
> + @page image_layer
>
> - Meta data item lists can be empty or consist of several item
> entries. Empty
> - item list consist of two 0x00 bytes which can not start any valid
> - [item entry].
> - @verbatim
> + @section metadata Metadata Format
>
> - [item list] = [ item entry ! ... ! item entry ]
> - [item list (empty)]= [ 0x00 0x00 ]
> - @endverbatim
> + The metadata section contains information for objects that need
> to be created
> + when restoring data. It has three main sections:
> + - Metadata for global objects (single chunk)
> + - Metadata for tables (one chunk per a database)
> + - Metadata for other per-database objects (single chunk)
> +
> +@verbatim
> + metadata: global_items [tables ... other_items]
> +@endverbatim
> +
> + If there are no databases in the image, @c metadata consists of the
> + @c global_items chunk only. Otherwise, for each database stored
> in the image
> + one @c tables chunk with table metadata follows. Finally, there is
> + @c other_items chunk containing metadata for per-database objects
> other than
> + tables.
> +
> + Each of these chunks contains a list of metadata entries, each
> entry containing
> + information required for restoring a single object.
> + The order of metadata entries is relevant. They are stored in
> such an order
> + that objects can be created while reading these entries without
> breaking any
> + dependencies. An exception is metadata for tables which are
> grouped by
> + database, which can break dependencies. However, this is OK since
> only
> + foreign key constraints can introduce dependencies between tables
> and they are
> + disabled during restore. If the list stored in a chunk is empty,
> then the
> + chunk consists of two 0x00 bytes.
> +
> + Currently the @c global_items chunk contains metadata only for
> tablespaces and
> + databases, but it might be used to store other global objects in
> the future.
> + Tablespace definitions precede database definitions.
> +
> + The @c other_items chunk stores metadata for per-database objects
> other than
> + tables. Currently these are stored routines, views, triggers,
> events and
> + grants. They are not grouped by a database but rather stored in a
> dependency
> + preserving order. For historical reasons, there are always two
> 0x00 bytes at
> + the end of @c other_items chunk.
> */
>
> -/** different formats in which item positions are stored */
> +/** different formats in which object positions are stored */
> enum enum_bstream_meta_item_kind {
> - GLOBAL_ITEM, /**< only item position is stored */
> + GLOBAL_ITEM, /**< only object position is stored */
> TABLE_ITEM, /**< only table position is stored (database is
> implicit) */
> - PER_DB_ITEM, /**< item position followed by it's database
> position */
> + PER_DB_ITEM, /**< object position followed by it's database
> position */
it's -> its
(occurs other places as well, such as two lines below)
> /**
> Item position followed by it's table's database position
> followed by the
> table's position inside that database
> @@ -1228,7 +1339,11 @@ int bstream_wr_item_def(backup_stream*,
> int read_and_create_items(backup_stream *s, struct
> st_bstream_image_header *cat,
> enum enum_bstream_meta_item_kind kind);
>
> -/** Write meta-data section of a backup image */
> +/**
> + Write meta-data section of a backup image
> +
> + @see @ref metadata
> +*/
> int bstream_wr_meta_data(backup_stream *s, struct
> st_bstream_image_header *cat)
> {
> void *iter= NULL, *titer= NULL;
> @@ -1238,7 +1353,7 @@ int bstream_wr_meta_data(backup_stream *
> bool item_written= FALSE;
> bool has_db= FALSE;
>
> - /* global items (this includes databases) */
> + /* global objects (this includes databases) */
>
> iter= bcat_iterator_get(cat,BSTREAM_IT_GLOBAL);
>
> @@ -1304,7 +1419,7 @@ int bstream_wr_meta_data(backup_stream *
> if (!has_db)
> return BSTREAM_OK;
>
> - /* other per-db items */
> + /* other per-db objects */
>
> CHECK_WR_RES(bstream_end_chunk(s));
>
> @@ -1351,12 +1466,14 @@ wr_error:
> /**
> Read backup image meta-data section.
>
> - All items read are created using @c bstream_create_item() function.
> + All objects read are created using @c bstream_create_item()
> function.
>
> @retval BSTREAM_ERROR Error while reading
> @retval BSTREAM_OK Read successful
> @retval BSTREAM_EOC Read successful and end of chunk has been
> reached
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
> +
> + @see @ref metadata
> */
> int bstream_rd_meta_data(backup_stream *s, struct
> st_bstream_image_header *cat)
> {
> @@ -1365,7 +1482,7 @@ int bstream_rd_meta_data(backup_stream *
> int ret=BSTREAM_OK;
> bool has_db= FALSE;
>
> - /* global items */
> + /* global objects */
>
> CHECK_RD_RES(read_and_create_items(s,cat,GLOBAL_ITEM));
>
> @@ -1398,7 +1515,7 @@ int bstream_rd_meta_data(backup_stream *
> if (!has_db)
> return ret;
>
> - /* other per-db item */
> + /* other per-db objects */
>
> if (ret != BSTREAM_EOC)
> return BSTREAM_ERROR;
> @@ -1408,13 +1525,13 @@ int bstream_rd_meta_data(backup_stream *
>
> /*
> If we hit end of chunk/stream, there is nothing more to read
> - (no per-table items)
> + (no per-table objects)
> */
>
> if (ret != BSTREAM_OK)
> return ret;
>
> - /* per-table items */
> + /* per-table objects */
>
> CHECK_RD_RES(read_and_create_items(s,cat,PER_TABLE_ITEM));
>
> @@ -1429,50 +1546,113 @@ int bstream_rd_meta_data(backup_stream *
>
>
> /**
> - @page stream_format
> + @page image_layer
>
> - @subsection item_entry Single item entry
> + @subsection metadata_entry Single metadata entry format
>
> - Item list is a sequence of meta data item entries, each having the
> - following format:
> - @verbatim
> -
> - [item entry]= [ type:2 ! flags:1 ! position in the catalogue !
> - optional extra data ! optional CREATE statement ]
> - @endverbatim
> -
> - Item meta-data contains a CREATE statement or other data in
> unspecified format
> - or both. [flags] inform about which meta-data elements are
> present in the
> - entry.
> - @verbatim
> -
> - [flags]= [ has_extra_data:.1 ! has_create_stmt:.1 ! unused:.6 ]
> - @endverbatim
> -
> - The position in the catalogue is represented by 1 to 3 numbers,
> depending on
> - in which part of catalogue the entry lies.
> - @verbatim
> -
> - [item position (global)]= [db no.]
> - [item position (table)]= [ snap no. ! pos in snapshot's table
> list ]
> - [item position (other per-db item)]= [ pos in db item list ! db
> no. ]
> - [item position (per-table item)] = [ pos in table's item list !
> db no. ! table pos ]
> - @endverbatim
> -
> - Note that table is identified by its position inside the snapshot
> to which it
> - belongs.
> - @verbatim
> + A single metadata entry has the following format:
>
> - [optional extra data]= [ data_len:2 ! extra data:(data_len) ]
> - @endverbatim
> +@verbatim
> + item_entry: type flags catalog_pos [extra_data] [object_metadata]
> +@endverbatim
> +
> + - @c type (2 bytes): Integer object type.
> + - @c flags (1 byte): Integer flags.
> + - @c catalog_pos (variable length): Catalog coordinates of the
> object - exact
> + format depends on the type of the object.
> + - @c extra_data (variable length): Optional; present only if
> + BSTREAM_FLAG_HAS_EXTRA_DATA is set in the flags. (Currently not
> used.) If
> + present it consists of
> + - Integer data length (2 bytes).
> + - Data bytes, as many as specified by the length.
> + - @c object_metadata (variable length): String with object's
> serialization.
> + Optional; present only if BSTREAM_FLAG_HAS_CREATE_STMT is set
> in the flags.
> +
> + Allowable object type values are given in @ref basic_data_itype.
> The format of
> + the catalog coordinates depends on the object type, as described
> below.
> +
> + Object metadata is a serialization of the object as a string, as
> + provided by server's object services API. Usually, it is an SQL
> CREATE
> + statement for the object, but it can also contain additional SQL
> statements
> + and other information.
> + There is also a possibility of storing arbitrary additional
> metadata in the
> + optional @c extra_data field, however this field is not used
> currently.
> +
> + The @c flags field indicates which metadata elements are present in
> + the entry:
> + - Bits 0-5: Reserved
> + - Bit 6 (BSTREAM_FLAG_HAS_CREATE_STMT): The object_metadata field
> is present.
> + - Bit 7 (BSTREAM_FLAG_HAS_EXTRA_DATA): The extra_data field is
> present.
> +
> + @subsubsection catalog_pos_global Catalog coordinates for global
> objects
> +
> + These are used to identify global objects whose metadata is
> stored in
> + @c global_items chunk of metadata section. Coordinate of a global
> object
> + consists of a single variable length integer:
> +
> +@verbatim
> + catalog_pos: pos_in_the_list
> +@endverbatim
> +
> + It is the 0-based position of the object in the corresponding list
> + in the @c catalog_header chunk of the catalog. The catalog is
> capable of
> + storing four types of global objects: databases, tablespaces,
> users, and
> + character sets. For all these global objects, @c catalog_pos is a
> single
> + number pointing at the entry on the appropriate list in the @c
> catalog_header
> + chunk:
> + - Database: The @c databases list.
> + - Tablespace: The @c tablespaces list.
> + - User: The @c users list.
> + - Character set: The @c charsets list.
> +
> + @note Only tablespace and database metadata is stored in the @c
> global_items
> + chunk of metadata section thus only tablespace and database
> coordinates are
> + currently used.
> +
> + @subsubsection catalog_pos_table Catalog coordinates for tables
> +
> + These are used for identifying tables inside table metadata
> entries in @c
> + tables chunks of metadata section.
> +
> +@verbatim
> + catalog_pos: table_pos snapshot_num
> +@endverbatim
> + - @c table_pos (variable length): Integer position of the table
> within the
> + snapshot storing this table's data (0-based).
> + - @c snapshot_num (1 byte). Integer snapshot number (0-based).
> +
> + Table coordinates (K,N) indicate the table described by the @c
> table_info
> + entry with @c snapshot_num = N and @c table_pos = K. This @c
> table_info entry
> + can be found in the @c db_catalog chunk corresponding to the
> database
> + containing the table.
> +
> + @subsubsection catalog_pos_perdb Catalog coordinates for other
> per-database objects
> +
> + They are used for identifying per-database objects for which
> metadata is
> + stored in the @c other_items chunk of the metadata section.
> +
> +@verbatim
> + catalog_pos: item_pos db_num
> +@endverbatim
> + - @c item_pos (variable length): 0-based position in the @c
> db_other_items
> + list from the @c db_catalog chunk corresponding to the given
> database.
> + - @c db_num (variable length): 0-based position of the database
> in the
> + @c databases list in @c catalog_header.
> +
> + The other per-database objects represent views, stored procedures,
> + stored functions, events, triggers, and grants. Coordinates (K,N)
> indicate
> + the object described by the K-th @c db_item_info entry in the
> + @c db_other_items list inside the N-th @c db_catalog chunk.
> */
>
> /** Definition of create statement flag bits. */
> #define BSTREAM_FLAG_HAS_CREATE_STMT 0x40
>
> /**
> - Write entry describing single item but without CREATE statement or
> + Write entry describing single object but without CREATE statement
> or
> other meta data.
> +
> + @see @ref metadata_entry
> */
> int bstream_wr_meta_item(backup_stream *s,
> enum enum_bstream_meta_item_kind kind,
> @@ -1486,7 +1666,7 @@ int bstream_wr_meta_item(backup_stream *
> CHECK_WR_RES(bstream_wr_item_type(s,item->type));
> CHECK_WR_RES(bstream_wr_byte(s,flags));
>
> - /* save item's position in the catalogue */
> + /* save object's position in the catalogue */
>
> CHECK_WR_RES(bstream_wr_num(s,item->pos));
>
> @@ -1502,23 +1682,22 @@ int bstream_wr_meta_item(backup_stream *
> if (kind == PER_TABLE_ITEM)
> CHECK_WR_RES(bstream_wr_num(s,
> ((struct st_bstream_titem_info*)item)->table-
> >base.base.pos));
> -
> wr_error:
>
> return ret;
> }
>
> /**
> - Read initial part of item entry and locate that item in the
> catalogue.
> + Read initial part of object entry and locate that object in the
> catalogue.
>
> - Pointer to an appropriate structure describing the located item
> is stored in
> + Pointer to an appropriate structure describing the located object
> is stored in
> @c (*item). This description is not persistent - next call to this
> function
> - can overwrite it with description of another item.
> + can overwrite it with description of another object.
>
> @param[in] s the backup stream
> - @param[in] kind format in which item coordinates are stored
> + @param[in] kind format in which object coordinates are stored
> @param[in] flags the flags saved in the entry are stored in that
> location
> - @param[in] item pointer to a structure describing item found is
> stored here.
> + @param[in] item pointer to a structure describing object found
> is stored here.
>
> @retval BSTREAM_ERROR Error while reading
> @retval BSTREAM_OK Read successful
> @@ -1526,7 +1705,9 @@ int bstream_wr_meta_item(backup_stream *
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
>
> @note If function returns BSTREAM_OK and @c (*item) is set to
> NULL, it means
> - that we are looking at an empty item list.
> + that we are looking at an empty object list.
> +
> + @see @ref metadata_entry
> */
> int bstream_rd_meta_item(backup_stream *s,
> enum enum_bstream_meta_item_kind kind,
> @@ -1564,7 +1745,7 @@ int bstream_rd_meta_item(backup_stream *
>
> CHECK_RD_OK(bstream_rd_byte(s,flags));
>
> - /* read item's position */
> + /* read object's position */
>
> CHECK_RD_RES(bstream_rd_num(s,&item_buf.any.pos));
>
> @@ -1611,14 +1792,16 @@ int bstream_rd_meta_item(backup_stream *
>
>
> /**
> - Write entry with given item's meta-data.
> + Write entry with given object's meta-data.
>
> @param[in] s backup stream.
> @param[in] cat image catalogue.
> - @param[in] kind determines format in which item's coordinates
> are saved.
> + @param[in] kind determines format in which object's coordinates
> are saved.
> @param[in] item stream item to write.
>
> @returns Status of operation.
> +
> + @see @ref metadata_entry
> */
> int bstream_wr_item_def(backup_stream *s,
> struct st_bstream_image_header *cat,
> @@ -1637,7 +1820,7 @@ int bstream_wr_item_def(backup_stream *s
> query.end= 0;
>
> /*
> - Fetch item's create query and/or extra metadata data. Note that
> + Fetch object's create query and/or extra metadata data. Note that
> the BSTREAM_EOS reply from bcat_get_item_create_*() functions
> indicates lack of the corresponding piece of metadata.
> */
> @@ -1654,7 +1837,7 @@ int bstream_wr_item_def(backup_stream *s
> else if (ret == BSTREAM_ERROR)
> goto wr_error;
>
> - /* save the header of metadata entry, containing item coordinates
> */
> + /* save the header of metadata entry, containing object
> coordinates */
"metadata" here, "meta-data" a few lines below. You probably want to
decide
on one (I prefer "metadata") and use it consistently throughout.
>
> ret= bstream_wr_meta_item(s,kind,flags,item);
> if (ret == BSTREAM_ERROR)
> @@ -1674,18 +1857,20 @@ int bstream_wr_item_def(backup_stream *s
> }
>
> /**
> - Read list of meta-data entries and create the corresponding items.
> + Read list of meta-data entries and create the corresponding
> objects.
>
> The entries are read until the end of chunk or 0x00 marker is hit.
> - After reading meta-data for each item (CREATE statement and/or
> extra meta-data)
> - the item is created using @c bstream_create_item() function which
> should be
> + After reading meta-data for each object (CREATE statement and/or
> extra meta-data)
> + the object is created using @c bstream_create_item() function
> which should be
> implemented by the program using this library.
>
> @retval BSTREAM_ERROR error while reading
> - @retval BSTREAM_EOC either list was empty or all items were
> read and created
> + @retval BSTREAM_EOC either list was empty or all objects were
> read and created
> successfully
> - @retval BSTREAM_EOS all items read and created successfully and
> end of
> + @retval BSTREAM_EOS all objects read and created successfully
> and end of
> stream has been reached
> +
> + @see @ref metadata_entry
> */
> int read_and_create_items(backup_stream *s, struct
> st_bstream_image_header *cat,
> enum enum_bstream_meta_item_kind kind)
> @@ -1741,29 +1926,66 @@ int read_and_create_items(backup_stream
>
> *************************************************************************/
>
> /**
> - @page stream_format
> -
> - @section data Table data section
> -
> - Format of table data section of backup image.
> - @verbatim
> + @page image_layer
>
> - [table data]= [ table data chunk | ... | table data chunk ]
> + @section table_data Table Data Format
>
> - [table data chunk]= [ snapshot no.:1 ! seq no.:2 ! flags:1 !
> table no. ! data ]
> - @endverbatim
> -
> - Data chunks of each snapshot are numbered by consecutive numbers.
> This can be
> - used to detect discontinuities in a backup stream. Currently only
> one flag
> - is used, indicating last data chunk for a given table.
> - @verbatim
> -
> - [flags]= [ unused:.7 ! last data block:.1 ]
> - @endverbatim
> + A snapshot of data from several tables is created by a backup
> driver
> + and used by restore driver to restore these table data. Each
> table is handled
> + by one of the drivers and each driver has a list of tables which
> it handles.
> + The number of snapshots contained in a backup image is given in the
> + @c header chunk (see @ref header). Each snapshot is described by
> a single
> + @c snapshot_description chunk in the preamble (see @ref
> snapshot). Snapshots
> + are numbered by consecutive numbers starting with 1.
> +
> + Table data snapshot consists of several data chunks. Usually,
> each data chunk
> + is assigned to one of the tables within the snapshot, but there
> can also
> + be "common data" chunks which are not tied to any particular table.
> + Data chunks from different snapshots can be mixed in the image.
> +
> + A single @c table_data chunk within backup image contains a data
> chunk from
> + one of the snapshots. It has the following format:
> +
> +@verbatim
> + table_data: snapshot_num sequence_num flags table_num data
> +@endverbatim
> + - @c snapshot_num (1 byte): Integer snapshot number (1-based).
> + - @c sequence_num (2 bytes): Integer sequence number. Currently
> not used and
> + is always 0.
> + - @c flags (1 byte): Integer flags.
> + - @c table_num (variable length): Integer table number (1-based).
> + - @c data (variable length): Table data, to end of chunk.
> +
> + The snapshot to which this data chunk belongs is identified in
> the first byte
> + of the chunk (@c snapshot_num field). Since snapshot numbers
> begin with 1,
> + this byte must be non-zero. This allows detecting the end of the
> @c table_data
> + chunks sequence in the stream as the final @c summary chunk
> starts with 0x00
> + byte (see @ref summary).
> +
> + In the future, sequence numebers will be used to detect
> discontinuities in a
numbers
> + backup stream. However, this is not currently implemented and a
> sequence
> + number is always 0 (two 0x00 bytes). Upon restore, sequence
> numbers are
> + currently ignored.
> +
> + Bits in the flags field are used as follows:
> + - Bit 0 (BSTREAM_FLAG_LAST_CHUNK): Set if the chunk is the last
> data chunk
> + for the table.
> + - Bits 1-7. Reserved.
> +
> + Field @c table_num identifies the table to which this data chunk
> belongs. If
> + @c table_num is 0 then this is a common data chunk not associated
> with any
> + particular table. Otherwise (@c table_num-1, @c snapshot_num-1)
> are the
> + catalog coordinates of the table as described in @ref
> catalog_pos_table.
> +
> + The format of @c data field is determined by the backup driver
> which created
> + given snapshot, as specified in the corresponding @c
> snapshot_description
> + chunk in the preamble.
> */
>
> /**
> Write chunk with data from backup driver.
> +
> + @see @ref table_data
> */
> int bstream_wr_data_chunk(backup_stream *s,
> struct st_bstream_data_chunk *chunk)
> @@ -1808,6 +2030,8 @@ int bstream_wr_data_chunk(backup_stream
>
> Return value @c BSTREAM_EOC indicates that all table data chunks
> have been read.
> The rest of the backup stream can contain image summary block.
> +
> + @see @ref table_data
> */
> int bstream_rd_data_chunk(backup_stream *s,
> struct st_bstream_data_chunk *chunk)
> @@ -1919,11 +2143,70 @@ int bstream_rd_data_chunk(backup_stream
> return ret;
> }
>
> +
> +/**
> + @page image_layer
> +
> + @section summary Summary Format
> +
> + The summary chunk stores information which are known only at the
> end of backup
> + process. Specifically the binary log coordinates of image's
> validity point,
> + which enable point-in-time recovery operations that combine use
> of the backup
> + image and a log of modifications made after the validity point.
> +
> + The summary chunk appears at the end of the backup image, after
> all table data
> + chunks. It always starts with 0x00 byte, to distinguish it from
> table data
> + chunks which never begin with 0x00 (see @ref table_data).
> +
> +@verbatim
> + summary: 0x00 vp_time end_time binlog_coords binlog_group_coords
> +@endverbatim
> + - 0x00 (1 byte): To distinguishing the summary from preceding
> table data
> + chunks.
> + - @c vp_time (6 bytes): Time of validity point.
> + - @c end_time (6 bytes): Time when backup ended.
> + - @c binlog_coords (variable length): Binary log coordinates,
> stored as:
> + - @c pos (4 bytes): Integer binary log position.
> + - @c binlog_file (variable length): String binary log filename.
> + - @c binlog_group_coords (variable length): Binary log group
> coordinates.
> + The format is same as for @c binlog_coords. This field is
> reserved for
> + future use.
> +*/
> +
> +
> /*********************************************************************
> *
> * WRITING/READING BASIC TYPES
> *
>
> *********************************************************************/
> +/**
> + @page image_layer
> +
> + @section basic_data Basic Data Type Storage
> +
> + This section discusses storage for basic data types:
> + - Fixed-length and variable-length integers
> + - Strings
> + - Times
> +
> + @subsection basic_data_int Integer Storage
> +
> + Only unsigned integers are stored.
> +
> + Fixed-length integers are stored as 1-byte, 2-byte, or 4-byte
> values, least
> + significant byte first.
> +
> + Variable-length integers are stored as a sequence of bytes, each
> byte
> + storing 7 bits (least significant bits first). The most
> significant bit in
> + a byte is set if more bytes follow. There is no maximum number of
> bytes that
> + a variable-length integer can contain. A program that reads them
> must be
> + prepared to deal with arbitrarily large values, or to throw an
> error if
> + it encounters a value larger than what it can handle.
> +
> + Example: The binary number 100101011101011010110100011110 will be
> split
> + into 7-bit groups 10 0101011 1010110 1011010 0011110 and then
> stored in
> + 5 bytes (least significant bytes first): 0x9E 0xDA 0xD6 0xAB 0x02
> +*/
>
> /** Write single byte to backup stream */
> int bstream_wr_byte(backup_stream *s, unsigned short int x)
> @@ -2065,6 +2348,8 @@ int bstream_rd_int4(backup_stream *s, un
> significant bit in a byte tells if there are more bytes to follow
> (if it is set) or if current byte is the last one (if it is not
> set).
> The bits are saved starting with least significant ones.
> +
> + @see @ref basic_data_int
> */
> int bstream_wr_num(backup_stream *s, unsigned long int x)
> {
> @@ -2090,6 +2375,8 @@ int bstream_wr_num(backup_stream *s, uns
> @retval BSTREAM_OK Read successful
> @retval BSTREAM_EOC Read successful and end of chunk has been
> reached
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
> +
> + @see @ref basic_data_int
> */
> int bstream_rd_num(backup_stream *s, unsigned long int *x)
> {
> @@ -2113,17 +2400,28 @@ int bstream_rd_num(backup_stream *s, uns
> return (b & 0x80) ? BSTREAM_ERROR : ret;
> }
>
> -/*
> - String format.
> +/**
> + @page image_layer
>
> - [string]= [ size ! string bytes:(size) ]
> + @subsection basic_data_string String Storage
> +
> + Strings are stored as counted strings: A byte count indicating
> the length of
> + the string, followed by the bytes in the string. The length is
> stored as a
> + variable-length integer using the encoding just described.
> +
> + Strings are encoded using a universal encoding listed as the
> first item in
> + the @c charsets list in the @c catalog_header chunk.
> +
> + An empty string (or "null string") is represented as a single
> 0x00 byte.
>
> - All strings are stored using the same universal character set,
> which is listed
> - in image's catalogue as the first entry.
> + Example: "abcd" is encoded as 0x04 0x61 0x62 0x63 0x64
> + Example: "" (the empty string) is encoded as 0x00
> */
>
> /**
> Write a string.
> +
> + @see @ref basic_data_string
> */
> int bstream_wr_string(backup_stream *s, bstream_blob str)
> {
> @@ -2152,6 +2450,8 @@ int bstream_wr_string(backup_stream *s,
> @note
> Caller of this function is responsible for freeing memory
> allocated to store
> the string.
> +
> + @see @ref basic_data_string
> */
> int bstream_rd_string(backup_stream *s, bstream_blob *str)
> {
> @@ -2178,16 +2478,39 @@ int bstream_rd_string(backup_stream *s,
> return bstream_read_blob(s, *str);
> }
>
> -/*
> - Time format:
> +/**
> + @page image_layer
> +
> + @subsection basic_data_time Time Storage
> +
> + Times are stored in UTC. A time value takes six bytes:
>
> - [time]= [ year and month:2 ! mday:1 ! hour:1 ! min:1 ! sec:1 ]
> + - 2 bytes. Years since 1900 and month (0..11)
> + - Bits 0-7, 12-15. Year:
> + - High 8 bits from bits 0-7 of byte 0
> + - Low 4 bits from bits 4-8 of byte 1
> + - Bits 8-11. Month (bits 0-3 of byte 1)
> + - 1 byte. Day of month (1..31)
> + - 1 byte. Hour (0..23)
> + - 1 byte. Minute (0..59)
> + - 1 byte. Second (0..60)
>
> - [year and month]= [ year:.12 ! month:.4 ]
> + Example: 2008-10-11 15:28:17 is encoded as 0x06 0xc9 0x0b 0x0f
> 0x1c 0x11
> + - Year = (0x06 << 4) + (0xc9 >> 4) = 0x60 + 0x0c = 0x6c = 108
> (that is, 2008 - 1900)
> + - Month = 0xc9 & 0x0f = 9
> + - Day = 0x0b = 11
> + - Hour = 0x0f = 15
> + - Minute = 0x1c = 28
> + - Second = 0x11 = 17
>
> + A time value may consist entirely of 0x00 bytes, which means "no
> date."
> */
>
> -/** Write time entry */
> +/**
> + Write time entry
> +
> + @see @ref basic_data_time
> +*/
> int bstream_wr_time(backup_stream *s, bstream_time_t *time)
> {
> byte buf[6];
> @@ -2214,6 +2537,8 @@ int bstream_wr_time(backup_stream *s, bs
> @retval BSTREAM_OK Read successful
> @retval BSTREAM_EOC Read successful and end of chunk has been
> reached
> @retval BSTREAM_EOS Read successful and end of stream has been
> reached
> +
> + @see @ref basic_data_time
> */
> int bstream_rd_time(backup_stream *s, bstream_time_t *time)
> {
> @@ -2238,3 +2563,29 @@ int bstream_rd_time(backup_stream *s, bs
>
> return BSTREAM_OK;
> }
> +
> +/**
> + @page image_layer
> +
> + @subsection basic_data_itype Object Type Encoding
> +
> + Object type encoding is used for @c db_item values in @c
> db_catalog chunks,
> + and for object types in metadata item lists.
> +
> + Allowable object types are encoded as follows:
> + - 1 = character set
> + - 2 = user
> + - 3 = privilege
> + - 4 = database
> + - 5 = table
> + - 6 = view
> + - 7 = stored procedure
> + - 8 = stored function
> + - 9 = event
> + - 10 = trigger
> + - 11 = tablespace
> +
> + Object types are encoded using two bytes. A value of 0 is not a
> valid object
> + type, so a 0x00 0x00 sequence can be used to separate one object
> list from
> + the next.
> +*/
>
> === modified file 'sql/backup/stream_v1_transport.c'
> --- a/sql/backup/stream_v1_transport.c 2009-08-03 07:59:51 +0000
> +++ b/sql/backup/stream_v1_transport.c 2009-08-21 11:16:45 +0000
> @@ -11,11 +11,92 @@
> Implementation of the low-level I/O functions from backup stream
> library.
>
> These functions form the transport layer of the 1st version of
> backup stream
> - format. They split stream into a sequence of data chunks.
> + format. They split stream into a sequence of data chunks.
> + See @ref transport_layer.
>
> @todo free internal buffer memory in case of errors.
> */
>
> +/**
> + @page transport_layer Transport Layer (Block Level) Format
> +
> + The transport layer consists of fixed-sized blocks. Blocks within
> a given
> + backup image are the same size (except possibly the last), but
> the block
> + size is encoded within the first few blocks of the image itself.
> Blocks are
> + encoded such that variable-size backup image chunks can be
> recognized and
> + extracted from the byte stream.
> +
> + The backup stream prefix (magic number and image format version)
> is not part
> + of the sequence of blocks comprising the backup image. If
> present, the prefix
> + must be read separately.
> +
> +@verbatim
> + backup_image: first_block [initial_block ...] [regular_block ...]
> + first_block: block_size initial_block_count block_data
> + initial_block: block_size block_data
> + regular_block: block_data
> +@endverbatim
> +
> + The first few blocks are special:
> + - The first block contains the block size for all following
> blocks, the
> + number N of "initial" blocks to follow, and some data.
> + - The next N blocks following the first block (the "initial"
> blocks) contain
> + the block size and some data.
> + - The blocks following the first and initial blocks contain only
> data.
> +
> + The first and inital blocks all contain the block size. The
> redundancy serves
> + to enable detection of data corruption.
> +
> + Fields in the first block of a stream:
> + - @c block_size (4 bytes): An integer indicating how large blocks
> are.
> + - @c initial_block_count (1 byte): An integer indicating how many
> initial
> + blocks follow the first block.
> + - @c block_data (block_size-5 bytes): Stream data to the end of
> the block.
> +
> + The size of @c block_data is block_size-5 because the first 5
> bytes of the
> + block contain the @c block_size and @c initial_block_count fields.
> +
> + Fields in the initial blocks of a stream:
> + - @c block_size (4 bytes): An integer indicating how large blocks
> are.
> + - @c block_data (block_size-4 bytes): Stream data to the end of
> the block.
> +
> + The size of @c block_data is block_size-4 because the first 4
> bytes of the
> + block contain the @c block_size field.
> + The @c block_size value in all @c initial_block blocks must match
> the
> + @c block_size in @c first_block.
> +
> + Regular blocks contain only @c block_data without any additional
> fields.
> +
> + Because the block size is unknown initially, the block reader
> must begin
> + by reading the first four bytes of @c first_block to get the
> block size
> + before it can read the rest of the block.
> +
> + @note
> + The last block in a backup image might not be the full block
> size, and there
> + might not be as many initial blocks as indicated in the first
> block. For
> + example, the first block might indicate a @c block_size of 16384
> bytes and
> + @c initial_block_count of 2. But if the entire backup image size
> is 3000
> + bytes, the first block will contain only 3000 bytes and there
> will be no
> + initial blocks following it. The block-reading level must be
> prepared to deal
> + with this and handle what block data bytes are actually present,
> interpreting
> + them according to the rules governing @c block_data content
> described in
> + @ref block_data.
> +
> + @note
> + It is possible to have external data at the beginning of the
> first block of
> + a backup image, This is for situations where, e.g., backup image
> storage
> + requires storing additional headers and still wants to preserve
> block
> + aligmnet of all the data. For example, when backup image is
> stored in a file
> + then it is preceded by a 10 byte prefix containing magic number
> and image
> + version number (see @ref stream_format). This 10 byte prefix is
> stored at
> + the beginning of the first block and is not considered a part of
> the backup
> + image. The @c block_size field of the first block starts at byte
> 11 in that case
> + and the size of @c block_data equals block_size-15.
> + If such external prefix is present in the first block, its
> presence is not
> + indicated in the image itself. It must be determined by other
> means (e.g., by
> + a convention as in the above example).
> +*/
> +
> #ifdef DBUG_OFF
> # define ASSERT(X)
> #else
> @@ -121,52 +202,6 @@ extern byte get_byte_short(short value);
> */
> extern byte get_byte_size_t(size_t value);
>
> -/*
> - Carrier format
> - --------------
> -
> - [backup stream] = [ block ! ... ! block ]
> -
> - [block] = [ fragment ! ... ! fragment ]
> -
> - where [fragment] is one of
> -
> - [EOC marker] = [ 0x80 ]
> - [EOS marker] = [ 0xC0 ]
> - [fragment extending to the end of block, last in a chunk] =
> [ 0x00 ! payload ]
> - [fragment extending to the end of block, more follow] = [ 0x40 !
> payload ]
> - [limited size fragment] = [ header:1 ! payload ]
> -
> - The header of [limited size fragment] consists of two bit fragment
> type info
> - followed by 6 bit, non-zero value encoding length of the fragment:
> -
> - [fragment header] = [ type:.2 ! value:.6 ]
> -
> - There are four types of fragments determined by the first 2 bits
> of the header:
> -
> - 00 - small fragment which is the last fragment of a chunk,
> - 01 - small fragment with more fragments following it,
> - 10 - big fragment (more fragments follow),
> - 11 - huge fragment (more fragments follow).
> -
> - Encoding of the size of the fragment depends on its type:
> -
> - - for small fragments: size= value
> - - for big fragments: size= value << 6
> - - for huge fragments: size= value << 12
> -
> - For small fragments, second bit of the header determines if it is
> the last
> - fragment of a chunk or there are more fragments to come. Chunk
> can't end with
> - a big or huge fragment and thus for these fragments we always
> expect more
> - fragments to come. [EOC marker] ends a chunk, even if the last
> fragment said
> - that more fragments will follow.
> -
> - The biggest fragment size is 64*2^12 ~= 250 Kb. The format of
> fragment header
> - puts constraints on possible fragment sizes. If a chunk of data
> has size not
> - possible to encode by a single fragment header, it is split into
> several
> - fragments of correct sizes.
> -*/
> -
>
> /
> *************************************************************************
> *
> @@ -174,6 +209,83 @@ extern byte get_byte_size_t(size_t value
> *
>
> *************************************************************************/
>
> +/**
> + @page transport_layer
> +
> + @section block_data Block Data Format (Transport Layer)
> +
> + When a block is read in the transport layer, its @c block_data
> bytes are
> + extracted and then interpreted to find the fragments that it
> contains.
> + (No fragment ever crosses a block boundary.) These fragments are
> assembled
> + into chunks for use by the image layer, which operates on chunk
> units
> + (see @ref image_format1). There is no fixed relationship between
> chunks and
> + blocks. A chunk can be constructed from fragments spanning
> several blocks,
> + or a chunk might require fragments from only part of a block.
> +
> + The @c block_data part of a given block consists of one or more
> fragments:
> +
> +@verbatim
> + block_data_stream: fragment [fragment ...]
> + fragment: EOC | EOS | frag_header payload
> + EOC: 0x80 (end of chunk)
> + EOS: 0xc0 (end of stream)
> +@endverbatim
> +
> + An EOC marker ends a chunk even if the preceding fragment
> indicates that
> + more fragments follow. EOS indicates end of the stream - all data
> folowing it
> + do not belong to the backup image. A regular fragment consists of
> 1 byte
> + header and data payload.
> +
> +@verbatim
> + frag_header: frag_type frag_size
> + payload: data bytes (size depends on frag_header contents)
> +@endverbatim
> +
> + A @c frag_header fragment header byte indicates the type and size
> of the
> + fragment:
> + - Bits 0-5: The fragment size. Interpretation of this value
> depends on the
> + fragment type.
> + - Bits 6-7: The fragment type.
> +
> + Thus:
> +@verbatim
> + frag_type = (frag_header & 0xc0)
> + frag_size = (frag_header & 0x3f)
> +@endverbatim
> +
> + Possible frag_type values:
> + - 0x00: Small fragment, more fragments in chunk to follow
> + - 0x40: Small fragment, last fragment in chunk
> + - 0x80: Big fragment, more fragments in chunk to follow
> + - 0xc0: Huge fragment, more fragments in chunk to follow
> +
> + For a small fragment, the second bit of the @c frag_type bits
> determine
> + whether there are more fragments in the chunk.
> + A chunk cannot end with a big or huge fragment, but an EOC marker
> can follow
> + indicating end of the chunk.
> +
> + The @c frag_size value must be interpreted to determine the
> actual fragment
> + size:
> + - Small fragment:
> +@code
> + size = rest of block if frag_size == 0;
> + size = frag_size otherwise
> +@endcode
> + - Big fragment:
> +@code
> + size = (frag_size << 6)
> +@endcode
> + - Huge fragment:
> +@code
> + size = (frag_size << 12)
> +@endcode
> +
> + The biggest fragment size is 64<<12 ~= 250 Kb. The format of
> fragment
> + header puts constraints on possible fragment sizes. Data of
> arbitrary length
> + can be divided into several fragments of appropriate sizes.
> + Each fragment adds one byte overhead for the fragment header.
> +*/
> +
> #define FR_EOC 0x80 /**< bits for EOC fragment */
> #define FR_EOS 0xC0 /**< bits for EOS fragment */
> #define FR_MORE 0x00 /**< bits for MORE fragment */
--
Paul DuBois
Sun Microsystems / MySQL Documentation Team
Madison, Wisconsin, USA
www.mysql.com