Thanks for taking time to share your ideas. I agree that we could gain
a lot if we could pass value objects around instead of going through all
the transformations you describe. However, this will be a pretty large
task, but I think what I have done so far is a step in the right
direction, and hopefully this work can evolve further.
Jay Pipes wrote:
> Hi! I've looked through your code and have a few comments inline.
> Please understand these are just suggestions; I'm actually very
> supportive of the concept of Value objects in the runtime, and I'll try
> to explain some alternate strategies to ponder...
> I will say ahead of time that I was not surprised to see this thread
> immediately devolve into a discussion on the memory and performance
> ramifications of the code change. It seems that the performance of the
> bits most often takes precedence over everything else, especially at the
> cost of maintainability and code readability. A Value Object system is
> designed to increase code maintainability and ease of use. It is *not*
> designed around performance. It may well be true that a Value object
> system may add additional memcpy's. It may also be true that a Value
> object system may reduce the *total* number of conversions/copies that
> need to be done in the runtime.
> Hopefully the comments below will shed some light on this, but I will
> state now that my fondness for the Value object system stems not from a
> desire for increased performance, but from a need for greater system
> Øystein Grøvlen wrote:
>> I am working on extracting much duplicated type conversion code from
>> Item classes and Field classes into a Value class. (See
>> http://forge.mysql.com/worklog/task.php?id=4760). Note that the
>> architectural review for this worklog is still pending, but a
>> preliminary patch for the Field classes part can be found at
>> Comments are very welcomed.
>> I am currently looking at the interaction between Field and Protocol,
>> and would appreciate some input.
>> Each Field subclass has a send_binary() method (a bit misleading name
>> since it depends on the protocol whether what it sends is binary or
>> textual.) Currently, these methods calls a corresponding
>> Protocol::store_xxx() method, and in my preliminary patch this looks
>> something like this:
>> bool Field_short::send_binary(Protocol *protocol)
>> return protocol->store_short(value().get_int());
>> I think it would be a good idea to add a Protocol::store(Value) method
>> that will be called from Field::send_binary(Protocol*). Then all the
>> overriding send_binary methods in the Field subclasses could be dropped.
>> Protocol_text::store(Value) could just call Value::to_string() to get
>> the textual representation of the Value. The question is what to do
>> for Protocol_binary and other protocols.
> May I present an alternate strategy?
> Currently, you have one large mysql::Value class (kudos on using
> namespacing BTW!) which "understands" how to convert its value into a
> number of different native "types" (custom String class, integers,
> my_decimal, etc... using the to_xxx() methods).
> Up until now, this is basically what the Item class and its derived
> classes already do via the val_str(), val_int(), etc methods.
> So, you've actually just added an additional layer of abstraction on top
> of the already complex Item tree. The benefit of a Value object system
> has yet to be realized since you have, to now, only really duplicated
> the current system of type conversion within the server.
> The true advantage of a Value object system is in two respects:
> * the immutability of Value objects, once constructed
> * the ability of Value objects to transform themselves into another
> Value object via construction
> By this last bullet point I mean something like the following:
> Number a= Number(20080911123059);
> DateTime b= DateTime(a);
> This may look like a trivial piece of code...but in there is some beauty
> which can remove or encapsulate a large chunk of type conversion code in
> the server and allow code to be written in a more natural (IMHO) manner.
> Assume that Number and DateTime are "Value objects". Perhaps they
> inherit from a base mysql::Value class, perhaps they don't. By saying
> they are "Value objects", I mean that:
> a) Number and DateTime objects, once constructed, are immutable
> b) Number and DateTime can be constructed from instances of each other
> Ignore, for now, the implementation of the above Value objects Number
> and DateTime (operator and constructor overloads). If we take the above
> interface (of construction via another Value object instance), then we
> can begin to refactor two distinct parts of the parser and runtime which
> revolve around:
> * Construction of constant things
> * Transformation and reduction of constant things into other constant
> An example of the former would be in the Item tree constructed via a
> simple statement such as:
> CREATE TABLE t1 (
> b DATETIME NOT NULL
> SELECT b FROM t1 WHERE b BETWEEN '20080911123059' AND '20090911123059';
> In the current server, the SELECT statement above is parsed into a
> Select_lex structure which contains a series of Item, Table and Field
> objects. The Item objects represent the constant strings
> '20080911123059' and '20090911123059' as Item_string objects. The Field
> object represents the "b" field as a Field_datetime.
> Because Field_datetime is evaluated by the MyISAM and HEAP storage
> engines and runtime as 64-bit integers, each Item_string's val_int()
> method is called to return a signed 64-bit integer number that is passed
> to the runtime during its evaluation of the COND representing the
> between condition on the "b" field. (Transformation #1)
> These 64-bit signed integer are then passed to the
> Field_datetime::store() method in the runtime and the runtime asks the
> storage engine to give it an appropriate key to use in reading the data
> from the t1 table.
> Before it can pass the data to the storage engine, however,
> Field_datetime::store() must first verify that the passed integer is
> indeed a valid datetime. So, it "parses" the signed 64-bit integer into
> a timestamp-like number (see sql/time.cc:number_to_datetime())
> represented by the MYSQL_TIME struct. This struct contains temporal
> information like year, month, day, etc. (Transformation #2)
> Field_datetime::store() then places the pieces of this MYSQL_TIME struct
> into a series of raw uchar* bytes, pointed to by the Field::ptr member.
> (Transformation #3)
> The storage engine can then use these raw uchar* bytes to find and
> retrieve the records needed by the call from the runtime to the record
> pointer needed during send of the record back to the client over the
> However, the Protocol itself will see (in Item::send()) that the
> returned Field data type is of type MYSQL_TYPE_DATETIME, and will
> construct a MYSQL_TIME instance by calling
> The get_date() method then takes the value of the retrieved uchar*
> records, converts the uchar* into a signed 64-bit integer (via macros in
> sql/korr.h) (Transformation #4). These 64-bit integers are then passed
> to the number_to_datetime() method to construct the MYSQL_TIME structs.
> (Transformation #5) and these are passed back to the Item::send() method.
> Item::send() then calls Protocol::store(*MYSQL_TIME), passing in the
> MYSQL_TIME structs. This method then transforms each MYSQL_TIME struct
> into a series of char* via the datetime_to_str() methods, which calls
> sprintf() to change the data into a textual format of "YYYY-MM-DD
> HH:MM:SS" (Transformation #6) which is then sent along the wire in text
> As you can see, there are quite a few transformations which occur for
> both the constant strings as well as the datetime data sent/retrieved
> from the storage engine.
> Once functions such as DATE_FORMAT(), CAST(), STR_TO_DATE() and others
> get involved, the whole process can be downright overwhelming! :)
> Currently, when you follow the code in the server, you weave in and out
> of the various val_xxx(), store(xxx), get_xxx() methods, and in the case
> of decimal and datetime conversions, all their associated routines. It
> can be very difficult to follow at times.
> A Value object implementation can simplify much of this code spaghetti
> and even get rid of many of the Item classes entirely, namely all of the
> Item_xxx_typecast classes and the Cached_item classes.
> So, how to go about implementing a Value object system in the runtime?
> 1) Create a hierarchy of Value classes subclassing from a Value class.
> You'll need classes for temporal objects like Date and Timestamp as well
> as classes for a Number (don't have to break it into Decimal and
> Natural, I'd encapsulate all that in one class) and a String class of
> course...but not like the current String class; you'll want an immutable
> 2) Instead of the parser creating Item_string or Item_num objects, it
> would create immutable Value objects of type String and Number.
> 3) For each column in a SELECT's result and each WHERE condition you
> might create a vector<> of Value object pointers. Instead of calling
> val_int(), val_str() and likewise, simply use the Session's vector<> of
> Value objects as needed. Push and pop objects off the vector as needed,
> and construct new Value objects from other ones.
> For example, Let's take the example above. The parser would construct a
> new String value object from the string "20080911123059".
> After parsing, some analysis of the parsed nodes is done, including a
> name resolution step. During this step, the "b" column would be
> determined to be of type DATETIME. A DateTime value object for each
> side of the condition would then be pushed onto the Session's object
> vector<> like so:
> String *left_side= <call to get the left String value of the condition>
> session.values.push_back((Value *) new DateTime(left_side));
> Within the optimizer, instead of using Field_datetime::store() to both
> validate and transform the datetime-string data, the optimizer would
> simply access the condition's constants like so:
> DateTime *left_side= (DateTime *) session.values;
> DateTime *right_side= (DateTime *) session.values;
> If temporal functions were used in a statement (say, DATE_FORMAT()), it
> could work with DateTimes as DateTimes, with calendrical calculations
> native to a DateTime object, instead of constantly having to convert
> args to a MYSQL_TIME or an integer or a string (just see the amount
> of code in the temporal built-in functions for doing conversion...)
> For implementing DATE_ADD(), for instance, there's a ton of code which
> could be encapsulated in a Value object system that understands how to
> add and subtract dates and times properly...
> Anyway, these are all just thoughts to get the ideas flowing, and
> nothing more. In Drizzle-land, we're constantly thinking about this
> very problem and how to tackle it. It's not an easy one to address with
> the current architecture, but hopefully we can share our successes and
> failures and collaborate together on this :)