From: Mats Kindahl Date: May 12 2009 3:16pm Subject: Re: MySQL Reengineering Project List-Archive: http://lists.mysql.com/internals/36637 Message-Id: <4A0992B6.4080208@sun.com> MIME-Version: 1.0 Content-Type: text/plain; CHARSET=US-ASCII Content-Transfer-Encoding: 7BIT Alaric Snell-Pym wrote: > > Excellent news! > > One word of warning, though: make sure it's a series of small steps. > It's far too easy, with this sort of thing, ending up going off on > huge yak-shaving tangents. By all means take lots of small steps > towards a lofty distant goal, but make sure each step is useful in its > own right (even if just by allowing other steps to happen), or you can > get lost on a branch that will never merge ;-) Yes, we don't want to do the work in macro steps (at least not at this point), we want to proceed carefully. > I see that a few macroscopic tasks have appeared on the Forge already, > but I'd like to add something I think could be changed for the better, > on a grass-roots level throughout the codebase: > > I see a lot of methods that are called with arguments and return a > value, but most of their input and output is actually through member > fields of the object - not that the method is operating on its object > per se, but that the caller actually puts things into the member > fields, then calls the method, then inspects the results in member > fields. For example, in the storage engine API, update_row is called > with a buffer in unireg format, which it almost universally ignores > and instead uses the array of Field objects set up in the handler > object by the caller. And we spent some time in debugging our storage > engine - it would return rows fine when you did selects on the table, > but when you did certain types of join, it would fail to return any > rows, despite our logging clearly showing we'd returned rows to MySQL > - because it seems that sometimes MySQL not only looks at the return > value of rnd_next, but also checks the 'status' member of the table > object to see if the current row of the table is valid or not. So our > rnd_next had to assign success/failure to table->status as well as > returning success, and then everything worked OK. Doh. > > Making the inputs and outputs of every method/function explicit, > rather than sneaking stuff in and out via members, will make the > calling interfaces between things a lot easier to read, which will > reduce the chances of developers working on a module introducing bugs! > Plus, it'll simplify the classes a lot, and make them easier to read, > as they will end up with only members that really relate to the actual > domain object - eg, the table or whatever - rather than members that > are part of the calling protocol of particular operations on the > class. Less short-term mutable state in classes means they can be > shared between threads in more and more contexts, too, as function > arguments and return values live only on the thread-local stacks! Yes; the fact that the handler interface doesn't really honor the arguments has been a major bummer for me several times. This is actually because some engines that we support internally ignore the argument and use the stored records record[0] and record[1] instead, which means that every engine (with the exception of a few) started doing that. So now you have to both pass the argument and the record to make sure that all engines work. Just getting clear semantics on how this part of the handler interface works, and add assertion to weed out the bad usages, would simplify the code significantly and improve the speed for all. However, do you know of any other interfaces that work this way? I am personally not aware of any other, but then I don't know every corner of the code like Serg does. :) Best wishes, Mats Kindahl -- Mats Kindahl Senior Software Engineer Database Technology Group Sun Microsystems