Jay Pipes wrote:
> Mats Kindahl wrote:
>> Alaric Snell-Pym wrote:
>>> Excellent news!
>>> One word of warning, though: make sure it's a series of small steps.
>>> It's far too easy, with this sort of thing, ending up going off on
>>> huge yak-shaving tangents. By all means take lots of small steps
>>> towards a lofty distant goal, but make sure each step is useful in its
>>> own right (even if just by allowing other steps to happen), or you can
>>> get lost on a branch that will never merge ;-)
>> Yes, we don't want to do the work in macro steps (at least not at this
>> we want to proceed carefully.
>>> I see that a few macroscopic tasks have appeared on the Forge already,
>>> but I'd like to add something I think could be changed for the better,
>>> on a grass-roots level throughout the codebase:
>>> I see a lot of methods that are called with arguments and return a
>>> value, but most of their input and output is actually through member
>>> fields of the object - not that the method is operating on its object
>>> per se, but that the caller actually puts things into the member
>>> fields, then calls the method, then inspects the results in member
>>> fields. For example, in the storage engine API, update_row is called
>>> with a buffer in unireg format, which it almost universally ignores
>>> and instead uses the array of Field objects set up in the handler
>>> object by the caller. And we spent some time in debugging our storage
>>> engine - it would return rows fine when you did selects on the table,
>>> but when you did certain types of join, it would fail to return any
>>> rows, despite our logging clearly showing we'd returned rows to MySQL
>>> - because it seems that sometimes MySQL not only looks at the return
>>> value of rnd_next, but also checks the 'status' member of the table
>>> object to see if the current row of the table is valid or not. So our
>>> rnd_next had to assign success/failure to table->status as well as
>>> returning success, and then everything worked OK. Doh.
>>> Making the inputs and outputs of every method/function explicit,
>>> rather than sneaking stuff in and out via members, will make the
>>> calling interfaces between things a lot easier to read, which will
>>> reduce the chances of developers working on a module introducing bugs!
>>> Plus, it'll simplify the classes a lot, and make them easier to read,
>>> as they will end up with only members that really relate to the actual
>>> domain object - eg, the table or whatever - rather than members that
>>> are part of the calling protocol of particular operations on the
>>> class. Less short-term mutable state in classes means they can be
>>> shared between threads in more and more contexts, too, as function
>>> arguments and return values live only on the thread-local stacks!
>> Yes; the fact that the handler interface doesn't really honor the
>> arguments has
>> been a major bummer for me several times. This is actually because
>> some engines
>> that we support internally ignore the argument and use the stored records
>> record and record instead, which means that every engine (with the
>> exception of a few) started doing that. So now you have to both pass the
>> argument and the record to make sure that all engines work.
>> Just getting clear semantics on how this part of the handler interface
>> and add assertion to weed out the bad usages, would simplify the code
>> significantly and improve the speed for all.
>> However, do you know of any other interfaces that work this way? I am
>> not aware of any other, but then I don't know every corner of the code
>> like Serg
>> does. :)
> "External" interfaces? See all the plugin "interfaces". There's no
> enforcement of types really at all. Just passing void *'s around.
> As for the internal interfaces, I would suggest cleaning up the class
> interfaces of THD, JOIN, and other major classes to enforce public
> accessors and getters, protecting private member variables behind a
> clean API. This would, eventually, make some of these classes
> semi-usable in public interfaces. Right now, the passing of the THD*
> everywhere, and THD having basically a bunch of public member variables,
> means that there is no enforcement of state changes through an
> interface. This leads to serious problems where the "internal" state of
> a THD is actually public and cannot be seen as reliable for the lifetime
> of a session's requests.
> Another thing to think about in your refactoring efforts is detaching
> the THD from its current inheritance from Statement, Query_arena and
> ilink. Without doing this, and using encapsulation so that a THD can
> have multiple Statements, it will be very difficult to work on any
> future parallelization efforts.