| List: | Commits | « Previous MessageNext Message » | |
| From: | Ingo Strüwing | Date: | February 21 2009 12:56pm |
| Subject: | Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745) WL#4630 | ||
| View as plain text | |||
Hi Øystein, Øystein Grøvlen, 19.02.2009 10:17: ... > Ingo Strüwing wrote: ... >> However, in a couple of cases you complained without proposing >> alternatives. Please see below. > > I do this because I do not necessarily have the answer. I think it is > the implementor's responsibilty to decide on a solution, and that it > is OK for a reviewer to just challenge the existing code to make the > implementor consider whether better solutions exists. (I think Rafal > covered that point of view pretty good in an earlier email.) I am still not all that happy with this approach. If the reviewer says "I don't like your code. Make it better.", I can change things, but I don't know if the reviewer would like that better. That's hunting a black cat in the darkness. I think a reviewer better says, how to change things to make him happy. Example: You asked me to re-use a pre-existing parser for the search string. But you don't know, where to find it, nor if it exists at all. So is it my duty to read all MySQL source files, to find one? And if it is not there, to search open-source archives? This could easily take a hundred times the time, it took me to write it from scratch. But the clever move may be to claim, there is no such parser. then it is your turn to prove me wrong, if you find the issue important enough. ... >> Øystein Grøvlen, 17.02.2009 14:57: ... >>> Ingo Struewing wrote: ... >>>> + puts("By Sun Microsystems, for your professional use\n\ >>>> +This software comes with NO WARRANTY: This is free software,\n\ >>>> +and you are welcome to modify and redistribute it under the GPL >>> license\n"); >>>> + >>> Have you checked that this is correct? Should not some copyright >>> notice with year be included (e.g., see mysqld)? >> >> >> Hm. I copied it from mysqlbinlog and changed the name. It seems that >> every client displays a different text. Perhaps we could also leave it >> away. Or check with someone authority, what we should print. > > Maybe we should check with build or documentation people. I checked with Trudy, who announced, how to handle copyright notices in source code. It turned out that this announcement included client copyrigths too. You are right. A copyright notice with year shall be included. Fixed. > >> >> ... >>>> +/** >>>> + Print a human readable number. >>> I do not think "number" is right here. Isn't it a size in bytes? >> >> >> Doesn't include a size a number too? I use the function for the file >> size and the number of bytes per table-data chunk. So what do you >> suggest to write here? "human readable number of bytes"? Or just add >> "K", "M", ... and leave "bytes" to be printed outside of the function? > > It is not a big issue, but my point is that numbers are a sequence of > digits (and +, -, and .) What is printed here is a number AND a unit > (bytes). I'll try to make it clearer. > >> >> ... >>>> + if (!opt_mysqlbackup_exact) >>>> + { >>>> + /* Shrink value while it's > 16K and postfix list is not at > end. */ >>>> + while (((value > 16383) || (value < -16384)) && > *(++postfix_p)) >>>> + value/= 1024; >>>> + /* >>>> + In case that we terminated the loop due to end of the postfixes >>>> + list, we need to compensate the loop's ++postfix_p so that we >>>> + point to the last non-NULL postfix. Note that this can only > happen >>>> + on machines where longlong has more than 64 bits. >>>> + */ >>>> + if (!*postfix_p) >>>> + --postfix_p; /* purecov: inspected */ >>> I am not very fond of dead code like this. It is not possible to >>> test, and it took me some time to verify that it is actually correct. >>> I would guess that if longlong ever becomes something more than 64 >>> bits, there are bigger problems that need to be solved ... >> >> >> Possible. But do you suggest to consciously leave this unsolved? Didn't >> we face enough problems with code that didn't expect "long" could be >> longer than 32 bits? If we will get problems with code that doesn't >> think of longlong to be longer than 64 bits ever, do we want to >> consciously add yet another one here? > > It does not seem to me that spending time on fixing .1% of the issue > we will have if that happens is worthwhile. This is still a case, where I don't know, how to change it to make you happy. Do you want me to remove the 'if' and the decrement? Above you said, that the "implementor [has to] consider whether better solutions exists". But if I remove it, you might come back to me, complaining about a possible bug. So please give me a hint, what solution you would accept. > >> >> ... >>>> + case BSTREAM_IT_CHARSET: >>>> + case BSTREAM_IT_USER: >>>> + case BSTREAM_IT_TABLESPACE: >>>> + { >>>> + struct st_backup_global *bup_global= (struct st_backup_global*) >>> item; >>>> + type_s= bup_global->glb_typename; >>>> + mdata= &bup_global->glb_metadata; >>>> + DBUG_PRINT("bupstrm", ("%s", type_s)); >>>> + break; >>>> + } >>> In the above code block, the order of the statements is different from >>> the others below. I think it would be good with some symmetry here. >> >> >> Ok. Which order do you like best? > > I do not really care which order. I just feel that some symmetry > makes the code easier to read. Fixed. > >> >> ... >>>> + case BSTREAM_IT_SPROC: >>>> + DBUG_PRINT("bupstrm", ("sproc")); >>>> + type_s= "Sproc"; >>>> + break; >>>> + case BSTREAM_IT_SFUNC: >>>> + DBUG_PRINT("bupstrm", ("sfunc")); >>>> + type_s= "Sfunc"; >>>> + break; >>> Is sproc and sfunc well known abbreviations? Does the s mean stored or >>> sql? >>> I think it would be good to be more explicit here. >> >> >> Sure, but look. At the moment I print: >> >> Table 'mysqltest1'.'t3' >> Sproc 'mysqltest1'.'p1' >> Sproc 'mysqltest1'.'p2' >> Sfunc 'mysqltest1'.'f1' >> Sfunc 'mysqltest1'.'f2' >> View 'mysqltest1'.'v1' >> View 'mysqltest1'.'v2' >> Event 'mysqltest1'.'e1' >> Event 'mysqltest1'.'e2' >> Privilege 'mysqltest1'.'<empty>' >> >> If I change to "stored procedure", the indentation of the object names >> will become so wide that it may be difficult to associate the type name >> with the object name for "table", "view", and "event". For table one >> could print "relation", which is longer, but I don't know how to make >> "view" longer. So what shall I do? > > I see your point. Do really "stored" need to be included here? Would > just using "procedure" and "function" introduce ambiguity? "Procedure" may be unique. But if we one day decide to back up user defined functions, we'd be in trouble with "Function". Does this concern change your suggestion? ... > My suggestion would be to use errm_with_prefix directly where needed > and add the "Cannot parse ..." directly to the string. Done. > >> ... >>>> +/** >>>> + Parse the search name. >>>> + >>>> + Split search name into database name, if present, and object name. >>>> + >>>> + @param[in] search_name search name >>>> + @param[out] search_database_name_p database name >>>> + @param[out] search_object_name_p object name >>>> + >>>> + @return status >>>> + @retval 0 ok >>>> + @retval != 0 error >>>> +*/ >>>> + >>>> +static int parse_search_name(const char *search_name, >>>> + char **search_database_name_p, >>>> + char **search_object_name_p) >>>> +{ >>> It seems to me like you are re-inventing the wheel here. Parsing for >>> quoted strings etc. must have been done before. Can't some existing >>> code be reused? >> >> >> Which existing code do you mean? > > I am not sure, but I would think that MySQL would need code to handle > quoted strings (e.g., the interactive client). See above for a discussion of this. ... > I think it is a huge difference between reusing a simple string and > simplifying an algorithm by extracting common sub-operations. As far > as I can tell (count), we are talking about moire than 20 lines of > code that are almost identical. (Only difference is whether quote is > initialized to '.' or '\0') ... > I am not insisting on anything. It is your code. No. It is Sun's code. I don't have a trace of ownership in it. > However, this issue > is more important to me than most of my other comments. By extracting > common code into a function, one will improve readability and > maintainability. One will immediately see that the code is doing the > same thing twice. This was not evident when I was reading the code, > which meant that it took me longer than necessary to understand what > the code was doing. Ok, I refactored the parser. It looks less overwhelming, but I doubt that it is easier to understand and maintain. If something goes wrong, the whole thing must be understood anyway. And understanding, how multiple functions interact, can be more difficult, than to have the whole algorithm in one place. ... >> :) I searched for "stuff" through the MySQL sources and found > 300 >> lines. I found it in > 300 MySQL related emails in my archive. I found >> it in > 200 Linux man pages. It seems to be the way people are talking. >> >> Anyway, what do you suggest as a replacement? > > content? However, this is not a big issue. Changed. ... > I do not think you need to change the names. I think there a > different standards for what a presented to programmers and to end > users. I think we need to be more careful about the visible part of > the product. Hm... ... >>> You are repeating the same pattern here. Please, consider if some >>> common code can be extracted into a separate method. >> >> >> Done. > > Considered or extracted? ;-) Extracted. ... > This is not an exact science. There are many different aspects to > consider, and probably many different opinions. That you are doing > this in two places is an argument for keeping it as it is. Some > arguments against: > > - This function is not always doing read_and_print. It only prints > if that is asked for. Changing the code where > read_and_print_summary is called like this would make that more > evident: > > backup_read_summary(...) > if (opt_mysqlbackup_summary) > print_summary(...) // New method that does the print part Together with the required error checking it makes quite a few lines of code. And exactly the same code is needed at two places. After all, this function combines all actions that are required for processing the summary section. If this is not an acceptable reason to make a function, I need some education in programming style. > > - The current function is a bit assymetric since you call another method > for reading, but do the printing in the function. (This is not a > major point.) Fixed. ... > The point is that we are talking about different levels of > abstraction. The next higher level should have a single > responsibility as seen from the caller's level of abstraction. > >> >> See for example Backup_restore_ctx::do_backup(). It has the >> resposibilities logging, locking, starting drivers, polling drivers, >> writing image file, and so on. > > No, at the caller's abstraction level, it has the single > responsibility of doing backup. At a lower abstraction level, this > involves logging, locking, ... But if do_backup() is allowed to combine several resposibilities into a single one, why is it bad to combine everything required for the summary section into a function? I still don't get the difference. ... >> Can you please define "single responsibility"? I have the impression >> that I won't get it right without such definition. > > My suggestion is to put the parts that prints stuff in separate > methods. (E.g., print backup stram header, print snapshot section, > print catalog, print meta data). The reading seems to already happen > in separate methods. > >> >>>> +{ >>>> + char **defaults_argv; >>>> + void *image_handle; >>>> + char *search_database_name= NULL; >>>> + char *search_object_name= NULL; >>>> + struct st_backup_catalog *bup_catalog= NULL; >>>> + struct st_bstream_image_header *hdr; >>>> + enum enum_bstream_ret_codes brc; >>>> + int rc; >>>> + uint idx; >>>> + int errpos= 0; >>>> + char llbuff[22]; >>> Some explanations for the different local variables would be good. >>> In particular, I would like to know the significance of the different >>> values for rc. >> >> >> Can you please point me at some code, where this is done exemplarily? > > I am not looking for something very advanced here. Just a line for > each variable that explains it's purpose. When a method is a long as > this one, it is not straight-forward to see what a variable is used > for, and an explanation would be good. I added this to the function comment. Comment lines between the declarations looked pretty confusing. Especially as it is not common MySQL style. ... >>>> + /* >>>> + The first two character sets are special: ... > My suggestion is to explain this thoroughly as you do the first time. > In the other places, you could just say "Skip first two character > sets. They have been displayed in the header section", or something > like that. Done. ... >>>> + use_err: >>>> + usage(); >>>> + rc= 1; >>>> + >>> Instead of a particular use_err target, why to just go to end and use >>> the value of rc to determine whether usage should be printed. >> >> >> I do not want to print the usage one every error, but only for wrong >> usage, so to say (wrong command line option, including invalid search >> string, wrong number of command line arguments). Using a special value >> for rc might do the trick, but requires careful documentation. What >> would be the win? > > I would like careful documentation of rc anyhow :-) After all the refactoring you suggested, there is no 'rc' left. ;-) Unfortunately, this grave refactoring means that you have to repeat the review of mysqlbackup.cc from scratch. :-( I'll send a notice to you and Rafal as soon as I commit the new patch. Regards Ingo -- Ingo Strüwing, Database Group Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten Geschäftsführer: Thomas Schröder, Wolfgang Engels, Dr. Roland Bömer Vorsitzender des Aufsichtsrates: Martin Häring HRB München 161028
