From: Date: September 10 2007 6:26am Subject: Re: Re: [PATCH] (super)user-loadable mysqld parsers [API stuff] List-Archive: http://lists.mysql.com/internals/35030 Message-Id: <20070910042600.GD30577@w3.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Ax5Xc7LkAiI0IBI6" --Ax5Xc7LkAiI0IBI6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable * Arnold Daniels [2007-09-09 13:10-0400] > > > Eric Prud'hommeaux wrote: >> * Arnold Daniels [2007-09-08 18:11-0400] >> =20 >>> Hi Eric, >>> =20 >> >> >> =3D=3D API stuff =3D=3D >> =20 >>> I'm not fully agreeing with you here. >>> =20 >> >> You can't strongly disagree with me 'cause i don't have a strong >> opinion. I have a vague preference for the approach I've taken, and >> will defend it 'till I'm dead^h^hbored. Seriously, I appreciate >> this discussion. >> =20 > Aren't you even considering, I might just convince you ;). Oh yes, was flippantly attemptying to imply that. I'm not super invested in one approach over the other. I implemented the prefix-switched parser first and did the protocol-switched one after talking to a few people, any of whom are welcome to chime in and either back their preference or indicate that they've been convinced by your arguments. The prefixed-switch parser was certainly easier for me to implement. >> =20 >>> I think changing the API just has= =20 >>> to big of an impact on all the different clients. Let's say you're use= =20 >>> ODBC or PDO in PHP. There's no way to implement that neatly. I really= =20 >>> don't like automatic selection either cause it's to easy to mess up. >>> =20 >> >> My uninformed guess is that PHP, DBI, et al are using mysql_query. The >> deployment challenge in that case is to add a call for the new API >> entry, mysql_send_query, which is like mysql_query but takes an extra >> parameter. Likewise, ODBC's executeQuery function would need to link >> to mysql_send_query and have some settable var to indicate the >> language code to use. >> =20 > The point here was not about the examples. But there are a few mysql=20 > clients out there: ODBC, /J, native PHP, etc. On top of those there are= =20 > hundreds of other drivers, usually multiple for each programming language= =2E=20 > All of these would need to be changed to support parser switching. Now on= =20 > top of those there are thousand and thousands of DB abstraction classes a= nd=20 > libs. All of those would need to be changed as well. > Besides that, the API off all of these clients can't break compatibility.= =20 > In many cases that means that having a second parameter to choose the=20 > parser is out of the question. Adding a function like you did for the mys= ql=20 > client isn't a possibility for most clients either, since they are data= =20 > access abstraction layers, having to conform to a specific API. Sure in= =20 > most cases you can come up with a workaround. But it will be a huge mess. The extra parser codes are at the end of the set of protocol codes so any client should be able to speak to the extended server (I tested with a 5.0 client). Likewise with the things that speak to the client, I don't see that they break compatibility when a new API point is added. However, their code would need to be visited in order to take advantage of the extension points. >> =20 >>> If you don't like to integrate the parser, perhaps a good solution is t= o=20 >>> select it with a local setting, so you could do: >>> SET LOCAL query_parser=3DPARSER_SPASQL; >>> SELECT ?s WHERE { ?s "hibbyhop" }; >>> =20 >> >> The problem with that is getting back. SET (LOCAL|GLOBAL) is parsed >> SQL parser. Each parser would have to implement its own way of getting >> back or you'd lose the flexibility of being able to intersperse >> queries. >> =20 > Each parser would need to implement the SET command, I don't see a proble= m=20 > there. Implementing a specific command `PARSER SPASQL`, could also work.= =20 > Though not to add yet another non ANSI keyword, something like `SET PARSE= R=20 > SPASQL` would be a better alternative. That way you don't need to impleme= nt=20 > SET in each parser. Though I think having set in each parser would be a= =20 > good idea, since is it the way to control how MySQL acts and you want to = be=20 > able to do that no matter what type of query your sending. * Arnold Daniels [2007-09-09 16:54-0400] > In any case, Oracle uses a specific keyword, XQuery, to select a differen= t=20 > parser. I really think that is the way to go, regardless of you do or do= =20 > not want to embed the XQuery statement. Here is a perhaps oversimplifed patch to prefix-switch on the server: --- sql/sql_parse.cc.orig 2007-08-26 04:36:04.000000000 -0400 +++ sql/sql_parse.cc 2007-09-05 09:54:19.000000000 -0400 @@ -5336,7 +5348,24 @@ =20 Lex_input_stream lip(thd, inBuf, length); =20 - bool err=3D parse_sql(thd, &lip, NULL); + bool err; + if (!strncmp(buf, "parser", 6) && buf[6] !=3D 0 && buf[7] =3D=3D ':') { + uchar parser =3D buf[6]-'0'; + if (parser > 0 || parser < PARSERS) { + buf +=3D 8; + length -=3D 8; + struct dynamic_parser_info* p =3D &dynamic_parsers[parser-1]; + if (p->initialize) { + char* space =3D (char*)malloc(strlen(lip.get_ptr())+1); + strcpy(space, lip.get_ptr()); + + Dynamic_parser* parser =3D p->initialize(thd, &space); + err=3D parser->parse(); + delete parser; + } else { + my_error(ER_CANT_FIND_DL_ENTRY ,MYF(0), "initialize"); //!!!not loaded + } + } else { + my_error(ER_UDF_NO_PATHS, MYF(0)"); // !!! illegal, really + err=3D 1; + } + } else { + err=3D parse_sql(thd, &lip, NULL); + } + *found_semicolon=3D lip.found_semicolon; =20 if (!err) but I'd like to hear from a couple other folks to guage the temperature of the list. [ cutting out XQuery stuff for this subthread ] --=20 -eric office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA mobile: +1.617.599.3509 (eric@stripped) Feel free to forward this message to any list for any purpose other than email address distribution. --Ax5Xc7LkAiI0IBI6 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iQEVAwUBRuTHWJZX2p1ccTnpAQJP/wf+Pw1CA8a3K3VYt2vLs0biSIp3L7i0IYPI 1G61nHRo+3n2CR2+pDxurSB7Nz/vIClHNwAHxJPMNLT6HXrxcPc5TMP5hwLAV6Dg VJ0Txpi45ltzkI0YPHLIWWUlY+SCuql13hWb04i/AXgLXzRx9D/sFhoXni7hq+RL xC+ZK9lwHcAec/Wi2YIHG3a3AelpJmrtBdyGremlAUy2U9cfrjuIc2FJu6fvxloU tOlJdpki/42gHIAZ+BgaNBGeXlwTky/23cLV8Gnx40HvLWG3TBNMN4ONmJqJqX+g /mz+8zYR6X2KjErRuVhKvXRIuMyVDM3iW9Jf1nyMCRPIRyicNQq9YQ== =g2DU -----END PGP SIGNATURE----- --Ax5Xc7LkAiI0IBI6--