Sorry for bringing this topic up again knowing that this list is for
Much of the discussion is related to performance, why is that ?
It seems like most web-masters cares about problems hardly relevant to
them. Even if your site is well visited I can almost guarantee that
10 times as visted sites uses ASP, Shell scripts, CGI Perl or even java
for dynamic content. The probably have more important problem than
how their site preform.
I have spent the last year's spare time to study and develop a similar
parser taking the good influences and discarging the unimportant.
There seems to be a belief that server API modules are much faster in
every aspect, this is far from the truth. A tool like a document parser
needs a input file (template) which often is embedded into HTML code.
The file I/O required to open the file, read it's content into a string,
is a far more expensive task than passing a enviroment list and fork a new
process. If a CGI program, written in C contains all intended output
without the need to read an external file you are guaranteed to have a
faster execution. This goes for a compiled C program. C++ programs using
the string object and iostreams and maybe even exception handling is many
times slower. Perl and Java programs are even slower than that, especially
The performance discussion reminds a lot of the top speed discussion
Basically what a server API module gains is the start-up of a new
execution. The real issue however should be what the program does, what's
it supposed to do and howe it's done. I can fairly truthful state that I
have spent a lot of time studying different aspects regarding memory
handling, file I/O and database connections. I have chosen ODBC even in
UNIX enviroment. I think it's among the best ideas Microsoft has been
launching by themselfes.
It offers a great way to request capabilities and handle the transaction
between the DB and client using very small overhead. The open arhitecture
of ODBC wins. DB API's is for writing ODBC drivers in my mind.
Now several client tool uses the SQL function SQLExtendedFetch() allowing
the record cursor to be moved incrementally backwards as well as forwards,
even to the beginning again. This is used by most web connectors too just
to be able to output the data twice or more. My study has shown that using
the regular SQLFetch (only forward) will consume a lot less resources and
after a result set is fetched reopen the query for an eventual record
fetch again. This will only be done with the first executed query in a
stack reffered to by name.
The result is a equality in speed with just one query executed but a huge
win while starting executing queries in a result loop.
This is extremely fast and I would like to know hove it's done in other
solutions. The SQLExtendedFetch is for scrolling record sets in a
graphical enviroment and not suited for this purpose.
Strings and memory handling, I have not studying the string objects in C++
but I can say that normally the memory allocation seems to be far to
frequent for efficient CPU use. We have developed a string class which
tries to determine the nature of the string and allocates more memory for
each time it needs to (up to a certain point), it appears that malloc and
free is faster than new and delete too.
One of the most important aspects in execution speed where a parser tool
opens a file is how it's scanned. The general rule from what I understand
is that the more functions (or even better tags) there is to locate at
each position the longer will execution take. And passing references to
the original string instead of copying new ones will also result in a more
We use bit flags for tags while checking occurances and it seems to be
WHY C, PHP Or others (at last)....
C and PhP uses similar syntax, function calls, and PhP seems like a mix of
shell, Perl, Java and C. The idea is fine but why use a programming tool
for creating dynamic web content, I don't use C for creating a word
document, I use Word !. I don't use C for storing data either, I preffer
MySQL or MS-Access perhaps. The final document outputting dynamically
resolved data should not look like the source code for the entire web
server at first look. Do you get my point. It's not right in time.
In Windows enviroment the discussion is Cold Fusion VS ASP. CF wins every
time, it's designed to do this. ASP is a subset of Visual Basic script
designed to do completely different things. I don't think that a site
builder should need to be a programmer more than his/her collegue writing
a document in word.
This is why we released webmerger and bettering the paser just about every
If you gotten this far in this mail you are either interested in the
subject or have too much spare time.
It would be interesting if someone else could benchmark a webmerger result
fetch compared to PhP.
Thanks for your time