On Nov 20, 2009, at 1:36 PM, Kurt D. Knudsen wrote:
> In the current state, it can list a specific directory,
> create the md5 hashes, and output to a text file in about 8 seconds.
> When I use my MySQL version, I didn't let it finish because it already
> passed the 5 minute mark and from the looks of it, it wasn't close to
> finishing.
Writing serial data out to flat files will always be faster than inserting rows into
remote DB servers. There are several overheads in the DB alternative missing from the
flat-file one: index updates, random disk seeks, IPC latency, binary data
serialization...
The point of using a DB server is to allow concurrent access to data from multiple
clients, to change the schema at run time easily, and to query a subset of rows without
doing a full table scan. You hear speed talked about in context with DB servers so often
precisely because it's a bottleneck, not because speed is in fact available. :)
I'm not suggesting that your hundreds-to-one speed hit is normal or acceptable, just
making sure you realize you won't get back to 1:1, and that you're using a client-server
DBMS for the right reasons.
If you don't need concurrent access from multiple machines, you might be better off with
some non-server-based DB system, like SQLite.
> I'm sure there has to be a more efficient way
> to insert these rows into the database and speed this thing up a bit.
The fastest way in pure MySQL++ is probably Query::insertfrom(). See examples/ssqls6.cpp.