On Nov 20, 2009, at 1:36 PM, Kurt D. Knudsen wrote:
> In the current state, it can list a specific directory,
> create the md5 hashes, and output to a text file in about 8 seconds.
> When I use my MySQL version, I didn't let it finish because it already
> passed the 5 minute mark and from the looks of it, it wasn't close to
Writing serial data out to flat files will always be faster than inserting rows into
remote DB servers. There are several overheads in the DB alternative missing from the
flat-file one: index updates, random disk seeks, IPC latency, binary data serialization...
The point of using a DB server is to allow concurrent access to data from multiple
clients, to change the schema at run time easily, and to query a subset of rows without
doing a full table scan. You hear speed talked about in context with DB servers so often
precisely because it's a bottleneck, not because speed is in fact available. :)
I'm not suggesting that your hundreds-to-one speed hit is normal or acceptable, just
making sure you realize you won't get back to 1:1, and that you're using a client-server
DBMS for the right reasons.
If you don't need concurrent access from multiple machines, you might be better off with
some non-server-based DB system, like SQLite.
> I'm sure there has to be a more efficient way
> to insert these rows into the database and speed this thing up a bit.
The fastest way in pure MySQL++ is probably Query::insertfrom(). See examples/ssqls6.cpp.