Hi Ingo, Serg!
A big thank you for your comments and insight. My research is
summarized in the first in a blog article series:
http://www.jpipes.com/index.php?/archives/182-Internals-of-MyISAMs-Concurrent-Insert-Functionality-Part-I.html
I welcome suggestions and corrections, of course :)
Thanks again!
Jay
Ingo Strüwing wrote:
> Hi Jay,
>
> Jay Pipes wrote:
> ...
>> 1) I notice that if the append_insert_at_end property of the share's
>> info struct is not set, then both a file_read() *and* a file_write() are
>> called when writing the data -- presumably because a seek is needed to
>> find the first deleted record slot...
>
> If append_insert_at_end is not set *and* there is a deleted record. Then
> we read the first deleted record to get at the pointer to the next
> deleted record. This is then assigned to info->s->state.dellink (for the
> next insert). Only then we can overwrite the deleted record.
>
> Note that this means that we cannot do concurrent insert when entering
> this branch. Concurrent inserts *always* append at end of file. So only
> the 'else' branch can be used for concurrent inserts, but also for
> non-concurrent inserts when there is no deleted record.
>
> ...
>> Is this the reason why a lock is needed when concurrent inserts aren't
>> enabled? In other words, does the file_read() actually lock the file
>> descriptor or the file in the place seeked to?
>
> No. On table lock level it is assured that only *one* modifying
> operation can touch a table at a time. If there are multiple concurrent
> inserts, they have to queue up. But multiple reads are allowed on a
> table in this situation.
>
> The trick is that every read-only statement takes a copy of
> info->state->data_file_length when it starts. If concurrent inserts
> increase this value, the read operations will see the new records, but
> ignore them because they are beyond the saved data_file_length.
>
>> 2) If file locking doesn't happen in either file_read() nor file_write()
>> (which I could not find any evidence that in a PREAD environment they
>> do...) when exactly does the file locking occur? I see that in
>> mi_lock_database() (in /storage/myisam/mi_locking.c) there are calls to
>> my_lock(), but only for the share->kfile, which I thought was just the
>> .frm file?
>
> Correct. Neither file_read() nor file_write() nor pread() do lock anything.
>
> my_lock() *can* do file locking, but this affects other *processes* only
> (not other threads in the same server). my_lock() is a NOOP if the
> server is started with skip-external-locking, which is the default.
>
> I hope, my above explanation is sufficient. Locking is handled on the
> table level. At any time there can only be *one* modifying operation on
> a table. Concurrent insert allows *reading* operations to run in parallel.
>
> share->kfile is the index file (.MYI). The .frm file is only open for a
> short time during open of the table. Since every use of a MyISAM table
> needs access to data file *and* index file, it is sufficient to
> (external) lock one of them: the index file.
>
>> Where does the .MYD file get locked during non-concurrent inserts/writes?
>
> The .MYD file is never locked. All locking is done on table level.
> Non-concurrent inserts have an exclusive lock on the table. Concurrent
> inserts allow parallel reads.
>
> If you really want to find the locking code, do the following.
>
> Execution of SQL statements almost always starts at
> sql_parse.cc:mysql_execute_command(). For INSERT search for the switch
> label SQLCOM_INSERT.
>
> Most statements call open_and_lock_tables() near their beginning.
> Sometimes this is done in a top-level function. In our example this is
> sql_insert.cc:mysql_insert().
>
> sql_base.cc:open_and_lock_tables() is quite trivial. Dive into
> sql_base.cc:lock_tables(). Here you will find (I hope) some code that
> converts the table list into an array ('start'). It then calls
> mysql_lock_tables().
>
> lock.cc:mysql_lock_tables() first calls lock_external(). This is where
> the engines are called to prepare for a table lock. For MyISAM,
> mi_locking() is called, where a file locking can take place, if external
> locking is enabled.
>
> lock.cc:mysql_lock_tables() then calls thr_multi_lock().
>
> mysys/thr_lock.c:thr_multi_lock() is the real fancy stuff. Or should I
> better point to thr_lock(), called by thr_multi_lock() for each table?
>
> mysys/thr_lock.c:thr_lock() handles all the cases of distinct lock
> types, has precedence rules, maintains wait queues, etc... Enjoy. :)
>
> Through this function all table locking is done. It is responsible for
> letting readers through when a concurrent insert is going on, but pushes
> other writers in a wait queue. Or lets one concurrent insert through
> when only readers are on the table.
>
> Now, if you have learned that the locking for concurrent inserts is done
> on a layer above MyISAM, it is time to relativize your knowledge. ;-)
>
> The trick with remembering the end of file, and ignore everything past
> it, does only work with the data file. Index entries (keys) for the new
> records will usually have to go somewhere between other keys to keep the
> sort order.
>
> Index locking is done on MyISAM level, but only in cases where
> concurrent inserts are possible, by means of a 'key_root_lock' per
> index. You will find that name in several files, e.g. mi_write.c.
>
> Indexes are maintained before the data file. One after the other. An
> index is locked only as long as it is used. This allows for maximum
> parallelism with readers.
>
> If not completely confused yet, I could go into more details? ;-)
>
>> Your insight would be GREATLY appreciated, as my eyesight is beginning
>> to blur as I grep my way through countless function pointer redirections
>> and redefines in /storage/myisam and /mysys
>
> Oh yes. I understand *very well* what you mean. ;-)
>
> Regards
> Ingo