My question is basically whether it is safe, when using parallel processing in R, to have multiple threads accessing an SQLite database simultaneously. I understand that SQLite is a file level dbs, so every connection gets access to the whole db. So, it is possible to have multiple connections going simultaneously (e.g., via the SQLite3 front end and, in R, via RSQLite's dbConnect() and via dplyr's src_sqlite()). I guess that this is OK so long as there is a single user who can assure that commands submitted one way are completed before other commands are submitted. But with multithreading, it would seem possible that one thread might submit a command to an SQLite db while a command submitted by another thread might not have completed. Does the underlying SQLite engine serialize received commands so that it is assured that one command is completed before the next one is processed, so as to avoid creating an inconsistent status of the database? I have read the SQLite documentation on locking and "ACID," and as I understand this documentation, the answer appears to be "Yes." But I want to be sure that I have understood things correctly. (Another question is whether it is safe to have separate threads submitting commands simultaneously that actually change the database. Since one can't control the exact timing by which the two threads submit their commands, I assume that using parallel processes that might change an SQLite data table in an inconsistent way would not be a good idea -- e.g., having one thread insert a record into a table and another thread doing a SELECT on the same table.) Thanks in advance for any help or clarification of this issue.
Aucun commentaire:
Enregistrer un commentaire