I am having some disk problems when deploying to a docker instance and I can't figure out yet how to solve it.
In the docker instance, there is a process that uses Shove with sqlite as the backend to save data to disk as a dictionary (I am saving around 200'000 records).
The problem occurs because I am using Shove as a dictionary to save, which makes every __setitem__
operation into a insert in sqlite.
Now Shove offers a sync
(an integer) parameter to specify when to execute the actual storage into disk, but that only saves a temporary dictionary in memory, and when the time comes it still makes the number of sync
operations into disk (which I don't want).
I checked that sqlite offers a executemany
operation which should reduce the method-call overheads, but does it really does 1 (or at least close to 1) operations to disk? I can't seem to find a clear explanation for this.
I also checked that a bulk operation could be done on sqlite but it isn't explained on python, so should I just call it with execute
and creating a very big string? (I want to be able to at least save 1000 records at a time).
So what should be the best way to write to disk with the minimum amount of operations?
Aucun commentaire:
Enregistrer un commentaire