I have written a web crawler that does some online pre-processing and stores that data into a database. The database layout is quite simple, but let me outline it:
- Table
dispatchedlists items that were dispatched at a particular time, which is one of the columns. There are 15 other columns that describe the items: One of these is from typeTEXTand the rest is from typeINT. - The table
missedlists the time periods that the crawler failed to watch, e.g. because of network problems. It has 4 columns from typeINTand 2 columns from typeTEXT.
I left it run for many hours twice. Both times, the resulting database file had a total size of exactly 256000 bytes, at least according to ls -l. I have seen from the recorded data that regularly 1-3 items are recorded per minute, but starting from a particular time, there are no new items listed any more.
To me, this sounds as if there was a limitation that I hit. Given that the resulting database file size was exactly 1000 * 2^8 bytes both times, I would suspect that it is a limitation on the maximum database file size, but the documentation doesn't say anything like that.
The moment that SQLite stopped appending new rows to the database, there were
- 5187 rows on
dispatchedand 3 rows onmissedduring the first run and - 5212 rows on
dispatchedand 2 rows onmissedduring the second run.
I'm using the sqlite3 module for Python 2.7. I appreciate any help that might point out what was going on, why SQLite stopped appending new rows after hitting 256000 bytes and how I could fix this.
Aucun commentaire:
Enregistrer un commentaire