sqlite: Speed up many python sqlite3 insert or ignore and update commands

samedi 16 avril 2016

Speed up many python sqlite3 insert or ignore and update commands

I have some array data (in python) which I want to add to an SQL database using the data index as its primary key. I am doing this for multiple columns (shown below for a column named theta0 but really I'm stepping through columns named theta0, theta1, etc.) and each time I would like the data to either be added to the existing row (if the primary key exists) or create a new row (if the primary key does not exist). Here's the meat of the code, where nonzero_index is a list of indices.

    # Either inserts new ID with given value or ignores if id already exists 
    c.executemany("INSERT OR IGNORE INTO "+tablename+" (id, theta0) VALUES (?, ?)", [(i, data[i]) for i in nonzero_index])

    # Inserts data to new ids
    c.executemany("UPDATE "+tablename+" SET theta0=? WHERE id=?", [(data[i], i) for i in nonzero_index])

The problem is that nonzero_index list is typically very long, ~8million integers. This code is very slow, taking 4-5 hours for a ~8million indices. How can I speed this up? Would it be faster to pre-determine which indices already exist in the database vs which need to be new rows, e.g. using python sets?

sqlite

samedi 16 avril 2016

Speed up many python sqlite3 insert or ignore and update commands

Aucun commentaire:

Enregistrer un commentaire