mercredi 8 juillet 2015

Replacing duplicates pandas to_sql (sqlite)

I am appending pandas dataframes to sqlite. My primary key is:

Datetime | UserID | CustomerID

My issue is that sometimes I get a new file with old data that I want to append to the existing sqlite table. I am not reading that table into memory so I can't drop_duplicates in pandas. (For example, one file is always month-to-date data and it is sent to me everyday)

How can I ensure that I am only appending unique values based on my primary key? Is there a pandas to_sql function to insert or replace when I append the new data?

Also, should I specify dtypes in pandas before writing to SQL? I had some error messages when I tried to write to SQLite and I had categorical dtypes.

Aucun commentaire:

Enregistrer un commentaire