How do you set up a valid auto-incrementing integer primary key on a table if you want to join it with separate files? I get data like this on a daily basis:
Interaction data:
Date | PersonID | DateTime | CustomerID | Other values...
The primary key there would be PersonID + DateTime + CustomerID. If I have an integer key, how can I get that to relate back to another table? I want to know the rows where a specific person interacted with a specific customer so I can tie back those pieces of data together into one master-file.
Survey return data:
Date | PersonID | DateTime | CustomerID | Other values...
I am normally processing all raw data first in pandas before loading it into a database. Some other files also do not have a datetime stamp and only have a date. It is rare for one person to interact with the same customer on the same day so I normally drop all rows where there are duplicates (all instances) so my sample of joins are just purely unique.
Other Data:
Date | PersonID | CustomerID | Other values...
I can't imagine how I can set it up so I know row 56,547 on 'Interaction Data' table matches with row 10,982 on 'Survey Return Data' table. Or should I keep doing it the way I am with a composite key of three columns?
Aucun commentaire:
Enregistrer un commentaire