lundi 12 janvier 2015

Creating SQLAlchemy engine from multiple SQLite databases in memory

I am very new to databases in general, so pardon my question if it is very simple. Right now I have to work on a task to do analytics and visualization from data in databases. I am forced to work with legacy code so my approach is something like this:




  1. Create an SQLite database (call it A) in memory (I have to do this because I am given huge chunks of SQL scripts)




  2. Create another distinct SQLite database (call it B) again in memory. With some looping, query and extract information out of database A, do a bit of combinations to them, and insert them into database B. Database B is now the database from which we will take data out from to perform analyses.




Starting from steps 3 and 4, I want to switch over to Python's pandas, a new move which the old codes did not have.




  1. Convert that entire SQLite database B into a Python pandas DataFrame object




  2. Perform analyses from that data taken from the DataFrame object (I do not want to continue these analyses with the SQLite database because it is much more tedious having to query it each time I need to extract the data I need, compared to if I were to use a DataFrame)




However, the difficulty I am facing is that in moving from step 2 to 3, I do not know how I could actually safely have my database B be converted into an SQLAlchemy engine that would make it ready to be fed in pandas to form a DataFrame.


As I read from the pandas documentation on this, and also the SQLAlchemy documentation on SQLite and in-memory databases, all I am supposed to do to turn my SQLite database into the SQLAlchemy engine is to tell SQLAlchemy that I want it to look for the database in memory



The sqlite :memory: identifier is the default if no filepath is present. Specify sqlite:// and nothing else:




#in-memory database
e = create_engine('sqlite://')


However, for my case here, there are 2 databases A and B sitting in memory. In this case, how can I tell SQLAlchemy that I want it to take database B and not A? Is this a valid concern?


I hope to refactor the codes when I have the time. Can anyone suggest to me a more elegant way to move from step 1 to 4?


Aucun commentaire:

Enregistrer un commentaire