I have 2 tables named geostat and res. I want to do an inner join on those two tables and store the result in to tables. I know the sql queryu for the same but when I do it in python using sqlite, It seems to take longer time to execute the query (approx 1hr which is not good). I wonder since it has an inner join does it take longer time to execute? How do I optimize the query?
My sql query is :
select res.[MaxOfState count percent], res.[grpno.], geostat.[state code] from res inner join geostat on res.[MaxOfState count percent] = geostat.[count percent] AND res.[grpno.] = geostat.[grpno.]
My tables are as follows:
geostat:
count percent grpno. state code
0 14.78 1 CA
1 0.00 2 CA
2 8.80 3 CA
3 9.60 4 FL
4 55.90 4 MA
5 0.00 2 FL
6 0.00 6 NC
7 0.00 5 NC
8 6.90 1 FL
9 59.00 4 MA
res:
grpno. MaxOfcount percent
0 1 14.78
1 2 0.00
2 3 8.80
3 4 59.00
4 5 0.00
5 6 0.00
Now when i execute the inner join query it takes more than an hour to execute. My code in python is as below:
import pandas as pd
from sqlalchemy import create_engine # database connection
disk_engine = create_engine('sqlite:///fnma.db') # Initializes database with filename fnma.db in current directory
inner_join_geostat = pd.read_sql('select res.[MaxOfState count percent], res.[grpno.], geostat.[state code] from res inner join geostat on res.[MaxOfState count percent] = geostat.[count percent] AND res.[grpno.] = geostat.[grpno.] ',disk_engine)
inner_join_geostat.to_sql('inner_join_geostat', disk_engine, if_exists='replace')
Can anyone tell me how to optimise the query/code?
Aucun commentaire:
Enregistrer un commentaire