custom database query with Python

Question

I have a large database (more than 5 million rows) accessible with microsoft access. I am able so far to pull the entire table in a data frame with python but this is very long to be processed (more than 15 minutes) which is nuts as I only need to work on a much smaller section of the entire table ...

import pyodbc
import pandas as pd


conn_str(
r'DRIVER = {Adaptive Server Enterprise};'
r'DBQ = \\path_where_the_table_is_located.mdb;'
r'SERVER = server_name;'
r'Databse = ...;'
r'UID = xxx;'
r'PWD = xxx;'
r'port = xxx')


con = pyodbc.connect(conn_str)
cursor = conn.cursor()

df  = pd.read_sql('select * table_to_be_converted_into_a_df, conn)

how can I enhance my query above to only request a smaller portion of the entire table and run it much faster ?

df1 = df.loc[df['date'] == '2021-07-07']

this is the code I run to shrink the df once it's done and that I would like to add somehow to the initial query to ONLY query the data I need and run it much faster

Chris W. · Accepted Answer · 2021-07-06 23:39:51Z

3

Altering the SQL query itself such that it only returns rows of that date would look something like...

select * from table_to_be_converted_into_a_df where date = '2021-07-07';

This will reduce the total amount of data returned from the DB to your python script—which could significantly speed up your script. However, if your table table_to_be_converted_into_a_df does NOT have an index on the date column, then your query will still be scanning the entire table, which may take a while.

If thats the case consider adding an index to the date column.

answered Jul 6, 2021 at 23:39

Chris W.

39.5k37 gold badges105 silver badges139 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Matarus Over a year ago

Amazing ! I tried your code and I have my result in less than 3 seconds from 20 minutes sounds like a good improvement to me :)

hackwithharsha Over a year ago

@chris is it possible to add index in access db ?

Chris W. Over a year ago

@hackwithharsha support.microsoft.com/en-us/office/…

Collectives™ on Stack Overflow

custom database query with Python

1 Answer 1

3 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Related