I have a large database (more than 5 million rows) accessible with microsoft access. I am able so far to pull the entire table in a data frame with python but this is very long to be processed (more than 15 minutes) which is nuts as I only need to work on a much smaller section of the entire table ...
import pyodbc
import pandas as pd
conn_str(
r'DRIVER = {Adaptive Server Enterprise};'
r'DBQ = \\path_where_the_table_is_located.mdb;'
r'SERVER = server_name;'
r'Databse = ...;'
r'UID = xxx;'
r'PWD = xxx;'
r'port = xxx')
con = pyodbc.connect(conn_str)
cursor = conn.cursor()
df  = pd.read_sql('select * table_to_be_converted_into_a_df, conn)
how can I enhance my query above to only request a smaller portion of the entire table and run it much faster ?
df1 = df.loc[df['date'] == '2021-07-07']
this is the code I run to shrink the df once it's done and that I would like to add somehow to the initial query to ONLY query the data I need and run it much faster
