1

I have a pandas dataframe of approx 300,000 rows (20mb), and want to write to a SQL server database.

I have the following code but it is very very slow to execute. Wondering if there is a better way?

import pandas
import sqlalchemy

engine = sqlalchemy.create_engine('mssql+pyodbc://rea-eqx-dwpb/BIWorkArea? 
driver=SQL+Server')

df.to_sql(name='LeadGen Imps&Clicks', con=engine, schema='BIWorkArea', 
if_exists='replace', index=False)
4

1 Answer 1

2

If you want to speed up you process with writing into the sql database , you can per-setting the dtypes of the table in your database by the data type of your pandas DataFrame

from sqlalchemy import types, create_engine
d={}
for k,v in zip(df.dtypes.index,df.dtypes):
    if v=='object':
       d[k]=types.VARCHAR(df[k].str.len().max())
    elif v=='float64':
       d[k]=types.FLOAT(126)
    elif v=='int64':
       d[k] = types.INTEGER()

Then

df.to_sql(name='LeadGen Imps&Clicks', con=engine, schema='BIWorkArea', if_exists='replace', index=False,dtype=d)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.