5

I'm looking to create a temp table and insert a some data into it. I have used pyodbc extensively to pull data but I am not familiar with writing data to SQL from a python environment. I am doing this at work so I dont have the ability to create tables, but I can create temp and global temp tables. My intent is to insert a relatively small dataframe (150rows x 4cols)into a temp table and reference it throughout my session, my program structure makes it so that a global variable in the session will not suffice.I am getting the following error when trying the piece below, what am I doing wrong?

pyodbc.ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")

import numpy as np
import pandas as pd
import pyodbc


conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=SERVER;'
                      'Database=DATABASE;'
                      'Trusted_Connection=yes;')

cursor = conn.cursor()

temp_creator = '''CREATE TABLE #rankings (Col1 int, Col2 int)'''

cursor.execute(temp_creator)

df_insert = pd.DataFrame({'Col1' : [1, 2, 3], 'Col2':[4,5,6]})
df_insert.to_sql(r'#rankings', conn, if_exists='append')
read_query = '''SELECT * FROM #rankings'''
df_back = pd.read_sql(read_query,conn)
5
  • Is it on Windows? Commented Aug 22, 2019 at 19:11
  • Yeah this is windows OS Commented Aug 22, 2019 at 19:17
  • I would try DSN connection. It's probably the easiest way. Other methods might need some time to try it out. Check the documentation for instruction. github.com/mkleehammer/pyodbc/wiki/… Commented Aug 22, 2019 at 19:32
  • Im not really worried about the connection that is fine, I've tested this code and the only line that fails is the df_insert.to_sql(r'#rankings', conn, if_exists='append'), I can create the temp table and read from it but I'm having trouble inserting my dataframe Commented Aug 22, 2019 at 19:39
  • 1
    Sorry, for inserting need to use sqlalchemy. Check stackoverflow.com/questions/25661754/… Commented Aug 22, 2019 at 20:07

2 Answers 2

11

Pandas.to_sql is failing there. But for SQL Server 2016+/Azure SQL Database there's a better way in any case. Instead of having pandas insert each row, send the whole dataframe to the server in JSON format and insert it in a single statement. Like this:

import numpy as np
import pandas as pd
import pyodbc

conn = pyodbc.connect('Driver={Sql Server};'
                      'Server=localhost;'
                      'Database=tempdb;'
                      'Trusted_Connection=yes;')

cursor = conn.cursor()

temp_creator = '''CREATE TABLE #rankings (Col1 int, Col2 int);'''
cursor.execute(temp_creator)

df_insert = pd.DataFrame({'Col1' : [1, 2, 3], 'Col2':[4,5,6]})

df_json = df_insert.to_json(orient='records')
print(df_json)

load_df = """\
insert into #rankings(Col1, Col2)
select Col1, Col2
from openjson(?)
with 
(
  Col1 int '$.Col1',
  Col2 int '$.Col2'
);
"""

cursor.execute(load_df,df_json)

#df_insert.to_sql(r'#rankings', conn, if_exists='append')
read_query = '''SELECT * FROM #rankings'''
df_back = pd.read_sql(read_query,conn)
print(df_back)

which outputs

[{"Col1":1,"Col2":4},{"Col1":2,"Col2":5},{"Col1":3,"Col2":6}]
   Col1  Col2
0     1     4
1     2     5
2     3     6
Press any key to continue . . .
Sign up to request clarification or add additional context in comments.

7 Comments

What about Oracle? Is this also the best way?
You should ask that in a separate question with the [oracle] tag, which is watched by people who might know the answer.
@DavidBrowne-Microsoft could you explain the WITH (...) statement when loading the json?
The with clause in OPENJSON specifies the projection of JSON values into relational columns with SQL Server data types. It's documented here: learn.microsoft.com/en-us/sql/t-sql/functions/…
To use this JSON "trick" with to_sql(), check out the gist here.
|
2

Inserting into temp table using sqlalchemy works great:

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('mssql://sql-server/MY_DB?trusted_connection=yes&driver=ODBC+Driver+17+for+SQL+Server')

df1 = pd.DataFrame({'a':[1,2,3], 'b':[4,5,6]})
df1.to_sql(name='#my_temp_table', con=engine)
df2 = pd.read_sql_query(sql='select * from #my_temp_table', con=engine)
# Now we can test they are the same:
pd.testing.assert_frame_equal(df1,df2.drop(columns=['index']))

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.