5

I am trying to write a program in Python3 that will run a query on a table in Microsoft SQL and put the results into a Pandas DataFrame.

My first try of this was the below code, but for some reason I don't understand the columns do not appear in the order I ran them in the query and the order they appear in and the labels they are given as a result change, stuffing up the rest of my program:

 import pandas as pd, pyodbc    

    result_port_mapl = []

    # Use pyodbc to connect to SQL Database
    con_string = 'DRIVER={SQL Server};SERVER='+ <server> +';DATABASE=' + 
<database>


     cnxn = pyodbc.connect(con_string)
    cursor = cnxn.cursor()

    # Run SQL Query
    cursor.execute("""
                   SELECT <field1>, <field2>, <field3>
                   FROM result
                   """)

    # Put data into a list
    for row in cursor.fetchall():
        temp_list = [row[2], row[1], row[0]]
        result_port_mapl.append(temp_list)

    # Make list of results into dataframe with column names
    ## FOR SOME REASON HERE row[1] AND row[0] DO NOT CONSISTENTLY APPEAR IN THE 
    ## SAME ORDER AND SO THEY ARE MISLABELLED
    result_port_map = pd.DataFrame(result_port_mapl, columns={'<field1>', '<field2>', '<field3>'})

I have also tried the following code

    import pandas as pd, pyodbc

    # Use pyodbc to connect to SQL Database
    con_string = 'DRIVER={SQL Server};SERVER='+ <server> +';DATABASE=' + <database>
    cnxn = pyodbc.connect(con_string)
    cursor = cnxn.cursor()

    # Run SQL Query
    cursor.execute("""
                   SELECT <field1>, <field2>, <field3>
                   FROM result
                   """)

    # Put data into DataFrame
    # This becomes one column with a list in it with the three columns 
    # divided by a comma
    result_port_map = pd.DataFrame(cursor.fetchall())

    # Get column headers
    # This gives the error "AttributeError: 'pyodbc.Cursor' object has no 
    # attribute 'keys'"
    result_port_map.columns = cursor.keys()

If anyone could suggest why either of those errors are happening or provide a more efficient way to do it, it would be greatly appreciated.

Thanks

1 Answer 1

6

If you just use read_sql? Like:

import pandas as pd, pyodbc    
con_string = 'DRIVER={SQL Server};SERVER='+ <server> +';DATABASE=' + <database>
cnxn = pyodbc.connect(con_string)
query = """
  SELECT <field1>, <field2>, <field3>
  FROM result
"""
result_port_map = pd.read_sql(query, cnxn)
result_port_map.columns.tolist()
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, that works great never seen that function before read_sql()
Could you please explain con_string? I just know how to use connection = pyodbc.connect('DSN=B1P HANA;UID=***;PWD=***'). And do not know how to use your way. Thanks
You can get the standard elements of the SQL-ODBC-connection-string here: github.com/mkleehammer/pyodbc/wiki/…. For your odbc-connection check Windows's (klick on Windows button) "Data Sources (ODBC)". You can find the respective data source details there, eg data source name (DSN), server name.
pyodbc doesn't seem the right way to go "pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy"

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.