1

I have MySQL sever that we can call MySever there are a hundred databases on that server, we can call the databases: database1, database2, database3… database100. Each database has a table called tablepull. What I am trying to do is get each table within each database called ‘tablepull’ and store the table into a python list, the table “tablepull” has the exact same headers for each database and can save as a dataframe in pandas.

Here are the variables being defined:

from sqlalchemy import create_engine
import pandas as pd

UserName = MyUserName
Password = MyPassword
Server = MyServer
Databases = [‘database1’, ‘database2’, ‘database3’,… ‘database100’]
Table = tablepull

I can easily pull one database table in using this syntax:

engine = create_engine(mssql:// UserName: Password@Server/Table?driver=SQL Server Native Client 11.0')
connection = engine.connect()
cnxn = connection.connection
#Read in data into from sql table#
raw_data = pd.read_sql(sql = """SELECT * FROM Table""", con = cnxn)

The goal is to be able to iterate through all of the databases and grab the desired table (tablepull) for each database and save into a list. Any help or guidance would be greatly appreciated.

Thank You

1
  • Which DB-API are you using? pyodbc, pymssql, other? Usually ,it is denoted by the mssql keyword in engine connection which you do not have in above redacted code. Commented Apr 6, 2020 at 22:20

1 Answer 1

2

You can connect to multiple databases, provided they are on same server by not specifying database name in connect command.

Now, you can have a list with all database names:

db_list = ['db1', 'db2', ... 'db100']
df_dict = {}

for c, db in enumerate(db_list):
    sql = """SELECT * FROM {}.Table"""
    df_dict[c] = pd.read_sql_query(sql.format(db), engine)        

Finally you will get a dictionary with all dataframes.

This should help.

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks, I was able to connect without specifying the database name. I am getting a syntax error on you last line: df{}.format(c) = pd.read_sql_query(query, engine, params=(db))
Edited my answer. Please check.
Good answer but you cannot parameterize identifiers like table and column names. Use f-strings: pd.read_sql_query(f"SELECT * FROM {db}.Table", engine).
Thank you both but now my connection is not working I must have been mistaken when I thought the connect was made. What I tried is is: engine = create_engine(mssql:// UserName: Password@Server?driver=SQL Server Native Client 11.0')
@Parfait Thanks for pointing that out. Updated my answer. Should get a +1 from your end :P
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.