0

I am new to Python programming and seeking for some help/guidance in correcting my python code.

Here my query is.

  1. I have one Excel file which has (7 Tabs).
  2. I have one Folder which contains 7 different text file and each text file contains respective Tab SQL Query and each text file name is same as the Tab Name which is available in Excel File.

I have a written a Python code to loop through all the text file one by one and execute that each text file SQL query and whatever data will come in output that output data should dump into existing excel file in that respective sheet/tab. i am using pandas to do this, however, code is working fine but while updating data into excel pandas is removing all existing sheets from the file and updating only current output data into excel file.

Example: if Python code execute a text file(Filename: Data) and after executing this SQL query we got some data and this data should dump into excel file (sheetname: Data).

<pre><code>
import pypyodbc
import pandas as pd
import os
import ctypes
from pandas import ExcelWriter
fpath = r"C:\MNaveed\DataScience\Python Practice New\SQL Queries"
xlfile = r"C:\MNaveed\DataScience\Python Practice New\SQL Queries\Open_Case_Data.xlsx"
cnxn = pypyodbc.connect('Driver={SQL Server};Server=MyServerName;Database=MyDatabaseName;Trusted_Connection=Yes')
cursor = cnxn.cursor()

for subdir, dirs, files in os.walk(fpath):
    for file in files:
        #print(os.path.join(subdir,file))
        filepath = os.path.join(subdir,file)
        #print("FilePath: ", filepath)

        if filepath.endswith(".txt"):
            if file != "ClosedAging_Cont.txt":
                txtdata = open(filepath, 'r')
                script = txtdata.read().strip()
                txtdata.close()
                cursor.execute(script)
                if file == "ClosedAging.txt":
                    txtdata = open(os.path.join(subdir,"ClosedAging_Cont.txt"), 'r')
                    script = txtdata.read().strip()
                    txtdata.close()
                    cursor.execute(script)

                col = [desc[0] for desc in cursor.description]
                data = cursor.fetchall()
                df = pd.DataFrame(list(data),columns=col)

                #save_xls(df,xlfile)

                writer = pd.ExcelWriter(xlfile)
                flnm = file.replace('.txt','').strip()
                df.to_excel(writer,sheet_name=flnm,index=False)
                writer.save()

                print(file, " : Successfully Updated.")
            else:
                print(file, " : Ignoring this File")
        else:
            print(file, " : Ignoring this File")

ctypes.windll.user32.MessageBoxW(0,"Open Case Reporting Data Successfully Updated","Open Case Reporting",1)
</pre></code>
3
  • From what I've read, xlsxwriter, is just a writer. It doesn't have the ability to read files, meaning it wouldn't be able to look for the last row where you want to append the data. You would need openpyxl to look for the empty row. Then use xlsxwriter to write the by specifying the row. Or you can re-write your code in openpyxl Commented Aug 14, 2017 at 12:56
  • formatted and typos fixed Commented Aug 14, 2017 at 16:29
  • Hi Muthu Kumaran, here i am not looking for to append data. i want to dump data in to existing worksheets (existing data should remove first and dump updated data). however here panda is not creating new data frame each time when its loop.. Commented Aug 15, 2017 at 13:56

1 Answer 1

2

By looping through the text files, you overwrite the Excel file inside the loop each time. Instead instantiate pd.ExcelWriter(xlfile) and call writer.save() outside the loop.

The following example is adapted from the xlswriter documentation

You can find more information about multiple sheets here: xlswriter documentaion - multiple sheets

import pandas as pd

# Create a Pandas Excel writer using XlsxWriter as the engine outside the loop.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
# Sample loop, replace with directory browsing loop
for i in range(7):
    # Sample Pandas dataframe. Replace with SQL query and resulting data frame.
    df = pd.DataFrame({'DataFromSQLQuery': ['SQL query result {0}'.format(i)]})
    # Convert the dataframe to an XlsxWriter Excel object.
    df.to_excel(writer, sheet_name='Sheet{0}'.format(i))
# Close the Pandas Excel writer and output the Excel file.
writer.save()

The following code addresses the concrete question but is untested.

import pypyodbc
import pandas as pd
import os
import ctypes
from pandas import ExcelWriter

fpath = r"C:\MNaveed\DataScience\Python Practice New\SQL Queries"
xlfile = r"C:\MNaveed\DataScience\Python Practice New\SQL Queries\Open_Case_Data.xlsx"
cnxn = pypyodbc.connect('Driver={SQL Server};Server=MyServerName;Database=MyDatabaseName;Trusted_Connection=Yes')
cursor = cnxn.cursor()

# Create a Pandas Excel writer using XlsxWriter as the engine outside the loop
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')

# File loop
for subdir, dirs, files in os.walk(fpath):
    for file in files:
        filepath = os.path.join(subdir,file)
        if filepath.endswith(".txt"):
            if file != "ClosedAging_Cont.txt":
                txtdata = open(filepath, 'r')
                script = txtdata.read().strip()
                txtdata.close()
                cursor.execute(script)
                if file == "ClosedAging.txt":
                    txtdata = open(os.path.join(subdir,"ClosedAging_Cont.txt"), 'r')
                    script = txtdata.read().strip()
                    txtdata.close()
                    cursor.execute(script)

                col = [desc[0] for desc in cursor.description]
                data = cursor.fetchall()

                # Data frame from original question
                df = pd.DataFrame(list(data),columns=col)

                # Convert the dataframe to an XlsxWriter Excel object
                flnm = file.replace('.txt','').strip()
                df.to_excel(writer, sheet_name=flnm, index=False)

                print(file, " : Successfully Updated.")
            else:
                print(file, " : Ignoring this File")
        else:
            print(file, " : Ignoring this File")

# Close the Pandas Excel writer and output the Excel file
writer.save()

ctypes.windll.user32.MessageBoxW(0,"Open Case Reporting Data Successfully Updated","Open Case Reporting",1)
Sign up to request clarification or add additional context in comments.

7 Comments

Hi silvanoe, Thank you so much for your response. here above code is deleting existing all sheets and creating a new sheet(Sheet0,Sheet1,....) with dumping same data in each sheet. my concern here is , my code will pick up one by one each text file and it will execute a SQL query and whatever data will come in output that data will converting into pandas dataframe and this data i want to dump in specific sheet (whatever textfile name has the same name will be there in excel sheet)
Since I cannot replicate your database, the data frame just served as exemplary data coming from your database query. You will have to replace this part of the code with your data base query code. Same goes for the loop counting to 7, you will have to replace this with your directory browsing loop. I update my answer with additional comments
Okay, thank you Silvanoe, can you help here in changing my code into yours <pre> <code> df = pd.DataFrame(list(data), columns=col) </pre></code>
Okay, thank you Silvanoe, can you help here in changing my code into yours <pre> <code> df = pd.DataFrame(list(data), columns=col) to in your format df = pd.DataFrame({'DataFromSQLQuery': ['SQL query result {0}'.format(i)]})
I changed your code and added it to the answer. Please test it and let us know if it works
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.