0
from docx.api import Document
import pandas as pd
    
document = Document("D:/tmp/test.docx")
tables = document.tables
df = pd.DataFrame()

for table in document.tables:
    for row in table.rows:
        text = [cell.text for cell in row.cells]
        df = df.append([text], ignore_index=True)

df.columns = ["Column1", "Column2"]    
df.to_excel("D:/tmp/test.xlsx")
print df

Output

`>>> 
  Column1 Column2
0   Hello    TEST
1     Est    Ting
2      Gg      ff

How to remove row and column 0,1,2 and how to add some images in this codes?

2
  • 1
    Does this answer your question? How to remove index from a created Dataframe in Python? Commented Jun 7, 2021 at 3:51
  • I don't see any column Index here, can you point out where the column index is? By the column index, do you mean the column names Column1 and Column2? Commented Jun 7, 2021 at 4:10

2 Answers 2

0

You can remove the index and header when export to excel, simply adding the following conditions:

df.to_excel("test.xlsx", header = None, index = False)
Sign up to request clarification or add additional context in comments.

2 Comments

@cha hey could you please be more specific? I didn't see any image was mentioned. Do you mean add borders in excel?
I think XlsxWriter can do that but unfortunately I'm not familiar with this. Sorry :(
0

It can be done like this.

import pandas as pd

dataset = pd.DataFrame({'A':[1,2,3,4], 'B':[5,6,7,8]})

writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
dataset.to_excel(writer, sheet_name = 'Data', index = False, header = False)

sheet_name = 'Images' #Sheet name in which the image will be generated
cell = 'B2' #Position of the image in w.r.t cell value

workbook  = writer.book
worksheet = workbook.add_worksheet(sheet_name)
worksheet.insert_image(cell, 'Tmp.jpg') #Add image
workbook.close()
writer.save()

1 Comment

I have never done it myself before but checkout this link. I think it will solve ur issue.