I am attempting to ingest txt files (an entire directory) into a pandas dataframe such that each row in the data frame has the content of one file.
The text files as far as I can tell are not delimited, they are the body of email messages. All files but one are split into many rows. So instead of having 20 something rows (one for each file) I have over 500 rows. I cannot tell how the one file differs from the rest. They are all plain-text.
The code I am using is:
import pandas as pd
for i in files:
list_.append(pd.read_csv('//directory'+i ,sep="\t" , quoting=csv.QUOTE_NONE,header=None,names=["message", "label"]))
I've set the separator to tabular as I think it will not effect the ingestion of the text at all. Any ideas what the problem is here?