I have CSV files which I read in in pandas with:
#!/usr/bin/env python
import pandas as pd
import sys
filename = sys.argv[1]
df = pd.read_csv(filename)
Unfortunately, the last line of these files is often corrupt (has the wrong number of commas). Currently I open each file in a text editor and remove the last line.
Is it possible to remove the last line in the same python/pandas script that loads the CSV to save having to take this extra non-automated step?
str.extract:for col in df.columns[2:]: df[col] = df[col].str.extract(r'(\d+)').astype(int)