Can't read CSV file in python

Question

I'm trying to read a CSV file in Python and i get some errors.I think there's something wrong with this particular CSV file, because it workes with others that I've tried. This is the code

import pandas as pd
import numpy as np

def execute():
    tabel = pd.read_csv("FoodV.csv", index_col=0)
    print(tabel, type(tabel))

if __name__ == "__main__":
    execute()

and these are the errors

 
Can you please help?

Not without seeing the CSV file (or at least first three lines of it). — Amadan
– Amadan, Commented Jan 19, 2022 at 0:38
The problem is with the data, which you haven't shown us. Can you post the contents of FoodV.csv? — John Gordon
– John Gordon, Commented Jan 19, 2022 at 0:38

Amadan · Accepted Answer · 2022-01-20 07:33:42Z

By default, CSV is delimited by commas ("Comma-Separated Values"), but your file is delimited by semicolons. To make matters worse, you do have commas in your file, but you use them as decimal separators, and not the default period. These defaults mean that the fields in your first line are being read as:

001;BUTTER
WITH SALT;15
87;717;0
85;81
11;0
06;0
06;24;0
02;2;24;24;643;0
09;0;0;1;0
003;0
17;2499;684;671;215

which is almost certainly not what you want. To fix these two expectations, explicitly mention them:

tabel = pd.read_csv("FoodV.csv", index_col=0, sep=";", decimal=",")

Note that this does not mean your CSV file is bad, just that it is non-standard, though that's likely Microsoft's fault. CSV standard is modeled on the USA usage, where . separates the fractional and integral part: 15.87. However, in some countries (particularly in Europe), the decimal separator is comma (15,87), which also means comma is not available to be a field separator. By making Windows software respond to different regional settings even when writing CSV, Microsoft has opened a can of worms by allowing non-standard "CSV" formats, which makes CSV less readily usable as a common global data interchange format. So this is how I would expect Excel to save a CSV if your Windows is set to e.g. French locale.

vinalti · Accepted Answer · 2022-01-19 00:37:48Z

0

The problem is explained in the error message :

Expected 11 fields in line 3, saw 14

you probably have a few comma too much on line 3 (or a few missing in previous lines.) it seems that for the CSV to work, you need the same amount of columns for every row, so it can transform it into a pandas dataframe

answered Jan 19, 2022 at 0:37

vinalti

1,2642 gold badges8 silver badges33 bronze badges

Comments

ti7 · Accepted Answer · 2022-01-19 00:41:38Z

Your CSV file is corrupted, likely because some inputs have extra (unescaped) commas!
If you can skip them, just set on_bad_lines="warn" when calling .read_csv()

df = pd.read_csv("FoodV.csv", index_col=0, on_bad_lines="warn")

If you need the corrupt lines, you could fix them manually or load the file normally and modify the extra lines

contents = []
with open("FoodV.csv") as fh:
    for line in fh:  # file-likes are iterable by-lines
        fields = lines.split(",")  # use a CSV
        if len(fields) != 11:  # guessed from Question
            # fix the line
        contents.append(fields)

# create dataframe

Collectives™ on Stack Overflow

Can't read CSV file in python

3 Answers 3

Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Related