4

I have a data.csv file like this

Col1,Col2,Col3,Col4,Col5  
10,12,14,15,16  
18,20,22,24,26  
28,30,32,34,36  
38,40,42,44,46  
48,50,52,54,56

Col6,Col7  
11,12  
13,14  
...

Now, I want to read only the data of columns Col1 to Col5 and I don't require Col6 and Col7.

I tried reading this file using

df = pd.read_csv('data.csv',header=0)

then its throwing an error saying

UnicodeDecodeError : 'utf-8' codec cant decode byte 0xb2 in position 3: invalid start byte

Then, I tried this

df = pd.read_csv('data.csv',header=0,error_bad_lines=True)

But this is also not giving the desired result. How can we read only till the first blank line in the csv file?

2
  • Have you tried encoding = 'utf-16'? Commented Oct 18, 2018 at 21:40
  • In my opinion, the problem is with the file. This is not a valid csv file, but two csv files concatenated into one. Try splitting the file into two files. Commented Oct 18, 2018 at 21:41

2 Answers 2

5

You can create a generator which reads a file line by line. The result is passed to pandas:

import pandas as pd
import io


def file_reader(filename):
    with open(filename) as f:
        for line in f:
            if line and line != '\n':
                yield line
            else:
                break


data = io.StringIO(''.join(file_reader('data.csv')))
df = pd.read_csv(data)
Sign up to request clarification or add additional context in comments.

Comments

2

Pandas doesn't have an option to stop at a condition, but it does have condition to stop after n rows. So you could read the file first, count number of rows until blank and then load in pandas with

pd.read_csv('file.csv',nrows= count )

Along the lines of this:

count = 0
with open(filename) as f:
    for line in f:
        if line and line != '\n':
            count += 1
        else:
            break

pd.read_csv(filename,nrows=count)

2 Comments

There are many files to read @Christian Sloper. So it would be extremely difficult to count the rows in each file
Bit hard to understand that comment, you count with the program snippet , just before you load it into pandas.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.