1

Using Python 3.3, I am trying to fill a NumPy array with contents from a .CSV file. The .CSV file has the following contents:

CellID  X   Y   Z   
1230    1   1   0
1231    2   1   0 
1232    1   1   1

The first row contains a header and so it must be skipped.

import csv
import numpy as np

csv_fn = "input.csv"

with open(csv_fn, "rb") as infile:
    reader = csv.reader(infile)
    next(reader, None)         # Skips the header? 
    x = list(reader) 
    result = np.array(x).astype("int")  # Converts to a matrix of int? 

The variable result doesn't seem to contain the expected values. I've tried to query the dimension using result.shape.

How do I fix this code so it reads the contents into the array?

1

4 Answers 4

3

You can use pandas to read in csv file as a data frame and then take only it's values

import pandas as pd
import numpy as np

csv_fn = "input.csv"

file = pd.read_csv(csv_fn)
result = file.values
Sign up to request clarification or add additional context in comments.

2 Comments

Where did you use the numpy?
result is a numpy array
2

Use np.loadtext:

from io import StringIO
import numpy as np

file_content = """CellID  X   Y   Z
1230    1   1   0
1231    2   1   0
1232    1   1   1"""

# Replace StringIO with your file object
with StringIO(file_content) as f:
    data = np.loadtxt(f, skiprows=1, dtype=int)

print(data)

Output:

[[1230    1    1    0]
 [1231    2    1    0]
 [1232    1    1    1]]

Comments

0

What exactly is the question here? Have you tried numpy.genfromtxt? It is a nice function for loading files like this.

2 Comments

Clarified my question now
don't want to brag about your question, but... do you simply want to read in the file, or do you want to know, how to convert your code to do this?
0

calling next() to skip the first line is good, but using itertools.dropwhile() might be clearer as to your intent.

Now, if you don't show what exactly you got in result that you didn't expect, I can't guess it. What I can see as possible problem is that the delimiter in the default csv.reader() dialect, 'excel', is the comma, while in your file it seems the delimiters are tabs. As such, reader will interpret each of the file's lines as having one element. your list x will then look like that :

[['1230    1   1   0'],
 ['1231    2   1   0'], 
 ['1232    1   1   1']]

And obviously you'll have some problems converting those strings to int.

When using csv, always check you have the good delimiters and line ending characters.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.