0

I am trying to iterate through a CSV file and create a numpy array for each row in the file, where the first column represents the x-coordinates and the second column represents the y-coordinates. I then am trying to append each array into a master array and return it.

import numpy as np 

thedoc = open("data.csv")
headers = thedoc.readline()


def generatingArray(thedoc):
    masterArray = np.array([])

    for numbers in thedoc: 
        editDocument = numbers.strip().split(",")
        x = editDocument[0]
        y = editDocument[1]
        createdArray = np.array((x, y))
        masterArray = np.append([createdArray])


    return masterArray


print(generatingArray(thedoc))

I am hoping to see an array with all the CSV info in it. Instead, I receive an error: "append() missing 1 required positional argument: 'values' Any help on where my error is and how to fix it is greatly appreciated!

8
  • numpy.append, is not like list.append, is not an in-place operation. provide the pointer also numpy.append(ind, i) Commented Apr 5, 2019 at 0:31
  • refer this doc Commented Apr 5, 2019 at 0:32
  • Thanks very much for the comment. When I change masterArray = np.append([createdArray]) to np.append(masterArray, createdArray) all it returns is [ ]. Any suggestion on why this is now happening? Commented Apr 5, 2019 at 0:34
  • check this answer Commented Apr 5, 2019 at 0:44
  • 1
    @Dyland yes, that is generally a better way to do it. Best is to not do this at all and instead read the entire file into a numpy array to begin with or a pandas dataframe. Commented Apr 5, 2019 at 0:49

1 Answer 1

0

Numpy arrays don't magically grow in the same way that python lists do. You need to allocate the space for the array in your "masterArray = np.array([])" function call before you add everything to it.

The best answer is to import directly to a numpy array using something like genfromtxt (https://docs.scipy.org/doc/numpy-1.10.1/user/basics.io.genfromtxt.html) but...

If you know the number of lines you're reading in, or you can get it using something like this.

file_length = len(open("data.csv").readlines())

Then you can preallocate the numpy array to do something like this:

masterArray = np.empty((file_length, 2))

for i, numbers in enumerate(thedoc): 
    editDocument = numbers.strip().split(",")
    x = editDocument[0]
    y = editDocument[1]
    masterArray[i] = [x, y]

I would recommend the first method but if you're lazy then you can always just build a python list and then make a numpy array.

masterArray = []

for numbers in thedoc: 
    editDocument = numbers.strip().split(",")
    x = editDocument[0]
    y = editDocument[1]
    createdArray = [x, y]
    masterArray.append(createdArray)

return np.array(masterArray)
Sign up to request clarification or add additional context in comments.

1 Comment

genfromtxt and loadtxt use the list append approach. Pretty well have to because they don't know ahead of time the number of rows.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.