0

I have a csv file that looks like below

[0.037621960043907166, 0.04622473940253258, 0.9161532521247864]
[0.030109738931059837, 0.03261643648147583, 0.9372738003730774]
[0.030109738931059837, 0.03261643648147583, 0.9372738003730774]

I need to convert this to numpy array. If I use below code

data = genfromtxt(file, delimiter=',', encoding="utf8")

I get nan in the output.

If I do this

np.genfromtxt (file, encoding=None, dtype = None)

It fails to remove the starting and ending brackets of the list and outputs like

array = ([['[0.037621960043907166,', '0.04622473940253258,',
        '0.9161532521247864]'],
       ['[0.030109738931059837,', '0.03261643648147583,',
        '0.9372738003730774]'],
       ['[0.030109738931059837,', '0.03261643648147583,',
        '0.9372738003730774]']], dtype='<U22')

the expected output is

array = ([['0.037621960043907166,', '0.04622473940253258,',
            '0.9161532521247864'],
           ['0.030109738931059837,', '0.03261643648147583,',
            '0.9372738003730774'],
           ['0.030109738931059837,', '0.03261643648147583,',
            '0.9372738003730774']], dtype='<U22')

How can I get the expected output? Seems I need to remove the brackets 1st before applying the numpy operations. Any suggestion?

3 Answers 3

1

As long as you know the format of the content, I think a simple slicing will do

import numpy as np

tmp = open('tmp', 'r').readlines()
tmp = np.array([[float(num) for num in item[1:-2].split(',')] for item in tmp])
Sign up to request clarification or add additional context in comments.

Comments

0

what you need is eval()

from numpy import array
with open('your file name', 'r') as f:
    str_lines = f.readLines()
    lines = [eval(x) for x in str_lines]
    ary = array(lines)
f.close()

Comments

0

When you have text file like:

[0.037621960043907166, 0.04622473940253258, 0.9161532521247864]
[0.030109738931059837, 0.03261643648147583, 0.9372738003730774]
[0.030109738931059837, 0.03261643648147583, 0.9372738003730774]

You can try this:

np.genfromtxt(filename,dtype=str,encoding=None, converters ={0: lambda s: s.strip('['), 2:lambda s: s.strip(']')}, delimiter = ',')

Output:

array([['0.037621960043907166', ' 0.04622473940253258',
        ' 0.9161532521247864'],
       ['0.030109738931059837', ' 0.03261643648147583',
        ' 0.9372738003730774'],
       ['0.030109738931059837', ' 0.03261643648147583',
        ' 0.9372738003730774']], dtype='<U20')

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.