How to convert numpy bytes to float in python3?

Question

My question is similar to this; I tried using genfromtxt but still, it doesn't work. Reads the file as expected but not as floats. Code and File excerpt below

     temp = np.genfromtxt('PFRP_12.csv', names=True, skip_header=1, comments="#", delimiter=",", dtype=None)

reads as (b'"0"', b'"0.2241135"', b'"0"', b'"0.01245075"', b'"0"', b'"0"')

     "1 _ 1",,,,,
     "Time","Force","Stroke","Stress","Strain","Disp."
     #"sec","N","mm","MPa","%","mm"
     "0","0.2241135","0","0.01245075","0","0"
     "0.1","0.2304713","0.0016","0.01280396","0.001066667","0.0016"
     "0.2","1.707077","0.004675","0.09483761","0.003116667","0.004675"

I tried with different dtypes (none, str, float, byte), still no success. Thanks!

Edit: As Evert mentioned I tried float also but reads all them as none (nan, nan, nan, nan, nan, nan)

Please read the documentation, and use dtype=float instead of dtype=None. — user707650
– user707650, Commented Feb 24, 2017 at 10:22
@Evert Yes I did, float gives all nan. Since it seems a simple thing, I spent roughly an hour looking for but nothing helped. — Anbu
– Anbu, Commented Feb 24, 2017 at 10:37

user707650 · Accepted Answer · 2017-02-24 11:26:23Z

1

Another solution is to use the converters argument:

np.genfromtxt('inp.txt', names=True, skip_header=1, comments="#", 
delimiter=",", dtype=None, 
converters=dict((i, lambda s: float(s.decode().strip('"'))) for i in range(6)))

(you'll need to specify a converter for each column).

Side remark Oddly enough, while dtype="U12" or similar should actually produce strings instead of bytes (avoiding the .decode() part), this doesn't seem to work, and results in empty entries.

answered Feb 24, 2017 at 11:26

user707650

Sign up to request clarification or add additional context in comments.

1 Comment

hpaulj Over a year ago

This converter also works: lambda s: float(s.strip(b'"'))) (that is bytestrings have a strip method as well).

user707650 · Accepted Answer · 2017-02-24 11:11:14Z

Here is a fancy, unreadable, functional programming style way of converting your input to the record array you're looking for:

>>> np.core.records.fromarrays(np.asarray([float(y.decode().strip('"')) for x in temp for y in x]).reshape(-1, temp.shape[0]), names=temp.dtype.names, formats=['f'] * len(temp.dtype.names))

or spread out across a few lines:

>>> np.core.records.fromarrays(
...   np.asarray(
...     [float(y.decode().strip('"')) for x in temp for y in x]
...   ).reshape(-1, temp.shape[0]), 
...   names=temp.dtype.names, 
...   formats=['f'] * len(temp.dtype.names))

I wouldn't recommend this solution, but sometimes it's fun to hack something like this together.

The issue with your data is a bit more complicated than it may seem. That is because the numbers in your CSV files really are not numbers: they are explicitly strings, as they have surrounding double quotes.

So, there are 3 steps involved in the conversion to float: - decode the bytes to Python 3 (unicode) string - remove (strip) the double quotes from each end of each string - convert the remaining string to float

This happens inside the double list comprehension, on line 3. It's a double list comprehension, since a rec-array is essentially 2D.
The resulting list, however is 1D. I turn it back into a numpy array (np.asarray) so I can easily reshape to something 2D. That (now plain float) array is then given to np.core.records.fromarrays, with the names taken from the original rec-array, and the formats set for each field to float.

Collectives™ on Stack Overflow

How to convert numpy bytes to float in python3?

2 Answers 2

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Linked

Related