I have been using Python 2.7 for some time now and have recently switched to Python 3. I have already updated my code on some points, but the problem I currently have deludes me. What I am trying to do is to load a dataset using np.loadtxt. Because this data also contains strings I am importing the full array as a string. I want to do type conversions after to convert some entries to float. This fails miserably and I do not understand why. All I see is that in Python 3 all strings get the prefix 'b' and I have the feeling this has something to do with this, but I cannot find a concise answer. Code and error below.
filename = 'train.csv'
raw_data = open(filename, 'rb')
data = np.loadtxt(raw_data, delimiter=",", dtype = 'str')
dataset = data[1:,1:]
print(dataset)
original_data = dataset
test = float(dataset[0,0])
print(test)
Result
[["b'60'" "b'RL'" "b'65'" ..., "b'WD'" "b'Normal'" "b'208500'"]
["b'20'" "b'RL'" "b'80'" ..., "b'WD'" "b'Normal'" "b'181500'"]
["b'60'" "b'RL'" "b'68'" ..., "b'WD'" "b'Normal'" "b'223500'"]
...,
["b'70'" "b'RL'" "b'66'" ..., "b'WD'" "b'Normal'" "b'266500'"]
["b'20'" "b'RL'" "b'68'" ..., "b'WD'" "b'Normal'" "b'142125'"]
["b'20'" "b'RL'" "b'75'" ..., "b'WD'" "b'Normal'" "b'147500'"]]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-38-c154945cd6f1> in <module>()
5 print(dataset)
6 original_data = dataset
----> 7 test = float(dataset[0,0])
8 print(test)
ValueError: could not convert string to float: "b'60'"