I have read all kinds of tutorials but I somehow can't implement those on my task.
My objective is to extract the data from a text file. And later on plot some histograms based on the data. However, I'm new to python and I'm stuck with the basics of slicing an array. In the text file there's a raw dataset; each item is in it's own row and each row has multiple attributes. The attributes are separated by commas.
I'm trying to split the dataset into two. The first attributes(cultivars) of each row into one array and the rest of the attributes(attributes of the given cultivar) of each item into a second array. The raw data is in 178 by 14 format.
I successfully got the first array extracted with the following code:
readFile = open('wine.data', 'r')
cultivar = np.loadtxt(readFile, delimiter=',', usecols=[0], unpack=True)
But when I try to make the second array, I run into problems.
readFile = open('wine.data','r')
attributes = np.loadtxt(readFile, delimiter=',', usecols=[-13], unpack=True)
Whatever I try to put into that usecols-method, it's either wrong by syntax (as the code above is) or I'll get a distorted array, like this:
[[ 1.00000000e+00 1.00000000e+00 1.00000000e+00 ..., 3.00000000e+00 3.00000000e+00 3.00000000e+00] [ 1.42300000e+01 1.32000000e+01 1.31600000e+01 ..., 1.32700000e+01 1.31700000e+01 1.41300000e+01] [ 1.71000000e+00 1.78000000e+00 2.36000000e+00 ..., 4.28000000e+00 2.59000000e+00 4.10000000e+00] ..., [ 1.04000000e+00 1.05000000e+00 1.03000000e+00 ..., 5.90000000e-01 6.00000000e-01 6.10000000e-01] [ 3.92000000e+00 3.40000000e+00 3.17000000e+00 ..., 1.56000000e+00 1.62000000e+00 1.60000000e+00] [ 1.06500000e+03 1.05000000e+03 1.18500000e+03 ..., 8.35000000e+02 8.40000000e+02 5.60000000e+02]]
The whole python code is here:
import numpy as np
import matplotlib.pyplot as plt
import urllib
readFile = open('wine.data', 'r')
first = np.loadtxt(readFile, delimiter=',', usecols=[0], unpack=True)
readFile = open('wine.data','r')
rest = np.loadtxt(readFile, delimiter=',', usecols=[-13], unpack=True)
readFile.close()
print rest
Raw data: http://pastebin.com/YqV1AZ3r