332

Does anybody know how to extract a column from a multi-dimensional array in Python?

0

20 Answers 20

292
>>> import numpy as np
>>> A = np.array([[1,2,3,4],[5,6,7,8]])

>>> A
array([[1, 2, 3, 4],
    [5, 6, 7, 8]])

>>> A[:,2] # returns the third columm
array([3, 7])

See also: "numpy.arange" and "reshape" to allocate memory

Example: (Allocating a array with shaping of matrix (3x4))

nrows = 3
ncols = 4
my_array = numpy.arange(nrows*ncols, dtype='double')
my_array = my_array.reshape(nrows, ncols)
Sign up to request clarification or add additional context in comments.

8 Comments

Took me 2 hours to discover [:,2] guess this feature not in official literature on slicing?
What does the comma mean?
@Phil [row, col]. the comma separates.
How can this answer have so many upvotes? OP never said it's a numpy array
for extract 2 columns: A[:,[1,3]] for example extract second and fourth column
|
263

Could it be that you're using a NumPy array? Python has the array module, but that does not support multi-dimensional arrays. Normal Python lists are single-dimensional too.

However, if you have a simple two-dimensional list like this:

A = [[1,2,3,4],
     [5,6,7,8]]

then you can extract a column like this:

def column(matrix, i):
    return [row[i] for row in matrix]

Extracting the second column (index 1):

>>> column(A, 1)
[2, 6]

Or alternatively, simply:

>>> [row[1] for row in A]
[2, 6]

1 Comment

This should be the top answer. It answers the asked question while pointing to an alternative in NumPy.
110

If you have an array like

a = [[1, 2], [2, 3], [3, 4]]

Then you extract the first column like that:

[row[0] for row in a]

So the result looks like this:

[1, 2, 3]

Comments

54

check it out!

a = [[1, 2], [2, 3], [3, 4]]
a2 = zip(*a)
a2[0]

it is the same thing as above except somehow it is neater the zip does the work but requires single arrays as arguments, the *a syntax unpacks the multidimensional array into single array arguments

7 Comments

What is above? Remember that the answers are not always sorted the same way.
This is clean, but might not be the most efficient if performance is a concern, since it is transposing the entire matrix.
FYI, this works in python 2, but in python 3 you'll get generator object, which ofcourse isn't subscriptable.
@RishabhAgrahari Anyway to do this zip in Py3?
@WarpDriveEnterprises yup, you'll have to convert the generator object to list and then do the subscripting. example: a2 = zip(*a); a2 = list(a2); a2[0]
|
18
>>> x = arange(20).reshape(4,5)
>>> x array([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]])

if you want the second column you can use

>>> x[:, 1]
array([ 1,  6, 11, 16])

3 Comments

This is using numpy?
I can't find any documentation for arange() in Python3 outside of numpy. Anyone?
i think it is tensorflow, @KevinWMatthews
17

If you have a two-dimensional array in Python (not numpy), you can extract all the columns like so,

data = [
['a', 1, 2], 
['b', 3, 4], 
['c', 5, 6]
]

columns = list(zip(*data))

print("column[0] = {}".format(columns[0]))
print("column[1] = {}".format(columns[1]))
print("column[2] = {}".format(columns[2]))

Executing this code will yield,

>>> print("column[0] = {}".format(columns[0]))
column[0] = ('a', 'b', 'c')

>>> print("column[1] = {}".format(columns[1]))
column[1] = (1, 3, 5)

>>> print("column[2] = {}".format(columns[2]))
column[2] = (2, 4, 6)

Comments

16
def get_col(arr, col):
    return map(lambda x : x[col], arr)

a = [[1,2,3,4], [5,6,7,8], [9,10,11,12],[13,14,15,16]]

print get_col(a, 3)

map function in Python is another way to go.

Comments

16
array = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]

col1 = [val[1] for val in array]
col2 = [val[2] for val in array]
col3 = [val[3] for val in array]
col4 = [val[4] for val in array]
print(col1)
print(col2)
print(col3)
print(col4)

Output:
[1, 5, 9, 13]
[2, 6, 10, 14]
[3, 7, 11, 15]
[4, 8, 12, 16]

1 Comment

16 likes, and nobody realize that val[4] is out of range... Python starts the index at 0.
12
[matrix[i][column] for i in range(len(matrix))]

Comments

9

The itemgetter operator can help too, if you like map-reduce style python, rather than list comprehensions, for a little variety!

# tested in 2.4
from operator import itemgetter
def column(matrix,i):
    f = itemgetter(i)
    return map(f,matrix)

M = [range(x,x+5) for x in range(10)]
assert column(M,1) == range(1,11)

2 Comments

use itertools.imap for large data
The itemgetter approach ran about 50x faster than the list comprehension approach for my use case. Python 2.7.2, use case was lots of iterations on a matrix with a few hundred rows and columns.
7

You can use this as well:

values = np.array([[1,2,3],[4,5,6]])
values[...,0] # first column
#[1,4]

Note: This is not working for built-in array and not aligned (e.g. np.array([[1,2,3],[4,5,6,7]]) )

Comments

7

let's say we have n X m matrix(n rows and m columns) say 5 rows and 4 columns

matrix = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16],[17,18,19,20]]

To extract the columns in python, we can use list comprehension like this

[ [row[i] for row in matrix] for in range(4) ]

You can replace 4 by whatever number of columns your matrix has. The result is

[ [1,5,9,13,17],[2,10,14,18],[3,7,11,15,19],[4,8,12,16,20] ]

1 Comment

Does this create an entirely new list?
6

I think you want to extract a column from an array such as an array below

import numpy as np
A = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

Now if you want to get the third column in the format

D=array[[3],
[7],
[11]]

Then you need to first make the array a matrix

B=np.asmatrix(A)
C=B[:,2]
D=asarray(C)

And now you can do element wise calculations much like you would do in excel.

1 Comment

While this helped me a lot, I think the answer can be much shorter: 1. A = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]]) 2. A[:, 1] >> array([ 2, 6, 10])
5

One more way using matrices

>>> from numpy import matrix
>>> a = [ [1,2,3],[4,5,6],[7,8,9] ]
>>> matrix(a).transpose()[1].getA()[0]
array([2, 5, 8])
>>> matrix(a).transpose()[0].getA()[0]
array([1, 4, 7])

Comments

5

Just use transpose(), then you can get the columns as easy as you get rows

matrix=np.array(originalMatrix).transpose()
print matrix[NumberOfColumns]

Comments

4

If you want to grab more than just one column just use slice:

 a = np.array([[1, 2, 3],[4, 5, 6],[7, 8, 9]])
    print(a[:, [1, 2]])
[[2 3]
[5 6]
[8 9]]

Comments

3

Well a 'bit' late ...

In case performance matters and your data is shaped rectangular, you might also store it in one dimension and access the columns by regular slicing e.g. ...

A = [[1,2,3,4],[5,6,7,8]]     #< assume this 4x2-matrix
B = reduce( operator.add, A ) #< get it one-dimensional

def column1d( matrix, dimX, colIdx ):
  return matrix[colIdx::dimX]

def row1d( matrix, dimX, rowIdx ):
  return matrix[rowIdx:rowIdx+dimX] 

>>> column1d( B, 4, 1 )
[2, 6]
>>> row1d( B, 4, 1 )
[2, 3, 4, 5]

The neat thing is this is really fast. However, negative indexes don't work here! So you can't access the last column or row by index -1.

If you need negative indexing you can tune the accessor-functions a bit, e.g.

def column1d( matrix, dimX, colIdx ):
  return matrix[colIdx % dimX::dimX]

def row1d( matrix, dimX, dimY, rowIdx ):
  rowIdx = (rowIdx % dimY) * dimX
  return matrix[rowIdx:rowIdx+dimX]

1 Comment

I checked this method and the cost of retrieving column is way cheaper than nested for loops. However, reducing a 2d matrix to 1d is expensive if the matrix is large, say 1000*1000.
2

Despite using zip(*iterable) to transpose a nested list, you can also use the following if the nested lists vary in length:

map(None, *[(1,2,3,), (4,5,), (6,)])

results in:

[(1, 4, 6), (2, 5, None), (3, None, None)]

The first column is thus:

map(None, *[(1,2,3,), (4,5,), (6,)])[0]
#>(1, 4, 6)

Comments

2

I prefer the next hint: having the matrix named matrix_a and use column_number, for example:

import numpy as np
matrix_a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
column_number=2

# you can get the row from transposed matrix - it will be a column:
col=matrix_a.transpose()[column_number]

Comments

0

All columns from a matrix into a new list:

N = len(matrix) 
column_list = [ [matrix[row][column] for row in range(N)] for column in range(N) ]

Comments