Python/Numpy array dimension confusion

Question

Suppose batch_size = 64. I created a batch : batch = np.zeros((self._batch_size,), dtype=np.int64). Suppose I have batch of chars such that batch = ['o', 'w', ....'s'] of 64 size and 'o' will be represented as [0,0, .... 0] 1-hot vector of size 27. So, is there any way such that batch will still have shape of batch_size and not batch_size x vocabulary_size? Code is as follows :

batch = np.zeros((self._batch_size,), dtype=np.int64)  
temp1 = list()
for b in range(self._batch_size):
  temp = np.zeros(shape=(vocabulary_size), dtype=np.int64)
  temp[char2id(self._text[self._cursor[b]])] = 1.0
  temp1.append(temp)
  self._cursor[b] = (self._cursor[b] + 1) % self._text_size
batch = np.asarray(list)
return batch

This return batch as dimension of batch_size x vocabulary_size.

batch = np.zeros((self._batch_size,), dtype=np.int64)  
for b in range(self._batch_size):
  batch[b, char2id(self._text[self._cursor[b]])] = 1.0
  self._cursor[b] = (self._cursor[b] + 1) % self._text_size
return batch

This code returns an error of too many few indices.
Is there any way of specifying array size as [batch_size :, None]?

How do you expect it to represent 27 values (0 and 1) for 64 'batches'? What do you intend to do with the array once it's created? — hpaulj
– hpaulj, Commented Sep 10, 2016 at 19:07

hpaulj · Accepted Answer · 2016-09-10 15:56:46Z

1

In the 1st block the initialization of batch to zeros does nothing for you, because batch is replaced with the asarray(temp1) later. (Note my correction). temp1 is a list of 1d arrays (temp), and produces a 2d arrray.

In the 2nd if you start with batch=np.zeros((batch_size, vocab_size)) you would avoid the index number error.

You can't use None instead of a real integer. None does not work like a broadcasting newaxis here. Arrays don't grow by assigning a new larger index. Even when used in indexing np.zeros((batchsize,))[:,None] the result is 2d, shape (batchsize,1).

Why do you want a 1d array? It's possible to construct a 1d array of dtype object that contains arrays (or any other object), but for many purposes it is just a glorified list.

edited Sep 10, 2016 at 15:56

answered Sep 10, 2016 at 12:18

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

SupposeXYZ Over a year ago

I needed it for function embedded_lookup() in tensorflow. Input should be of dimension (batch_size,).How to construct 1d array of dtype object that contains which you were about to mention?

Collectives™ on Stack Overflow

Python/Numpy array dimension confusion

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related