1

I'm pulling my hair about this. I'm trying to change the elements of a numpy array to no avail:

import numpy as np
c = np.empty((1), dtype='i4, S, S, S, S, S, S, S, S, S')
print(c)
c[0][1]="hello"
c[0][2]='hello'
c[0][3]=b'hello'
print(c)

Output:

[(0, b'', b'', b'', b'', b'', b'', b'', b'', b'')]
[(0, b'', b'', b'', b'', b'', b'', b'', b'', b'')]
2
  • What are you doing with that dype? Commented Dec 16, 2017 at 18:04
  • The array should contain tuples(10) per row. 1 row in this case. I picked this example from some numpy page. docs.scipy.org/doc/numpy-1.10.1/user/basics.rec.html - x = np.zeros(3, dtype='3int8, float32, (2,3)float64') Commented Dec 16, 2017 at 18:50

2 Answers 2

2

Strings are fixed length in numpy. What doesn't fit is simply discarded:

np.array('hello', dtype='S4')
# array(b'hell', dtype='|S4')

dtype('S') appears to be equivalent to dtype('S0'):

np.dtype('S').itemsize
# 0

so assigning to that gets your strings truncated at position 0.

If you know the maximum length to expect in advance:

c = np.empty((1,), dtype=', '.join(['i4'] + 9*['S5']))
for i in range(1, 10):
    c[0][i] = 'hello'

c
# array([ (-1710610776, b'hello', b'hello', b'hello', b'hello', b'hello', b'hello', b'hello', b'hello', b'hello')],
#   dtype=[('f0', '<i4'), ('f1', 'S5'), ('f2', 'S5'), ('f3', 'S5'), ('f4', 'S5'), ('f5', 'S5'), ('f6', 'S5'), ('f7', 'S5'), ('f8', 'S5'), ('f9', 'S5')])

If you need flexible length you can use object dtype:

c = np.empty((1,), dtype=', '.join(['i4'] + 9*['O']))
for i in range(1, 10):
    c[0][i] = 'hello world'[:i]

c
# array([ (0, 'h', 'he', 'hel', 'hell', 'hello', 'hello ', 'hello w', 'hello wo', 'hello wor')],
#   dtype=[('f0', '<i4'), ('f1', 'O'), ('f2', 'O'), ('f3', 'O'), ('f4', 'O'), ('f5', 'O'), ('f6', 'O'), ('f7', 'O'), ('f8', 'O'), ('f9', 'O')])

If you want fixed length just large enough, have all the records at hand and are not too picky about the exact types you can have numpy work it out for you:

lot = [(5,) + tuple('hello world 2 3 4 5 6 7 8 9'.split()), (8,) + tuple('0 1 2 3 short loooooooong 6 7 8 9'.split())]
lot
# [(5, 'hello', 'world', '2', '3', '4', '5', '6', '7', '8', '9'), (8, '0', '1', '2', '3', 'short', 'loooooooong', '6', '7', '8', '9')]
c = np.rec.fromrecords(lot)
c
# rec.array([(5, 'hello', 'world', '2', '3', '4', '5', '6', '7', '8', '9'),
#       (8, '0', '1', '2', '3', 'short', 'loooooooong', '6', '7', '8', '9')], 
#      dtype=[('f0', '<i8'), ('f1', '<U5'), ('f2', '<U5'), ('f3', '<U1'), ('f4', '<U1'), ('f5', '<U5'), ('f6', '<U11'), ('f7', '<U1'), ('f8', '<U1'), ('f9', '<U1'), ('f10', '<U1')])
Sign up to request clarification or add additional context in comments.

Comments

1

You are using strings of length 0. You have to make the fields large enough for your text:

import numpy as np
c = np.empty((1), dtype='i4, S5, S5, S5, S5, S5, S5, S5, S5, S5')
print(c)
c[0][1]="hello"
c[0][2]='hello'
c[0][3]=b'hello'
print(c)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.