20

Passing a numpy array of dtype np.float64_t works fine ( below), but I can't pass string arrays.

This is what works :

# cython_testing.pyx
import numpy as np
cimport numpy as np

ctypedef np.float64_t dtype_t 

cdef func1 (np.ndarray[dtype_t, ndim=2] A):
    print A 

def testing():
    chunk = np.array ( [[94.,3.],[44.,4.]], dtype=np.float64)

    func1 (chunk)

But I can't make this work: I can't find the matching 'type identifiers' for numpy string dtypes.

# cython_testing.pyx
import numpy as np
cimport numpy as np

ctypedef np.string_t dtype_str_t 

cdef func1 (np.ndarray[dtype_str_t, ndim=2] A):
    print A 

def testing():
    chunk = np.array ( [['huh','yea'],['swell','ray']], dtype=np.string_)

    func1 (chunk)

The compilation error is :

Error compiling Cython file:
------------------------------------------------------------
ctypedef np.string_t dtype_str_t 
    ^
------------------------------------------------------------

cython_testing.pyx:9:9: 'string_t' is not a type identifier

UPDATE

Per looking through numpy.pxd, I see the following ctypedef statements. Maybe that's enough to say I can use uint8_t and pretend everything is normal, as long as I can do some casting?

ctypedef unsigned char      npy_uint8
ctypedef npy_uint8      uint8_t

Just have to see how expensive that casting will be.

2 Answers 2

9

With Cython 0.20.1 it works using cdef np.ndarray, without specifying the data type and the number of dimensions:

import numpy as np
cimport numpy as np

cdef func1(np.ndarray A):
    print A

def testing():
    chunk = np.array([['huh','yea'], ['swell','ray']])
    func1(chunk)
Sign up to request clarification or add additional context in comments.

2 Comments

@TedPetrou I am trying to build an example where the dtype=object would accelerate in order to update the answer, but up to now I found it to be equivalent to not specifying dtype. How did you measure the 100x speed up?
Looks like I massively misspoke in my previous comment. It looks like I am getting 5x improvement by changing to object. Use this array. a = np.array(['some', 'strings', 'in', 'an', 'array'] * 10 ** 5)
7

Looks like you're out of luck.

http://cython.readthedocs.org/en/latest/src/tutorial/numpy.html

Some data types are not yet supported, like boolean arrays and string arrays.


This answer is no longer valid as shown by Saullo Castro's answer, but I'll leave it for historical purposes.

4 Comments

Thanks. I upvoted your answer. Though I hope there is a work around by using perhaps the Numpy Structured array [docs.scipy.org/doc/numpy/user/…. But I am still looking for how to pass one of those too.
At least for my purposes, using cProfile, it looks like you can still pass Numpy arrays w/o typing, in Cython. But you do not get the Cython optimizations described in your readthedocs.org reference.
Being able to use them slowly is still better than not being able to use them at all, though, right?
Content of this link has been modified. The quote doesn't exist.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.