The Wayback Machine - https://web.archive.org/web/20211104210112/https://github.com/topics/rocm
Skip to content
#

rocm

Here are 66 public repositories matching this topic...

numba
juliusbierk
juliusbierk commented Oct 19, 2021

I believe this is undocumented behaviour.

import numba as nb


@nb.njit
def f1():
    for i in nb.prange(1):
        print(type(i))  # >>> int64


@nb.njit(parallel=True)
def f2():
    for i in nb.prange(1):
        print(type(i))  # >>> uint64


f1()
f2()

This caused a nasty bug in my own code that was hard to debug as the problem did not exist without `parallel=Tr

hipSYCL
illuhad
illuhad commented Sep 6, 2021

Bug summary
There is evidence that sub_group::get_group_id() does not return the same value as threadIdx.x / warpSize (assuming 1D kernel), as expected on CUDA. We should check the implementation of this function. Our implementation of this function performs bit manipulation magic, presumably the optimization went to far...

To Reproduce
Compare sub_group{}.get_group_id() or `sub

MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

  • Updated Nov 3, 2021
  • C++
trafficVision

Improve this page

Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."

Learn more