0

I use the OpenCL.NET C# wrapper for OpenCL.

My GPU from GPU-Z is AMD Radeon Barcelo, and specific for OpenCL:

  • Platform Version: OpenCL 2.1 AMD-APP (3570.0)
  • Device Name: gfx90c
  • Device Profile: FULL_PROFILE
  • Device Version: OpenCL 2.0 AMD-APP (3570.0)
  • Device Version OpenCL C: OpenCL C 2.0
  • Device Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_copy_buffer_p2p cl_amd_planar_yuv

Part of the code:

// probably useless
#pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable
#pragma OPENCL EXTENSION cl_khr_global_int32_extended_atomics : enable
#pragma OPENCL EXTENSION cl_khr_fp64 : enable

void vector_is_zero_partial(
    uint row,
    uint row_to,
    __global const double *x,
    double tolerance,
    __global atomic_int *is_zero)
{
    for (; row < row_to; ++row)
    {
        if (fabs(x[row]) > tolerance)
        {
            atomic_store(is_zero, 0);
            break;
        }
        if (!atomic_load(is_zero)) break;
    }
}

The error:

C:\Users\CHAMEL~1\AppData\Local\Temp\\OCL8036T0.cl:264:4: error: implicit declaration of function 'atomic_store' is invalid in C99
                        atomic_store(is_zero, 0);
                        ^
C:\Users\CHAMEL~1\AppData\Local\Temp\\OCL8036T0.cl:267:8: error: implicit declaration of function 'atomic_load' is invalid in C99
                if (!atomic_load(is_zero)) break;
                     ^
2 errors generated.

error: Clang front-end compilation failed!
Frontend phase failed compilation.
Error: Compiling CL to IR
 

So, atomic extensions exist, OpenCL is v2, BUT atomic_store / atomic_load does not exist.

Did I something wrong here?

2
  • 1
    Have you checked the documentation here: registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/… Also what load/store atomic you expect form host program to the kernel? Commented Mar 10 at 20:00
  • @IlianZapryanov Yes. I did it already. It says "// Requires OpenCL C 2.0" but I am not sure if I miss something as english is not my native language. C atomic_load(volatile A *object) cannot handle a pointer to __global variable? So then, why the need of atomics; Commented Mar 10 at 20:14

1 Answer 1

1

atomic_load requires the device features __opencl_c_atomic_order_seq_cst and __opencl_c_atomic_scope_device. As you are on the AMD-APP platform, it is possible these are not available. You could check clinfo to be sure.

Two options that could be considered are:

  1. Try mesa drivers with rusticl. As your GPU is GCN, this should get you OpenCL 3.0 support. As you are likely on windows, you would probably need to use WSL to get it to work.
  2. Change your function to use the legacy OpenCL atomics. Something like this should work:
void vector_is_zero_partial(
    uint row,
    uint row_to,
    __global const double *x,
    double tolerance,
    __global atomic_int *is_zero)
{
    for (; row < row_to; ++row)
    {
        if (fabs(x[row]) > tolerance)
        {
            atomic_xchg(is_zero, 0);//stores 0 to is_zero, returns is_zero
            break;
        }
        if (!atomic_max(is_zero, 0)) break;
        /*atomic_max will return is_zero, and store the
        max value of is_zero, 0 to is_zero.
        If is_zero = 1, is_zero will remain 1, but if 
        is_zero has been set to 0, it will remain 0.*/
    }
}
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.