How to compile opencl-kernel-file(.cl) to LLVM IR

Question

This question is related to LLVM/clang. I already know how to compile opencl-kernel-file.cl using the OpenCL API (clBuildProgram() and clGetProgramBuildInfo()).

My question is: How to compile opencl-kernel-file(.cl) to LLVM IR with OpenCL 1.2 or higher?
In the other words, How to compile opencl-kernel-file(.cl) to LLVM IR without libclc?

I have tried various methods to generate LLVM-IR from an OpenCL kernel.

I first followed the clang user manual.(https://clang.llvm.org/docs/UsersManual.html#opencl-features) but it did not run.

Secondly, I found a way to use libclc:

clang++ -emit-llvm -c -target -nvptx64-nvidial-nvcl -Dcl_clang_storage_class_specifiers -include /usr/local/include/clc/clc.h -fpack-struct=64 -o "$@".bc "$@" <br>
llvm-link "$@".bc /usr/local/lib/clc/nvptx64--nvidiacl.bc -o "$@".linked.bc <br>
llc -mcpu=sm_52 -march=nvptx64 "$@".linked.bc -o "$@".nvptx.s

This method worked fine, but since libclc was built on top of the OpenCL 1.1 specification, it couldn't be used with OpenCL 1.2 or later code such as code using printf.

And this method uses libclc, which implements OpenCL built-in functions in the shape of new function. You can observe that in the assembly (ptx) of result OpenCL binary, it goes straight to the function call instead of converting it to an inline assembly. I am concerned that this will affect GPU performance.

So I am looking for a way to replace compilation using libclc. As a last resort, I'm considering using libclc with the NVPTX backend and AMDGPU backend of LLVM. But if there is already another way, I want to use it. (I expect that the OpenCL front-end I have not found yet exists in clang)

My program's scenarios are:

There is opencl kernel source file(.cl)
Compile the file to LLVM IR
IR-Level process to the IR
Compile(using llc) the IR to Binary
- with each gpu targets(nvptx, amdgcn..)
Using the binary, Run host(.c or .cpp with lib OpenCL) with clCreateProgramWithBinary()

Now, When I compile kernel source file to LLVM IR, I have to include header of libclc(-include option in first one of above command) for compiling built-in functions. And I have to link libclc libraries before compile IR to binary

My environments are below:

GTX960
- NVIDIA's Binary appears in nvptx format
- I'm using sm_52 nvptx for my gpu.
Ubuntu Linux 16.04 LTS
LLVM/Clang 5.0.0
- If there is another way, I am willing to change the LLVM version.

Thanks in advice!

So basically your only problem with libclc is that call instructions to built-in functions are not inlined, right? — Andrew Savonichev
– Andrew Savonichev, Commented Jan 11, 2018 at 6:17
Yes. That's my first one. In addition, I want to compile OpenCL 1.2 or later built-in functions. — WON
– WON, Commented Jan 11, 2018 at 14:38

Josh Milthorpe · Accepted Answer · 2019-07-22 08:29:31Z

Clang 9 (and up) can compile OpenCL kernels written in the OpenCL C language. You can tell Clang to emit LLVM-IR by passing the -emit-llvm flag (add -S to output the IR in text rather than in bytecode format), and specify which version of the OpenCL standard using e.g. -cl-std=CL2.0. Clang currently supports up to OpenCL 2.0.

By default, Clang will not add the standard OpenCL headers, so if your kernel uses any of the OpenCL built-in functions you may see an error like the following:

clang-9 -c -x cl -emit-llvm -S -cl-std=CL2.0 my_kernel.cl -o my_kernel.ll
my_kernel.cl:17:12: error: implicit declaration of function 'get_global_id' is invalid in OpenCL
  int i = get_global_id(0);
          ^
1 error generated.

You can tell Clang to include the standard OpenCL headers by passing the -finclude-default-header flag to the Clang frontend, e.g.

clang-9 -c -x cl -emit-llvm -S -cl-std=CL2.0 -Xclang -finclude-default-header my_kernel.cl -o my_kernel.ll

mogu · Accepted Answer · 2018-01-11 09:19:02Z

0

(I expect that the OpenCL front-end I have not found yet exists in clang)

There is an OpenCL front-end in clang - and you're using it, otherwise you couldn't compile a single line of OpenCL with clang. Frontend is Clang recognizing the OpenCL language. There is no OpenCL backend of any kind in LLVM, it's not the job of LLVM; it's the job of various OpenCL implementations to provide proper libraries. Clang+LLVM just recognizes the language and compiles it to bitcode & machine binaries, that's all it does.

in the assembly(ptx) of result opencl binary, it goes straight to the function call instead of converting it to an inline assembly.

You could try linking to a different library instead of libclc, if you find one. Perhaps NVidia's CUDA has some bitcode libraries somewhere, then again licensing issues... BTW are you 100% sure you need LLVM IR ? getting OpenCL binaries using the OpenCL runtime, or using SPIR-V, might get you faster binaries & certainly be less painful to work with. Even if you manage to get a nice LLVM IR, you'll need some runtime which actually accepts it (i could be wrong, but i doubt proprietary AMD/NVIDIA OpenCL will just accept random LLVM IR as inputs).

answered Jan 11, 2018 at 9:19

mogu

1,1297 silver badges8 bronze badges

8 Comments

WON Over a year ago

Thanks for your advice. When I compiled opencl using clang without include libclc header, it was impossible to compile built-in functions and opencl types such as get_global_id and float4.

WON Over a year ago

this is errorr message: kernel.cl:3:73: error: unknown type name 'float4'; did you mean 'float'? kernel.cl:5:30: error: implicit declaration of function 'get_global_id' is invalid in OpenCL

WON Over a year ago

Thank you for your different approach to the source of the problem. but I need IR-level processing. LLVM already supports AMD GPU backend(llvm.org/docs/AMDGPUUsage.html) and NVPTX backend(llvm.org/docs/NVPTXUsage.html), so I have no problem that generate gpu-executable (that can be used in opencl) from LLVM IR. I just compile to the target(nvptx and amdgcn) using llc( llvm static compiler ).

mogu Over a year ago

> I just compile to the target(nvptx and amdgcn) using llc( llvm static compiler) ... and then what ? will any OpenCL runtime accept your LLVM IR ? or how do you plan on using it ? that's why i said it's easier to get the binaries from the OpenCL runtime, those are guaranteed to be loadable. Also, when i said there isn't any backend in LLVM - that was badly formulated. Ofc there is "backend" in the LLVM sense (compilation target backend), what i meant is, there are no OpenCL libraries (like libclc) and no OpenCL runtime in Clang/LLVM.

mogu Over a year ago

.. so when you said: (I expect that the OpenCL front-end I have not found yet exists in clang) .. there is no such hidden magic, sorry. There's a Clang frontend (language) and LLVM backends (AMD / NVPTX) - that's all.

|

silverclaw · Accepted Answer · 2018-01-17 13:52:35Z

0

Clang does not provide a standard CL declaration header file (for example, C's stdio.h), which is why you're getting "undefined type float" and whatnot.

If you get one such header, you can then mark it as implicit include using "clang -include cl.h -x cl [your filename here]"

One such declaration header can be retrieved from the reference OpenCL compiler implementation at

https://github.com/KhronosGroup/SPIR-Tools/blob/master/headers/opencl_spir.h

And by the way, consider using this compiler which generates SPIR (albeit 1.0) which can be fed into OpenCL drivers as input.

answered Jan 17, 2018 at 13:52

silverclaw

1783 silver badges8 bronze badges

2 Comments

WON Over a year ago

Thanks for the advice. sorry that the comment is late. I'll give it a try again.

Andrew Marshall Over a year ago

NVIDIA's OpenCL driver, which is one of the target platforms, does not support SPIR.

Collectives™ on Stack Overflow

How to compile opencl-kernel-file(.cl) to LLVM IR

3 Answers 3

Comments

8 Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

8 Comments

2 Comments

Related