Vectorized Code on GPU

Question

I am using OpenCL to execute a procedure on different GPUs and CPUs simultaneously to get a high performance results. The Intel OpenCL is always showing a message that the Kernel is not vectorized, so it will only run on different cores but will not run using SIMD instructions. My question is, if I rewrite the code so that the SIMD instruction can be exploit with the OpenCL code, will it increase the GPU Performance also?

Jason Newton · Accepted Answer · 2015-05-13 03:36:51Z

3

Yes - but beware that this is not necessary on AMD GCN based APU/GPU or Nvidia Fermi or higher GPU hardware for good performance -they do scalar operations with great utilization. CPUs and Intels GPU however can greatly benefit via SIMD instructions which is what the vector operations boil down to.

answered May 13, 2015 at 3:36

Jason Newton

1,22110 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

mmostajab Over a year ago

So, What I understood from your answer is that I should do that and then profile and see which one is better. Is that true? or for sure, the vectorized code will work faster on GPU anyway?

Jason Newton Over a year ago

well profiling is never a bad idea but if you look into the hardware architecture you are programming for, you will easily have your answer at whether it is a good idea or a waste of time. Just with this quick check you probably don't need to program anything up to know which to do.

Collectives™ on Stack Overflow

Vectorized Code on GPU

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related