6,727 questions
Advice
0
votes
1
replies
83
views
Can OpenMP be offloaded to Adreno GPU on ARM64 hardware?
GCC documentation is rather clear that the only GPUs supported for OpenMP offload are AMD and Nvidia.
LLVM documentation states Clang supports OpenMP target offloading to several different ...
0
votes
0
answers
83
views
Finding proper BLAS and OpenMP for Intel compilers with CMake
I have project in both C and Fortran built with CMake. This project requires standard stuff like BLAS, Lapack and OpenMP. This is the snippet that I use to find the packages and it works great with ...
Advice
2
votes
3
replies
72
views
Does a masked region have to be in an explicit parallel region?
This is an odd OpenMP question triggered by a possible compiler bug (?) we are chasing down. For a long time, we've had code that had:
!$omp masked
...
!$omp end masked
that didn't have an explicit ...
Advice
0
votes
2
replies
97
views
Is calling `omp_pause_resource` allowed after forking (v.s. before forking)?
I have a question on what is technically allowed by the omp standard when it comes to relinquish resources (using omp_pause_resource) with forking.
I understand that in section 12.12, pause_resource.1....
Advice
0
votes
0
replies
23
views
Is lastprivate on nested loop indices valid without a COLLAPSE clause?
For the flang implementation of OpenMP, we are discussing the following program. At issue is whether e.g. ii, iii, and iiii should be consider private or lastprivate, and if the former, whether the ...
Advice
0
votes
2
replies
71
views
OpenMP 5.2 declare simd with linear(p: ref) inconsistency between specification and example?
I'm seeking opinions on what appears to be an inconsistency between the OpenMP 5.2 specification and an official example regarding the declare simd directive with the linear(p: ref) clause.
1. The ...
2
votes
2
answers
114
views
OMP_DYNAMIC and scaling experiments
When conducting scaling experiments on a code accelerated with OpenMP, is it standard practice to set OMP_DYNAMIC to false so that the number of threads spawned per parallel region is deterministic?
I ...
Best practices
0
votes
3
replies
104
views
Passive-container sieve: computing prime coordinates on a mod-60 grid rather than walking segments
I've implemented a prime sieve where segments are passive containers rather than active search spaces. Instead of walking through each segment to mark multiples, a function (ricerca_ciclo) computes ...
Advice
2
votes
6
replies
92
views
C++ containers and omp parallel
So I have an algorithm that gets a set of objects and computes a next set of objects. The code below is a toy version of the real code for brevity. The next snip illustrates the theme of the program ...
1
vote
0
answers
111
views
What is the difference between the values obtained from sched_getcpu() and getcpu(int*, int*) from sched.h?
Here is a minimal working c++ example to show what I mean:
#include <sched.h>
#include <iostream>
#include <sstream>
#include <omp.h>
int main(int argc, char** argv) {
#pragma ...
1
vote
2
answers
225
views
How to recognize the main thread in an OpenMP program?
OpenMP uses its own scheme for numbering threads staring from 0 for the master thread. A master thread is the thread that initiated a parallel region, which is the main program thread in the following ...
2
votes
1
answer
167
views
How is a non-parallelized for loop inside an OpenMP parallel section executed?
Consider the following code:
#pragma omp parallel
for (int run = 0; run < 10; run++)
{
std::vector<int> out;
#pragma omp for
for (int i = 0; i < 1'000'000; i++)
{
...
}
}
...
Best practices
0
votes
3
replies
108
views
Code design conundrum: runtime polymorphism, templates, compile times, and OpenMP
I'm struggling to finalise the design of my C++17 library.
One of the primary goals is to use runtime polymorphism to allow users to extend or rewrite default features of the library for their own use ...
2
votes
1
answer
154
views
Does the construct `workshare` work with `do-concurrent`?
I'm one of the developers of the Lumen code: https://www.lumen-code.org/.
That is computational code for condensed matter physics simulations.
We are replacing FORALL with DO CONCURRENT, since FORALL ...
0
votes
1
answer
103
views
C++ segmentation fault when throwing in ordered OMP parallel for
the code below crashes with
terminate called after throwing an instance of 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >'
Aborted ...