Implement a “split” synchronization barrier for C++ with OpenMP

Question

EDIT TL;DR Anyone who might consider using my code below in production and can afford to require C++-20 standard should rather use std::barrier as suggested by G. Sliepen in his excellent answer.

I’m working on some OpenMP-parallelized C++ code, that is made of 3 parts with the constraint that no thread should begin part 3 before all threads have finished part 1. But it is perfectly acceptable to have some threads run part 2 while others are running part 1, or to have some threads run part 3 while others are running part 2. (See my question on StackOverflow for details.)

A synchronization barrier anywhere between part 1 and part 3 would satisfy the constraint but there’s no need for such “hard” synchronization. So I thought it would be nice to have a “split” barrier: no thread can pass the second half before all threads have passed the first half.

I managed to implement such a thing with the following code:

class split_barrier {
private:
    std::mutex m;
    std::condition_variable cv;
    int threads_in_section;
    int total_threads;
    bool may_enter;
    bool may_leave;

public:
    split_barrier():
        threads_in_section(0),
        may_enter(false),
        may_leave(false)
    {}

    void init(int threads) {
        std::lock_guard<std::mutex> lock(m);
        total_threads = threads;
        may_enter = true;
    }

    void enter() {
        std::unique_lock lock(m);
        cv.wait(lock, [this]{return may_enter;});
        if (++threads_in_section == total_threads) {
            may_enter = false;
            may_leave = true;
            lock.unlock();
            cv.notify_all();
        }
    }

    void leave() {
        std::unique_lock lock(m);
        cv.wait(lock, [this]{return may_leave;});
        if (--threads_in_section == 0) {
            may_leave = false;
            may_enter = true;
            lock.unlock();
            cv.notify_all();
        }
    }
};

Then my code looks like:

void main() {
    split_barrier barrier;
    #pragma omp parallel
    {
        #pragma omp single
        barrier.init(omp_get_num_threads());
        part1();
        barrier.enter();
        part2();
        barrier.leave();
        part3();
    }
}

EDIT: Since it does not invalidate the only and accepted answer, I hope I am allowed to add that my “real” use-case looks more like:

void main() {
    split_barrier barrier;
    #pragma omp parallel
    {
        #pragma omp single
        barrier.init(omp_get_num_threads());
        while (…) {
            part1();
            barrier.enter();
            part2();
            barrier.leave();
            part3();
            #pragma omp barrier
            part4();
        }
    }
}

I consider synchronization code to be very error-prone. Is my code thread-safe?

G. Sliepen · Accepted Answer · 2025-05-23 18:54:03Z

6

Thread safety

Correct use of a single std::mutex to guard both may_enter, may_leave and threads_in_section. Although there are probably ways to make it more performant by using atomic variables somehow, your code takes a robust approach.

Note that it does depend on the caller using your split_barrier correctly; the following code will result in a deadlock:

split_barrier barrier; // total_threads == 0
barrier.enter();       // ++threads_in_section != total_threads
foo();
barrier.leave();       // may_leave is still false, so will wait() forever

Use `std::barrier`

C++ already has a barrier primitive: std::barrier. It has separate functions arrive() and wait(). Using this, your example main() would look like:

void main() {
    std::optional<std::barrier<>> barrier;
    #pragma omp parallel
    {
        #pragma omp single
        barrier.emplace(omp_get_num_threads());
        part1();
        auto arrival_token = barrier->arrive();
        part2();
        barrier->wait(std::move(arrival_token));
        part3();
    }
}

The std::optional is a workaround for the fact that std::barrier only takes the number of threads in its constructor.

Unnecessary waiting in `enter()`

Both enter() and leave() call wait. This means there are actually two barriers. This happens for example if the barrier is reused:

part1();
barrier.enter();
part2();
barrier.leave(); // Waits for all threads to finish part1()
part3();
barrier.enter(); // Waits for all threads to finish part2()
part4();
barrier.leave(); // Waits for all barriers to finish part3()
part5();

But I would expect the second call to barrier.enter() to not block anything.

edited May 23 at 18:54

answered May 22 at 18:25

G. Sliepen

69.3k3 gold badges75 silver badges180 bronze badges

\$\begingroup\$ Thanks a lot for pointing me to the std::barrier C++-20 standard class, although the doc does not mention the emplace() method… That’s exactly what I was looking for on StackOverflow. If you have an account there, please consider writing an answer so I can accept it! \$\endgroup\$

user2233709
– user2233709

2025-05-23 07:46:55 +00:00
Commented May 23 at 7:46
\$\begingroup\$ As for the deadlock if the object in not correctly initialized before it is used, I was aware of that, but considered such a misuse should be considered (or, better, be documented as) undefined behaviour. \$\endgroup\$

user2233709
– user2233709

2025-05-23 07:58:13 +00:00
Commented May 23 at 7:58
\$\begingroup\$ About the unnecessary waiting, I was also aware of that. In your 5-parts example, I would have used 2 distinct split_barrier objects. In my use-case, I have only 4 parts with a hard barrier between parts 3 and 4 and the whole section is enclosed in a while loop. Hence, I needed my barrier object to be re-usable, but I did not find a way to to allow some threads to enter the barrier in iteration n+1 while others have not left it in iteration n. I also did not care since the hard barrier ensures it cannot happen anyway. \$\endgroup\$

user2233709
– user2233709

2025-05-23 07:59:26 +00:00
Commented May 23 at 7:59
2

\$\begingroup\$ The emplace() method comes from std::optional. As for the undefined behaviour, I think it's fine this way (std::barrier has exactly the same issue), I just wanted to point it out. \$\endgroup\$

G. Sliepen
– G. Sliepen

2025-05-23 09:12:28 +00:00
Commented May 23 at 9:12
\$\begingroup\$ Oh, stupid me, I missed the std::optional when I read your example code ! \$\endgroup\$

user2233709
– user2233709

2025-05-23 09:29:13 +00:00
Commented May 23 at 9:29

Add a comment |

Stack Exchange Network

Implement a “split” synchronization barrier for C++ with OpenMP

1 Answer 1

Thread safety

Use `std::barrier`

Unnecessary waiting in `enter()`

You must log in to answer this question.

Hot Network Questions

Implement a “split” synchronization barrier for C++ with OpenMP

1 Answer 1

Thread safety

Use std::barrier

Unnecessary waiting in enter()

You must log in to answer this question.

Related

Hot Network Questions

Use `std::barrier`

Unnecessary waiting in `enter()`