Revisions to Protecting garbage collection data structure with reader-writer lock might not be appropriate?

added 26 characters in body

Source Link

edited Apr 3, 2024 at 4:24

374
1
19

Given the expected use of RW-locksGiven the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

Currently, and that currently no standard specifying the memory sementics ofdraft-4.1 for Single Unix Specification 5 (Issue 8) has listed reader-writer lock, I expectlocks as operations that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevantsynchronizes memory locations may result in undefined behavior.

Alternatively, Since the holders of reader locks are likely going to modify other memory regions that they operate on respectively exclusively, so some fences may stillwill have to be added to ensure these per-thread memory regions are visible globally after the release of reader locks. But this is also guessing, we still need definitive answer from the next revision of either C or POSIX.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

Given the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

Alternatively, the holders of reader locks are likely going to modify other memory regions that they operate on respectively exclusively, so some fences may still be added to ensure these per-thread memory regions are visible globally after the release of reader locks. But this is also guessing, we still need definitive answer from the next revision of either C or POSIX.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

Given the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

Currently, draft-4.1 for Single Unix Specification 5 (Issue 8) has listed reader-writer locks as operations that synchronizes memory. Since the holders of reader locks are likely going to modify other memory regions that they operate on respectively exclusively, some fences will have to be added to ensure these per-thread memory regions are visible globally after the release of reader locks.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

added 13 characters in body

Source Link

edited Mar 6, 2024 at 9:12

DannyNiu

374
1
19

Given the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

Alternatively, the holders of reader locks are likely going to modify other memory regions that they operate on respectively exclusively, so some fences may still be added to ensure these per-thread memory regions are visible globally after the release of reader locks. But this is also guessing, we still need definitive answer from the next revision of either C or POSIX.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

Given the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

Alternatively, the holders of reader locks are likely going to modify other memory regions that they operate on exclusively, so some fences may still be added to ensure these per-thread memory regions are visible globally after the release of reader locks. But this is also guessing, we still need definitive answer from the next revision of either C or POSIX.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

Given the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

Alternatively, the holders of reader locks are likely going to modify other memory regions that they operate on respectively exclusively, so some fences may still be added to ensure these per-thread memory regions are visible globally after the release of reader locks. But this is also guessing, we still need definitive answer from the next revision of either C or POSIX.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

added 366 characters in body

Source Link

edited Mar 6, 2024 at 9:06

DannyNiu

374
1
19

Given the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

Alternatively, the holders of reader locks are likely going to modify other memory regions that they operate on exclusively, so some fences may still be added to ensure these per-thread memory regions are visible globally after the release of reader locks. But this is also guessing, we still need definitive answer from the next revision of either C or POSIX.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

Given the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

Given the expected use of RW-locks, and that currently no standard specifying the memory sementics of reader-writer lock, I expect that someone will eventually file a bug at AustinGroupBugs.net with the resolution: releasing a reader lock after changing relevant memory locations may result in undefined behavior.

Alternatively, the holders of reader locks are likely going to modify other memory regions that they operate on exclusively, so some fences may still be added to ensure these per-thread memory regions are visible globally after the release of reader locks. But this is also guessing, we still need definitive answer from the next revision of either C or POSIX.

So I thought of 2 solutions, both valid, yet each suited for particular purposes the other doesn't.

Use RW-Lock provided by the host, and include explicit memory fencing call when releasing he reader lock. Example of releasing reader lock:

#include <stdatomic.h>
#include <pthread.h>
...
    atomic_thread_fence(memory_order_release);
    pthread_rwlock_unlock(&gc->thr_xor_gc);
...

Advantages:

performance optimized by OS,
realtime-aware,

Use regular mutices and condition variables. Example of releasing reader lock:

#include <pthread.h>
...
    pthread_lock(&gc->lock_main); // the master lock of the entire GC structure.
    if( gc->inprogress_gc || !gc->inprogress_threads ) // the boolean indicating whether GC thread and application threads respectively are active
    {
        pthread_unlock(&gc->lock_main);
        return; // may return a value if appropriate.
    }
    if( !--gc->inprogress_threads )
        pthread_cond_signal(&gc->cv_writer);
    pthread_unlock(&gc->lock_main);
...

Advantages:

If I want application threads to stall when there's any thread that's blocked attempting to do GC, I can customimze this behavior (where as regular RW-Lock cannot do that explicitly, unless optional POSIX APIs that alters scheduling are available).
All fences are implicit.

The latest GCC and Clang compilers support the atomic_thread_fence function, and Single Unix Specification mandate implementations to support all pthread interfaces. For earlier versions of the compilers and OSes, compiler built-ins and vendor-specific APIs may be available. For toolchains and platforms LACKING ALL of these, probably they'll not be supporting multi-threading in the first place.

Adding my prediction of future standardization effort of POSIX with regard to the memory sementic of reader-writer lock.

Source Link

edited Feb 20, 2024 at 1:26

DannyNiu

374
1
19

Loading

deleted 1 character in body

Source Link

edited Feb 19, 2024 at 6:45

DannyNiu

374
1
19

Loading

added 32 characters in body

Source Link

edited Feb 19, 2024 at 1:54

DannyNiu

374
1
19

Loading

added 8 characters in body

Source Link

edited Feb 19, 2024 at 1:47

DannyNiu

374
1
19

Loading

Source Link

answered Feb 19, 2024 at 1:17

DannyNiu

374
1
19

Loading

Stack Exchange Network

Return to Answer