I'm quite new to C++ and want to focus on writing performant multithreaded code because I will try to port our company internal GUI framework which is currently implemented in C#. So I'd love to get some recommendations about code style and Performance on a reader-writer spin lock that favors writers I wrote:
namespace UI
{
namespace Threading
{
/**
* \brief ReaderWriterSpinLock which favors writers
*/
class ReaderWriterSpinLock
{
public:
ReaderWriterSpinLock() = default;
~ReaderWriterSpinLock() = default;
ReaderWriterSpinLock(ReaderWriterSpinLock&) = delete;
ReaderWriterSpinLock(ReaderWriterSpinLock&&) = delete;
ReaderWriterSpinLock& operator =(ReaderWriterSpinLock&) = delete;
ReaderWriterSpinLock& operator =(ReaderWriterSpinLock&&) = delete;
void AcquireReaderLock()
{
int computedValue, initialValue;
do
{
int count{0};
while (m_waitingWriters != 0 || (initialValue = m_lockCount) < 0)
{
if (++count % MAX_SPIN_COUNT == 0)
std::this_thread::yield();
}
computedValue = initialValue + 1;
}
while (!m_lockCount.compare_exchange_strong(initialValue, computedValue));
}
void ReleaseReaderLock()
{
assert(m_lockCount > 0);
--m_lockCount;
}
void AcquireWriterLock()
{
int computedValue, initialValue;
++m_waitingWriters;
do
{
int count{0};
while (m_lockCount != 0)
{
if (++count % MAX_SPIN_COUNT == 0)
std::this_thread::yield();
}
initialValue = 0;
computedValue = -1;
}
while (!m_lockCount.compare_exchange_strong(initialValue, computedValue));
--m_waitingWriters;
}
void ReleaseWriterLock()
{
assert(m_lockCount == -1);
++m_lockCount;
}
private:
static constexpr int MAX_SPIN_COUNT = 1000;
std::atomic<int> m_lockCount{0};
std::atomic<int> m_waitingWriters{0};
};
}
}
Explanation: A negative m_lockCount denotes the active writers (because there can only be one writer at a time the minimum is -1) and a positive count denotes the active readers. When aquiring the reader lock it spins (and yields every MAX_SPIN_COUNT spins) until there are no waiting writers (because it should favor those) and there are no active writers (m_lockCount >= 0), then it tries to update the lock count and repeats the process if it fails. When releasing the lock, the lock count can get decremented without further checks. The methods for the writer lock are quite similar with the difference of temporarily incrementing the m_waitingWriters instead of checking them in the acquire method.
I know that you should normally rely on library functions but in my test it was ten times faster for acquiring the writer lock and twice as fast for readers+writers (with four readers trying to acquire the reader lock in a loop) than the boost::shared_mutex in combination with the boost::shared_lock and boost::unique_lock. So it would actually be worth it.
Also in my application the thread count equals the processor count so the yield should rarely be necessary and the spin lock will only be used for methods which only need a few cycles.
I also wanted to explicitly delete the move and copy constructors because it makes sense for such a class (even though they are deleted implicitly because of the atomic fields). Is this a good idea?