“My C++ knowledge is outdated and I'm trying to learn about C++11 threading.” 😲 C++11?
You’re probably not aware of this, but right now—as you were posting your code—the C++ standard committee was meeting in Sofia, Bulgaria, to begin the process of standardizing C++29.
It’s good that you want to update your knowledge, but maybe perhaps you should be a little more ambitious, and try to at least update your knowledge to the current decade. C++20 in particular added a lot of very cool and very powerful extensions to the fairly basic concurrency facilities introduced in C++11. C++26 is going to add even more powerful tools, although nothing that would really help what you are trying to do. And the next-generation of concurrency will (probably) by done with senders and receivers, but that’s getting ahead of ourselves a bit.
So I’m not going to review your code as C++11 code. Instead, I am going to point out some of the C++20 facilities that would massively simplify and em-power-ify your code.
But before I start, I have say that your overall design seems rather confused and haphazard. That’s not great in non-concurrent code, but once you introduce concurrency, vagueness will kill. You need to have a crystal clear understanding of who owns what, who might be accessing or modifying what, and when and how all these things are happening, or you will end up with data races. You can’t just fart out good concurrent code like you could with non-concurrent code. Especially if you are trying to use low-level facilities.
For example, you use an std::atomic<ssize_t> (which is not portable C++, but we’ll get to that later) to keep track when the buffer something-something ready to read or write or something. I don’t even know, and I doubt you do either. The variable is called _buf_count (which is not legal C++, but, again, later), but it doesn’t seem to be counting any buffers. Indeed, you only have one buffer. And at one point you set _buf_count to -1… so, you have negative one buffers?
See what I mean? Confused.
And this confusion causes real problems. When _buf_count is greater than zero, that is supposed to be the signal that there is data ready to read. Except what actually happens is that _buf_count is incremented before the buffer mutex is locked, where the comment says that _buf_count is now the number of buffers “about to be filled”… meanwhile another comment calls _buf_count “the number of filled buffers” (not about to be filled, already filled), and yet another says _buf_count is not a count at all, but an ad-hoc enumeration where 0 means the buffer is “acquired” and 1 means it’s “filled”. It’s a complete mess.
Frankly, I don’t even know what you think you mean by “acquiring” the buffer. The buffer is a static array. It is “acquired” when the program begins. acquireWriteBuffer() literally does nothing but increment a (supposed) counter, then (sometimes) wait for something else to decrement it. It’s not “acquiring” anything. It would be better called something like (sometimes_)wait_for_something_to_finish_reading_from_the_buffer(). Which is exactly what you describe in the text. Don’t use jargon to obscure simple ideas.
Speaking of:
The odd thing about my situation is that I know when the client wants to acquire a buffer, but I don't know when it has actually written to it. So, all I can do is assume that it's written to the last buffer it acquired once it requests another one.
But you do realize this is a terrible idea, right? The producer has to linger after it is done, waiting for the consumer to announce that it is done. They cannot be independent threads of execution anymore. The producer is now dependent on the consumer.
Also, again with the confusion and vagueness. You talk about “the last buffer” and requesting “another one”… but there is only one buffer!
Before reviewing the existing code, it might be a better idea to start from a clean slate to get a better handle on what the design really needs.
On the transmission side, you need a worker function:
auto tx_loop(/*...*/)
{
    // ...
}
What inputs does that function need? Well, obviously, it needs the buffer it will be getting its transmission data from. What type should the buffer be? Well, if it’s just going to be a chunk of memory, you might as well use a span:
auto tx_loop(std::span<char const> buffer, /*...*/)
{
    // ...
}
This doesn’t consider whether the amount of data in the buffer may vary in size—even if the buffer is always “filled” before writing, the last bit of data may not completely fill it. But since that’s not a consideration in the original code, I won’t worry about it either.
Next you need some way to signal that the buffer is full and ready to transmit. A simple reference to a bool will do the trick:
auto tx_loop(std::span<char const> buffer, bool& buffer_is_full, /*...*/)
{
    // ...
}
Next we need some synchronization for the buffer and signal. For that, we’ll need a mutex and condition variable. Actually, two condition variables: one for the condition that the buffer is full and ready to transmit, and one for the condition that the buffer is empty and ready to be filled.
auto tx_loop(
    std::span<char const> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition,
    std::condition_variable& buffer_is_empty_condition,
    /*...*/)
{
    // ...
}
And finally, we need a way to signal that there is nothing more to be transmitted, rather than just crashing out with a timeout error. For that we will use the first of the C++20 stuff I’ll be introducing: std::stop_token. We’ll make it the first argument because it makes things easier (we don’t have to; but std::jthread prefers it that way, so let’s make it easier on ourselves).
auto tx_loop(
    std::stop_token stoken,
    std::span<char const> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition,
    std::condition_variable& buffer_is_empty_condition)
{
    // ...
}
(I know you also need the start time in your transmission thread. If you want it, you can add it. I won’t bother because it’s not important.)
Yes, that’s a lot of arguments, but bear with it for now.
Now, inside the transmission loop, we need to know when to stop looping. We have the stop token for that:
auto tx_loop(
    std::stop_token stoken,
    std::span<char const> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition,
    std::condition_variable& buffer_is_empty_condition)
{
    while (not stoken.stop_requested())
    {
        // ...
    }
}
The first thing we should do is lock the mutex.
auto tx_loop(
    std::stop_token stoken,
    std::span<char const> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition,
    std::condition_variable& buffer_is_empty_condition)
{
    while (not stoken.stop_requested())
    {
        auto lock = std::unique_lock{buffer_mutex};
        // ...
    }
}
Once we have the mutex, we could then wait for the condition variable. However, before doing that, we could do a quick check of the ready flag to see if we even need to bother waiting.
auto tx_loop(
    std::stop_token stoken,
    std::span<char const> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition,
    std::condition_variable& buffer_is_empty_condition)
{
    using namespace std::literals;
    while (not stoken.stop_requested())
    {
        auto lock = std::unique_lock{buffer_mutex};
        if (not buffer_is_full)
        {
            if (not buffer_is_full_condition.wait_for(lock, 1s, [&buffer_is_full] { return buffer_is_full; }))
            {
                // handle timeout error
            }
        }
        // ...
    }
}
Now we transmit the data in the buffer, and then signal that it is now empty:
auto tx_loop(
    std::stop_token stoken,
    std::span<char const> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition,
    std::condition_variable& buffer_is_empty_condition)
{
    using namespace std::literals::chrono_literals;
    while (not stoken.stop_requested())
    {
        auto lock = std::unique_lock{buffer_mutex};
        if (not buffer_is_full)
        {
            if (not buffer_is_full_condition.wait_for(lock, 1s, [&buffer_is_full] { return buffer_is_full; }))
            {
                // handle timeout error
                return;
            }
        }
        std::println("{05}: {}", /*...*/, std::string_view{buffer.data(), buffer.size()});
        std::this_thread::sleep_for(500ms);
        buffer_is_full = false;
        lock.unlock();
        buffer_is_empty_condition.notify_all();
    }
}
That’s pretty much it for the transmission side of things. Filling the buffer is basically the same.
auto write_data(
    std::ranges::input_range auto&& input,
    std::span<char> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition,
    std::condition_variable& buffer_is_empty_condition)
{
    std::ranges::for_each(
        std::forward<decltype(input)>(input) | std::views::chunk(buffer.size()),
        [&] (auto&& chunk)
        {
            auto lock = std::unique_lock{buffer_mutex};
            if (buffer_is_full)
            {
                if (not buffer_is_empty_condition.wait_for(lock, 1s, [&buffer_is_full] { return not buffer_is_full; }))
                {
                    // handle timeout error
                    return;
                }
            }
            std::ranges::copy(std::forward<decltype(chunk)>(chunk), buffer.begin());
            // Not bothering to account for partial chunk.
            buffer_is_full = true;
            lock.unlock();
            buffer_is_full_condition.notify_one();
        });
}
And now your program is pretty trivial:
auto main() -> int
{
    using namespace std::literals::string_view_literals;
    auto buffer = std::array<char, 10>{};
    auto buffer_is_full = false;
    auto buffer_mutex = std::mutex{};
    auto buffer_is_full_condition = std::condition_variable{};
    auto buffer_is_empty_condition = std::condition_variable{};
    auto tx_thread = std::jthread{tx_loop,
        buffer,
        std::ref(buffer_is_full),
        std::ref(buffer_mutex),
        std::ref(buffer_is_full_condition),
        std::ref(buffer_is_empty_condition)
    };
    auto const data = "Use whatever input data source you please"sv;
    write_data(
        data,
        buffer,
        buffer_is_full,
        buffer_mutex,
        buffer_is_full_condition,
        buffer_is_empty_condition
    );
    // Everything gets cleaned up automatically.
}
Now, this does not take into account the “odd thing” about your situation. You can make it model your situation simply by removing the buffer_is_empty_condition. So basically, this:
auto tx_loop(
    std::stop_token stoken,
    std::span<char const> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition)
{
    using namespace std::literals::chrono_literals;
    while (not stoken.stop_requested())
    {
        auto lock = std::unique_lock{buffer_mutex};
        if (not buffer_is_full_condition.wait_for(lock, 1s, [&buffer_is_full] { return buffer_is_full; }))
        {
            // handle timeout error
            return;
        }
        // ...
        buffer_is_full = false;
    }
}
auto write_data(
    std::ranges::input_range auto&& input,
    std::span<char> buffer,
    bool& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition)
{
    std::ranges::for_each(
        std::forward<decltype(input)>(input) | std::views::chunk(buffer.size()),
        [&] (auto&& chunk)
        {
            auto lock = std::unique_lock{buffer_mutex};
            if (not buffer_is_full)
            {
                // ...
                buffer_is_full = true;
                lock.unlock();
                buffer_is_full_condition.notify_one();
            }
            else
            {
                // Well, what to do here?
                //
                // Unlock the mutex obviously, but then what?
                //
                // Yield? Not exactly the most performant choice, but
                // probably better than a busy-wait.
                //
                // Beware of the situation where the consumer stops for
                // whatever reason. You don't want to wait forever if
                // there is no one emptying the buffer.
            }
        });
}
Note that if this is the case, you can probably change the buffer_is_full flag to an atomic bool, or an atomic_flag, and avoid locking the mutex unless necessary:
auto write_data(
    std::ranges::input_range auto&& input,
    std::span<char> buffer,
    std::atomic_flag& buffer_is_full,
    std::mutex& buffer_mutex,
    std::condition_variable& buffer_is_full_condition)
{
    std::ranges::for_each(
        std::forward<decltype(input)>(input) | std::views::chunk(buffer.size()),
        [&] (auto&& chunk)
        {
            if (not buffer_is_full.test(std::memory_order::acquire))
            {
                auto lock = std::unique_lock{buffer_mutex};
                // ...
                buffer_is_full.clear(std::memory_order::release);
                lock.unlock();
                buffer_is_full_condition.notify_one();
            }
            else
            {
                // You could do:
                //  buffer_is_full.wait(std::memory_order::acquire);
                // (and don't forget to notify in the transmission
                // thread).
                //
                // But beware of waiting forever if no-one is emptying
                // the buffer.
            }
        });
}
If you are bothered by the large number of parameters to the various functions, that is because the correct way to do this in C++ would be to make a proper type, rather than tooling around with naked buffers and synchronization primitives.
Indeed, the proper way to solve this problem would be to use a concurrent queue, like the one currently proposed for C++26 (though it probably won’t make it; it will probably wait for senders and receivers to get specified first). With that class, your code might look more like this:
auto tx_loop(std::stop_token stoken, std::bounded_queue<char>& data)
{
    auto const transmit = [] (std::string_view s)
    {
        std::println("{05}: {}", /*...*/, s);
        std::this_thread::sleep_for(500ms);
    };
    auto buffer = std::array<char, 10>{};
    auto p = std::ranges::begin(buffer);
    auto c = std::optional<char>{};
    while (not stoken.stop_requested() and (c = data.pop()))
    {
        // Once the buffer is full, transmit.
        if (p == std::ranges::end(buffer))
        {
            transmit(std::string_view{std::ranges::begin(buffer), p});
            p = std::ranges::begin(buffer);
        }
    }
    // Transmit the last bit, if any.
    if (p != std::ranges::end(buffer))
        transmit(std::string_view{std::ranges::begin(buffer), p});
}
auto write_data(std::input_range auto&& input, std::bounded_queue<char>& data)
{
    std::ranges::for_each(std::forward<decltype(input)>(input), [&data] (auto c) { data.push(c); });
}
auto main() -> int
{
    using namespace std::literals::string_view_literals;
    auto queue = std::bounded_queue<char>(10);
    auto tx_thread = std::jthread{tx_loop, std::ref(queue)};
    auto const data = "Use whatever input data source you please"sv;
    write_data(data, queue);
    queue.close(); // not strictly necessary
    // Everything gets cleaned up automatically.
}
That is what good C++ concurrent code looks like.
Code review
#include <iostream>           // std::cout
#include <thread>             // std::thread
#include <chrono>             // std::chrono::seconds
#include <mutex>              // std::mutex, std::unique_lock
#include <condition_variable> // std::condition_variable, std::cv_status
#include <iomanip>
It is good practice to order includes in some logical manner, usually alphabetically.
std::mutex _buf_mutex;
std::condition_variable _buf_cond;
char _buff[10];
std::atomic<ssize_t>    _buf_count;
All of these are illegal identifiers. You cannot start an identifier with an underscore at global scope.
Also, using a naked array? Ew. Use std::array.
Also also, ssize_t is not a standard C++ type. It is POSIX, but you haven’t included any headers that define it.
Global variables in general, though, are a bad idea.
bool running;
This is not an atomic type (generally), yet you set it and read it across threads. That is a no-no.
More broadly, this is a cancellation token (albeit one that is logically inverted). For this, you should use the built-in cancellation facilities that come with std::jthread. In any case, you should use std::jthread instead of std::thread; std::thread should be considered obsolete.
void tx_loop()
{
    while (running)
    {
        // ...
        
        // Decrement the number of filled buffers.
        _buf_count--;
It is sloppy practice to use postfix increment/decrement when you mean prefix increment/decrement. --_buf_count; can do a simple atomic decrement, but _buf_count-- has to tie up another register to hold the old value. Generally it is not actually harmful, because compilers are smart enough to fix the error. It’s still sloppy.
size_t acquireWriteBuffer(void **buf)
It’s std::size_t, and you need to include the proper header.
“Out” parameters like the one here are an anti-pattern. Any outputs from a function should be in the return, not in the arguments. If you need multiple return values, use a std::tuple. But in this case, it would probably make more sense to return a std::span that includes both the pointer and size.
    return sizeof(_buff) / sizeof(_buff[0]);
We don’t need to resort to C hacks. Just do std::ranges::size(_buff).
Questions
Is my use of _buf_count thread-safe?
So long as there is only a single producer thread, and a single consumer thread, then sure. But this will not scale.
Is my code prone to deadlocks?
Yes and no.
Yes, in that it is very much prone to deadlock. In fact, it will pretty much always deadlock, because of the confused spaghetti mess of responsibilities and abstractions.
No, in that it will never actually deadlock, because every time you lock anything, you do so with a timeout.
You can even see this in your sample output. Right there at the end, it deadlocks… then times out.
Is my use of a single condition variable appropriate?
That depends on what you mean by “appropriate”.
Logically, the answer is “no”, because you have two conditions: ready to read the buffer, and ready to fill the buffer. Two conditions means you should have two condition variables.
But your actual requirements are such that one of those conditions is non-notifiable. There is no way to know when the “ready to write” condition has been met; you are assuming the “ready to read” also means “ready to write”. So in this model, yes, you only have a single condition… thus, you should appropriately only have a single condition variable.
However, these requirements mean the producer is now bound to the consumer. You producer has to sit and wait after it is done, waiting for the consumer to announce it is ready for more. So you don’t really have a producer-consumer situation so much as a controller-worker situation, where the controller is basically a loop of “while there is data: send some data; wait for worker to ask for more”.
Am I managing my mutexes well? Since I'm using wait_for, they automatically get unlocked, right?
Correct, but I would hardly call that “managing well”.
Even though it technically doesn’t completely deadlock, it does freeze for a moment at the end while it waits for the timeout.
Generally, relying on timeouts to avoid deadlocks is not a great idea. Timeouts are great as emergency escape hatches, for when situations arise where something should happen, but for some unexpected reason doesn’t. But you should not rely on them, for the same reasons you should not use exceptions for control flow, or use segfaults as the normal way to exit the program.
Suggestion
You are too fixated on the mechanisms, and not paying enough attention to the problem. Or put another way: you are so focused on how you are trying to get things done that you have lost sight of what you are trying to get done.
Concurrent programming—especially low-level concurrent programming—is unforgiving. If you have a loosey-goosey, vague, wishy-washy model, you will either end up with very poorly-performing code, or straight-up deadlocks and data races. You really, really need to lock down the design and abstractions before even considering which synchronization primitives you will be using, or how.
Of course, truly good concurrent code does not actually use any synchronization primitives. At least not directly. The moment you start tossing in mutexes and condition variables, you are no longer writing good concurrent code. Take a look at the version of the code with the concurrent queue. Not a single lock in sight. In fact, except for the one jthread construction, there is no sign that it is multi-threaded code at all! And yet there are no data races, and no deadlocks. Not even deadlocks that have to be timed-out to be avoided.
I would suggest stepping back and very carefully thinking about your problem. Lock down precisely what you want to happen. Don’t get lost in jargon or low-level details like buffers and locks. If you can’t ELI5 (“explain it like I’m 5 (years-old)”) the problem, then you do not fully grok the problem, and you will fail to produce the most efficient and race-free concurrent code for it. “I want to send chunks of data to be transmitted; and I want the next chunk (or several chunks) to be produced at the same time (concurrently) as the current chunk is being transmitted.” As soon as you phrase it that plainly, you can see that your design is sub-optimal, because you are dealing with only a single chunk at a time: the producer has to stop and wait while the consumer works, and the consumer has to stop and wait while the producer works; you are getting very little actual concurrency. What you want, at least, is two chunks, so the producer can be filling a chunk while the consumer is transmitting one. That is, double-buffering.
That is, of course, not the only way to phrase the problem, and if you phrase it differently, you might get a different design. For example you might conceive of a stream of data, and say: “I want to send a stream of data, to be transmitted in chunks; and I want data to be written to the stream uninterrupted unless the stream fills up.” That design more matches the version with the concurrent queue.