Mahdi Chaari

Posted on Jun 16

C++, Unsafe Doesn’t Always Mean Dangerous : How I increased performance with raw pointers

#cpp #opengl #ffmpeg

TL;DR:

When building a real-time video rendering and encoding pipeline in C++, I hit a surprising bottleneck: memory allocation. Despite using high-performance libraries like OpenGL and FFmpeg, the cost of safely copying ~5MB per frame killed my frame budget. The fix? Using raw pointers with a manually managed double buffer. It’s not idiomatic C++, but in this case, “unsafe” code—done carefully—was the only way to hit performance targets.

As a person who worked in managed runtimes like NodeJs and the Golang runtime, thinking about memory management always felt like premature optimisations because the runtime, or the underlying libraries generally handled it really well. However, in C++ it is the name of the game. The language offers multiple ways to manage memory and resource lifecycles like RAII, smart pointers and manual work using raw pointers. And in general it is never a good idea to work with raw pointers due to their complexity and risk. Leaks, dangling pointers, double frees… the list is long. But sometimes, performance demands pull you back into the wild.

I'll share a story, about how raw pointers saved my project from business unviability.

In a project I am working on, I have to Render 800 frames per second of 48GB of gpu assets then encoding them into real time video streams, all in real time, on a single powerful machine. If I can't achieve these numbers, the project dies.

The rendering side uses OpenGL; the encoding side uses libav (FFmpeg). Both are highly optimised but in completely different ecosystems.

At first glance, the heavy GPU work seemed like the obvious bottleneck:

Rendering 4M+ triangles per frame
Reading GPU memory back to the CPU
Passing the frame to the encoder

But after measuring… surprise: The bottleneck wasn’t GPU work.
It was memory allocation.

The Hidden Bottleneck: Memory Allocation

Back to the drawing board?

When planning the project, I did some benchmarks:

real-time encoding can be done quickly on the separate Nvenc ship ( or apple video toolbox on Apple Silicon)
Optimised rendering allows for huge gains even with large polygon count, and testing showed it could clearly meet our frame budget
Encoding and rendering are done on different dies in the GPU so they can be parallelised and shouldn't affect each other
we have a double buffer per stream, and since we're rendering in a round robin fashion, and since encoding is much faster than rendering, it is not possible that rendering is being blocked by the locking mechanism in place.

Looking at the estimates, putting everything together should work. But there was a problem hiding in plain sight.

Both OpenGL and libav manage their own memory layouts and object lifetimes. When bridging them, the easy (and “safe”) path is to copy frame data into a newly allocated buffer for each AVFrame.

That means allocating and copying ~5MB per frame, at hundreds or thousands of frames per second.
Across multiple threads.
While trying to maintain real-time performance.

That quickly became a page-fault storm. My rendering time per frame went from 1.3ms per frame to more than double that.

The Fix: Controlled Unsafe Code

So I broke the rules.

Instead of copying memory or using RAII, I created a manually-managed double buffer. Both OpenGL and libav operate on shared raw memory. I carefully coordinate access, and validate every pointer and layout match using source inspection and testing.

This is what optimized code sometimes looks like in C++:

struct DoubleBuffer {
    char* front;
    char* back;
    int size;
    void swap();
    char* frontBuffer();
    char* backBuffer();
    ~DoubleBuffer() {
        delete[] front;
        delete[] back;
    }
};

The renderer writes into the front buffer.

The encoder reads from the back buffer.

We swap them on each frame.

And we use the less-safe versions of AVFrame and Image that just wrap existing buffers.

No allocations. No copies. Just raw throughput.
Is it safe? Not in the traditional RAII sense.
But it is controlled, isolated, and fast.

In other higher level C++ code this feels like a crime, but reading the libraries source code, and ensuring safety through verification and understanding memory layout well is also called safety.

Sometimes encapsulation leads to performance traps, to the favor of clean and easy to use interfaces.

In C++, "unsafe" doesn't have to mean dangerous.
It can mean deliberate, validated, and surgical.

Top comments (2)

Kristofer • Jun 17

Great article! 100% true. "Back in the days" when I learned C++ there was no such thing as "safe" mode. Either you had memory access violation or you didn't, I wouldnt say it was better, but it definitely surved a purpose, especially if you access something lowlevel as OpenGL.

Mahdi Chaari • Jun 17

I first started programming in C, most of the challenges I did as a beginner were pointer arithmetic, since it was super hard to do anything besides simple in memory data manipulation in C. At the time I never thought it was unsafe, it was either good code that works or code that seg faults. simpler times