3,396 questions
            
            
            25
            votes
        
        
            2
            answers
        
        
            4k
            views
        
    Does excessive use of [[likely]] and [[unlikely]] really degrade program performance in C++?
                The C++ standard [dcl.attr.likelihood] says:
[Note 2: Excessive usage of either of these attributes is liable to result in performance degradation.
— end note]
I’m trying to understand what “...
            
        
       
    
            2
            votes
        
        
            0
            answers
        
        
            87
            views
        
    Clang __builtin_constant_p Inconsistent Behavior Issue Report [closed]
                Problem Description
In Clang 21.1.0, the __builtin_constant_p builtin function exhibits inconsistent behavior when using constant arrays (.rodata) versus stack arrays. The function returns true in ...
            
        
       
    
            3
            votes
        
        
            0
            answers
        
        
            105
            views
        
    How well can clang 20 infer the likelihood of branches without annotations?
                I have a performance-critical C++ code base, and I want to improve (or at least measure if it's worth improving) the likelihood that clang assigns to branches, and in general understand what it's ...
            
        
       
    
            0
            votes
        
        
            0
            answers
        
        
            39
            views
        
    How to build a gcc_tree_node from custom language Nodes
                Nodes:
building a gcc_tree_node for a custom prograimming language
compile and base on C++26
the modules are avilable
the language using tab-block system
every keyword start with '/'
I want to ...
            
        
       
    
            1
            vote
        
        
            1
            answer
        
        
            130
            views
        
    Can the compiler elide a const local copy of const& vector parameter?
                Consider these two functions:
int foo(std::array<int, 10> const& v) {
    auto const w = v;
    int s{};
    for (int i = 0; i < v.size(); ++i) {
        s += (i % 2 == 0 ? v : w)[i];
    ...
            
        
       
    
            5
            votes
        
        
            3
            answers
        
        
            268
            views
        
    How to make the optimiser treat a local function as a black box and not optimise based on its implementation?
                I thought that the noinline function attribute would force the compiler to treat a local function as a black box:
__attribute__((noinline)) void touch_noinline(int&) {}
void touch_external(int&...
            
        
       
    
            0
            votes
        
        
            1
            answer
        
        
            64
            views
        
    Why does MSVC AVX2 /FP:strict sometimes generate inferior (slower) code to SSE2?
                I was testing various expressions of a sixth order polynomial to find the fastest possible throughput. I have stumbled upon a simple polynomial expression length 6 that provokes poor code generation ...
            
        
       
    
            2
            votes
        
        
            0
            answers
        
        
            61
            views
        
    Why is sequential indexing with fixed length stride slower in Estrin's method?
                Preparing to make Estrin's method vectorisable I changed from normal linear indexing of the coefficients to bitreversed and restricted it to strictly powers of 2. Neither MSVC nor ICX can see how to ...
            
        
       
    
            1
            vote
        
        
            1
            answer
        
        
            117
            views
        
    How is rust able to optimize Option::is_some_and so effectively? [closed]
                Looking at the codegen of a check inside for-loop I wanted to see if there is an optimization opportunity by outlining is_some_and but both cases had the same codegen.
struct V {
    len: Option<...
            
        
       
    
            5
            votes
        
        
            1
            answer
        
        
            182
            views
        
    Why are [[no_unique_address]] members not transparently replaceable?
                In the classic talk An (In-)Complete Guide to C++ Object Lifetimes by Jonathan Müller, there is a useful guideline as follows:
Q: When do I need to use std::launder?
A: When you want to re-use the ...
            
        
       
    
            7
            votes
        
        
            1
            answer
        
        
            317
            views
        
    Does GCC optimize array access with __int128 indexes incorrectly?
                When compiling the following code using GCC 9.3.0 with O2 optimization enabled and running it on Ubuntu 20.04 LTS, x86_64 architecture, unexpected output occurs.
#include <algorithm>
#include &...
            
        
       
    
            1
            vote
        
        
            2
            answers
        
        
            188
            views
        
    GCC switch statements do not simplify on identical handling
                The switch statements in the following two functions
int foo(int value) {
    switch (value) {
    case 0:
        return 0;
    case 1:
        return 0;
    case 2:
        return 1;
    }
}
int ...
            
        
       
    
            1
            vote
        
        
            0
            answers
        
        
            99
            views
        
    Does critical section protected by semaphore, mutex, etc, implicitly volatile? [duplicate]
                Say if I have an array of integers, int array[NUM_ELEMENTS];, access to it is encapsulated as setter and getter function well protected by synchronization such as semaphore, mutex, etc, do I need to ...
            
        
       
    
            4
            votes
        
        
            1
            answer
        
        
            151
            views
        
    optimize computation of real part of complex product
                I need (only) the real part of the product of two complex numbers. Naturally, I can code this as
real(x)*real(y) - imag(x)*imag(y);
or
real(x*y);
The latter, however, formally first computes the ...
            
        
       
    
            29
            votes
        
        
            1
            answer
        
        
            4k
            views
        
    Why do C compilers still prefer push over mov for saving registers, even when mov appears faster in llvm-mca?
                I noticed that modern C compilers typically use push instructions to save caller-saved registers, rather than explicit mov + sub sequences. However, based on llvm-mca simulations, the mov approach ...
            
        
       
     
         
         
         
        