integer overflow system variability?

Question

note : removed printf part, since it's explained in a different post.

I learned C from K&R 2nd edition a few yours ago. I have not used C in a while, so I decided to skim through a more modern book. “C How to Program, 8th Edition” by Deitel and Deitel (published by Pearson in 2015), which has C99 and C11.

He covers security, unlike K&R. One thing that surprised me, is integer overflow. He wrote:

Section 3.13 Secure C Programming • Adding integers can result in a value that’s too large to store in an int variable. This is known as arithmetic overflow and can cause unpredictable runtime behavior, possibly leaving a system open to attack.

In a different page he has:

It’s considered a good practice to ensure that before you perform arithmetic calculations like the one in line 18 of Fig. 2.5, they will not overflow. The code for doing this is shown on the CERT website https://www.securecoding.cert.org — just search for guideline “INT32-C.”

If you look up the code they recommend:

5.3.3.2 Compliant Solution This compliant solution ensures that the addition operation cannot overflow, regardless of representation:

#include <limits.h>

void f(signed int si_a, signed int si_b)
{
    signed int sum;

    if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
        ((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
        /* Handle error */
    }
    else {
        sum = si_a + si_b;
    }
    /* ... */
}

My understanding was, although unsigned int behavior is undefined, it's always a fixed size of bits. On my computer it's a 32 bit int. So on my laptop my INT_MAX = 2147483647 and if I add 1 to it i get -2147483648. If I keep adding one to it, it will eventually get to 0, then go back up to INT_MAX and keep going round and round. I don’t see how this can be attacked by someone using my code?

To add extra code, every time I add ints, seems very wasteful, unless there really is a variability I have to look out for, not just getting the wrong result.

edit: I'm adding the quote back, because of the discussion below:

Avoid Single-Argument printfs. One such guideline is to avoid using printf with a single string argument. If you need to display a string that terminates with a newline, use the puts function, which displays its string argument followed by a newline character. For example, in Fig. 2.1, line 8

printf( "Welcome to C!\n" ); should be written as: puts( "Welcome to C!" ); We did not include \n in the preceding string because puts adds it automatically. If you need to display a string without a terminating newline character, use printf with two arguments — a "%s" format control string and the string to display. The %s conversion specifier is for displaying a string. For example, in Fig. 2.3, line 8

printf( "Welcome " );should be written as:

printf( "%s", "Welcome " );

Although the printf in this chapter as written are actually not insecure, these changes are responsible coding practices that will eliminate certain security vulnerabilities as we get deeper into C.

The problem is that undefined behavior is undefined, and signed integer overflow doesn't always wrap around. Compilers can assume undefined behavior will never occur, so they might optimize your code in a way such that overflows results in exploitable behavior instead of just wrapping around. You have to get out of the habit of making assumptions about UB. — user9254539
– user9254539, Commented Dec 16, 2018 at 0:32
@curiousguy: why do you think using volatile will affect the behaviour of integer arithmetic overflow? Can you provide any authoritative reference that explains your suggestion? — Jonathan Leffler
– Jonathan Leffler, Commented Dec 16, 2018 at 2:03
@curiousguy: I think I understand what volatile does — and it is not at all obvious to me why you think it would have any effect whatsoever on the behaviour of integer overflow. "All" volatile does is say that the compiler must make an appropriate reference to the volatile-qualified variable whenever the source code refers to it, so it cannot (for example) optimize reads away assuming that the value is still the same as the last time it was read — which has precisely nothing to do with how an integer expression is evaluated once the values are read, which is what's relevant to overflow. — Jonathan Leffler
– Jonathan Leffler, Commented Dec 16, 2018 at 2:28
If you want to guarantee wraparound behavior for signed integers, nemequ's suggestion of passing flags to your specific compiler that force it to use wraparound semantics for signed integers is the only reliable approach. — Ray
– Ray, Commented Dec 16, 2018 at 6:09
@curiousguy please read and understand this article before you continue this "smaller, fastest impl. of int ops" mantra) — Antti Haapala
– Antti Haapala, Commented Dec 16, 2018 at 9:12

Ray · Accepted Answer · 2018-12-16 00:29:01Z

Integer overflow

Behavior on overflow for signed integers is undefined. (That's actually the example used when defining undefined behavior in C11 3.4.3). It's not uncommon for systems to wrap around, but it's not universal, and even if that is how the system handles it, the compiler is permitted to assume that overflow will not occur and optimize accordingly.

Unsigned integers are guaranteed to wrap around on overflow (or more formally, the value is reduced modulo the max representable value) (C11 6.2.5.9). You should still be aware of whether the value is in a range where this is likely to happen, since having well-defined behavior isn't especially helpful if that behavior isn't what you wanted it to do.

As for whether you actually need to add those checks before each addition...unless it's really critical code (and I mean space shuttle guidance system or pacemaker controller levels of mission critical), my opinion is that those checks would be overkill most of the time. Instead, be aware of what sort of range of values are possible, and choose a datatype such that the value will never be in the sort of range where it might overflow (and if there are cases where it might, test it in those places and only those places). To ensure this in cases where the numbers are user-supplied, you might have some explicit checks when the user first enters the data so as to verify that the values are in sane ranges. (Of course, if you do that, there is always the risk that the user and you will have different ideas as to what ranges are sane, but better to reject an input than to accept it and return incorrect results.)

printf

There's no risk in printf("Hello, world!\n"). It's possible that puts("Hello, world!") will compile to more efficient code, but I doubt it'd make a noticeable difference; the bottleneck will be the actual I/O.

But there is a risk if you do printf(s), where s contains user-supplied data. If the user causes s to contain, e.g., "Foo %s", then it will attempt to run printf("Foo %s"), scan to the "%s", try to read the (nonexistent) next argument, and crash (or do some other undefined behavior).

puts("Hello, world!"); is more efficient because it doesn't need to scan for %s - but compilers already do substitute puts for printf("Hello, world!\n");

nemequ · Accepted Answer · 2018-12-16 03:13:40Z

You don't need to check every single operation, only the ones that could overflow. One obvious example is any operations based on untrusted input, but even then it may be more appropriate to do a sanity check on arguments instead of generically checking all math operations.

Ray's statement that "… unless it's really critical code (and I mean space shuttle guidance system or pacemaker controller levels of mission critical), my opinion is that those checks would be overkill most of the time …" is dangerous. Yes, he's right that you don't need to check every operation, but you should always check the ones which can overflow.

https://undeadly.org/cgi?action=article&sid=20060330071917 has a good example of where this can bite you, even if you're not writing a space shuttle guidance system or pacemaker controller.

It's also worth noting that compilers can sometimes optimize away checks that it can prove will never fail. For example, if you were to call f() with two constants and check the result, there is a decent chance that the compiler will completely optimize it away, especially at higher optimization levels.
The code suggested by CERT is portable, but generally not the fastest option. GCC and clang have __builtin_*_overflow intrinsics, and on Windows there is an <intsafe.h>. If a larger type is available (e.g., if you want to check the result of a 32-bit operation and you have 64-bit types available) it should be pretty quick to perform the operation using the larger type then cast back. If you want some portable code to do it, there is a safe-math module in portable-snippets (disclaimer: my project) you can steal.
Static analysis tools can be very helpful here; several of them can detect potential integer overflows. I know I've seen such errors from Coverity, and I think I've seen them from scan-build and cppcheck.

Ray's statement...is dangerous. ...you don't need to check every operation, but you should always check the ones which can overflow. I agree, and said something to that effect a bit later in the answer. The space shuttle and pacemaker cases are the ones where you should check everywhere just in case.

score 0 · Accepted Answer · 2018-12-16 00:54:09Z

0

1) On some machines, x = MAX_INT + 1 causes an overflow exception to be raised, thus causing an alteration in control flow that may be exploitable.

2) Checks on limits are at risk from overflow. On a machine without overflow exceptions, it is not always true that x + y >= x even where x and y are both positive. Thus, if the attacker can supply a length, a naïve check like if (pointer + length < end of allocated space) can be bypassed by supplying a very large length.

EDITED: to remove printf part of answer, since that part of the question has been removed. The linked page gives the answer I gave, in more detail, so that part of my answer is redundant.

edited Dec 16, 2018 at 0:54

answered Dec 16, 2018 at 0:14

user10762593

8 Comments

David C. Rankin Over a year ago

3 is questionable and the term is more properly "conversion specifiers" rather than "insertion-specifier". It's hard to see how it could end up accessing thing that they shouldn't -- unless the programmer has wholly failed to insure a nul-terminated string is passed. The point being, don't use printf ("%s\n", "the string"); to simply puts ("the string"); which is akin to not using printf ("%c", 'a'); where putchar('a'); is proper (even though a good compiler will optimize printf to puts or putchar in both cases)

user10762593 Over a year ago

In many implementations, printf("%s") will read off the end of the argument list, treating whatever happens to be there as a pointer, and will merrily fetch characters from there until a zero byte is encountered. The %s was not literally supplied by the programmer, it was read as input (network, user, whatever). See the link posted above to format-string vulnerabilities.

David C. Rankin Over a year ago

Granted, a misuse of printf can do bad things (as a misuse of any other function can). The compiler in that case is required to warn of a missing argument. (the C11 Standard also defines a mismatch in printf argument types as Undefined Behavior C11 §7.21.6.1(p2 & p9) The fprintf function - insufficient arguments, or incorrect type.)

user10762593 Over a year ago

The compiler can warn of nothing in the case I am talking about. char *s = read_string_from_socket(); printf(s); The content of s is not known until execution. And, yes, the behavior is undefined; but it is nevertheless exploitable by an attacker.

David C. Rankin Over a year ago

Like I said, the misuse of printf can and will cause problems. Your example violates the "const char * restrict format, ..." definition of printf/fprintf See C11 Standard - 7.21.6.1p1

|

M.M · Accepted Answer · 2018-12-20 03:22:13Z

although unsigned int behavior is undefined, [...]

You are misunderstanding "undefined". It really does mean undefined. There is no definition of behaviour. Anything at all can happen. You should not expect, let alone rely on, any particular behaviour as you describe in your answer.

Here are some of the possibilities:

Appears to work like wraparound arithmetic
Appears to work like saturating arithmetic
Prints unexpected thing
Formats the hard drive
Launches missiles
Sets the computer on fire
(anything else you want to put here)

See What is undefined behaviour?

Collectives™ on Stack Overflow

integer overflow system variability?

4 Answers 4

Integer overflow

printf

1 Comment

1 Comment

8 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Integer overflow

printf

1 Comment

1 Comment

8 Comments

Comments

Linked

Related