4

Is the exploitation of a format string vulnerability possible if the number of characters you're allowed to enter is limited?

Let's say I'm just allowed to enter input with 23 characters. I can read the stack like this of course:

AAAA%1$08xBBBBBBBBBBBBB
...
AAAA%576$08xBBBBBBBBBBB

But is it possible to exploit it somehow? (Shell, ...)

The system behind is a Linux Server I am accessing with netcat. It is running a simple ELF which makes the string length check and then calls the vulnerable printf function.

3
  • 1
    The question "Is ___ possible?" is very different from "Can you give me an example of ___?". When doing defensive security, I always assume things are possible :P (may want to edit the title) Commented Aug 28, 2018 at 16:22
  • I agree, but I don't need an example for now. I just want to know if it is even possible. Commented Aug 29, 2018 at 9:00
  • Yes 23 seems an okay bound to exploit. Show us the binary/code and I'll be happy to help. Commented Sep 10, 2018 at 2:34

2 Answers 2

1

Let's say I'm just allowed to enter input with 23 characters... But is it possible to exploit it somehow? (Shell, ...)

Yes (assuming that unsanitized user input is used as the format string)! I'm not going to give you a full exploit (I'd need, at minimum, to know the offsets of various functions on your platform, and probably a lot of tinkering) but I'll explain the basics. Before we get too far, open up the printf(3) manual in another tab.

The core tool for using format string vulns to achieve code execution is the %n conversion specifier. Its theoretical purpose is to take a parameter, which has type int*, write to the pointed-at address the number of bytes of output thus far. However, it it possible to instead use it to write an arbitrary value to an arbitrary address (within the process' address space), even with a relatively short format string.

First problem: how to specify an address of interest?
The C runtime implements "variadic" functions (those which take a variable number of parameters, like printf) by putting the parameters on the stack. At least some C runtimes actually also record the number of parameters passed as another value on the stack, but I've never seen[1] an implementation of C printf which uses this information, so you can reference parameters that don't exist, and the function will simply check controllable offsets from the stack pointer and treat the values at those addresses as parameters of the expected type.

Usually this step is pretty easy, if you control other input into the program. The functions "below" printf on the stack (so, at higher addresses, because the stack grows downward) likely contain stack-allocated with data you supplied, so if you supply the right data, you can reference your own buffer as a "parameter" to any arbitrary address.

But what if no value near the stack pointer is suitable? Especially with a short format string, you can only put so many %p or similar conversions in there (to skip a bunch of stack bytes at a time) before you run out of space, and maybe that isn't enough room to reach the data with the right value. While the C standard doesn't support addressing arbitrary parameters, the UNIX standard does; instead of just the % sign for a conversion, write %m$ to get the mth parameter, indexed from 1 (as in, %17$n to use the 17th parameter as the int*). Note that this is not available on all implementations! Note that the size of nonexistent parameters will be taken as some default value, probably either 1, 4, or 8 bytes; you might need to experiment some to land on the exact "parameter" with the value you want. Negative indices are unlikely to work, but if you really need one, try positive values so large they cause integer overflow when converted to memory offsets, wrapping around to locations behind parameter 1. (You might think this doesn't need to be super large, because the stack is near the top of the process' address space and you're probably only trying to get to the bottom, but don't forget the entire kernel address space above the user-mode address space! On 64-bit processes, this would need to be a very large number and it might not be possible.)

Generally, the value you'll be looking for is the address (on the stack) of the saved instruction pointer (return address) of some function that will return soon. Overwriting a such a return address is often the easiest way to jump to an arbitrary function (such as system or execve). However, addresses pointing to other locations - such as function pointers that will be invoked, or pointers to data structures which contain function pointers, or so on - can also be used.

Note that you can chain these. For example, maybe you can't find a value that is exactly the right address, but you can find a value that is a writable address. Write into that writable address the address you actually want, and then use the "index" of that writable address (where you just wrote) for the actual write that changes program flow. If you can find a location whose value is its address (and it's writable), you can even just use that index twice. This requires additional format string capacity, though, so it might get tight for you.

Second problem: how to specify a value of interest?
%n writes to memory (at the address in the parameter) the number of bytes output so far by printf. 23 characters of format string doesn't sound like enough to overwrite even a single byte with a fully chosen value, much less a complete pointer, especially when part of that format string will be the complicated %n conversion (or possibly two conversions, chaining one address to another). Fortunately, even a pretty short format conversion can output an arbitrary number of bytes.

The trick is to abuse the "field width" specifier, which comes after the positional parameter specifier (the m$ part) and any flags, and before the conversion specifier (and any precision specifier, which you can omit). For example, %1$12345d will always output exactly 12345 bytes (since no matter what value is in parameter 1, it won't be longer than 12345 bytes when represented as a decimal value, and will get left-padded with spaces). The value you want to write into memory (e.g. address of the function system) is probably short enough to fit in a single field length without trouble. Do note that the field length is always given in decimal, not hex. Simply start your format string with a conversion whose field length is the value you want to write into memory, and follow it directly with the actual memory write.


There you go, the tools to write an arbitrary value to an arbitrary address, using a short format string!

Note that actually weaponizing this is hard unless you have the ability to scan memory. If you control other parts of memory outside of the format string, and know either their absolute or relative positions, that will help a lot; you could for example create a fake stack frame with all the parameters, including strings, that you want a function invocation or even chain of invocations to have, and use that for return-oriented programming (with a single format string vuln to redirect the control flow). However, it is almost certainly weaponizable with enough effort (even on a remote server, you can scan memory using enough printf invocations, assuming you see the output string).


[1] Note that my last time messing with format string vulns such that I cared about implementation minutiae was ~15 years ago; some modern implementations may well have implemented protections against many of these techniques although e.g. glibc still allows specifying nonexistent parameters.

0

PRINTF is not vulnerable on its own, so probably need to explain what it does with the output.

looks like you only need 2-3 characters

Read this: https://www.owasp.org/index.php/Format_string_attack

5
  • In my case, it is definitely vulnerable. I already read the whole stack and know at which position my payload is. However, it is not my question if it is vulnerable. My question is: Is it possible to exploit it with a limited number of characters? With "exploit" I mean popping up a shell, RCE and so on. Commented Aug 29, 2018 at 8:57
  • you'd still have to have memory allocated I think. Do you have the source code? Commented Aug 29, 2018 at 16:40
  • I have to reconnect to the netcat server every time I send a string. So the memory stays allocated, because the ELF is executed with every reconnect. I don't have the source code, but I'm 100% sure that it is just a string length check and then the printf(input) function. Commented Aug 30, 2018 at 6:34
  • Notepad has an encoding hack that Microsoft determined is not fixable and not a security issue. It’s posted on stackoverflow.com. Educational on UTF8 Commented Sep 13, 2021 at 0:28
  • This answer is simply wrong. printf(3), the C standard library function, is vulnerable when user-supplied input is used as the format string! It doesn't matter what is done with either the characters written stdout or the return value. Some modern implementations disable the %n conversion specifier by default, which limits the potential damage, but this is technically against the standard. Other languages may have a function that works similarly, maybe even called "printf", but those are generally specifically immune to format string vulns (also modern runtimes are safer than C's). Commented Dec 29, 2024 at 0:18

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.