There are two "unknowns" that the attacker has to contend with.
First, the attacker is overflowing a buffer, supposedly on the stack, and among the bytes which follow the buffer in RAM are the bytes which store the "return address" where execution jumps after the current function is finished. The attacker wants to overwrite these bytes with another address, making the CPU jump into code written by the attacker. The first unknown is then: "what is the distance between the buffer (that is overflown) and the return address slot ?". The attacker must guess that, so as to know at which point in the attacker-provided data the "return address replacement" must be set.
The second unknown is the actual address of his shell code. In a traditional buffer overflow on the stack, the shell code is part of the data which triggers the overflow, i.e. it is in the buffer. The return address is absolute so it is not sufficient for the attacker to know the code of the attacked function; the attacker must also know what is the stack depth at that moment, and this depends on previous application behaviour.
The NOP slide, or NOP sled, is a simple technique to cope with accuracy issue for the second unknown. When the attacker guesses the address of his shell code with some possible jitter, then the attacker puts a lot of nop opcodes (or similarly harmless opcodes) before his shell code; thus, it suffices that the CPU jumps somewhere in the sled in order to make it, ultimately, reach the shell code.
For the first unknown, the tool is not a NOP sled, but another kind of sled. Simply put, the complete overflow will look like this:
90 90 90 90 90... 90 shellcode addr addr addr... addr
where shellcode is the actual payload, 90 is the hexadecimal code for the nop opcode, and addr is the attacker's guess of the buffer start address in RAM. When the overflow occurs, one of the addr values will overwrite the return address slot, and make the CPU jump at address addr; which one of the addr does the trick is the "first unknown" described above. Usually, the "first unknown" is not that unknown, because it only depends on the layout of local variables in the function itself, which is known through reverse-engineering (disassembly).
The attacker has guessed that the byte at address addr will be one of the 90, so execution jumps there. Which one is not very important, because the CPU will proceed to execute all subsequent nop opcodes and then reach the shellcode.
The pictures you show are indeed quite confusing. They seem to relate to more complex cases where the attacker, for some reason, cannot put his shell code immediately after the NOP sled, but at some place before the NOP sled, so he must then arrange for an appropriate jump at the end of the sled (jump opcodes use relative addressing, so that jump is easy to get right).