TL;DR:
- It's a memory word.
- It was the IBM /360 in 1965.
- It was to allow optimal memory utilization for common data types.
The Questions:
What was the minimum amount of addressable memory?
One item? But I guess you mean the minium data item size hardware can access. That's a memory word, which depends on the implementation used. Can be a bit, a nibble, a byte, or any larger word, double or quadruple word.
Note that this may differ from what a the ISA defines as a word. For example an 8088 has an 8 bit memory word, but a 16 bit ISA word.
The x86 evolution gives a nice example here. All members use byte addressing and 16/32 bit words/double words, but have different memory words:
- 8088 - 8 bit
- 8086 - 16 bit
- 80286 - 16 bit
- 80386SX - 16 bit
- 80386DX - 32 bit
- Pentium - 64 Bit
- ...
I'm watching this video, which, at 9:51, says that around late 1980s the majority of computers were byte-addressable.
Well, bytes ate way older, but I guess you're referring to a byte of 8 bits (*1), so yes, by the 80s the 8 bit byte as basic character, as well as byte addressing was pretty much common sense.
So the minimum amount of data you can retrieve from RAM was 8 bit.
Note: Byte addressing does not mean it being 8 bit.
Byte addressable only means that the smallest common data unit that can be named (addressed) in all (most *2) instructions is less than a memory word (*3). What is fetched from that address depends on the instruction. It could as well be more than a byte, like a word or double word.
Likewise this is only when viewed from the instruction side. A CPU's hardware (aka the memory interface) would usually fetch (at least *4) one memory word. That could be 32 Bit on a Motorola 68030 or Intel 80486, or 64 Bit on a Pentium.
Not to mention that most RAM internally use additional buffers. For example a byte wide RAM build from 4116 RAM chips will fetch 128 bytes from it's storage matrix, but only deliver one of them.
Also, byte does not always mean 8 bit. Especially not early on. A byte was early on the abstract term for a character sized cell and could have anything from 6 to 12 bit, depending on machine. Heck, some even allowed bytes of variable size.
What was the minimum amount of data you could retrieve from RAM before that?
At least in the beginning it was always a machine word. So 18 bit for an 18 bit computer, 36 for a 36 bit machine and so on.
Oh, and then there are of course serial architectures which will always have to read multiple bits to deliver a byte :))
Why did they switch to 8 bit eventually?
When and Why is rather clear:
- In 1965 with introduction of the IBM /360 series
- To accommodate multiple data types with the same ease
IBM's goal for the /360 was to replace the different architectures they had with a single fitting all use cases. Until then IBM maintained several major incompatible CPU lines. Some for data processing with large 36 bit word integers, some with 36/72 bit float, some to handle mainly decimal numbers (BCD) as well as character based types. Not to forget about still existing large installations of electro-mechanical punch card processing.
The 360 in System/360 was meant to symbolize all around capabilities - like the 360 degree in a circle. So when looking at their portfolio, they noted 5 different data types that all in one machine needed to handle at comparable performance and with as least as possible exceptions/special cases.
- Decimal data (BCD)
- Character data (bytes)
- Integer data
- Floating point data
- Address data
Some of them are contradicting when it comes to efficient handling, and that's not the ones you may think of. Integer, float and address all not very picky about size. if an integer is 32 or 26 bit or a float is 64 or 72 doesn't make a difference for most applications. The real problem lays with those pesky little decimal numbers and characters.
IBM's accounting machines worked already with byte handling, but their bytes were 6 bit - like common in the 1950s (*5). BCD on the other hand fits in 4 bits. And BCD was THE main data type (*6). It was where the customer was and the money went. So making a machine with 6 bit bytes would have worked fine for characters (and in turn for addresses, integer and float), but leave a whooping 1/3 of memory unused. Memory was expensive.
As a result, the /360 architecture was designed around BCD numbers of 4 bits each. That way decimal numbers could be stored the most efficient way. So far this would mean that the machine would best be made to address each digit of 4 bit as base unit. Then again, one rarely needs to address single digits in a machine that operates on strings of digits (*7). A decimal number comes rarely as a single digit - especially not in accounting.
The next bigger building block (data type) would be character. Combining two BCD into one character seems logical. sure, with a 6 bit character set this would now waste two of the 8 bits that two BCD give - but that's already only 1/4th. Also character data are way less used then decimal numbers, so way more efficient. Not to mention that the time was right to include also lower case letters wich would need at least a 7 bit character cell (*8). Characters are also something that one wants to access.
So 8 bit bytes and byte addressing is the sensible lower limit for data access. All other ("higher") types can easy be constructed in a way to build upon those 8 bit bytes:
- 16 bit for Half Words (short)
- 32 bit for Words (long)
- 64 bit for Double Words
- 32 bit for Short Float
- 64 bit for Float
- 128 bit for Extended Float (*9)
and finally a
Since they are multiples of 8 they can as well be addressed using a byte address, so no problem either (*10).
And as they say, The Rest is History
With IBM being THE normative force in business computing it became mandatory to handle their data types without much hassles. Most important the 8 bit byte. What company would buy some smaller/complementary computers if they can't exchange data with the companies mainframe? What department would order one if it needed more complicate programming to do the same?
As a result next to all (*11) other manufacturers moved their new designs to fit the 8 bit byte and in turn 16/32/64 bit word sizes. Sure, a few, like DEC's 36 bit systems, had a longer hold out period due perfect fit for a certain niche, but in the end - that's the 1980s you mentioned - VAXen replaced PDP-10.
And micro processors, which only became a thing a good 10 years after the /360, already started out with the 8 bit byte. History repeats: there is no sense in going a different way when most modern devices are already 8 bit based.
Heck, in case of Intel it was explicite designed to match IBM's 8 bit byte. The original 8008 (*11) is a single chip implementation of the Datapoint 2200 terminal, which was designed from scratch to work as data entry station for IBM /360 systems.
So yes, there is a straight line from the /360 all the way to whatever PC is on your desk.
*1 - In je olde times bites could be anywhere from 6 to 12 bit, depending on machine structure, with 6, 8 and 9 being the most common values.
*2 - Well, most, as some instructions may not accept any arbitrary byte address even if that address field could hold any. This is for example the case for implementations that require larger data items (words, etc.) aligned to boundaries - like word have to be on word addresses, so for a 32 bit memory only on byte address x0, x4, x8, and so on.
*3 - Well, or equal for the fringe case where byte and memory word are the same - like on an intel 8088.
*4 - When accessing unaligned data or wider data than the machine word the CPU would do multiple memory access cycles. Just imagine a 16 bit wide memory interface. When accessing a byte, the CPU would do one fetch, no matter if located on an even or uneven address. Same when fetching a word from an even byte address. But doing so from an uneven address, it will need to do two 16 bit reads (=32 bit of data) to deliver one 16 bit data item.
*5 - Serious, who on earth needs lower case or fancy punctuations?
*6 - believe me, any money transaction anyone of us is doing anywhere on the world will at some point run thru a mainframe using BCD - 'cause that's what accounting wants for their Dollars and Pence, Marks and Centimes.
And those accounting guys are also the same underwriting orders for new computers - connect the dots :))
*7 - The /360's decimal operations handle signed decimal numbers of 1..31 digits at once. so usually no need to address each digit.
*8 - Always worth to remember that IBM was a main proponent for the definition of (7 bit) ASCII. It wasn't their fault that standardization took longer than expected, so they went ahead with their EBCDIC using the full 8 bit range anyway.
*9 - Only added some years later with the Model 85, so more of a /370 thing :)
*10 - Originally the /360 models only allowed aligned access, so while the addresses were technically encoded as byte address, they would fault if any address not fitting their data type alignment was encoded.
*11 - Of which all CPUs until the latest x86-64 variant are a direct evolution.