10

I'm watching this video, which, at 9:51, says that around the late 1980s the majority of computers were byte-addressable.

So the minimum amount of data you can retrieve from RAM was 8 bits.

What was the minimum amount of data you could retrieve from RAM before that?

Why did they switch to 8 bits eventually?

11
  • 8
    Consider looking at this question and this question, the latter of which has an answer which points to this wikipedia page "Word (Computer Architecture)", all of which may be helpful. Commented Oct 20 at 0:08
  • 5
    So the minimum amount of data you can retrieve from RAM was 8 bit. Not entirely true. The CPU executing a 'move byte from memory' doesn't actually require the memory controller to be able to deliver a single byte from RAM; maybe the memory delivers 8 bytes and the CPU presents one byte to the program. Commented Oct 20 at 0:27
  • 2
    @dave or it fetched 6..9 bits (whatever a byte was) in consecutive order - like serial architectures may do :)) Commented Oct 20 at 2:38
  • 4
    Sure - them bits come through the mercury one at a time! Commented Oct 20 at 3:26
  • 1
    The question is sort of backwards: most computers have switched away from byte addressable now, or at least all discourages it. Commented Oct 20 at 3:40

6 Answers 6

26

The majority of 'scientific' machines were word-addressable; the word size was a compromise between a desire for arithmetic range and precision, and the cost of memory (and address bits). 36 bits was a common word size.

What became clear was that addressing down to the character level was useful for many purposes; even scientific machines needed to generate readable output.

Prior to the IBM System/360, IBM had separate scientific and business machines, with consequent effects on cost of hardware design and software support. The S/360 was intended to unify the two product sets. One of the design principles was 'addressable down to small units', preferably to individual characters.

The main decision then was how large a character should be. The two contenders were 6 and 8 bits, discussed at some depth in Architecture of the IBM System/360 by Amdahl, Brooks, and Blaauw – see page 91. To cut the story short, 8 bits won.

Industry considerations led to this being adopted by other designers, to the extent that it is nearly ubiquitous today.

We use 'byte' for 8-bit units because the 8-bit unit is common. If the standard unit size were anything different, we'd be calling that size a byte.

15
  • 1
    Eight bits = one octet. “Oct” like the eight legged or eight armed octopus. Commented Oct 20 at 15:07
  • 4
    The term 'byte' was invented for the design of Stretch (IBM 7030), where it was used to mean 'any field of 1 to 8 bits'. Other computer architectures used it for any field less than a full word. people.computing.clemson.edu/~mark/stretch.html Commented Oct 20 at 15:14
  • 13
    @gnasher729 - good point. The term 'octet' is necessary precisely because 'byte' did not unambiguously mean '8 bits'. Commented Oct 20 at 15:23
  • 6
    @gidds Byte is a group of bits usually good for a character-like data element. Depending on machine and codeset this has been historically anything from 5 to 12 bits. This is why even modern standards still start by defining byte as 8 bits - as that is what is commonly used today (see below) - unless they prefer to use the term Oktet instead - as that is always 8 bit (Octa is Greek for 8). Commented Oct 20 at 15:25
  • 2
    And 'word' is a lot more convenient to say than 'sequence of 8 contiguous bytes starting at an address that is an integral multiple of 8'. Commented Oct 20 at 17:47
8

TL;DR:

  • It's a memory word.
  • It was the IBM /360 in 1965.
  • It was to allow optimal memory utilization for common data types.

The Questions:

What was the minimum amount of addressable memory?

One item? But I guess you mean the minium data item size hardware can access. That's a memory word, which depends on the implementation used. Can be a bit, a nibble, a byte, or any larger word, double or quadruple word.

Note that this may differ from what a the ISA defines as a word. For example an 8088 has an 8 bit memory word, but a 16 bit ISA word.

The x86 evolution gives a nice example here. All members use byte addressing and 16/32 bit words/double words, but have different memory words:

  • 8088 - 8 bit
  • 8086 - 16 bit
  • 80286 - 16 bit
  • 80386SX - 16 bit
  • 80386DX - 32 bit
  • Pentium - 64 Bit
  • ...

I'm watching this video, which, at 9:51, says that around late 1980s the majority of computers were byte-addressable.

Well, bytes ate way older, but I guess you're referring to a byte of 8 bits (*1), so yes, by the 80s the 8 bit byte as basic character, as well as byte addressing was pretty much common sense.

So the minimum amount of data you can retrieve from RAM was 8 bit.

Note: Byte addressing does not mean it being 8 bit.

Byte addressable only means that the smallest common data unit that can be named (addressed) in all (most *2) instructions is less than a memory word (*3). What is fetched from that address depends on the instruction. It could as well be more than a byte, like a word or double word.

Likewise this is only when viewed from the instruction side. A CPU's hardware (aka the memory interface) would usually fetch (at least *4) one memory word. That could be 32 Bit on a Motorola 68030 or Intel 80486, or 64 Bit on a Pentium.

Not to mention that most RAM internally use additional buffers. For example a byte wide RAM build from 4116 RAM chips will fetch 128 bytes from it's storage matrix, but only deliver one of them.

Also, byte does not always mean 8 bit. Especially not early on. A byte was early on the abstract term for a character sized cell and could have anything from 6 to 12 bit, depending on machine. Heck, some even allowed bytes of variable size.

What was the minimum amount of data you could retrieve from RAM before that?

At least in the beginning it was always a machine word. So 18 bit for an 18 bit computer, 36 for a 36 bit machine and so on.

Oh, and then there are of course serial architectures which will always have to read multiple bits to deliver a byte :))

Why did they switch to 8 bit eventually?

When and Why is rather clear:

  • In 1965 with introduction of the IBM /360 series
  • To accommodate multiple data types with the same ease

IBM's goal for the /360 was to replace the different architectures they had with a single fitting all use cases. Until then IBM maintained several major incompatible CPU lines. Some for data processing with large 36 bit word integers, some with 36/72 bit float, some to handle mainly decimal numbers (BCD) as well as character based types. Not to forget about still existing large installations of electro-mechanical punch card processing.

The 360 in System/360 was meant to symbolize all around capabilities - like the 360 degree in a circle. So when looking at their portfolio, they noted 5 different data types that all in one machine needed to handle at comparable performance and with as least as possible exceptions/special cases.

  • Decimal data (BCD)
  • Character data (bytes)
  • Integer data
  • Floating point data
  • Address data

Some of them are contradicting when it comes to efficient handling, and that's not the ones you may think of. Integer, float and address all not very picky about size. if an integer is 32 or 26 bit or a float is 64 or 72 doesn't make a difference for most applications. The real problem lays with those pesky little decimal numbers and characters.

IBM's accounting machines worked already with byte handling, but their bytes were 6 bit - like common in the 1950s (*5). BCD on the other hand fits in 4 bits. And BCD was THE main data type (*6). It was where the customer was and the money went. So making a machine with 6 bit bytes would have worked fine for characters (and in turn for addresses, integer and float), but leave a whooping 1/3 of memory unused. Memory was expensive.

As a result, the /360 architecture was designed around BCD numbers of 4 bits each. That way decimal numbers could be stored the most efficient way. So far this would mean that the machine would best be made to address each digit of 4 bit as base unit. Then again, one rarely needs to address single digits in a machine that operates on strings of digits (*7). A decimal number comes rarely as a single digit - especially not in accounting.

The next bigger building block (data type) would be character. Combining two BCD into one character seems logical. sure, with a 6 bit character set this would now waste two of the 8 bits that two BCD give - but that's already only 1/4th. Also character data are way less used then decimal numbers, so way more efficient. Not to mention that the time was right to include also lower case letters wich would need at least a 7 bit character cell (*8). Characters are also something that one wants to access.

So 8 bit bytes and byte addressing is the sensible lower limit for data access. All other ("higher") types can easy be constructed in a way to build upon those 8 bit bytes:

  • 16 bit for Half Words (short)
  • 32 bit for Words (long)
  • 64 bit for Double Words
  • 32 bit for Short Float
  • 64 bit for Float
  • 128 bit for Extended Float (*9)

and finally a

  • 32 bit Address Word

Since they are multiples of 8 they can as well be addressed using a byte address, so no problem either (*10).

And as they say, The Rest is History

With IBM being THE normative force in business computing it became mandatory to handle their data types without much hassles. Most important the 8 bit byte. What company would buy some smaller/complementary computers if they can't exchange data with the companies mainframe? What department would order one if it needed more complicate programming to do the same?

As a result next to all (*11) other manufacturers moved their new designs to fit the 8 bit byte and in turn 16/32/64 bit word sizes. Sure, a few, like DEC's 36 bit systems, had a longer hold out period due perfect fit for a certain niche, but in the end - that's the 1980s you mentioned - VAXen replaced PDP-10.

And micro processors, which only became a thing a good 10 years after the /360, already started out with the 8 bit byte. History repeats: there is no sense in going a different way when most modern devices are already 8 bit based.

Heck, in case of Intel it was explicite designed to match IBM's 8 bit byte. The original 8008 (*11) is a single chip implementation of the Datapoint 2200 terminal, which was designed from scratch to work as data entry station for IBM /360 systems.

So yes, there is a straight line from the /360 all the way to whatever PC is on your desk.


*1 - In je olde times bites could be anywhere from 6 to 12 bit, depending on machine structure, with 6, 8 and 9 being the most common values.

*2 - Well, most, as some instructions may not accept any arbitrary byte address even if that address field could hold any. This is for example the case for implementations that require larger data items (words, etc.) aligned to boundaries - like word have to be on word addresses, so for a 32 bit memory only on byte address x0, x4, x8, and so on.

*3 - Well, or equal for the fringe case where byte and memory word are the same - like on an intel 8088.

*4 - When accessing unaligned data or wider data than the machine word the CPU would do multiple memory access cycles. Just imagine a 16 bit wide memory interface. When accessing a byte, the CPU would do one fetch, no matter if located on an even or uneven address. Same when fetching a word from an even byte address. But doing so from an uneven address, it will need to do two 16 bit reads (=32 bit of data) to deliver one 16 bit data item.

*5 - Serious, who on earth needs lower case or fancy punctuations?

*6 - believe me, any money transaction anyone of us is doing anywhere on the world will at some point run thru a mainframe using BCD - 'cause that's what accounting wants for their Dollars and Pence, Marks and Centimes.

And those accounting guys are also the same underwriting orders for new computers - connect the dots :))

*7 - The /360's decimal operations handle signed decimal numbers of 1..31 digits at once. so usually no need to address each digit.

*8 - Always worth to remember that IBM was a main proponent for the definition of (7 bit) ASCII. It wasn't their fault that standardization took longer than expected, so they went ahead with their EBCDIC using the full 8 bit range anyway.

*9 - Only added some years later with the Model 85, so more of a /370 thing :)

*10 - Originally the /360 models only allowed aligned access, so while the addresses were technically encoded as byte address, they would fault if any address not fitting their data type alignment was encoded.

*11 - Of which all CPUs until the latest x86-64 variant are a direct evolution.

5
  • 2
    "doing so from an uneven address, it will need to do two 16 bit reads (=32 bit of data) to deliver one 16 bit data item" unless the CPU enforces alignment like the 68000. If you wanted to read/write a 16-bit word, it had to be on an even address, or it triggered an unaligned access exception. Commented Oct 20 at 14:28
  • @jcaron you may want to read the footnote ind relation to where it is added. Commented Oct 20 at 15:20
  • Yep, it was IBM's market dominance that made everybody copy the 360. But one might consider an alternate history with IBM adopting a 7 bit byte to match the ASCII effort. Intel might have followed the 4004 with the 7007 and 7070 ツ Commented Oct 20 at 18:29
  • @JohnDoty Hihi, tue. Then again, using 7 bit would be even worse than continuing with 6 bit. Remember, BCD is what customer data were. Commented Oct 20 at 18:38
  • Yet the 1401, IBM's most popular machine prior to the 360 series, stored each BCD digit in a seven bit memory cell. Customers were happy with it. Commented 2 days ago
8

Very rough historical overview:

  • Computers started out as "calculating machines", there was no RAM. Machines targeted at "Business" used a decimal representation (to avoid rounding errors), "Scientific" machines used floating point or fixed point, some decimal, some binary. But all representations used many bits (30-40), and that was the word size.
  • Before RAM, computers used revolving magnetic drums, delay line memory, or storage tubes. Eventually core memory became the preferred main storage.
  • Before 8-bits, multiple of 6-bits (which then used octal notation) wre common: E.g. DEC PDP computers used 12-bit, 18-bit, and 36-bit words. Characters in DEC machines often used a 6-bit representation, and multiple characters were stored in a word. Characters in other 36-bit machines (Honeywell) were 9 bits, again with multiple characters in a word.
  • Eventually computers converged on an 8-bit byte as the minimal "moveable" piece of data. Examples are the IBM/360 and the PDP-11. Often actual RAM accesses were still in multiples of 8-bits, and bytes were masked out.
10
  • 1
    Don't forget the 9 bit bytes of various Honeywell machines :) Commented Oct 20 at 15:26
  • @Raffzahn which Honeywell had 9-bit bytes as addressable unit? The 6000 series had 36-bit words, and to my understanding the 9-bit "bytes" were more like the "packed" characters in other word-based systems (e.g. IBM), but maybe I am wrong... Commented Oct 20 at 20:11
  • Erm, please do not insert what not has been said - in this case 'byte addressable'. The comment is in relation to your third point which talks about byte/character size before 8 bit, not how they are accessed. The very difference between a DEC (e.g. PDP-10) and a Honeywell was one packaging 6 6 bit characters in a 36 bit word, while the other was doing 4 9 bit while doing word addressing. Wasn't it? So that's exactly the addition supplied her. Commented Oct 20 at 22:06
  • @Raffzahn: "do not insert what has not been said" -- but it was, in the question title and again in the question body. Anything not discussing addressable units would be off-topic on this page. Commented 2 days ago
  • @BenVoigt Note that this was a comment to an answer, not the question, or do you want to say Dirk's answer is off topic? Commented 2 days ago
7

The Burroughs B1000 Series — also known as the Burroughs Small Systems — was bit-addressable, meaning individual bits could be directly accessed in memory. However, its data path was 24 bits wide, reflecting a design that balanced fine-grained addressing with practical data handling. Interestingly, later models evolved to use 32-bit granularity in their physical RAM implementation, hinting at a shift toward more modern memory technology — even if the underlying architecture retained its bit-addressable heritage.

If we look even further back in computing history, we find ourselves in the era of rotating-drum serial computers, where data was stored and accessed sequentially on spinning magnetic drums.

4

Prior to things converging on byte-addressing, there was no standard. Some machines were bit-addressable, while others used word addressing... which is why you'll see old mainframes and minicomputers having their memory capacity given in kWords.

Here's the examples section from that Wikipedia page:

  • The ERA 1103 uses word addressing with 36-bit words. Only addresses 0–1023 refer to random-access memory; others are either unmapped or refer to drum memory.
  • The PDP-10 uses word addressing with 36-bit words and 18-bit addresses.
  • Most Cray supercomputers from the 1980s and 1990s use word addressing with 64-bit words. The Cray-1 and Cray X-MP use 24-bit addresses, while most others use 32-bit addresses.
  • The Cray X1 uses byte addressing with 64-bit addresses. It does not directly support memory accesses smaller than 64 bits, and such accesses must be emulated in software. The C compiler for the X1 was the first Cray compiler to support emulating 16-bit accesses.[1]
  • The DEC Alpha uses byte addressing with 64-bit addresses. Early Alpha processors do not provide any direct support for 8-bit and 16-bit memory accesses, and programs are required to e.g. load a byte by loading the containing 64-bit word and then separately extracting the byte. Because the Alpha uses byte addressing, this offset is still represented in the least significant bits of the address (rather than separately as a wide address), and the Alpha conveniently provides load and store unaligned instructions (ldq_u and stq_u) which ignore those bits and simply load and store the containing aligned word.[2] The later byte-word extensions to the architecture (BWX) added 8-bit and 16-bit loads and stores, starting with the Alpha 21164a.[3] Again, this extension was possible without serious software incompatibilities because the Alpha had always used byte addressing.

(Note that the DEC Alpha was introduced in 1992 and the Cray X1 was introduced in 2003. Addressing only converged on bytes where the advantage of consistency outweighed the advantage of doing something difference.)

As for why they did it, from what I remember from old reading, because 8 bits was a good balance point. It was large enough for ASCII and small enough to not be wasteful, unlike sizes rooted in 6, like 12-bit or 36-bit addressing, and it was convenient that it was a power of 2.

Prior to that, there was a split between "business machines" and "scientific machines" that also influenced each machine's choice of word sizes, given how different the kinds of math they were performing were.

(While not literally using the same techniques for storing numbers, the split between what business needed and what scientists needed is the distant ancestor of the "integer vs. floating point" split in modern language primitives. Financial computing still implements fixed-point decimals on top of integer operations to avoid rounding errors.)

5
  • 1
    "Why" - the focus was on computation; word sizes were chosen for balance between cost (of address bits) and range/precision of numeric operands. Byte addressing allows access at the character level without the need for explicit shift/mask operations or special insert/extract instructions. Commented Oct 20 at 0:22
  • 1
    Early on in language design, it was not 'integer versus floating point' but 'integer versus real'. Kids today don't realize you can have real numbers without floating point. Commented Oct 20 at 0:24
  • @dave Good point. I should have been clear that I didn't mean they were literally doing integer vs. float. (BCD has entered the chat. :P) Edited. Commented Oct 20 at 0:29
  • @ssokolow :)) Then again, real does not necessary mean BCD - nor does float mean power of 2:)) Commented Oct 20 at 15:29
  • 2
    @dave REAL was always a misnomer. Floats are a subset of the rationals. Commented Oct 20 at 18:31
1

The Motorola MC14500B, marketed as "Industrial Control Unit" is a 4-Bit (instruction width) CPU that operates on a single data line (i.e. its bi-directional "data bus" is one bit wide) in pure serial fashion. It has 16 instructions, supports conditionals, jumps, subroutines,...

Doesn't get any simpler.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.