Computers can't perfectly represent most decimal numbers!
YES! You read that right...even I was like,"Wait, what? That can't be right!" But then I remembered this crazy story about Airane 5 explosion because of a tiny decimal point error, which I saw on an IG reel. Hence why precision matters!
We naturally work in decimal (base-10) while computers use binary (base-2) arithmetic, This simple difference creates complex problems. For example, the seemingly simple number 0.1 becomes an endless sequence in binary: 0.00011001100110011...
and
0.1 + 0.7 ≠ 0.8
To solve this complex problem of having a precise value IEEE 754 was introduced. For a detailed history of IEEE read about it on IEEE 754 Wikipedia
To handle these decimal-to-binary challenges, computers use the IEEE 754 floating-point format. There are two main ways computers store these numbers:
- Single Precision
- Double Precision
What is Mantissa?
Mantissa is the part of a number after the decimal point. For example, in 3.745, the mantissa is 0.745.
Now let's see, how does a computer converts a decimal or floating numbers into a precise Binary value using the example of 0.1 + 0.7
for a fact now we know that it is != 0.8 (!= not equals to)
Step 1: Convert 0.1 and 0.7 into Binary numbers
Step 2: Convert the Binary into Scientific Notation
Since we are trying to represent the decimal number 0.1 in binary, and in return we see in a repeating binary fraction 0f 0.000110011... So we convert this into scientific notation, just like in decimal we write 1234 = 1.234 × 10³
Here we are converting a Binary number into scientific notation
How does it work? We did these 2 steps:
- Move the binary point (the decimal point) to the right of the first '1'
- Count how many places you moved the point. That becomes your exponent
- First 1 appears at 4th position
- Move the point 4 places right To make it 1.10011...
- Multiply by 2^(-4) Because you moved right
Quick Note:
If decimal moved left, exponent is positive
If decimal moved right, exponent is negative
Step 3: If it is recurring or non recurring, converts it into IEEE 754 - 32 Bit format
Bias = 2^(k-1) - 1
Exponent format = 2^(k-1) - 1 + P
here
k = No.of bits are used to represent exponent
(if single precision then 8 bits, if double precision then 11 bits)
P = The power of the mantissa
Once we get the Bias value i.e 123 *convert it into Binary *
Represent this in the IEEE 754 format of single Precision
Note:
We start the mantissa from the scientific notation when representing in IEEE - 754 format.
This is how a floating point is converted into Binary, We do the same process for 0.7 as well.
Here comes the final part:
To add 2 Floating point numbers we need to align the exponents. Just like when you add decimal numbers like:
0.1
- 0.7 You arrange the decimal points.
In binary, we do something similar — we align the exponents so that both the numbers are using the same power of 2.
So we shift the mantissa of 0.1 to the right by 3 bits to match the exponent of 0.7 (because -4 to -1 = 3 steps).
Now both numbers are at the same exponent as -1 so now they can be added easily.
This already starts with 1. — so it is normalized, which is what IEEE 754 wants. No shifting of decimal points is needed. The exponent stays at 126 (same as 0.7’s exponent because we repeat the step 3).
So the final representation of 0.1+0.7 in Binary is
0 1111110 10011001100110011001100
let's see the final representation of the 0.1 + 0.7's Binary in Decimal;
Hence 0.1 + 0.7 = ≈ 0.7999999523
Note:
Computers can’t store endless digits. They have a fixed number of bits (like 32 bits or 64 bits) to represent a number. So if a number is too long, the computer has to round it off which is called as rounding. It is used when it's impractical or unnecessary to keep full precision. Even tiny rounding issues can turn into big errors. check IEEE 754 for rounding off rules.
Top comments (0)