Skip to main content
0 votes
0 answers
6 views

Surprising result in mixed integer/double arithmetic in Swift

I am working with Swift 5 in Xcode 16.4. I have the following bit of code: let x = 20.0 let a = 1/6 let b = 1/6*x let c = (1/6)*x let d = Double(a)*x and when I stop at the next ...
Robert Dodier's user avatar
-2 votes
0 answers
44 views

Why is 0.1*0.1 not equal to 1e-2 in Python? [duplicate]

I'm using Python 3.12.10 on Spyder, packaged through conda-forge on a Windows x64 machine. I'm seeing the following output: 0.1 - 1e-1 Out[1]: 0.0 and 0.1*0.1 - 1e-2 Out[3]: 1.734723475976807e-18 ...
NNN's user avatar
  • 589
1 vote
0 answers
119 views

How are you supposed to normalize a fixed point signed number? [closed]

Let's say you have a signed 8 bit integer, and you want to turn it into a normalised floating point number. Do you: Divide by 127 and clamp any values < -1.0 to -1.0 Divide by 128 and accept that ...
Zebrafish's user avatar
  • 16k
2 votes
1 answer
216 views

Why does Newton’s method overshoot on the first deceleration step in my motion profile generator?

I’m porting a Python motion profile generator to C to implement for my STM32H743. The generator produces step timings for a simple acceleration → cruise → deceleration motion profile. See the ...
Marvin W's user avatar
0 votes
1 answer
273 views

Inaccuracy replicating Fortran mixed-precision expression in Rust

I have the following code in my Fortran program, where both a and b are declared as REAL (KIND=8): a = 0.12497443596150659d0 b = 1.0 + 0.00737 * a This yields b as 1.0009210615647672 For comparison, ...
sgfw's user avatar
  • 356
4 votes
1 answer
146 views

Weird behavior in large complex128 NumPy arrays, imaginary part only [closed]

I'm working on numerical simulations. I ran into an issue with large NumPy arrays (~ 26 GB) on Linux with 128 GB of RAM. The arrays are of type complex128. Arrays are instantiated without errors (if ...
laserpropsims's user avatar
0 votes
2 answers
238 views

turn Python float argument into numpy array, keep array argument the same

I have a simple function that is math-like: def y(x): return x**2 I know the operation x**2 will return a numpy array if supplied a numpy array and a float if supplied a float. for more complicated ...
villaa's user avatar
  • 1,259
25 votes
12 answers
3k views

How can I parse a string to a float in C in a way that isn't affected by the current locale?

I'm writing a program where I need to parse some configuration files in addition to user input from a graphical user interface. In particular, I'm having issues with parsing strings taken from the ...
Newbyte's user avatar
  • 3,865
2 votes
0 answers
172 views

Why does floating point division take less than 50% of the latency of integer division and also 10x more latency than usual when underflow occurs?

I am measuring the latency of instructions. For 64-bit primitives, integer division takes about 25 cycles each, usually on my 2.3GHz Digital Ocean vCPU, while floating point division takes about 10 ...
Zack Light's user avatar
4 votes
2 answers
297 views

Why does adding a value to Float.MAX_VALUE not reach infinity?

According to the standard, overflow in java is handled using a special value called infinity, but here the sum is 3.4028235E38. Why is this the case? public class FloatingPointTest { public static ...
saul goodman's user avatar
6 votes
1 answer
244 views

Speeding up integer division with doubles

I have a fixed-point math-heavy project and I was looking to speed up integer divisions. I tested double division with SSE4 and AVX2 and got nearly 2x speedup versus scalar integer division. I wonder ...
M.kazem Akhgary's user avatar
0 votes
1 answer
103 views

GCC offers a _Float16 type, but - what about the functions to work with it?

GCC offers a 16-bit floating point type, outside of the C language standard: _Float16 - at least for x86_64. This allowance is described here. However - the GCC documentation does not seem to indicate ...
einpoklum's user avatar
  • 137k
5 votes
1 answer
137 views

Is it expected that vmapping over different input sizes for the same function impacts the accuracy of the result?

I was suprised to see that depending on the size of an input matrix, which is vmapped over inside of a function, the output of the function changes slightly. That is, not only does the size of the ...
hvater's user avatar
  • 100
3 votes
2 answers
183 views

How does Oracle convert decimal values to float?

If I have a float(5) column, why does 7.89 get rounded to 7.9 but 12.79 gets rounded to 13, not 12.8? Binary forms are as follows for 3 examples: 7.89 0111.01011001 ------ round to------\> 7.9 ...
titi zarif's user avatar
0 votes
1 answer
216 views

How can a long double be that big in C++? [duplicate]

The sizeof(long double) is 8, which means that if I use all the bits for the integer part of an unsigned number, I can maximum store 2^64-1=18446744073709551615. However, std::numeric_limits<long ...
alekscooper's user avatar

15 30 50 per page
1
2 3 4 5
1033