Basic String to Floating-point Conversion

Question

Compiler: GCC 4.7.2 (Debian 4.7.2-5)
Platform: Linux 3.2.0 x86 (Debian 7.1)

I am attempting to write my own character string to float conversion function. It is basically a cheap ripoff of strtof(), but I can't get it to mimic strtof() exactly. I do not expect my function to mimic strtof() exactly, but I want to know why it differs where it does. I have tested a couple different strings and I found that the following strings have different values when the are given to my function and when given to strtof() and when they are printed using printf("%.38f")).

1234.5678
44444.44444
333.333
777.777

Why does this happen? (Also feel free to point out any other mistakes, or inform me of any other strings that also have different values (there is no way I can find them all).)

#include <stdlib.h>
#include <stdio.h>
#include <float.h>
#include <math.h>

int dec_to_f(char *dec, float *f)
{
int i = 0;
float tmp_f = 0;

if(dec == NULL) return 1;

if(f == NULL) return 2;

if(dec[i] == '\000') return 3;

if(dec[i] == '-')
{
    i++;

    if(dec[i] == '\000') return 3;

    for(; dec[i] != '\000'; i++)
    {
        if(dec[i] == '.')
        {
            float dec_place = 10;
            int power_of_ten = 1;

            for(i++; dec[i] != '\000'; i++, power_of_ten++, dec_place *= 10)
            {
                if(dec[i] >= '0' && dec[i] <= '9')
                {
                    if(power_of_ten > FLT_MAX_10_EXP) return 4;
                    else tmp_f -= (dec[i] - '0') / dec_place;
                }
                else return 5;
            }

            break;
        }

        if(dec[i] >= '0' && dec[i] <= '9')
        {
            tmp_f = tmp_f * 10 - (dec[i] - '0');
            if(!isfinite(tmp_f)) return 6;
        }
        else return 5;
    }
}
else
{
    if(dec[i] == '+')
    {
        if(dec[i+1] == '\000') return 3;
        else i++;
    }

    for(; dec[i] != '\000'; i++)
    {
        if(dec[i] == '.')
        {
            float dec_place = 10;
            int power_of_ten = 1;

            for(i++; dec[i] != '\000'; i++, power_of_ten++, dec_place *= 10)
            {
                if(dec[i] >= '0' && dec[i] <= '9')
                {
                    if(power_of_ten > FLT_MAX_10_EXP) return 7;
                    else tmp_f += (dec[i] - '0') / dec_place;
                }
                else return 5;
            }

            break;
        }

        if(dec[i] >= '0' && dec[i] <= '9')
        {   
            tmp_f = tmp_f * 10 + (dec[i] - '0');
            if(!isfinite(tmp_f)) return 8;
        }
        else return 5;
    }
}

*f = tmp_f;
return 0;
    }

int main()
{
printf("FLT_MIN = %.38f\n", FLT_MIN);
printf("FLT_MAX = %f\n", FLT_MAX);
float f = 0;
int return_value = 0;
char str[256];

printf("INPUT = ");
scanf("%s", str);

return_value = dec_to_f(str, &f);

printf("return_value = %i\nstr = \"%s\"\nf = %.38f\nstrtof = %.38f\n", return_value, str, f, strtof(str, NULL));
}

Could you be more specific? What do you expect to get, and what do you get instead? — Barmar
– Barmar, Commented Oct 11, 2013 at 20:25
Very small errors will be due to rounding, which is unavoidable when dealing with floating point. Not an error at all. If that's not the problem you're going to have to speak up. — Mark Ransom
– Mark Ransom, Commented Oct 11, 2013 at 20:30
"%.38f" suggest that you are expecting a lot more precision than a 32 bit (or 64 bit) float can give you. — ryyker
– ryyker, Commented Oct 11, 2013 at 20:31
@MarkRansom: Errors due to exact results not being representable are unavailable, but errors due to incorrect conversions are avoidable. Two properly implemented decimal-binary conversions should produce the same results. — Eric Postpischil
– Eric Postpischil, Commented Oct 11, 2013 at 20:46
An excellent resource to read if you are interested in doing decimal-to-binary conversion properly is this post: exploringbinary.com/… . I do not know where you got your reference function from but it is terribly imprecise for large and small numbers. — Pascal Cuoq
– Pascal Cuoq, Commented Oct 11, 2013 at 21:05

Eric Postpischil · Accepted Answer · 2013-10-11 21:46:17Z

Converting decimal to binary or vice-versa with correct rounding is complicated, requires detailed knowledge of floating-point arithmetic, and requires care.

There are a number of reasons why conversion is hard. Two of them are:

When calculations are performed with floating-point, those calculations often experience rounding errors. If the computations are not carefully designed, those rounding errors will affect the final results.
Some inputs will be very close to a rounding point, a point where rounding changes because the two nearest representable values are almost equally distant. As an example, consider 1.30000001192092895507812x. If that x is 4, the result should be 1.2999999523162841796875. If it is 6, the result should be 1.30000007152557373046875. Yet the digit x is well beyond the number of decimal digits that 32-bit binary floating-point can distinguish. It is even beyond the number of digits that 64-bit can distinguish. So you cannot use ordinary arithmetic to perform these conversions. You need some form of extended-precision arithmetic.

(In fact, consider 1.30000001192092895507812500000000…x. If x is a non-zero digit after any number of zeros in that numeral, then the conversion should round upward. If there is no non-zero digit, then the conversion should round downward. This means there is no limit to how many digits you must examine in order to determine the correctly rounded result. Fortunately, there are limits to the amount of arithmetic you must do, aside from scanning digits, as shown in the paper.)

A link to this paper was already posted as a comment on the question (where it's suitable as a reference). This answer won't be any good if that link goes down, and doesn't contain the information that the asker would need.
@JoshuaTaylor: The contents of the paper are necessary; designing correctly rounded conversions is not easy. How do you propose to put the contents in an answer? Perhaps the question ought to be closed as too broad.
Well, the conditions for too broad do include that "there are either too many possible answers, or good answers would be too long for this format." If this really can't be answered without an entire paper, then it's probably too broad. However, I expect that the differences in the behavior of strtof and printf( ... ) can be explained with less than a full paper (though references to other sources could very well still be appropriate).
@JoshuaTaylor: I have added two reasons simple conversion code does not work.
@EricPostpischil That is a nice update, thanks for the additional input. They explain why this is difficult task, but they don't explain why the OP's function, strtof, printf produce different results. Even if it's a hard task, strtof and printf are doing it, and OP's attempting it. It OP did exactly what strtof does, then OP would get the same answer.

Rick Regan · Accepted Answer · 2013-10-12 15:31:52Z

The short answer is: You can't use floats or doubles to convert to floats or doubles. You need arithmetic of higher precision, either "big floats" or "big integers".

The longer answer is in David Gay's paper (cited in other answers) and David Gay's implementation of that paper.

The even longer answer is on my Web site, where I explain David Gay's code in a series of detailed articles.

If you don't care about how to get conversions right, and just want to understand why yours went wrong, read my article Quick and Dirty Decimal to Floating-Point Conversion. It shows a small program like yours, which seems should work, but doesn't. Then see my article Decimal to Floating-Point Needs Arbitrary Precision to understand why.

Nahuel Fouilleul · Accepted Answer · 2013-10-11 21:00:12Z

After looking at the source of strtof/strtod, it uses double and then cast to float.

Replacing float by double gives the same result as strtof:

#include <stdlib.h>
#include <stdio.h>
#include <float.h>
#include <math.h>
int dec_to_f(char *dec, float *f)
{
int i = 0;
double tmp_f = 0;
if(dec == NULL) return 1;
if(f == NULL) return 2;
if(dec[i] == '\000') return 3;
if(dec[i] == '-')
{
    i++;
    if(dec[i] == '\000') return 3;
    for(; dec[i] != '\000'; i++)
    {
        if(dec[i] == '.')
        {
            double dec_place = 10;
            int power_of_ten = 1;
            for(i++; dec[i] != '\000'; i++, power_of_ten++, dec_place *= 10)
            {
                if(dec[i] >= '0' && dec[i] <= '9')
                {
                    if(power_of_ten > FLT_MAX_10_EXP) return 4;
                    else tmp_f -= (dec[i] - '0') / dec_place;
                }
                else return 5;
            }
            break;
        }
        if(dec[i] >= '0' && dec[i] <= '9')
        {
            tmp_f = tmp_f * 10 - (dec[i] - '0');
            if(!isfinite(tmp_f)) return 6;
        }
        else return 5;
    }
}
else
{
    if(dec[i] == '+')
    {
        if(dec[i+1] == '\000') return 3;
        else i++;
    }
    for(; dec[i] != '\000'; i++)
    {
        if(dec[i] == '.')
        {
            double dec_place = 10;
            int power_of_ten = 1;
            for(i++; dec[i] != '\000'; i++, power_of_ten++, dec_place *= 10)
            {
                if(dec[i] >= '0' && dec[i] <= '9')
                {
                    if(power_of_ten > FLT_MAX_10_EXP) return 7;
                    else tmp_f += (dec[i] - '0') / dec_place;
                }
                else return 5;
            }
            break;
        }
        if(dec[i] >= '0' && dec[i] <= '9')
        {   
            tmp_f = tmp_f * 10 + (dec[i] - '0');
            if(!isfinite(tmp_f)) return 8;
        }
        else return 5;
    }
}
*f = (float)tmp_f;
return 0;
    }
int main()
{
printf("FLT_MIN = %.38f\n", FLT_MIN);
printf("FLT_MAX = %f\n", FLT_MAX);
float f = 0;
int return_value = 0;
char str[256];
printf("INPUT = ");
scanf("%s", str);
return_value = dec_to_f(str, &f);
printf("return_value = %i\nstr = \"%s\"\nf = %.38f\nstrtof = %.38f\n", return_value, str, f, strtof(str, NULL));
}

Does this code work in all cases, including cases where the input numeral is very close to a point where rounding changes because the two nearest representable values are almost equally close?
all tests gives the same result except -0 and for strings non numeric characters for ex: 1e-5
Given “1.30000001192092896”, this code shows different results for f and strtof.

chux · Accepted Answer · 2013-10-12 14:35:06Z

@Eric Postpischil and @Nahuel Fouilleul have provided good info. I'll add some more thoughts that don't fit well as a comment.

1) Text to FP needs to be evaluated in the other direction. Rather than most significant digits to least. Form the result from least to most significant. Ignore leading zeros. This will best maintain the subtle effects of your least significant text digits. As you go right to left, maintain a power_of_10 to multiple by at the end.

power_of_ten *= 10.0;
...
loop()
  // tmp_f = tmp_f * 10 + (dec[i] - '0');
  tmp_f = tmp_f/10 + (dec[i] - '0');
  power_of_ten *= 10.0;
...
tmp_f *= power_of_10;

2) Upon noticing the DP ., (going right to left), reset your power_of_10 to 1.0.

3) Fold your - and + code into one.

4) Use "%.9e" to compare results.

5) Use next_afterf(x,0.99*x) and next_afterf(x,1.01*x) to bracket acceptable results.

6) Typical float has about 1 part in power(2,23) precision (~7 decimal digits). As OP is closing in on that, the overall conversion is OK, just needs to reverse parsing.

Collectives™ on Stack Overflow

Basic String to Floating-point Conversion

4 Answers 4

5 Comments

Comments

3 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

Comments

3 Comments

Comments

Related