25

I'm writing a program where I need to parse some configuration files in addition to user input from a graphical user interface. In particular, I'm having issues with parsing strings taken from the configuration file into floats as the function I've been using for this purpose so far, strtof(), respects the current locale which means a string that represents a floating point number may parse into 0.10000000149011612 in one locale and 0 in another—not good. This is because some locales use the full stop (.) for denoting the decimal separator whereas others use a comma (,), but the strings from the configuration file always use a full stop.

These configuration files are distributed to users in identical format regardless of their locale, and it is not feasible to distribute different versions dependent on the locale they have set—especially as they are a global immutable resource part of the operating system base and a system may have multiple users that aren't necessarily using the same locale.

I can't just set the locale to something predictable at program startup because removing support for i18n is a non-starter. I also want to preserve locale-specific parsing for user input as referenced earlier. I also don't think I safely can call setlocale(LC_ALL, "C") when I start parsing and then finish with setlocale(LC_ALL, "whatever it was before") as this is a multi-threaded program and I can't guarantee that other threads aren't doing locale-dependent work while configuration file parsing is happening.

So, how can I parse strings into floats in a locale-independent fashion in C, preferably without relying on functionality outside of the standard library? The program I'm writing only targets Linux (although it may also be possible to run it on BSDs, but they are not a priority), so Linux-specific answers are just fine.

17
  • 1
    It looks like Linux/glibc may support stdtod_l. Commented Aug 24 at 10:06
  • 1
    @Newbyte, Aside from'.' versus ',' (which could be handled with a string substitution), what other numeric locale concerns do you have? Commented Aug 24 at 10:07
  • 1
    @robertklep Never mind, I had to do #define _GNU_SOURCE and then it became available! Commented Aug 24 at 10:22
  • 6
    @WeatherVane "Why don't you make your own?" --> string to FP is quite non-trivial to do well and get consistent results. Sure easy to code up one, yet the edge cases are challenging. Typically a good implementation involves controlling rounding mode (a problem like controlling locale), wider precision and expo range. Commented Aug 24 at 10:30
  • 1
    You could get a little more fine-grained by calling setlocale(LC_NUMERIC, "C"). That would preserve i18n in other aspects, although it would indeed affect numeric handling everywhere, not just in your config file parser. Commented Aug 24 at 11:05

13 Answers 13

26

It is indeed unfortunate that the C Standard does not provide functions to handle these conversions for a specified locale.

There is no simple portable solution to this problem using standard functions. Converting the strings from the config file to the locale specific alternative is feasible but tricky.

There is a simple work around for the config file. Use the exponent notation without decimals: 123e-3 is portable locale neutral version of 0.123 or 0,123.

POSIX has alternate functions for most standard functions with locale specific behavior, but unfortunately not for strtod() and friends.

Yet both the GNU libc on linux (and alternate libraries such as musl) and the BSD systems support extended POSIX locale functions:

#define _GNU_SOURCE   // for linux
#include <stdlib.h>
#ifdef __APPLE__
#include <xlocale.h>  // on macOS
#endif

double strtod_l(const char * restrict nptr, char ** restrict endptr,
                locale_t loc);

float strtof_l(const char * restrict nptr, char ** restrict endptr,
               locale_t loc);

long double strtold_l(const char * restrict nptr, char ** restrict endptr,
                      locale_t loc);

On macos, it seems you can pass 0 for the loc argument and get the C locale, on linux loc is specified in the header file as non null so you need to create a C locale with newlocale.

Here is an example:

#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <locale.h>
#ifdef __APPLE__
#include <xlocale.h>
#endif

locale_t c_locale;

int main(void) {
    const char locale_name[] = "fr_FR.UTF-8";
    const char locale_string[] = "0,123";
    const char standard_string[] = "0.123";

    c_locale = newlocale(LC_ALL_MASK, "C", (locale_t)0);

    setlocale(LC_ALL, locale_name);

    double x1, x2, y1, y2;
    x1 = strtod(locale_string, NULL);
    x2 = strtod_l(standard_string, NULL, c_locale);
    int s1 = sscanf(locale_string, "%lf", &y1);
    int s2 = sscanf_l(standard_string, c_locale, "%lf", &y2);

    printf("default locale: %s\n\n", locale_name);
    printf("using printf(...):\n");
    printf("  strtod(\"%s\", NULL) -> %f\n", locale_string, x1);
    printf("  strtod_l(\"%s\", NULL, c_locale) -> %f\n", standard_string, x2);
    printf("  sscanf(\"%s\", &y1) -> %d,  y1=%f\n", locale_string, s1, y1);
    printf("  sscanf_l(\"%s\", c_locale, &y2) -> %d, y2=%f\n", standard_string, s2, y2);

    printf("\nusing printf_l(c_locale, ...):\n");
    printf_l(c_locale, "  strtod(\"%s\", NULL) -> %f\n", locale_string, x1);
    printf_l(c_locale, "  strtod_l(\"%s\", NULL, c_locale) -> %f\n", standard_string, x2);
    printf_l(c_locale, "  sscanf(\"%s\", &y1) -> %d,  y1=%f\n", locale_string, s1, y1);
    printf_l(c_locale, "  sscanf_l(\"%s\", c_locale, &y2) -> %d, y2=%f\n", standard_string, s2, y2);

    return 0;
}

Output:

default locale: fr_FR.UTF-8

using printf(...):
  strtod("0,123", NULL) -> 0,123000
  strtod_l("0.123", NULL, c_locale) -> 0,123000
  sscanf("0,123", &y1) -> 1,  y1=0,123000
  sscanf_l("0.123", c_locale, &y2) -> 1, y2=0,123000

using printf_l(c_locale, ...):
  strtod("0,123", NULL) -> 0.123000
  strtod_l("0.123", NULL, c_locale) -> 0.123000
  sscanf("0,123", &y1) -> 1,  y1=0.123000
  sscanf_l("0.123", c_locale, &y2) -> 1, y2=0.123000

If strtod_l is not available, copying the string and substituting the decimal separator will be required, but here is a list of caveats:

  • the source string will be copied to a temporary buffer of sufficient length, at least 300 bytes, possibly more needed.
  • the source array can contain arbitrary spacing before the number and arbitrary text after the number and might not be null terminated.
  • the endptr must be computed to point to the source string, if the prefix was not copied, an adjustment is necessary
  • swapping . for , is incorrect: making assumptions regarding the current decimal separator is risky, the appropriate one for the currently selected locale must be retrieved via localeconv(), it is the string pointed to by the decimal_point member of the struct lconv. If this string has more than one character, updating the end pointer is tricky.
  • if the current locale is changed concurrently in another thread, the behavior is undefined.

A simpler and safer solution is to read the settings at the beginning of the process, before changing the locale. The process starts in the "C" locale. The problem remains if the settings must be updated as snprintf() will use the current locale too.

Sign up to request clarification or add additional context in comments.

15 Comments

On Linux, xlocale.h doesn't exist and #define _GNU_SOURCE must be specified for these functions to be available. This seems to be the case on both glibc and musl. Otherwise this is a great answer!
[deleted]
[deleted]
On Linux, xlocale.h doesn't exist and #define _GNU_SOURCE must be specified for these functions to be available. This seems to be the case on both glibc and musl. Otherwise this
@Newbyte: good point, answer updated.
@DavidRanieri: yes, js_dtoa in quickjs.c but we have custom code in libbf.c and other versions too for improved performance. converting floats to strings and vice-versa is, as chux commented, non-trivial.
@MichaelSerretta Localization (like floating-point conversion) is a big, complicated, difficult topic, with more pitfalls and lurking special cases than the average programmer knows about. So saying "I know, I can just swap '.' and ',' and I'll be all set" is kind of, well, a recipe for disaster. How do you know there isn't a locale that uses different digit characters, too, instead of ASCII 0-9? (It's true, strtod probably doesn't use non-ASCII digits, ever, but are you sure it doesn't, and won't ever, on any platform, or under any future revision of the Standard?)
@chqrlie UTF-8 is a perfect fit for C string functions Which is no surprise when you look at who invented it, after they noticed what a hash the Unicode committee was making out of things. :-)
|
16

POSIX-specific

The uselocale function only changes the calling thread's locale. Therefore you can use it temporarily in a wrapper around your formatting functions. However, it is somewhat tedious to use as there is no predefined locale_t value representing the C locale. You have to allocate a new one.

#include <locale.h>
#include <stdio.h>

locale_t C_LOCALE;

int main(void)
{
  /*
   * Global initialization
   */
  C_LOCALE = newlocale(LC_ALL_MASK, "C", (locale_t) 0);
  setlocale(LC_ALL, "");
  printf("With locale: %g\n", 3.14);
  {
    /*
     * Per thread
     */
    locale_t saved_locale;
    saved_locale = uselocale(C_LOCALE);
    printf("Without locale: %g\n", 3.14);
    uselocale(saved_locale);
  }
  /*
   * Could be a global destructor
   */
  freelocale(C_LOCALE);
}

3 Comments

I wonder if C_LOCALE = duplocale(LC_GLOBAL_LOCALE); would work? This assumes the global locale is valid (and equal to the C locale) before the first call to setlocale(), even though the POSIX spec says "If the locobj argument is LC_GLOBAL_LOCALE, duplocale() shall create a new locale object containing a copy of the global locale determined by the setlocale() function."
I guess it should work but I don't see what it would give you unless there is reason to believe that newlocate(…, "C", …) is more expensive. The potential downside I see is that there is a risk that you get the order wrong; maybe a global constructor called setlocale before you could access it. However, duplicating followed by newlocale(…, base) (which, confusingly, modifies the base in-place) would be good to override only parts of the locale, e.g. the numeric mask while keeping other parts set to the current locale
This comes really close to my imagined answer of "just set it before the conversion and then revert to old locale after conversion", but since it's thread-specific, even better. Depending on performance requirements and/or overhead I'd suggest simply having a thread with the locale set that's only there for these conversions. Yes, it'd require some wrapper setup to have an easy-to-use function, so the set-convert-unset may be more practical
9

If you're using GLib as I am, another option is g_ascii_strtod (). It essentially acts like strtod_l () as mentioned in other answers, except that it always behaves as if it is using the C locale so you don't need to manually set up and manage the memory of a locale object.

Comments

7

Don't use your current locale anywhere

If you're worried about changing behaviour with set_locale(), there's a simple solution. Don't let C use locales at all. Change locale to "C" at the very start of your app, so that the whole app just runs that way.

"But wait!" I hear you say. "What about the other parts of the program which do need to know what the locale's decimal place is, or standard date or time format, or whatever?"

It's a pretty simple answer. You read those from your locale settings before you change to using C locale, and then you write your own code to do what you need. There's good odds that all you need to do is swap decimal points for a localised decimal symbol, and that's trivially easy to post-process in a text string. I've personally bashed my head against this problem in an internationalised DLL, and tried every other "correct" solution, and none of them work. Or more precisely, they may work for Linux, or for Windows, or for one compiler's interpretation of the standard, but across the board? Hard nope. The only thing which worked was getting C locales out of the way and doing it myself.

Before anyone suggests C++ streaming with locales - no, that's also broken. And it's slower than a slug with heavy shopping.

It sucks that the only way to do such a basic thing is to boot out the standard library, for sure. But we are where we are.

5 Comments

A bit drastic @graham!! Don't you think so?
Normally I would have said so. I hate the idea of having to reinvent the wheel. But I've got a chunk of code cross-compiling on Linux and MSVC where I spent a fortnight trying every alternative. The only thing I haven't done is change to using Boost, because I don't need all the extra libraries. I can't for the life of me understand why the standards people think this is good enough, but it clearly isn't.
Well you indicate you hate many more things than is requested in the question, not only using locales. Your answer makes people think that you actually don't know how to internationalize software, rather than defending abuse of misuse of something. I don't like either to use a big library only because I need a single routine, but localization software is required in many scenarios that must lead with users, and the problem holds by itself as valid. You don't know if this is the case.
... And you talk about broken things that I use regularly without any problem... (this is important)
I'm honestly not sure why a language needs a dynamic auto-loaded default locale. It's the same problem in csharp, I can't count how many library I've seen generate broken json because my computer thinks it's French. The only time you may want to localize your floats or string comparisons is in UI code. Most code is not UI. Most C code is definitely not UI.
6

If one cannot call setlocale() and must rely on standard library, either one must pre-process the string (change configuration file decimal point to the locale one), form your own my_strtof() (not easy and error prone) or limit acceptable strings.

Concerning consistency, strtof() from various implementations do not always convert strings to the same FP value as the upper bound of decimal places to pay attention is not defined. Typically it is XXX_DECIMAL_DIG + 3 or more. This makes a difference when the textual value is near a mid-point of 2 FP values.

Since the goal is "I'm having issues with parsing strings taken from the configuration file into floats", to solve both above issues, consider using %a to read/write the values or drop writing with a decimal/radix point.

6 Comments

Config files may have been edited by a human, so suggestions to constrain the program's output do not help.
Perhaps, yet human user will see sample data in the config file and edit like-wise or risk difficulties. Further, the config file may include comments describing limitations and/or guidance.
form your own my_strtof() (not easy and error prone) This is actually much easier than it sounds: you can copy-paste the code from the published GLIBC or UCRT code (which are licensed under the GNU and MIT licenses, respectively), replacing the locale-derived separator with a hard-coded value.
True about copying good available sources yet "locale-derived separator" is one issue. Leading acceptable white-space is another yet of lessor concern. Other locale issues may exist. Maintenance of one's copy is also another.
@chux And then the user will go and copy-paste a config snippet from the docs or the web to fix an issue, and before they know they have two issues. Or they will change their locale settings after the original creation of said config file and things will break. Or they will run off a NFS-mounted home directory on several machines with different locale settings. And so on; that way lies madness.
Config files may have been edited by a human, so suggestions to constrain the program's output do not help.
2

As far as it goes for GNU/POSIX, I very much like chqrlie’s answer (upvoted), but if you are looking for a relatively system-agnostic way to do it... use a subprocess.

GTK provides a number of ways to spawn, communicate with, and reap a subprocess, through the aptly-named g_spawn_* functions.

When it is time to initialize your configuration data, lock it, pre-initialize it, then spawn the subprocess that processes the configuration file with the correct locale (presumably "" on a first pass, then "C" on a possible second) and reports the data back to the parent via whatever IPC method you choose (a pipe is a good choice here). Once the child terminates/errors/times-out, mark the configuration data as initialized, unlock it, and continue processing as usual in your GTK app.

While it is at it, the subprocess can also normalize the configuration file(s), and also be tasked with writing configuration data to file at the appropriate time(s).


Alternately, you could set up your program to run from a script that just pipes the configuration reading app’s output into the GTK app. Starting from a script (or if on Windows, a .lnk) is a very normal way to do things for more complex program startups.


This is not necessarily a sub-optimal solution. Unix is built on the idea of multiple programs working in tandem to process data. And regardless, unless you have exceedingly strict constraints on your target systems, the overhead of spawning a process is negligible.

Finally, just to mention in passing, you could use the GTK configuration .xml stuff, but I presume you and your team cannot do that for business raisins.

1 Comment

If you are using GTK, presumably you can use the GLib function mentioned in Newbyte's answer.
1

I would normalize the string and use standard methods to convert it to fload/double. It is not an easy task but doable. (xxx.xxx) - negative number.

#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stddef.h>


typedef enum
{
    NF_DECIMAL_AUTO = 0,
    NF_DECIMAL_FORCE_DOT,
    NF_DECIMAL_FORCE_COMMA,
    NF_DECIMAL_RIGHTMOST
} nf_decimal_mode_t;

static int nf_is_space_like(unsigned char c)
{
    return (c == ' ' || c == '\t' || c == 0xA0);
}

static int nf_is_grouping(unsigned char c)
{
    return nf_is_space_like(c) || c == '\'' || c == '_';
}

/* Normalize "locale-ish" number to ASCII "[-]digits[.digits][e[+-]digits]".
   Returns bytes written (no NUL), or 0 on error. */
size_t normalize_float_string_mode(const char *src, char *dst, size_t dst_cap, nf_decimal_mode_t mode)
{
    size_t retLen = 0;
    int ok = 1;
    size_t out = 0;

    if(src == NULL || dst == NULL || dst_cap == 0)
    {
        ok = 0;
        goto done;
    }

#define NF_PUTC(ch)                                   \
    do                                                \
    {                                                 \
        if(out + 1 >= dst_cap)                        \
        {                                             \
            ok = 0;                                   \
            goto done;                                \
        }                                             \
        dst[out++] = (char)(ch);                      \
    } while(0)

    /* Find exponent start: first e/E that comes after at least one digit. */
    size_t i = 0, first_digit_pos = (size_t)-1, exp_pos = (size_t)-1;
    for(; src[i] != '\0'; ++i)
    {
        char c = src[i];
        if(first_digit_pos == (size_t)-1 && isdigit((unsigned char)c))
        {
            first_digit_pos = i;
        }
        if((c == 'e' || c == 'E') && first_digit_pos != (size_t)-1)
        {
            exp_pos = i;
            break;
        }
    }
    size_t body_end = (exp_pos != (size_t)-1) ? exp_pos : i;

    /* Positions of last '.' and ',' before exponent. */
    size_t last_dot = (size_t)-1, last_comma = (size_t)-1;
    for(size_t j = 0; j < body_end; ++j)
    {
        if(src[j] == '.')
        {
            last_dot = j;
        }
        else if(src[j] == ',')
        {
            last_comma = j;
        }
    }

    /* Decide which source char acts as decimal (0 => none). */
    char decimal_src = 0;
    if(mode == NF_DECIMAL_FORCE_DOT)
    {
        decimal_src = (last_dot != (size_t)-1) ? '.' : 0;
    }
    else if(mode == NF_DECIMAL_FORCE_COMMA)
    {
        decimal_src = (last_comma != (size_t)-1) ? ',' : 0;
    }
    else if(mode == NF_DECIMAL_RIGHTMOST)
    {
        /* FIX: handle "only one present" correctly (no sentinel compare). */
        if(last_dot != (size_t)-1 && last_comma != (size_t)-1)
        {
            decimal_src = (last_dot > last_comma) ? '.' : ',';
        }
        else if(last_dot != (size_t)-1)
        {
            decimal_src = '.';
        }
        else if(last_comma != (size_t)-1)
        {
            decimal_src = ',';
        }
        else
        {
            decimal_src = 0;
        }
    }
    else /* NF_DECIMAL_AUTO */
    {
        if(last_dot != (size_t)-1 && last_comma != (size_t)-1)
        {
            decimal_src = (last_dot > last_comma) ? '.' : ',';
        }
        else if(last_dot != (size_t)-1 || last_comma != (size_t)-1)
        {
            size_t sep_pos = (last_dot != (size_t)-1) ? last_dot : last_comma;
            char sep_char = (last_dot != (size_t)-1) ? '.' : ',';

            size_t k = sep_pos + 1, digits_right = 0;
            while(k < body_end && isdigit((unsigned char)src[k]))
            {
                ++digits_right;
                ++k;
            }

            size_t digits_left = 0;
            k = sep_pos;
            while(k > 0)
            {
                --k;
                if(isdigit((unsigned char)src[k]))
                {
                    ++digits_left;
                }
                else if(src[k] != '.' && src[k] != ',' && !nf_is_grouping((unsigned char)src[k]))
                {
                    break;
                }
            }

            if(digits_right == 3 && digits_left > 0)
            {
                decimal_src = 0; /* looks like thousands grouping */
            }
            else
            {
                decimal_src = sep_char;
            }
        }
    }

    /* Skip leading spaces/tabs/NBSP */
    size_t pos = 0;
    while(nf_is_space_like((unsigned char)src[pos]))
    {
        ++pos;
    }

    /* Optional sign or accounting parentheses */
    {
        int negative = 0;
        if(src[pos] == '+')
        {
            ++pos;
        }
        else if(src[pos] == '-')
        {
            negative = 1;
            ++pos;
        }
        else if(src[pos] == '(')
        {
            negative = 1;
            ++pos;
            while(nf_is_space_like((unsigned char)src[pos]))
            {
                ++pos;
            }
        }

        if(negative)
        {
            NF_PUTC('-');
        }
    }

    /* Copy body; abort if alpha appears before number "starts". */
    {
        int wrote_digit = 0;
        int wrote_decimal = 0;
        int number_started = 0;

        for(size_t j = pos; j < body_end; ++j)
        {
            unsigned char c = (unsigned char)src[j];

            if(isalpha(c))
            {
                /* If we see letters before the number begins (e.g., "e10" / ".e10"), give up on body. */
                if(!number_started)
                {
                    break;
                }
                else
                {
                    /* Once number started, letters shouldn't appear in body; stop. */
                    break;
                }
            }

            if(isdigit(c))
            {
                NF_PUTC(c);
                wrote_digit = 1;
                number_started = 1;
                continue;
            }

            if(c == '.' || c == ',')
            {
                if(decimal_src && c == (unsigned char)decimal_src)
                {
                    if(!wrote_decimal)
                    {
                        NF_PUTC('.');
                        wrote_decimal = 1;
                        number_started = 1;
                    }
                }
                /* else: treat as grouping -> drop */
                continue;
            }

            if(nf_is_grouping(c) || c == ')')
            {
                continue; /* drop grouping and closing ')' */
            }

            /* Unknown non-alpha junk: ignore. */
        }

        /* Remove trailing '.' if no fraction followed */
        if(wrote_decimal && out > 0 && dst[out - 1] == '.')
        {
            --out;
            wrote_decimal = 0;
        }

        /* Exponent (normalize). */
        if(exp_pos != (size_t)-1)
        {
            size_t j = exp_pos;
            size_t out_before_exp = out; /* FIX: roll back whole exponent if bad */
            int exp_has_digit = 0;

            NF_PUTC('e');
            ++j;

            if(src[j] == '+' || src[j] == '-')
            {
                NF_PUTC(src[j]);
                ++j;
            }

            while(src[j] != '\0')
            {
                if(isdigit((unsigned char)src[j]))
                {
                    NF_PUTC(src[j]);
                    exp_has_digit = 1;
                    ++j;
                }
                else if(nf_is_space_like((unsigned char)src[j]))
                {
                    ++j; /* allow spaces inside exponent */
                }
                else
                {
                    break;
                }
            }

            if(!exp_has_digit)
            {
                out = out_before_exp; /* drop 'e' and optional sign */
            }
        }

        if(!wrote_digit)
        {
            NF_PUTC('0');
        }
    }

    if(out >= dst_cap)
    {
        ok = 0;
        goto done;
    }
    dst[out] = '\0';
    retLen = out;

done:
    if(!ok)
    {
        if(dst_cap > 0)
        {
            dst[0] = '\0';
        }
        retLen = 0;
    }
    return retLen;

#undef NF_PUTC
}

size_t normalize_float_string(const char *src, char *dst, size_t dst_cap)
{
    return normalize_float_string_mode(src, dst, dst_cap, NF_DECIMAL_AUTO);
}

typedef struct
{
    const char *input;
    nf_decimal_mode_t mode;
    const char *expected;
} TestCase;

static const char* mode_name(nf_decimal_mode_t m)
{
    switch(m)
    {
        case NF_DECIMAL_AUTO:        return "AUTO";
        case NF_DECIMAL_FORCE_DOT:   return "FORCE_DOT";
        case NF_DECIMAL_FORCE_COMMA: return "FORCE_COMMA";
        case NF_DECIMAL_RIGHTMOST:   return "RIGHTMOST";
        default:                     return "?";
    }
}

int main(void)
{
    const TestCase tests[] =
    {
        { "1,234.56",              NF_DECIMAL_AUTO,        "1234.56" },
        { "1.234,56",              NF_DECIMAL_AUTO,        "1234.56" },
        { "1 234,56",              NF_DECIMAL_AUTO,        "1234.56" },
        { "12'345",                NF_DECIMAL_AUTO,        "12345" },
        { "(1,234.56)",            NF_DECIMAL_AUTO,        "-1234.56" },

        { "1,234e+3",              NF_DECIMAL_FORCE_DOT,   "1234e+3" },
        { "1,234e+3",              NF_DECIMAL_FORCE_COMMA, "1.234e+3" },
        { "1,234e+3",              NF_DECIMAL_RIGHTMOST,   "1.234e+3" },
        { "1,234e+3",              NF_DECIMAL_AUTO,        "1234e+3" },

        { "1234",                  NF_DECIMAL_AUTO,        "1234" },
        { "  +1,234  ",            NF_DECIMAL_FORCE_DOT,   "1234" },
        { "  ( 1 234 )  ",         NF_DECIMAL_AUTO,        "-1234" },
        { "1.234",                 NF_DECIMAL_FORCE_DOT,   "1.234" },
        { "1.234",                 NF_DECIMAL_AUTO,        "1234" },

        { "1,234",                 NF_DECIMAL_AUTO,        "1234" },
        { "1.234",                 NF_DECIMAL_RIGHTMOST,   "1.234" },
        { "1,234,567.89",          NF_DECIMAL_AUTO,        "1234567.89" },
        { "1.234.567,89",          NF_DECIMAL_AUTO,        "1234567.89" },
        { "1_234_567,89",          NF_DECIMAL_FORCE_COMMA, "1234567.89" },

        { "1\t234,56",             NF_DECIMAL_FORCE_COMMA, "1234.56" },
        { "1\x00A0 234,56",           NF_DECIMAL_FORCE_COMMA, "1234.56" },
        { "00123,4500",            NF_DECIMAL_FORCE_COMMA, "00123.4500" },
        { ".5",                    NF_DECIMAL_FORCE_DOT,   ".5" },
        { ",5",                    NF_DECIMAL_FORCE_COMMA, ".5" },

        { "5,",                    NF_DECIMAL_FORCE_COMMA, "5" },
        { "5.",                    NF_DECIMAL_FORCE_DOT,   "5" },
        { "1,234E-02",             NF_DECIMAL_FORCE_DOT,   "1234e-02" },
        { "1.234E+02",             NF_DECIMAL_FORCE_DOT,   "1.234e+02" },
        { "1.234E",                NF_DECIMAL_FORCE_DOT,   "1.234" },

        { "1,234E+",               NF_DECIMAL_FORCE_DOT,   "1234" },      /* fixed: no trailing 'e' */
        { "(1,23)",                NF_DECIMAL_FORCE_COMMA, "-1.23" },
        { " +\t1'234'567,0 ",      NF_DECIMAL_FORCE_COMMA, "1234567.0" },
        { "1,2,3,4",               NF_DECIMAL_FORCE_COMMA, "1.234" },
        { "1.2.3.4",               NF_DECIMAL_FORCE_DOT,   "1.234" },

        { "1,23,456",              NF_DECIMAL_AUTO,        "123456" },
        { "000",                   NF_DECIMAL_AUTO,        "000" },
        { "",                      NF_DECIMAL_AUTO,        "0" },
        { "   ",                   NF_DECIMAL_AUTO,        "0" },
        { "e10",                   NF_DECIMAL_AUTO,        "0" },         /* fixed: abort on alpha before number */
        { ".e10",                  NF_DECIMAL_FORCE_DOT,   "0" },         /* fixed: same */

        { "1,234e- 3",             NF_DECIMAL_FORCE_DOT,   "1234e-3" },
        { "1,234e -3",             NF_DECIMAL_FORCE_DOT,   "1234" },
        { "1.234,567",             NF_DECIMAL_RIGHTMOST,   "1234.567" },
        { "1,234.567",             NF_DECIMAL_RIGHTMOST,   "1234.567" },

        { "1 234 567",             NF_DECIMAL_AUTO,        "1234567" },
        { "1_234_567",             NF_DECIMAL_AUTO,        "1234567" },
        { "1'234'567",             NF_DECIMAL_AUTO,        "1234567" },
        { " (  14,50 ) ",          NF_DECIMAL_FORCE_COMMA, "-14.50" },    /* fixed expected */
        { "999,999,999,999,999.999999", NF_DECIMAL_AUTO,   "999999999999999.999999" },


        { "( .5 )",                NF_DECIMAL_FORCE_DOT,   "-.5" }
    };

    const size_t N = sizeof(tests) / sizeof(tests[0]);
    char out[256];
    size_t passCount = 0;

    for(size_t i = 0; i < N; ++i)
    {
        const TestCase *t = &tests[i];
        size_t n = normalize_float_string_mode(t->input, out, sizeof(out), t->mode);
        int ok = (n == strlen(t->expected)) && (strcmp(out, t->expected) == 0);

        if(ok)
        {
            ++passCount;
            printf("%2zu. PASS  mode=%-10s  in=\"%s\"  out=\"%s\"\n",
                   i + 1, mode_name(t->mode), t->input, out);
        }
        else
        {
            printf("%2zu. FAIL  mode=%-10s  in=\"%s\"\n"
                   "    exp=\"%s\"\n"
                   "    got=\"%s\"  (len %zu)\n",
                   i + 1, mode_name(t->mode), t->input, t->expected, out, n);
        }
    }

    printf("\nSummary: %zu / %zu tests passed\n", passCount, N);
    return (passCount == N) ? 0 : 1;
}

https://godbolt.org/z/GWvneKoTq

1 Comment

My understanding is that OP's input strings are already normalized in the C locale.
1

The locale all routines use when you call setlocale() is established every time you call it. You have two ways of achieving that:

  • Switch to the user locale (caling setlocale(LC_ALL, "");) once you have read the configuration. This can be possible on some cases or not.

  • Switch to the standard C locale, saving the previous locale (output by setlocale() routine) and then back to a user locale:

/* before the following call you are using "C" locale */
/* do configuration here */
const char *user_locale = setlocale(LC_ALL, "");
/* here you are using the user environment locale */
/* do user related comunication here */
const char *saved_locale = setlocale(LC_ALL, "C");
/* NOW YOU HAVE SWITCHED TO THE STANDARD "C" LOCALE AGAIN */
setlocale(LC_ALL, user_locale); /* switch to the user locale */
setlocale(LC_ALL, saved_locale); /* swith to previous used locale */
setlocale(LC_ALL, "C"); /* switch back to standard "C" locale */

This allows you to use the locale you may want at every program stage. Beware that the locale is a process property (better said a global library property), so IMHO switching locales shouldn't be considered thread safe.

3 Comments

The question explicitly states calling setlocale is not acceptable due to multithreading.
downvote it then... but I don't think multithreading makes it not acceptable, the code is said to be related to configuration (the call to set if to default locale, configure the application, then change to the user locale) can be protected in a nonshared region (e.g. with a mutex) and stop the conflicting threads while the configuration is being applied (or the starting of the multithreading can be postponed to after the configuration has been done) The question explicitly says the environment is multithread, but the OP only mentions it...
... (doesn't state explicitly that he doesn't want a multithread protected solution) and nowhere is said that setlocale() cannot be used in a multithreaded environment (not being thread safe doesn't mean that you cannot make it thread safe) You can just start the application, configure it using C locale, then switch to user locale, then start the running application properly. Please, read properly the question before saying I have not done so. Thanks :)
1

If you don't need the full range of floating-point numbers, by far the easiest solution is to simply multiply by a large constant and store an integer, then divide by that constant when loading. For example, if you're using 64-bit integers and IEEE 32-bit floats, using a multiplier of 100 000 000 will give you eight decimal places of precision, while letting you store numbers up to 9.2*10^10.

(I use a multiplier of 100 000 000 000 for storing latitude and longitude, which XKCD assures me is sufficient precision to point out a specific grain of sand).

Comments

0

For systems that do not have strtod_l, here is an alternate approach where digits from the source string are copied to a local buffer but not the decimal separator and the exponent is adjusted to compensate. This ensures a locale independent conversion with the same rounding as for the original string:

#include <ctype.h>
#include <stdlib.h>
#include <stdint.h>

double strtod_nolocale(const char *s, char **endptr) {
    char buf[64];
    size_t prefix, pos, pos2;
    int exp = 0;
    unsigned char c, c0;
    unsigned char has_digits = 1;

    // skip the optional blanks
    for (prefix = 0; isspace(c = s[prefix]); prefix++)
        continue;
    // check for a string with at least one digit
    pos = prefix;
    c = s[pos++];
    if (c == '-' || c == '+')
        c = s[pos++];
    if (c == '.' || c == ',')
        c = s[pos++];
    if (!isdigit(c)) {
        // not a numeric value, just pass the initial string to strtod
        return strtod(s, endptr);
    }
    // copy the sign and digits to the tmp buffer
    pos = prefix;
    pos2 = 0;
    c = s[pos++];
    if (c == '+' || c == '-') {
        buf[pos2++] = c;
        c = s[pos++];
    }
    // skip leading zeroes
    if (c == '0') {
        has_digits = 0;
        while (s[pos] == '0') {
            pos++;
        }
        c = s[pos++];
    }
    // copy digits before the decimal separator
    while (isdigit(c)) {
        has_digits = 1;
        if (pos2 < sizeof(buf) - 16)
            buf[pos2++] = c;
        else
            exp++;
        c = s[pos++];
    }
    if (c == '.' || c == ',') {
        if (!has_digits) {
            // skip leading 0 decimals
            while (s[pos] == '0') {
                exp -= 1;
                pos++;
            }
        }
        // copy decimal digits
        c = s[pos++];
        while (isdigit(c)) {
            has_digits = 1;
            exp -= 1;
            if (pos2 < sizeof(buf) - 16)
                buf[pos2++] = c;
            c = s[pos++];
        }
    }
    if (!has_digits) {
        // no digits: positive or negative zero
        buf[pos2++] = '0';
        exp = 0;
    }
    // check for proper exponent
    if (c == 'e' || c == 'E') {
        c = c0 = s[pos];
        if (c == '+' || c == '-')
            c = s[pos + 1];
        if (isdigit(c)) {
            // do not use strtol to avoid setting errno
            int e = 0;
            if (c0 == '+' || c0 == '-')
                pos++;
            while (isdigit(c = s[pos++])) {
                e = e * 10 + c - '0';
                if (e >= 32000)
                    e = 32000;
            }
            exp += e;
        }
    }
    if (endptr) {
        *endptr = (char*)(uintptr_t)(s + pos - 1);
    }
    // store adjusted exponent
    if (exp) {
        buf[pos2++] = 'E';
        if (exp < 0) {
            buf[pos2++] = '-';
            exp = -exp;
        }
        pos2 += 1 + (exp > 9) + (exp > 99) + (exp > 999) + (exp > 9999);
        size_t p1 = pos2;
        while (exp > 9) {
            buf[--p1] = (char)('0' + exp % 10);
            exp %= 10;
        }
        buf[--p1] = (char)('0' + exp);
    }
    buf[pos2] = '\0';
    return strtod(buf, NULL);
}

And here a test framework to compare the behavior if this function and strtod:

#include <errno.h>
#include <locale.h>
#include <stdio.h>
#include <string.h>

const char *tests[] = {
    "",
    "x",
    "123.456",
    "123,456",
    "1.23456e2",
    "1,23456e2",
    "00000000001.23456e2",
    "0000,0000000000123456e12",
    "0",
    "0.0",
    "0,0",
    "+0",
    "+0.0",
    "+0,0",
    "-0",
    "-0.0",
    "-0,0",
    "inf",
    "infinity",
    "infx",
    "nan",
    "nan(43)",
    "nanx",
    "INF",
    "INFINITY",
    "INFx",
    "NAN",
    "NAN(43)",
    "NANX",
    "1e10000",
    "0e10000",
    "-1e10000",
    "1e-10000",
    "-1e-10000",
    "123_456.789_012",
    "123_456,789_012",
    "123'456.789'012",
    "123'456,789'012",
};
int num_tests = sizeof(tests) / sizeof(tests[0]);

void test(const char *s) {
    char buf[32];
    double x;
    char *end;
    int len1 = strlen(s);
    int len2;

    errno = 0;
    x = strtod(s, &end);
    if (x >= 0 && x <= 0 && 1 / x < 0)
        strcpy(buf, "-0");
    else
        snprintf(buf, sizeof buf, "%g", x);
    len2 = strlen(end);
    printf("  \"%s\"%*s    %3d  \"%s\"%*s", s, 32 - len1, buf, errno, end, 16 - len2, "");
    errno = 0;
    x = strtod_nolocale(s, &end);
    if (x >= 0 && x <= 0 && 1 / x < 0)
        strcpy(buf, "-0");
    else
        snprintf(buf, sizeof buf, "%g", x);
    len2 = strlen(end);
    printf("  \"%s\"%*s    %3d  \"%s\"\n", s, 32 - len1, buf, errno, end);
}

int main(int argc, char *argv[]) {
    const char locale_name[] = "fr_FR.UTF-8";

    printf("\nUsing C locale:\n\n");
    printf("  %-62s %s\n", "strtod", "strtod_nolocale");
    printf("  string                       value  errno  end                 string                       value  errno  end\n");

    if (argc > 1) {
        for (int i = 1; i < argc; i++) {
            test(argv[i]);
        }
    } else {
        for (int i = 0; i < num_tests; i++) {
            test(tests[i]);
        }
    }

    setlocale(LC_ALL, locale_name);
    printf("\nUsing locale %s:\n\n", locale_name);
    printf("  %-62s %s\n", "strtod", "strtod_nolocale");
    printf("  string                       value  errno  end                 string                       value  errno  end\n");

    if (argc > 1) {
        for (int i = 1; i < argc; i++) {
            test(argv[i]);
        }
    } else {
        for (int i = 0; i < num_tests; i++) {
            test(tests[i]);
        }
    }
    return 0;
}

Output:

Using C locale:

  strtod                                                         strtod_nolocale
  string                       value  errno  end                 string                       value  errno  end
  ""                               0      0  ""                  ""                               0      0  ""
  "x"                              0      0  "x"                 "x"                              0      0  "x"
  "123.456"                  123.456      0  ""                  "123.456"                  123.456      0  ""
  "123,456"                      123      0  ",456"              "123,456"                  123.456      0  ""
  "1.23456e2"                123.456      0  ""                  "1.23456e2"                123.456      0  ""
  "1,23456e2"                      1      0  ",23456e2"          "1,23456e2"                123.456      0  ""
  "00000000001.23456e2"      123.456      0  ""                  "00000000001.23456e2"      123.456      0  ""
  "0000,0000000000123456e12"       0      0  ",0000000000123456e12"      "0000,0000000000123456e12" 12.3456      0  ""
  "0"                              0      0  ""                  "0"                              0      0  ""
  "0.0"                            0      0  ""                  "0.0"                            0      0  ""
  "0,0"                            0      0  ",0"                "0,0"                            0      0  ""
  "+0"                             0      0  ""                  "+0"                             0      0  ""
  "+0.0"                           0      0  ""                  "+0.0"                           0      0  ""
  "+0,0"                           0      0  ",0"                "+0,0"                           0      0  ""
  "-0"                            -0      0  ""                  "-0"                            -0      0  ""
  "-0.0"                          -0      0  ""                  "-0.0"                          -0      0  ""
  "-0,0"                          -0      0  ",0"                "-0,0"                          -0      0  ""
  "inf"                          inf      0  ""                  "inf"                          inf      0  ""
  "infinity"                     inf      0  ""                  "infinity"                     inf      0  ""
  "infx"                         inf      0  "x"                 "infx"                         inf      0  "x"
  "nan"                          nan      0  ""                  "nan"                          nan      0  ""
  "nan(43)"                      nan      0  ""                  "nan(43)"                      nan      0  ""
  "nanx"                         nan      0  "x"                 "nanx"                         nan      0  "x"
  "INF"                          inf      0  ""                  "INF"                          inf      0  ""
  "INFINITY"                     inf      0  ""                  "INFINITY"                     inf      0  ""
  "INFx"                         inf      0  "x"                 "INFx"                         inf      0  "x"
  "NAN"                          nan      0  ""                  "NAN"                          nan      0  ""
  "NAN(43)"                      nan      0  ""                  "NAN(43)"                      nan      0  ""
  "NANX"                         nan      0  "X"                 "NANX"                         nan      0  "X"
  "1e10000"                      inf     34  ""                  "1e10000"                        1      0  ""
  "0e10000"                        0      0  ""                  "0e10000"                        0      0  ""
  "-1e10000"                    -inf     34  ""                  "-1e10000"                      -1      0  ""
  "1e-10000"                       0     34  ""                  "1e-10000"                       1      0  ""
  "-1e-10000"                     -0     34  ""                  "-1e-10000"                     -1      0  ""
  "123_456.789_012"              123      0  "_456.789_012"      "123_456.789_012"              123      0  "_456.789_012"
  "123_456,789_012"              123      0  "_456,789_012"      "123_456,789_012"              123      0  "_456,789_012"
  "123'456.789'012"              123      0  "'456.789'012"      "123'456.789'012"              123      0  "'456.789'012"
  "123'456,789'012"              123      0  "'456,789'012"      "123'456,789'012"              123      0  "'456,789'012"

Using locale fr_FR.UTF-8:

  strtod                                                         strtod_nolocale
  string                       value  errno  end                 string                       value  errno  end
  ""                               0      0  ""                  ""                               0      0  ""
  "x"                              0      0  "x"                 "x"                              0      0  "x"
  "123.456"                      123      0  ".456"              "123.456"                  123,456      0  ""
  "123,456"                  123,456      0  ""                  "123,456"                  123,456      0  ""
  "1.23456e2"                      1      0  ".23456e2"          "1.23456e2"                123,456      0  ""
  "1,23456e2"                123,456      0  ""                  "1,23456e2"                123,456      0  ""
  "00000000001.23456e2"            1      0  ".23456e2"          "00000000001.23456e2"      123,456      0  ""
  "0000,0000000000123456e12" 12,3456      0  ""                  "0000,0000000000123456e12" 12,3456      0  ""
  "0"                              0      0  ""                  "0"                              0      0  ""
  "0.0"                            0      0  ".0"                "0.0"                            0      0  ""
  "0,0"                            0      0  ""                  "0,0"                            0      0  ""
  "+0"                             0      0  ""                  "+0"                             0      0  ""
  "+0.0"                           0      0  ".0"                "+0.0"                           0      0  ""
  "+0,0"                           0      0  ""                  "+0,0"                           0      0  ""
  "-0"                            -0      0  ""                  "-0"                            -0      0  ""
  "-0.0"                          -0      0  ".0"                "-0.0"                          -0      0  ""
  "-0,0"                          -0      0  ""                  "-0,0"                          -0      0  ""
  "inf"                          inf      0  ""                  "inf"                          inf      0  ""
  "infinity"                     inf      0  ""                  "infinity"                     inf      0  ""
  "infx"                         inf      0  "x"                 "infx"                         inf      0  "x"
  "nan"                          nan      0  ""                  "nan"                          nan      0  ""
  "nan(43)"                      nan      0  ""                  "nan(43)"                      nan      0  ""
  "nanx"                         nan      0  "x"                 "nanx"                         nan      0  "x"
  "INF"                          inf      0  ""                  "INF"                          inf      0  ""
  "INFINITY"                     inf      0  ""                  "INFINITY"                     inf      0  ""
  "INFx"                         inf      0  "x"                 "INFx"                         inf      0  "x"
  "NAN"                          nan      0  ""                  "NAN"                          nan      0  ""
  "NAN(43)"                      nan      0  ""                  "NAN(43)"                      nan      0  ""
  "NANX"                         nan      0  "X"                 "NANX"                         nan      0  "X"
  "1e10000"                      inf     34  ""                  "1e10000"                        1      0  ""
  "0e10000"                        0      0  ""                  "0e10000"                        0      0  ""
  "-1e10000"                    -inf     34  ""                  "-1e10000"                      -1      0  ""
  "1e-10000"                       0     34  ""                  "1e-10000"                       1      0  ""
  "-1e-10000"                     -0     34  ""                  "-1e-10000"                     -1      0  ""
  "123_456.789_012"              123      0  "_456.789_012"      "123_456.789_012"              123      0  "_456.789_012"
  "123_456,789_012"              123      0  "_456,789_012"      "123_456,789_012"              123      0  "_456,789_012"
  "123'456.789'012"              123      0  "'456.789'012"      "123'456.789'012"              123      0  "'456.789'012"
  "123'456,789'012"              123      0  "'456,789'012"      "123'456,789'012"              123      0  "'456,789'012"

Comments

0

Use strtof_l() with the "C" locale — it parses using . as the decimal separator regardless of the system locale.

Example:

#include <locale.h>
#include <stdlib.h>
#include <stdio.h>

int main(void) {
    locale_t c_locale = newlocale(LC_NUMERIC_MASK, "C", NULL);
    const char *s = "3.14";
    char *end;
    float f = strtof_l(s, &end, c_locale);
    printf("%f\n", f);
    freelocale(c_locale);
    return 0;
}

This keeps your main locale untouched (safe for multithreaded code) and uses the C locale only for that conversion.

Comments

-1

Solution which is 100% standard C, which does not care about locale settings and doesn't involve rolling out the whole conversion manually:

  • Make a function strtod_nolocale with 100% compatible API:

    double strtod_nolocale (const char* restrict nptr, char** restrict endptr);
    
  • In this function, start by calling strtod and see if it parsed the whole input string. If so, all is well - stop there. This also gives support to corner cases "INF" and "NAN" without us having to craft that manually.

  • In case strtod didn't parse the whole string, then make a list of supported decimal separators. Realistically this is only ever going to be '.' and ','.

  • Check if strtod failed because it encountered an unknown separator from a different locale, or if it just failed because it found some garbage.

  • In case it was the other locale, skip past that character and parse the rest of the string from there (the fractional part), using strtol.

  • Cook up a correct floating point number by adding the previous integral part from strtod together with the fractional part obtained by strtol.

This is what I came up with:

#include <stdlib.h>
#include <stdio.h>
#include <stddef.h>
#include <math.h>

double strtod_nolocale (const char* restrict nptr, char** restrict endptr)
{
  static const char supported_separators[] = {'.', ','};

  // in case the user provided an endptr, use that one
  // otherwise use a temporary one for this function only
  char* end;
  if(endptr != NULL)
  {
    end = *endptr;
  }

  double integral = strtod(nptr, &end);
  double fractional = 0.0;
  if(end == nptr) // whole string was parsed including special cases NAN, INF
  {
    goto the_end;
  }

  bool found_separator = false;
  for(size_t i=0; i<sizeof supported_separators; i++)
  {
    if(*end == supported_separators[i])
    {
      found_separator = true;
      break;    
    }
  }
  if(!found_separator) // was some other weird symbol, stop here
  {
    goto the_end;
  }

  char* next = end+1;
  long int_fractional = strtol(next, &end, 10);
  if(int_fractional == 0)
  {
    goto the_end;
  }
  ptrdiff_t digits = end - next;
  double sign = integral < 0.0 ? -1.0 : 1.0;
  fractional = sign * (double)int_fractional / pow(10,digits);

  the_end:
  if(endptr != NULL)
  {
    *endptr = end;
  }
  return integral + fractional;
}

Full example with test cases:

#include <stdlib.h>
#include <stdio.h>
#include <stddef.h>
#include <math.h>

double strtod_nolocale (const char* restrict nptr, char** restrict endptr)
{
  static const char supported_separators[] = {'.', ','};

  // in case the user provided an endptr, use that one
  // otherwise use a temporary one for this function only
  char* end;
  if(endptr != NULL)
  {
    end = *endptr;
  }

  double integral = strtod(nptr, &end);
  double fractional = 0.0;
  if(end == nptr) // whole string was parsed including special cases NAN, INF
  {
    goto the_end;
  }

  bool found_separator = false;
  for(size_t i=0; i<sizeof supported_separators; i++)
  {
    if(*end == supported_separators[i])
    {
      found_separator = true;
      break;    
    }
  }
  if(!found_separator) // was some other weird symbol, stop here
  {
    goto the_end;
  }

  char* next = end+1;
  long int_fractional = strtol(next, &end, 10);
  if(int_fractional == 0)
  {
    goto the_end;
  }
  ptrdiff_t digits = end - next;
  double sign = integral < 0.0 ? -1.0 : 1.0;
  fractional = sign * (double)int_fractional / pow(10,digits);

  the_end:
  if(endptr != NULL)
  {
    *endptr = end;
  }
  return integral + fractional;
}


#define TEST(str) \
  puts(str); \
  result1 = strtod(str, &endptr); \
  printf("strtod:          %lf, endptr: \"%s\"\n", result1, endptr); \
  result2 = strtod_nolocale(str, &endptr); \
  printf("strtod_nolocale: %lf, endptr: \"%s\"\n", result2, endptr); \
  puts("");

int main (void)
{
  double result1, result2;
  char* endptr;

  TEST("123");
  TEST("123.456");
  TEST("0");
  TEST("123,456");
  TEST("-123,4");
  TEST("123|456");
  TEST("garbage");
  TEST("123.456garbage");
  TEST("123,456garbage");
  TEST("INF");
  TEST("NAN");
}

Notable differences between strtod and strtod_nolocale (tested on a machine with . separator):

123,456
strtod:          123.000000, endptr: ",456"
strtod_nolocale: 123.456000, endptr: ""

-123,4
strtod:          -123.000000, endptr: ",4"
strtod_nolocale: -123.400000, endptr: ""

123,456garbage
strtod:          123.000000, endptr: ",456garbage"
strtod_nolocale: 123.456000, endptr: "garbage"

2 Comments

I'm afraid this does not work for 1,23456e2. Furthermore the rounding may be incorrect for the fractional part parsed by strtol() and divided by pow(10,digits), especially if sizeof(long) < sizeof(uint64_t).
“-0.5” fails due to mishandling the sign. “1.e1” fails due to neglecting the exponent. “9.9900000000000000000” fails due to overflow in strtol (if long is at most 64 bits). “17179869184.5000057220458984” fails due to rounding in division by a power of 10. “4.36028797018963974" fails due to rounding in conversion from long to double (if long is at least 57 bits). “0. 5” incorrectly produces 0.5 due to accepting a space inside the string.
-2

Write a very simple string-to-float conversion function for yourself. About a dozen or so lines of code and you avoid all problems mentioned in the other answers. As the config format is in your hands, you can easily make sure that you won't ever use exponential notation or maybe not even negative numbers. Being a config reader, you don't even have to worry about performance, so even the naivest approach of reading character after character, multiplying by ten and adding the new digit (and the opposite after the separator) will do.

Yes, I know some other comments warn against this — without merit, in my opinion. Creating a full conversion routine that deals with exponentials, very small or very large numbers or similar edge cases is not trivial, that's sure. However, your requirements described in your post, namely reading your own values back don't require that.

6 Comments

I expect it is impossible to write a decimal to binary floating-point conversion in about a dozen lines of C code that is correct. The method you describe will not work for some numerals as simple as “0.875”, because computing 8./10. + 7./100. + 5/1000. with IEEE-754 binary64 arithmetic does not produce 0.875 but rather 0.87500000000000011102230246251565404236316680908203125.
Yes. And in what way is this important for the OP who, according to his description, needs just a few config values stored, and was obviously not worried about precision to the umpteenth digit but the locale differences of decimal separators? If he really needs that kind of precision, he could obviously store the actual IEEE representation in a binhex-coded binary value or similar. But that's not what I inferred from the question. So, I stand by my answer, even if downvoted. :-)
OP showed a 17-digit number, 0.10000000149011612, as an example, and your method will not even work for a three-digit number. We do not know what problems that will cause. If the program had saved, for example, some shading level from 0/8 to 8/8 as a fraction and later read back 0.87500000000000011102230246251565404236316680908203125, that might not match any of the expected values and could cause various errors.
Further, the purpose of Stack Overflow is not to help individual users; it is to provide a durable repository of questions and answers for future users. So, even if some low-quality code served one person’s purpose, it is still a bad answer for the question because it misleads future users who may need code that actually works. The suggestion in this answer is bad engineering and fosters ignorance and negligence.
We don't know whether it's important. The OP didn't say he needed full precision on his config params — but he didn't say he didn't, either. And floating point is an area where many consider it a good rule to not throw away precision gratuitously, because sometimes that precision is vital. (Look at how many people get badly upset when they learn that 0.1 can't be exactly represented in binary. Not being able to handle 0.875 — which can be exactly represented — is, arguably, even worse.)
OK. OP posted the question, I provided a possible solution. It's up to him to decide whether it suits his needs or not. Actually, just now, I found a better solution among the comments: multiply with a hardwired large number and store as an integer. I consider this to be the best answer, obviously better than mine and all others, and urged Mark to make an answer from it. Then OP is free to choose the best answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.