Fast and precise summation of random numbers

Question

I'm trying to first formulate this challenge.

In short, we want to sum up an stream of random numbers with the following objective:

The objective is "simply" to sum up as many sequences as possible as fast as possible and as accurately as possible.

Problem

Description. There is a sequence of floating-point numbers stored in IEEE-754 binary64 (double precision, fp64) format 𝑥𝑖 of length 𝑁. The sequence needs to be summed up to 𝑆=𝑥1+𝑥2+…+𝑥𝑁. As professional computer equipment with native support for fp16 is usually unavailable to the general audience, we propose to do operations in a simplified simulated environment, that is, we do computations in fp64 format with mantissa and exponent cut to the range admissible in fp16. In particular, small values that do not fit fp16 admissible range turn into zeros, while excessively large values turn into infinities.

Objective. Your objective is to sum up as many sequences as possible as fast as possible and as accurately as possible. Please note that you may do summation in fp64 format, but the summation process would be slow though accurate. If you do plain summation in fp16 format, it can be fast, but inaccurate, especially for larger sequences.

Input
The input consists of a single line. It starts with an integer 𝑁 representing the number of values in the sequence. The following 𝑁 double precision numbers form the sequence 𝑥𝑖, where 𝑖=1,…,𝑁.

Variable constraints:

Length of the sequence: 2≤𝑁≤1 000 000.

Value of any individual number in the sequence: legal IEEE-754 binary64 value stored in decimal format.

Note that the actual binary64 value is not always exactly equal to the given decimal value. Instead, the actual given value is the closest number representable in binary64. When reading the input, most programming languages will do the conversion for you automatically.

It is guaranteed that every number in the sequence either is 0 or has absolute value between 10\$^{−300}\$ and 10\$^{300}\$, inclusive.

Output
Print a single line which will describe the summation process. The line should contain an encoded algorithm for the summation. We use this encoding to do actual summation and report the result to prevent the need to seek hardware capable of doing fp16 operations natively.

An encoded algorithm consists of the data type to use, followed by a list of values to sum up using this data type. The result of the algorithm is the sum of the given values, as computed in the given data type, from left to right, in the given order. It looks as follows:

{type:value_1,value_2,...,value_k}

As you can see, the whole algorithm is surrounded by curly brackets ({ and }). The next character represents one of the three possible data types:

d for fp64 summation,

s for fp32 summation,

h for fp16 summation.

Then goes a colon (:). It is followed by a non-empty list of values to sum up, separated by commas (,). Note that there are no spaces.

Each value can be one of the following:

an integer from 1 to 𝑁 indicating a position in the input sequence: in this case, the value comes directly from the input

another algorithm: in this case, the value is the result of this algorithm.

Some examples:

{d:1,2,3,4} tells to use double precision to compute 𝑥1+𝑥2+𝑥3+𝑥4;

{h:4,3,2,1} tells to use half precision to compute 𝑥4+𝑥3+𝑥2+𝑥1;

{d:{s:3,4},{h:2,1}} tells to use double precision to compute 𝑦+𝑧, where:
• 𝑦 is to use single precision to compute 𝑥3+𝑥4
• 𝑧 is to use half precision to compute 𝑥2+𝑥1

{h:1,4,{d:3,2}} tells to use half precision to compute 𝑥1+𝑥4+𝑦, where:
• 𝑦 is to use double precision to compute 𝑥3+𝑥2.

Each input value must be used exactly once.

Code

#include <iostream>
#include <iomanip>
#include <vector>
#include <random>
#include <limits>
#include <sstream>
#include <algorithm>

template <typename T>
struct RandomGenerator
{
public:
    std::vector<T> generate(int num_elements)
    {
        std::vector<T> random_numbers(num_elements);

        std::random_device rd;

        std::mt19937 gen(rd());

        if constexpr (std::is_floating_point_v<T>)
        {
            std::uniform_real_distribution<T> dis(-1e33, 1e33);

            for (int i = 0; i < num_elements; ++i)
            {
                random_numbers[i] = dis(gen);
            }
        }
        else if constexpr (std::is_integral_v<T>)
        {
            std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(), std::numeric_limits<T>::max());
            for (int i = 0; i < num_elements; ++i)
            {
                random_numbers[i] = dis(gen);
            }
        }

        return random_numbers;
    }
};

struct Solution
{
public:
    template <typename T>
    std::vector<T> gen_rand(int num_elements)
    {
        RandomGenerator<T> generator;
        return generator.generate(num_elements);
    }

    std::string shuffle_random_numbers(int size)
    {
        std::ostringstream stream;
        auto doubles = gen_rand<long double>(size);
        auto long_ints = gen_rand<long long>(size);
        auto int64 = gen_rand<int64_t>(size);
        auto int32 = gen_rand<int32_t>(size);
        auto int16 = gen_rand<int16_t>(size);
        for (int i = 0; i < size; i++)
        {

            stream << std::setprecision(32) << doubles[i] << " ";
            stream << long_ints[i] << " ";
            stream << int64[i] << " ";
            stream << int32[i] << " ";
            stream << int16[i] << " ";
        }

        return stream.str();
    }
};

int main()
{
    Solution solution;
    int total_numbers = 2000000;
    int all_types = 5;
    int size = total_numbers / all_types;
    std::string nums = solution.shuffle_random_numbers(size);
    for (int i = 0; i < 5000; ++i)
    {
        std::cout << nums[i];
    }

    return 0;
}

This code "works" and only generates some random numbers, which are slightly out of the range of the challenge.

Let's review this code snippet and see how off-topic I am.

Prints

8.9933425805702840490035138002944e+32 6068080230499629591 1826670665154147315 1926076633 16679 -2.3399580496251931870772800677478e+32 -2732482697013172932 6425612697248066705 825847785 12479 -3.9444235690444262100081682939904e+32 8553508215025978902 -2670702833087541545 164713529 28675 1.6083512370712003821362383473869e+32 -3506100849371078041 -2165996920030252883 -901675535 8152 -9.5252438825429638154154145742848e+32 -3560862265332198988 8799122362162851542 -54914853 -28996 -6.2094149117166213759869987376333e+32 -5855979725391943300 5713515607426216804 751600839 7763 -9.7987789440709657017344638477926e+32 -3820587565628500649 850340292612739915 -391437763 14125 3.7479446584346197537142620553216e+32 -9057751415679921633 678121319467017279 -1861290375 -8744 6.8093472005846094889347065472614e+32 8321197442813223250 237204722459191531 -479535824 -14669 6.1142155694755141085012146572493e+32 8041428622998088346 2553967891039017903 -976743947 6397 -2.4393183990678838521509170066227e+32 6576801438081748567 7205393039574529791 -358619125 -6168 -7.7622471247437259621051420324659e+32 -3245109382221000642 -7711308135318252274 567027640 -23410 3.6235229292322655666619042653798e+32 4152914541590001194 -3808686697100249036 -995000765 5539 -8.6455228211482700708946848514048e+32 7463233541783770067 3531693187639253190 1558578888 -765 -4.3881685373671754294460423130317e+32 -7115711376329041685 -1737915043077554274 -1320860354 -9994

Toby Speight · Accepted Answer · 2024-05-12 15:30:03Z

The code presented doesn't agree with the problem statement. It appears that this is intended to generate inputs for the challenge. That led me astray at first.

The good news is that I can't provoke any compiler warnings or Valgrind issues with it.

Let's dive into the detail, starting with the problem statement:

every number in the sequence either is 0 or has absolute value between 10⁻³⁰⁰ and 10³⁰⁰

However, we use a distribution with a different range:

       std::uniform_real_distribution<T> dis(-1e33, 1e33);

Whilst this is certainly within the extremes (too far within, for a good test sample), it's including a lot of numbers in the ranges -10⁻³⁰⁰ < x < 0 and 0 < x < 10⁻³⁰⁰. We should be rejecting and resampling such values.

I don't think there's value in recreating our random generator every time generate() is called. We could make it static:

        static std::mt19937 gen(std::random_device{}());

But this still constructs a new one for each template instantiation, so may be worth making global (at file scope).

If neither if constexpr is satisfied, we end up with a collection of default-constructed T, rather than a random selection. We could add else static_assert(false) to protect against this, or constrain the template:

template<typename T>
concept number = std::is_integral_v<T> || std::is_floating_point_v<T>;

template <number T>
struct RandomGenerator
{

Instead of arithmetic loops such as

        for (int i = 0; i < num_elements; ++i)
        {
            random_numbers[i] = dis(gen);
        }

we can iterate over all members of the collection using range-based for:

        for (auto& number: random_numbers)
        {
            number = dis(gen);
        }

This makes it clearer that we're modifying every element, rather than requiring the reader to match up the size with num_elements.

But instead of filling the vector with default-constructed T, we could create it empty, then add in the elements using the standard "generate" algorithm (isn't this why we included <algorithm>?):

    static std::mt19937 gen(std::random_device{}());

    template<std::integral T>
    T random_number() {
        auto dis = std::uniform_int_distribution<T>
            {std::numeric_limits<T>::min(), std::numeric_limits<T>::max()};
        return dis(gen);
    }

    template<std::floating_point T>
    T random_number() {
        auto dis = std::uniform_real_distribution<T>{-1e300, 1e300};
        for (;;) {
            auto result = dis(gen);
            if (result == 0.0 || std::abs(result) >= 1e-300) {
                return result;
            }
        }
    }

    template <typename T>
    std::vector<T> generate(std::size_t num_elements)
    {
        std::vector<T> random_numbers;
        random_numbers.reserve(num_elements);
        std::ranges::generate_n(std::back_inserter(random_numbers), num_elements, random_number<T>);
        return random_numbers;
    }

Notice how we do away with the if constexpr within the generate() function, because we've provided separate functions to call for integer and floating-point targets.

Members of structs are public by default, so we don't need public: labels. In fact, the member functions use no members, so we don't need the structs. Free functions are fine here.

Also, return 0; at the end of main() is redundant.

Finally, consider that random generation of test cases isn't always the best way of stressing an algorithm. Consider also hand-crafting some adversarial test cases that are likely to overflow result types or cause catastrophic loss of precision. For example:

10³⁰⁰ + 10³⁰⁰ + -10³⁰⁰
10³⁰⁰ + 10⁻³⁰⁰ + -10³⁰⁰

Just note that a uniform distribution in the range from $-10^{33}$ to $10^{33}$ will hardly ever generate a number of absolute value smaller than $10^{10}$, let alone anything with negative exponent — Hagen von Eitzen
– Hagen von Eitzen, Commented May 10, 2024 at 13:49
Good point - it's certainly possible that a uniform distribution isn't what's really required here, of course. — Toby Speight
– Toby Speight, Commented May 10, 2024 at 14:41

Stack Exchange Network

Fast and precise summation of random numbers

Problem

Code

Prints

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Fast and precise summation of random numbers

Problem

Code

Prints

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions