I'm trying to first formulate this challenge.
In short, we want to sum up an stream of random numbers with the following objective:
The objective is "simply" to sum up as many sequences as possible as fast as possible and as accurately as possible.
Problem
Description. There is a sequence of floating-point numbers stored in IEEE-754 binary64 (double precision, fp64) format 𝑥𝑖 of length 𝑁. The sequence needs to be summed up to 𝑆=𝑥1+𝑥2+…+𝑥𝑁. As professional computer equipment with native support for fp16 is usually unavailable to the general audience, we propose to do operations in a simplified simulated environment, that is, we do computations in fp64 format with mantissa and exponent cut to the range admissible in fp16. In particular, small values that do not fit fp16 admissible range turn into zeros, while excessively large values turn into infinities.
Objective. Your objective is to sum up as many sequences as possible as fast as possible and as accurately as possible. Please note that you may do summation in fp64 format, but the summation process would be slow though accurate. If you do plain summation in fp16 format, it can be fast, but inaccurate, especially for larger sequences.
Input
The input consists of a single line. It starts with an integer 𝑁 representing the number of values in the sequence. The following 𝑁 double precision numbers form the sequence 𝑥𝑖, where 𝑖=1,…,𝑁.Variable constraints:
Length of the sequence: 2≤𝑁≤1 000 000.
Value of any individual number in the sequence: legal IEEE-754 binary64 value stored in decimal format.
Note that the actual binary64 value is not always exactly equal to the given decimal value. Instead, the actual given value is the closest number representable in binary64. When reading the input, most programming languages will do the conversion for you automatically.
It is guaranteed that every number in the sequence either is 0 or has absolute value between 10\$^{−300}\$ and 10\$^{300}\$, inclusive.
Output
Print a single line which will describe the summation process. The line should contain an encoded algorithm for the summation. We use this encoding to do actual summation and report the result to prevent the need to seek hardware capable of doing fp16 operations natively.An encoded algorithm consists of the data type to use, followed by a list of values to sum up using this data type. The result of the algorithm is the sum of the given values, as computed in the given data type, from left to right, in the given order. It looks as follows:
{type:value_1,value_2,...,value_k}
As you can see, the whole algorithm is surrounded by curly brackets (
{and}). The next character represents one of the three possible data types:
dfor fp64 summation,sfor fp32 summation,hfor fp16 summation.Then goes a colon (
:). It is followed by a non-empty list of values to sum up, separated by commas (,). Note that there are no spaces.Each value can be one of the following:
- an integer from 1 to 𝑁 indicating a position in the input sequence: in this case, the value comes directly from the input
- another algorithm: in this case, the value is the result of this algorithm.
Some examples:
{d:1,2,3,4}tells to use double precision to compute 𝑥1+𝑥2+𝑥3+𝑥4;{h:4,3,2,1}tells to use half precision to compute 𝑥4+𝑥3+𝑥2+𝑥1;{d:{s:3,4},{h:2,1}}tells to use double precision to compute 𝑦+𝑧, where:
• 𝑦 is to use single precision to compute 𝑥3+𝑥4
• 𝑧 is to use half precision to compute 𝑥2+𝑥1{h:1,4,{d:3,2}}tells to use half precision to compute 𝑥1+𝑥4+𝑦, where:
• 𝑦 is to use double precision to compute 𝑥3+𝑥2.Each input value must be used exactly once.
Code
#include <iostream>
#include <iomanip>
#include <vector>
#include <random>
#include <limits>
#include <sstream>
#include <algorithm>
template <typename T>
struct RandomGenerator
{
public:
std::vector<T> generate(int num_elements)
{
std::vector<T> random_numbers(num_elements);
std::random_device rd;
std::mt19937 gen(rd());
if constexpr (std::is_floating_point_v<T>)
{
std::uniform_real_distribution<T> dis(-1e33, 1e33);
for (int i = 0; i < num_elements; ++i)
{
random_numbers[i] = dis(gen);
}
}
else if constexpr (std::is_integral_v<T>)
{
std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(), std::numeric_limits<T>::max());
for (int i = 0; i < num_elements; ++i)
{
random_numbers[i] = dis(gen);
}
}
return random_numbers;
}
};
struct Solution
{
public:
template <typename T>
std::vector<T> gen_rand(int num_elements)
{
RandomGenerator<T> generator;
return generator.generate(num_elements);
}
std::string shuffle_random_numbers(int size)
{
std::ostringstream stream;
auto doubles = gen_rand<long double>(size);
auto long_ints = gen_rand<long long>(size);
auto int64 = gen_rand<int64_t>(size);
auto int32 = gen_rand<int32_t>(size);
auto int16 = gen_rand<int16_t>(size);
for (int i = 0; i < size; i++)
{
stream << std::setprecision(32) << doubles[i] << " ";
stream << long_ints[i] << " ";
stream << int64[i] << " ";
stream << int32[i] << " ";
stream << int16[i] << " ";
}
return stream.str();
}
};
int main()
{
Solution solution;
int total_numbers = 2000000;
int all_types = 5;
int size = total_numbers / all_types;
std::string nums = solution.shuffle_random_numbers(size);
for (int i = 0; i < 5000; ++i)
{
std::cout << nums[i];
}
return 0;
}
This code "works" and only generates some random numbers, which are slightly out of the range of the challenge.
Let's review this code snippet and see how off-topic I am.
Prints
8.9933425805702840490035138002944e+32 6068080230499629591 1826670665154147315 1926076633 16679 -2.3399580496251931870772800677478e+32 -2732482697013172932 6425612697248066705 825847785 12479 -3.9444235690444262100081682939904e+32 8553508215025978902 -2670702833087541545 164713529 28675 1.6083512370712003821362383473869e+32 -3506100849371078041 -2165996920030252883 -901675535 8152 -9.5252438825429638154154145742848e+32 -3560862265332198988 8799122362162851542 -54914853 -28996 -6.2094149117166213759869987376333e+32 -5855979725391943300 5713515607426216804 751600839 7763 -9.7987789440709657017344638477926e+32 -3820587565628500649 850340292612739915 -391437763 14125 3.7479446584346197537142620553216e+32 -9057751415679921633 678121319467017279 -1861290375 -8744 6.8093472005846094889347065472614e+32 8321197442813223250 237204722459191531 -479535824 -14669 6.1142155694755141085012146572493e+32 8041428622998088346 2553967891039017903 -976743947 6397 -2.4393183990678838521509170066227e+32 6576801438081748567 7205393039574529791 -358619125 -6168 -7.7622471247437259621051420324659e+32 -3245109382221000642 -7711308135318252274 567027640 -23410 3.6235229292322655666619042653798e+32 4152914541590001194 -3808686697100249036 -995000765 5539 -8.6455228211482700708946848514048e+32 7463233541783770067 3531693187639253190 1558578888 -765 -4.3881685373671754294460423130317e+32 -7115711376329041685 -1737915043077554274 -1320860354 -9994