Hottest 'statistics' Answers

31 votes

Accepted

Benchmarking, why discard lowest time?

The lowest timing might indeed represent the "true" timing without outside interference, or it might be a measurement error. E.g. the boosting behaviour of a CPU might speed up the first run ...

amon

136k

answered Dec 31, 2020 at 10:50

28 votes

Benchmarking, why discard lowest time?

Outliers indicate unusual situations. Outliers are interesting in science, because they give you something to investigate, but they are useless in benchmarks. If you have 10000 benchmark runs, and ...

Jörg W Mittag

105k

answered Dec 31, 2020 at 10:41

8 votes

Benchmarking, why discard lowest time?

I'd add the thought that before doing anything else I would eyeball the data, that is plot the distribution data. I'd do that with most datasets, for that matter. My experience as a retired statto is ...

thelordgivetj

81

answered Jan 27, 2021 at 16:28

6 votes

Alternative to Wilson Score when I only have the number of ratings and the average rating?

You can use Wilson scoring, however them main issue with your approach is that you are discretising the data, which results in a loss of information, which is best avoided wherever possible. A better ...

Robert Long

802

answered Mar 24 at 16:16

5 votes

Accepted

How to get a useful measure for latency

You will probably want to make several measurements here, because you'll want to understand how the system works both including and excluding the outliers. There are many possible solutions for ...

Kyle McVay

1,986

answered Apr 17, 2019 at 18:44

4 votes

Birthday Paradox, Analytical and Monte Carlo solutions give two systemically slightly different results

Consider when your number of people, n, is 366. Using your proposed analytic solution for n=366, you get NumPairs = n*(n-1)/2 = 66,795. You then say that the probability of two people having ...

WRSomsky

141

answered Dec 6, 2019 at 2:24

3 votes

Accepted

Design a function that indicates significant deviations in response times

I work in a completely different domain, but we have a similar requirement where our system must take action when a measured physical signal is outside a predefined band long enough to be considered ...

Bart van Ingen Schenau

79k

answered May 23, 2021 at 7:23

3 votes

(AI) algorithm to optimize input parameters

I would recommend to try different standard algorithms for optimization of non-differentiable functions and see how well it works. Out of my head, in increasing complexity: hill climbing threshold ...

Doc Brown

220k

answered Nov 23, 2017 at 11:53

3 votes

Accepted

Compare two arrays by the number of occurances

You could try an ad-hoc method such as summing the weight of all tags, but that is not a meaningful metric. A better approach would be to perform statistical inference to answer a question like “what ...

amon

136k

answered Jan 13, 2019 at 17:04

2 votes

Weighted correlation coefficient

Because you are only comparing the candidates to the mean sample of each class, you lose information about the distribution of each class. You are trying to compensate for this by assigning a ...

amon

136k

answered Sep 7, 2019 at 20:27

2 votes

(AI) algorithm to optimize input parameters

SMAC, Sequential Model-based Algorithm Configuration, is a relatively recent approach to this problem. It may well be overkill. It is aimed at the scenario where evaluating a particular configuration ...

Derek Elkins left SE

6,821

answered Nov 24, 2017 at 5:20

2 votes

Accepted

Ranking results from a Question and Answer game

I agree with mmathis that this might be a better question for Math/Stats or even GameDev SE. However, here's a suggestion: Points per question Answering questions gets you points and you get more ...

JMekker

427

answered Nov 5, 2021 at 17:36

1 vote

Accepted

Optimal variable-time logging of a real-time data stream

Decide how much memory your filter is allowed to consume. This is what you have to work with when deciding if something is interesting enough to send to the file. In here you can hold more than you ...

candied_orange

120k

answered Mar 4, 2023 at 5:01

1 vote

Design a function that indicates significant deviations in response times

Have a look at HdrHistogram. There are implementations for all kinds of languages. What it effectively is, is a history of latency distributions. So you could have a latency distribution per second ...

pveentjer

121

answered Jul 10, 2021 at 9:06

1 vote

Benchmarking, why discard lowest time?

Another point: benchmarks are generally "averaged" using the geometric mean. The geometric mean intrinsically upweights the lowest value in the list, when compared to the algebraic mean. On ...

wchlm

149

answered Jan 27, 2021 at 18:03

1 vote

How to get a useful measure for latency

By themselves the pings are unlikely to show much of anything interesting or useful. Neither are basic stats like median and mean for those. Percentiles are typically more interesting for this kind ...

JimmyJames

30.9k

answered Apr 18, 2019 at 16:23

1 vote

FIFO Min-Max-Heap for Rolling Median

This is a refinement of Jerry Coffin's idea. Use a nearly balanced tree, where all nodes reside directly in the circular buffer. Initialize it with dummy values, so that the size stays constant all ...

maaartinus

2,733

answered Apr 18, 2018 at 16:35

1 vote

FIFO Min-Max-Heap for Rolling Median

If I were going to do this, I'd probably use a balanced tree (e.g., AVL or red-black) where each node also keeps track of the size of its left sub-tree (and you keep track of the overall size). This ...

Jerry Coffin

44.8k

answered Apr 16, 2018 at 16:22

Stack Exchange Network

Tag Info

Hot answers tagged statistics

Benchmarking, why discard lowest time?

Benchmarking, why discard lowest time?

Benchmarking, why discard lowest time?

Alternative to Wilson Score when I only have the number of ratings and the average rating?

How to get a useful measure for latency

Birthday Paradox, Analytical and Monte Carlo solutions give two systemically slightly different results

Design a function that indicates significant deviations in response times

(AI) algorithm to optimize input parameters

Compare two arrays by the number of occurances

Weighted correlation coefficient

(AI) algorithm to optimize input parameters

Ranking results from a Question and Answer game

Optimal variable-time logging of a real-time data stream

Design a function that indicates significant deviations in response times

Benchmarking, why discard lowest time?

How to get a useful measure for latency

FIFO Min-Max-Heap for Rolling Median

FIFO Min-Max-Heap for Rolling Median

Tag Info

Hot answers tagged statistics

Related Tags