Cross Validated Community Digest

Top new questions this week:

Different CIs for the same linear mixed model emmeans, ggemmeans, ggpredict

I fitted a linear mixed model in R and tried to compute marginal means using emmeans, ggemmeans (from ggeffects), and ggpredict (also from ggeffects). The predicted means are similar, but the ...

r mixed-model confidence-interval lsmeans

asked by Fmt Score of 8

answered by PBulls Score of 10

Narrow vs Broad-based U-shape comparisons

I’m modeling mortality using a multivariate logistic regression model with a nonlinear effect of X1 and I’m examining whether this relationship changes across ...

regression r interaction interpretation

asked by Konstantinos Gkirgkiris Score of 6

answered by Eli Score of 3

Why does the p-value increase when I add more observations to my t-test?

I am comparing the means of two groups using an independent two-sample t-test in R. Initially, I had the following samples: Group A: n = 15, mean = 52.3, sd = 4.8 Group B: n = 15, mean = 48.1, sd = 5....

hypothesis-testing statistical-significance t-test p-value sample-size

asked by KoleyPort Score of 6

answered by Sextus Empiricus Score of 21

How to test i.i.d. assumption?

Given a sample $X_1,\ldots X_n$, how can I test the hypothesis that these are i.i.d. samples from a fixed (unknown) distribution? To add context, assume this is a time series and I want evidence ...

hypothesis-testing iid

asked by yoyo Score of 6

answered by Christian Hennig Score of 9

If a transformation preserves distance for ANY metric, must it be the identity?

I am trying to prove the following statement and I am looking for some guidance or a hint. Let $X = \mathbb{R}^n$ and let $f: X \to X$ be an affine transformation defined by $f(x) = Ax + b$. We know ...

probability distance

asked by rosinn Score of 6

answered by whuber Score of 9

How can I evaluate a time‑series forecasting model when I must train on the entire small dataset?

I’m building a Python forecasting pipeline that tries several models: Holt‑Winters (tuned with Optuna) ARIMA (via pmdarima.auto_arima) XGBoost (tuned with Optuna) ...

time-series cross-validation model-evaluation

asked by CSe Score of 6

answered by Stephan Kolassa Score of 3

Calculate confidence intervals for spline after change in reference- manually

How do I calculate confidence intervals for a spline function after changing the reference? I would like to plot the spline with reference at age=52 along with the confidence limits. ...

r splines

asked by Pam G Score of 6

answered by PBulls Score of 8

Greatest hits from previous weeks:

What's the difference between correlation and simple linear regression?

In particular, I am referring to the Pearson product-moment correlation coefficient.

correlation regression

asked by Neil McGuigan Score of 106

answered by Jeromy Anglim Score of 128

Why normalize images by subtracting dataset's image mean, instead of the current image mean in deep learning?

There are some variations on how to normalize the images but most seem to use these two methods: Subtract the mean per channel calculated over all images (e.g. VGG_ILSVRC_16_layers) Subtract by pixel/...

deep-learning image-processing

asked by Max Gordon Score of 162

answered by lollercoaster Score of 122

Crossed vs nested random effects: how do they differ and how are they specified correctly in lme4?

Here is how I have understood nested vs. crossed random effects: Nested random effects occur when a lower level factor appears only within a particular level of an upper level factor. For example,...

mixed-model lme4-nlme multilevel-analysis random-effects-model crossed-random-effects

asked by Joe King Score of 209

answered by Robert Long Score of 391

How to choose a predictive model after k-fold cross-validation?

I am wondering how to choose a predictive model after doing K-fold cross-validation. This may be awkwardly phrased, so let me explain in more detail: whenever I run K-fold cross-validation, I use K ...

cross-validation model-selection

asked by Berk U. Score of 304

answered by Bogdanovist Score of 360

What's the difference between probability and statistics?

What's the difference between probability and statistics, and why are they studied together?

probability teaching mathematical-statistics

asked by hslc Score of 156

A list of cost functions used in neural networks, alongside applications

What are common cost functions used in evaluating the performance of neural networks? Details (feel free to skip the rest of this question, my intent here is simply to provide clarification on ...

machine-learning neural-networks

asked by Phylliida Score of 187

answered by Phylliida Score of 110

What exactly are keys, queries, and values in attention mechanisms?

How should one understand the keys, queries, and values that are often mentioned in attention mechanisms? I've tried searching online, but all the resources I find only speak of them as if the reader ...

neural-networks natural-language attention machine-translation

asked by Sean Score of 326

answered by dontloo Score of 299

Can you answer these questions?

Accuracy in Machine Learning vs. Accuracy in Statistics vs. pass@1,1 in Generative Modeling: What's the Difference?

I've encountered the term "accuracy" used differently across several evaluation contexts, and I want to clearly understand their mathematical and conceptual distinctions using consistent ...

machine-learning probability mathematical-statistics model-evaluation accuracy

asked by Charlie Parker Score of 1

answered by Dikran Marsupial Score of 0

Asymptotic bias when misspecifying ARMA-X type model

Consider the following model: $$y_t = c_0 + \phi y_{t-1} + \beta x_{t} + u_t,$$ where $u_t$ follows an MA(1) process: $u_t = \varepsilon_t + \theta\varepsilon_{t-1},$ where $\varepsilon_t$ is white ...

bias consistency generalized-least-squares misspecification armax

asked by Alba Score of 1

Is there a way to perform a correspondence analysis with ordered variables?

I am trying to perform a correspondence analysis on a dataset of anatomical measurements of ecologically relevant features. Most of these variables are ordered factor variables representing binning of ...

categorical-data dimensionality-reduction correspondence-analysis

asked by user2352714 Score of 1