942 questions
Best practices
0
votes
5
replies
88
views
Filling Null values with MEDIAN in Oracle Database table
I have table TABLE_ORIGIN with a primary key PK_ID and a numeric column COLUMN_NAME with some Null values.
I want to create a table called NEW_TABLE with columns PK_ID and COLUMN_NAME but I want to ...
0
votes
0
answers
32
views
Python MissForest Users will have to perform categorical features encoding by themselves
I'm trying to use missforest in python to impute missing values in a data set but I'm having issues with the categorical values.
In the original documentation it gives the example:
categorical=["...
Advice
0
votes
1
replies
95
views
Regression analysis
How should I handle a mass-point in the dependent variable when running OLS regression in R?
I’m working with a a household expenditure dataset (Living Costs 2019) where the dependent variable is the ...
0
votes
0
answers
39
views
Imputation with mice for multilevel data that is only missing level 1 values
This is my first time attempting data imputation with the mice package. I've read some tutorials but am still confused about how to apply the different examples to my data.
I have a multilevel dataset ...
1
vote
0
answers
44
views
Having trouble using multiple imputation for covariates in generalized additive model using mice package
I am attempting to figure out how to perform multiple imputation using the mice package when I am fitting a generalized additive model. I have a fully observed continuous response variable with 4 ...
0
votes
0
answers
52
views
Stratified (group-wise) imputation with tidymodels::recipes
In my application, the data-generating process requires stratified handling, as the data was sampled within known strata (e.g., by country), and each stratum is assumed to follow a structurally ...
0
votes
0
answers
62
views
Multiple imputation with mice: Auxiliary variables excluded (loggedEvents, method = "pmm")
I am working with an ecological dataset where I need to impute missing values for 11 variables of interest (species'traits) using mice package.
To support the imputation of NMAR traits, I included ~...
2
votes
1
answer
216
views
Imputing and adding rows to dataframe using polars expressions
I have a dataframe with incomplete values as below - in particular ages with corresponding years, and I would like to make it square (i.e., all three cust_id to have correctly imputed values for age ...
3
votes
2
answers
76
views
Needing advice on linear regression and then replacing NA's with fitted values
I am quite new to the data analytics stuff and R/RStudio so I am in need of advice. I am doing a project and asked to do:
for every variable that has missing value to run a linear regression model ...
0
votes
2
answers
114
views
Missing values in olive oil dataset
I have a dataset of olive oil samples and the goal of creating a classification model for oil quality. I'm having trouble deciding how to deal with missing data. have a look at the data here if you ...
1
vote
1
answer
86
views
Median imputation to a list by mutate() in dplyr
I want to replace missing data with median values to a dataframe within a list. I can do this by entering the column name. However, how can I do this when I need to randomly select the column in a ...
1
vote
0
answers
58
views
summary() of MICE imputation in SEM analysis
I am using the mice and semTools packages to impute missing data for my SEM dataset. However, the summary() only gives me unstandardized coefficients, but I like to have p-values and fit indices.
For ...
0
votes
2
answers
121
views
Issue with MIcombine when using svyciprop for estimating CIs. Multiple Imputation and Complex Survey
I am working with multiply imputed complex survey data and trying to estimate CIs for a proportion using the Thomas Lumley's survey and mitools package, in particular svyciprop() function with beta ...
0
votes
1
answer
129
views
I'm confused on how to handle missing values and transformations with the Hmisc R package
I'm completely lost attempting to use the Hmisc package for my analysis. I have a high dimensional lipidomics data with several missing values. I usually remove lipids with over 20% missing values and ...
1
vote
0
answers
138
views
Type III ANOVA for linear mixed models using multiply imputed data
I used R to run mixed linear model on a multiply imputed dataset. I'm trying to evaluate the significance of my main effects. I was not able to find a package/function that would allow to compute a ...