1

I am looking for an elegant or efficient way to select columns in R's data.table.

Personally I value a flexible approach.

Therefore I tend to refer to columns by their characteristics rather than their names.

For example, I want to set the values of all columns to lower case.

If I include all columns in this operation, like so

dt[, lapply(.SD, tolower),.SDcols = names(dt)]

numeric and integer columns, too, will be converted to (lower case) character.

This is undesirable, and hence I first identify all character columns as folows:

char_cols <- as.character(names(dt[ , lapply(.SD, function(x) which(is.character(x)))]))

and subsequently pass char_cols to .SDcols

dt[ , lapply(.SD, tolower), .SDcols = char_cols ]

If instead, all your columns are character (for example to avoid type conversion issues while reading the data) I would go about it like this

char_cols <- as.character(names(dt[ , lapply(.SD, function(x) which(all(is.na(as.numeric(x)))))]))

One should be certain however, that no column is of mixed type: i.e. contains some character strings and some numeric values.

Does anyone have a suggestion to approach this more elegantly, or more efficiently?

0

2 Answers 2

4

You can pass a logical/character vector to .SDcols.

For character columns, we can do

library(data.table)
cols <- names(Filter(is.character, dt))
dt[, (cols) := lapply(.SD, tolower), .SDcols = cols]
Sign up to request clarification or add additional context in comments.

5 Comments

That is definitely more elegant. What about the second case. I load all my data as character type to avoid conversion issues with e.g. dates. Can you think of a better way to select those columns that are actual character type?
@o_v You can use type.convert to get data in their respective classes. dt <- type.convert(dt, as.is = TRUE)
So I think the disadvantage of the SDcols = sapply(dt, is.character) solution is that I am only left with the character columns. So if I want to keep both character and numeric columns, I'm still better of with my solution.
See updated answer. This will help you to retain all the columns in dt.
Neat! I knew I was going about it too cumbersomely.
0

We can use

library(data.table)
cols <- names(which(sapply(dt, is.character)))
dt[, (cols) := lapply(.SD, tolower), .SDcols = cols]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.