2

I have the following dataframe (df) - there are more columns, but these are the relevant columns:

ID  Cost 
1    $100
1    $200
2    $50
2    $0
2    $40
3    $10
4    $100
5    $0
5    $50

I would like to subset this dataframe such that if any of the costs for a particular ID = $0, then it should remove all those rows (i.e. all the rows for that particular ID.)

Therefore, in this example, ID 2 and 5 contain a $0, so all of ID 2 and ID 5 rows should be deleted.

Here is the resulting df I would like:

    ID  Cost 
    1    $100
    1    $200
    3    $10
    4    $100

Could someone help with this? I tried some combinations of the subset function, but it didn't work.

** On a similar note: I have another dataframe with "NA"s - could you help me figure out the same problem, in case it were NAs, instead of 0's.

Thanks in advance!!

1
  • A data.table option is. library(data.table); setDT(df)[, if(!any(Cost=='$0')) .SD, ID] Commented Jun 9, 2015 at 17:09

3 Answers 3

4

try this:

subset(df,!df$ID %in% df$ID[is.na(df$Cost) | df$Cost == "$0"])

this gives you:

  ID Cost
1  1 $100
2  1 $200
6  3  $10
7  4 $100
Sign up to request clarification or add additional context in comments.

1 Comment

+1 nice job using subset. you may save keystrokes with with(df, subset(df,!ID %in% ID[is.na(Cost) | Cost == "$0"]))
3

Try

df[!df$ID %in% df$ID[df$Cost=="$0"],]

Comments

1

You can compute the IDs that you want to remove with something like tapply:

(has.zero <- tapply(df$Cost, df$ID, function(x) sum(x == 0) > 0))
#     1     2     3     4     5 
# FALSE  TRUE FALSE FALSE  TRUE 

Then you can subset, limiting to IDs that you don't want to remove:

df[!df$ID %in% names(has.zero)[has.zero],]
#   ID Cost
# 1  1  100
# 2  1  200
# 6  3   10
# 7  4  100

This is pretty flexible, because it enables you to limit IDs based on more complicated criteria (e.g. "the average cost for the ID must be at least xyz").

2 Comments

thanks @josilber! What if I want to remove the rows based on NAs?
Then you would change sum(x == 0) > 0 to sum(is.na(x)) > 0.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.