I have the following dataset (example)
idnumber=c(12,12,13,14,14,15,16,17,18,18)
reg = c('FR','FR','DE','US','US','TZ','MK','GR','ES','ES')
code1=c('F56','G76','G56','T78','G78','G76','G64','T65','G79','G56')
code2=c('G56','I89','J83','S46','D78','G56','H89','G56','W34','T89')
df = data.frame(idnumber,reg,code1,code2)
which gives:
idnumber reg code1 code2
1 12 FR F56 G56
2 12 FR G76 I89
3 13 DE G56 J83
4 14 US T78 S46
5 14 US G78 D78
6 15 TZ G76 G56
7 16 MK G64 H89
8 17 GR T65 G56
9 18 ES G79 W34
10 18 ES G56 T89
I would like to subset df keeping only the raws where the value G56 appears in column code1 or code 2, though keeping the raw idnumber if the id value is the same id value matching with the value G56 such as:
idnumber reg code1 code2
1 12 FR F56 G56
2 12 FR G76 I89
3 13 DE G56 J83
6 15 TZ G76 G56
8 17 GR T65 G56
9 18 ES G79 W34
10 18 ES G56 T89
I have millions of observations and around 30 code columns.
Hope the question is clear enough, any suggestion will be welcomed!
Cheers