Skip to main content
Typo in title
Link
camille
  • 16.9k
  • 18
  • 44
  • 67

Filter rows based on the dplyr groupby, summaraizesummarize output

Source Link
haimen
  • 2k
  • 7
  • 35
  • 54

Filter rows based on the dplyr groupby, summaraize output

I have a dataset with two columns, metro, state. I give the following command in dplyr,

data %>% group_by(metro, State) %>% summarise(count = n())

I get the following output,

metro           State         count 
A                OH            703
A                NJ              3
B                GA           1453
B                CA            456
B                WA            123

I now want to filter out the rows in the dataframe which are only the maximum counts and leave out the remaining. I need to filter out the corresponding rows. The output after filtering out the rows for the following command should be,

data %>% group_by(metro, State) %>% summarise(count = n())

   metro           State         count 
    A                OH            703
    B                GA           1453

Where every metro has only state which is the state with maximum counts and remaining are removed.

The following is my trying,

data %>% group_by(metro, State) %>% filter(n() == max(n()))

But this is again giving out the same dataframe as input.

Can anybody help me in doing this? My output should be every metro should have a unique state which has the maximum counts and the remaining state entries should be removed.

Thanks