Skip to main content
replaced http://unix.stackexchange.com/ with https://unix.stackexchange.com/
Source Link

I have a large database of this format:

0500506 gi|561108321|ref|NC_018143.2|   71446   0 
0500506 gi|561108321|ref|NC_018143.2|   71447   0
0500508 gi|561108321|ref|NC_018143.2|   71449   0
0500509 gi|561108321|ref|NC_018143.2|   71447   0

What I want to do is to create a new table with the number of occurrences of each of the third column, something like this

71446 1 0500506
71447 2 0500506,0500509
71449 1 0500508

Importantly, the whole lines won't match, so I want to make sure that I'm just matching the values in a single column. I'm assuming I can adapt one of the solutions from Count total number of occurrences using grepCount total number of occurrences using grep, but am not sure how.

Thanks very much, drea

I have a large database of this format:

0500506 gi|561108321|ref|NC_018143.2|   71446   0 
0500506 gi|561108321|ref|NC_018143.2|   71447   0
0500508 gi|561108321|ref|NC_018143.2|   71449   0
0500509 gi|561108321|ref|NC_018143.2|   71447   0

What I want to do is to create a new table with the number of occurrences of each of the third column, something like this

71446 1 0500506
71447 2 0500506,0500509
71449 1 0500508

Importantly, the whole lines won't match, so I want to make sure that I'm just matching the values in a single column. I'm assuming I can adapt one of the solutions from Count total number of occurrences using grep, but am not sure how.

Thanks very much, drea

I have a large database of this format:

0500506 gi|561108321|ref|NC_018143.2|   71446   0 
0500506 gi|561108321|ref|NC_018143.2|   71447   0
0500508 gi|561108321|ref|NC_018143.2|   71449   0
0500509 gi|561108321|ref|NC_018143.2|   71447   0

What I want to do is to create a new table with the number of occurrences of each of the third column, something like this

71446 1 0500506
71447 2 0500506,0500509
71449 1 0500508

Importantly, the whole lines won't match, so I want to make sure that I'm just matching the values in a single column. I'm assuming I can adapt one of the solutions from Count total number of occurrences using grep, but am not sure how.

Thanks very much, drea

Source Link
drea
  • 45
  • 6

counting occurrences of a string

I have a large database of this format:

0500506 gi|561108321|ref|NC_018143.2|   71446   0 
0500506 gi|561108321|ref|NC_018143.2|   71447   0
0500508 gi|561108321|ref|NC_018143.2|   71449   0
0500509 gi|561108321|ref|NC_018143.2|   71447   0

What I want to do is to create a new table with the number of occurrences of each of the third column, something like this

71446 1 0500506
71447 2 0500506,0500509
71449 1 0500508

Importantly, the whole lines won't match, so I want to make sure that I'm just matching the values in a single column. I'm assuming I can adapt one of the solutions from Count total number of occurrences using grep, but am not sure how.

Thanks very much, drea