Skip to main content
Add formatting and tags
Source Link
AdminBee
  • 23.6k
  • 25
  • 55
  • 77

I wanted to extract a list of numbers (string.txt) from masterFile.list. masterFile.list is separated by '|' and contained more than one column. I am only interested with the line, where its first column contained number matched in the string.txt file.

string.txt:

3075
3078
3076

masterFile.list

3078    |       Auxenochlorella pyrenoidosa (H.Chick) Molinari & Calvo-Perez, 2015      |       
             |       authority       |
3079    |       Auxenochlorella pyrenoidosa 3078    |               |       scientific name |
3076    |       Chlorella pyrenoidosa H.Chick, 1903     |               |       authority       |
3077    |       Chlorella vulgaris var. viridis Chodat, 1913    |               |       authority
487     |       ATCC 13077      |       ATCC 13077 <type strain>        |       type material   |
460     |       DSM 23076       |       DSM 23076 <type strain> |       type material   |

expected output:

3078    |       Auxenochlorella pyrenoidosa (H.Chick) Molinari & Calvo-Perez, 2015      |       
                 |       authority       |
3076    |       Chlorella pyrenoidosa H.Chick, 1903     |               |       authority       |

Most of the previous post I have found only allow the extraction of single string, and limit match to first column. Is it possible to extract more than one string at a time?

I wanted to extract a list of numbers (string.txt) from masterFile.list. masterFile.list is separated by '|' and contained more than one column. I am only interested with the line, where its first column contained number matched in the string.txt file.

string.txt:

3075
3078
3076

masterFile.list

3078    |       Auxenochlorella pyrenoidosa (H.Chick) Molinari & Calvo-Perez, 2015      |       
         |       authority       |
3079    |       Auxenochlorella pyrenoidosa 3078    |               |       scientific name |
3076    |       Chlorella pyrenoidosa H.Chick, 1903     |               |       authority       |
3077    |       Chlorella vulgaris var. viridis Chodat, 1913    |               |       authority
487     |       ATCC 13077      |       ATCC 13077 <type strain>        |       type material   |
460     |       DSM 23076       |       DSM 23076 <type strain> |       type material   |

expected output:

3078    |       Auxenochlorella pyrenoidosa (H.Chick) Molinari & Calvo-Perez, 2015      |       
             |       authority       |
3076    |       Chlorella pyrenoidosa H.Chick, 1903     |               |       authority       |

Most of the previous post I have found only allow the extraction of single string, and limit match to first column. Is it possible to extract more than one string at a time?

I wanted to extract a list of numbers (string.txt) from masterFile.list. masterFile.list is separated by | and contained more than one column. I am only interested with the line, where its first column contained number matched in the string.txt file.

string.txt:

3075
3078
3076

masterFile.list

3078    |       Auxenochlorella pyrenoidosa (H.Chick) Molinari & Calvo-Perez, 2015      |                   |       authority       |
3079    |       Auxenochlorella pyrenoidosa 3078    |               |       scientific name |
3076    |       Chlorella pyrenoidosa H.Chick, 1903     |               |       authority       |
3077    |       Chlorella vulgaris var. viridis Chodat, 1913    |               |       authority
487     |       ATCC 13077      |       ATCC 13077 <type strain>        |       type material   |
460     |       DSM 23076       |       DSM 23076 <type strain> |       type material   |

expected output:

3078    |       Auxenochlorella pyrenoidosa (H.Chick) Molinari & Calvo-Perez, 2015      |                       |       authority       |
3076    |       Chlorella pyrenoidosa H.Chick, 1903     |               |       authority       |

Most of the previous post I have found only allow the extraction of single string, and limit match to first column. Is it possible to extract more than one string at a time?

Source Link
web
  • 193
  • 13

Extract a list of strings from a file, only from first column

I wanted to extract a list of numbers (string.txt) from masterFile.list. masterFile.list is separated by '|' and contained more than one column. I am only interested with the line, where its first column contained number matched in the string.txt file.

string.txt:

3075
3078
3076

masterFile.list

3078    |       Auxenochlorella pyrenoidosa (H.Chick) Molinari & Calvo-Perez, 2015      |       
        |       authority       |
3079    |       Auxenochlorella pyrenoidosa 3078    |               |       scientific name |
3076    |       Chlorella pyrenoidosa H.Chick, 1903     |               |       authority       |
3077    |       Chlorella vulgaris var. viridis Chodat, 1913    |               |       authority
487     |       ATCC 13077      |       ATCC 13077 <type strain>        |       type material   |
460     |       DSM 23076       |       DSM 23076 <type strain> |       type material   |

expected output:

3078    |       Auxenochlorella pyrenoidosa (H.Chick) Molinari & Calvo-Perez, 2015      |       
            |       authority       |
3076    |       Chlorella pyrenoidosa H.Chick, 1903     |               |       authority       |

Most of the previous post I have found only allow the extraction of single string, and limit match to first column. Is it possible to extract more than one string at a time?