Skip to main content
11 events
when toggle format what by license comment
Nov 5, 2020 at 12:35 vote accept Jerry
Nov 5, 2020 at 11:01 comment added terdon Do you have two files or three? You only use two files in your script (headers.txt and uniqueheaders.txt) but you also seem to have a file that has both headers and sequences. Is that file headers.txt or is it a third file? And what do you mean that "both files contain unique header names"? Isn't the whole point that one of the files has duplicate header names?
Nov 5, 2020 at 10:58 answer added AdminBee timeline score: 1
Nov 5, 2020 at 10:47 comment added AdminBee Thanks for the clarification. Do I understand correctly that there is no "blank line" in between gene sequences, or anything else that would identify a header line (apart from gene sequences being all uppercase ;) )
Nov 5, 2020 at 10:44 history edited Jerry CC BY-SA 4.0
deleted 8 characters in body
Nov 5, 2020 at 10:19 history edited AdminBee CC BY-SA 4.0
Formatting and tags
Nov 5, 2020 at 10:15 comment added Jerry I edited the post, that is basically all I wanted... @terdon
Nov 5, 2020 at 10:14 history edited Jerry CC BY-SA 4.0
added 541 characters in body
Nov 5, 2020 at 9:54 comment added terdon Can you show us a few lines of both files and the output you are expecting? Doing this in the shell is incredibly inefficient. You probably just need a simple awk one-liner that will run in seconds (your loop will take several minutes for larger files).
Nov 5, 2020 at 9:27 comment added choroba What do you think the resulting sed command is?
Nov 5, 2020 at 9:22 history asked Jerry CC BY-SA 4.0