Some notes:
As others have already pointed out, you should use a hash instead of a gigantic
case. But make sure your get operations on that data structure are O(1), otherwise the method will be very inefficient.There is a common pattern: write the data structure in the most declarative/simple way you can and then programmatically build the derived data structures you need (on initialization, so it's efficient).
You can use
Enumerable#take_whileto manage the stop amino acids.Encapsulate the code in a module/class.
You need a
returnbecause it's not the last expression of the method, it's within thescan, which you want to break.Note that this works:
"123456".gsub(/.../) { |triplet| triplet[0] } #=> "14"
I'd write:
module Rosalind
CODONS_BY_AMINOACID = {
"F" => ["UUU", "UUC"],
"L" => ["UUA", "UUG","CUU", "CUC", "CUA", "CUG"],
"S" => ["UCU", "UCC", "UCA", "UCG", "AGU", "AGC"],
"Y" => ["UAU", "UAC"],
"C" => ["UGU", "UGC"],
"W" => ["UGG"],
"P" => ["CCU", "CCC", "CCA", "CCG"],
"H" => ["CAU", "CAC"],
"Q" => ["CAA", "CAG"],
"R" => ["CGU", "CGC", "CGA", "CGG", "AGA", "AGG"],
"I" => ["AUU", "AUC", "AUA"],
"M" => ["AUG"],
"T" => ["ACU", "ACC", "ACA", "ACG"],
"N" => ["AAU", "AAC"],
"K" => ["AAA", "AAG"],
"V" => ["GUU", "GUC", "GUA", "GUG"],
"A" => ["GCU", "GCC", "GCA", "GCG"],
"D" => ["GAU", "GAC"],
"E" => ["GAA", "GAG"],
"G" => ["GGU", "GGC", "GGA", "GGG"],
"STOP" => ["UGA", "UAA", "UAG"],
}
AMINOACID_BY_CODON = CODONS_BY_AMINOACID.
flat_map { |c, as| as.map { |a| [a, c] } }.to_h
def self.problem_prot(aminoacids_string)
aminoacids_string.
scan(/[A-Z]{3}/).
map { |codon| AMINOACID_BY_CODON[codon] }.
take_while { |aminoacid| aminoacid != "STOP" }.
join
end
end