Skip to main content
9 of 14
added 1 character in body
tokland
  • 11.2k
  • 1
  • 21
  • 26

Some notes:

  • As others have already pointed out, you should use a hash instead of a gigantic case. But make sure your get operations on that data structure are O(1), otherwise the method will be very inefficient.

  • There is a common pattern: write the data structure in the most declarative/simple way you can and then programmatically build the derived data structures you need (on initialization, so it's efficient).

  • You can use Enumerable#take_while to manage the stop amino acids.

  • Encapsulate the code in a module/class.

  • You need a return because it's not the last expression of the method, it's within the scan, which you want to break.

  • Note that this works: "123456".gsub(/.../) { |triplet| triplet[0] } #=> "14"

I'd write:

module Rosalind
  CODONS_BY_AMINOACID = {
    "F" => ["UUU", "UUC"],
    "L" => ["UUA", "UUG","CUU", "CUC", "CUA", "CUG"],
    "S" => ["UCU", "UCC", "UCA", "UCG", "AGU", "AGC"],
    "Y" => ["UAU", "UAC"],
    "C" => ["UGU", "UGC"],
    "W" => ["UGG"],
    "P" => ["CCU", "CCC", "CCA", "CCG"],
    "H" => ["CAU", "CAC"],
    "Q" => ["CAA", "CAG"],
    "R" => ["CGU", "CGC", "CGA", "CGG", "AGA", "AGG"],
    "I" => ["AUU", "AUC", "AUA"],
    "M" => ["AUG"],
    "T" => ["ACU", "ACC", "ACA", "ACG"],
    "N" => ["AAU", "AAC"],
    "K" => ["AAA", "AAG"],
    "V" => ["GUU", "GUC", "GUA", "GUG"],
    "A" => ["GCU", "GCC", "GCA", "GCG"],
    "D" => ["GAU", "GAC"],
    "E" => ["GAA", "GAG"],
    "G" => ["GGU", "GGC", "GGA", "GGG"],
    "STOP" => ["UGA", "UAA", "UAG"],
  }
  AMINOACID_BY_CODON = CODONS_BY_AMINOACID.
    flat_map { |c, as| as.map { |a| [a, c] } }.to_h

  def self.problem_prot(aminoacids_string)
    aminoacids_string.
      scan(/[A-Z]{3}/).
      map { |codon| AMINOACID_BY_CODON[codon] }.
      take_while { |aminoacid| aminoacid != "STOP" }.
      join
  end
end
tokland
  • 11.2k
  • 1
  • 21
  • 26