Skip to main content
2 of 3
edited tags
200_success
  • 145.6k
  • 22
  • 191
  • 481

The Genetic Code

This question is part of a series solving the Rosalind challenges. For the previous question in this series, see Wascally wabbits. The repository with all my up-to-date solutions so far can be found here.


Problem: PROT

The 20 commonly occurring amino acids are abbreviated by using 20 letters from the English alphabet (all letters except for B, J, O, U, X, and Z). Protein strings are constructed from these 20 symbols. Henceforth, the term genetic string will incorporate protein strings along with DNA strings and RNA strings.

The RNA codon table dictates the details regarding the encoding of specific codons into the amino acid alphabet.

Given:

An RNA string \$s\$ corresponding to a strand of mRNA (of length at most 10 kbp).

Return:

The protein string encoded by \$s\$.

Sample Dataset:

AUGGCCAUGGCGCCCAGAACUGAGAUCAAUAGUACCCGUAUUAACGGGUGA

Sample Output:

MAMAPRTEINSTRING

My solution solves the sample dataset and the actual dataset given.

Dataset:

AUGCGCCCUUGGUCGCUCCUUGGAUCAGAGCAUAUUCUAUCACGGCGCGUCGAAGGAAUAACCCACGACAUCUCUCUAUAUUGGAUUCCCUUUUUUUCGGUUCAGGUAGAUCAUUUGGCUACUGGACUUUCUAAGAUUUACUCCGCCAUGUUCCUAUAUGUUACAUUCUCAGCCGAAGUCGCUGUAUAUCACGUUAAGGUAGACGGUUCCUUGACUACCAGCGACGCCUGUAGGGAGAAUUCCAUCCAUCAUGCAUGUAUGGGCAGUGCGCACUUACAGCGCCAUAGGCAACGGACGGACAGACCUCCUUUCCUGUCGGACGGUAAGCCGCGAUCCAAUACAGAGCAAAGUCCCACGCCCUCCUAUAGACUCACGCCAAGAUUGUAUUCCCCGUUAACCGCUCUCUCAGGGAAGUUGUAUCUACUCGGAUCGGGAUGUCCUUGGAAAUGUAGGAAAAUGGCUCAAACUACGAUUGUAUACCGUGCGAGACGUUGGAUCCCGCUUAUCACUGAUACCAUAAUCUGUGUGGCCCCCUUACCACAACCUAACCAUGGAGUAGUAGCCCUGGCCGUCCCUUCAAGGGCAAGACCUCAUUGUCUUGUACGCUUAUCACAAGGGCCAUCUAACAAUGUGUACCGGUAUAAUUUUACGUGGUAUUGUCCAGACGGCGGUACGGCCGAUCCGUUGCCAUUUCGUCAUGGCAUAACCUCGGUCUAUCUUCCUUCCUACUCGGGAAUAGUUCGCAGUACACCAUACCUCAUCGGCACUUACGCUGUUCCAACACAAAAUUCUGAUCCCUUCGCUACCACCCGCUGGGUAUCUGUCAGGUUACUGGCCUCCACUACCGGAGAGGGCGAUACGGGGGACCGCGGAACACUUUCUACAUUUUUGGACUGCUGUAUGUCUACUUCGGCUCUUCCUCCCGCGAGAUUUAUAUCGGCAUACAGAGUAAACGCUACCCAUGGCGACGACACUCGUCUCACCGUUAAGAAGCUGUGCACACCUUAUAAAAGCCUUUGGUCAGGUAAUGUAUGUAGACCUCUAUCGGUCUAUAAGUUGAAGGAAAGAUAUAUAGUCCUCACCGUUGGUUAUCAUUCCCUCCUAUCCCGCAUGUCCACCCUAAGGAAAACUUACUCCGUGGACCAAGCGGGACCGGCAAGGCCCUUGGUUGCCGAAAAAUCAGGAGUUUGCGAGUGCGACUAUGAUCCCAGGUUUGUAAUCGUUAUCGCGACUGUUCAUUUCGGGAAGGCUACUGGGCUGCAUUCAAAGGCGGACCGCUGCUCUCGCUGGGUCGGGUCCAGUGGACGGGUCGAGGCUGCUCAGCUCCAUCAUCAAAGAAAUAAGAAGCCCACCGGAAGAGACAGUGAUGCCGAGAAGAUGUCCGGCAAAGGCAGACUCUGGUCCCAAGUUAGACUGGGUGGUUUAGAGAUCCAGAUAGGUCGUCGUUACCAGUUCUACGAAUACUUUCUUGGUAAACUGAUUAUCCGUCAAAGGCCGACUCACGAGAGGACACAGGGUGCAAAGAAACCCUCACAGAAGGGAACGCGAUUGGCGACACGUAUGGCGCUCGCGUGGUGUGACAGGUUUGGUAGGAUCUAUAUCCCCCAGUGCAACAUAUUAGUUCACACUAUAAUGAAGGUCCGAUUGCACCAAACAGCCUGCGAUGAUAACACGUGGACUGCUGGAGAGUAUGACUUGUACGACUGCACGCGCGAUGUACCCAACAUCGUCCUGUCCCUACCGCACCAUCUUUAUGAACUGCUGGUCUCUGAUGCACUCCCCGCCCCCCACCUCCUCUUUCUGGGGGAAUCUGGUUCCGCCAGGCAUAGGACACGUUCGGUUGGACUAACUAUUGCUAACUACACGAUCAUUCGUGAUAGGUGUCGGUCCACAUGUAUAAGGUUGGAUGAACCUAACCCCUACCGACGAUUUGAUGUAUUGUGGCCACUACUAAUGACCCCCGCUCGUAACACCCAAACACGCGCAUUUCUCUGCUCCGUGGCUGGCUGGAAUUGCCAGUAUUACAGACCCCCUGACAGAUCCAGUAGUAGGACGUACUUGAUCGCACUACAUUUAGCAAUCAAAUUGCGUUCACGCUACCCCAAUCGUUCAGAUCUGGCUUAUCCCACAUCUGUAAACAGGAACACGGUGUAUAUCACCGUAGUCUCUCUACGCACAGCGAAACUAAGAUACAACAGUUACACACCCAGCAAACUGCGGCCCGAUCAAGGCAAUGACCCCAGCUACCGAACUUCGGAAAGAGGGAAUUUCGUCCGGAACUUGCCAUCAGUGAAACCGUACCGAGACUUCAUGAAAACGAUCAGCAUUUCCUUUACAGGAUUCCGGACCAAAUUUGAUCGUCAAAUUGGGAAUUCCAUAGGCCGAGGAUUCACGGGAGCAAGGCCGACCUCAUUGAAGCGUCAUAGUCGCUUUUCGCUCACCGUACAUUCUAGCAAGCGCUAUCUCUCCCGCCUCAACGCCUACGUCUCUUUUACAAUAAAACAUCGGAUAACGAGAUUCGACGUGGCUACGCGCGAAGUUAAGGCUUCCGCCGUGGUACCUAAUGCGGAAAUAGUCCGGAAAAGUACGAAGGUGUGCUGGUUUAUGUGCAUCAUCAGACUCCAAACUGUCCGAGCUACCAAACCGACCAUUCAGAAGCAGUUGUUGAGAUUAAGUGGCCCGUUUCAAUGCGGGGAUUCCACCAAUUACGUACAACCUGGUUACUUCCACAGUUCAAACGCGCCCCGCCCGGCGUGUGUCAGUGUGUGUAUCAGCCCGGGGUUAUGGGACCUGUUGGUAAAAACCCGGAAUGCUUUCUCCCGUGUGCACGGGGGUACUAUCCUUACUUUAGUUCAGCAUGACAUUCAUAAAGUAGAAUUAUCGUCAGCAUGCACUCGCGAGCGGGCUACCAACCUGGCAAUGACUGAAAGCGUAACGUCAUACUCUCAGUGCGAGGCCUCGACUCGCCAUACCGAAAUACAAGCUGUUAGCUCAAUUGUGUAUCUCCACUUGACUGCGGCCCGCAGGGAGAAACACACAGAGAAGAGGGCCGACGCGAAGCAUCAUGUGUCUACUUGGCGCGAGGGUAAAACCGAAAACGUCAUUGGAAGGCUCAGAUCGUCACAUACAUUAACCCUUAGGUUACAUCCAUCCUCGUUGGACAACUGGCCGUUCAUUCUUGGGGAAUGCCAGAGAGGAACGGAUAUCGAGGAACGCAUCCCGGCACAUGCGGAAUGUACAGACAAGGCUGUAGGCGUAGCCUUUCAGUCGACGCUAUGGGCAGAUUCGGCGAAUCCGCGAGGUGGAUCUCGCUUGAGAAGAGGGAUUAGGGGCCCCAACGCGAUGAAUAUUGAAUGCGGAUAUUAUCUGGCGAGACAGCUUCUUGACCGCUCUUGUAGUCGCAAGAUAGGCGAGACUCUAAGACAAACUAGUUCCCGCACGCCAUUGCCAUGCAAGCGAGGCCGCGUCCCAAAACCCCUGGAACCUAAAGAGUCGAACAGGUCAGGAGCAAGUAGCGUAUGGAUACCAGUCGGAGUUAGGCUGGGUCCCUCCGCUGCGAAGACUCCGCCCUGGCGACAUGGUCGCCCGCGACAACUUCUAAUCUCUCCUCAAGUUAUUCCGUUAAGACACCCGAGCAAACGAGCUAGUCAAAGGGAUCAGUGCGAGCUCCCAUCAUGUCCUGAGUACAAGACCCCAGUGUGCCGACUUGCUUUGGGUAGCUCCAGAAUGGUUCACGAAUUAGCCCUUAAGAUGCUCUCCCCUGUUCCGGAGUUCGUGUGGAGGGUCGGAGGCGGGAAAGUCUAUUUAACGGCGGACCCACCUAGGGUAAAUCUGACGCAUAUGUCUGAACACGCCCUGGUGGUACCAGGAGUUUCCCUAUGGGCCCUUUUUCUUUUAAGACCUUUAUUUUUCCAUCUCCAACCUCGAUUAUCGACAACAUACCGUUGGCGCAGACACUUAUGCUUACCCGUUCAACUGCAUUCGUACAGGCUGGGUGAUAUGCAAUUAGGAGCAUCUCGACGUUGGGUAGGCCCCCGAAAUAUAUGGAGACAGGGUGUGUAUGCGUGGGAGAUAUAUGAGAUUCGAACUGUACCGGCUCCAAGGGUGUCACUGUUCCGCCGUUGGAAGGAAAACACUUACACCCUCUUUGGAUCAGGGGAAAUUACAGCGAAUGUUAAGACGGCUAUGUAUCGGAUAACGUCACAUCCGUUUAGACUGUAUGCGGGCGCCCGAGAAUUCAGUCCAUUCCGACUAAACGAGAAAAAGUUUGCCCCCGCGGGGAUUACGUACAAAACCGGAUGCGAUUAUAGCCGUUCUGGAGAACUCUGUGAGGGGCGGGGAAGGAAAAAUAGUUUUAUGUACCAUUGGGCGGCCCUUCCUCUCCAUGCACAUAAAACAAACAGCCUCAUUGAUUUCUACACUCCGUGCAACCCAAAUGCGGCUGCUGACAUGCUAGUGCGUAGUACAGAAGAUGCCCGAGCUCGAAUACAUUGCAGAUAUUGGGUUAACAAUUCGUUUUGCAAAUAUAUAUGUUGGCACUCCAGGUUCUUGAUAGAACCGAUUCAGAAGAAAUGGUGUACACCCAUUGAGAGGCGUCGCCCCGUAAUUAACGGGGAUGUCUUAAACGGGUCAGAGGUAACUACUAAGACGCGGUGCUGUUUCAGAUGGGCAAGUCAUACGGGCCGUUCUUACGGAAGAGAUCGUGCUAAUACAAACCUGCUUGUUAUGGGCGACGCAUCGCCCGAGGGGGGCGCGAAUCGGAGACUUGCAAGUACGACCGGUGGAUUCGUGCAAUUUAAGGUAUACAUUUCACGCGGAGACCCCGGGAAGGAGCUCCCCUACAUAACACGAAUAUCACCCGGCCGAAUUAGGGCUCGACGGUCCUUCCGCCUAAUGUGUGCAGUGAACGAGUUGGUGCCUGAAGAUGGUUUCACUCACAGGCGCCAAAGUACGACUCCUCCUUCCCGAUCAGUUUGCGACGGGCCUGUCCGCUUUAGAAUAAAGACCCACUUCCAAACUUCGACCGGCUGGGGGAAACAUUGGAGCAGCUUCCAAUGUGAUCAAAACUGUAGCAGAUCUCUAGCAUACUUCAAACAGGAAGUUACUGUAUGUGGUGCACGACCUGGCCAAGGUAGUUUCCUCCCCCUAGGACUGGUGAACGAUGGCGGGUGGAUUGUCAUUCAUAGUGAGAGAUUAGCCGUGCCUGCUUAUGACGGGAUCGGCGAUCUAGUAAGCGAUUCCAAAUCGAACAGCAUGCGCCGAGGGGACACUUACUUGGAGGUACUUAUCCGAGCGAAAAGGAGGGAGCCCAUAUCCAAAUGCGCUAGUAGAGGAGCACUGUCGAGUCAUGACCGCGGCCAUUCACUCGUAAGUACAGGGACACACUUCCAUAUCCUCCGGGGACUGAUGGGUAUUCGUACGAUUCGGCUGAGCGGGUCGCGGGACCCUACGGUCCGAACGUCUCGCGAGGGGUGUCAGGCUCCUCAUUUCAUGGUGCAGUCUGUCUGGAAGCCGACUACACUACGGGGUAGUCCUGCCCUUGAUAAUGCUAAUGAGAGUAACUCACGCCCCGCCCAUAACAAAGGCCGGGGGCCCUCUCAAUCAAACAGGCGGACGGGGAAUGUCAGGCAGGUUGUCGUUGGCAGAGUUACGCUCUCGCAGGAGAUUAAUCCUUUUGUAAAGCAUUUGGAACUAGUCCCCGGCUAUUAUUUAGCUGAGUAUCCAAUGCCUAGAAGCCUUGCGUCCCGUUCUAACCUGCGCGUAAUUCAUACAUCGCAUGAGAGAGCAAGGCAAACAAUCCAUUCGCCUGGCAAGAGAAACCGAGGAGCAAGUCACCGAACGCCCGCCGGGAAGCACCGCGAGUACCCACGACAAAACAGCUGCUUGGACUAUUAUGAACCCUCCAUACGUAGGAAGGAGGCCUAUGGGUGCGUCAAUAACGCACUCCCUGAUUGUCCUGACAAGGACGAUCGCGAAUGGACGCGCUCGCAAUCCAUGAUUGAAAUGUCCAGACCAACCGAGUCCCUGCUCAGUGCCUCCUGGCAUCGGCCAUUGGUUCUUGGAAGCCUCAACUACGGAUUCAUCACUGACCCCGUGGCGCUCACUGGUCAAAGAAAACUAGGAUGCCGUGGAAUGAUGAACACGUUAAUGUUAAUAUGGAACCAUCAUUUCGGCCCCUAUGGUUCAACCCCAAGAUUAGUUUUCGUUUGUGAAGCCAAGCGGCACCGGGGAUCUUGGGCAAACUACACUGAAGCAAAACUCCCUUCCUAUUAUGUAAUAACACUGGCACAGGGUCUUGGCCCGCGCGCUGGGCUCCACCACAACGUGUACUGUCUUCACCCUCAACGAGUUUUCCACUUCUGUCCCUUCGUAUCAGUUCACUUGCAAUUCCUAUCCCAUGUUUCGACUAGCCCAAGCGCUAAGUGUGCCCGCCUAGAUCCAGUCCAUCUUCCGGCUGAGGUGGGGAUCGUCAAACCUGCGGGGCGGAUUAAGAAGUCAUUUGUUGGUGGCGCGGGGCCUCUCAGAAUGUUAAAUAGGCAACGUGUAUGUUCGGUGGGGCCUGGAAGUGGACCGCCGUCCGUGGCGGAUUGUGCCAAAUUAACGGCUGAAGUGGAGUGGACUUCCAUCCACCCAGCUGCAGCAGAUCGGGGGUUAUCCCAAAGCACCAUCCAUGCCAGCAUGAUGCUGACUCACCAAAUAAGCUUCACUGAAUGCGACAAGUUCGCGCAAAUGGCUCAGAGCAGCGUGUCCCACACCGUGGGACAAAGGGUAUACUCGACUUCUCCACCUUGCGCGAAACCUGGCCCCGCUGGAUACAGACUGAUCAGUUCUAUCGAAUGUACCGUGCACAAAUGUAAACGUCGACAUAUGGCCGGCGCGCUACUGCGGCCCCGGGAACCUGGCCUCCUACCCGAUGACAAUGUAUCACCCGUCCCUCUCCGGUACGGCGAUAAUAUAUUGGCUCAUCGGGUGGCAUCUUACUCCCGGCGUACUUCCGACCCGUCGCAUCAAGUCCGAUCUGACCAAUUCUGGGACAUUAAUGUACAACCACCUAUACCCUUCUUCGCACCUCUGCUGAAUUUGGCUUCACGAACCUAUGGGCGAGGAGCGCUGCUGUCCCCGCCGGAACCACAGAUUCACGCUGCCACUAUGGCUUCAGCUAGAUGUGAGUCAAAUAAUAGAUCAGUACUCGUUAUGCGGCAUGAUCAUGAAGGUAAACUGCCCUUGCACCGAUCCAAGCUAAGCGGGCUAGCUGUAAUCCUUAGCCGGGGAUCUUCCGAUGUAUGUGCCCCCUCGGACAUGAAACACAUCCACAGUGGAGAUAGACAAAUGACUGAGGAGCUUAGAUUUCUGGAGAACAAAAACUUGAUGGGCUUAAGAUAUGGUCUAUACUCAUUAACAUCGAGAUGUGCUCGAAACGUCGAUAGACUCAUUCCUUUUAUUCGCCUACAGCAAGUGUUCGGGGAAUCAAAGUUGGAGUCACUUGCCCCAGGGGUCAAGCCGCUCCCGAUUUUCGUCGAGCGUCGUAGGAUGUGGCCGCCGGUUAUAUGGAUAAGUAUACGUUGCGGACACCAGACCAGACCCUAUAUACGAGACCGUUCUGCAGCUAAAUGUCGGGGGGGCCAGUCGCGGCCCGCCCUCUCUAAACAACUUAUUUACGUUCGGCGGGUAAGGCAGUGGCGGUUACAUCCAGGCAGACAGAUGGUCCUUGGUCAUACGUUCGCGAGCUCUUUCCGUCAGGAACUCUCCGCACAGCAUACUGCAACUCGCCGGAUUACAAGACCCCUUAGUGCUCCCUUUUAUGUACGUCCCCGGCCCUGGACUGGCGGUACUGUCGAGCUUUGCAUUUUUAGAGGCGCCUCAUGCAGGACUUCAGAAUUCGGCAAGGGAGCUACCCCCAAAGAGCUCCUCGUAAUGAACGGGUUCCUAGUGGUGUAUUACCCAGCCGGACAAAGGCCCGGUCUAACGUCUUUUGUCCGUUCGCAUUCCAUACGUCCCGUGUACGCCGAGCUACUCGGUAGUGAAACUAGGCGAGACUUGCGGAGGUCUUUUUGGUCAGUAAACGUAGUACUUGGUGUAUAUCGUCACUUACGCCACGGCACAAGGCAAAGGAGUGCGUCAUCCGGAUUGAAGGGCACUCUCAAGGUUGAUUCGCCAAUGGGUGUUGUUCGUCGCAAACCGAACCCGAUCCACUUUUUACCCUGGAAAGGGGUGUCAAGGGCGGACUUCGUGGCUCUAUCCGUCCAUGGAGUAUACUCGUCCUCAGUAAGUAGUGUAGGAUGGUUCACCGGAUGGAAAGGUAACGUUAAAAGACCGCUUCGUUGUUUAAUUGCGCAAGACUUCAAGUGCUCGAGCUUAGGUCUUCCCAUUAUGUUUAGGGAUGUAUUCUCACAAAUGCCUUAUUGUAGAUUGAGACAAGCUCCAUACGUAGUAGCACCCUUUGACUCGGGCGUUCUAUGGAUAGCUCGCAAGACGUGGAUCGCAUUCAGUCACUUACGAAAAUCCAGAUUCUGCCCUGCCUGGCUGUCAACAGACAACACCUUCGAUCAAUACGGAUCUAUCUUGGUGAGCGAAUUUUCUCCCACCCCGCGGGGAAUCGCACUGGUGGUCUGUGUGCCCCGAUCCAUUGUCUGCCGGAGCCACGGGAAAAAUUUUAAAUUCUGUAUCCUACUCCCCCGUGUGGCUGUAGCCCAGCUGAGGUCAGUAUGUCACCUUGUCGCUAUUAGGUGUUUCACCAUCCUAAUUGGCAAACUGUUUCAACCCUGCCAGAUAAGGUCAGAGCAGCCCCUUCGCUGGUAUUUAUCCCACAGCCCCCCUUCGAAGCGUUCCGCUAAGGCAAUACCAGCUCCGUACAGAGCGCCGGGUACCUUCCUCAUCUACUCCUGGAUCUACUUCUUACUUUGUAGGUCCACGGAUCAAGGCUGUUACUUUUGCAUAGUUCAUCGUGCCAUUACGCAGAGGACUGGAUGUCCCAGAAUACUUCUUGGAUUCACACUUGUCUCAAAUGAGCUUACGGUGGCGCACGGGAUUCAAGCUCCCGUGUUAGAGCCUCGGGCGCUGCCGUACAAUAGGGCAACUCCCAGAACCGAUCACGGAGUUUCUCCGGUGCGUAGACGUAGGUGCAGUAAUAUUCCUAUAAAUGUUGGAGAGUACCGCUGGUUGUUUACUUUUUCGGUAUGCAUACCUACCGACAGUCGCAAGGCAGUACAUGCCACGCAAGUUAGUUGUUUAAUGGUUUUGCCGCGCACAGCUCGUGCCUAUCAUAGGGUAAGGUACACCAGCUUCGGGCUUGCUUCUGAGCAGACCCAAACUAUUUUUCUGAUCCACAUAUCAUCAGACAACAAUUUUGCUCGAAAAGUAUGCAUACCCCCAUUAGUCCUUCUCUGA

Output:

MRPWSLLGSEHILSRRVEGITHDISLYWIPFFSVQVDHLATGLSKIYSAMFLYVTFSAEVAVYHVKVDGSLTTSDACRENSIHHACMGSAHLQRHRQRTDRPPFLSDGKPRSNTEQSPTPSYRLTPRLYSPLTALSGKLYLLGSGCPWKCRKMAQTTIVYRARRWIPLITDTIICVAPLPQPNHGVVALAVPSRARPHCLVRLSQGPSNNVYRYNFTWYCPDGGTADPLPFRHGITSVYLPSYSGIVRSTPYLIGTYAVPTQNSDPFATTRWVSVRLLASTTGEGDTGDRGTLSTFLDCCMSTSALPPARFISAYRVNATHGDDTRLTVKKLCTPYKSLWSGNVCRPLSVYKLKERYIVLTVGYHSLLSRMSTLRKTYSVDQAGPARPLVAEKSGVCECDYDPRFVIVIATVHFGKATGLHSKADRCSRWVGSSGRVEAAQLHHQRNKKPTGRDSDAEKMSGKGRLWSQVRLGGLEIQIGRRYQFYEYFLGKLIIRQRPTHERTQGAKKPSQKGTRLATRMALAWCDRFGRIYIPQCNILVHTIMKVRLHQTACDDNTWTAGEYDLYDCTRDVPNIVLSLPHHLYELLVSDALPAPHLLFLGESGSARHRTRSVGLTIANYTIIRDRCRSTCIRLDEPNPYRRFDVLWPLLMTPARNTQTRAFLCSVAGWNCQYYRPPDRSSSRTYLIALHLAIKLRSRYPNRSDLAYPTSVNRNTVYITVVSLRTAKLRYNSYTPSKLRPDQGNDPSYRTSERGNFVRNLPSVKPYRDFMKTISISFTGFRTKFDRQIGNSIGRGFTGARPTSLKRHSRFSLTVHSSKRYLSRLNAYVSFTIKHRITRFDVATREVKASAVVPNAEIVRKSTKVCWFMCIIRLQTVRATKPTIQKQLLRLSGPFQCGDSTNYVQPGYFHSSNAPRPACVSVCISPGLWDLLVKTRNAFSRVHGGTILTLVQHDIHKVELSSACTRERATNLAMTESVTSYSQCEASTRHTEIQAVSSIVYLHLTAARREKHTEKRADAKHHVSTWREGKTENVIGRLRSSHTLTLRLHPSSLDNWPFILGECQRGTDIEERIPAHAECTDKAVGVAFQSTLWADSANPRGGSRLRRGIRGPNAMNIECGYYLARQLLDRSCSRKIGETLRQTSSRTPLPCKRGRVPKPLEPKESNRSGASSVWIPVGVRLGPSAAKTPPWRHGRPRQLLISPQVIPLRHPSKRASQRDQCELPSCPEYKTPVCRLALGSSRMVHELALKMLSPVPEFVWRVGGGKVYLTADPPRVNLTHMSEHALVVPGVSLWALFLLRPLFFHLQPRLSTTYRWRRHLCLPVQLHSYRLGDMQLGASRRWVGPRNIWRQGVYAWEIYEIRTVPAPRVSLFRRWKENTYTLFGSGEITANVKTAMYRITSHPFRLYAGAREFSPFRLNEKKFAPAGITYKTGCDYSRSGELCEGRGRKNSFMYHWAALPLHAHKTNSLIDFYTPCNPNAAADMLVRSTEDARARIHCRYWVNNSFCKYICWHSRFLIEPIQKKWCTPIERRRPVINGDVLNGSEVTTKTRCCFRWASHTGRSYGRDRANTNLLVMGDASPEGGANRRLASTTGGFVQFKVYISRGDPGKELPYITRISPGRIRARRSFRLMCAVNELVPEDGFTHRRQSTTPPSRSVCDGPVRFRIKTHFQTSTGWGKHWSSFQCDQNCSRSLAYFKQEVTVCGARPGQGSFLPLGLVNDGGWIVIHSERLAVPAYDGIGDLVSDSKSNSMRRGDTYLEVLIRAKRREPISKCASRGALSSHDRGHSLVSTGTHFHILRGLMGIRTIRLSGSRDPTVRTSREGCQAPHFMVQSVWKPTTLRGSPALDNANESNSRPAHNKGRGPSQSNRRTGNVRQVVVGRVTLSQEINPFVKHLELVPGYYLAEYPMPRSLASRSNLRVIHTSHERARQTIHSPGKRNRGASHRTPAGKHREYPRQNSCLDYYEPSIRRKEAYGCVNNALPDCPDKDDREWTRSQSMIEMSRPTESLLSASWHRPLVLGSLNYGFITDPVALTGQRKLGCRGMMNTLMLIWNHHFGPYGSTPRLVFVCEAKRHRGSWANYTEAKLPSYYVITLAQGLGPRAGLHHNVYCLHPQRVFHFCPFVSVHLQFLSHVSTSPSAKCARLDPVHLPAEVGIVKPAGRIKKSFVGGAGPLRMLNRQRVCSVGPGSGPPSVADCAKLTAEVEWTSIHPAAADRGLSQSTIHASMMLTHQISFTECDKFAQMAQSSVSHTVGQRVYSTSPPCAKPGPAGYRLISSIECTVHKCKRRHMAGALLRPREPGLLPDDNVSPVPLRYGDNILAHRVASYSRRTSDPSHQVRSDQFWDINVQPPIPFFAPLLNLASRTYGRGALLSPPEPQIHAATMASARCESNNRSVLVMRHDHEGKLPLHRSKLSGLAVILSRGSSDVCAPSDMKHIHSGDRQMTEELRFLENKNLMGLRYGLYSLTSRCARNVDRLIPFIRLQQVFGESKLESLAPGVKPLPIFVERRRMWPPVIWISIRCGHQTRPYIRDRSAAKCRGGQSRPALSKQLIYVRRVRQWRLHPGRQMVLGHTFASSFRQELSAQHTATRRITRPLSAPFYVRPRPWTGGTVELCIFRGASCRTSEFGKGATPKELLVMNGFLVVYYPAGQRPGLTSFVRSHSIRPVYAELLGSETRRDLRRSFWSVNVVLGVYRHLRHGTRQRSASSGLKGTLKVDSPMGVVRRKPNPIHFLPWKGVSRADFVALSVHGVYSSSVSSVGWFTGWKGNVKRPLRCLIAQDFKCSSLGLPIMFRDVFSQMPYCRLRQAPYVVAPFDSGVLWIARKTWIAFSHLRKSRFCPAWLSTDNTFDQYGSILVSEFSPTPRGIALVVCVPRSIVCRSHGKNFKFCILLPRVAVAQLRSVCHLVAIRCFTILIGKLFQPCQIRSEQPLRWYLSHSPPSKRSAKAIPAPYRAPGTFLIYSWIYFLLCRSTDQGCYFCIVHRAITQRTGCPRILLGFTLVSNELTVAHGIQAPVLEPRALPYNRATPRTDHGVSPVRRRRCSNIPINVGEYRWLFTFSVCIPTDSRKAVHATQVSCLMVLPRTARAYHRVRYTSFGLASEQTQTIFLIHISSDNNFARKVCIPPLVLL

PROT.rb:

def abbreviate(str)
  list = ""
  str.scan(/.../) do |sub|
    case sub
      when "UUU", "UUC"
        list += "F"
      when "UUA", "UUG" 
        list += "L"
      when "UCU", "UCC", "UCA", "UCG", "AGU", "AGC"
        list += "S"
      when "UAU", "UAC"
        list += "Y"
      when "UGU", "UGC"
        list += "C"
      when "UGG"
        list += "W"
      when "CUU", "CUC", "CUA", "CUG"
        list += "L"
      when "CCU", "CCC", "CCA", "CCG"
        list += "P"
      when "CAU", "CAC"
        list+= "H"
      when "CAA", "CAG"
        list += "Q"
      when "CGU", "CGC", "CGA", "CGG", "AGA", "AGG"
        list += "R"
      when "AUU", "AUC", "AUA"
        list += "I"
      when "AUG"
        list += "M"
      when "ACU", "ACC", "ACA", "ACG"
        list += "T"
      when "AAU", "AAC"
        list += "N"
      when "AAA", "AAG"
        list += "K"
      when "GUU", "GUC", "GUA", "GUG"
        list += "V"
      when "GCU", "GCC", "GCA", "GCG"
        list += "A"
      when "GAU", "GAC"
        list += "D"
      when "GAA", "GAG"
        list += "E"
      when "GGU", "GGC", "GGA", "GGG"
        list += "G"
      else
        return list
    end
  end
end

user_input = gets.chomp
abbreviate(user_input)

Yea, that's a very long switch.

Basically it's a translator. Take 3 characters, output 1 other. I've thought about implementing this using a key-value map, but that wouldn't solve the repetitiveness.

The readability is quite good and so is the speed. However, I'm quite sure it isn't idiomatic. I have no clue how this would score on maintainability.

There's one thing striking me as odd: return should be implicit. But simply stating list instead of return list modifies the behaviour and I'm not sure why.

Mast
  • 13.8k
  • 12
  • 57
  • 127