DEV Community

PranavMunigala
PranavMunigala

Posted on

Central Dogma Explorer

Recently, I came across a simple tutorial on using Biopython, and after thoroughly studying the material, I was inspired to build a few projects that combine these foundational bioinformatics concepts with AI. My goal was to gain hands-on experience while creating something both educational and interactive. (https://www.kaggle.com/code/shtrausslearning/biopython-bioinformatics-basics#8-|-PHYLOGENETIC-ANALYSIS)

The first project in this series is called the "Central Dogma Explorer." It's an interactive educational tool designed to take a gene name, retrieve its DNA sequence, and visually demonstrate the biological processes of transcription and translation, ultimately showing how a functional protein is produced.

To make this more manageable, I divided the project into three core components:

  1. Biopython Integration – I wrote functions such as fetch_gene_sequence(gene_symbol, organism="Homo sapiens") and transcribe_translate_dna(dna_seq) to retrieve and process DNA sequences. These functions fetch gene data, transcribe DNA into mRNA, and translate it into an amino acid sequence.

  2. AI Explanation with an LLM – I incorporated a language model to explain the transcription and translation processes in simple terms. Here's the core function:

def generate_explanation(dna_seq):
    prompt = PromptTemplate(
        input_variables=["sequence"],
        template=(
            "You are an expert biology tutor. Explain the full process of transcription and translation "
            "for this DNA sequence: {sequence}. Your answer should be clear and easy to understand for a first-year biology student."
        )
    )
    chain = LLMChain(llm=llm, prompt=prompt)
    explanation = chain.run(sequence=dna_seq[:500])  # Limit to 500 bases
    return explanation

Enter fullscreen mode Exit fullscreen mode
  1. User Interface with Streamlit – I built a simple, user-friendly interface using Streamlit to display the DNA sequence, the corresponding mRNA transcript, the amino acid sequence, and an AI-generated explanation. Example UI elements include:
st.subheader("1️⃣ DNA Sequence")
st.code(dna_seq[:1000] + ("..." if len(dna_seq) > 1000 else ""), language="text")

st.subheader("2️⃣ mRNA Transcript")
st.code(rna_seq[:1000] + ("..." if len(rna_seq) > 1000 else ""), language="text")

st.subheader("3️⃣ Amino Acid Sequence")
st.code(protein_seq, language="text")

st.subheader("4️⃣ AI-Generated Explanation")
st.markdown(explanation)

Enter fullscreen mode Exit fullscreen mode

I've also created a short tutorial video walking through the application and its features (https://youtu.be/2qi5UPiiS1Q). I'm always open to feedback and would love to hear any suggestions for future bioinformatics projects that combine biology, AI, and interactivity. Thanks for reading!

Here is a link to the code: https://github.com/PranavMunigala/CentralDogmaExplorer.git

Top comments (0)