Table of Contents
Overview: What Does “From Gene to Protein” Mean?
In this chapter we follow the information flow inside cells:
- A gene is a segment of DNA that contains the information for building a specific RNA (often an mRNA that encodes a protein).
- This information is copied into RNA in a process called transcription.
- The RNA information is then read and translated into a sequence of amino acids in a process called translation, producing a polypeptide (which can fold into a functional protein).
Together, transcription and translation are often summarized as gene expression. The details of DNA structure, the genetic code, and the differences between prokaryotes and eukaryotes are handled in other chapters; here we concentrate on the overall path from DNA information to protein product.
Basic Idea of Transcription
What Transcription Does
Transcription turns DNA information into an RNA copy:
- Template: One strand of DNA (the template strand) acts as a pattern.
- Product: A complementary RNA strand is built by RNA polymerase.
- Direction: RNA is synthesized in the $5' \to 3'$ direction, adding nucleotides to the 3' end.
- Base pairing rules:
- DNA–RNA pairs: A–U, T–A, G–C, C–G (RNA uses U instead of T).
The RNA produced can be:
- mRNA (messenger RNA) – carries the code for a protein.
- rRNA (ribosomal RNA) and tRNA (transfer RNA) – play roles in translation (how they work in detail is covered with translation and RNA structure elsewhere).
Key DNA Regions Involved
Along a gene, some regions are especially important for transcription:
- Promoter: A DNA sequence “upstream” of the coding region where RNA polymerase binds and starts transcription.
- Terminator: A sequence that signals RNA polymerase to stop and release the RNA.
The coding region of the gene is the part that corresponds (through mRNA) to the amino acid sequence.
Main Steps of Transcription
Although there are differences between prokaryotes and eukaryotes, transcription can be divided into three general stages.
1. Initiation
- RNA polymerase (alone or with helper proteins) binds to the promoter.
- The local DNA double helix is unwound, exposing the template strand.
- The first RNA nucleotides are joined.
Result: A “transcription bubble” forms and synthesis begins.
2. Elongation
- RNA polymerase moves along the DNA template strand.
- It adds ribonucleotides one by one, complementary to the template.
- The newly made RNA peels away from the DNA; the DNA helix reforms behind the enzyme.
The RNA strand grows in the $5' \to 3'$ direction.
3. Termination
- When RNA polymerase reaches a termination signal:
- It releases the completed RNA.
- It detaches from DNA.
- The DNA is fully rewound.
The result is a primary RNA transcript.
mRNA Processing in Eukaryotes (Conceptual View)
In prokaryotes, the RNA produced by transcription is often ready for translation with very little modification.
In eukaryotes, the initial RNA (the primary transcript or pre‑mRNA) usually needs several processing steps before it becomes a functional mRNA:
- Capping at the 5' end: A special modified nucleotide is added to the 5' end.
- Helps mRNA stability and later recognition by ribosomes.
- Polyadenylation at the 3' end: A stretch of many adenines (poly‑A tail) is added.
- Protects the mRNA from degradation and aids export from the nucleus.
- RNA splicing:
- Many eukaryotic genes are interrupted by non‑coding segments (introns) within coding segments (exons).
- Introns are removed; exons are joined together to form a continuous coding sequence.
These steps happen in the nucleus. Only after processing is the mature mRNA exported to the cytoplasm for translation.
From Nucleotide Sequence to Amino Acid Sequence
The Role of the Genetic Code
The genetic code (treated in detail elsewhere) specifies how nucleotide triplets in mRNA (called codons) correspond to amino acids.
Key conceptual points here:
- Codon: A sequence of 3 nucleotides in mRNA (e.g.
AUG,GAA). - Each codon codes for:
- One of 20 standard amino acids, or
- A start or stop signal for translation.
- The code is non‑overlapping, almost universal, and degenerate (many amino acids are encoded by more than one codon).
So, gene → mRNA codons → amino acid sequence.
Players in Translation
Translation converts the mRNA sequence into a polypeptide. Three main types of molecules cooperate:
- mRNA – carries the coded information (sequence of codons).
- tRNA – adapter molecules that:
- Have an anticodon that base‑pairs with the mRNA codon.
- Carry a specific amino acid that matches that codon.
- Ribosomes – large complexes of rRNA and protein that:
- Bring mRNA and tRNAs together.
- Catalyze formation of peptide bonds between amino acids.
Other supporting components include:
- Aminoacyl‑tRNA synthetases – enzymes that attach the correct amino acid to each tRNA type.
- Various initiation, elongation, and release factors – helper proteins that regulate and assist each stage of translation.
Main Stages of Translation
Translation also proceeds through three conceptual stages.
1. Initiation
Goal: Assemble the components at the correct start codon.
- The small ribosomal subunit binds to the mRNA.
- A start codon (usually
AUG) is recognized. - A special initiator tRNA carrying methionine (in eukaryotes) pairs with the start codon.
- The large ribosomal subunit then joins, forming a complete ribosome with the initiator tRNA in the correct site.
Result: The translation machinery is now positioned to build the polypeptide.
2. Elongation
Goal: Add amino acids in the order specified by the mRNA.
For each codon along the mRNA:
- A tRNA with the matching anticodon binds to the codon.
- The ribosome catalyzes formation of a peptide bond between the amino acid of the new tRNA and the growing polypeptide chain.
- The ribosome shifts along the mRNA by one codon, and the cycle repeats.
The polypeptide grows from its N‑terminus (first amino acid) to its C‑terminus (last amino acid).
3. Termination
Goal: Release the completed polypeptide when the end of the coding sequence is reached.
- When a stop codon (e.g.
UAA,UAG,UGA) appears in the mRNA’s reading frame, there is no corresponding tRNA. - Instead, release factor proteins bind to the stop codon.
- The polypeptide is released from the last tRNA.
- The ribosome dissociates from the mRNA.
Result: A free polypeptide chain is produced that can fold into a functional protein or combine with other chains.
From Polypeptide to Functional Protein (Conceptual)
The linear amino acid chain must usually undergo additional steps before becoming a fully functional protein:
- Folding:
- The chain folds into secondary, tertiary, and sometimes quaternary structures.
- This folding depends on the amino acid sequence and the cellular environment.
- Post‑translational modifications:
- Chemical additions (e.g. phosphate, sugar, lipid groups).
- Cleavage of certain segments from the chain.
- Targeting and transport:
- Proteins are directed to their correct locations (e.g. cytosol, membranes, organelles, or secretion outside the cell) using short signal sequences in their structure.
These processes allow the information encoded in DNA to result in a wide variety of proteins with specific functions.
Why the Gene–Protein Connection Matters
Understanding the path from gene to protein explains key biological phenomena:
- How genotype leads to phenotype: DNA differences can alter amino acid sequences, protein shape, and function.
- Effects of mutations:
- A change in DNA can change a codon and thus the amino acid, potentially altering the protein.
- Some mutations do not change the amino acid (synonymous), others may have serious consequences.
- Basis of many diseases: Defects in transcription, translation, or protein processing often underlie genetic disorders.
- Biotechnological applications:
- Using cells to produce desired proteins (e.g. insulin) relies on controlling gene expression and translation.
In summary, “from gene to protein” describes the central flow of genetic information in living cells: DNA is transcribed into RNA, and RNA is translated into proteins, which carry out most of the cell’s work.