Table of Contents
What Proteins Are in Simple Terms
Proteins are large biological macromolecules built from smaller units called amino acids. They are the “working molecules” of cells: almost everything cells do involves proteins.
In this chapter we focus on:
- how amino acids link to form proteins
- how protein chains fold into complex shapes
- how these shapes relate to protein function
General chemical basics and macromolecules in general are assumed from the parent chapters.
Amino Acids: The Building Blocks of Proteins
General Structure of an Amino Acid
All standard amino acids used in proteins share a common backbone:
- a central carbon atom (
α-carbon) - an amino group:
–NH₂(often protonated to–NH₃⁺in cells) - a carboxyl group:
–COOH(often deprotonated to–COO⁻in cells) - a hydrogen atom:
–H - a variable side chain:
–R(this differs between amino acids)
In shorthand:
$$
\text{Amino acid: } \mathrm{H_2N{-}CH(R){-}COOH}
$$
The R group (side chain) determines the chemical character of each amino acid.
Types of Side Chains (R Groups)
For understanding protein structure, we mainly care about how side chains interact with water and with each other:
- Nonpolar (hydrophobic) side chains
- Often hydrocarbon chains or rings (e.g., leucine, isoleucine, valine)
- Tend to avoid water and cluster inside proteins
- Polar (uncharged) side chains
- Contain atoms like O or N that can form hydrogen bonds (e.g., serine, threonine, asparagine)
- Often found on protein surfaces where they can interact with water
- Positively charged (basic) side chains
- Carry a positive charge at physiological pH (e.g., lysine, arginine)
- Attracted to negatively charged groups (e.g., DNA backbone, acidic amino acids)
- Negatively charged (acidic) side chains
- Carry a negative charge at physiological pH (e.g., aspartate, glutamate)
- Special cases
- Glycine: small, flexible, side chain is just
–H - Proline: forms a ring with the backbone N, adds rigidity and bends
- Cysteine: contains
–SHgroups that can form disulfide bonds
These chemical properties are crucial for how proteins fold.
Essential vs. Non-Essential Amino Acids (Conceptual)
- Non-essential amino acids can be synthesized by the organism.
- Essential amino acids must be supplied by the diet (in humans: e.g., lysine, tryptophan).
This distinction matters nutritionally, but structurally all proteinogenic amino acids are used in the same way by the cell’s protein-building machinery.
From Amino Acids to Polypeptides
Peptide Bonds
Proteins are formed when amino acids link in chains. The link between two amino acids is a peptide bond, formed by a condensation reaction:
- The carboxyl group (
–COOH) of one amino acid reacts with the amino group (–NH₂) of the next. - A molecule of water is released.
Simplified:
$$
\mathrm{Amino\ acid_1{-}COOH + H_2N{-}Amino\ acid_2}
\rightarrow
\mathrm{Amino\ acid_1{-}CO{-}NH{-}Amino\ acid_2 + H_2O}
$$
The repeating backbone of a polypeptide is therefore:
$$
\mathrm{-N{-}C_{\alpha_H}{-}C(=O)-} \ \text{(repeated)}
$$
Side chains R project from this backbone.
Directionality: N-Terminus and C-Terminus
Polypeptide chains have a direction:
- N-terminus: free amino group (
–NH₃⁺) at one end - C-terminus: free carboxyl group (
–COO⁻) at the other end
During synthesis in cells, amino acids are added from the N-terminus toward the C-terminus.
Peptides, Polypeptides, Proteins
- Dipeptide: 2 amino acids
- Oligopeptide: short chain (e.g., 2–20 amino acids)
- Polypeptide: longer chain (can be hundreds of amino acids)
- Protein: one or more polypeptide chains, properly folded and functional
The sequence of amino acids in a polypeptide is called its primary structure.
Levels of Protein Structure
Understanding proteins requires recognizing several hierarchical structural levels. Each level depends on the one below it.
Primary Structure: Amino Acid Sequence
The primary structure is the linear sequence of amino acids, listed from the N-terminus to the C-terminus.
Example (very simplified):
Met–Ala–Gly–Lys–Phe–…
Even a single amino acid change in this sequence can alter protein folding and function.
- The primary structure is determined by the gene encoding the protein.
- All higher-level structures ultimately depend on this sequence.
Secondary Structure: Local Folding Patterns
Secondary structures are regular, repeated local shapes formed by hydrogen bonding along the polypeptide backbone (not mainly by side chains).
The two most common types:
α-Helix (Alpha Helix)
- The backbone coils into a right-handed helix.
- Each backbone
C=Ogroup forms a hydrogen bond with theN–Hgroup of an amino acid four residues ahead. - Side chains stick outward, around the helix.
Features:
- Common in membrane-spanning regions of proteins.
- Stable yet flexible.
- Formed spontaneously if the sequence allows hydrogen bonding and does not contain too many helix-breaking amino acids (like proline).
β-Sheet (Beta Sheet)
- The backbone is extended, and several such strands lie side by side, forming a sheet.
- Hydrogen bonds form between
C=OandN–Hgroups of neighboring strands.
Two arrangements:
- Parallel: strands run in the same N→C direction.
- Antiparallel: adjacent strands run in opposite directions; hydrogen bonding is often more optimal.
Side chains alternate above and below the sheet. β-sheets can form flat or twisted surfaces within proteins.
Other Elements: Turns and Loops
- β-turns / hairpin turns: short segments (often 4 amino acids) that reverse the direction of the chain, often containing glycine or proline.
- Loops: non-regular segments connecting helices and sheets, often on the protein surface and involved in binding or catalysis.
Secondary structures are stabilized mainly by hydrogen bonds between backbone atoms:
C=O ··· H–N
Tertiary Structure: Overall 3D Shape of a Single Polypeptide
Tertiary structure is the complete three-dimensional folding of a single polypeptide chain. It describes:
- how helices, sheets, and loops are arranged in space
- how side chains interact to stabilize the structure
Forces Stabilizing Tertiary Structure
Several types of interactions work together:
- Hydrophobic interactions
- Nonpolar side chains cluster in the interior, away from water.
- This is one of the main driving forces of protein folding in aqueous environments.
- Hydrogen bonds
- Between polar side chains and between side chains and backbone.
- Contribute to specificity in folding and ligand binding.
- Ionic bonds (salt bridges)
- Between positively and negatively charged side chains.
- Especially important in stabilizing certain conformations.
- Disulfide bonds
- Covalent bonds between two cysteine side chains:
$$
\mathrm{R{-}SH + HS{-}R \rightarrow R{-}S{-}S{-}R + 2H^+ + 2e^-}
$$ - Often stabilize extracellular proteins (e.g., antibodies, some hormones).
- Van der Waals forces
- Weak, short-range attractions between all atoms in close contact.
- Individually weak, but collectively significant.
Domains
Many proteins are built from domains:
- Compact, semi-independent structural and functional units within a single polypeptide.
- Typically 50–300 amino acids each.
- A single protein can have multiple domains with different roles (e.g., one binds DNA, another binds ATP).
Domains can often fold independently and sometimes correspond to distinct evolutionary units.
Quaternary Structure: Association of Multiple Polypeptide Chains
Some functional proteins consist of more than one polypeptide:
- Each polypeptide is called a subunit.
- The arrangement of these subunits is the quaternary structure.
Examples:
- Hemoglobin
- 4 subunits (2 α and 2 β chains) that cooperate to transport oxygen.
- Many enzymes
- Work as dimers (2 subunits), tetramers (4), or higher-order complexes.
Subunits are held together by the same kinds of interactions as in tertiary structure:
- hydrophobic interactions
- hydrogen bonds
- ionic interactions
- sometimes inter-subunit disulfide bonds
Quaternary structure allows:
- cooperative effects (binding in one subunit influencing others)
- regulation via subunit association or dissociation
- modular design of large complexes (e.g., ribosomes, multi-enzyme complexes)
Protein Folding and Denaturation
How Proteins Fold
For most small to medium-sized proteins:
- The amino acid sequence alone contains enough information to specify the final 3D structure.
- Folding in the cell is often assisted and guided by chaperone proteins (without becoming part of the final structure).
Folding proceeds through:
- Formation of local secondary structures (helices, sheets)
- Collapse of hydrophobic residues toward the interior
- Fine-tuning via side chain rearrangements to reach a low-energy, stable conformation
Misfolded proteins can lose function and may aggregate; in some cases this is linked to disease.
Denaturation
Denaturation is the loss of a protein’s native (functional) structure without breaking the primary sequence (peptide bonds remain intact).
Causes include:
- High temperature
- Extreme pH
- High salt concentrations
- Organic solvents or detergents
- Certain chemicals (e.g., urea, guanidinium salts)
Consequences:
- Secondary, tertiary, and often quaternary structures are disrupted.
- The protein usually loses its specific biological function.
- Sometimes denaturation is irreversible (e.g., cooking an egg); sometimes proteins can renature if conditions are restored.
Denaturation illustrates that protein function critically depends on three-dimensional structure.
Structure–Function Relationship in Proteins
The specific 3D structure of a protein creates:
- Binding sites: pockets or surfaces where other molecules (substrates, ligands) fit.
- Active sites of enzymes: regions where chemical reactions are catalyzed.
- Recognition motifs: sequences and shapes that allow interaction with DNA, membranes, other proteins, etc.
Even small structural changes—such as exchanging one amino acid for another in a key position—can:
- reduce or abolish function
- alter stability or solubility
- change interactions with other molecules
Thus, the primary structure (sequence) → determines the higher structures (fold) → which determine function.
Overview of Major Structural Classes
Based on their overall architecture, proteins are often grouped as:
- Globular proteins
- Compact, roughly spherical.
- Usually soluble in water.
- Typically enzymes, transport proteins, regulatory proteins (e.g., hemoglobin, many enzymes).
- Fibrous proteins
- Long, extended, often forming fibers.
- Usually insoluble, structural or protective roles.
- Examples: collagen in connective tissue, keratin in hair and nails.
- Membrane proteins
- Embedded in or associated with cell membranes.
- Usually have hydrophobic segments (often α-helices or β-barrels) that interact with the lipid bilayer.
- Roles in transport, signaling, cell recognition.
Despite this variety, all these proteins are built from the same set of amino acids and the same basic principles of structure.
Summary of Key Points
- Proteins are polymers of amino acids linked by peptide bonds.
- Primary structure: linear amino acid sequence.
- Secondary structure: local regular patterns (α-helices, β-sheets, turns).
- Tertiary structure: complete 3D folding of a single polypeptide.
- Quaternary structure: arrangement of multiple polypeptide subunits.
- Protein structure is stabilized by hydrogen bonds, hydrophobic interactions, ionic bonds, disulfide bonds, and van der Waals forces.
- Proper folding is essential for function; denaturation disrupts higher-order structures and usually inactivates the protein.