D. Proteins Proteins are a different kind of macromolecule from DNA and RNA. They are composed of a chain of amino acids, which are molecules structured as in the diagram below.
Figure 4. The structure of an amino acid.
There are 20 different possible R-groups, and thus 20 distinct amino acids with different properties. Proteins are composed of approximately 300 to 1000 amino acids, but with large deviation possible. Amino acids are connected via peptide bonds.
Figure 5. Formation of a peptide bond, connecting two amino acids together. The blue box on the right is the peptide bond itself.
Amino acid chains then fold into complex 3-D structures, which perform certain functions within the cell. The structure of a protein is intimately connected to its function. The general dogma is that same sequence implies same structure, i.e. that the amino acid sequence has all the information necessary to predict the 3D structure of the protein and thus to predict its function. However, predicting structure from sequence is still an area of extensive research.
This conversion from DNA, containing many genes, to mRNA, containing information for only one gene, to an amino acid chain corresponding to that gene and to a functional 3D structure is the Central Dogma of Molecular Biology.
E. DNA in action: Transcription and Translation
DNA is the carrier of vital information for the organism. The two main questions about it are “how is the information stored in DNA” and “how is this stored information used.” In general terms, information is stored as nucleotide sequences, as described above, and used in protein synthesis.
As mentioned in previous sections, DNA is contained in the nucleus of the cell. A stretch of it unwinds and its message (or sequence) is transcribed onto a molecule of mRNA. Its destination is a molecular workbench in the cytoplasm, a structure called a ribosome, which translates the mRNA to a protein.
During translation, each triplet of nucleotides in RNA maps to an amino acid. The triplets are known as codons. Thus one can think of the sequence AUGCCGGGAGUAUAG in RNA as AUG-CCG-GGA-GUA-UAG for the purposes of translation.
Some useful terms to remember are gene, which is a length of DNA that codes for a protein, and genome, which is the entire DNA sequence within the nucleus.