The genetic code (which includes the codon) serves as a basis for establishing how genes encoded in DNA are decoded into proteins. A critical interaction in protein synthesis is the interaction between the codon in messenger RNA (mRNA) and the anticodon in an aminoacyl-transfer RNA (aminoacyl-tRNA).
A codon is a triplet of adjacent nucleotides in mRNA that specifies an amino acid to be incorporated in a protein. Because the codon can be made from three of the four possible ribonucleotides, there are 4 3 or 64 combinations, leading to 64 different codons. The first letter of the codon is at the 5′-end, while the last letter is at the 3′-end. For example, 5′-AUG-3′.
The amino acid sequence of a protein will be specified by the sequence of contiguous codons in the mRNA template. The initial codon in the mRNA establishes the reading frame and defines the protein's initial amino acid.
There are three types of codons. There is an initiation codon, AUG, which signifies the initial amino acid (and also codes for methionine residues in internal positions) in the protein. There are 61 codons, including AUG, that designate individual amino acids. The remaining three codons (UAA, UAG, and UGA) are termination codons (also called stop codons or nonsense codons), which do not code for amino acids, but signal the end of the mRNA message and provide the "stop" signal for protein synthesis.
Two amino acid residues, tryptophan and methionine, have unique codons—UGG and AUG, respectively. All other amino acids may be coded for by more than one codon, such that the code is said to be degenerate. This degeneracy is not uniform, but varies according to the particular amino acids. For example, three amino acids (arginine, leucine, and serine) have six codons, five amino acids have four, isoleucine has three, and nine amino acids have two. The first two letters of each codon provide the primary determinant in the specificity. For example, the codons for amino acid valine are GUU, GUC, GUA, and GUG. The open reading frame of the mRNA, which extends from the AUG codon to the termination codon, establishes the protein that is to be synthesized.
The correspondence between codons and the amino acids that they specify appears to be nearly, but not quite, universal among species. This genetic code is identical within nuclear genes in all species examined, including Escherichia coli , viruses, various plants, and humans, with the exceptions being those genes that are encoded in mitochondria and genes found in a small number of other organisms. This is cited as evidence that all life-forms have a common evolutionary ancestor, with the genetic code being preserved throughout evolution.
The genetic information within a gene in DNA is encoded by a sequence of four nucleotides (A, T, G, and C). This must ultimately be translated into the twenty-letter (corresponding to amino acids) language of proteins. It is now known that this information is translated first into an intermediate message form called mRNA, and then converted into a specific protein. This latter process of converting from the "nucleotide alphabet" to the "protein alphabet" requires that specific segments on mRNA correspond to specific amino acids in the protein being manufactured. This connection is provided by the genetic code.
The translation process that occurs at the site of the ribosomes in the cytoplasm requires that the mRNA designate the codons that then specify the amino acid sequence for the protein. The codons on the mRNA must interact with the anticodons on the charged tRNA molecules, which bring to the site the specific amino acid residues. Watson-Crick complementary base pairing provides the specificity for this interaction.
Lewin, Benjamin (2000). Genes VII. New York: Oxford University Press.
Nelson, David L., and Cox, Michael M. (2000). Lehninger Principles of Biochemistry , 3rd edition. New York: Worth Publishers.