Genetic information is located in the genes formed by discrete segments of the cellular DNA. In a process called transcription (presented schematically in Fig. 7), genes are copied into a complementary length of ribonucleic acid (RNA) by the enzyme RNA polymerase. Most of the RNA molecules, the messen-ger RNAs (mRNAs), specify the amino acid composi-tion of the cellular proteins. Other RNA molecules derived by transcription, ribosomal RNA (rRNA) and transfer RNA (tRNA), participate as auxiliary mole-cules for translation. Recently, micro-RNA transcripts were found in all sorts of cells. These small molecules described as siRNA play an important role in the regulation of protein synthesis.
The discovery of how the specific arrangement of nucleotides in the gene codes the sequence of amino acids in the polypeptide, the unraveling of the genetic code, is one of the milestones of the DNA e´poque. It was found that triplets of nucleotides in the DNA, and consequently in the mRNA, code for the amino acid composition of a protein. Most important was finding that the genetic code was (almost) universal in nature. A given triplet of nucleotides, a so-called codon, in mRNA codes for the same amino acid in nearly all organisms.
In the mRNA molecules there is more informa-tion than merely the triplets required for the encoded protein. The protein encoding information is preceded by a piece of RNA that allows binding to the ribosome, while after the triplet encoding the C-terminal amino acid there is some RNA that functioned in the termination of the transcription process. Thus, signals are required to guarantee within this mRNA molecule a proper start and finish for the polypeptide synthesis. Near the 50-end of the mRNA a specific triplet, coding for the amino acid methionine, dictates the proper start of the polypeptide synthesis and near the 30-end a specific triplet (a stop codon) dictates a proper finish of the polypeptide synthesis. The genetic code, on the basis of the triplets in the mRNA, is presented in Table 1. One may see that there are three different stop codons. Moreover, it is clear from this Table 1 that the code is highly redundant: for certain amino acids there are several codons. Whenever there is a choice between various codons for one amino acid, different organisms tend to show different preferences. Later it will become clear that this organism dependent codon preference has consequences for certain biotechnological processes.
Transcription starts with the binding of the enzyme RNA polymerase at a specific site, called promoter, immediately upstream from a gene or from a set of genes transcribed as an operational unit (an operon). Promoters vary in their efficiency to bind RNA polymerase. Some promoters, the strong promoters, are highly efficient while others are weak and often require additional factors for effective binding of RNA polymerase. Promoter structures, in prokaryotes aswell as in eukaryotes, have been studied in great detail. Based on such studies it is now feasible in biotechnol-ogy, as shown later, to fuse very effective promoter structures to any gene that one wishes to be expressed.
After binding of the RNA polymerase, the DNA helix is partially unwound and subsequently the transcription process starts. RNA synthesis then proceeds with the ribonucleotides ATP, GTP, CTP and UTP (uridine 50-triphosphate) as building units. One DNA strand in the gene, the so-called template strand, serves as the matrix for this RNA synthesis.
Like in the DNA synthesis, the RNA synthesis runs antiparallel in the direction 50 to 30 and proceeds in a complementary way. The latter implies that a G in the matrix DNA leads to C in the RNA, a C leads to a G, a T to an A while an A in the DNA shows up as a U in the RNA. The transcription may stop either on the basis of intrinsic structural features of the RNA at the end of the gene or the operon or by the intervention of a specific terminating protein factor at this site.
Transcription can be regulated at various stages in the process. The intrinsic properties of the promoter, next to various kinds of proteins that can either repress or stimulate the binding of RNA poly-merase, regulate the transcription start. Transcription termination can also be regulated. Termination may, under the influence of physiological factors, occur at a premature stage. Alternatively, the normal termina-tion signal could be ignored (a process called read-through). This may lead to various lengths of transcripts starting from the same promoter. Finally gene activity can also be regulated at the level of the formed mRNA. All transcripts are subject to degrada-tion, but rates of degradation can vary widely: Some transcripts have a short half-life time while others are very stable. Biotechnologists try to influence the expression of a gene encoding a relevant biotechno-logical protein at each of these regulation levels in order to achieve optimal production.