Of the total mass of RNA in a retroviral particle, about three quarters comprises the dimeric genomic RNA, which in nondefective viruses ranges in size from approximately 7 kb per monomer for viruses such as ALV to more than 12 kb for human foamy virus (HFV) and a fish virus (walleye dermal sarcoma virus, WDSV) (Table 2). The sizes of genomic RNAs were first characterized by rate of sedimentation in sucrose gradients of RNA extracted from virions (Duesberg 1968). Untreated RNA extracted from virions was found to have a sedimentation coefficient of approximately 70S. Following heat treatment, however, the RNA sedimented more slowly, at approximately 35S. These important findings suggested that the retroviral genome is composed of subunits. Since sedimentation rate is not a linear function of size and depends not only on size, but also on shape, the number of subunits cannot be inferred directly from the change in sedimentation rate. Electron microscopic observation of virion RNAs spread on a film was one of the experiments that led to the conclusion that the genomic RNA is a dimer, held together by sequences near the 5′ends of both subunits (Bender and Davidson 1976). Although these sedimentation coefficients are not necessarily numerically accurate and vary among different retroviruses, dimeric and monomeric genomic RNAs are still conventionally referred to as “70S RNA” and “35S” RNA, respectively, even when electrophoresis is used to display them. Numerous kinds of viruses carry more than one segment of nucleic acid, but retroviruses are unique in that two identical molecules of the genome are incorporated into a single particle. That the 70S species is a dimer of identical subunits was inferred originally from careful stoichiometric analysis of RNase T1 oligonucleotides (Beemon et al. 1974; Billeter et al. 1974; Quade et al. 1974). The dimeric nature of the retroviral genome is responsible for a high rate of recombination during infection, since portions of both molecules can be copied during reverse transcription into viral DNA (Chapter 4). In addition to the viral genes proper, the genomic RNA contains a number of cis-acting sequences that are important in the viral life cycle (see Fig. 4). They can be grouped by the steps in the life cycle in which they function—RNA processing and translation, virion assembly, and reverse transcription.
The primary transcript of retroviral DNA is modified in several ways and closely resembles a cellular mRNA. It is “capped” at its 5′end (bearing a methylated GDP attached to the first encoded nucleotide by a 5′-5′linkage) (Furuichi et al. 1975; Keith and Fraenkel-Conrat 1975), polyadenylated at its 3′end (bearing a poly[A] “tail” of about 50–200 noncoding A residues) (Gillespie et al. 1972; Lai and Duesberg 1972), and methylated at several specific sites internally (Beemon and Keith 1977; Dimock and Stoltzfus 1977). Some of the primary transcripts are spliced to give subgenomic mRNAs, first identified in MLV (Fan and Baltimore 1973). The signal sequence for cleavage and polyadenylation of the primary transcript, typically AAUAAA, is usually located about 20 nucleotides upstream of the end of the transcript, which in most retroviruses falls in the R region. When R is very short, as in the ASLVs, this sequence is in U3. Viruses of the HTLV group are exceptional in that the nearest copy of a consensus polyadenylation signal is more than 200 bases upstream of the beginning of the poly(A) tract, and the site of polyadenylation is determined by secondary structure. In cases where the poly(A) signal functions inefficiently, longer transcripts may be incorporated into virions (Swain and Coffin 1989), creating potential intermediates for the acquisition of oncogenes (Chapter 4).
Unlike most cellular mRNAs, in which all introns are efficiently spliced out, newly synthesized retroviral RNA must be diverted into two populations (Chapter 6). One population remains unspliced, to serve as the genomic RNA and as mRNA for gag and pol. The other population is spliced, fusing the 5′portion of the genomic RNA to the downstream genes, most commonly env. The intron between the splice donor and splice acceptor sites (SD and SA) that is removed by splicing contains the gag, pro, and pol genes (see Fig. 4). This splicing event creates the mRNA for envelope protein.
The proper ratio of spliced to unspliced RNA must be maintained for efficient replication. For simple retroviruses, this ratio is determined by several cis-acting sequences that have been only partially defined. In ASLV, sequences affecting splicing are found in gag (McNally et al. 1991), as well as near the end of pol (Katz and Skalka 1990) or the end of env (Berberich and Stolzfus 1991). In M-PMV, a sequence near the 3′ end of the RNA is involved in the regulation of splicing (Bray et al. 1994). For viruses with accessory genes, splicing is regulated through interaction of sequences on the RNA with the protein product of one of the accessory genes, rev in HIV-1 and rex in HTLV-1.
Translation of the retroviral pro and pol genes is controlled by cis-acting sequences around the gag-pro and propol borders (Chapter 7). In each case, a small fraction of the ribosomes translating gag continues on to translate the downstream genes, thereby generating a fusion protein with Gag at its amino terminus. In MLV, this process is the consequence of readthrough (termination suppression) of the termination codon, with the insertion of glutamine at the stop site (Yoshinaka et al. 1985). In most other retroviruses, the gag termination codon is bypassed by frameshifting: Ribosomes stall and then shift their reading frame back one nucleotide before continuing into the downstream gene (see Jacks and Varmus 1985). In both types of readthrough, essential consensus secondary structures have been identified. In retroviruses in which the pro gene is in a reading frame by itself (including M-PMV, MMTV, and HTLV), there are two frameshift signals, one before pro and the other before pol. In this case, frameshifting at each site is very efficient—up to 30%—to ensure that enough of the Gag-Pro-Pol protein is made after the two serial shifts. Frameshifting and readthrough apparently have evolved as simple strategies to provide the proper ratios of Gag, Gag-Pro, and Gag-Pro-Pol polypeptides in the infected cell. A consequence of this strategy is that the enzymatic proteins required by the virus are fused to the Gag polyprotein, which provides a direct way to incorporate enzymes into the virion during assembly. Although first discovered in retroviruses, frameshifting has also been found in other viruses (e.g., coronoviruses) and in bacteria, but to date, no example is known where two eukaryotic cellular genes are expressed coordinately by this mechanism.
Until recently, this mode of expression of the pro and pol genes seemed to be universal in retroviruses. However, studies with spumaviruses (which are unique in having a +1 frameshift between gag and pro) now imply that the Pro-Pol polyprotein is expressed independently of Gag, from a different mRNA apparently generated by splicing. This conclusion is based on several new findings. Even in the absence of a functional protease, cells expressing spumavirus proteins show no evidence of a Gag-Pro-Pol protein (Konvalinka et al. 1995). In fact, Pol is expressed from a spliced RNA (Yu et al. 1996). The mechanism by which this protein is incorporated into virions remains to be established.
In viral assembly, the genomic RNA is identified as an RNA to be packaged by virtue of a complex sequence in the leader region, called ψ (from the original work on MLV) or E (for encapsidation, in the spleen necrosis virus [SNV] system). The exact nature of this sequence in different retroviruses has been elucidated only in part (for review, see Linial and Miller 1990; Berkowitz et al. 1996; Chapter 7), and the mechanism by which it is recognized by the packaging machinery is only beginning to be deciphered. It is not known how retroviruses incorporate two genomic RNAs. In the mature virion, these two molecules are believed to be held together primarily by a sequence called the “dimer linkage,” which is in the leader region or in some cases in gag. The location of the dimer linkage is based both on early work in which purified virion RNA was partially denatured and then examined by electron microscopy (Bender and Davidson 1976) and on the dimerization of in-vitro-synthesized RNAs (Bieth et al. 1990; Prats et al. 1990). However, the observation that genomic RNAs can carry multiple single-strand breaks (nicks) and still sediment at the same rate as intact dimers suggests that in the virion, the two RNA subunits are held together at multiple points, as is also apparent from early electron microscopic studies. This conclusion is also borne out by more recent in vitro dimerization assays (Feng et al. 1995). As discussed below, the nucleocapsid protein probably has an instrumental role in forming the dimer complex. One attractive model that could explain how two copies of the genome are incorporated postulates that only dimerized RNA is recognized by the packaging machinery. However, this model is at odds with the observations that dimerization appears to occur during the maturation process after viral particles are shed from the cell.
Among the small RNAs included in retroviral virions is the primer for initiation of reverse transcription, a specific cellular tRNA discovered in the initial experiments on reverse transcriptase in permeabilized virions (Verma et al. 1971; Dahlberg et al. 1974). In all retroviruses, the primer tRNA is associated with the genome by base pairing between its 3′-terminal 18 nucleotides and the complementary “primer-binding site” (PBS) sequence on the genome. Retroviruses have evolved to use one of several different tRNAs as primers. The best studied viruses use tRNATrp (ASLV), tRNALys (e.g., HIV-1), and tRNAPro (e.g., MLV) (see Table 2). The left end of the PBS defines the end of U5 (see Chapter 4).
Just as a tRNA is used as a primer for synthesis of the first DNA strand (i.e., “minus” strand), an RNA fragment derived from the genome itself is used as the primer for the second DNA strand (i.e., “plus” strand). This RNA is the polypurine tract (PPT), the sequence immediately preceding U3. During reverse transcription, the PPT sequence survives the RNase H activity of the reverse transcriptase, remaining to be used as a primer. Some retroviruses have a second PPT-like sequence, positioned near the middle of the genome, that can serve as a second site for initiation of plus-strand synthesis (Chapter 4). Although proper reverse transcription requires PBS, PPT, and R sequences and virion proteins, it can occur normally in the complete absence of gag, pro, pol, and env genes. This is the basis of retroviral vectors, which are constructed to deliver genes of choice into cells infected in culture or in animals or people (Chapter 9). Sequences upstream of the PBS and downstream from the PPT may also have roles in efficient reverse transcription, and additional sequences, sometimes called att, at the ends of the LTR function in the integration of the DNA into the host genome (Chapter 5).
In addition to genomic RNA, virions contain substantial amounts of cellular tRNA, representing perhaps 50 molecules per virion (Erickson and Erickson 1970). When RNA is extracted gently from viral particles and then separated by sedimentation or gel electrophoresis, most of the tRNA is found to be free. However, some of the tRNA is bound to the 70S genomic RNA dimer and can be released by heating; at least a portion of this bound RNA is the primer tRNA hydrogen-bonded to the PBS (Canaani and Duesberg 1972). Even the free tRNAs in virions do not represent a random sampling of the cellular pool, but rather are predominantly primer tRNAs and tRNAs structurally related to them. In most viruses, the specificity for incorporation of tRNA apparently includes interactions with a specific binding site on reverse transcriptase as well as the PBS and probably some neighboring sequences as well.
The cellular RNA component found in preparations of retroviral particles is not limited to tRNAs, but it also includes low levels of mRNAs as well as 5S ribosomal RNA and (in ASLV) a 7S RNA encoded by a SINE element. The exact amounts of such cellular RNAs have not been carefully quantified, however, and it is not always easy to exclude contamination of the virus preparation with cellular debris. Athough still present in small amounts, some cellular messengers appear to be packaged selectively, such as that for the glycolytic enzyme glyceraldehyde 3-phosphate dehydrogenase in ASLV (Linial and Miller 1990). What feature of this RNA facilitates incorporation into virions is not known. MLV efficiently incorporates RNAs derived from the endogenous virus-like VL30 elements in the mouse genome (Chapter 8). In this case, the efficient packaging is explained by the resemblance of sequences in these RNAs to the sequence directing viral RNA packaging in wild-type MLV. The presence of viral RNA is not essential for virion assembly and budding. However, cells that express defective retroviruses in the absence of viral RNA containing a normal encapsidation signal probably incorporate more cellular RNAs than do wild-type viruses. However, careful quantitation of total RNA in such defective virions remains to be carried out. It is likely that virion formation requires at least some RNA, viral or cellular, but this supposition has not been addressed experimentally (Chapter 7).
Retroviruses also contain low levels of DNA. At least a portion of virion DNA is viral in nature, in HIV representing about 0.1% the amount of genomic RNA (Arts et al. 1994), and consisting predominantly of the early products of reverse transcription. The amount of this viral DNA in virions is much lower in protease mutants, consistent with the idea that reverse transcription occurs in the nascent virions in which reverse transcriptase has been activated by proteolytic processing. Given the low percentage of infectious particles in retroviruses, it might be hypothesized that virion DNA has an important role in infection. In hepadnaviruses, close cousins of retroviruses that also use reverse transcription in replication, the mature virion contains as its genome a DNA copy of the viral RNA packaged in the cell. Nevertheless, although it is difficult to exclude a biological function for virion DNA, for most retroviruses, it is generally assumed to be a byproduct that is unnecessary for the infectious cycle. An important exception to this generalization apparently is the spumaviruses, which contain high levels of genome-sized DNA (Yu et al. 1996). This viral genus, which is most distant from all other retroviral genera by the reverse transcriptase sequence (Chapter 8), may thus represent an evolutionary link with hepadnaviruses.
Publication Details
Copyright
Publisher
Cold Spring Harbor Laboratory Press, Cold Spring Harbor (NY)
NLM Citation
Coffin JM, Hughes SH, Varmus HE, editors. Retroviruses. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 1997. Virion RNA.